Hacker News new | past | comments | ask | show | jobs | submit | Tsarp's comments login

Why not something like https://github.com/nanobrowser/nanobrowser.

Its kinda built really well without exposing webdriver etc and can comfortably run js and communicate with LLMs.Has full agentic capabilites.

Why a new browser instead of a robust extension?


Why a new browser extension for Chrome instead of an MCP operating Chrome over Chrome DevTools Protocol?

https://chromedevtools.github.io/devtools-protocol/

Not vouching for this project, but just an example of the category existing: https://github.com/AgentDeskAI/browser-tools-mcp


CDP is great for testing. But one of the most basic checks for bot detection is checking for CDP(webdriver). Its always going to be a cat and mouse game. You'll see a bunch of solutions captch solvers etc, But they usually are only good for a few weeks.

There's no reason why the same cat and mouse game wouldn't apply to this browser as a whole.

True, but its orders of magnitude lesser when webdriver flag is an extremely basic bot check that is now considered 101.

It sounds like you're thinking of window.navigator.webdriver, which is a WebDriver thing not part of Chrome DevTools Protocol. With CDP, as far as I can tell the detection mechanisms are more about the heuristics of e.g. how fast a form is filled -- which this AI stuff will trigger immediately too.

(And even if CDP had an explicit marker somewhere, surely patching that out is easier than piling up enough patches to "make a new browser".)


Dont you need to navigator.webdriver === true for CDP to drive automation? Maybe I need to update my understanding on this. THis is usually a dead giveaway

I see mentions that (unpatched) webdriver is easy to detect but detecting CDP only works by heuristics on timing etc.

With stuff like https://www.cloudflare.com/en-in/application-services/produc... and https://blog.cloudflare.com/ai-labyrinth/ big money going on both sides last thing you want is to shadow detected as a bot. Its all ok if you are scraping to top rated SEO slop which is usually static sites but for anything beyond it wont work well eventually. Quite a few issues on browerbase, crawl4ai and similar repos around being detected as a bot.

I was initially impressed. But then I tested a bunch, it wasn't catching some really basic things. Mostly hit or miss.

I'd love for you to try https://carelesswhisper.app

- Locally running, wrapper around whisper.cpp

- I've done a lot of work on noise profiling, stitching the segments. So when you are speaking for anything >2-3mins, its actually faster than cloud transcriptions. (Accuracy is a few WER off since they are quantized models).

- You can try without paying or putting in CC. After that ~19$ one time. No need to sign up or login.

- BYOK to use your groq, gemini free daily credits to rewrite. Support for thinking models too. can also plug into any locally running LLM.

- Works on my 1st gen M1 without a sweat.


How much do you pay on average for an hour of transcription?


Runs locally on device. So no server costs.


simultaneously related and off topic:

https://arxiv.org/abs/2402.08021


huh! nice!


Maybe worth considering speech to text. Dictation has come a long way and if they are using a Mac any of the locally running whisper wrappers will work.

1. https://goodsnooze.gumroad.com/l/macwhisper (dictation + transcription)

2. https://carelesswhisper.app (does dictation only, and does it really well; cheapest)

3. https://superwhisper.com (both local and hosted models + lots of bells and whistles, but much higher pricing)


Wow. This looks awesome.

Can we build our own python sandbox using the sandboxfile spec? This is if I want to add my own packages. Would this be just having my own requirements file here - https://github.com/microsandbox/microsandbox/blob/main/MSB_V...


Thank you!

> Can we build our own python sandbox using the sandboxfile spec?

Yes and I plan to make that work with the SDK.

PS: Multi-stage build is WIP.


Great will join the discord. Is this embeddable? Will it work with a cross platform desktop app(Tauri)?


An embeddable library that lets you launch Linux VMs that works across Windows, MacOS, and Linux hosts would be incredible.


If by embeddable, you mean having the vm run in the same process, then no. The vm aborts its process when it's done so it has to run as separate process.


A local dictation app for Mac to use when coding. I spend a lot of time talking to Cursor, Chatgpt and needed to get rust and swift library names correctly.

Spent a lot of time on low level hardware libs to roll out my own version of VAD, grammar correction and stitching segments.

Faster than the hosted dictations tools thought it runs locally and a lot more control in terms of custom vocabulary.

https://carelesswhisper.app


Building a small framework for securely connecting desktop apps/clis directly to your existing browser using Native Messaging i.e no headless browsers or cloud sandboxes/proxies involved.

Inspired by secure password managers like Bitwarden, goal is to reduce detectability, avoid CAPTCHAs, and mitigate common fingerprinting pitfalls.

The idea is simple: leverage the trust your browser already has.

https://github.com/srv1n/rzn-browser-native


Looks neat. Have you thought of GTM? Is it B2C or are you thinking enterprise.


Thank you, what does GTM mean?


Looks neat. How does it compare to the already available solutions out there/any thing different about it?


Thanks. I've written about it in detailed in the README. But most important thing that differentiate it from other self-hosted variants is the ability to see references along with chat and ability to exclude/include specific sources from your local sources for targeted discussion. Moreover it has integrated note taking feature in markdown, so that all of your digital knowledge-base is at one place.


Been using the obsidian clipper since it was out and this is a really neat. The per website profile based extraction is awesome.

Even if you are not a obsidian user, the markdown extraction quality is the most reliable Ive seen.


thanks for the tip!


Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: