You get a small chat overlay on every page. Ask it about the page and it (usually) figures out which tools to call. It has a thinking mode that shows chain-of-thought reasoning as it works.
It's a 2B model in a browser. It works for simple page questions and running JavaScript, but multi-step tool chains are unreliable and it sometimes ignores its tools entirely. The agent loop has zero external dependencies and can be extracted as a standalone library if anyone wants to experiment with it.
Every webpage I've ever visited has full JS execution privileges and I trust half of them less than an LLM
If you think about it, everything we've done to make malicious webpages unable to fiddle around with your state on other sites using XHRs, are exactly and already the proper set of constraints we'd want to prevent models working with webpages from doing the same thing.
That said this looks like a cool project. It is so valuable writing projects like this that use local models, both for tool building and self education. I am writing my own “Emacs native” agentic coding harness and I am learning a lot.
And what you call sketchy is what billions of people default to every day when they use web applications.
It's usually too much when an app asks someone to setup a local LLM but this I believe could solve that problem?
If you want to see an example of this, https://querylight.tryformation.com/ is where I put my search library and demo. It does vector search in the browser.
https://developer.chrome.com/docs/ai/prompt-api
I just checked the stats:
Different use case but a similar approach.I expect that at some point this will become a native web feature, but not anytime soon, since the model download is many multiples the size of the browser itself. Maybe at some point these APIs could use LLMs built into the OS, like we do for graphics drivers.
That's not to say that the in browser isn't valuable for privacy+offline, just that the standard case currently is pretty rough.
https://sendcheckit.com/blog/ai-powered-subject-line-alterna...
(It's currently available for testing in Android's AICore under a developer preview)