Hacker Newsnew | past | comments | ask | show | jobs | submit | gigel82's commentslogin

Ooh, I might consider actually buying Bose products now. Way to go!

I'm sure if Apple keeps innovating and adopting some of the Web standards they'll outcompete other engines. But let's be realistic, they 100% are blocking other engines and not adopting standards in their own because they want that sweet sweet 30% cut when developers can't publish PWAs and are forced into the "app" model.

WebKit's progress has been significant in recent years, it's just been more focused on things like improving CSS instead of things like an API that tells the developer how many beers the user has in their fridge.

You are unwittingly confirming his point. Apple isn't randomly working on random stuff, they know exactly where their bread is buttered - features that have potential of diminishing that butter get skipped, neglected or implemented half-baked.

It depends on how you look at it.

From my perspective, Google tends to focus on somewhat niche features that will benefit a small slice of web apps. In contrast, the things Apple works on are those that benefit everything from static blog sites to huge commercial web apps.

I wish Google were more like Apple in this regard, because the primitives from which everything web is built are still overwhelmingly crude, which results in the half-ton-truck-built-on-a-golf-cart frameworks and apps the web has become famous for. Making the web reasonable to develop for without a dependency tree that looks like a spiral fractal would do way more to make it flourish as a platform than things like access to the GPU and USB devices.


I'd love to live in your world for a bit... I can't imagine any future where having AI in your browser is a net positive for any user. It sounds like an absolute dystopian privacy and security nightmare.


Why?

Imagine you have an AI button. When you click it, the locally running LLM gets a copy of the web site in the context window, and you get to ask it a prompt, e.g. "summarize this".

Imagine the browser asks you at some point, whether you want to hear about new features. The buttons offered to you are "FUCK OFF AND NEVER, EVER BOTHER ME AGAIN", "Please show me a summary once a month", "Show timely, non-modal notifications at appropriate times".

Imagine you choose the second option, and at some point, it offers you a feature described as follows: "On search engine result pages and social media sites, use a local LLM to identify headlines, classify them as clickbait-or-not, and for clickbait headlines, automatically fetch the article in an incognito session, and add a small overlay with a non-clickbait version of the title". Would you enable it?


>Why?

Do we have to re-tread 3 years of big tech overreach, scams, user hostility in nearly every common program , questionable utility that is backed by hype more than results, and way its hoisting up the US economy's otherwise stagnant/weakening GDP?

I don't really have much new to add here. I've hated this "launch in alpha" mentality for nearly a decade. Calling 2022 "alpha" is already a huge stretch.

>When you click it, the locally running LLM gets a copy of the web site in the context window, and you get to ask it a prompt, e.g. "summarize this".

Why is this valuable? I spent my entire childhood reading, and my college years being able to research and navigate technical documents. I don't value auto-summarizations. Proper writing should be able to do this in its opening paragraphs.

>Imagine the browser asks you at some point, whether you want to hear about new features. The buttons offered to you are "FUCK OFF AND NEVER, EVER BOTHER ME AGAIN", "Please show me a summary once a month", "Show timely, non-modal notifications at appropriate times"

Yes, this is my "good enough" compromise that most applications are failing to perform. Let's hope for the best.

>Imagine you choose the second option, and at some point, it offers you a feature described as follows: "On search engine result pages and social media sites, use a local LLM to identify headlines, classify them as clickbait-or-not, and for clickbait headlines, automatically fetch the article in an incognito session, and add a small overlay with a non-clickbait version of the title". Would you enable it?

No, probably not. I don't trust the powers behind such tools to be able to identify what is "clickbait" for me. Grok shows that these are not impartial tools, and news is the last thing I want to outsource sentiment too without a lot of built trust.

meanwhile, trust has only corroded this decade.


> Imagine you have an AI button. When you click it, the locally running LLM

sure, you can imagine Firefox integrating a locally-running LLM if you want.

but meanwhile, in the real world [0]:

> In the next three years, that means investing in AI that reflects the Mozilla Manifesto. It means diversifying revenue beyond search.

if they were going to implement your imagination of a local LLM, there's no reason they'd be talking about "revenue" from LLMs.

but with ChatGPT integrating ads, they absolutely can get revenue by directing users there, in the same way they get money for Google for putting Google's ads into Firefox users' eyeballs.

that's ultimately all this is. they're adding more ads to Firefox.

0: https://blog.mozilla.org/en/mozilla/leadership/mozillas-next...


not to mention the high resource-usage of a local LLM that most PCs wouldn't be able to handle, or would just drain a laptop's battery.


All for searching something trivial, where for 99% of cases the already indexed wikipedia summary is good enough and way faster


> When you click it, the locally running LLM gets a copy of the web site in the context window, and you get to ask it a prompt, e.g. "summarize this".

I'm also now imagining my GPU whirring into life and the accompanying sound of a jetplane getting ready for takeoff, as my battery suddenly starts draining visibly.

Local LLMs for are a pipe dream, the technology fundamentally requires far too much computation for any true intelligence to ever make sense with current computing technologies.


Most laptops are now shipping with a NPU for handling these tasks. So it wont be getting computed on your GPU.


That doesn't mean anything, it's just a name change. They're the same kind of unit.

And whatever accelerator you try to put into it, you're not running Gemini3 or GPT-5.1 on your laptop, not in any reasonable time frame.


Over the last few decades I've seen people make the same comment about spell checking, voice recognition, video encoding, 3D rendering, audio effects and many more.

I'm happy to say that LLM usage will only actually become properly integrated into background work flow when we have performant local models.

People are trying to madly monetise cloud LLMs before the inevitable rise of local only LLMs severely diminishes the market.


Time will tell, but right now we're not solving the problem of running LLMs by increasing efficiency, we're solving it by massive, unprecedented investments in compute power and just power. Companies definitely weren't building nuclear power stations to power their spell checkers or even 3D renderers. LLMs are unprecedented in this way.


True, but the usefulness of local models is actually getting better. I hope that the current unprecedented madness is a factor of the potential of cloud models, and not a dismissal of the possibility of local models. It's the biggest swing we've seen (with the possible exception of cloud computing vs local virtualisation) but that may be due to recognition of the previous market behaviour, and a desperate need to not miss out on the current boom.


Also it does mean something. An NPU is completely different from your 5070. Yes the 5070 has specific AI cores but it also has raster cores and other things not present in an NPU.

You dont need to run GPT5.1 to summerize a webpage. Models are small and specialized for different tasks.


And all of that is irrelevant for the AI use case. The NPU is at best slightly more efficient than a GPU for this use case, and mostly its just cheaper by forgoing various parts of a GPU that are not useful for AI (and would not be used during inferencing anyway).

And the examples being given of why you'd want AI in your browser are all general text comprehension and conversational discussions about that text, applied to whatever I may be browsing. It doesn't really get less specialized than that.


No, NPUs are designed to be power efficient in ways GPU compute aren't.

You also don't need Gemini3 or GPT anything running locally.


Personally, I don't need AI in my browser at all. But if I did, why would I want to run a crappy model that can't think and hallucinates constantly, instead of using a better model that kinda thinks and doesn't hallucinate quite as often?


I generally agree with you, but you'd be surprised at what lower parameter models can accomplish.

I've got Nemo 3 running on an iGPU on a shitty laptop with SO-DIMM memory, and it's good enough for my tasks that I have no use for cloud models.

Similarly, Granite 4 based models are even smaller, just a couple of gigabytes and are capable of automation tasks, summarization, translation, research etc someone might want in a browser.

Both do chain of reasoning / "thinking", both are fast, and once NPU support lands in runtimes, they can be offloaded on to more efficient hardware.

They certainly aren't perfect, but at least in my experience, fuzzy accuracy / stochastic inaccuracy is good enough for some tasks.


That's the point. For things like summarizing a webpage or letting the user ask questions about it, not that much computation is required.

An 8B Ollama model installed on a middle of the road MacBook can do this effortlessly today without whirring. In several years, it will probably be all laptops.


But what you would want to summarize a page. If I'm reading a blog, that means that I want to read it, not just a condensed version that might miss the exact information I need for an insight or create something that was never there.


You can also just skim it. It feels like LLM summarization boils down to an argument to substitute technology for media literacy.

Plus, the latency on current APIs is often on the order of seconds, on top of whatever the page load time is. We know from decades [0] of research that users don't wait seconds.

[0] https://research.google/blog/speed-matters/


It makes a big difference when the query runs in a sidebar without closing the tab, opening a new one, or otherwise distracting your attention.


> without closing the tab, opening a new one, or otherwise distracting your attention.

well, 2/3 is admirable in this day and age.


You don't use it to summarize pages (or at least I don't), but to help understand content within a page while minimizing distractions.

For example: I was browsing a Reddit thread a few hours ago and came upon a comment to the effect of "Bertrand Russell argued for a preemptive nuclear strike on the Soviets at the end of WWII." That seemed to conflict with my prior understanding of Bertrand Russell, to say the least. I figured the poster had confused Russell with von Neumann or Curtis LeMay or somebody, but I didn't want to blow off the comment entirely in case I'd missed something.

So I highlighted the comment, right-clicked, and selected "Explain this." Instead of having to spend several minutes or more going down various Google/Wikipedia rabbit holes in another tab or window, the sidebar immediately popped up with a more nuanced explanation of Russell's actual position (which was very poorly represented by the Reddit comment but not 100% out of line with it), complete with citations, along with further notes on how his views evolved over the next few years.

It goes without saying how useful this feature is when looking over a math-heavy paper. I sure wish it worked in Acrobat Reader. And I hope a bunch of ludds don't browbeat Mozilla into removing the feature or making it harder to use.


And this explanation is very likely to be entirely hallucinated, or worse, subtly wrong in ways that's not obvious if you're not already well versed in the subject. So if you care about the truth even a little bit, you then have to go and recheck everything it has "said".

Why waste time and energy on the lying machine in the first place? Just yesterday I asked "PhD-level intelligence" for a well known quote from a famous person because I wasn't able to find it quickly in wikiquotes.

It fabricated three different quotes in a row, none of them right. One of them was supposedly from a book that doesn't really exist.

So I resorted to a google search and found what I needed in less time it took to fight that thing.


And this explanation is very likely to be entirely hallucinated, or worse, subtly wrong in ways that's not obvious if you're not already well versed in the subject. So if you care about the truth even a little bit, you then have to go and recheck everything it has "said".

It cited its sources, which is certainly more than you've done.

Just yesterday I asked "PhD-level intelligence" for a well known quote from a famous person because I wasn't able to find it quickly in wikiquotes.

In my experience this means that you typed a poorly-formed question into the free instant version of ChatGPT, got an answer worthy of the effort you put into it, and drew a sweeping conclusion that you will now stand by for the next 2-3 years until cognitive dissonance finally catches up with you. But now I'm the one who's making stuff up, I guess.


Unless you've then read through those sources — and not asked the machine to summarize them again — I don't see how that changes anything.

Judging by your tone and several assumptions based on nothing I see that you're fully converted. No reason to keep talking past each other.


No, I'm not "fully converted." I reject the notion that you have to join one cult or the other when it comes to this stuff.

I think we've all seen plenty of hallucinated sources, no argument there. Source hallucination wasn't a problem 2-3 years ago simply because LLMs couldn't cite their sources at all. It was a massive problem 1-2 years ago because it happened all the freaking time. It is a much smaller problem today. It still happens too often, especially with the weaker models.

I'm personally pretty annoyed that no local model (at least that I can run on my own hardware) is anywhere near as hallucination-resistant as the major non-free, non-local frontier models.

In my example, no, I didn't bother confirming the Russell sources in detail, other than to check that they (a) existed and (b) weren't completely irrelevant. I had other stuff to do and don't actually care that much. The comment just struck me as weird, and now I'm better informed thanks to Firefox's AI feature. My takeaway wasn't "Russell wanted to nuke the Russians," but rather "Russell's positions on pacifism and aggression were more nuanced than I thought. Remember to look into this further when/if it comes up again." Where's the harm in that?

Can you share what you asked, and what model you were using? I like to collect benchmark questions that show where progress is and is not happening. If your question actually elicited such a crappy response from a leading-edge reasoning model, it sounds like a good one. But if you really did just issue a throwaway prompt to a free/instant model, then trust me, you got a very wrong impression of where the state of the art really is. The free ChatGPT is inexcusably bad. It was still miscounting the r's in "Strawberry" as late as 5.1.


> I'm personally pretty annoyed that no local model (at least that I can run on my own hardware) is anywhere near as hallucination-resistant as the major non-free, non-local frontier models.

And here you get back to my original point: to get good (or at least better) AI, you need complex and huge models, that can't realistically run locally.


You can just look down thread at what people actually expect to do - certainly not (just) text summarization. And even for summarization, if you want it to work for any web page (history blog, cooking description, github project, math paper, quantum computing breakthrough), and you want it accurate, you will certainly need way more than Ollama 8B. Add local image processing (since huge amounts of content are not understandable or summarizable if you can't understand images used in the content), and you'll see that for a real 99% solution you need models that will not run locally even in very wild dreams.


Sure. Let's solve our memory crisis without triggering WW3 with China over Taiwan first, and maybe then we can talk about adding even more expensive silicon to increasingly expensive laptops.


>Imagine you have an AI button. When you click it, the locally running LLM gets a copy of the web site in the context window, and you get to ask it a prompt, e.g. "summarize this".

but.. why? I can read the website myself. That's why I'm on the website.


People have a limited amount of time, so they may prefer spending it on something else than what a computer can do for them.


That last one sounds like a lot of churn and resources for little results? You're not really making them sound compelling compared to just blocking click bait sites with a normal extension somehow. And it could also be an extension users install and configure - why a pop up offering it to me, and why built into the browser that directly?


For any mildly useful AI feature, there are hundreds of entirely dangerous ones. Either way I don't want the browser to have any AI features integrated, just like I don't want the OS to have them.

Especially since we know very well that they won't be locally running LLMs, everyone's plan is to siphon your data to their "cloud hybrid AI" to feed into the surveillance models (for ad personalization, and for selling to scammers, law enforcement and anyone else).

I'd prefer to have entirely separate and completely controlled and fire-walled solutions for any useful LLM scenarios.


> Imagine you have an AI button.

That pretty much sums up the problem: an "AI" button is about as useful to me as a "do stuff" button, or one of those red "that was easy" buttons they sell at Home Depot. Google translate has offered machine translation for 20+ years that is more or less adequate to understand text written in a language I don't read. Fine, add a button to do that. Mediocre page summaries? That can live in some submenu. "Agentic" things like booking flights for an upcoming trip? I would never trust an "AI" button to do that.

Machine learning can be useful for well-defined, low-consequence tasks. If you think an LLM is a robot butler, you're fundamentally misunderstanding what you're dealing with.


> The buttons offered to you are "FUCK OFF AND NEVER, EVER BOTHER ME AGAIN"

I've already hit that option before reading the other ones.

> "On search engine result pages and social media sites, use a local LLM to identify headlines, classify them as clickbait-or-not, and for clickbait headlines, automatically fetch the article in an incognito session, and add a small overlay with a non-clickbait version of the title"

Why would you bother fetching the clickbait at all? It's spam.

The main transformation I want out of a browser, the absolutely critical one, is the removal of advertising. I concede that AI might be decent at removing ads and all the overlay clutter that makes news sites unreadable; does anyone have the demo of "AI readability mode"? Crucially I do not want it changing any non-ad text found on the page.


> Imagine you have an AI button. When you click it, the locally running LLM gets a copy of the web site in the context window, and you get to ask it a prompt, e.g. "summarize this".

They basically already have this feature: https://support.mozilla.org/en-US/kb/use-link-previews-firef...


I like Firefox and don't think it's about to collapse like many users here, but I have already unchecked "Recommend features as you browse" and "Recommend extensions as you browse" along with setting the welcome page for updates to about:blank.

Ideally the user interface for any tool I use should never change unless I actively prompt it to change, and the only notifications I should get would be from my friends and family contacting me or calendars/alarms that I set myself.


Lots of imagining here.


I have already clicked the all-caps button


Most users are entirely ignorant of privacy and security and will make choices without considering it. I don’t say that to excuse it but it’s absolutely the reality.


I don't know. What if the AI can remove all junk from the page, clean it up, and only leave the content - sort of like ublock origin on steroids?


I'd pay a monthly subscription fee for this. All the service would need to do to get my money is guess which words that already exist on the page I will be interested in and show me those words in black-and-white type (in a face and a size chosen by me, not the owner of the web site) free of any CSS, styling or "innovative" manner of presentation.

Specifically, the AI does not generate text for me to read. All it does is decide which parts of the text that already exists on the page to show me. (It is allowed to interact with the web page to get past any modal windows or gates.)


haha, what if I told you that the currently existing, shipping product, "ChatGPT / Gemini uses a browser for you" will have more users than Firefox in two years? I will even bet you that will likely be the case in 2 months.


> any future

> any user


This isn't a win for anyone. Windows 11 should ask consent before installing any AI feature on user's computers.


US doesn't train anyone, that headline would only make sense somewhere in... every other country in the world (more or less), where the government actually subsidizes education (a lot of the times up to 100%).

In the US, people pay for their own training, so they can damn well go wherever they please.


By that time, it will already be too late. I'd argue it's probably already too late... dead man walking kind of situation. Definitely true for consumers, it might survive out of inertia in the corporate environment for decades.


Except when your AI psychosis PM / manager sees your throwaway vibe-coded garbage and demands it gets shipped to customers.

It's infinitely worse when your PM / manager vibe-codes some disgusting garbage, sees that it kind of looks like a real thing that solves about half of the requirements (badly) and demands engineers ship that and "fix the few remaining bugs later".


We need to be super careful with how legislation around this is passed and implemented. As it currently stands, I can totally see this as a backdoor to surveillance and government overreach.

If social media platforms are required by law to categorize content as AI generated, this means they need to check with the public "AI generation" providers. And since there is no agreed upon (public) standard for imperceptible watermarks hashing that means the content (image, video, audio) in its entirety needs to be uploaded to the various providers to check if it's AI generated.

Yes, it sounds crazy, but that's the plan; imagine every image you post on Facebook/X/Reddit/Whatsapp/whatever gets uploaded to Google / Microsoft / OpenAI / UnnamedGovernmentEntity / etc. to "check if it's AI". That's what the current law in Korea and the upcoming laws in California and EU (for August 2026) require :(


What is this "application containers" BS, just add native docker stack support. Most folks in the self hosting community already deploy nested dockers in LXCs, just add native support so we can cut out the middle man and squeeze out that indirection.


It makes no sense to add an extra layer, and we definitively do not want to make us and our users dependent of docker project.

There exist many OCI runtimes, and our container toolkit already provides a (ball parked) 90% feature overlap with them. Maintaining two stacks here is just needless extra work and asking for extra pain for us devs and our users, so no, thanks.

That said, PVE is not OCI runtime compatible yet, that's why this is marked as tech preview, but it can be still useful for many that control their OCI images themselves or have an existing automation stack that can drive the current implementation. That said, we plan to work more on this in the future, but for the midterm it will be not that interesting for those that want a very simple hand-off approach (let's call it "casual hobby homelabber"), or want to replace some more complex stack with it; but I think we'll get there.


People stuck with Docker for a reason, even after they became user hostile. Almost every selfhosted project in existence provides a docker-compose.yml that's easy to expand and configure to immediately get started. None provide generic OCI containers to run in generic OCI runtimes.

I understand sticking with compatibility at that layer from an "ideal goal" POV, but that is unlikely to see a lot of adoption precisely because applications don't target generic OCI runtimes.


The one (docker compose) builds on top of the other (OCI images & runtimes).

We would have required to implement runtime integration anyway, and I hardly see any benefit in not releasing that lower level integration earlier.


docker errors or escapes taking down the root system? Not for me....


Docker is mostly based on the same stuff that LXC uses under the hood.


So, Pluribus?


enlighten me?


Their intent in making the reference is a bit vague, but they seem to be referring to the recently released series of the same name, and maybe drawing some kind of parallel between AI technology and the mind control virus depicted in the show, which I haven’t seen yet myself, so I am only speculating:

https://en.wikipedia.org/wiki/Pluribus_(TV_series)

> The show follows author Carol Sturka, played by Seehorn, as the rest of humanity is suddenly joined into a hive mind that seeks to amicably assimilate Carol and other immune individuals into the mind. The title of the series refers to e pluribus unum, a Latin phrase meaning 'out of many, one'.

> Set in Albuquerque, New Mexico, the series follows author Carol Sturka, who is one of only thirteen people in the world immune to the effects of "the Joining", resulting from an extraterrestrial virus that had transformed the world's human population into a peaceful and content hive mind (the "Others").


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: