Hacker News new | past | comments | ask | show | jobs | submit login

Okay, if client side resources expand then larger parameter LLMs will be used. There.

The point is that it will be ubiquitously client side, and it will happen faster than newer hardware comes out. Current hardware is very limited and slow in getting output from LLMs.




I'm not convinced - at least on current hardware this seems well positioned for the cloud.

My tiny Alexa puck isn't going to run a 180bn parameter LLM that runs best on 10 graphics cards any time soon, however it can already call a simple API and get a response in only 50ms more. I suspect people will prefer the cloud overhead of 50ms for a better response for a bigger model for a lot of queries.

But who knows at this stage! I guess it could go either way depending on how both hardware and these models advance.

I just assume that in the close future, most people are going to be interacting with LLM's on low-cost devices with limited/varied compute.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: