Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

So iOS LLM Apps dont use the neural engine? Lol



None of the current iOS and macOS LLM Apps use the Neural Engine. They use the CPU and the GPU.

nb: I'm the author of a fairly popular app in that category.


How would you know none of the apple apps use the neural engine? Is the key in the statement “LLM”?


Yes, I specifically meant autoregressive LLMs. BERT style encoder only models, ViTs and CNNs ran perfectly fine. Yesterday's coremltools update[1] changes that.

[1]: https://github.com/apple/coremltools/pull/2232


Why do they not?


AFAIK there is no general purpose, "do this on the ANE" API. You have to be using specific higher level APIs like CoreML or VisionKit in order for it to end up on the ANE.


This, plus metal acceleration works quite well. 7~8B parameter models quantized to 3bpw or so run with good tok/s on my iphone 15 pro


It works quite well as long as you don't care about battery.


If they use Llama.cpp they probably run on the GPU. Apple hasn’t published much about their neural engine, so you kinda have to use it through CoreML. I assume they have some aces up their sleeves for running LLMs efficiently that haven’t told anyone yet.


Probably not. The CoreML LLM stuff only works on Macs AFAIK. Probably the phone app uses the GPU.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: