AFAIK there is no general purpose, "do this on the ANE" API. You have to be using specific higher level APIs like CoreML or VisionKit in order for it to end up on the ANE.
If they use Llama.cpp they probably run on the GPU. Apple hasn’t published much about their neural engine, so you kinda have to use it through CoreML. I assume they have some aces up their sleeves for running LLMs efficiently that haven’t told anyone yet.