fine tuned custom models, models with IP knowledge, models that know what you look like. Better latency etc etc. Obviously some can be served by models hosted locally. You can host a model with Triton and create an API to call it in your native application.