Hacker News new | past | comments | ask | show | jobs | submit login

This can currently already be done using a streaming capable LLM with a streaming input/output TTS model.



Any LLM is "streaming capable".


https://github.com/mit-han-lab/streaming-llm

On a side node, and that's what led me to the link above, I wonder if it would be possible to chain N streaming LLMs in an agent workflow and get a final output stream almost instantaneously without waiting for N-1 LLM to complete their reply.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: