Hacker News new | past | comments | ask | show | jobs | submit login

What are you talking about? It's MIT licensed.



there is no training code and devs don't plan on ever releasing it


It’s mostly there in https://github.com/lucidrains/audiolm-pytorch#hierarchical-t.... They just used FAIRs EnCodec (https://github.com/facebookresearch/encodec) instead of soundstream.


The voices aren’t the model; while the model takes cobventional training for which code is not provided, voices are, or at least can be, built by what could be described as “accumulated in-context learning”. Every time you run text with a voice (which can be null) through the inference process, the result is an audio waveform and an updated history prompt.


It's only a matter of time.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: