Hacker News new | past | comments | ask | show | jobs | submit login

"After training, we can sample the network to generate synthetic utterances. At each step during sampling a value is drawn from the probability distribution computed by the network. This value is then fed back into the input and a new prediction for the next step is made. Building up samples one step at a time like this is computationally expensive, but we have found it essential for generating complex, realistic-sounding audio."

So it looks like generation is a slow process.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: