Hacker News new | past | comments | ask | show | jobs | submit login

The work of Nicolas Boulanger-Lewandowski was extensively focused on this topic, see his work [1]. He wrote a Theano deep learning tutorial on this topic [2], and several people (Kratarth Goel) [3][4] have advanced the work to use LSTM and deep belief networks.

For a brief while RNN-NADE made an appearance as well, though I do not know of an open source implementation

There are also a few of us who are working on more advanced versions of this model for speech synthesis, versus operating on the MIDI sequence. Stay tuned in the near future!

I can say from experience that some of the samples from the LSTM-DBN are shockingly cool, and drove me to spend about a week using K-means coded speech. It made robo-voices at least but our research moved past that pretty fast.

[1] http://www-etud.iro.umontreal.ca/~boulanni/ [2] http://deeplearning.net/tutorial/rnnrbm.html [3] http://arxiv.org/pdf/1412.6093.pdf [4] https://github.com/kratarth1203/NeuralNet/blob/master/rnndbn...




Is the robot-voice code published anywhere?

You can make money out of that kind of thing btw!

https://soniccharge.com/bitspeek

(Obviously not the same thing but the point is that silly robo-voice code is marketable :)




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: