Hacker News new | past | comments | ask | show | jobs | submit login
Ask HN: Who makes the most advanced text-to-speech technology?
5 points by oftenwrong on Dec 28, 2017 | hide | past | favorite | 6 comments
I want to convert some articles and books to audio so that I can listen to them while walking. I would like to try the current leader in commercially-available text-to-speech.

I have tried a few text-to-speech offerings, such as the ones included in Firefox and macOS, but they sound robotic to the point that they are difficult to listen to. The pacing is unnatural. I am hoping there is something available that is better.




Have you tried AWS Polly? https://aws.amazon.com/polly/


Seconded, AWS Polly should be good, there's also Bing Speech

https://azure.microsoft.com/en-us/services/cognitive-service...

Google WaveNet promises even better audio, but it's not available yet. In my experience, Amazon is "good enough".

https://deepmind.com/blog/wavenet-generative-model-raw-audio...


I just gave it a go. It does a good job, but still the pacing and intonation are unnatural and difficult to follow. If this is the current state-of-the-art, or close to it, I would rather just read. It is possible I could train myself to listen to it.

Hats off to the developers, though. I know human-like text-to-speech is difficult.


although, not a commercially available product, the SOTA for text-to-speach just came out this week and it's called TacoTron 2 https://google.github.io/tacotron/publications/tacotron2/ind...


This might not be very helpful if you are looking for specific texts to be read, but it might be worth trying to find human spoken texts (audio books and so on). Even some medium articles have spoken copies, if I remember correctly.


I do use audio books, and audio versions of articles when available.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: