With gradio and F5-TTS it is also possible to train your own voice by Speech-To-Text via Whisper and after this train your model to use the generated LJSpeech dataset to train your own voice model for F5-TTS. Video:
This way you basically can take any audio book with your favourite narrator, clone his voice and let him read ANY of your epubs. Even more, you could use F5-TTS extensions to use different voices e.g. for female and male characters:
{male} What's up?
{female} Nothing.
{male} Ok
{narrator} After this they both hung up the phone.
But you'd probably need a pretty decent GPU to get this going :-)
https://github.com/JarodMica/audiobook_maker
Here is a video about it:
https://www.youtube.com/watch?v=HbUnb5znNwM
With gradio and F5-TTS it is also possible to train your own voice by Speech-To-Text via Whisper and after this train your model to use the generated LJSpeech dataset to train your own voice model for F5-TTS. Video:
https://www.youtube.com/watch?v=GmketyZW2c4
This way you basically can take any audio book with your favourite narrator, clone his voice and let him read ANY of your epubs. Even more, you could use F5-TTS extensions to use different voices e.g. for female and male characters:
But you'd probably need a pretty decent GPU to get this going :-)