Cool. Maybe you'd like to take a look at F5-TTS, where you can upload a voice sa...

Cool. Maybe you'd like to take a look at F5-TTS, where you can upload a voice sample to "clone" a narrator voice, e.g.

https://github.com/JarodMica/audiobook_maker

Here is a video about it:

https://www.youtube.com/watch?v=HbUnb5znNwM

With gradio and F5-TTS it is also possible to train your own voice by Speech-To-Text via Whisper and after this train your model to use the generated LJSpeech dataset to train your own voice model for F5-TTS. Video:

https://www.youtube.com/watch?v=GmketyZW2c4

This way you basically can take any audio book with your favourite narrator, clone his voice and let him read ANY of your epubs. Even more, you could use F5-TTS extensions to use different voices e.g. for female and male characters:

  {male} What's up?
  {female} Nothing.
  {male} Ok
  {narrator} After this they both hung up the phone.

But you'd probably need a pretty decent GPU to get this going :-)