Elevenlabs, by some ex-googlers: https://beta.elevenlabs.io/ If anything, the results I've heard from this are better than the demos.
And VALL-E by Microsoft, which isn't quite as good, but notable because it can clone real voices with only three seconds of training data:
https://valle-demo.github.io/