I still find this very useful, but I think the automation here is lagging a bit behind the technology in terms of realistic voices and pronunciation errors.
I’ve been hearing a lot better AI-generated narrations of books, news, etc., using the newest models. (I was even listening to an audiobook and didn’t realize it was AI for about 15 minutes until it made a glitch).
For now, we think our current voice model (provided by Azure) is the best available option all things considered. There are important considerations in addition to human-like delivery (e.g. cost, speed, reliability, fine-grained control).
I’m quite surprised that an overall-much-better option hasn’t emerged before now. My guess is that something will show up later in 2024. When it does, we will migrate.
I still find this very useful, but I think the automation here is lagging a bit behind the technology in terms of realistic voices and pronunciation errors.
I’ve been hearing a lot better AI-generated narrations of books, news, etc., using the newest models. (I was even listening to an audiobook and didn’t realize it was AI for about 15 minutes until it made a glitch).
Any chance this could be upgraded?
Thanks for your feedback.
For now, we think our current voice model (provided by Azure) is the best available option all things considered. There are important considerations in addition to human-like delivery (e.g. cost, speed, reliability, fine-grained control).
I’m quite surprised that an overall-much-better option hasn’t emerged before now. My guess is that something will show up later in 2024. When it does, we will migrate.
Update: we’re seeking feedback on some new voice models.