Amazon Polly, the backend that provides these recordings, lets you use “SSML”, basically a trivial annotation of text, that you can easily use to add pauses (as well as emphasis, pitch, volume, etc.).
Basically, what you’re asking for is probably a 2-3 line pull request. (Nonlinear might be using SSML, if not, it’s pretty easy to add it.)
Amazon Polly, the backend that provides these recordings, lets you use “SSML”, basically a trivial annotation of text, that you can easily use to add pauses (as well as emphasis, pitch, volume, etc.).
Basically, what you’re asking for is probably a 2-3 line pull request. (Nonlinear might be using SSML, if not, it’s pretty easy to add it.)