I do think that there’s an interesting fuzzy boundary here between “derivative work” and “interpretative tool”.
e.g. with the framing “turn it into a podcast” I feel kind of uncomfortable and gut-level wish I was consulted on that happening to any of my posts.
But here’s another framing: it’s pretty easy to imagine a near-future world where anyone who wants can have a browser extension which will read things to them at this quality level rather than having visual fonts. If I ask “am I in favour of people having access to that browser extension?”, I’m a fairly unambiguous yes. And then the current project can be seen as selectively providing early access to that technology. And that seems … pretty fine?
This actually makes me more favourable to the version with automated rather than human readers. Human readers would make it seem more like a derivative work, whereas the automation makes the current thing seem closer to an interpretative tool.
As far as I understand, text to speech browser extensions do exist, but I haven’t tested their quality relative to this one.
Extensions Read Aloud: A Text to Speech Voice Reader
Read aloud the current web-page article with one click, using text to speech (TTS). Supports 40+ languages.
Read Aloud uses text-to-speech (TTS) technology to convert webpage text to audio. It works on a variety of websites, including news sites, blogs, fan fiction, publications, textbooks, school and class websites, and online university course materials.
I feel like I am writing an Elon Musk tweet but in case anyone is interested:
As the post mentions, basically the speech quality comes from using the newer TTS models. From the sound, I think this is using Amazon Polly, the voice “Matthew, Male”.
I am an imposter, but I know this because I “made an app” for myself that does general TTS for local docs and websites.
The relevant code that produces the voices is two lines long. I did not check but I would be surprised if there was not a browser extension already.
If EAs think that a browser extension would be valuable, or want really any of the permutations of forum/comment or other services, my guess is that a working quality project and full deployment could be made for $30,000 or maybe as little as $3,000 (the crux is operational like handling payment, accounts, as well as interpretation of the value of project management)
If there is interest I think we can just put this in EAIF or the future fund or something.
And they do already exist, they just have the limitations I mentioned in the blog (glitchy, no playlists, not multi-platform, etc). Here are the ones I use for various purposes
I do think that there’s an interesting fuzzy boundary here between “derivative work” and “interpretative tool”.
e.g. with the framing “turn it into a podcast” I feel kind of uncomfortable and gut-level wish I was consulted on that happening to any of my posts.
But here’s another framing: it’s pretty easy to imagine a near-future world where anyone who wants can have a browser extension which will read things to them at this quality level rather than having visual fonts. If I ask “am I in favour of people having access to that browser extension?”, I’m a fairly unambiguous yes. And then the current project can be seen as selectively providing early access to that technology. And that seems … pretty fine?
This actually makes me more favourable to the version with automated rather than human readers. Human readers would make it seem more like a derivative work, whereas the automation makes the current thing seem closer to an interpretative tool.
As far as I understand, text to speech browser extensions do exist, but I haven’t tested their quality relative to this one.
Gears level info about TTS:
I feel like I am writing an Elon Musk tweet but in case anyone is interested:
As the post mentions, basically the speech quality comes from using the newer TTS models. From the sound, I think this is using Amazon Polly, the voice “Matthew, Male”.
I am an imposter, but I know this because I “made an app” for myself that does general TTS for local docs and websites.
The relevant code that produces the voices is two lines long. I did not check but I would be surprised if there was not a browser extension already.
If EAs think that a browser extension would be valuable, or want really any of the permutations of forum/comment or other services, my guess is that a working quality project and full deployment could be made for $30,000 or maybe as little as $3,000 (the crux is operational like handling payment, accounts, as well as interpretation of the value of project management)
If there is interest I think we can just put this in EAIF or the future fund or something.
Hear the audio version of this comment in “Matthew”, voice.
Hear the audio version of this comment in “Kevin” voice.
You’re right that it’s Matthew!
And they do already exist, they just have the limitations I mentioned in the blog (glitchy, no playlists, not multi-platform, etc). Here are the ones I use for various purposes
For articles on desktop, Natural Reader
For ebooks on Android, Evie
For articles on Android, @Voice