Text-to-speechifying the EA Forum

Edit: apparently, this already exists, and I am just behind the times! This post contains a few insights about how to address some of the problems with seeking author permission, so I’ll leave it up anyway. Great job, Nonlinear Library!

Example

Here is a sample text-to-speech version of my article Don’t Be Bycatch. I chose it to obviate any concerns about permission. I’ve also tried it privately on The Case for Rare Chinese Tofus, and it worked just as well.

Key takeaways

  • We can use good text-to-speech software to convert the entire EA Forum to podcasts for about $12/​month.

  • Quality is good, but we lose charts, graphs, and potentially citations.

  • We’d need to decide on a policy for getting authors’ permission.

  • I’m seeking input particularly on the authorial permission question. How should this be handled?

  • Other concerns include loss of reputation and filtering (discussed in “prioritization”), and loss of comments and participation (discussed in “conversation and feedback”).

  • I will delay implementing this at least until March 2022 to give time for any necessary discussion to play out. I’m not determined to do this if a strong consensus objection is raised, and will prioritize discussion above action.

The pros and cons here are what I thought up in an hour off the top of my head. I’ll give this at least a few weeks before moving ahead with anything, since there may be objections or alternatives I hadn’t thought of.

Background

Garrett Baker and others have been narrating EA Forum posts and publishing them as a podcast. This is wonderful, making posts more accessible and convenient. Garrett & co. seems to be doing this for free, and as a volunteer effort, they’ve only been able to release 6 posts over the last 40 days. I want to give them my sincere compliments for this excellent effort, and suggest a way to scale things up via Amazon’s text-to-speech service, Polly.

Most Important Concern: Permission of the Authors

On the one hand, EA forum posts are already being turned into podcasts, and I don’t know if Garrett is seeking permission from every author before doing so. The posts are already up in public. However, turning them into a podcast is a step that some authors may not want to take. It takes away their ability to control their own intellectual output.

This connects to broader questions about copyright and EA Forum posting. I know next-to-nothing about copyright, so this is a major area of uncertainty for me.

Most convenient would be for the forum to build in an option box for posters to request that their post not be converted to a podcast, which could be default-checked or not, depending on how we want to nudge the writers on this forum. More broadly, we could implement some sort of option box in the editor to facilitate selecting a particular alternative license for particular posts, such as Creative Commons.

Another option is to either automatically create text-to-speech versions of posts, and take down recordings at the request of the authors. Authors could be sent messages with a few days’ lead time prior to recording their post, to give them a chance to do this.

A third option is to message authors, asking for their permission in each case, and only generate text to speech if they say “yes.” This is perhaps the safest course of action, but it also diminishes the efficiency and increases the effort required to execute this strategy.

Prioritization

Podcasting strips away karma and makes it harder to use the author’s reputation as heuristics of quality. Other podcasts, like 80,000 Hours, take many steps to improve legibility and screen for quality. Mass-dumping the EA Forum into podcast form removes these reputational features. This makes it harder to navigate, while also potentially allowing lower-quality writing to become highlighted on equal terms with high-quality work in the context of the podcast.

There are a couple workarounds. One is to have a karma threshold for displaying the podcast. Another is to put the karma the post has accrued at the time of recording into the title or description, as well as the author’s name or username. Any use of karma to tag or filter posts for text-to-speech might be implemented after the post has been up for a week or so. On the other hand, this would cause the audio content to lag the forum. Although the EA forum has an important aspect of timelessness that mitigates this effect, it is also a live conversation among our community. Some of the value is in staying up-to-date.

An alternative perspective is that this is a feature, not a bug. It avoids information cascades. Somebody went so far as to set up a mirror of LessWrong that doesn’t have karma.

Conversation and feedback

It’s possible to include comments, but this seems even more challenging from a permissions perspective than the posts themselves. It also would be very hard to do this in a way that’s comprehensible in audio format, though perhaps not impossible. This means that an important source of context and feedback is lost in the audio version. If people switch from reading to listening, then their relationship with the forum may also shift from one of community participation to one of relatively passive absorption. Then again, this is how many readers consume this material—they read, but don’t comment or post.

One potential positive side effect is that people may take more time to consider the content if the act of reading is a little more removed from the act of commenting.

Outreach

Podcasting is exploding in popularity. This just seems like an obvious move in terms of expanding EA’s outreach.

Cost

About 6 posts are published on EA Forum per day, and the long ones seem to have about 15,000 characters. AWS’s Polly cost estimator says this service would cost about $12/​month.

I wouldn’t mind funding this myself, but if somebody wanted to donate, I could probably set up a patreon or something.

Quality

You can judge for yourself from the sample above. I think it is better than many human narrators (though certainly not better than Garrett!). Any shortcomings in the narration style are compensated by the fact that there’s no background noise.

Perhaps the most important downside is that some posts include citations, images, and graphs that are crucial to the content, yet impossible to conveniently render into speech.

One option would be for the person converting to speech to write up their best narration of the contents of images, and to insert citations into the main body of the text. This is how the SlateStarCodex podcast handles this problem, and in my opinion, it’s unsatisfactory. I usually find it impossible to glean any information from the description, and it’s so weird and interruptive that I feel a bit embarrassed to play it for others (even though I love the posts and appreciate the narrator’s heroic efforts!).

Another is for the creator of posts to write a podcasting-friendly version. A third is just to accept this shortcoming of the podcasting format.

One downside is that the female and voices inflected with any accent other than “neutral American masculine narrator” are sub-par. Do we really want Polly’s “Matthew” to be the voice of the EA movement? On the other hand, we can substitute human narrators and expand into alternative AI-generated accents and voices as AWS improves Polly.

Time savings and improvement

Eyeballing it, the average long EA Forum post takes about 15 minutes to read. Some are shorter, and a few are longer. It seems like about 1-1.5 hours of reading material gets published per day, on average.

What about the time people actually spend reading posts? We can use EA Forum’s post analytics to get a little insight, multiplying readership by median reading time. My most popular post has a median reading time of 2 minutes across 3445 views, so I estimate that it has absorbed about 100 hours of total reading time. Another of my posts, with decent but not extraordinary popularity, has absorbed about 40 hours of reading time. One of my lesser-ranked posts may have absorbed 5 hours.

Let’s say that the average EA Forum post absorbs about 10 hours of total reading time. In that case, estimating 180 posts per month, that’s 1800 person-hours of reading. This adds up to about 12 person-years of work per year.

A robust solution to podcasting the EA Forum could move a substantial portion of this reading into times when people are in transit. It can allow this information to be shared with others who may not read the forum on their own, but will listen to it if their friend or partner wants to turn it on in the car. It may allow some distracted reading, such as on the phone in the grocery store checkout line, to shift to higher-focused listening.

Accessibility

This makes the EA Forum more accessible to the visually impaired, illiterate, and to those for whom little time is available for reading but some time is available for listening. Some people have an easier time taking in information in an audio format.