Founder of the Existential Risk Observatory here. We’ve focused on informing the public about xrisk for the last four years. We mostly focused on traditional media, perhaps that’s a good addition to the social media work discussed here.
We also focused on measuring our impact from the beginning. Here are a few of our EA forum posts detailing AI xrisk comms effectiveness.
We did not only measure exposure, but also effectiveness of our interventions, using surveys. Our main metric was the conversion rate (called Human Extinction Events indicator in our first paper), basically the percentage of people who changed their mind about whether AI is an existential risk after being exposed to our media intervention. Our average persistent conversion rate was 22%. I think this methodology would also be suitable to apply to social media work (and we applied it to some youtube videos already—results in the links above).
Our total conversion was around 1.8 million people (spreadsheet here). Using engagement times of 57 sec for short-form articles and 123 sec for long-form ones, we yield an effectiveness rate of 254 minute/$ (uncorrected for quality). I do think our views estimates here, that are mostly based on circulation figures, may be on the high side. On the other hand, I’d say quality of e.g. TIME, SCMP, or NRC articles should be expected to be better than average youtube content, but there may be outliers, and this will remain, to an extent, a matter of taste.
I hope it’s useful to share these numbers and calculation methods publicly. I’m a big fan of trying to spend money on the most effective channels.
In the end, I think a strategy to reduce xrisk should hedge risks by not betting on one communication method only. I think it makes sense to spend some funding on the most effective social media work, some on the most effective traditional media work, and some on direct lobbying.
Hi Jamie, thanks for your comment, glad you like it!
It’s hard to go into this without answering your question anyway a bit, but we appreciate the user feedback too.
We got some quick data on the project yesterday (n=15, tech audience but not xrisk, data here). We asked, among other questions: “In your own words, what is this website tracking or measuring?” Almost everyone gave a correct answer. Also from the other answers, I think the main points get across pretty well, so we’re not really planning to modify too much.
The percentage that you’re asking about (‘Score’) is the amount of questions answered correctly by the AI model in a benchmark (with 1-2 exceptions, we explain these under ‘Benchmarks’). I agree that’s not super clear, I’ve added an issue to Github to explain this a bit better.
Does 100% mean a takeover? Not really. The issue here is that none of us knows at which capabilities threshold a takeover can occur exactly. We don’t have data on takeovers since they haven’t happened yet and the world is complex. ‘Human expert level’ is definitely a relevant boundary to cross, and we have included this in the benchmark plots wherever meaningful (not on the homepage, that would have been too messy).
As we said, we think part of the website’s point is to point to missing pieces of the puzzle. Threat models (AI takeover scenarios) are currently hardly scientifically analysed, and we plan to do research into them this year (Existential Risk Observatory, MIT FutureTech, FLI). Once we have more robust threat models, we should determine which dangerous capabilities have which red lines for each model. Then, we can find out whether current benchmarks can measure those and if so, what the relevant scores are (and if not, build new ones that can). We’d like to work on these projects together with other researchers!
Currently, that work is not done. TakeOverBench is an attempt to shed more light on the matter using the research we have right now. We plan to update it when better research becomes available.