Skimmed it and mostly agree, thanks for writing. Especially takeover and which capabilities are needed for that is a crux for me, rather than human-level. Still, one realistically needs a shorthand for communication and AGI/human-level AI is time tested and understood relatively easily. For policy and other more advanced comms, and as more details become available on what capabilities are and aren’t important for takeover, making messaging more detailed is a good next step.
Otto
The recordings of our event are now online!
Announcing the AI Safety Summit Talks with Yoshua Bengio
High impact startup idea: make a decent carbon emissions model for flights.
Current ones simply use flight emissions which makes direct flights look low-emission. But in reality, some of these flights wouldn’t even be there if people could be spread over existing indirect flights more efficiently, which is why they’re cheaper too. Emission models should be relative to counterfactual.
The startup can be for-profit. If you’re lucky, better models already exist in scientific literature. Ideal for the AI for good-crowd.
My guess is that a few man-years work could have a big carbon emissions impact here.
Otto’s Quick takes
Great work, thanks a lot for doing this research! As you say, this is still very neglected. Also happy to see you’re citing our previous work on the topic. And interesting finding that fear is such a driver! A few questions:
- Could you share which three articles you’ve used? Perhaps this is in the dissertation, but I didn’t have the time to read that in full.
- Since it’s only one article per emotion (fear, hope, mixed), perhaps some other article property (other than emotion) could also have led to the difference you find?
- What follow-up research would you recommend?
- Is there anything orgs like ours (Existential Risk Observatory) (or, these days, MIRI, that also focuses on comms) should do differently?
As a side note, we’re conducting research right now on where awareness has gone after our first two measurements (that were 7% and 12% in early/mid ’23, respectively). We might also look into the existence and dynamics of a tipping point.
Again, great work, hope you’ll keep working in the field in the future!
Congratulations on a great prioritization!
Perhaps the research that we (Existential Risk Observatory) and others (e.g. @Nik Samoylov, @KoenSchoen) have done on effectively communicating AI xrisk, could be something to build on. Here’s our first paper and three blog posts (the second includes measurement of Eliezer’s TIME article effectiveness—its numbers are actually pretty good!). We’re currently working on a base rate public awareness update and further research.
Best of luck and we’d love to cooperate!
Recordings are now available!
Announcing #AISummitTalks featuring Professor Stuart Russell and many others
Nice post! Yet another path to impact could be to influence international regulation processes, such as the AI Safety Summit, through influencing the EU and member states positions. In a positive scenario, the EU could even take a mediation role between the US and China.
It’s definitely good to think about whether a pause is a good idea. Together with Joep from PauseAI, I wrote down my thoughts on the topic here.
Since then, I have been thinking a bit on the pause and comparing it to a more frequently mentioned option, namely to apply model evaluations (evals) to see how dangerous a model is after training.
I think the difference between the supposedly more reasonable approach of evals and the supposedly more radical approach of a pause is actually smaller than it seems. Evals aim to detect dangerous capabilities. What will need to happen when those evals find that, indeed, a model has developed such capabilities? Then we’ll need to implement a pause. Evals or a pause is mostly a choice about timing, not a fundamentally different approach.
With evals, however, we’ll move precisely to the brink, look straight into the abyss, and then we plan to halt at the last possible moment. Unfortunately, though, we’re in thick mist and we can’t see the abyss (this is true even when we apply evals, since we don’t know which capabilities will prove existentially dangerous, and since an existential event may already occur before running the evals).
And even if we would know where to halt: we’ll need to make sure that the leading labs will practically succeed in pausing themselves (there may be thousands of people working there), that the models aren’t getting leaked, that we’ll implement the policy that’s needed, that we’ll sign international agreements, and that we gain support from the general public. This is all difficult work that will realistically take time.
Pausing isn’t as simple as pressing a button, it’s a social process. No one knowns how long that process of getting everyone on the same page will take, but it could be quite a while. Is it wise to start that process at the last possible moment, namely when the evals turn red? I don’t think so. The sooner we start, the higher our chance of survival.
Also, there’s a separate point that I think is not sufficiently addressed yet: we don’t know how to implement a pause beyond a few years duration. If hardware and algorithms improve, frontier models could democratize. While I believe this problem can be solved by international (peaceful) regulation, I also think this will be hard and we will need good plans (hardware or data regulation proposals) for how to do this in advance. We currently don’t have these, so I think working on them should be a much higher priority.
Thanks for the comment. I think the ways an aligned AGI could make the world safer against unaligned AGIs can be divided in two categories: preventing unaligned AGIs from coming into existence or stopping already existing unaligned AGIs from causing extinction. The second is the offense/defense balance. The first is what you point at.
If an AGI would prevent people from creating AI, this would likely be against their will. A state would be the only actor who could do so legally, assuming there is regulation in place, and also most practically. Therefore, I think your option falls under what I described in my post as “Types of AI (hardware) regulation may be possible where the state actors implementing the regulation are aided by aligned AIs”. I think this is indeed a realistic option and it may reduce existential risk somewhat. Getting the regulation in place at all, however, seems more important at this point than developing what I see as a pretty far-fetched and—at the moment—intractable way to implement it more effectively.
[Crosspost] AI Regulation May Be More Important Than AI Alignment For Existential Safety
[Crosspost] An AI Pause Is Humanity’s Best Bet For Preventing Extinction (TIME)
Hi Peter, thanks for your comment. We do think the conclusions we draw are robust based on our sample size. If course it depends on the signal: if there’s a change in e.g. awareness from 5% to 50%, a small sample size should be plenty to show that. However, if you’re trying to measure a signal of only 1% difference, your sample size should be much larger. While we stand by our conclusions, we do think there would be significant value in others doing similar research, if possible with larger sample sizes.
Again, thanks for your comments, we take the input into account.
Yes exactly!
Thanks Gabriel! Sorry for the confusion. TE stands for The Economist, so this item: https://www.youtube.com/watch?v=ANn9ibNo9SQ
Thanks for your reply. I mostly agree with many of the things you say, but I still think work to reduce the amount of emission rights should at least be on the list of high-impact things to do, and as far as I’m concerned, significantly higher than a few other paths mentioned here.
If you’d still want to do technology-specific work, I think offshore solar might also be impactful and neglected.
As someone who has worked in sustainable energy technology for ten years (wind energy, modeling, smart charging, activism) before moving into AI xrisk, my favorite neglected topic is carbon emission trading schemes (ETS).
ETSs such as implemented by the EU, China, and others, have a waterbed effect. The total amount of emissions is capped, and trading sets the price of those emissions for all sectors under the scheme (in the EU electricity, heavy industry, expanding to other sectors). That means that:
Reducing emissions for sectors under an ETS is pointless, climate-wise.
Deciding to reduce the amount of emission rights within an ETS should directly lead to lower emissions, without any need to understand the technologies involved.
It’s just crazy to think about all the good-hearted campaigning, awareness creation, hard engineering work, money, etc that is being directed to decreasing emissions for a sector that’s covered by an ETS. To my best understanding, as long as ETS is working correctly, this effort is completely meaningless. At the same time, I knew of exactly one person trying to reduce ETS emission rights based in my country, the Netherlands. This was the only person potentially actually achieving something useful for the climate.
If I would want to do something neglected in the climate space, I would try to inform all those people currently wasting their energy that what they should really do is trying to reduce the amount of ETS emission rights and let the market figure out the rest. (Note that several of the trajectories recommended above, such as working on nuclear power, reducing industry emissions, and deep geothermal energy (depending on use case) are all contained in ETS (at least in the EU) and improvements would therefore not benefit the climate).
If countries or regions have an ETS system, successful emission reduction should really start (and basically stop) there. It’s also quite a neglected area so plenty of low hanging fruit!
I sympathize with working on a topic you feel in your stomach. I worked on climate and switched to AI because I couldn’t get rid of a terrible feeling about humanity going to pieces without anyone really trying to solve the problem (~4 yrs ago, but I’d say this is still mostly true). If your stomach feeling is in climate instead, or animal welfare, or global poverty, I think there is a case to be made that you should be working in those fields, both because your effectiveness will be higher there and because it’s better for your own mental health, which is always important. I wouldn’t say this cannot be AI xrisk: I have this feeling about AI xrisk, and I think many eg. PauseAI activists and others do, too.