Former AI safety research engineer, now AI governance researcher at OpenAI. Blog: thinkingcomplete.blogspot.com
richard_ngo
I wasn’t at Manifest, though I was at LessOnline beforehand. I strongly oppose attempts to police the attendee lists that conference organizers decide on. I think this type of policing makes it much harder to have a truth-seeking community. I’ve also updated over the last few years that having a truth-seeking community is more important than I previously thought—basically because the power dynamics around AI will become very complicated and messy, in a way that requires more skill to navigate successfully than the EA community has. Therefore our comparative advantage will need to be truth-seeking.
Why does enforcing deplatforming make truth-seeking so much harder? I think there are (at least) three important effects.
First is the one described in Scott’s essay on Kolmogorov complicity. Selecting for people willing to always obey social taboos also selects hard against genuinely novel thinkers. But we don’t need to take every idea a person has in board in order to get some value from them—we should rule thinkers in, not out.
Secondly, a point I made in this tweet: taboo topics tend to end up expanding, for structural reasons (you can easily appeal to taboos to win arguments). So over time it becomes more and more costly to quarantine specific topics.
Thirdly, it selects against people who are principled in defense of truth-seeking. My sense is that the people who organized Manifest are being very principled, and would also be willing to have left-wing people who have potentially-upsetting views. For example, there’s been a lot of anti-semitism from prominent left-wing thinkers lately. If one of them wanted to attend Manifest, I think it would be reasonable for Jews to be upset. But I also expect that they’d be treated pretty similarly to Hanania (e.g. allowed to come and host sessions, name used in promotional materials). I’m curious what critics of Manifest think should be done in these cases.
To be clear, I’m not saying all events should take a stance like Manifest’s. I’m just saying that I strongly support their right to do so.
- Jun 20, 2024, 6:38 AM; 68 points) 's comment on Austin’s Quick takes by (
- EA should unequivocally condemn race science by Aug 1, 2024, 1:37 AM; 9 points) (
Eh, I personally think of some things in the top 10 as “nowhere near” the most important issues, because of how heavy-tailed cause prioritization tends to be.
When you’re weighing existential risks (or other things which steer human civilization on a large scale) against each other, effects are always going to be denominated in a very large number of lives. And this is what OP said they were doing: “a major consideration here is the use of AI to mitigate other x-risks”. So I don’t think the headline numbers are very useful here (especially because we could make them far far higher by counting future lives).
It follows from alignment/control/misuse/coordination not being (close to) solved.
“AGIs will be helping us on a lot of tasks”, “collusion is hard” and “people will get more scared over time” aren’t anywhere close to overcoming it imo.
These are what I mean by the vague intuitions.
I think it should be possible to formalise it, even
Nobody has come anywhere near doing this satisfactorily. The most obvious explanation is that they can’t.
The issue is that both sides of the debate lack gears-level arguments. The ones you give in this post (like “all the doom flows through the tiniest crack in our defence”) are more like vague intuitions; equally, on the other side, there are vague intuitions like “AGIs will be helping us on a lot of tasks” and “collusion is hard” and “people will get more scared over time” and so on.
Last time there was an explicitly hostile media campaign against EA the reaction was not to do anything, and the result is that Émile P. Torres has a large media presence,[1] launched the term TESCREAL to some success, and EA-critical thoughts became a lot more public and harsh in certain left-ish academic circles.
You say this as if there were ways to respond which would have prevented this. I’m not sure these exist, and in general I think “ignore it” is a really really solid heuristic in an era where conflict drives clicks.
@Linch, see the article I linked above, which identifies a bunch of specific bottlenecks where lobbying and/or targeted funding could have been really useful. I didn’t know about these when I wrote my comment above, but I claim prediction points for having a high-level heuristic that led to the right conclusion anyway.
The article I linked above has changed my mind back again. Apparently the RTS,S vaccine has been in clinical trials since 1997. So the failure here wasn’t just an abstract lack of belief in technology: the technology literally already existed the whole time that the EA movement (or anyone who’s been in this space for less than two decades) has been thinking about it.
An article on why we didn’t get a vaccine sooner: https://worksinprogress.co/issue/why-we-didnt-get-a-malaria-vaccine-sooner
This seems like significant evidence for the tractability of speeding things up. E.g. a single (unjustified) decision by the WHO in 2015 delayed the vaccine by almost a decade, four years of which were spent in fundraising. It seems very plausible that even 2015 EA could have sped things up by multiple years in expectation either lobbying against the original decision, or funding the follow-up trial.
Great comment, thank you :) This changed my mind.
This is a good point. The two other examples which seem salient to me:
Deutsch’s brand of techno-optimism (which comes through particularly clearly when he tries to reason about the future of AI by saying things like “AIs will be people, therefore...”).
Yudkowsky on misalignment.
Ah, I see. I think the two arguments I’d give here:
Founding 1DaySooner for malaria 5-10 years earlier is high-EV and plausibly very cheap; and there are probably another half-dozen things in this reference class.
We’d need to know much more about the specific interventions in that reference class to confidently judge that we made a mistake. But IMO if everyone in 2015-EA had explicitly agreed “vaccines will plausibly dramatically slash malaria rates within 10 years” then I do think we’d have done much more work to evaluate that reference class. Not having done that work can be an ex-ante mistake even if it turns out it wasn’t an ex-post mistake.
Hmm, your comment doesn’t really resonate with me. I don’t think it’s really about being monomaniacal. I think the (in hindsight) correct thought process here would be something like:
”Over the next 20 or 50 years, it’s very likely that the biggest lever in the space of malaria will be some kind of technological breakthrough. Therefore we should prioritize investigating the hypothesis that there’s some way of speeding up this biggest lever.”I don’t think you need this “move heaven and earth” philosophy to do that reasoning; I don’t think you need to focus on EA growth much more than we did. The mental step could be as simple as “Huh, bednets seem kinda incremental. Is there anything that’s much more ambitious?”
(To be clear I think this is a really hard mental step, but one that I would expect from an explicitly highly-scope-sensitive movement like EA.)
Makes sense, though I think that global development was enough of a focus of early EA that this type of reasoning should have been done anyway.
I’m more sympathetic about it not being done after, say, 2017.
A different BOTEC: 500k deaths per year, at $5000 per death prevented by bednets, we’d have to get a year of vaccine speedup for $2.5 billion to match bednets.
I agree that $2.5 billion to speed up development of vaccines by a year is tricky. But I expect that $2.5 billion, or $250 million, or perhaps even $25 million to speed up deployment of vaccines by a year is pretty plausible. I don’t know the details but apparently a vaccine was approved in 2021 that will only be rolled out widely in a few months, and another vaccine will be delayed until mid-2024: https://marginalrevolution.com/marginalrevolution/2023/10/what-is-an-emergency-the-case-of-rapid-malaria-vaccination.html
So I think it’s less a question of whether EA could have piled more money on and more a question of whether EA could have used that money + our talent advantage to target key bottlenecks.
(Plus the possibility of getting gene drives done much earlier, but I don’t know how to estimate that.)
That’s very useful info, ty. Though I don’t think it substantively changes my conclusion because:
Government funding tends to go towards more legible projects (like R&D). I expect that there are a bunch of useful things in this space where there are more funding gaps (e.g. lobbying for rapid vaccine rollouts).
EA has sizeable funding, but an even greater advantage in directing talent, which I think would have been our main source of impact.
There were probably a bunch of other possible technological approaches to addressing malaria that were more speculative and less well-funded than mRNA vaccines. Ex ante, it was probably a failure not to push harder towards them, rather than focusing on less scalable approaches which could never realistically have solved the full problem.
To be clear, I think it’s very commendable that OpenPhil has been funding gene drive work for a long time. I’m sad about the gap between “OpenPhil sends a few grants in that direction” and “this is a central example of what the EA community focuses on” (as bednets have been); but that shouldn’t diminish the fact that even the former is a great thing to have happen.
It currently seems likely to me that we’re going to look back on the EA promotion of bednets as a major distraction from focusing on scientific and technological work against malaria, such as malaria vaccines and gene drives.
I don’t know very much about the details of either. But it seems important to highlight how even very thoughtful people trying very hard to address a serious problem still almost always dramatically underrate the scale of technological progress.
I feel somewhat mournful about our failure on this front; and concerned about whether the same is happening in other areas, like animal welfare, climate change, and AI risk. (I may also be missing a bunch of context on what actually happened, though—please fill me in if so.)
More precisely, the cascade is:
- Probability of us developing TAGI, assuming no derailments
- Probability of us being derailed, conditional on otherwise being on track to develop TAGI without derailmentGot it. As mentioned I disagree with your 0.7 war derailment. Upon further thought I don’t necessarily disagree with your 0.7 “regulation derailment”, but I think that in most cases where I’m talking to people about AI risk, I’d want to factor this out (because I typically want to make claims like “here’s what happens if we don’t do something about it”).
Anyway, the “derailment” part isn’t really the key disagreement here. The key disagreement is methodological. Here’s one concrete alternative methodology which I think is better: a more symmetric model which involves three estimates:
Probability of us developing TAGI, assuming that nothing extreme happens
Probability of us being derailed, conditional on otherwise being on track to develop TAGI
Probability of us being rerailed, conditional on otherwise not being on track to develop TAGI
By “rerailed” here I mean roughly “something as extreme as a derailment happens, but in a way which pushes us over the threshold to be on track towards TAGI by 2043″. Some possibilities include:
An international race towards AGI, akin to the space race or race towards nukes
A superintelligent but expensive AGI turns out to good enough at science to provide us with key breakthroughs
Massive economic growth superheats investment into TAGI
Suppose we put 5% credence on each of these “rerailing” us. Then our new calculation (using your numbers) would be:
The chance of being on track assuming that nothing extreme happens: 0.6*0.4*0.16*0.6*0.46 = 1%
P(no derailment conditional on being on track) = 0.7*0.9*0.7*0.9*0.95 = 38%
P(rerailment conditional on not being on track) = 1 − 0.95*0.95*0.95 = 14%
P(TAGI by 2043) = 0.99*0.14 + 0.01*0.38 = 14.2%
That’s over 30x higher than your original estimate, and totally changes your conclusions! So presumably you must think either that there’s something wrong with the structure I’ve used here, or else that 5% is way too high for each of those three rerailments. But I’ve tried to make the rerailments as analogous to the derailments as possible. For example, if you think a depression could derail us, then it seems pretty plausible that the opposite of a depression could rerail us using approximately the same mechanisms.
You might say “look, the chance of being on track to hit all of events 1-5 by 2043 is really low. This means that in worlds where we’re on track, we’re probably barely on track; whereas in worlds where we’re not on track, we’re often missing it by decades. This makes derailment much easier than rerailment.” Which… yeah, conditional on your numbers for events 1-5, this seems true. But the low likelihood of being on track also means that even very low rerailment probabilities could change your final estimate dramatically—e.g. even 1% for each of the rerailments above would increase your headline estimate by almost an order of magnitude. And I do think that many people would interpret a headline claim of “<1%” as pretty different from “around 3%”.
Having said that, speaking for myself, I don’t care very much about <1% vs 3%; I care about 3% vs 30% vs 60%. The difference between those is going to primarily depend on events 1-5, not on derailments or rerailments. I have been trying to avoid getting into the weeds on that, since everyone else has been doing so already. So I’ll just say the following: to me, events 1-5 all look pretty closely related. “Way better algorithms” and “far more rapid learning” and “cheaper inference” and “better robotic control” all seem in some sense to be different facets of a single underlying trend; and chip + power production will both contribute to that trend and also be boosted by that trend. And so, because of this, it seems likely to me that there are alternative factorizations which are less disjoint and therefore get very different results. I think this was what Paul was getting at, but that discussion didn’t seem super productive, so if I wanted to engage more with it a better approach might be to just come up with my own alternative factorization and then argue about whether it’s better or worse than yours. But this comment is already too long so will leave it be for now.
If events 1-5 constitute TAGI, and events 6-10 are conditional on AGI, and TAGI is very different from AGI, then you can’t straightforwardly get an overall estimate by multiplying them together. E.g. as I discuss above, 0.3 seems like a reasonable estimate of P(derailment from wars) if the chip supply remains concentrated in Taiwan, but doesn’t seem reasonable if the supply of chips is on track to be “massively scaled up”.
One person I was thinking about when I wrote the post was Medhi Hassan. According to Wikipedia:
Medhi has spoken several times at the Oxford Union and also in a recent public debate on antisemitism, so clearly he’s not beyond the pale for many.
I personally also think that the “from the river to the sea” chant is pretty analogous to, say, white nationalist slogans. It does seem to have a complicated history, but in the wake of the October 7 attacks its association with Hamas should I think put it beyond the pale. Nevertheless, it has been defended by Rashida Tlaib. In general I am in favor of people being able to make arguments like hers, but I suspect that if Hanania were to make an argument for why a white nationalist slogan should be interpreted positively, it would be counted as a strong point against him.
I expect that either Hassan or Tlaib, were they interested in prediction markets, would have been treated in a similar way as Hanania by the Manifest organizers.
I don’t have more examples off the top of my head because I try not to follow this type of politics too much. I would be pretty surprised if an hour of searching didn’t turn up a bunch more though.