I mean on average; obviously you’re right that our opinions are correlated. Do you think there’s anything important about this correlation?
You say that the impact/scale of COVID is “huge”. I think this might mislead people who are used to thinking about the problems EAs think about. Here’s why.
I think COVID is probably going to cause on the order of 100 million DALYs this year, based on predictions like this; I think that 50-95% the damage ever done by COVID will be done this year. On the scale that 80000 Hours uses to assess the scale of problems, this would be ranked as importance level 11 or so.
I think this is lower than most things EAs consider working on or funding. For example:
Health in poor countries and factory farming is a 13 (this seems weirdly low to me, but it’s their number)
Climate change is a 14
Nuclear security and positively shaping the development of AI are both 15
This is a logarithmic scale, so for example, according to this scale, health in poor countries is 100 times more important than COVID.
So given that COVID seems likely to be between 100x and 10000x less important than the main other cause areas EAs think about, I think it’s misleading to describe its scale as “huge”.
I’m interested in betting about whether 20% of EAs think psychedelics are a plausible top EA cause area. Eg we could sample 20 EAs from some group and ask them. Perhaps we could ask random attendees from last year’s EAG. Or we could do a poll in EA Hangout.
I think that it’s important for EA to have a space where we can communicate efficiently, rather than phrase everything for the benefit of newcomers who might be reading, so I think that this is bad advice.
I’d prefer something like the weaker and less clear statement “we **can** think ahead, and it’s potentially valuable to do so even given the fact that people might try to figure this all out later”.
I think your summary of crux three is slightly wrong: I didn’t say that we need to think about it ahead of time, I just said that we can.
Yeah, for the record I also think those are pretty plausible and important sources of impact for AI safety research.
I think that either way, it’s useful for people to think about which of these paths to impact they’re going for with their research.
My guess is I consider the activities you mentioned less valuable than you do. Probably the difference is largest for programming at MIRI and smallest for Hubinger-style AI safety research. (This would probably be a bigger discussion.)
I don’t think that peculiarities of what kinds of EA work we’re most enthusiastic about lead to much of the disagreement. When I imagine myself taking on various different people’s views about what work would be most helpful, most of the time I end up thinking that valuable contributions could be made to that work by sufficiently talented undergrads.
Independent of this, my guess would be that EA does have a decent number of unidentified people who would be about as good as people you’ve identified. E.g., I can think of ~5 people off the top of my head of whom I think they might be great at one of the things you listed, and if I had your view on their value I’d probably think they should stop doing what they’re doing now and switch to trying one of these things. And I suspect if I thought hard about it, I could come up with 5-10 more people—and then there is the large number of people neither of us has any information about.
I am pretty skeptical of this. Eg I suspect that people like Evan (sorry Evan if you’re reading this for using you as a running example) are extremely unlikely to remain unidentified, because one of the things that they do is think about things in their own time and put the results online. Could you name a profile of such a person, and which of the types of work I named you think they’d maybe be as good at as the people I named?
It might be quite relevant if “great people” refers only to talent or also to beliefs and values/preferences
I am not intending to include beliefs and preferences in my definition of “great person”, except for preferences/beliefs like being not very altruistic, which I do count.
E.g. my guess is that there are several people who could be great at functional programming who either don’t want to work for MIRI, or don’t believe that this would be valuable. (This includes e.g. myself.)
I think my definition of great might be a higher bar than yours, based on the proportion of people who I think meet it? (To be clear I have no idea how good you’d be at programming for MIRI because I barely know you, and so I’m just talking about priors rather than specific guesses about you.)
For what it’s worth, I think that you’re not credulous enough of the possibility that the person you talked to actually disagreed with you—I think you might doing that thing whose name I forget where you steelman someone into saying the thing you think instead of the thing they think.
For the problems-that-solve-themselves arguments, I feel like your examples have very “good” qualities for solving themselves: both personal and economic incentives are against them, they are obvious when one is confronted with the situation, and at the point where the problems becomes obvious, you can still solve them. I would argue that not all these properties holds for AGI. What are your thoughts about that?
I agree that it’s an important question whether AGI has the right qualities to “solve itself”. To go through the ones you named:
“Personal and economic incentives are aligned against them”—I think AI safety has somewhat good properties here. Basically no-one wants to kill everyone, and AI systems that aren’t aligned with their users are much less useful. On the other hand, it might be the case that people are strongly incentivised to be reckless and deploy things quickly.
“they are obvious when one is confronted with the situation”—I think that alignment problems might be fairly obvious, especially if there’s a long process of continuous AI progress where unaligned non-superintelligent AI systems do non-catastrophic damage. So this comes down to questions about how rapid AI progress will be.
“at the point where the problems become obvious, you can still solve them”—If the problems become obvious because non-superintelligent AI systems are behaving badly, then we can still maybe put more effort into aligning increasingly powerful AI systems after that and hopefully we won’t lose that much of the value of the future.
I’m not quite sure how high your bar is for “experience”, but many of the tasks that I’m most enthusiastic about in EA are ones which could plausibly be done by someone in their early 20s who eg just graduated university. Various tasks of this type:
Work at MIRI on various programming tasks which require being really smart and good at math and programming and able to work with type theory and Haskell. Eg we recently hired Seraphina Nix to do this right out of college. There are other people who are recent college graduates who we offered this job to who didn’t accept. These people are unusually good programmers for their age, but they’re not unique. I’m more enthusiastic about hiring older and more experienced people, but that’s not a hard requirement. We could probably hire several more of these people before we became bottlenecked on management capacity.
Generalist AI safety research that Evan Hubinger does—he led the writing of “Risks from Learned Optimization” during a summer internship at MIRI; before that internship he hadn’t had much contact with the AI safety community in person (though he’d read stuff online).
Richard Ngo is another young AI safety researcher doing lots of great self-directed stuff; I don’t think he consumed an enormous amount of outside resources while becoming good at thinking about this stuff.
I think that there are inexperienced people who could do really helpful work with me on EA movement building; to be good at this you need to have read a lot about EA and be friendly and know how to talk to lots of people.
My guess is that EA does not have a lot of unidentified people who are as good at these things as the people I’ve identified.
I think that the “EA doesn’t have enough great people” problem feels more important to me than the “EA has trouble using the people we have” problem.
One underlying hypothesis that was not explicitly pointed out, I think, was that you are looking for priority arguments. That is, part of your argument is about whether AI safety research is the most important thing you could do (It might be so obvious in an EA meeting or the EA forum that it’s not worth exploring, but I like expliciting the obvious hypotheses).
This is a good point.
Whereas you could argue that without pure mathematics, almost all the positive technological progress we have now (from quantum mechanics to computer science) would not exist.
I feel pretty unsure on this point; for a contradictory perspective you might enjoy this article.
[for context, I’ve talked to Eli about this in person]
I’m interpreting you as having two concerns here.
Firstly, you’re asking why this is different than you deferring to people about the impact of the two orgs.
From my perspective, the nice thing about the impact certificate setup is that if you get paid in org B impact certificates, you’re making the person at orgs A and B put their money where their mouth is. Analogously, suppose Google is trying to hire me, but I’m actually unsure about Google’s long term profitability, and I’d rather be paid in Facebook stock than Google stock. If Google pays me in Facebook stock, I’m not deferring to them about the relative values of these stocks, I’m just getting paid in Facebook stock, such that if Google is overvalued it’s no longer my problem, it’s the problem of whoever traded their Facebook stock for Google stock.
The reason why I think that the policy of maximizing impact certificates is better for the world in this case is that I think that people are more likely to give careful answers to the question “how relatively valuable is the work orgs A and B are doing” if they’re thinking about it in terms of trying to make trades than if some random EA is asking for their quick advice.
Secondly, you’re worrying that people might end up seeming like they’re endorsing an org that they don’t endorse, and that this might harm community epistemics. This is an interesting objection that I haven’t thought much about. A few possible responses:
It’s already currently an issue that people have different amounts of optimism about their workplaces, and people don’t very often publicly state how much they agree and disagree with their employer (though I personally try to be clear about this). It’s unlikely that impact equity trades will exacerbate this problem much.
Also, people often work at places for reasons that aren’t “I think this is literally the best org”, eg:
thinking that the job is fun
the job paying them a high salary (this is exactly analogous to them paying in impact equity of a different org)
thinking that the job will give you useful experience
random fluke of who happened to offer you a job at a particular point
thinking the org is particularly flawed and so you can do unusual amounts of good by pushing it in a good direction
Also, if there were liquid markets in the impact equity of different orgs, then we’d have access to much higher-quality information about the community’s guess about the relative promisingness of different orgs. So pushing in this direction would probably be overall helpful.
This was nice to read, because I’m not sure I’ve ever seen anyone actually admit this before.
Not everyone agrees with me on this point. Many safety researchers think that their path to impact is by establishing a strong research community around safety, which seems more plausible as a mechanism to affect the world 50 years out than the “my work is actually relevant” plan. (And partially for this reason, these people tend to do different research to me.)
You say you think there’s a 70% chance of AGI in the next 50 years. How low would that probability have to be before you’d say, “Okay, we’ve got a reasonable number of people to work on this risk, we don’t really need to recruit new people into AI safety”?
I don’t know what the size of the AI safety field is such that marginal effort is better spent elsewhere. Presumably this is a continuous thing rather than a discrete thing. Eg it seems to me that now compared to five years ago, there are way more people in AI safety and so if your comparative advantage is in some other way of positively influencing the future, you should more strongly consider that other thing.
What do you think about participating in a forecasting platform, e.g. Good Judgement Open or Metaculus? It seems to cover all ingredients, and even be a good signal for others to evaluate your judgement quality.
Seems pretty good for predicting things about the world that get resolved on short timescales. Sadly it seems less helpful for practicing judgement about things like the following:
judging arguments about things like the moral importance of wild animal suffering, plausibility of AI existential risk, and existence of mental illness
predictions about small-scale things like how a project should be organized (though you can train calibration on this kind of question)
Re my own judgement: I appreciate your confidence in me. I spend a lot of time talking to people who have IMO better judgement than me; most of the things I say in this post (and a reasonable chunk of things I say other places) are my rephrasings of their ideas. I think that people whose judgement I trust would agree with my assessment of my judgement quality as “good in some ways” (this was the assessment of one person I asked about this in response to your comment).
It seems that your current strategy is to focus on training, hiring and outreaching to the most promising talented individuals.
This seems like a pretty good summary of the strategy I work on, and it’s the strategy that I’m most optimistic about.
Other alternatives might include more engagement with amatures, and providing more assistance for groups and individuals that want to learn and conduct independent research.
I think that it would be quite costly and difficult for more experienced AI safety researchers to try to cause more good research to happen by engaging more with amateurs or providing more assistance to independent research. So I think that experienced AI safety researchers are probably going to do more good by spending more time on their own research than by trying to help other people with theirs. This is because I think that experienced and skilled AI safety researchers are much more productive than other people, and because I think that a reasonably large number of very talented math/CS people become interested in AI safety every year, so we can set a pretty high bar for which people to spend a lot of time with.
Also, what would change if you had 10 times the amount of management and mentorship capacity?
If I had ten times as many copies of various top AI safety researchers and I could only use them for management and mentorship capacity, I’d try to get them to talk to many more AI safety researchers, through things like weekly hour-long calls with PhD students, or running more workshops like MSFP.
I’m a fairly good ML student who wants to decide on a research direction for AI Safety.
I’m not actually sure whether I think it’s a good idea for ML students to try to work on AI safety. I am pretty skeptical of most of the research done by pretty good ML students who try to make their research relevant to AI safety—it usually feels to me like their work ends up not contributing to one of the core difficulties, and I think that they might have been better off if they’d instead spent their effort trying to become really good at ML in the hope of being better skilled up with the goal of working on AI safety later.
I don’t have very much better advice for how to get started on AI safety; I think the “recommend to apply to AIRCS and point at 80K and maybe the Alignment Newsletter” path is pretty reasonable.
It was a good time; I appreciate all the thoughtful questions.