Although their arguments are reasonable, my big problem with this is that these guys are so motivated that I find it hard to read what they write in good faith. How can I trust these arguments are made with any kind of soberness or neutrality, when their business model is to help accelerate AI until humans aren’t doing most “valuable work” any more. I would be much more open to taking these arguments seriously if they were made by AI researchers or philosophers not running an AI acceleration company.
”Our current focus is automating software engineering, but our long-term goal is to enable the automation of all valuable work in the economy. ”
I also consider “they never present any meaningful empirical evidence for their worldview.”to be false. I think the evidince from YS is weak-ish but meaninful. They do provide a wide range of where AIs have gone rogue in strange and disturbing ways. I would consider driving people to delusion and suicide, killing people for self-preservation and even Hitler the man himself to be at least a somewhat “alien” style of evil. Yes grounded in human experience but morally incomprehensible to many people.
Although their arguments are reasonable, my big problem with this is that these guys are so motivated that I find it hard to read what they write in good faith.
People who are very invested in arguing for slowing down AI development, or decreasing catastrophic risk from AI, like many in the effective altruism community, will also be happier if they succeed in getting more resources to pursue their goals. However, I believe it is better to assess arguments on their own merits. I agree with the title of the article that it is difficult to do this. I am not aware of any empirical quantitative estimate of the risk of human extinction resulting from transformative AI.
I would consider driving people to delusion and suicide, killing people for self-preservation and even Hitler the man himself to be at least a somewhat “alien” style of evil.
I agree those actions are alien in the sense of deviating a lot from what random people do. However, I think this is practically negligible evidence about the risk of human extinction.
I don’t really like accusations of motivated reasoning. The logic you presented cuts both ways.
MIRI’s business model relies on the opposite narrative. MIRI pays Eliezer Yudkowsky $600,000 a year. It pays Nate Soares $235,000 a year. If they suddenly said that the risk of human extinction from AGI or superintelligence is extremely low, in all likelihood that money would dry up and Yudkowsky and Soares would be out of a job.
The financial basis for motivated reasoning is arguably even stronger in MIRI’s case than in Mechanize’s case. The kind of work MIRI is doing and the kind of experience Yudkowsky and Soares have isn’t really transferable to anything else. This means they are dependent on people being scared of enough of AGI to give money to MIRI. On the other hand, the technical skills needed to work on trying to advance the capabilities of current deep learning and reinforcement learning systems are transferable to working on the safety of those same systems. If the Mechanize co-founders wanted to focus on safety rather than capabilities, they could.
I’m also guessing the Mechanize co-founders decided to start the company after forming their views on AI safety. They were publicly discussing these topics long before Mechanize was founded. (Conversely, Yudkowsky/MIRI’s current core views on AI were formed roughly around 2005 and have not changed in light of new evidence, such as the technical and commercial success of AI systems based on deep learning and deep reinforcement learning.)
I would consider driving people to delusion and suicide, killing people for self-preservation and even Hitler the man himself to be at least a somewhat “alien” style of evil.
The Yudkowsky/Soares/MIRI argument about AI alignment is specifically that an AGI’s goals and motivations are highly likely to be completely alien from human goals and motivations in a way that’s highly existentially dangerous. If you’re making an argument to the effect that ‘humans can also be misaligned in a way that’s extremely dangerous’, I think, at that point, you should acknowledge you’ve moved on from the Yudkowsky/Soares/MIRI argument (and maybe decided to reject it). You’re now making a quite distinct argument that needs to be evaluated independently. It may be worth asking what to do about the risk that powerful AI systems will have human-like goals and motivations that are dangerous in the same way that human goals and motivations can be dangerous. But that is a separate premise from what Yudkowsky and Soares are arguing.
MIRI’s business model relies on the opposite narrative. MIRI pays Eliezer Yudkowsky $600,000 a year. It pays Nate Soares $235,000 a year. If they suddenly said that the risk of human extinction from AGI or superintelligence is extremely low, in all likelihood that money would dry up and Yudkowsky and Soares would be out of a job.
[...] The kind of work MIRI is doing and the kind of experience Yudkowsky and Soares have isn’t really transferable to anything else.
$235K is not very much money [edit: in the context of the AI industry]. I made close to Nate’s salary as basically an unproductive intern at MIRI. $600K is also not much money. A Preparedness researcher at OpenAI has a starting salary of $310K – $460K plus probably another $500K in equity. As for nonprofit salaries, METR’s salary range goes up to $450K just for a “senior” level RE/RS, and I think it’s reasonable for nonprofits to pay someone with 20 years of experience, who might be more like a principal RS, $600K or more.
In contrast, if Mechanize succeeds, Matthew Barnett will probably be a billionaire.
If Yudkowsky said extinction risks were low and wanted to focus on some finer aspect of alignment, e.g. ensuring that AIs respect human rights a million years from now, donors who shared their worldview would probably keep donating. Indeed, this might increase donations to MIRI because it would be closer to mainstream beliefs.
MIRI’s work seems very transferable to other risks from AI, which governments and companies both have an interest in preventing. Yudkowsky and Soares have a somewhat weird skillset and I disagree with some of their research style but it’s plausible to me they could still work productively in a mathy theoretical role in either capabilities or safety.
However, things I agree with
If the Mechanize co-founders wanted to focus on safety rather than capabilities, they could.
the Mechanize co-founders decided to start the company after forming their views on AI safety.
The Yudkowsky/Soares/MIRI argument about AI alignment is specifically that an AGI’s goals and motivations are highly likely to be completely alien from human goals and motivations in a way that’s highly existentially dangerous.
I understand the point being made (Nate plausibly could get a pay rise from an accelerationist AI company in Silicon Valley, even if the work involved was pure safetywashing, because those companies have even deeper pockets), but I would stress that these two sentences underline just how lucrative peddling doom has become for MIRI[1] as well as how uniquely positioned all sides of the AI safety movement are.
There are not many organizations whose messaging has resonated with deep pocketed donors to the extent that they can afford to pay their [unproductive] interns north of $200k pro rata to brainstorm with them.[2] Or indeed up to $450k to someone with interesting ideas for experiments to test AI threats, communication skills and at least enough knowledge of software to write basic Python data processing scripts. So the financial motivations to believe that AI is really important are there on either side of the debate; the real asymmetry is between the earning potential of having really strong views on AI vs really strong views on the need to eliminate malaria or factory farming.
tbf to Eliezer, he appears to have been prophesizing imminent tech-enabled doom/salvation since he was a teenager on quirky extropian mailing lists, so one thing he cannot be accused of is bandwagon jumping.
Outside the Valley bubble, plenty of people at profitable or well-backed companies with specialist STEM skillsets or leadership roles are not earning that for shipping product under pressure, never mind junior research hires for nonprofits with nominally altruistic missions
I think this misses the point: The financial gain comes from being central to ideas around AI in itself. I think given this baseline, being on the doomer side tends to carry huge opportunity cost financially. At the very least it’s unclear and I think you should make a strong argument to claim anyone financially profits from being a doomer.
The opportunity cost only exists for those with a high chance of securing comparable level roles in AI companies, or very senior roles at non-AI companies in the near future. Clearly this applies to some people working in AI capabilities research,[1] but if you wish to imply this applies to everyone working at MIRI and similar AI research organizations, I think the burden of proof actually rests on you. As for Eliezer, I don’t think his motivation for dooming is profit, but it’s beyond dispute that dooming is profitable for him. Could he earn orders of magnitude more money from building benevolent superintelligence based on his decision theory as he once hoped to? Well yes, but it’d have to actually work.[2]
Anyway, my point was less to question MIRI’s motivations or Thomas’ observation Nate could earn at least as much if he decided to work for a pro-AI organization and more to point out that (i) no, really, those industry norm salaries are very high compared with pretty much any quasi-academic research job not related to treating superintelligence as imminent and especially to roles typically considered “altruistic” and (ii) if we’re worried that money gives AI company founders the wrong incentives, we should worry about the whole EA-AI ecosystem and talent pipeline EA is backing. Especially since that pipeline incubated those founders.
I agree. But the reason I agree is that I think the relevant metric of what counts as a lot of money here is not whether it is a competitive salary in an ML context, but whether it would be perceived as a lot of money in a way that could plausibly threaten Eliezer’s credibility among people who would otherwise be more disposed to support AI safety, e.g. if cited broadly. I believe the answer is that it is, and so in a way that even a sub-$250k salary would not be (despite how insanely high a salary that is by the standard of even most developed countries), and I would guess this expected effect to be bigger than the incentive benefits of guaranteeing his financial independence. For this reason, accepting this level of income struck me as unwise, though I’m happy to be persuaded otherwise.
The context of this quote, which you have removed, is discussion of the reasonableness of wages for specific people with specific skills. Since neither Nate nor Eliezer’s counterfactual is earning the median global wage, your statistic seems irrelevant.
One should stick to the original point that raised the question about salary.
Is $600K a lot of money for most people and does EY hurt his cause by accepting this much? (Perhaps, but not the original issue)
Does EY earning $600K mean he’s benefitting substantially from maintaining his position on AI safety? E.g. if he was more pro AI development, would this hurt him financially? (Very unlikely IMO, and that was the context Thomas was responding to)
On a global scale I agree. My point is more that due to the salary standards in the industry, Eliezer isn’t necessarily out of line in drawing $600k, and it’s probably not much more than he could earn elsewhere; therefore the financial incentive is fairly weak compared to that of Mechanize or other AI capabilities companies.
Thanks for the reply. I agree with your specific point but I think it’s worth being more careful with your phrasing. How much we earn is an ethically-charged thing, and it’s not a good thing if EA’s relationship with AI companies gives us a permission structure to lose sight of this.
Edit: to be clear, I agree that “it’s probably not much more than he could earn elsewhere” but disagree that “Eliezer isn’t necessarily out of line in drawing $600k”
Global wealth would have to increase a lot for everyone to become billionaire. There are 10 billion people. So everyone being a billionaire would require a global wealth of 10^19 $ (= 10*10^9*1*10^9) for perfect distribution. Global wealth is 600 T$. So it would have to become 16.7 k (= 10^19/(600*10^12)) times as large. For a growth of 10 %/year, it would take 102 years (= LN(16.7*10^3)/LN(1 + 0.10)). For a growth of 30 %/year, it would take 37.1 years (= LN(16.7*10^3)/LN(1 + 0.30)).
I think the claim that Yudkowsky’s views on AI risk are meaningfully influenced by money is very weak. My guess is that he could easily find another opportunity unrelated to AI risk to make $600k per year if he searched even moderately hard.
The claim that my views are influenced by money is more plausible because I stand to profit far more than Yudkowsky stands to profit from his views. However, while perhaps plausible from the outside, this claim does not match my personal experience. I developed my core views about AI risk before I came into a position to profit much from them. This is indicated by the hundreds of comments, tweets, in-person arguments, DMs, and posts from at least 2023 onward in which I expressed skepticism about AI risk arguments and AI pause proposals. As far as I remember, I had no intention to start an AI company until very shortly before the creation of Mechanize. Moreover, if I was engaging in motivated reasoning, I could have just stayed silent about my views. Alternatively, I could have started a safety-branded company that nonetheless engages in capabilities research—like many of the ones that already exist.
It seems implausible that spending my time writing articles advocating for AI acceleration is the most selfishly profitable use of my time. The direct impact of the time I spend building Mechanize is probably going to have a far stronger effect on my personal net worth than writing a blog post about AI doom. However, while I do not think writing articles like this one is very profitable for me personally, I do think it is helpful for the world because I see myself as providing a unique perspective on AI risk that is available almost nowhere else. As far as I can tell, I am one of only a very small number of people in the world who have both engaged deeply with the arguments for AI risk and yet actively and explicitly work toward accelerating AI.
In general, I think people overestimate how much money influences people’s views about these things. It seems clear to me that people are influenced far more by peer effects and incentives from the social group they reside in. As a comparison, there are many billionaires who advocate for tax increases, or vote for politicians who support tax increases. This actually makes sense when you realize that merely advocating or voting for a particular policy is very unlikely to create change that meaningfully impacts you personally. Bryan Caplan has discussed this logic in the context of arguments about incentives under democracy, and I generally find his arguments compelling.
I think the claim that Yudkowsky’s views on AI risk are meaningfully influenced by money is very weak.
To be clear, I agree. I also agree with your general point that other factors are often more important than money. Some of these factors include the allure of millennialism, or the allure of any sort of totalizing worldview or “ideology”.
I was trying to make a general point against accusations of motivated reasoning related to money, at least in this context. If two sets of people are each getting paid to work on opposite sides of an issue, why only accuse one side of motivated reasoning?
This is indicated by the hundreds of comments, tweets, in-person arguments, DMs, and posts from at least 2023 onward in which I expressed skepticism about AI risk arguments and AI pause proposals.
Thanks for describing this history. Evidence of a similar kind lends strong credence to Yudkowsky forming his views independent from the influence of money as well.
My general view is that reasoning is complex, motivation is complex, people’s real psychology is complex, and that the forum-like behaviour of accusing someone of engaging in X bias is probably a misguided pop science simplification of the relevant scientific knowledge. For instance, when people engage in distorted thinking, the actual underlying reasoning often seems to be a surprisingly complicated multi-step sequence.
The essay above that you co-wrote is incredibly strong. I was the one who originally sent it to Vasco and, since he is a prolific cross-poster and I don’t like to cross-post under my name, encouraged him to cross-post it. I’m glad more people in the EA community have now read it. I think everyone in the EA community should read it. It’s regrettable that there’s only been one object-level comment on the substance of the essay so far, and so many comments about this (to me) relatively uninteresting and unimportant side point about money biasing people’s beliefs. I hope more people will comment on the substance of the essay at some point.
Thanks for this comment! I think your arguments about your own motivated reasoning are somewhat moot, since they seem more of an explanation that your behavior/public facing communication isn’t straightout deception (which seems right!). As I see it, motivated reasoning is to a large extent about deceiving yourself and maintaining a coherent self-narrative, so it’s perfectly plausible that one is willing to pay substantial cost in order to maintain this. (Speaking generally; I’m not very interested in discussing whether you’re doing it in particular.)
The kind of work MIRI is doing and the kind of experience Yudkowsky and Soares have isn’t really transferable to anything else.
Soares’ experience was a software engineer at Microsoft and Google before joining MIRI, and would trivially be able to rejoin industry after a few weeks of self-study to earn more money if for some reason he decided he wanted to do that. I won’t argue the point about EY—it seems obvious to me that his market value as a writer/communicator is well in excess of his 2023/2024 compensation, given his track record, but the argument here is less legible. Thankfully it turns out that somebody anticipated the exact same incentive problem and took action to mitigate it.
It’s interesting to claim that money stops being an incentive for people after a certain fixed amount well below $1 million/year. Let’s say that’s true — maybe it is true — then why do we treat people like Sam Altman, Dario Amodei, Elon Musk, and so on as having financial incentives around AI? Are we wrong to do so? (What about AI researchers and engineers who receive multi-million-dollar compensation packages? After the first, say, $5 million, are they free and clear to form unmotivated opinions?)
I think a very similar argument can be made about the Mechanize co-founders. They could make “enough” money doing something else — including their previous jobs — even if it’s less money than they might stand to gain from a successful AI capabilities startup. Should we then rule out money as an incentive?
To be clear, I don’t claim that Eliezer Yudkowsky, Nate Soares, others at MIRI, or the Mechanize co-founders are unduly motivated by money in forming their beliefs. I have no way of knowing that, and since there’s no way to know, I’m willing to give them all the benefit of the doubt. I’m saying I dislike accusations of motivated reasoning in large part because they’re so easy to level at people you disagree with, and it’s easy to overlook how the same argument could apply to yourself or people you agree with. I’m pointing out how a similar accusation could be levelled at Yudkowsky and Soares in order to illustrate this general point, specifically to challenge Nick Laing’s accusation against the Mechanize co-founders above.
I generally think that ideological motivation around AGI is a powerful motivator. I think the psychology around how people form their beliefs on AGI is complex and involves many factors (e.g. millennialist cognitive bias, to name just one).
It’s interesting to claim that money stops being an incentive for people after a certain fixed amount well below $1 million/year.
Where is this claim being made? I think the suggestion was that someone found it desirable to reduce the financial incentive gradient for EY taking any particular public stance, not some vastly general statement like what you’re suggesting.
Personally I don’t think Sam Altman is motivated by money. He just wants to be the one to build it.
I sense that Elon Musk and Dorio Amodei’s motivations are more complex than “motivated by money”, but I can imagine that the actual dollar amounts are more important to them than to Sma.
I believe this is because a donor specifically requested it. The express purpose of the donation was to make Eliezer rich enough that he could afford to say “actually AI risk isn’t a big deal” and shut down MIRI without putting himself in a difficult financial situation.
Edit Feb 2: Apparently the donation I was thinking of is separate from Eliezer’s salary, see his comment
Thanks for sharing, Michael. If I was as concerned about AI risk as @EliezerYudkowsky, I would use practically all the additional earnings (e.g. above Nate’s 235 k$/year; in reality I would keep much less) to support efforts to decrease it. I would believe spending more money on personal consumption or investments would just increase AI risk relative to supporting the most cost-effective efforts to decrease it.
A donor wanted to spend their money this way; it would not be fair to the donor for Eliezer to turn around and give the money to someone else. There is a particular theory of change according to which this is the best marginal use of ~$1 million: it gives Eliezer a strong defense against accusations like
If they suddenly said that the risk of human extinction from AGI or superintelligence is extremely low, in all likelihood that money would dry up and Yudkowsky and Soares would be out of a job.
I kinda don’t think this was the best use of a million dollars, but I can see the argument for how it might be.
I got a one-time gift of appreciated crypto, not through MIRI, part of whose purpose as I understood it was to give me enough of a savings backstop (having in previous years been not paid very much at all) that I would feel freer to speak my mind or change my mind should the need arise.
I have of course already changed MIRI’s public mission sharply on two occasions, the first being when I realized in 2001 that alignment might need to be a thing, and said so to the primary financial supporter who’d previously supported MIRI (then SIAI) on the premise of charging straight ahead on AI capabilities; the second being in the early 2020s when I declared publicly that I did not think alignment technical work was going to complete in time and MIRI was mostly shifting over to warning the world of that rather than continuing to run workshops. Should I need to pivot a third time, history suggests that I would not be out of a job.
If I had Eliezer’s views about AI risk, I would simply be transparent upfront with the donor, and say I would donate the additional earnings. I think this would ensure fairness. If the donor insisted I had to spend the money on personal consumption, I would turn down the offer if I thought this would result in the donor supporting projects that would decrease AI risk more cost-effectively than my personal consumption. I believe this would be very likely to be the case.
I generally don’t love “motivated reasoning” arguments but on the exteme ends like Tobacco companies, government propaganda and AI accellerationist companies I’m happy with putting that out there. Especially in a field like AI safety which is so speculative anyway. In general I don’t think we should give people too much airtime who have enormous personal financial gains at stake, especially in a world where money is stronger than rationalism most of the time
Wow I’m mind blown that Yudowsky pays himself that much. If only because it leaves him open to criticisms likt these. I still don’t think the financial incentives are as strong as for people starting an accellerationist company, but its a fair point.
And yes on the alien argument I was arguing that some previous indications of rogue AI do seem to me somewhat Alien.
While motivated reasoning is certainly something to look out for, the substance of the argument should also be taken into account. I believe that the main point of this post, that Yudkowsky and Soares’s book is full of narrative arguments and unfalsifiable hypotheses mostly unsupported by references to external evidence, is obviously true. As you yourself say, OP’s arguments are reasonable. On that background, this kind of attack from you seems unjustified, and I’d like to hear what parts/viewpoints/narratives/conclusions of the post are motivated reasoning in your estimation.
I do agree that motivated reasoning is common with the proponents of AI adoption. As an example, I think the white paper Sparks of Artificial General Intelligence: Early experiments with GPT-4 by Microsoft is clearly a piece of advertising masquerading as a scientific paper. Microsoft has a lot to benefit from the commercial success of its partner company OpenAI, and the conclusions it suggests are almost certainly colored by this. Same could be said about many of OpenAI’s own white papers. But this does not mean that the examples or experiments they showcase are wrong per se (even if cherry-picked), or that there is no real information in them. Their results merely need to be read with the skeptical lenses.
We should generally be skeptical of corporations (or even non-profits!) releasing pre-prints that look like scientific papers but might not pass peer review at a scientific journal. We should indeed view such pre-prints as somewhere between research and marketing. OpenAI’s pre-prints or white papers are a good example.
I think it’s hard to claim that a pre-print like Sparks of AGI is insincere (it might be, but how could we support that claim?), but this doesn’t undermine the general point. Suppose employees at Microsoft Research wanted to publish a similar report arguing that GPT-4′s seeming cognitive capabilities are actually just a bunch of cheap tricks and not sparks of anything. Would Microsoft publish that report? It’s not just about how financial or job-related incentives shape what you believe (although that is worth thinking about), it’s also about how they shape what you can say out loud. (And, importantly, what you are encouraged to focus on.)
There’s an expert consensus that tobacco is harmful, and there is a well-documented history of tobacco companies engaging in shady tactics. There is also a well-documented history of government propaganda being misleading and deceptive, and if you asked anyone with relevant expertise — historians, political scientists, media experts, whoever — they would certainly tell you that government propaganda is not reliable.
But just lumping in “AI accelerationist companies” with that is not justified. “AI accelerationist” just means anyone who works on making AI systems more capable who doesn’t agree with the AI alignment/AI safety community’s peculiar worldview. In practice, that means you’re saying most people with expertise AI are compromised and not worth listening to, but you are willing to listen to this weird random group of people, some of whom like Yudkowsky who have no technical expertise in contemporary AI paradigms (i.e. deep learning and deep reinforcement learning). This seems like a recipe for disaster, like deciding that capitalist economists are all corrupt and that only Marxist philosophers are worth trusting.
A problem with motivated reasoning arguments, when stretched to this extent, is that anyone can accuse anyone over the thinnest pretext. And rather than engaging with people’s views and arguments in any serious, substantive way, it just turns into a lot of finger pointing.
Yudkowsky’s gotten paid millions of dollars to prophesize AI doom. Many people have argued that AI safety/AI alignment narratives benefit the AI companies and their investors. The argument goes like this: Exaggerating the risks of AI exaggerates AI’s capabilities. Exaggerating AI’s capabilities makes the prospective financial value of AI much higher than it really is. Therefore, talking about AI risk or even AI doom is good business.
I would add that exaggerating risk may be a particularly effective way to exaggerate AI’s capabilities. People tend to be skeptical of anything that sounds like pie-in-the-sky hope or optimism. On the other hand, talking about risk sounds serious and intelligent. Notice what goes unsaid: many near-term AGI believers think there’s a high chance of some unbelievably amazing utopia just on the horizon. How many times have you heard someone imagine that utopia? One? Zero? And how many times have you heard various AI doom or disempowerment stories? Why would no one ever bring up this amazing utopia they think might happen very soon?
Even if you’re very pessimistic and think there’s a 90% chance of AI doom, a 10% chance of utopia is still pretty damn interesting. And many people are much more optimistic, thinking there’s around a 1-30% chance of doom, which implies a 70%+ chance of utopia. So, what gives? Where’s the utopia talk? Even when people talk about the utopian elements of AGI futures, they emphasize the worrying parts: what if intelligent machines produce effectively unlimited wealth, how will we organize the economy? What policies will we need to implement? How will people cope? We need to start worrying about this now! When I think about what would happen if I won the lottery, my mind does not go to worrying about the downsides.
I think the overwhelming majority of people who express views on this topic are true believers. I think they are sincere. I would only be willing to accuse someone of possibly doing something underhanded if, independently, they had a track record of deceptive behaviour. (Sam Altman has such a track record, and generally I don’t believe anything he says anymore. I have no way of knowing what’s sincere, what’s a lie, and what’s something he’s convinced himself of because it suits him to believe it.) I think the specific accusation that AI safety/AI alignment is a deliberate, conscious lie cooked up to juice AI investment is silly. It’s probably true, though, that people at AI companies have some counterintuitive incentive or bias toward talking up AI doom fears.
However, my general point is that just as it’s silly to accuse AI safety/alignment people of being shills for AI companies, it also seems silly to me to say that AI companies (or “AI accelerationist” companies, which is effectively all major AI companies and almost all startups) are the equivalent of tobacco companies, and you shouldn’t pay attention to what people at AI companies say about AI. Motivated reasoning accusations made on thin grounds can put you into a deluded bubble (e.g. becoming a Marxist) and I don’t think AI is some clear-cut, exceptional case like tobacco or state propaganda where obviously you should ignore the message.
Wow I’m mind blown that Yudowsky pays himself that much. If only because it leaves him open to criticisms likt these. I still don’t think the financial incentives are as strong as for people starting an accellerationist company, but its a fair point.
I think the strength of the incentives to behave in a given way is more proportional to the resulting expected increase in welfare than to the expected increase in net earnings. Individual human welfare is often assumed to be proportional to the logarithm of personal consumption. So a given increase in earnings increases welfare less for people earning more. In addition, a 1 % chance of earning 100 times more (for example, due to one’s company being successful) increases welfare less than a 100 % chance of earning 100 % more. More importantly, there are major non-financial benefits for Yudowsky, who is somewhat seen as a prophet in some circles.
The reason Eliezer gets paid so much is because a donor specifically requested it. The express purpose of the donation was to make Eliezer rich enough that he could afford to say “actually AI risk isn’t a big deal” and shut down MIRI without putting himself in a difficult financial situation.
(I don’t know about Nate’s salary but $235K looks pretty reasonable to me? That’s less than a mid-level software engineer makes.)
Edit Feb 2: Apparently the donation I was thinking of is separate from Eliezer’s salary, see his comment
I’m not sure how they decide on what salaries to pay themselves. But the reason they have the money to pay themselves those salaries in the first place is that MIRI’s donors believe there’s a significant chance of AI destroying the world within the next 5-20 years and that MIRI (especially Yudkowsky) is uniquely positioned to prevent this from happening.
The financial basis for motivated reasoning is arguably even stronger in MIRI’s case than in Mechanize’s case. The kind of work MIRI is doing and the kind of experience Yudkowsky and Soares have isn’t really transferable to anything else.
It is somewhat difficult to react to this level of absolutely incredible nonsense politely, but I’ll try.
I disagree with both Yudkowsky and Soares about many things, but very obviously their direct experience with thinking and working with existing AIs would be worth > $1M pa if evaluated anonymously based on understanding SOTA AIs, and likely >$10s M pa if they worked on capabilities.
For the companies racing to AGI, Y&S endorsing some effort as good would likely have something between billions $ to tens of billions $ value.
“very obviously their direct experience with thinking and working with existing AIs would be worth > $1M pa if evaluated anonymously based on understanding SOTA AIs, and likely >$10s M pa if they worked on capabilities.”
“Y&S endorsing some effort as good would likely have something between billions $ to tens of billions $ value.”
fwiw both of these claims strike me as close to nonsense, so I don’t think this is a helpful reaction.
If you ask the AIs they get numbers in the tens of millions to tens of billions range, with around 1 billion being the central estimate. (I haven’t extensively controlled for the effect and some calculations appear driven by narrative)
Personally I find it hard to judge and tend to lean no when trying to think it through, but it’s not obviously nonsense.
I agree with Ben Stewart’s response that this is not a helpful thing to say. You are making some very strange and unintuitive claims. I can’t imagine how you would persuade a reasonable, skeptical, well-informed person outside the EA/LessWrong (or adjacent) bubble that these are credible claims, let alone that they are true. (Even within the EA Forum bubble, it seems like significantly more people disagree with you than agree.)
To pick on just one aspect of this claim: it is my understanding that Yudkowsky has no meaningful technical proficiency with deep learning-based or deep reinforcement learning-based AI systems. In my understanding, Yudkowsky lacks the necessary skills and knowledge to perform the role of an entry-level AI capabilities researcher or engineer at any AI company capable of paying multi-million-dollar salaries. If there is evidence that shows my understanding is mistaken, I would like to see that evidence. Otherwise, I can only conclude that you are mistaken.
I think the claim that an endorsement is worth billions of billions is also wrong, but it’s hard to disprove a claim about what would happen in the event of a strange and unlikely hypothetical. Yudkowsky, Soares, and MIRI have an outsized intellectual influence in the EA community (and obviously on LessWrong). There is some meaningful level of influence on the community of people working in the AI industry in the Bay Area, but it’s much less. Among the sort of people who could make decisions that would realize billions or tens of billions in value, namely the top-level executives at AI companies and investors, the influence seems pretty marginal. I would guess the overwhelming majority of investors either don’t know who Yudkowsky and Soares are or do but don’t care what their views are. Top-level executives do know who Yudkowsky is, but in every instance I’ve seen, they tend to be politely disdainful or dismissive toward his views on AGI and AI safety.
Anyway, this seems like a regrettably unproductive and unimportant tangent.
I think it could be a helpful response for people who are able to respond to signals of the type “someone who has demonstrably good forecasting skills, is an expert in the field, and works on this long time claims X” by at least re-evaluating if their models make sense and are not missing some important considerations.
If someone is at least able to that, they can for example ask a friendly AI or some other friendly AI and they will tell you, based on conservative estimates and reference classes, that the original claim is likely wrong. They will still miss important considerations—in a way in which typical forecaster would also do—so the results are underestimates.
I think at the level of [some combination of lack of ability to think and motivated reasoning] when people are uninterested in e.g. sanity checking their thinking with AIs, it is not worth the time correcting them. People are wrong on the internet all the time.
(I think the debate was moderately useful—I made an update from this debate & voting patterns, broadly in the direction EA Forum descending to a level of random place on the internet where confused people talk about AI and it is broadly not worth to read or engage. I’m no longer that much active on EAF, but I’ve made some update)
This thread seems to have gone in an unhelpful direction.
Questioning motivations is a hard point to make well. I’m unwilling to endorse that they are never relevant, but it immediately becomes personal. Keeping the focus primarily on the level of the arguments themselves is an approach more likely to enlighten and less likely to lead to flamewars.
I’m not here to issue a moderation warning to anyone for the conversation ending up on the point of motivations. I do want to take my moderation hat off and suggest that people spend more time on the object level.
I will then put my moderation hat back on and say that this and Jan’s previous comment breaks norms. You can disagree with someone without being this insulting.
I agree the thread direction may be unhelpful, and flame wars are bad.
I disagree though about the merits of questioning motivations, I think its super important.
In the AI sphere, there are great theoretical arguments on all sides, good arguments for accelleration, caution, pausing etc. We can discuss these ad nauseum and I do think that’s useful. But I think motivations likely shape the history and current state of AI development more than unmotivated easoning and rational thought. Money and Power are strong motivators—EA’s have sidelined them at their peril before. Although we cannot know people’s hearts, we can see and analyse what they havedone and said in the past and what motivational pressure might affect them right now.
I also think its possislbe to have a somewhat object level about motivations.
For the companies racing to AGI, Y&S endorsing some effort as good would likely have something between billions $ to tens of billions $ value.
Are you open to bets about this? I would be happy to bet 10 k$ that Anthropic would not pay e.g. 3 billion $ for Yudkowsky and Soares to endorse their last model as good. We could ask the marketing team at Anthropic or marketing experts elsewhere. I am not officially proposing a bet just yet. We would have to agree on a concrete operationalisation.
This doesn’t seem to be a reasonable way to operationalize. It would create much less value for the company if it was clear that they were being paid for endorsing them. And I highly doubt Amodei would be in a position to admit that they’d want such an endorsement even if it indeed benefitted them.
Thanks for the good point, Nick. I still suspect Anthropic would not pay e.g. 3 billion $ for Yudkowsky and Soares to endorse their last model as good if they were hypothetically being honest. I understand this is difficult to operationalise, but it could still be asked to people outside Anthropic.
The operationalisation you propose does not make any sense, Yudkowsky and Soares do not claim ChatGPT 5.2 will kill everyone or anything like that.
What about this:
MIRI approaches [a lab] with this offer: we have made some breakthrough in ability to verify if the way you are training AIs leads to misalignment in the way we are worried about. Unfortunately the way to verify requires a lot of computations (ie something like ARC), so it is expensive. We expect your whole training setup will pass this, but we will need $3B from you to run this; if our test will work, we will declare that your lab solved the technical part of AI alignment we were most worried about & some arguments which we expect to convince many people who listen to our views.
Or this: MIRI discusses stuff with xAI or Meta and convinces themselves their—secret—plan is by far the best chance humanity has, and everyone ML/AI smart and conscious should stop whatever they are doing and join them.
(Obviously these are also unrealistic / assume something like some lab coming with some plan which could even hypotehically work)
Thanks, Jan. I think it is very unlikely that AI companies with frontier models will seek the technical assistance of MIRI in the way you described in your 1st operationalisation. So I believe a bet which would only resolve in this case has very little value. I am open tobetsagainst short AI timelines, or what they supposedly imply, up to 10 k$. Do you see any that we could make that is good for both of us under our own views considering we could invest our money, and that you could take loans?
I was considering hypothetical scenarios of the type “imagine this offer from MIRI arrived, would a lab accept” ; clearly MIRI is not making the offer because the labs don’t have good alignment plans and they are obviously high integrity enough to not be corrupted by relatively tiny incentives like $3b
I would guess there are ways to operationalise the hypothethicals, and try to have, for example, Dan Hendrycks guess what would xAI do, him being an advisor.
With your bets about timelines—I did 8:1 bet with Daniel Kokotajlo against AI 2027 being as accurate as his previous forecast, so not sure which side of the “confident about short timelines” do you expect I should take. I’m happy to bet on some operationalization of your overall thinking and posting about the topic of AGI being bad, e.g. something like “3 smartest available AIs in 2035 compare all what we wrote in 2026 on EAF, LW and Twitter about AI and judge who was more confused, overconfident and miscalibrated”.
I was considering hypothetical scenarios of the type “imagine this offer from MIRI arrived, would a lab accept”
When would the offer from MIRI arrive in the hypothetical scenario? I am sceptical of an honest endorsement from MIRI today being worth 3 billion $, but I do not have a good sense of what MIRI will look like in the future. I would also agree a full-proof AI safety certification is or will be worth more than 3 billion $ depending on how it is defined.
With your bets about timelines—I did 8:1 bet with Daniel Kokotajlo against AI 2027 being as accurate as his previous forecast, so not sure which side of the “confident about short timelines” do you expect I should take.
I was guessing I would have longer timelines. What is your median date of superintelligent AI as defined by Metaculus?
It’s not endorsing a specific model for marketing reasons; it’s about endorsing the effort, overall.
Given that Meta is willing to pay billions of dollars for people to join them, and that many people don’t work on AI capabilities (or work, e.g., at Anthropic, as a lesser evil) because they share their concerns with E&S, an endorsement from E&S would have value in billions-tens of billions simply because of the talent that you can get as a result of this.
Meta is paying billions of dollars to recruit people with proven experience at developing relevant AI models.
Does the set of “people with proven experience in building AI models” overlap with “people who defer to Eliezer on whether AI is safe” at all? I doubt it.
Indeed given that Yudkowsky’s arguments on AI are not universally admired and people who have chosen building the thing he says will make everybody die as their career are particularly likely to be sceptical about his convictions on that issue, an endorsement might even be net negative.
Thanks for the comment, Mikhail. Gemini 3 estimates a total annualised compensation of the people working at Meta Superintelligence Labs (MSL) of 4.4 billion $. If an endorsement from Yudkowsky and Soares was as beneficial (including via bringing in new people) as making 10 % of people there 10 % more impactful over 10 years, it would be worth 440 M$ (= 0.10*0.10*10*4.4*10^9).
You could imagine a Yudkowsky endorsement (say with the narrative that Zuck talked to him and admits he went about it all wrong and is finally taking the issue seriously just to entertain the counterfactual...) to raise meta AI from “nobody serious wants to work there and they can only get talent by paying exorbitant prices” to “they finally have access to serious talent and can get a critical mass of people to do serious work”. This’d arguably be more valuable than whatever they’re doing now.
I think your answer to the question of how much an endorsement would be worth mostly depends on some specific intuitions that I imagine Kulveit has for good reasons but most people don’t, so it’s a bit hard to argue about it. It also doesn’t help that in every other case than Anthropic and maybe deepmind it’d also require some weird hypotheticals to even entertain the possibility.
Although their arguments are reasonable, my big problem with this is that these guys are so motivated that I find it hard to read what they write in good faith. How can I trust these arguments are made with any kind of soberness or neutrality, when their business model is to help accelerate AI until humans aren’t doing most “valuable work” any more. I would be much more open to taking these arguments seriously if they were made by AI researchers or philosophers not running an AI acceleration company.
”Our current focus is automating software engineering, but our long-term goal is to enable the automation of all valuable work in the economy. ”
I also consider “they never present any meaningful empirical evidence for their worldview.” to be false. I think the evidince from YS is weak-ish but meaninful. They do provide a wide range of where AIs have gone rogue in strange and disturbing ways. I would consider driving people to delusion and suicide, killing people for self-preservation and even Hitler the man himself to be at least a somewhat “alien” style of evil. Yes grounded in human experience but morally incomprehensible to many people.
Hi Nick.
People who are very invested in arguing for slowing down AI development, or decreasing catastrophic risk from AI, like many in the effective altruism community, will also be happier if they succeed in getting more resources to pursue their goals. However, I believe it is better to assess arguments on their own merits. I agree with the title of the article that it is difficult to do this. I am not aware of any empirical quantitative estimate of the risk of human extinction resulting from transformative AI.
I agree those actions are alien in the sense of deviating a lot from what random people do. However, I think this is practically negligible evidence about the risk of human extinction.
I don’t really like accusations of motivated reasoning. The logic you presented cuts both ways.
MIRI’s business model relies on the opposite narrative. MIRI pays Eliezer Yudkowsky $600,000 a year. It pays Nate Soares $235,000 a year. If they suddenly said that the risk of human extinction from AGI or superintelligence is extremely low, in all likelihood that money would dry up and Yudkowsky and Soares would be out of a job.
The financial basis for motivated reasoning is arguably even stronger in MIRI’s case than in Mechanize’s case. The kind of work MIRI is doing and the kind of experience Yudkowsky and Soares have isn’t really transferable to anything else. This means they are dependent on people being scared of enough of AGI to give money to MIRI. On the other hand, the technical skills needed to work on trying to advance the capabilities of current deep learning and reinforcement learning systems are transferable to working on the safety of those same systems. If the Mechanize co-founders wanted to focus on safety rather than capabilities, they could.
I’m also guessing the Mechanize co-founders decided to start the company after forming their views on AI safety. They were publicly discussing these topics long before Mechanize was founded. (Conversely, Yudkowsky/MIRI’s current core views on AI were formed roughly around 2005 and have not changed in light of new evidence, such as the technical and commercial success of AI systems based on deep learning and deep reinforcement learning.)
The Yudkowsky/Soares/MIRI argument about AI alignment is specifically that an AGI’s goals and motivations are highly likely to be completely alien from human goals and motivations in a way that’s highly existentially dangerous. If you’re making an argument to the effect that ‘humans can also be misaligned in a way that’s extremely dangerous’, I think, at that point, you should acknowledge you’ve moved on from the Yudkowsky/Soares/MIRI argument (and maybe decided to reject it). You’re now making a quite distinct argument that needs to be evaluated independently. It may be worth asking what to do about the risk that powerful AI systems will have human-like goals and motivations that are dangerous in the same way that human goals and motivations can be dangerous. But that is a separate premise from what Yudkowsky and Soares are arguing.
I strongly disagree with a couple of claims:
$235K is not very much money [edit: in the context of the AI industry]. I made close to Nate’s salary as basically an unproductive intern at MIRI. $600K is also not much money. A Preparedness researcher at OpenAI has a starting salary of $310K – $460K plus probably another $500K in equity. As for nonprofit salaries, METR’s salary range goes up to $450K just for a “senior” level RE/RS, and I think it’s reasonable for nonprofits to pay someone with 20 years of experience, who might be more like a principal RS, $600K or more.
In contrast, if Mechanize succeeds, Matthew Barnett will probably be a billionaire.
If Yudkowsky said extinction risks were low and wanted to focus on some finer aspect of alignment, e.g. ensuring that AIs respect human rights a million years from now, donors who shared their worldview would probably keep donating. Indeed, this might increase donations to MIRI because it would be closer to mainstream beliefs.
MIRI’s work seems very transferable to other risks from AI, which governments and companies both have an interest in preventing. Yudkowsky and Soares have a somewhat weird skillset and I disagree with some of their research style but it’s plausible to me they could still work productively in a mathy theoretical role in either capabilities or safety.
However, things I agree with
I understand the point being made (Nate plausibly could get a pay rise from an accelerationist AI company in Silicon Valley, even if the work involved was pure safetywashing, because those companies have even deeper pockets), but I would stress that these two sentences underline just how lucrative peddling doom has become for MIRI[1] as well as how uniquely positioned all sides of the AI safety movement are.
There are not many organizations whose messaging has resonated with deep pocketed donors to the extent that they can afford to pay their [unproductive] interns north of $200k pro rata to brainstorm with them.[2] Or indeed up to $450k to someone with interesting ideas for experiments to test AI threats, communication skills and at least enough knowledge of software to write basic Python data processing scripts. So the financial motivations to believe that AI is really important are there on either side of the debate; the real asymmetry is between the earning potential of having really strong views on AI vs really strong views on the need to eliminate malaria or factory farming.
tbf to Eliezer, he appears to have been prophesizing imminent tech-enabled doom/salvation since he was a teenager on quirky extropian mailing lists, so one thing he cannot be accused of is bandwagon jumping.
Outside the Valley bubble, plenty of people at profitable or well-backed companies with specialist STEM skillsets or leadership roles are not earning that for shipping product under pressure, never mind junior research hires for nonprofits with nominally altruistic missions
I think this misses the point: The financial gain comes from being central to ideas around AI in itself. I think given this baseline, being on the doomer side tends to carry huge opportunity cost financially.
At the very least it’s unclear and I think you should make a strong argument to claim anyone financially profits from being a doomer.
The opportunity cost only exists for those with a high chance of securing comparable level roles in AI companies, or very senior roles at non-AI companies in the near future. Clearly this applies to some people working in AI capabilities research,[1] but if you wish to imply this applies to everyone working at MIRI and similar AI research organizations, I think the burden of proof actually rests on you. As for Eliezer, I don’t think his motivation for dooming is profit, but it’s beyond dispute that dooming is profitable for him. Could he earn orders of magnitude more money from building benevolent superintelligence based on his decision theory as he once hoped to? Well yes, but it’d have to actually work.[2]
Anyway, my point was less to question MIRI’s motivations or Thomas’ observation Nate could earn at least as much if he decided to work for a pro-AI organization and more to point out that (i) no, really, those industry norm salaries are very high compared with pretty much any quasi-academic research job not related to treating superintelligence as imminent and especially to roles typically considered “altruistic” and (ii) if we’re worried that money gives AI company founders the wrong incentives, we should worry about the whole EA-AI ecosystem and talent pipeline EA is backing. Especially since that pipeline incubated those founders.
including Nate
and work in a way that didn’t kill everyone, I guess...
This is false.
I agree. But the reason I agree is that I think the relevant metric of what counts as a lot of money here is not whether it is a competitive salary in an ML context, but whether it would be perceived as a lot of money in a way that could plausibly threaten Eliezer’s credibility among people who would otherwise be more disposed to support AI safety, e.g. if cited broadly. I believe the answer is that it is, and so in a way that even a sub-$250k salary would not be (despite how insanely high a salary that is by the standard of even most developed countries), and I would guess this expected effect to be bigger than the incentive benefits of guaranteeing his financial independence. For this reason, accepting this level of income struck me as unwise, though I’m happy to be persuaded otherwise.
Thanks for the good point, Paul. I tend to agree.
The context of this quote, which you have removed, is discussion of the reasonableness of wages for specific people with specific skills. Since neither Nate nor Eliezer’s counterfactual is earning the median global wage, your statistic seems irrelevant.
What do you think their counterfactual is? I don’t think any of what they’ve been doing is really transferable.
One should stick to the original point that raised the question about salary.
Is $600K a lot of money for most people and does EY hurt his cause by accepting this much? (Perhaps, but not the original issue)
Does EY earning $600K mean he’s benefitting substantially from maintaining his position on AI safety? E.g. if he was more pro AI development, would this hurt him financially? (Very unlikely IMO, and that was the context Thomas was responding to)
On a global scale I agree. My point is more that due to the salary standards in the industry, Eliezer isn’t necessarily out of line in drawing $600k, and it’s probably not much more than he could earn elsewhere; therefore the financial incentive is fairly weak compared to that of Mechanize or other AI capabilities companies.
Thanks for the reply. I agree with your specific point but I think it’s worth being more careful with your phrasing. How much we earn is an ethically-charged thing, and it’s not a good thing if EA’s relationship with AI companies gives us a permission structure to lose sight of this.
Edit: to be clear, I agree that “it’s probably not much more than he could earn elsewhere” but disagree that “Eliezer isn’t necessarily out of line in drawing $600k”
It’s true Mechanize are trying to hire him for 650k...
If Mechanize succeeds in its long-term goal of “the automation of all valuable work in the economy”, then everyone on Earth will be a billionaire.
Global wealth would have to increase a lot for everyone to become billionaire. There are 10 billion people. So everyone being a billionaire would require a global wealth of 10^19 $ (= 10*10^9*1*10^9) for perfect distribution. Global wealth is 600 T$. So it would have to become 16.7 k (= 10^19/(600*10^12)) times as large. For a growth of 10 %/year, it would take 102 years (= LN(16.7*10^3)/LN(1 + 0.10)). For a growth of 30 %/year, it would take 37.1 years (= LN(16.7*10^3)/LN(1 + 0.30)).
I think the claim that Yudkowsky’s views on AI risk are meaningfully influenced by money is very weak. My guess is that he could easily find another opportunity unrelated to AI risk to make $600k per year if he searched even moderately hard.
The claim that my views are influenced by money is more plausible because I stand to profit far more than Yudkowsky stands to profit from his views. However, while perhaps plausible from the outside, this claim does not match my personal experience. I developed my core views about AI risk before I came into a position to profit much from them. This is indicated by the hundreds of comments, tweets, in-person arguments, DMs, and posts from at least 2023 onward in which I expressed skepticism about AI risk arguments and AI pause proposals. As far as I remember, I had no intention to start an AI company until very shortly before the creation of Mechanize. Moreover, if I was engaging in motivated reasoning, I could have just stayed silent about my views. Alternatively, I could have started a safety-branded company that nonetheless engages in capabilities research—like many of the ones that already exist.
It seems implausible that spending my time writing articles advocating for AI acceleration is the most selfishly profitable use of my time. The direct impact of the time I spend building Mechanize is probably going to have a far stronger effect on my personal net worth than writing a blog post about AI doom. However, while I do not think writing articles like this one is very profitable for me personally, I do think it is helpful for the world because I see myself as providing a unique perspective on AI risk that is available almost nowhere else. As far as I can tell, I am one of only a very small number of people in the world who have both engaged deeply with the arguments for AI risk and yet actively and explicitly work toward accelerating AI.
In general, I think people overestimate how much money influences people’s views about these things. It seems clear to me that people are influenced far more by peer effects and incentives from the social group they reside in. As a comparison, there are many billionaires who advocate for tax increases, or vote for politicians who support tax increases. This actually makes sense when you realize that merely advocating or voting for a particular policy is very unlikely to create change that meaningfully impacts you personally. Bryan Caplan has discussed this logic in the context of arguments about incentives under democracy, and I generally find his arguments compelling.
To be clear, I agree. I also agree with your general point that other factors are often more important than money. Some of these factors include the allure of millennialism, or the allure of any sort of totalizing worldview or “ideology”.
I was trying to make a general point against accusations of motivated reasoning related to money, at least in this context. If two sets of people are each getting paid to work on opposite sides of an issue, why only accuse one side of motivated reasoning?
Thanks for describing this history. Evidence of a similar kind lends strong credence to Yudkowsky forming his views independent from the influence of money as well.
My general view is that reasoning is complex, motivation is complex, people’s real psychology is complex, and that the forum-like behaviour of accusing someone of engaging in X bias is probably a misguided pop science simplification of the relevant scientific knowledge. For instance, when people engage in distorted thinking, the actual underlying reasoning often seems to be a surprisingly complicated multi-step sequence.
The essay above that you co-wrote is incredibly strong. I was the one who originally sent it to Vasco and, since he is a prolific cross-poster and I don’t like to cross-post under my name, encouraged him to cross-post it. I’m glad more people in the EA community have now read it. I think everyone in the EA community should read it. It’s regrettable that there’s only been one object-level comment on the substance of the essay so far, and so many comments about this (to me) relatively uninteresting and unimportant side point about money biasing people’s beliefs. I hope more people will comment on the substance of the essay at some point.
Thanks for this comment!
I think your arguments about your own motivated reasoning are somewhat moot, since they seem more of an explanation that your behavior/public facing communication isn’t straightout deception (which seems right!). As I see it, motivated reasoning is to a large extent about deceiving yourself and maintaining a coherent self-narrative, so it’s perfectly plausible that one is willing to pay substantial cost in order to maintain this. (Speaking generally; I’m not very interested in discussing whether you’re doing it in particular.)
Soares’ experience was a software engineer at Microsoft and Google before joining MIRI, and would trivially be able to rejoin industry after a few weeks of self-study to earn more money if for some reason he decided he wanted to do that. I won’t argue the point about EY—it seems obvious to me that his market value as a writer/communicator is well in excess of his 2023/2024 compensation, given his track record, but the argument here is less legible. Thankfully it turns out that somebody anticipated the exact same incentive problem and took action to mitigate it.
It’s interesting to claim that money stops being an incentive for people after a certain fixed amount well below $1 million/year. Let’s say that’s true — maybe it is true — then why do we treat people like Sam Altman, Dario Amodei, Elon Musk, and so on as having financial incentives around AI? Are we wrong to do so? (What about AI researchers and engineers who receive multi-million-dollar compensation packages? After the first, say, $5 million, are they free and clear to form unmotivated opinions?)
I think a very similar argument can be made about the Mechanize co-founders. They could make “enough” money doing something else — including their previous jobs — even if it’s less money than they might stand to gain from a successful AI capabilities startup. Should we then rule out money as an incentive?
To be clear, I don’t claim that Eliezer Yudkowsky, Nate Soares, others at MIRI, or the Mechanize co-founders are unduly motivated by money in forming their beliefs. I have no way of knowing that, and since there’s no way to know, I’m willing to give them all the benefit of the doubt. I’m saying I dislike accusations of motivated reasoning in large part because they’re so easy to level at people you disagree with, and it’s easy to overlook how the same argument could apply to yourself or people you agree with. I’m pointing out how a similar accusation could be levelled at Yudkowsky and Soares in order to illustrate this general point, specifically to challenge Nick Laing’s accusation against the Mechanize co-founders above.
I generally think that ideological motivation around AGI is a powerful motivator. I think the psychology around how people form their beliefs on AGI is complex and involves many factors (e.g. millennialist cognitive bias, to name just one).
Where is this claim being made? I think the suggestion was that someone found it desirable to reduce the financial incentive gradient for EY taking any particular public stance, not some vastly general statement like what you’re suggesting.
Personally I don’t think Sam Altman is motivated by money. He just wants to be the one to build it.
I sense that Elon Musk and Dorio Amodei’s motivations are more complex than “motivated by money”, but I can imagine that the actual dollar amounts are more important to them than to Sma.
I believe this is because a donor specifically requested it. The express purpose of the donation was to make Eliezer rich enough that he could afford to say “actually AI risk isn’t a big deal” and shut down MIRI without putting himself in a difficult financial situation.Edit Feb 2: Apparently the donation I was thinking of is separate from Eliezer’s salary, see his comment
Thanks for sharing, Michael. If I was as concerned about AI risk as @EliezerYudkowsky, I would use practically all the additional earnings (e.g. above Nate’s 235 k$/year; in reality I would keep much less) to support efforts to decrease it. I would believe spending more money on personal consumption or investments would just increase AI risk relative to supporting the most cost-effective efforts to decrease it.
A donor wanted to spend their money this way; it would not be fair to the donor for Eliezer to turn around and give the money to someone else. There is a particular theory of change according to which this is the best marginal use of ~$1 million: it gives Eliezer a strong defense against accusations like
I kinda don’t think this was the best use of a million dollars, but I can see the argument for how it might be.
I got a one-time gift of appreciated crypto, not through MIRI, part of whose purpose as I understood it was to give me enough of a savings backstop (having in previous years been not paid very much at all) that I would feel freer to speak my mind or change my mind should the need arise.
I have of course already changed MIRI’s public mission sharply on two occasions, the first being when I realized in 2001 that alignment might need to be a thing, and said so to the primary financial supporter who’d previously supported MIRI (then SIAI) on the premise of charging straight ahead on AI capabilities; the second being in the early 2020s when I declared publicly that I did not think alignment technical work was going to complete in time and MIRI was mostly shifting over to warning the world of that rather than continuing to run workshops. Should I need to pivot a third time, history suggests that I would not be out of a job.
If I had Eliezer’s views about AI risk, I would simply be transparent upfront with the donor, and say I would donate the additional earnings. I think this would ensure fairness. If the donor insisted I had to spend the money on personal consumption, I would turn down the offer if I thought this would result in the donor supporting projects that would decrease AI risk more cost-effectively than my personal consumption. I believe this would be very likely to be the case.
100 percent agree. I was going to write something similar but this is better
I generally don’t love “motivated reasoning” arguments but on the exteme ends like Tobacco companies, government propaganda and AI accellerationist companies I’m happy with putting that out there. Especially in a field like AI safety which is so speculative anyway. In general I don’t think we should give people too much airtime who have enormous personal financial gains at stake, especially in a world where money is stronger than rationalism most of the time
Wow I’m mind blown that Yudowsky pays himself that much. If only because it leaves him open to criticisms likt these. I still don’t think the financial incentives are as strong as for people starting an accellerationist company, but its a fair point.
And yes on the alien argument I was arguing that some previous indications of rogue AI do seem to me somewhat Alien.
While motivated reasoning is certainly something to look out for, the substance of the argument should also be taken into account. I believe that the main point of this post, that Yudkowsky and Soares’s book is full of narrative arguments and unfalsifiable hypotheses mostly unsupported by references to external evidence, is obviously true. As you yourself say, OP’s arguments are reasonable. On that background, this kind of attack from you seems unjustified, and I’d like to hear what parts/viewpoints/narratives/conclusions of the post are motivated reasoning in your estimation.
I do agree that motivated reasoning is common with the proponents of AI adoption. As an example, I think the white paper Sparks of Artificial General Intelligence: Early experiments with GPT-4 by Microsoft is clearly a piece of advertising masquerading as a scientific paper. Microsoft has a lot to benefit from the commercial success of its partner company OpenAI, and the conclusions it suggests are almost certainly colored by this. Same could be said about many of OpenAI’s own white papers. But this does not mean that the examples or experiments they showcase are wrong per se (even if cherry-picked), or that there is no real information in them. Their results merely need to be read with the skeptical lenses.
We should generally be skeptical of corporations (or even non-profits!) releasing pre-prints that look like scientific papers but might not pass peer review at a scientific journal. We should indeed view such pre-prints as somewhere between research and marketing. OpenAI’s pre-prints or white papers are a good example.
I think it’s hard to claim that a pre-print like Sparks of AGI is insincere (it might be, but how could we support that claim?), but this doesn’t undermine the general point. Suppose employees at Microsoft Research wanted to publish a similar report arguing that GPT-4′s seeming cognitive capabilities are actually just a bunch of cheap tricks and not sparks of anything. Would Microsoft publish that report? It’s not just about how financial or job-related incentives shape what you believe (although that is worth thinking about), it’s also about how they shape what you can say out loud. (And, importantly, what you are encouraged to focus on.)
There’s an expert consensus that tobacco is harmful, and there is a well-documented history of tobacco companies engaging in shady tactics. There is also a well-documented history of government propaganda being misleading and deceptive, and if you asked anyone with relevant expertise — historians, political scientists, media experts, whoever — they would certainly tell you that government propaganda is not reliable.
But just lumping in “AI accelerationist companies” with that is not justified. “AI accelerationist” just means anyone who works on making AI systems more capable who doesn’t agree with the AI alignment/AI safety community’s peculiar worldview. In practice, that means you’re saying most people with expertise AI are compromised and not worth listening to, but you are willing to listen to this weird random group of people, some of whom like Yudkowsky who have no technical expertise in contemporary AI paradigms (i.e. deep learning and deep reinforcement learning). This seems like a recipe for disaster, like deciding that capitalist economists are all corrupt and that only Marxist philosophers are worth trusting.
A problem with motivated reasoning arguments, when stretched to this extent, is that anyone can accuse anyone over the thinnest pretext. And rather than engaging with people’s views and arguments in any serious, substantive way, it just turns into a lot of finger pointing.
Yudkowsky’s gotten paid millions of dollars to prophesize AI doom. Many people have argued that AI safety/AI alignment narratives benefit the AI companies and their investors. The argument goes like this: Exaggerating the risks of AI exaggerates AI’s capabilities. Exaggerating AI’s capabilities makes the prospective financial value of AI much higher than it really is. Therefore, talking about AI risk or even AI doom is good business.
I would add that exaggerating risk may be a particularly effective way to exaggerate AI’s capabilities. People tend to be skeptical of anything that sounds like pie-in-the-sky hope or optimism. On the other hand, talking about risk sounds serious and intelligent. Notice what goes unsaid: many near-term AGI believers think there’s a high chance of some unbelievably amazing utopia just on the horizon. How many times have you heard someone imagine that utopia? One? Zero? And how many times have you heard various AI doom or disempowerment stories? Why would no one ever bring up this amazing utopia they think might happen very soon?
Even if you’re very pessimistic and think there’s a 90% chance of AI doom, a 10% chance of utopia is still pretty damn interesting. And many people are much more optimistic, thinking there’s around a 1-30% chance of doom, which implies a 70%+ chance of utopia. So, what gives? Where’s the utopia talk? Even when people talk about the utopian elements of AGI futures, they emphasize the worrying parts: what if intelligent machines produce effectively unlimited wealth, how will we organize the economy? What policies will we need to implement? How will people cope? We need to start worrying about this now! When I think about what would happen if I won the lottery, my mind does not go to worrying about the downsides.
I think the overwhelming majority of people who express views on this topic are true believers. I think they are sincere. I would only be willing to accuse someone of possibly doing something underhanded if, independently, they had a track record of deceptive behaviour. (Sam Altman has such a track record, and generally I don’t believe anything he says anymore. I have no way of knowing what’s sincere, what’s a lie, and what’s something he’s convinced himself of because it suits him to believe it.) I think the specific accusation that AI safety/AI alignment is a deliberate, conscious lie cooked up to juice AI investment is silly. It’s probably true, though, that people at AI companies have some counterintuitive incentive or bias toward talking up AI doom fears.
However, my general point is that just as it’s silly to accuse AI safety/alignment people of being shills for AI companies, it also seems silly to me to say that AI companies (or “AI accelerationist” companies, which is effectively all major AI companies and almost all startups) are the equivalent of tobacco companies, and you shouldn’t pay attention to what people at AI companies say about AI. Motivated reasoning accusations made on thin grounds can put you into a deluded bubble (e.g. becoming a Marxist) and I don’t think AI is some clear-cut, exceptional case like tobacco or state propaganda where obviously you should ignore the message.
I think the strength of the incentives to behave in a given way is more proportional to the resulting expected increase in welfare than to the expected increase in net earnings. Individual human welfare is often assumed to be proportional to the logarithm of personal consumption. So a given increase in earnings increases welfare less for people earning more. In addition, a 1 % chance of earning 100 times more (for example, due to one’s company being successful) increases welfare less than a 100 % chance of earning 100 % more. More importantly, there are major non-financial benefits for Yudowsky, who is somewhat seen as a prophet in some circles.
Why are they paid so much?
Copying from my other comment:
The reason Eliezer gets paid so much is because a donor specifically requested it. The express purpose of the donation was to make Eliezer rich enough that he could afford to say “actually AI risk isn’t a big deal” and shut down MIRI without putting himself in a difficult financial situation.(I don’t know about Nate’s salary but $235K looks pretty reasonable to me? That’s less than a mid-level software engineer makes.)
Edit Feb 2: Apparently the donation I was thinking of is separate from Eliezer’s salary, see his comment
I’m not sure how they decide on what salaries to pay themselves. But the reason they have the money to pay themselves those salaries in the first place is that MIRI’s donors believe there’s a significant chance of AI destroying the world within the next 5-20 years and that MIRI (especially Yudkowsky) is uniquely positioned to prevent this from happening.
It is somewhat difficult to react to this level of absolutely incredible nonsense politely, but I’ll try.
I disagree with both Yudkowsky and Soares about many things, but very obviously their direct experience with thinking and working with existing AIs would be worth > $1M pa if evaluated anonymously based on understanding SOTA AIs, and likely >$10s M pa if they worked on capabilities.
For the companies racing to AGI, Y&S endorsing some effort as good would likely have something between billions $ to tens of billions $ value.
“very obviously their direct experience with thinking and working with existing AIs would be worth > $1M pa if evaluated anonymously based on understanding SOTA AIs, and likely >$10s M pa if they worked on capabilities.”
“Y&S endorsing some effort as good would likely have something between billions $ to tens of billions $ value.”
fwiw both of these claims strike me as close to nonsense, so I don’t think this is a helpful reaction.
If you ask the AIs they get numbers in the tens of millions to tens of billions range, with around 1 billion being the central estimate. (I haven’t extensively controlled for the effect and some calculations appear driven by narrative)
Personally I find it hard to judge and tend to lean no when trying to think it through, but it’s not obviously nonsense.
I agree with Ben Stewart’s response that this is not a helpful thing to say. You are making some very strange and unintuitive claims. I can’t imagine how you would persuade a reasonable, skeptical, well-informed person outside the EA/LessWrong (or adjacent) bubble that these are credible claims, let alone that they are true. (Even within the EA Forum bubble, it seems like significantly more people disagree with you than agree.)
To pick on just one aspect of this claim: it is my understanding that Yudkowsky has no meaningful technical proficiency with deep learning-based or deep reinforcement learning-based AI systems. In my understanding, Yudkowsky lacks the necessary skills and knowledge to perform the role of an entry-level AI capabilities researcher or engineer at any AI company capable of paying multi-million-dollar salaries. If there is evidence that shows my understanding is mistaken, I would like to see that evidence. Otherwise, I can only conclude that you are mistaken.
I think the claim that an endorsement is worth billions of billions is also wrong, but it’s hard to disprove a claim about what would happen in the event of a strange and unlikely hypothetical. Yudkowsky, Soares, and MIRI have an outsized intellectual influence in the EA community (and obviously on LessWrong). There is some meaningful level of influence on the community of people working in the AI industry in the Bay Area, but it’s much less. Among the sort of people who could make decisions that would realize billions or tens of billions in value, namely the top-level executives at AI companies and investors, the influence seems pretty marginal. I would guess the overwhelming majority of investors either don’t know who Yudkowsky and Soares are or do but don’t care what their views are. Top-level executives do know who Yudkowsky is, but in every instance I’ve seen, they tend to be politely disdainful or dismissive toward his views on AGI and AI safety.
Anyway, this seems like a regrettably unproductive and unimportant tangent.
I think it could be a helpful response for people who are able to respond to signals of the type “someone who has demonstrably good forecasting skills, is an expert in the field, and works on this long time claims X” by at least re-evaluating if their models make sense and are not missing some important considerations.
If someone is at least able to that, they can for example ask a friendly AI or some other friendly AI and they will tell you, based on conservative estimates and reference classes, that the original claim is likely wrong. They will still miss important considerations—in a way in which typical forecaster would also do—so the results are underestimates.
I think at the level of [some combination of lack of ability to think and motivated reasoning] when people are uninterested in e.g. sanity checking their thinking with AIs, it is not worth the time correcting them. People are wrong on the internet all the time.
(I think the debate was moderately useful—I made an update from this debate & voting patterns, broadly in the direction EA Forum descending to a level of random place on the internet where confused people talk about AI and it is broadly not worth to read or engage. I’m no longer that much active on EAF, but I’ve made some update)
This thread seems to have gone in an unhelpful direction.
Questioning motivations is a hard point to make well. I’m unwilling to endorse that they are never relevant, but it immediately becomes personal. Keeping the focus primarily on the level of the arguments themselves is an approach more likely to enlighten and less likely to lead to flamewars.
I’m not here to issue a moderation warning to anyone for the conversation ending up on the point of motivations. I do want to take my moderation hat off and suggest that people spend more time on the object level.
I will then put my moderation hat back on and say that this and Jan’s previous comment breaks norms. You can disagree with someone without being this insulting.
I agree the thread direction may be unhelpful, and flame wars are bad.
I disagree though about the merits of questioning motivations, I think its super important.
In the AI sphere, there are great theoretical arguments on all sides, good arguments for accelleration, caution, pausing etc. We can discuss these ad nauseum and I do think that’s useful. But I think motivations likely shape the history and current state of AI development more than unmotivated easoning and rational thought. Money and Power are strong motivators—EA’s have sidelined them at their peril before. Although we cannot know people’s hearts, we can see and analyse what they have done and said in the past and what motivational pressure might affect them right now.
I also think its possislbe to have a somewhat object level about motivations.
I think this article on the history of Modern AI outlines some of this well https://substack.com/home/post/p-185759007
I might write more about this later...
Hi Jan.
Are you open to bets about this? I would be happy to bet 10 k$ that Anthropic would not pay e.g. 3 billion $ for Yudkowsky and Soares to endorse their last model as good. We could ask the marketing team at Anthropic or marketing experts elsewhere. I am not officially proposing a bet just yet. We would have to agree on a concrete operationalisation.
This doesn’t seem to be a reasonable way to operationalize. It would create much less value for the company if it was clear that they were being paid for endorsing them. And I highly doubt Amodei would be in a position to admit that they’d want such an endorsement even if it indeed benefitted them.
Thanks for the good point, Nick. I still suspect Anthropic would not pay e.g. 3 billion $ for Yudkowsky and Soares to endorse their last model as good if they were hypothetically being honest. I understand this is difficult to operationalise, but it could still be asked to people outside Anthropic.
The operationalisation you propose does not make any sense, Yudkowsky and Soares do not claim ChatGPT 5.2 will kill everyone or anything like that.
What about this:
MIRI approaches [a lab] with this offer: we have made some breakthrough in ability to verify if the way you are training AIs leads to misalignment in the way we are worried about. Unfortunately the way to verify requires a lot of computations (ie something like ARC), so it is expensive. We expect your whole training setup will pass this, but we will need $3B from you to run this; if our test will work, we will declare that your lab solved the technical part of AI alignment we were most worried about & some arguments which we expect to convince many people who listen to our views.
Or this: MIRI discusses stuff with xAI or Meta and convinces themselves their—secret—plan is by far the best chance humanity has, and everyone ML/AI smart and conscious should stop whatever they are doing and join them.
(Obviously these are also unrealistic / assume something like some lab coming with some plan which could even hypotehically work)
Thanks, Jan. I think it is very unlikely that AI companies with frontier models will seek the technical assistance of MIRI in the way you described in your 1st operationalisation. So I believe a bet which would only resolve in this case has very little value. I am open to bets against short AI timelines, or what they supposedly imply, up to 10 k$. Do you see any that we could make that is good for both of us under our own views considering we could invest our money, and that you could take loans?
I was considering hypothetical scenarios of the type “imagine this offer from MIRI arrived, would a lab accept” ; clearly MIRI is not making the offer because the labs don’t have good alignment plans and they are obviously high integrity enough to not be corrupted by relatively tiny incentives like $3b
I would guess there are ways to operationalise the hypothethicals, and try to have, for example, Dan Hendrycks guess what would xAI do, him being an advisor.
With your bets about timelines—I did 8:1 bet with Daniel Kokotajlo against AI 2027 being as accurate as his previous forecast, so not sure which side of the “confident about short timelines” do you expect I should take. I’m happy to bet on some operationalization of your overall thinking and posting about the topic of AGI being bad, e.g. something like “3 smartest available AIs in 2035 compare all what we wrote in 2026 on EAF, LW and Twitter about AI and judge who was more confused, overconfident and miscalibrated”.
When would the offer from MIRI arrive in the hypothetical scenario? I am sceptical of an honest endorsement from MIRI today being worth 3 billion $, but I do not have a good sense of what MIRI will look like in the future. I would also agree a full-proof AI safety certification is or will be worth more than 3 billion $ depending on how it is defined.
I was guessing I would have longer timelines. What is your median date of superintelligent AI as defined by Metaculus?
It’s not endorsing a specific model for marketing reasons; it’s about endorsing the effort, overall.
Given that Meta is willing to pay billions of dollars for people to join them, and that many people don’t work on AI capabilities (or work, e.g., at Anthropic, as a lesser evil) because they share their concerns with E&S, an endorsement from E&S would have value in billions-tens of billions simply because of the talent that you can get as a result of this.
Meta is paying billions of dollars to recruit people with proven experience at developing relevant AI models.
Does the set of “people with proven experience in building AI models” overlap with “people who defer to Eliezer on whether AI is safe” at all? I doubt it.
Indeed given that Yudkowsky’s arguments on AI are not universally admired and people who have chosen building the thing he says will make everybody die as their career are particularly likely to be sceptical about his convictions on that issue, an endorsement might even be net negative.
Thanks for the comment, Mikhail. Gemini 3 estimates a total annualised compensation of the people working at Meta Superintelligence Labs (MSL) of 4.4 billion $. If an endorsement from Yudkowsky and Soares was as beneficial (including via bringing in new people) as making 10 % of people there 10 % more impactful over 10 years, it would be worth 440 M$ (= 0.10*0.10*10*4.4*10^9).
You could imagine a Yudkowsky endorsement (say with the narrative that Zuck talked to him and admits he went about it all wrong and is finally taking the issue seriously just to entertain the counterfactual...) to raise meta AI from “nobody serious wants to work there and they can only get talent by paying exorbitant prices” to “they finally have access to serious talent and can get a critical mass of people to do serious work”. This’d arguably be more valuable than whatever they’re doing now.
I think your answer to the question of how much an endorsement would be worth mostly depends on some specific intuitions that I imagine Kulveit has for good reasons but most people don’t, so it’s a bit hard to argue about it. It also doesn’t help that in every other case than Anthropic and maybe deepmind it’d also require some weird hypotheticals to even entertain the possibility.