Evolutionary psychology professor, author of ‘The Mating Mind’, ‘Spent’, ‘Mate’, & ‘Virtue Signaling’. B.A. Columbia; Ph.D. Stanford. My research has focused on human cognition, machine learning, mate choice, intelligence, genetics, emotions, mental health, and moral virtues. Interested in long termism, X risk, longevity, pronatalism, population ethics, AGI, China, crypto.
Geoffrey Miller
Whatever people think about this particular reply by Nonlinear, I hope it’s clear to most EAs that Ben Pace could have done a much better job fact-checking his allegations against Nonlinear, and in getting their side of the story.
In my comment on Ben Pace’s original post 3 months ago, I argued that EAs & Rationalists are not typically trained as investigative journalists, and we should be very careful when we try to do investigative journalism—an epistemically and ethically very complex and challenging profession, which typically requires years of training and experience—including many experiences of getting taken in by individuals and allegations that seemed credible at first, but that proved, on further investigation, to have been false, exaggerated, incoherent, and/or vengeful.
EAs pride ourselves on our skepticism and our epistemic standards when we’re identifying large-scope, neglected, tractable causes areas to support, and when we’re evaluating different policies and interventions to promote sentient well-being. But those EA skills overlap very little with the kinds of investigative journalism skills required to figure out who’s really telling the truth, in contexts involving disgruntled ex-employees versus their former managers and colleagues.
EA epistemics are well suited to the domains of science and policy. We’re often not as savvy when it comes to interpersonal relationships and human psychology—which is the relevant domain here.
In my opinion, Mr. Pace did a rather poor job of playing the investigative journalism role, insofar as most of the facts and claims and perspectives posted by Kat Woods here were not even included or addressed by Ben Pace.
I think in the future, EAs making serious allegations about particular individuals or organizations should be held to a pretty high standard of doing their due diligence, fact-checking their claims with all relevant parties, showing patience and maturity before publishing their investigations, and expecting that they will be held accountable for any serious errors and omissions that they make.
- 13 Dec 2023 22:52 UTC; 49 points) 's comment on Nonlinear’s Evidence: Debunking False and Misleading Claims by (
Nathan—thanks for sharing the Time article excerpts, and for trying to promote a constructive and rational discussion.
For now, I don’t want to address any of the specific issues around SBF, FTX, or EA leadership. I just want to make a meta-comment about the mainstream media’s feeding frenzy around EA, and its apparently relentless attempts to discredit EA.
There’s a classic social/moral psychology of ‘comeuppance’ going on here: any ‘moral activists’ who promote new and higher moral standards (such as the EA movement) can make ordinary folks (including journalists) feel uncomfortable, resentful, and inadequate. This can lead to a public eagerness to detect any forms of moral hypocrisy, moral failings, or bad behavior in the moral activist groups. If any such moral failings are detected, they get eagerly embraced, shared, signal-amplified, and taken as gospel. This makes it easier to dismiss the moral activists’ legitimate moral innovations (e.g. focusing on scope-sensitivity, tractability, neglectedness, long-termism), and allows a quicky, easy return to the status quo ante (e.g. national partisan politics + scope-insensitive charity as usual).
We see this ‘psychology of comeuppance’ in the delight that mainstream media took when televangelists who acted greedy, lustful, and/or mendacious suffered various falls from grace over the last few decades. We see it in the media’s focus on the (relatively minor) moral mis-steps and mis-statements of ‘enemy politicians’ (i.e. those in whatever party the journalists don’t like), compared to the (relatively major) moral harms done by bad government policies. We see it throughout cancel culture, which is basically the psychology of comeuppance weaponized through social media to attack ideological enemies.
I’m not positing an organized conspiracy among mainstream journalists to smear EA. Rather, I’m pointing out a widespread human psychological propensity to take delight in any moral failings of any activist groups that make people feel morally inadequate. This propensity may be especially strong among journalists, since it motivates a lot of their investigative reporting (sometimes in the legitimate public interest, sometimes not).
I think it’s useful to recognize the ‘comeuppance psychology’ when it’s happening, because it often overshoots, and amplifies moderately bad moral errors into looking like they’re super-bad moral errors. When a lot of credible, influential media sources are all piling onto a moral activist group (like EA), it can be extremely stressful, dispiriting, and toxic for the group. It can lead the group to doubt their own valid ideas and values, to collapse into schisms and recriminations, to over-correct its internal moral norms in an overly puritanical direction, and to ostracize formerly valued leaders and colleagues.
I’ve seen EA do a lot of soul-searching over the last few months. Some of it has been useful, valid, and constructive. Some of it has been self-flagellating, guilt-stricken, and counter-productive. I think we should take the Time article seriously, learn what we can from it, and update some of our views of issues and people. But I think our reactions should be tempered and contextualized by understanding that the media’s ‘comeuppance psychology’ can also lead to hasty, reactive, over-corrections.
- 26 Mar 2023 10:41 UTC; 11 points) 's comment on Time Article Discussion—“Effective Altruist Leaders Were Repeatedly Warned About Sam Bankman-Fried Years Before FTX Collapsed” by (
A moral backlash against AI will probably slow down AGI development
leopold—my key question here would be, if the OpenAI Preparedness team concluded in a year or two that the best way to mitigate AGI risk would be for OpenAI to simply stop doing AGI research, would anyone in OpenAI senior management actually listen to them, and stop doing AGI research?
If not, this could end up being just another example of corporate ‘safety-washing’, where the company has already decided what they’re actually going to do, and the safety team is just along for the ride.
I’d value your candid view on this; I can’t actually tell if there are any conditions under which OpenAI would decide that what they’ve been doing is reckless and evil, and they should just stop.
Rohit—if you don’t believe in epistemic integrity regarding controversial views that are socially stigmatized, you don’t actually believe in epistemic integrity.
You threw in some empirical claims about intelligence research, e.g. ‘There’s plenty of well reviewed science in the field that demonstrates that, varyingly, there are issues with measurements of both race and intelligence, much less how they evolve over time, catch up speeds, and a truly dizzying array of confounders.’
OK. Ask yourself the standard epistemic integrity checks: What evidence would convince you to change your mind about these claims? Can you steel-man the opposite position? Are you applying the scout mindset to this issue? What were your Bayesian priors about this issue, and why did you have those priors, and what would update you?
It’s OK for EAs to see a highly controversial area (like intelligence research), to acknowledge that learning more about it might be a socially handicapping infohazard, and to make a strategic decision not to touch the issue with a 10-foot-pole—i.e. to learn nothing more about it, to say nothing about it, and if asked about it, to respond ‘I haven’t studied this issue in enough depth to offer an informed judgment about it.’
What’s not OK is for EAs to suddenly abandon all rationality principles and epistemic integrity principles, and to offer empirically unsupported claims and third-hand critiques of a research area (that were debunked decades ago), just because there are high social costs to holding the opposite position.
It’s honestly not that hard to adopt the 10-foot-pole strategy regarding intelligence research controversies—and maybe that would be appropriate for most EAs, most of the time.
You just have to explain to people ‘Look, I’m not an intelligence research expert. But I know enough to understand that any informed view on this matter would require learning all about psychometric measurement theory, item response theory, hierarchical factor analysis, the g factor, factorial invariance across groups, evolutionary cognitive psychology, evolutionary neurogenetics, multivariate behavior genetics, molecular behavior genetics, genome-wide association studies for cognitive abilities, extended family twin designs, transracial adoption studies, and several other fields. I just haven’t put in the time. Have you?’
That kind of response can signal that you’re epistemically humble enough not to pretend to have any expertise, but that you know enough about what you don’t know, that whoever you’re talking to can’t really pretend any expertise they don’t have either.
And, by the way, for any EAs to comment on the intelligence research without actually understanding the majority of the topics I mentioned above, would be pretty silly: analogous to someone commenting on technical AI alignment issues if they don’t know the difference between an expert system and a deep neural network, or the difference between supervised and reinforcement learning.
Maya—thanks for a thoughtful, considered, balanced, and constructive post.
Regarding the issue that ‘Effective Altruism Has an Emotions Problem’: this is very tricky, insofar as it raises the issue of neurodiversity.
I’ve got Aspergers, and I’m ‘out’ about it (e.g. in this and many other interviews and writings). That means I’m highly systematizing, overly rational (by neurotypical standards), more interested in ideas than in most people, and not always able to understand other people’s emotions, values, or social norms. I’m much stronger on ‘affective empathy’ (feeling distressed by the suffering of others) than on ‘cognitive empathy’ (understanding their beliefs & desires using Theory of Mind.)
Let’s be honest. A lot of us in EA have Aspergers, or are ‘on the autism spectrum’. EA is, to a substantial degree, an attempt by neurodivergent people to combine our rational systematizing with our affective empathy—to integrate our heads and our hearts, as they actually work, not as neurotypical people think they should work.
This has lead to an EA culture that is incredibly welcoming, supportive, and appreciative of neurodivergent people, and that capitalizes on our distinctive strengths. For those of us who are ‘Aspy’, nerdy, or otherwise eccentric by ‘normie’ standards, EA has been an oasis of rationality in a desert of emotionality, virtue-signaling, hypocrisy, and scope-insensitivity.
Granted, it is often helpful to remind neurodivergent people that we can try to improve our emotional skills, sensitivity, and cognitive empathy.
However, I worry that if we try to address this ‘emotions problem’ in ways that might feel awkward, alienating, and unnatural to many neurodivergent people in EA, we’ll lose a lot of what makes EA special and valuable.
I have no idea how to solve this problem, or how to strike the right balance between welcoming and valuing neurodiversity, versus welcoming and valuing more neurotypical norms around emotions and cognitive empathy. I just wanted to introduce this concern, and see what everybody else thinks about it.
Julia—thanks for a helpful update.
As someone who’s dealt with journalists & interviews for over 25 years, I would just add: if you do talk to any journalists for any reason, be very clear up front about (1) whether the interview is ‘on the record’, ‘off the record’, ‘background’, or ‘deep background’, (2) ask for ‘quote approval’, i.e. you as the interviewee having final approval over any quotes attributed to them, (3) possibly ask for overall pre-publication approval of the whole piece, so its contents, tone, and approach are aligned with yours. (Most journalists will refuse 2 and 3, which reminds you they are not your friends or allies; they are seeking to produce content that will attract clicks, eyeballs, and advertisers.)
Also, record the interview on your end, using recording software, so you can later prove (if necessary, in court), that you were quoted accurately or inaccurately.
If you’re not willing to take all these steps to protect yourself, your organization, and your movement, DO NOT DO THE INTERVIEW.
This piece is a useful resource about these terms and concepts.
- 14 Oct 2019 8:45 UTC; 40 points) 's comment on What to know before talking with journalists about EA by (
I agree with Scott Alexander that when talking with most non-EA people, an X risk framework is more attention-grabbing, emotionally vivid, and urgency-inducing, partly due to negativity bias, and partly due to the familiarity of major anthropogenic X risks as portrayed in popular science fiction movies & TV series.
However, for people who already understand the huge importance of minimizing X risk, there’s a risk of burnout, pessimism, fatalism, and paralysis, which can be alleviated by longtermism and more positive visions of desirable futures. This is especially important when current events seem all doom’n’gloom, when we might ask ourselves ‘what about humanity is really worth saving?’ or ‘why should we really care about the long-term future, it it’ll just be a bunch of self-replicating galaxy-colonizing AI drones that are no more similar to us than we are to late Permian proto-mammal cynodonts?’
In other words, we in EA need long-termism to stay cheerful, hopeful, and inspired about why we’re so keen to minimize X risks and global catastrophic risks.
But we also need longtermism to broaden our appeal to the full range of personality types, political views, and religious views out there in the public. My hunch as a psych professor is that there are lots of people who might respond better to longtermist positive visions than to X risk alarmism. It’s an empirical question how common that is, but I think it’s worth investigating.
Also, a significant % of humanity is already tacitly longtermist in the sense of believing in an infinite religious afterlife, and trying to act accordingly. Every Christian who takes their theology seriously & literally (i.e. believes in heaven and hell), and who prioritizes Christian righteousness over the ‘temptations of this transient life’, is doing longtermist thinking about the fate of their soul, and the souls of their loved ones. They take Pascal’s wager seriously; they live it every day. To such people, X risks aren’t necessarily that frightening personally, because they already believe that 99.9999+% of sentient experience will come in the afterlife. Reaching the afterlife sooner rather than later might not matter much, given their way of thinking.
However, even the most fundamentalist Christians might be responsive to arguments that the total number of people we could create in the future—who would all have save-able souls—could vastly exceed the current number of Christians. So, more souls for heaven; the more the merrier. Anybody who takes a longtermist view of their individual soul might find it easier to take a longtermist view of the collective human future.
I understand that most EAs are atheists or agnostics, and will find such arguments bizarre. But if we don’t take the views of religious people seriously, as part of the cultural landscape we’re living in, we’re not going to succeed in our public outreach, and we’re going to alienate a lot of potential donors, politicians, and media influencers.
There’s a particular danger in overemphasizing the more exotic transhumanist visions of the future, in alienating religious or political traditionalists. For many Christians, Muslims, and conservatives, a post-human, post-singularity, AI-dominated future would not sound worth saving. Without any humane connection to their human social world as it is, they might prefer a swift nuclear Armageddon followed by heavenly bliss, to a godless, soulless machine world stretching ahead for billions of years.
EAs tend to score very highly on Openness to Experience. We love science fiction. We like to think about post-human futures being potentially much better than human futures. But it that becomes our dominant narrative, we will alienate the vast majority of current living humans, who score much lower on Openness.
If we push the longtermist narrative to the general public, we better make the long-term future sound familiar enough to be worth fighting for.
A long-termist perspective on EA’s current PR crisis
The Michael Neilsen critique seems thoughtful, constructive, and well-balanced on first read, but I have some serious reservations about the underlying ethos and its implications.
Look, any compelling new world-view that is outside the mainstream cultures’ Overton window can be pathologized as an information hazard that makes its believers feel unhappy, inadequate, and even mentally ill by mainstream standards. Nielsen seems to view ‘strong EA’ as that kind of information hazard, and critiques it as such.
Trouble is, if you understand that most normies are delusional about some important issue, and you you develop some genuinely deeper insights into that issue, the psychologically predictable result is some degree of alienation and frustration. This is true for everyone who has a religious conversion experience. It’s true for everyone who really takes onboard the implications of any intellectually compelling science—whether cosmology, evolutionary biology, neuroscience, signaling theory, game theory, behavior genetics, etc. It’s true for everyone who learns about any branch of moral philosophy and takes it seriously as a guide to action.
I’ve seen this over, and over, and over in my own field of evolutionary psychology. The usual ‘character arc’ of ev psych insight is that (1) you read Dawkins or Pinker or Buss, you get filled with curiosity about the origins of human nature, (2) you learn some more and you feel overwhelming intellectual awe and excitement about the grandeur of evolutionary theory, (3) you gradually come to understand that every human perception, preference, value, desire, emotion, and motivation has deep evolutionary roots beyond your control, and you start to feel uneasy, (4) you ruminate about how you’re nothing but an evolved robot chasing reproductive success through adaptively self-deceived channels, and you feel some personal despair, (5) you look around at a society full of other self-deceived humans unaware of their biological programming, and you feel black-pilled civilizational despair, (6) you live with the Darwinian nihilism for a few years, adapt to the new normal, and gradually find some way to live with the new insights, climbing your way back into some semblance of normie-adjacent happiness. I’ve seen these six phases many times in my own colleagues, grad students, and collaborators.
And that’s just with a new descriptive world-view about how the human world works. EA’s challenge can be even more profound, because it’s not just descriptive, but normative, or at least prescriptive. So there’s a painful gap between what we could be doing, and what we are doing. And so there should be, if you take the world in a morally serious way.
I think the deeper problem is that given 20th century history, there’s a general dubiousness about any group of people who do take the world in a morally serious way that deviates from the usual forms of mild political virtue signaling encouraged in our current system of credentialism, careerism, and consumerism.
- 7 Dec 2022 17:54 UTC; 6 points) 's comment on Announcing: EA Forum Podcast – Audio narrations of EA Forum posts by (
Some historical context on this issue. If Bostrom’s original post was written around 1996 (as I’ve seen some people suggest), that was just after the height of the controversy over ‘The Bell Curve’ book (1994) by Richard Herrnstein & Charles Murray.
In response to the firestorm around that book, the American Psychological Association appointed a blue-ribbon committee of 11 highly respected psychologists and psychometricians to evaluate the Bell Curve’s empirical claims. They published a report in 1996 on their findings, which you can read here, and summarized here. The APA committee affirmed most of the Bell Curve’s key claims, and concluded that there were well-established group differences in average general intelligence, but that the reasons for the differences were not yet clear.
More recently, Charles Murray has reviewed the last 30 years of psychometric and genetic evidence in his book Human Diversity (2020), and in his shorter, less technical book Facing Reality (2021).
This is the most controversial topic in all of the behavioral sciences. EAs might be prudent to treat this whole controversy as an information hazard, in which learning about the scientific findings can be socially and professionally dangerous. But it is worth noting that there is a big gap between what intelligence researchers have actually found, versus what most social scientists, journalists, and activists believe.
Epistemic status: As a psychology professor, I’ve worked on intelligence research for over 20 years, was on the editorial board of the journal Intelligence, and have published 3 books and 11 papers on the evolutionary origins, functions, genetics, and structure of human intelligence, which have been cited a few thousand times. However I’ve never worked directly on, or published on, group differences in intelligence.
- 16 Jan 2023 13:56 UTC; 27 points) 's comment on Thread for discussing Bostrom’s email and apology by (
Teaching EA through superhero thought experiments
Peter—I have mixed feelings about your advice, which is well-expressed and reasonable.
I agree that, typically, it’s prudent not to get caught up in news stories that involve high uncertainty, many rumors, and unclear long-term impact.
However, a crucial issue for the EA movement is whether there will be a big public relations blowback against EA from the FTX difficulties. If there’s significant risk of this blowback, EA leadership better develop a pro-active plan for dealing with the PR crisis—and quick.
The FTX crisis is a Very Big Deal in crypto—one of the worst crises ever. Worldwide, about 300 million people own crypto. Most of them have seen dramatic losses in the value of their tokens recently. On paper, at least, they have lost a couple of hundred billion dollars in the last couple of days. Most investors are down at least 20% this week because of this crisis. Even if prices recover, we will never forget how massive this drop has been.
Sam Bankman-Fried (SBF) himself has allegedly lost about 94% of his net worth this week, down from $15 billion to under $1 billion. (I don’t give much credence to these estimates, but it’s pretty clear the losses have been very large).
Millions of crypto investors are furious. They blame the FTX leadership, especially SBF. And some of them are blaming FTX’s difficulties on SBF’s utilitarianism, e.g. tweeting things like ‘never trust a utilitarian with your money’.
This could all blow over. The financial contagion from FTX might be contained. Crypto prices might recover soon. Binance dominating the crypto exchange space might become the new normal. Other billionaires might step up to fill any funding gap (once the asset markets recover in a year, or two, or five).
But I think it would be prudent for EA leadership to treat this FTX crisis as a potentially serious PR crisis for EA—and not just a massive financial crisis for EA funding. SBF’s close association with EA creates some potential PR risk for the EA movement, especially among crypto investors.
It all depends on how mainstream media spins the FTX story. The next couple of weeks will be critical. If crypto news, financial news, and/or mainstream news starts blaming SBF personally for these difficulties, or uncovers evidence of financial wrong-doing, or links the FTX crisis someone to utilitarian moral reasoning and/or EA, that could be really bad for our movement.
I have no idea what the optimal PR response would be. I’m not a PR expert. But PR crisis management experts do exist, and I would strongly urge EA leadership to consult some of them. Soon.
This FTX crisis might not be an existential risk to EA, but it might be a global catastrophic risk at both the financial and the public relations levels. And we have learned to take GCRs seriously, haven’t we?
PS let me be clear: I have a lot of respect for SBF; I don’t have any real idea what happened with FTX; I’m not assigning any blame; and I hope the crisis can be resolved with minimal damage to investors and the crypto industry.
Excellent post. I hope everybody reads it and takes it onboard.
One failure mode for EA will be over-reacting to black swan events like this that might not carry as much information about our organizations and our culture as we think they do.
Sometimes a bad actor who fools people is just a bad actor who fools people, and they’re not necessarily diagnostic of a more systemic organizational problem. They might be, but they might not be.
We should be open to all possibilities at this point, and if EA decides it needs to tweak, nudge, update, or overhaul its culture and ethos, we should do so intelligently, carefully, strategically, and wisely—rather than in a reactive, guilty, depressed, panicked, or self-flagellating panic.
Rob—I strongly agree with your take here.
EA prides itself on quantifying the scope of problems. Nobody seems to be actually quantifying the alleged scope of sexual misconduct issues in EA. There’s an accumulation of anecdotes, often second or third hand, being weaponized by mainstream media into a blanket condemnation of EA’s ‘weirdness’. But it’s unclear whether EA has higher or lower rates of sexual misconduct than any other edgy social movement that includes tens of thousands of people.
In one scientific society I’m familiar with, a few allegations of sexual conduct were made over several years (out of almost a thousand members). Some sex-negative activists tried to portray the society as wholly corrupt, exploitative, sexist, unwelcoming, and alienating. But instead of taking the allegations reactively as symptomatic of broader problems, the society ran a large-scale anonymous survey of almost all members. And it found that something less than 2% of female or male members had ever felt significantly uncomfortable, unwelcome, or exploited. That was the scope of the problem. 2% isn’t 0%, but it’s a lot better than 20% or 50%. In response to this scope information, the society did not adopt the draconian anti-sex, anti-relationship, anti-socializing policies that the activists had demanded. Instead, it allowed its members to treat each other as mature adults capable of navigating their own social and sexual decisions.
If EA is serious about assessing ‘weird sexual come-ons’ as a cause area that’s worthy of attention, then we should apply the usual EA quantification methods, instead of just defaulting to emotion-driven ‘activist mode’. How widespread is the actual problem? How severe are the consequences? How neglected is this issue (given that the EA community team is already actively involved in addressing this issue)?
If we’re not willing do a serious, scope-sensitive, cause assessment of this issue, we’re just reacting as prudish, puritanical alarmists, who are willing to sex-shame, poly-shame, kink-shame, and Aspy-shame whenever it seems like the ‘empathic, concerned’ thing to do.
Historical note: If EA had emerged in the 1970s era of the gay rights movement rather than the 2010s, I can imagine an alternative history in which some EAs were utterly outraged and offended that gay or lesbian EAs had dared to invite them to a gay or lesbian event. The EA community could have leveraged the latent homophobia of the time to portray such an invitation as bizarrely unprofessional, and a big problem that needs addressing. Why are we treating polyamory and kink in 2023 with the same reactive outrage that people would have treated gay/lesbian sexuality fifty years ago?
Epistemic status/disclosure: I’m an evolutionary sex researcher who teaches courses on ‘Alternative Relationships’, ‘Psychology of Human Sexuality’, and related topics. I’ve recently been doing research on anti-polyamory stigma and anti-BDSM stigma, and I’m on the American Psychological Association (APA) Task Force on Consensual Non-Monogamy. FWIW, I’m seeing an alarming amount of anti-poly stigma, kink-shaming, and ageism (esp. outrage about age-gap relationships) emerging in EA lately—and in the mainstream media’s oddly well-coordinated attack EA.
Leopold—thanks for a clear, vivid, candid, and galavanizing post. I agree with about 80% of it.
However, I don’t agree with your central premise that alignment is solvable. We want it to be solvable. We believe that we need it to be solvable (or else, God forbid, we might have to actually stop AI development for a few decades or centuries).
But that doesn’t mean it is solvable. And we have, in my opinion, some pretty compelling reasons to think that it not solvable even in principle, (1) given the diversity, complexity, and ideological nature of many human values (which I’ve written about in other EA Forum posts, and elsewhere), (2) given the deep game-theoretic conflicts between human individuals, groups, companies, and nation-states (which cannot be waved away by invoking Coherent Extrapolated Volition, or ‘dontkilleveryoneism’, or any other notion that sweeps people’s profoundly divergent interests under the carpet), and (3) given that humans are not the only sentient stakeholder species that AI would need to be aligned with (advanced AI will have implications for every other of the 65,000 vertebrate species on Earth, and most of the 1,000,000+ invertebrate species, one way or another).
Human individuals aren’t aligned with each other. Companies aren’t aligned with each other. Nation-states aren’t aligned with each other. Other animal species aren’t aligned with humans, or with each other. There is no reason to expect that any AI systems could be ‘aligned’ with the totality of other sentient life on Earth. Our Bayesian prior, based on the simple fact that different sentient beings have different interests, values, goals, and preferences, must be that AI alignment with ‘humanity in general’, or ‘sentient life in general’, is simply not possible. Sad, but true.
I worry that ‘AI alignment’ as a concept, or narrative, or aspiration, is just promising enough that it encourages the AI industry to charge full steam ahead (in hopes that alignment will be ‘solved’ before AI advances to much more dangerous capabilities), but it is not delivering nearly enough workable solutions to make their reckless accelerationism safe. We are getting the worst of both worlds—a credible illusion of a path towards safety, without any actual increase in safety.
In other words, the assumption that ‘alignment is solvable’ might be a very dangerous X-risk amplifier, in its own right. It emboldens the AI industry to accelerate. It gives EAs (probably) false hope that some clever technical solution can make humans all aligned with each other, and make machine intelligences aligned with organic intelligences. It gives ordinary citizens, politicians, regulators, and journalists the impression that some very smart people are working very hard on making AI safe, in ways that will probably work. It may be leading China to assume that some clever Americans are already handling all those thorny X-risk issues, such that China doesn’t really need to duplicate those ongoing AI safety efforts, and will be able to just copy our alignment solutions once we get them.
If we take seriously the possibility that alignment might not be solvable, we need to rethink our whole EA strategy for reducing AI X-risk. This might entail EAs putting a much stronger emphasis on slowing or stopping further AI development, at least for a while. We are continually told that ‘AI is inevitable’, ‘the genie is out of the bottle’, ‘regulation won’t work’, etc. I think too many of us buy into the over-pessimistic view that there’s absolutely nothing we can do to stop AI development, while also buying into the over-optimistic view that alignment is possible—if we just recruit more talent, work a little more, get a few more grants, think really hard, etc.
I think we should reverse these optimisms and pessimisms. We need to rediscover some optimism that the 8 billion people on Earth can pause, slow, handicap, or stop AI development by the 100,000 or so AI researchers, devs, and entrepreneurs that are driving us straight into a Great Filter. But we need to rediscover some pessimism about the concept of ‘AI alignment’ itself.
In my view, the burden of proof should be on those who think that ‘AI alignment with human values in general’ is a solvable problem. I have seen no coherent argument that it is solvable. I’ve just seen people desperate to believe that it is solvable. But that’s mostly because the alternative seems so alarming, i.e., the idea that (1) the AI industry is increasingly imposing existential risks on us all, (2) it has a lot of money, power, talent, influence, and hubris, (3) it will not slow down unless we make it slow down, and (4) slowing it down will require EAs to shift to a whole different set of strategies, tactics, priorities, and mind-sets than we had been developing within the ‘alignment’ paradigm.
- How could a moratorium fail? by 22 Sep 2023 15:11 UTC; 48 points) (
- Geoffrey Miller on Cross-Cultural Understanding Between China and Western Countries as a Neglected Consideration in AI Alignment by 17 Apr 2023 3:26 UTC; 25 points) (
- P(doom|AGI) is high: why the default outcome of AGI is doom by 2 May 2023 10:40 UTC; 13 points) (
- 15 May 2023 3:10 UTC; 2 points) 's comment on EA and AI Safety Schism: AGI, the last tech humans will (soon*) build by (
Jeff—this is a useful perspective, and I agree with some of it, but I think it’s still loading a bit too much guilt onto EA people and organizations for being duped and betrayed by a major donor.
EAs might have put a little bit too much epistemic trust in subject matter experts regarding SBF and FTX—but how can we do otherwise, practically speaking?
In this case, I think there was a tacit, probably largely unconscious trust that if major VCs, investors, politicians, and journalists trusted SBF, then we can probably trust him too. This was not just a matter of large VC firms vetting SBF and giving him their seal of approval through massive investments (flawed and rushed though their vetting may have been.)
It’s also a matter of ordinary crypto investors, influencers, and journalists largely (though not uniformly) thinking FTX was OK, and trusting him with billions of dollars of their money, in an industry that is actually quite skeptical a lot of the time. And major politicians, political parties, and PACs who accepted millions in donations trusting that SBD’s reputation would not suffer such a colossal downturn that they would be implicated. And journalists from leading national publications doing their own forms of due diligence and investigative journalism on their interview subjects.
So, we have a collective failure of at least four industries outside EA—venture capital, crypto experts, political fund-raisers, and mainstream journalists—missing most of the alleged, post-hoc, red flags about SBF. The main difference between EA and those other four industries is that I see us doing a lot of healthy, open-minded, constructive, critical dialogue about what we could have done differently, and I don’t see the other four industries doing much—or any—of that.
Let’s consider an analogous situation in cause-area science rather than donor finance. Suppose EAs read some expert scientific literature about a potential cause area—whether global catastrophic biological risks, nuclear containment, deworming efficacy, direct cash transfers, geoengineering, or any other domain. Suppose we convince each other, and donors, to spend billions on a particular cause area based on expert consensus about what will work to reduce suffering or risk. And then suppose that some of the key research that we used to recommend that cause area turns out to have been based on false data fabricated by a powerful sociopathic scientist and their lab—but the data were published in major journals, peer-reviewed by leading scientists, cited by hundreds of other experts, informed public policy, etc.
How much culpability would EA have in that situation? Should we have done our own peer review of the key evidence in the cause area? Should we have asked the key science labs for their original data? Should we have hired subject matter experts to do some forensic analysis of the peer-reviewed papers? That seems impractical. At a certain point, we just have to trust the peer-review process—whether in science, or in finance, politics, and journalism—with the grim understanding that we will sometimes be fooled and betrayed.
The major disanalogy here would be if the key sociopathic scientist who faked the data was personally known to the leaders of a movement for many years, and was directly involved in the community. But even there, I don’t think we should be too self-castigating. I have known several behavioral scientists more-or-less well, over the years, who turned out to be very bad actors who faked data, but who were widely trusted in their fields, who didn’t raise any big red flags, and who left all of their colleagues scratching their heads afterwards, asking ‘How on Earth did I miss the fact that this was a really shady researcher?’ The answer usually turns out to be, the disgraced researcher allocated most of the time that other researchers would have put into collecting real data, into covering their tracks and duping their colleagues, and they were just very good at being deceptive and manipulative.
Science relies on trust, so it’s relatively vulnerable to intentionally bad, deceptive actors. EA also relies on trust in subject matter experts, so we’re also relatively vulnerable to bad actors. But unless we want to replicate every due diligence process, every vetting process, every political ‘opposition research’ process, every peer review process, every investigative journalism process, then we will remain vulnerable to the occasional error—and sometimes those errors will be very big and very harmful.
That might just be the price of admission when trying to do evidence-based good using finances from donors.
Of course, there are lots of ways we could do better in the future, especially in doing somewhat deeper dives into key donors, the integrity of key organizations and leaders, and the epistemics around key cause areas. I’m just cautioning against over-correcting in the direction of distrust and paranoia.
Epistemic status of this comment: I’m slightly steel-manning a potential counter-argument against Jeff’s original post, and I think I’m mostly right, but I could easily be persuaded otherwise.
So, OpenAI believes that superintelligence ‘could arrive this decade’, and ‘could lead to the disempowerment of humanity or even human extinction’.
Those sentences should strike EAs as among the most alarming ones ever written.
Personally, I’m deeply concerned that OpenAI seems to have become even more caught up in their runaway hubris spiral, such that they’re aiming not just for AGI, but for ASI, as soon as possible—whether or not they get anywhere close to achieving viable alignment solutions.
The worst part of this initiative is framing alignment as something that will require a vast new increase in AI capabilities—the AGI-level ‘automated alignment researcher’. This gives them a get-out-of-jail-free card: they can claim they must push ahead with AGI, so they can build this automated alignment researcher, so they can keep us all safe from… the AGI-level systems they’ve just built. In other words, instead of treating AI alignment with human values as a problem at the intersection of moral philosophy, moral psychology, and other behavioral sciences, they’re treating it as just another AI capabilities issue, amenable to clever technical solutions plus a whole lot of compute. Which gives them carte blanche to push ahead with capabilities research, under the guise of safety research.
So, I see this ‘superintelligence alignment’ effort as cynical PR window-dressing, intended to reassure naive and gullible observers that OpenAI is still among ‘the good guys’, even as they accelerate their imposition of extinction risks on humanity.
Let’s be honest with ourselves about that issue. If OpenAI still had a moral compass, and were still among the good guys, they would pause AGI (and ASI) capabilities research until they have achieved a viable, scalable, robust set of alignment methods that have the full support and confidence of AI researchers, AI safety experts, regulators, and the general public. They are nowhere close to that, and they probably won’t get close to it in four years. Many AI-watchers (including me) are extremely skeptical that ‘AI alignment’ is every possible, in any meaningfully safe way, given the diversity, complexity, flexibility, and richness of human (and animal) values and preferences that AIs are trying to ‘align’ with.
In summary: If OpenAI was an ethical company, they would stop AI capabilities research until they solve alignment. Period. They’re not doing that, and have shown no intention of doing that. Therefore, I infer that they are not an ethical company, and they do not have humanity’s best interests at heart.
Brief note on why EA should be careful to remain inclusive & welcoming to neurodiverse people:
As somebody with Aspergers, I’m getting worried that in this recent ‘PR crisis’, EA is sending some pretty strong signals of intolerance to those of us with various kinds of neurodiversity that can make it hard for us to be ‘socially sensitive’, to ‘read the room’, and to ‘avoid giving offense’. (I’m not saying that any particular people involved in recent EA controversies are Aspy; just that I’ve seen a general tendency for EAs to be a little Aspier than other people, which is why I like them and feel at home with them.)
There’s an ongoing ‘trait war’ that’s easy to confuse with the Culture War. It’s not really about right versus left, or reactionary versus woke. It’s more about psychological traits: ‘shape rotators’ versus ‘wordcels’, ‘Aspies’ versus ‘normies’, systematizers versus empathizers, high decouplers versus low decouplers.
EA has traditionally been an oasis for Aspy systematizers with a high degree of rational compassion, decoupling skills, and quantitative reasoning. One downside of being Aspy is that we occasionally, or even often, say things that normies consider offensive, outrageous, unforgiveable, etc.
If we impose standard woke cancel culture norms on everybody in EA, we will drive away everybody with the kinds of psychological traits that created EA, that helped it flourish, and that made it successful. Politically correct people love to Aspy-shame. They will seek out the worst things a neurodiverse person has ever said, and weaponize it to destroy their reputation, so that their psychological traits and values are allowed no voice in public discourse. (Systematizing and decoupling are existential threats to political correctness....)
I’ve seen this happen literally dozens of times in academia over the last decade. High emphathizers and low decouplers are taking over the universities from high systematizers and high decouplers. They will do the same to EA, if we’re not careful.
I’ve written in much more depth about this in my 2017 essay ‘The neurodiversity case for free speech’ (paywalled here on Quillette, free pdf here). IMHO, it’s more relevant than ever, in relation to some of EA’s recent public relations issues.