AI safety researcher
Thomas Kwa
Claude thinks possible outgroups include the following, which is similar to what I had in mind
Based on the EA Forum’s general orientation, here are five individuals/groups whose characteristic opinions would likely face downvotes:
Effective accelerationists (e/acc) - Advocates for rapid AI development with minimal safety precautions, viewing existential risk concerns as overblown or counterproductive
TESCREAL critics (like Emile Torres, as you mentioned) - Scholars who frame longtermism/EA as ideologically dangerous, often linking it to eugenics, colonialism, or techno-utopianism
Anti-utilitarian philosophers—Strong deontologists or virtue ethicists who reject consequentialist frameworks as fundamentally misguided, particularly on issues like population ethics or AI risk trade-offs
Degrowth/anti-progress advocates—Those who argue economic/technological growth is net-negative and should be reduced, contrary to EA’s generally pro-progress orientation
Left-accelerationists and systemic change advocates—Critics who view EA as a “neoliberal” distraction from necessary revolutionary change, or who see philanthropic approaches as fundamentally illegitimate compared to state redistribution
My main concern is that the arrival of AGI completely changes the situation in some unexpected way.
e.g. in the recent 80k podcast on fertility, Rob Wiblin opines that the fertility crash would be a global priority if not for AI likely replacing human labor soon and obviating the need for countries to have large human populations. There could be other effects.
My guess is that due to advanced AI, both artificial wombs and immortality will be technically feasible in the next 40 years, as well as other crazy healthcare tech. This is not an uncommon view
Before anything like a Delphi forecast it seems better to informally interview a couple of experts, and then write your own quick report on what the technical barriers are to artificial wombs. This way you can incorporate this into the structure of any forecasting exercise, e.g. by asking experts to forecast when each of hurdles X, Y, and Z will be solved, whereupon you can do things like identifying where the level of agreement is highest and lowest, as well as consistency checks against the overall forecast.
Most infant mortality still happens in the developing world, due to much more basic factors like tropical diseases. So if the goal is reducing infant mortality globally, you won’t be addressing most of the problem, and for maternal mortality, the tech will need to be so mature that it’s affordable for the average person in low-income countries, as well as culturally accepted.
Yeah, while I think truth-seeking is a real thing I agree it’s often hard to judge in practice and vulnerable to being a weasel word.
Basically I have two concerns with deferring to experts. First is that when the world lacks people with true subject matter expertise, whoever has the most prestige—maybe not CEOs but certainly mainstream researchers on slightly related questions—will be seen as experts and we will need to worry about deferring to them.
Second, because EA topics are selected for being too weird/unpopular to attract mainstream attention/funding, I think a common pattern is that of the best interventions, some are already funded, some are recommended by mainstream experts and remain underfunded, and some are too weird for the mainstream. It’s not really possible to find the “too weird” kind without forming an inside view. We can start out deferring to experts, but by the time we’ve spent enough resources investigating the question that you’re at all confident in what to do, the deferral to experts is partially replaced with understanding the research yourself as well as the load-bearing assumptions and biases of the experts. The mainstream experts will always get some weight, but it diminishes as your views start to incorporate their models rather than their views (example that comes to mind is economists on whether AGI will create explosive growth, and how recently good economic models have been developed by EA sources, now including some economists that vary assumptions and justify differences from the mainstream economists’ assumptions).
Wish I could give more concrete examples but I’m a bit swamped at work right now.
Not “everyone agrees” what “utilitarianism” means either and it remains a useful word. In context you can tell I mean someone whose attitude, methods and incentives allow them to avoid the biases I listed and others.
I think the “most topics” thing is ambiguous. There are some topics on which mainstream experts tend to be correct and some on which they’re wrong, and although expertise is valuable on topics experts think about, they might be wrong on most topics central to EA. [1] Do we really wish we deferred to the CEO of PETA on what animal welfare interventions are best? EAs built that field in the last 15 years far beyond what “experts” knew before.
In the real world, assuming we have more than five minutes to think about a question, we shouldn’t “defer” to experts or immediately “embrace contrarian views”, rather use their expertise and reject it when appropriate. Since this wasn’t an option in the poll, my guess is many respondents just wrote how much they like being contrarian, and EAs have to often be contrarian on topics they think about so it came out in favor of contrarianism.
[1] Experts can be wrong because they don’t think in probabilities, they have a lack of imagination, there are obvious political incentives to say one thing over another, and probably other reasons, and lots of the central EA questions don’t have actual well-developed scientific fields around them, so many of the “experts” aren’t people who have thought about similar questions in a truth-seeking way for many years
I think this is a significant reason why people downvote some, but not all, things they disagree with. Especially a member of the outgroup who makes arguments EAs have refuted before and need to reexplain, not saying it’s actually you
Can you explain what you mean by “contextualizing more”? (What a curiously recursive question...)
I mean it in this sense; making people think you’re not part of the outgroup and don’t have objectionable beliefs related to the ones you actually hold, in whatever way is sensible and honest.
Maybe LW is better at using disagreement button as I find it’s pretty common for unpopular opinions to get lots of upvotes and disagree votes. One could use the API to see if the correlations are different there.
IMO the real answer is that veganism is not an essential part of EA philosophy, just happens to be correlated with it due to the large number of people in animal advocacy. Most EA vegans and non-vegans think that their diet is a small portion of their impact compared to their career, and it’s not even close! Every time you spend an extra $5 finding a restaurant with a vegan option you could help 5,000 shrimp instead. Vegans have other reasons like non-consequentialist ethics, virtue signaling or self-signaling, or just a desire not to eat the actual flesh/body fluids of tortured animals.
If you have a similar emotional reaction to other products it seems completely valid to boycott them, although as you mention there can be significant practical burdens, both in adjusting one’s lifestyle to avoid such products and in judging whether the claims of marginal impact are valid. Being vegan is not obligatory in my culture and neither should boycotts be—unless the marginal impact of the boycott is larger than any other life choice which is essentially never true.
I really enjoyed reading this post; thanks for writing it. I think it’s important to take space colonization seriously and shift into “near mode” given that, as you say, the first entity to start a Dyson Swarm has a high chance to get DSA if it isn’t already decided by AGI, and it’s probably only 10-35 years away.
Regarding COIs, it’s probably bigger that Daniela is married to Holden, and while not strictly a COI, we don’t want the association with OP’s political advocacy. There are probably other things, I’m don’t work on strategy
Assorted thoughts
Rate limits should not apply to comments on your own quick takes
Rate limits could maybe not count negative karma below −10 or so, it seems much better to rate limit someone only when they have multiple downvoted comments
2.4:1 is not a very high karma:submission ratio. I have 10:1 even if you exclude the april fool’s day posts, though that could be because I have more popular opinions, which means that I could double my comment rate and get −1 karma on the extras and still be at 3.5
if I were Yarrow I would contextualize more or use more friendly phrasing or something, and also not be bothered too much by single downvotes
From scanning the linked comments I think that downvoters often think the comment in question has bad reasoning and detracts from effective discussion, not just that they disagree
Deliberately not opining on the echo chamber question
My understanding is METR doesn’t take Good Ventures money to avoid the appearance of COIs. We could maybe avoid creating actual COIs but it is crucial to the business model to appear as trusted and neutral as possible.
When 80,000 Hours pivoted to AI, I largely stopped listening to the podcast, thinking that as part of the industry I would already know everything. But I recently found myself driving a lot and consuming more audio content, and the recent ones eg with Holden, Daniel K and ASB are incredibly high quality and contain highly nontrivial, grounded opinions. If they keep this up I will probably keep listening until the end times.
What inspiring and practical examples!
Maybe a commitment to impact causes EA parents to cooperate at maximizing it, which means optimally distributing the parenting workload whatever society thinks. In EA with lots of conferences and hardworking impactful women, it makes sense that the man’s op cost is often lower. Elsewhere couples cooperate to maximize income, but men tend to have higher earning potential so maybe the woman would often do more childcare anyway.
My sense is that parenting falls on the woman due not only to gender norms, but also higher average interest in childcare and other confounders—so I wonder how much is caused by other effects like EAs leaning liberal, questioning social expectations in general, or EA dads somehow being more keen on parenting. Also it’s unclear if EA men even contribute more than non-EA men.
I’m reminded a bit of the gender equality paradox where in the USSR, and maybe also countries with restrictive gender roles [1] there are higher rates of women in STEM and other male-dominated fields. The idea is that in liberal societies, there would be a disparity due to difference in interest, and some kinds of external factor can reduce disparities on net—in the Soviet case because equality was enforced by the state, in other cases if there is economic interest or a lack of Western stereotypes. So EA mindset is maybe one of these external factors—not to imply it’s like Soviet central planning or anything.
[1] the research seems disputed here
There are a few mistakes/gaps in the quantitative claims:
Continuity: If A ≻ B ≻ C, there’s some probability p ∈ (0, 1) where a guaranteed state of the world B is ex ante morally equivalent to “lottery p·A + (1-p)·C” (i.e., p chance of state of the world A, and the rest of the probability mass of C)
This is not quite the same as either property 3 or property 3′ in the Wikipedia article, and it’s plausible but unclear to me that you can prove 3′ from it. Property 3 uses “p ∈ [0, 1]” and 3′ has an inequality; it seems like the argument still goes through with 3′ so I’d switch to that, but then you should also say why 3 is unintuitive to you because VNM only requires 3 OR 3′.
This arbitrariness diminishes somewhat (though, again, not entirely) when viewed through the asymptotic structure. Once we accept that compensation requirements grow without bound as suffering intensifies, some threshold becomes inevitable. The asymptote must diverge somewhere; debates about exactly where are secondary to recognizing the underlying pattern.
“Grow without bound” just means that for any M, we have f(X) > M for sufficiently large X. This is different from there being a vertical asymptote so a threshold is not inevitable. For instance one could have f(X) = X or f(X) = X^2.
To be clear, whether we call this behavior ‘continuous’ depends on mathematical context and convention. In standard calculus, a function that approaches infinity exhibits an infinite discontinuity. [...]
[1] In the extended reals with appropriate topology, such a function can be rigorously called left-continuous.
It would be confusing to call this behavior continuous, because (a) the VNM axiom you reject is called continuity and (b) we are not using any other properties of the extended reals, but we are using real-valued probabilities and x values.
Once you’ve accepted that some suffering might require a number of flourishing lives that you could not write down, compute, or physically instantiate to morally justify, at least in principle, the additional step to “infinite” is smaller in some important conceptual sense than it might seem prima facie.
This may seem like a nitpick, but “write down”, “compute”, and “physically instantiate” are wildly different ranges of numbers. The largest number one could “physically instantiate” is something like 10^50 minds, the most one could “write down” the digits of is something like 10^10^10.
Not all large numbers are the same here, because if one thinks the offset ratio for a cluster headache is in the 10^50 range, there are only 50 ‘levels’ of suffering each of which is 10x worse than the last. If it’s over 10^10^10, there are over 10 billion such ‘levels’, it would be impossible to rate cluster headaches on a logarithmic pain scale, and we would happily give everyone on Earth (say) a level 10,000,000,000 cluster headache to prevent one person from having a (slightly worse than average) level 10,000,000,010 cluster headache. Moving from 10^10^10 to infinity, we would then believe that suffering has a threshold t where t + epsilon intensity suffering cannot be offset by removing t—epsilon intensity suffering, and also need to propose some other mechanism like lexicographic order for how to deal with suffering above the infinite badness threshold.
So it’s already a huge step to reject numbers we can “physically instantiate” to ones we can barely “write down”, and another step from there to infinity; at both steps your treatment of comparisons between different suffering intensities changes significantly, even in thought experiments without an unphysically large number of beings.
Ok interesting! I’d be interested in seeing this mapped out a bit more, because it does sound weird to have BOS be offsettable with positive wellbeing, positive wellbeing to be not offsettable with NOS, but BOS and NOS are offsetable with each other? Or maybe this isn’t your claim and I’m misunderstanding
This is what kills the proposal IMO, and EJT also pointed this out. The key difference between this proposal and standard utilitarianism where anything is offsettable isn’t the claim that that NOS is worse than TREE(3) or even 10^100 happy lives, since this isn’t a physically plausible tradeoff we will face anyway. It’s that once you believe in NOS, transitivity compels you to believe it is worse than any amounts of BOS, even a variety of BOS that, according to your best instruments, only differs from NOS in the tenth decimal place. Then once you believe this, the fact that you use a utility function compels you to create arbitrary amounts of BOS to avoid a tiny probability of a tiny amount of NOS.
It is not necessary to be permanently vegan for this. I have only avoided chicken for about 4 years, and hit all of these benefits.
Because evidence suggests that when we eat animals we are likely to view them as having lower cognitive capabilities or moral status (see here for a wikipedia blurb about it).
I have felt sufficient empathy for chicken for basically the whole time I haven’t eaten it. I also went vegan for (secular) Lent four years ago, and felt somewhat more empathy for other animals, but my sense is eating non-chicken animals didn’t meaningfully cloud my moral judgment enough to care about, given my job isn’t in animal welfare.
As a social signal, to show to others that you object to this practice as a whole.
My family eats chicken all the time, so when I visit they change to beef or vegetarian, which serves the social signal purpose without making it difficult for us to eat together
I gave up squid and octopus this year, and on two instances this has come up and people have praised me for being virtuous
You just find it easier to live according to simple ethical principles rather than calculating the expected utility in every situation.
I don’t need to think about expected utility in every situation; it’s not hard to just not eat chicken. 98% of restaurants have high-protein non-chicken options whereas less than half have high-protein vegan options.
Also it’s more convenient than being vegan because there are fewer products to worry about. A vegan will have to check whether a sandwich has mayonnaise, pasta has cheese, pastry has lard/eggs/butter.
I separately believe that social and political change are pretty small compared to EA animal welfare efforts. But beef and high-welfare-certified meat options cut down on suffering by >90% vs factory farmed chicken (or eggs, squid, and some fish) and also serve many of the signaling benefits. If you eat welfare-certified animal products only, it may even be higher for two reasons:
You transmit a higher-fidelity message; it’s clear you want to reduce suffering whereas people are vegan for many reasons, like health and religion
Talking about welfare certifications is interesting, so you’re more likely to start positive conversations, whereas vegans are perceived as, and sometimes are, insufferable.
I perceive it as +EV to me but I feel like I’m not the best buyer of short timelines. I would maybe do even odds on before 2045 for smaller amounts, which is still good for you if you think the yearly chance won’t increase much. Otherwise maybe you should seek a bet with someone like Eli Lifland. The reason I’m not inclined to make large bets is that the markets would probably give better odds for something that unlikely, eg options that pay out with very high real interest rates; whereas a few hundred dollars is enough to generate good EA forum discussion.
No bet. I don’t have a strong view on short timelines or unemployment. We may find a bet about something else; here are some beliefs
my or Linch’s position vs yours on probability of extinction from nuclear war (I’d put $2 against your $98 that you ever update upwards by 50:1 on extinction by nuclear war by 2050, but no more for obvious reasons)
>25% that global energy consumption will increase by 25% year over year some year before 2035 (30% is the AI expert median, superforecaster median is <1%), maybe more
probably >50% that a benchmark by Mechanize meant to measure economic value, if converted to time horizon, will double twice in the first 16 months (I’m not aware of one existing yet)
probably >50% that AIs will outperform humans at forecasting geopolitical events by 2035, as long as the humans can’t read AI analyses, though this seems hard to operationalize
You’re shooting the messenger. I’m not advocating for downvoting posts that smell of “the outgroup”, just saying that this happens in most communities that are centered around an ideological or even methodological framework. It’s a way you can be downvoted while still being correct, especially from the LEAST thoughtful 25% of EA forum voters
Please read the quote from Claude more carefully. MacAskill is not an “anti-utilitarian” who thinks consequentialism is “fundamentally misguided”, he’s the moral uncertainty guy. The moral parliament usually recommends actions similar to consequentialism with side constraints in practice.
I probably won’t engage more with this conversation.