I’d like to have conversations with people who work or are knowledgeable about energy and security. Whether that’s with respect to energy grids, nuclear power plants, solar panels, etc. I’m exploring a startup idea to harden the world’s critical infrastructure against powerful AI. (I am also building a system to make formal verification more deployable at scale so that it may reduce loss of control and misuse scenarios.)
I’ve given workshops on using AIs for productivity/research to various research organizations like MATS. I’m happy to offer a bit of my time to share my expertise on that if that would make the meeting more interesting for you (or any other topics you’d like to hear my perspective on).
Context about me: I’m Jacques. I started working on technical AI safety research in January 2022. Before that, I had been engaging with AI ethics in a more personal capacity, worked as a data scientist at the Canada Energy Regulator, and earned a BSc/master’s in Physics. I’m currently based in Montreal.
Been thinking about morality recently. Here are my current thoughts, take them with a grain of salt because they aren’t battle-tested yet.
There are some strong arguments for utilitarianism, but regardless of what is correct theoretically, in practise utilitarianism doesn’t work well without some kind of deontological bars.
Continuing with attempting to develop a pragmatic morality, it then become clear that virtue ethics is important too because a) rules are rigid compared to judgement b) decisions aren’t independent but also affect how you’ll act in the future[1].
Some folks may be quite tepid in integrating virtue ethics, but my intuition is that the more common fault will be to give yourself too much latitude, so you’ll probably want to revive some of your old deontological bars.
I view the next stage after this as introducing a sort of meta-virtue ethics to balance the three components (utilitarianism, deontology and virtue ethics; obviously it would be possible to break this down further). But this likely gives you too much latitude again, so you’ll probably want to introduce some kind of meta-deontology to limit how you update the balance.
You could go further than this, but you’d probably be running into decreasing marginal utility.
I quite liked this article by Martha Nussbaum: Virtue Ethics is a Misleading Category. She points out that both the classical utilitarians and Kant talked extensively about virtues. On the other hand, there’s great variation among those who call themselves ‘virtue ethicists’, such that it’s not clear if virtue ethics is really a thing.
But the point I want to make is: a good utilitarian has to acknowledge the role of virtue, and I think a lot of modern utilitarians have forgotten this. We want to use utility-calculation to guide our actions, but humans can’t think like calculators all the time.
I’m not really into deontological constraints myself. Rules of thumb, yes, but they should always be open to revision. Exceptional circumstances can always justify breaking rules—and in those cases, I will refer to what maximizes utility.
I wanted to make this poll to see how the community views the speed/x-risk tradeoff. I’m personally 99% x-risk and 1% speed, so I would hard agree. My prediction is most people will agree, maybe a 70⁄30 split, but I’m curious to see.
I would be willing to delay technological innovation by up to 100 years to significantly reduce existential risk
I think the question is too imprecise phrased to be answered precisely. When would the delay start? Over what time period would it be felt? (e.g. a 100% delay for 100 years is very different than 1% delay over 10,000 years)
I’m thus giving a directional answer assuming we’re talking about whether seeking to dramatically reducing technological progress in exchange for safety is a feasible way to make the world a better place. I don’t think this is, but I’m not sure.
My biggest gripe is that any attempt to reduce technological innovation dramatically would entail a bunch of side-effects that would degrade the quality of existence (e.g. requiring authoritarianism, moving power from cooperators to defectors, to people skilled at deception to people less skilled, incentivises fighting for a larger slice of the pie instead of expanding it as expanding it is far harder without improved technology)
Initially I just calculated a naive expected value function and put 100% agree, but then I realized that I don’t value realizing potential lives nearly as much as I value improving existing ones. While I do value realizing potential lives, the loss of them is not experienced by anyone other than present-day people like myself who think about them abstractly, which seems to me in sum to be less bad than the suffering otherwise avertible due to technological progress in the next 100 years. But I obviously haven’t thought about this enough or I wouldn’t have made my initial mistake.
One thing I didn’t consider in my revised answer is that I didn’t actually do the math. Taking an existential event as literally causing the end of earth-originating life, the question is whether the difference in probability multiplied by the immediate mass extinction itself would represent more death and suffering than the avertible death and suffering occurring over a 100-year period. I just don’t know. It seems unlikely that the avertible death and suffering amounts to as much as the amount caused by the mass-extinction event itself, but after multiplying by the difference in probability and acknowledging the ambiguity of the timeline proposed in this question, things become less clear. However, let’s say that the probability-adjusted, undetermined-timing mass-extinction event does cause more suffering and death and I change my answer to 50% agree. I don’t think this is what most people would interpret 50% agree to express.
I should also be clear that I’m taking the question to mean literally ending earth-originating life in more-or-less one, fell swoop. Obviously, traditional x-risks actually have a spectrum of severity, so this is not so straightforward to apply to real-world resource allocation.
If I had to be more specific I would mean “reducing the probability of all humanity (and only humanity) dying in a few short days/weeks from 50% to 10%” by “significantly reduce existential risk”.
Also, I disagree with your methods. X risks aren’t especially bad because of all the utility lost (and “negative utility” created), they’re bad because after they happen there’s never any utility again. Unless apes re-evolve into humans and reestablish all of civilization all over again, but we’re getting too hypothetical. What’s 100, or even 1000 years of death and suffering compared to 10000 of utopia? If stalling/slowing down technological progress for 1000 years made the P(Doom) go from 50% to 1%, I would definitely take it. Unless of course you think utopia is gonna be some short lived thing, but I seriously doubt that.
You are rightly grasping that we disagree, but I don’t think you are understanding my view (and to be clear, reasonable people can disagree about this).
My wife and I are debating whether we will have more children or not. Having another child is desirable to us. So much so that she’s willing to undergo the relatively risky process of child birth to have another one. However, failing to have another child is significantly less bad than losing one of our existing children, IMO. I’d even say that, failing to have 100 more children is significantly less bad than losing one of our existing children. The reason why is that the child who never existed is not sentient and so does not experience any deprivation. They do not suffer. And my suffering of that abstract loss is not nearly as bad as would be the suffering I would experience losing a living child who I know.
Now you may disagree with that, and mourn all the lost utility, and that is a reasonable perspective, but its not mine, and as you can see, this is a deeper philosophical difference and not some sort of misunderstanding about expected utility or something like that.
FYI, about this sentence: “X risks aren’t especially bad because of all the utility lost … they’re bad because after they happen there’s never any utility again.” I don’t really see a difference between these two statements.
I agree with Craig here. I’ve written about problems with most conceptions of utility people use and describe alternatives that I think better match what Craig is saying in this sequence.
Wrote a post about it, but the TL;DR is that extintion is THE worst case scenario. It is the end of all utility and completely irreversible, whereas progress can always be made at a later date.
That’s fair, but I imagine X risks and S risks are very heavily correlated. Especially in regards to “speed of progress”, accelerationism will, in my view, obviously increase X risks (safety research takes time, the more time you have, the more time for research you have, the more research is done, therefore reducing risk) but also increase S risks (this is more personal opinion, but I don’t think the current leaders of AI innovation have stuff like animal welfare in mind. if we just keep chugging along, the first ASI might not care about animals at all).
Intuitively, that seems correct, and I’ve relied on the expression “when you have a hammer, everything looks like a nail.” This got me thinking: is it necessarily the wrong way, or is this a truism?
If I have a legitimately useful and powerful tool, isn’t it indeed valuable to look around for problems that it can help solve? E.g., if we have discovered a way to harness electricity, shouldn’t think about the ways it can be used to improve communication, build labor-saving devices, power factories, etc? If we have something that has demonstrated potential to generate reliable information (supposing that forecasting could do this) shouldn’t we look for fruitful opportunities to apply it?
With a set of tools and a set of problems, why is it more useful for one side to do the searching than the other? (Sorry, maybe this is getting too meta and belongs in its own shortform?)
This just came to mind: the reason that it’s the wrong way to go about solving problems is that you want to solve the largest problems (well, per resource) and not just solve any random problem. Like, there is a problem that my shoes are currently untied, and I don’t want to bend down or spend 10 seconds to tie them, but it’s not very important.
So if you want to solve the most important problems, you should start with the problem and then work backwards for what solutions you might wish existed. I think the mere fact that people often talk about forecasting as the solution they are seeking to apply, whether that be Sentinel or whoever, is evidence that things are going wrong.
Actually, the set of things you want to apply electricity to is far smaller than the set of things you dont want to. For example, if your baby is crying, please dont use electricity.
The problem side should do the searching since they have the shape and exact know-how of the problem
They do and it’s a powerful point. But on the other hand they may be very much unaware of the nature of available tools and solutions. So I think there should probably be some searching — and listening — in both directions. If it’s done in good faith.
I have been disappointed by the support someEAs have expressed for recent activist actions at Ridglan Farms. I share others’ outrage at the outcome of the state animal cruelty investigation, which found serious animal cruelty law violations but led to a settlement that still permits Ridglan to sell beagles through July and to continue in-house experimentation. But I personally think the tactics used in the recent open rescues, including property damage and forced entry to remove animals, violate reasonable moral bounds on what actions are permissible in response to the belief that a serious harm is occurring. My views here stem from contractualist views of democratic legitimacy and from concerns about the non-universalizability of principles that justify lawbreaking, though I think a purely act utilitarian calculus also supports them.
Regarding universalizability, in a society where many people believe that different forms of irreparable harm are occurring (e.g. viewing abortion as murder, climate change as destroying the sacredness of the natural world, immigration as ending western civilization), I worry that moral principles that allow for significant lawbreaking when one believes that irreparable harm is occurring could easily lead to great damage if broadly followed (consider for example what it would be like to live in a country where hundreds of activists were regularly smashing their way into abortion clinics, energy companies, and refugee assistance nonprofits with sledgehammers and crowbars). Regarding the legitimacy of the law, I think reasonable contractualist views can give us obligations to follow the law when the processes by which the law is determined are legitimate, and that democracies with universal suffrage qualify as such (even granting that certain groups such as animals and future generations are impossible to enfranchise).[1] Therefore, I think that if we are trying to make decisions under moral uncertainty and give meaningful credences to rule utilitarian and contractualist views, we ought to reject the kinds of lawbreaking done by the Ridglan activists.
Moreover, I think that even if one rejects this kind of moral uncertainty-based reasoning and is a pure act utilitarian, rejecting lawbreaking in the western democratic context is still a relatively robust decision procedure under epistemic uncertainty. Broadly-followed norms against lawbreaking would have prevented EA’s worst scandal (FTX) without preventing EA’s most significant successes (cage-free reforms, evidence-based health interventions in LMICs). And while there are historical examples of illegal civil disobedience clearly producing good outcomes, I don’t think these generalize well to the type of lawbreaking under consideration here. The clearest such historical cases are ones where a disenfranchised group of people broke laws that directly enforced their own exclusion from political participation or basic legal personhood. These cases are self-limiting (and thus pass reasonable tests of universalizability) since the principles justifying such lawbreaking achieve their own obsolescence once participation is granted.[2] It’s much harder to find historical cases of property-damaging civil disobedience occurring in a democracy with universal suffrage that, in hindsight, appear clearly both effective and in service of a good cause. DxE’s own history is instructive here—their work over the last decade has led to many criminal convictions among its members, as well as several organizational scandals. But their record of concrete wins for animals is at best small-scale and mixed, especially compared to the successes of groups that have purely used lawful tactics like ballot initiatives and corporate campaigns.
One last point in the utilitarian calculus, this time on the more object-level cost-benefit calculation, is that I think EAs who embrace these kinds of illegal tactics may be underestimating the downside risks of endorsing criminal activity. I think there is a set of donors and volunteers that are happy to contribute to legal activism but who would be concerned about being associated with lawbreaking (at a minimum, I would consider myself to be in this group). If people in leadership roles within the EA/EAA ecosystem endorse illegal action, any foreseen benefits may easily be swamped by the harms of driving away risk-averse donors.
None of this is to say that Ridglan’s treatment of animals is justified, or that the lack of state enforcement against Ridglan for their serious violations of animal cruelty laws is acceptable. However, these harms don’t justify using tactics that are neither clearly effective nor robustly permissible across moral views.
I don’t mean to say that literally all lawbreaking is unjustified in a democracy. In particular, if one thinks that a law in the US is unconstitutional, breaking it may be required to gain standing for a legal challenge. But this implies a narrow exception for doing the minimum amount of lawbreaking required to obtain standing; it doesn’t imply that the tactics in question at Ridglan are permissible.
Note that this self-limiting principle only holds when applied to groups of humans denied suffrage. It doesn’t extend to cases like animals, where suffrage isn’t possible and there’s no natural bound on how many such groups might be invoked to justify lawbreaking.
I’ve been somewhat disappointed in reading this post. But as I know some folks I like are reading it, I feel the need to share a few thoughts as a legal scholar and theorist. Your post I think demonstrates some misunderstandings about the nature of law.
1. You seem to misunderstand contractarianism, by making it an argument for quietism, as well as the nature of law in a democracy. We don’t agree to many laws as a society; most extant laws are conditions that no one ever agrees on. They are traditions. Take property law — no one ever voted to make animals property. That’s an inherited concept. There is a difference between a law that is democratically enacted, and a vestige of the English common law. The same goes for trespass, a common law principle. Moreover, no contractarian worth her salt is going to claim you can never break an unjust law. Going back to Aquinas, there is broad agreement that sometimes breaking an unjust law is morally appropriate, or at least defensible.
2. No one could reasonably argue this tactic is universalizable. It is strategic — it is a non-universalizable tactic meant to create significant attention and pressure for change toward universalizable norms. And more — the argument that if they do it, everyone will, can be a dangerous slippery slope. It’s infeasible for everyone to do this: they have jobs. More important, I would very much caution against that kind of thinking. It has been used to justify atrocities in the past. Take slaveholders: they used that kind of argument to push back against slave rebellions—it would destroy the antebellum south, socially and economically. That is not an argument worth keeping. I am not defending the Ridglan approach, but I am cautioning against these types dismissals of what they are doing.
3. Historically, you seem to again misunderstand the nature of law. Slaves were considered property, and their rebellions certainly damaged their property status. They also moved the north toward abolitionism. Elements of the civil rights movement and the labor movement in the United States engaged in tactics that damaged property and yet ultimately won reforms. Property is a convention—it should not interfere with moral obligation and serious moral intervention.
I’ll just mention one thing about donors. Doing what is right sometimes risks making certain people unhappy. That’s why social movements shouldn’t rely on the whims of wealthy individuals. The right donors will see the right priorities for what they are. The priority should be supporting one another within a social movement.
Thanks for engaging with the post! You made a lot of different points, so I’ll do my best to separate them out and consider them one-by-one:
(1)
I’m not making an argument for quietism. Saying that we have an obligation to follow the law is compatible with having obligations (even extraordinarily strong ones) to use non-illegal means to combat injustice (e.g. by advocating for changes to laws).
It’s a genuinely interesting point that many of our laws are inherited traditions, rather than the direct product of the democratic process. However, I don’t think that’s a strong argument in this specific case. The US has had true universal suffrage for more than 60 years, and in that time Congress and state legislative bodies have passed many laws related to the treatment of animals and the criminality of trespassing. Under any reasonable interpretation of democratic legitimacy, a democratically-elected legislative body specifically dealing with an issue and choosing to pass laws that accept the underlying common law principles and add specific penalties, related rules etc., should confer it.
I don’t disagree that a reasonable contractualist would think that there are cases where it would be justified to break an unjust law. The core question is whether the required conditions hold in this case. Democratic legitimacy is one important part of that, since reasonable contractualists generally would give some weight to whether laws resulted from a just process. A point I didn’t make in the OP, but I think is relevant here, is that even if you disagree about the democratic legitimacy argument, I think the specific nature of the lawbreaking here falls outside many notions of justifiable civil disobedience. That’s because the Ridglan rescues involved breaking a law to achieve a non-symbolic end (rescuing the dogs), not merely symbolically challenging a law by breaking it.
(2)
I think you’re moving between a couple different notions of universalizability here. It’s true literally everyone breaking and entering in service of moral aims is a far-fetched idea. But it’s still coherent to ask whether a tactic would have positive or negative effects if commonly used across social movements. Democratic societies can and have experienced periods of widespread civil unrest.
I agree that a similar argument could have been deployed against rebellion by enslaved people, but I think the analogy is weak because of the specifics. Slave rebellions occurred in a society where the affected population was excluded from political participation, and the principles justifying them were self-limiting in the way I described in the OP. The current case is different: the affected parties (animals) can’t be enfranchised, but the human population that cares about animal welfare has full political participation, and the legal channels for advancing animal welfare are open and have produced incremental gains over recent decades (most notably the transition of nearly half of the US egg supply to cage free). The bar for lawbreaking is plausibly higher when those channels are more responsive than when they’re closed.
(3)
I think you’re being too quick to dismiss property as being something that can drive moral obligation. There are clearly many cases where we are obligated to not destroy or interfere with others’ property, as is obvious in cases where vulnerable groups’ property rights are infringed. The way I’d think about this is that obligations to protect property are stronger the more just a society’s system of property rights is. In a more just society, property destruction not only weakens otherwise-good norms, but is also more likely to be the result of a miscalculation: if the base rate of unjust property ownership is lower, then any given case of someone believing property destruction is justified is more likely to be wrong. So both the rule-following considerations and the act-utility considerations point toward higher property protection in more just societies.
(4)
I disagree that the priority should be supporting one another within a social movement; the priority should be trying to do the most good. Donor considerations reasonably impact that calculation, both because money is a necessary ingredient for advocacy and because donor preferences may reflect genuine moral views that are worth considering. But I do also agree that it can be worth trying to convince donors of an approach rather than just deferring to their preexisting preferences.
State laws are path dependent, and rely very often on common law principles and concepts uncritically applied. That does not equate to democratic legitimacy for every codified version of property and criminal law.
I think we have fundamentally incompatible views on the appropriate frame to apply to balancing questions—I am not at all a utilitarian, and I don’t think you should be either. But I’ll set that aside.
You again seem to conflate lawbreaking with immorality. Please don’t do that. Rosa Parks broke the law. So did the Ridglan rescuers. That doesn’t make what they did wrong. That’s a separate question. The symbolic/non-symbolic distinction is not one I find compelling.
You seem to see humans and animals as categorically different from a legal perspective. Humans care for animal welfare but animals have no voice. I fundamentally disagree — animals should have rights, including a right to a voice. That means they should have institutions that represent their interests. So there is no fundamental difference between the slave revolts and what people are trying to do here for animals in terms of voice, as their representatives. I suggest Zoopolis on this point.
Your articulation of the EA bias toward donors is particularly problematic. Successful social movements historically have not relied on extremely wealthy individuals for funding. Your concern to persuade donors is troubling. You should be worried about persuading people, not persuading donors. Lots of people are going to be needed to change social perspectives and institutions around animals. A social narrative that focuses on “earning to give” or major donors is likely to be unmoored from a durable movement.
For what it’s worth, I’m surprised that people who think abortion is murder aren’t doing more illegal stuff to destroy clinics.
Also for what it’s worth, I think factory farming is so bad that it’s by far the greatest injustice caused by humans in history. Justifiable wars have been fought for orders of magnitude less important things.
consider for example what it would be like to live in a country where hundreds of activists were regularly smashing their way into abortion clinics, energy companies, and refugee assistance nonprofits with sledgehammers and crowbars
I think this is a good/fine question and the answer is “they’ll go to jail and then stop”. I think maybe you’re conflating this question and the following:
consider for example what it would be like to live in a country where hundreds of activists were regularly smashing their way into abortion clinics, energy companies, and refugee assistance nonprofits with sledgehammers and crowbars and also we live in a tenuous society with only vigilante or no law enforcement
Plausibly the morally correct answers are different. If your policy might cause total collapse of social order (irl, not in a nested thought experiment), maybe you shouldn’t do the nonviolent disruptive protest, but if you live in the real US where you largely internalize the negative consequences and others are similarly dissuaded and you still find the ~1st order effect worthwhile, then go right ahead
It’s like a sin tax (not a perfect term here tbc) - you want some amount of Pigouvian tax on the thing that you’re worried is not generalizable (or good if it generalizes). If you find that the action is worth it to you tax included, then godspeed. It would be fallacious to say “in addition to the correctly priced carbon tax you’d be paying on the gas, consider your impact on the environment by driving”
I think we disagree about whether the harms of lawbreaking are mostly internalized. The degradation of social trust in the deliberative process seems bigger to me than the consequences to the individual? As an analogy, shoplifting is an ordinary crime where individuals do face real consequences, but the diffuse harms to consumers and businesses (goods locked up, stores closing) are large and dominate the social calculus.
The Pigouvian tax comparison doesn’t quite work here because paying a tax contributes to public resources that can directly address the harms of the act or improve welfare elsewhere, making the net outcome neutral. Going to jail doesn’t repair damaged property or restore trust in the democratic process.
1. I think we disagree about whether the harms of lawbreaking are mostly internalized. The degradation of social trust in the deliberative process seems bigger to me than the consequences to the individual? As an analogy, shoplifting is an ordinary crime where individuals do face real consequences, but the diffuse harms to consumers and businesses (goods locked up, stores closing) are large and dominate the social calculus.
I don’t think we should speak of “lawbreaking” as a general case in this context; some argue that shoplifting is too lightly punished/prosecuted (especially in eg liberal US cities), but even assuming that’s true, the question remains as to whether the more specific category of say “property damage via protest” is punished too lightly, too harshely, or about right.
My best guess is that it’s not “too lightly” from a purely normie “law and order and human welfare right now” perspective. Many people believe moral-ish things strongly and don’t find property destruction immoral, but far far fewer actually destroy the property of those they think are doing something immoral. This seems like good evidence that the expected punishment (including via informal mechanisms) is not too light.
2. The Pigouvian tax comparison doesn’t quite work here because paying a tax contributes to public resources that can directly address the harms of the act or improve welfare elsewhere, making the net outcome neutral. Going to jail doesn’t repair damaged property or restore trust in the democratic process.
I think we are/were both sort failing to decouple Pigouvian taxes and restitution. My understanding about both how the term “Pigouvian tax” is used in econ and about the real world is that even without restitution, you can get to the socially optimal level of some bad with a tax alone and no transfer to victims.
I think the motivating intuition is that the tax is affecting the amount of eg “social disorder” supplied, but the tax revenue is just a transfer of economic power from one party to another—it’s not creating real wealth that can then be given to the victims. So the same amount of real wealth exists before and after the transfer and a separate question is what to do with that wealth given the state of the world (eg you might think that the very well-off who are harmed slightly by some negative externality, say ambient noise, should not be given restitution and a tax on decibels should really flow to some other party like the very poor)
Many people believe moral-ish things strongly and don’t find property destruction immoral, but far far fewer actually destroy the property of those they think are doing something immoral. This seems like good evidence that the expected punishment (including via informal mechanisms) is not too light.
I think that this is at best weak evidence. Activists’ decisions of whether or not to commit crimes are surely influenced by norms, not just the expected intensity of punishment. The recent history of climate activism in the UK is a good example. As far as I can tell, nothing changed about UK law to cause the rapid rise of high-profile lawbreaking by Extinction Rebellion and then Just Stop Oil in the 2018-2023 timeframe. The UK government did in the end stop the activists through increased legal penalties (going from typically no prison time for nonviolent lawbreaking when motivated by ethical concerns to 4+ year prison sentences becoming common for the more serious cases). But something other than threat of prison time was keeping climate activists from using these tactics in the early-to-mid 2010s.
On (2):
I agree that a Pigouvian tax doesn’t require restitution (as I indicated with including “improve welfare elsewhere” as something that can be done with the tax revenue). But the classical formulation (in which the optimal tax rate fully eliminates deadweight losses) does require that a dollar of consumer/producer surplus and a dollar of tax revenue produce the same social welfare. If a dollar of tax revenue produces less social welfare, then the deadweight loss cannot be eliminated.
To make this more concrete, I want to dig into an example based on your comment about driving in a world with a carbon tax. Consider taking a long trip by car rather than train under 3 different taxation schemes. Let’s assume you value the convenience of the car over the train at $101, the social cost of your carbon emissions is $100, and that all consumers in this world have identical marginal utilities of money.
World A: No carbon tax. You take the trip, gaining benefits you value at +$101 while causing social costs of -$100. In this world, we might say that you’ve done the right thing by driving (since this maximizes utility overall), but that for fairness reasons you might be obligated to donate some money to others, since your utility-maximizing decision also acted as a transfer from others to you.
World B: Carbon tax of $100 on the trip, returned as an equal dividend to all people. You take the trip, gaining net benefits after the tax of $1. The rest of society ends up net neutral (though there might still be particular winners and losers). In this world, you’ve done the right thing by driving, and have no further obligations.
World C: Carbon tax of $100 on the trip, which the government will use to buy $100 of consumer goods and dump them down an old mineshaft. You take the trip, gaining net benefits after the tax of $1. The rest of society still experiences the social costs of -$100, which the tax doesn’t do anything to reduce. In this world, you’ve clearly done the wrong thing by driving, since you caused a net utility loss of $99.
My claim is that doing crimes is similar to deciding to drive in World C. The “tax” on crime is imprisoning the criminal, which causes them to pay large costs in terms of their lost freedom and ability to work and doesn’t do anything to benefit society. And in fact it’s worse than World C, since the rest of society needs to pay the additional costs of arresting, prosecuting, and jailing them. So I think the Pigouvian tax analogy does not hold here, and it’s wrong to think that the harms of crime are properly internalized.
Not addressing every point but I think in some respects I agree that crime is C but then how much benefit the criminal gets/values is a case-by-case question, and we can’t just assume that in the irl case at hand that the benefit is (in the analogy) $101 instead of $1000
There’s real deadweight loss from the mineshaft drop/spending money on prisons, but also potentially real value to be gained from the crime itself (canonical case = speeding bc wife is going in to labor)
Remember, property damage as activism isn’t like simple theft—the property damage can cost society amount $X and benefits of activism feature can separately benefit society or be valued by the perpetrator at any other number $Y
I see what you’re getting at here. But if we agree that the externalities of crime aren’t internalized, then I think we’re just back in the position of the original post. You think the act utilitarian calculus checks you, I’m both skeptical that it does and think that there are non-act-utilitarian reasons why we ought to avoid lawbreaking.
Since you still state that in some instances, some careful lawbreaking can be justified in the pursuit of a just outcome, perhaps you could spend more of the post detailing why you thought lawbreaking was a bad call in this specific instance? This is not clear to me from reading your note.
Great points. Thank you for writing this up. I think it’s a strong and fair critique of the strategy of actions like this, and would love to see more discussion at this high level of context and analysis.
(I expect you understand the legal arguments at play, but I do want to reemphasize for other readers that I stand behind the ultimate legality of all actions I took at the first Ridglan rescue in March, using a basic necessity defense argument: “you’d break a window to save a dog stuck in locked car on a hot day”, e.g., sometimes property damage is legal to avoid a foreseeable imminent harm. We can argue whether the harms at Ridglan are foreseeable or imminent, but I believe they were and that’s the basis for why I chose to do what I did. I wasn’t there for the one last weekend in April and don’t have a settled opinion about it yet.)
To respond to part of your very good post, I feel that we should be able to discuss and analyze nonviolent direct action and other forms of civil disobedience in EA spaces. I engaged in this action in part because I think EA folks don’t think about this kind of thing enough and I want to raise the salience of civil disobedience, at least as at least a secondary or tertiary sort of thing that EA should have as levers. I don’t think it is ever likely to be primary and I don’t want it to be, but I also don’t want it to be ignored and I think it largely has been around here.
A chunk of your argument boils down to what’s good for the overall EA brand. I strongly agree that there are bright lines I would not want the community to cross (e.g. endorsing or promoting violence). I think nonviolent direct action falls on the “OK” side of the line for me, but I agree there is probably a useful discussion to be had here, and am open to more arguments on this.
What would you say in response to a conservative abortion clinic protestor who makes the same argument you’re making? “It was ethically necessary for me to kidnap the doctor who was about to start their shift at Planned Parenthood. Yes, it’s normally illegal to kidnap people, but those babies* were in imminent danger of being killed by the doctor, and it’s permissible to break laws to avoid a forseeable imminent harm.” (*The conservative protestor believes that fetuses have equal moral status to babies, the same way you and I believe that pigs have equal moral status to dogs.)
A better analogy of non violent direct action would be breaking in to disable the clinic’s capabilities to provide abortions (without harming anybody).
In this case, the protesters would be subject to the same penalties under the law that the Ridglan protesters are. That being said, there is a case that what is happening to the dogs is illegal (311 counts of animal cruelty documented already). I’m wary of an appeal to authority bias here—just because they are not enforcing the law doesn’t mean what Ridglan is doing is legal. As pointed out, the necessity defense is being tested here, and has reason for consideration.
The clearest such historical cases are ones where a disenfranchised group of people broke laws that directly enforced their own exclusion from political participation or basic legal personhood. These cases are self-limiting (and thus pass reasonable tests of universalizability) since the principles justifying such lawbreaking achieve their own obsolescence once participation is granted.
I worry this approach excludes the most vulnerable (those who cannot meaningfully participate in political life, like human babies and animals), and focuses on less fundamental rights: I think protection from torture is more urgent than legal personhood.
Why would women be justified in engaging in civil disobedience to get the vote for themselves, but not be justified in engaging in civil disobedience to rescue babies from Josef Mengele?
I agree that there’s a sense in which the constraints I’m talking about focus on less fundamental rights. But I think the more important sense is that they focus on preserving a viable process for living together in a society with people of greatly differing moral views. That doesn’t mean we have to leave behind other vulnerable groups, just that we have to try and bring about change for them through democratic means.
Regarding the Mengele example, I think it’s disanalogous because it took place in a dictatorship, where the rule utilitarian and contractualist constraints on action look very different.[1] I’m really probing at what constraints EAs should have when acting in the context of a democracy (including a flawed one), not what behavior would be correct in Nazi Germany.
Note that the act utilitarian calculus also changes in a dictatorship too. Following the law in a dictatorship is unlikely to be a successful decision procedure for maximizing utility under epistemic uncertainty.
What if “protecting innocent sentient beings from torture” is a higher moral priority than “living together in a society with people of greatly differing moral views”?
I’m sceptical that the distinction between flawed democracy and dictatorship is clean enough to justify civil disobedience on behalf of others only in the latter (if this is what you’re saying). Would you support rescuing American children from deliberate infection with hepatitis at Willowbrook in the 1960s?
On your first question, I think your framing isn’t addressing what happens if other people think the same way. The equilibrium where everyone with strong moral convictions feels licensed to break laws doesn’t seem to me like it’s better for vulnerable groups, just more chaotic. I think that to some extent you’re proposing smashing the “defect” button in a prisoner’s dilemma and hoping the other side doesn’t do the same.
On your second, I agree that it’s not a clear line between flawed democracy and dictatorship, but in the US today this isn’t really relevant.
On your third, I think the Willowbrook example is worth thinking about more carefully. As I understand the history, the binding constraint at Willowbrook wasn’t legal. Many parents and guardians retained custody and could have legally removed their children. The constraint was that families without resources didn’t have a better option. And in the end, legal activism was able to marshal those resources, albeit much more slowly than I would have wished.
I think that to some extent you’re proposing smashing the “defect” button in a prisoner’s dilemma and hoping the other side doesn’t do the same.
I’ve been pondering this. I think your button-smashing characterisation is basically accurate, and it is a leap of faith that those who engage in civil disobedience make: an appeal to the conscience of society, the jury etc..
You’re right to say that one way to think about universalisability is “if it’s okay for me to break the law to achieve what I consider to be a moral goal here, why can’t everyone break the law to achieve their own moral goals?”. But another way to think about universalisability is to go “if I were the one in Ridglan / Unit 731 / Willowbrook, what actions would I support to end my suffering?”
I don’t know whether it would be illegal for parents to break their children out of Willowbrook, but for the purposes of this question assume it was.
You should volunteer at your first EAG! (Especially if you are a student or early career)
If you don’t have a network in EA, EAG’s can be overwhelming. Volunteering gives you a ready-made, organic network.
Volunteering is pretty chill—a lot of the shifts aren’t that hard.
At your first EAG, it’s unlikely that you are using your time so efficiently that a few hours of volunteering would cut into the value of your conference.
Protein/dairy tradeoffs/substitutions make more sense: honey/syrup/agave seem less necessary. For example, waffles, pancakes, french toast, etc still taste good without much of those, and honey/syrup/agave all seem too sugary to be healthy. Since they seem less necessary, your reasoning makes more sense to me as a case against honey alternatives rather than a case for honey
We recently published an interview with Matthew Coleman—another entry in our Career Journeys series. Matthew is the Executive Director of Giving Multiplier, a platform that encourages donations to highly effective charities through donation matching. Before this, he completed a PhD in psychology, researching the psychology of altruism.
The interview covers quite a lot of ground, but a few of the things we talked about include:
The gap between what a career looks like from the outside and what it’s actually like day-to-day.
Advice for people wanting to make an impact through psychology.
The tension between keeping your options open and committing to a path.
Here’s one of our favorite extracts from the full interview:
On engaging with the (often mundane) realities of academic research:
I learned a lot. By the time I started my lab manager role, I was fairly confident I wanted to do a PhD. But my research lab in undergrad, which I loved, was a very small lab where I was working closely with the faculty advisor, and I wanted to try out a larger lab studying different topics to explore a bit more.
As the lab manager of an unusually large lab, I got a bird’s-eye view of a lot of the research projects going on and understood what the day-to-day looked like, whether that was grant applications, hiring and onboarding, or actually conducting research myself alongside my colleagues. I found the experience amazing and fascinating and really intellectually stimulating, which confirmed that I wanted to go the PhD route, so I followed through on my original plan from undergrad.
[…] I was certainly very fortunate to have gotten a lot of hands-on experience in research as an undergraduate, so I think I had a better sense of the day-to-day than many people do. But I do think it’s a very important point, and some related advice I like to give is: when you wake up on a random Tuesday in February, do you actually want to do the things that you have to do? Not just do you like the topics or ideas you’re studying (although that’s of course very important, too). Maybe you read a book, watched a TED talk, or listened to a podcast about some topic you found fascinating, and maybe you do want to pursue work in that domain. But I think the ideas themselves aren’t enough, because you actually have to do the day-to-day work.
So what are the actual responsibilities and tasks you like doing? For example, you may find neuroscience fascinating, but maybe you don’t want to spend a large portion of your workweek interacting with research subjects running brain imaging sessions, or whatever it might be. In such a case, even if you think the subject matter is fascinating, maybe that’s not the best career fit for you. Or maybe you do also enjoy most of the regular responsibilities associated with that career, in which case it could be a great fit. So I think a combination of enjoying the topic itself plus the day-to-day responsibilities is important. I was lucky that, early in my career, I was able to test it out and experiment with which responsibilities I liked more than others.
Lighting has been getting ridiculously cheaper. And for the most part we seem to be not taking advantage of that positive externality: reducing crime through better lighting. This has been battle-tested as one of the effective ways for public security, see Chalfin, Hansen, Lerner & Parker (2022), an RCT in NYC public housing finding ~36% reductions in nighttime outdoor index crimes from added street lighting. Many, many major cities still haven’t copied this at the right levels!
But we’re also getting substantially negative externalities of bright lighting. Office buildings that never turn off their lights because why would they care. Apropos the new office building that just opened next to my housing. This may alimentate NIMBY spirits in me, God forbid. Kyba et al. (2017) document that Earth’s artificially lit outdoor area grew 2.2% per year from 2012 to 2016, with the LED transition producing a rebound effect instead of getting savings. Jevons paradox and such.
Also, this has all sorts of annoyances. I think malls, pharmacies, and hospitals have all become much brighter since my childhood. I may be more sensorially overloaded than most people, but this does meaningfully affect my qualia, so much that Pigou himself would collect taxes from the pharmacies with dozens and dozens of LEDs, while Coase would advocate that I have the natural property right of not being assaulted with that much lumen while buying a Tylenol. This does affect wellbeing of more than just me (Cho et al. 2015). But lightly enough, ha, to not be a topic of discussion.
Maybe my biggest medium-term worry about transformative AI, other than the takeover stuff, is a constellation of concerns I sometimes abbreviate to “political economy.” Right now a large fraction of humans in democracies can live and support their families as a direct result of voluntarily exchanging their labor. It’d take active acts of violence to break from this (pretty good, all things considered) status quo. As a peacetime norm, this is unusually good relative to the history of human civilization.
At some point in the future (in the “good” futures, I’d add), there’ll be a natural transition from that to people living and supporting their families as a result of UBI or welfare or other gifts from companies or the State. Ie they will now be surviving explicitly due to someone else’s largesse[1]. This seems bad!
Unfortunately I don’t have a good answer here, even in principle. But it seems worth considering! I vaguely wish more people would work on it.
State power is of course backed by the threat of violence, so it may not be just largesse. But a) “my desired system is the peaceful default, and it takes violence to wrest me away from it” is more stable and dignified than “my desired system relies on the constant threat of violence to hold”, and b) a fair amount of democratic power comes from the democratic nature (and the ease of mass mobilization) of guns, and this has also been eroded by technological developments in the last century, and will also likely be further eroded by developments in AI.
Windfall shares. Some fraction of AI stocks should be given one-time to every human alive
This still requires some form of largesse/threat but one-time largesse feels less scary to me than continuously need to uphold the norm.
And it’s not exactly largesse while people (especially outside of AI companies) still have real power, more like a structured negotiation
For reasons of political-economy realities, probably with more given towards rich countries and/or countries that are closer to developing AGI
I’m imagining maybe ratios like 10:1
Not sure about the exact amount of shares but should be way more than enough to support everybody indefinitely at significantly above modern Western standards, excepting positional goods
After the initial transfer, this completely solves the largesse and political economy problems. The “dignity” problem of having your consumption no longer tied to your labor is still there but I’m less worried about this (seems more like a framing problem).
Children can still be a problem. My guess is that normal inheritance stuff is enough though in edge cases maybe we say that you aren’t allowed to disown your children completely from your windfall shares.
If people live forever maybe we have a rule that reproduction means a minimum fraction of your shares automatically go to your children I dunno.
Charter. Later on, some version of this is also written directly into the charters of the AIs, so at minimum something like 0.1-10% of their values ought to care something like all of current humanity’s preferences
Assuming alignment is solved, now superintelligence is (0.1-10%) on the side of all humanity.
(probably optional) some form of protection against manipulation/theft/expropriation
If there’s a transition period where AIs are good enough to do most work in the economy and generate a lot of wealth and/or disemploy most people but AI alignment and capabilities aren’t enough that #2 solves all the new AI-generated problems (eg if we’re worried about superpersuader thieves) we have ad hoc paternalism stuff to prevent obvious ways to steal people’s windfall shares.
how heavy the paternalism is defends on how serious different concerns look. Eg if AI superpersuasion scams are common maybe we’d just make it legally impossible to transfer windfall shares, in the same way you can’t legally sell your organs in most countries.
To ease the transition, this should be seen in earlier stages as a complement to existing welfare systems rather than a substitute to them. Eg if someone’s dumb enough to gamble their monthly AI windfall dividends away, different societies can either choose to let them starve or (my preferred solution) still feed them, perhaps until AI-assisted tools can cure their gambling addictions. In general, don’t let “the windfall shares solution can’t solve all of society’s problems” be a blocker to implementing it.
__
tbc I don’t think this is an amazing answer. I worry both that this won’t be enough and that we won’t implement anything as good as this. I don’t know what the bottlenecks to better answers are, and why other people aren’t working on this. Two obvious answers come to mind:
It’s just kind of a hard problem!
Most people don’t “feel the AGI”, and the people who do think they have more important/tractable problems to work on.
Claude gives some references to prior work. Maybe the most interesting is Anton Korinek:
Anton Korinek has been the most prolific economist on this. “AI’s Economic Peril to Democracy” (with Stephanie Bell, Journal of Democracy, 2023) is closest to your framing — explicitly argues that the labor-democracy linkage is what makes modern democracies stable and that AI severs it. “Preparing for the (Non-Existent?) Future of Work” (with Juelfs, in the Oxford Handbook of AI Governance) and “Economic Policy Challenges for the Age of AI” (2024 NBER WP) cover the policy space. He’s on Anthropic’s Economic Advisory Council now.
I’ve also had worries there; my naive hope is that there’ll be a meaningful plurality within the-controllers-of-the-AI that they’ll have to compete for feet. So if you want to grow the amount of matter and energy you govern, you’ll need more people to opt in to your system to justify yourself (unless you want to give a good excuse for everyone else to band together and smite you). Then I hope the world is held stable by something like mutually assured resource exhaustion.
If you squint, I think UBI could function more like a lease on individual consent vs a gift. Hopefully, giving people inherent political value.
But for sure seems dicey; easy to imagine a few people in power colluding to disregard the vast majority of the population.
Thinking of drafting a post on war crimes, trying to answer the following puzzles:
Why do we have a notion of war crimes at all, given how bad war itself is?
Why are some things war crimes and not others?
Why do precursor notions to war crimes appear, independently, in essentially every culture that has fought wars at scale?
Given that essentially every culture has also broken these norms, sometimes spectacularly, why does the norm always come back, and often come back stronger?
Common answers to these questions seem profoundly misguided. The naive answer, that war crimes are simply the most horrible things that we all agree is collectively wrong, does not survive even five minutes of scrutiny. More sophisticated versions of that argument also do not survive scrutiny: Just War theory is similarly flawed and question-begging on the descriptivist front, and the Schelling—shaped argument that war crimes can’t limit all of war’s badness, but are aimed at curbing the worst excesses, does not explain why mass bombings and medieval sieges are/were considered acceptable, but false surrender is not.
The “cynical” answers are (differently) flawed. eg some people think war crimes are completely fake and anything other than total war is just modern virtue signaling, ignoring the thousands of years of documented history we have on precursors to war crime (Xerxes in 400s BC: “The Spartans, when they do such things overthrow all law and justice among men.“). If anything, the modern version of “total war” is much newer than the idea of war crimes. Similarly, a naive “power analysis” that war crimes are simply defined by the powerful to limit the options of the powerless ignores that powerful people are often themselves constrained by these norms, sometimes hugely.
Instead, my core answer here is surprisingly simple: A “war crime” is, in its oldest and clearest form, the category of acts that destroy the means by which wars can be ended. The prohibitions track not the moral worst things people do in war, but the acts that, if generalized, would turn every future war into a total war.
I don’t think my theory here is very novel. Indeed, as I’ll discuss, this theory is literally thousands of years old and likely arose independently in many places. I will try, however, to make my post the best modern articulation of these ancient ideas.
No offense Linch, but aren’t these questions for jurists, historians and philosophers? Why should you develop the answers from first principles, so to speak? I’d get writing a blog post about a journey through such sources and what their theories are, but I think trying to answer such questions ourselves is not very robust.
This is not a criticism of you personally—developing ideas that require domain expertise from first principles is an approach I often see in EA and I think it’s a wrong one.
My experience with trying to investigate various questions is that it’s pretty hard to ex ante predict which things already have substantial attention from the “experts”, and many questions that seem important fall through the cracks (for some EA-relevant examples that comes to mind, optimal charity, AI risk, pandemic preparedness, the impact of incarceration on crime).[1]
In this case, I don’t think the specific question I’m interested in has attracted a lot of academic attention/I don’t believe any single field has a good unifying theory. Just war theory is overwhelmingly normative with limited descriptivist content. IHL scholarship is interpretive and doctrinal. IR/game theory scholarship has partial answers, but afaik no one has synthesized them into a structural theory of war crime, etc, etc[2].
Second, I’m definitely reading a bunch of sources here! I’m sure I’ll miss a bunch but I’m certainly reading through a bunch of sources, including historical ones.
Third, if you believe the most useful thing to do here is a literature review, be the change you want to see in the world! Like if you think what’s missing in the world is a pure literature review summarizing various theories, go for it! I’d be happy to read your review.
Watts (2013) talking about perfidy specifically is the closest. But tbc there have been many shadows of this theory/hypothesis over the centuries, since at least 2400 years ago and probably long before.
Your experience reminded me of how Holden Karnofsky described his career so far:
The general theme of my career is just taking questions, especially questions about how to give effectively, where it’s just like no one’s really gotten started on this question. Even doing a pretty crappy analysis can be better than what already exists. So often what I have done in my career, what I consider myself to have kind of specialized in, in a sense, is I do the first cut crappy analysis of some question that has not been analyzed much and is very important. Then I build a team to do better analysis of that question. That’s been my general pattern. I think that’s the most generalizable skill I’ve had
I guess a “but can’t we, like, just outlaw all war?” approach is not the standard one so I’m at least interested in what answers you may find. Especially with me coming from a very, umm, war-prone country...
You might like this post I wrote earlier about the bargaining theory puzzle of war. I engaged with the academic literature on the subject pretty significantly, particularly James Fearon, so you might like it. On the other hand Fearon himself mostly reasoned from first-principles rather than conduct a careful historical assessment, so in that regard it might fit your interests less.
The post never got very popular but a few people who read it carefully really enjoyed it. One of the better compliments I’ve gotten on my writing is when somebody said they were surprised to learn after reading my post and several books on the subject that the post gave them >50% of the value of an academic book on the subject.
I like the puzzle. But I wonder if you can make your answer even simpler:
Actions taken in war have some benefit to the perpetrator, and some costs to the larger system of permitting them
When the ratio between these things gets too extreme, it’s regarded as a war crime
I think this explains the category that you outline (undermining trust in the kind of institutions that could stop the war is super destructive!), but also explains some other cases, e.g. abuse of prisoners, not impersonating medical staff, etc.
Yeah this is fair. I outlined something like that here.
I think there’s a few tricky things with this model. One is lack of precision, eg by whose lights are you interpreting “costs to the larger system of permitting them.” Relatedly, an advantage of advocating “war crimes are crimes against the end of war” is that it creates a clear core (even if it doesn’t describe everything) of norms that I think are a good description of commonly shared norms in history, and I think are good to uphold morally[1]. In contrast, many other norms of war tend to be more sporadic, like protecting civilians, chivalry, or diplomatic precedence.
Another tricky thing is Schelling’s point that almost all conflict is non-zero-sum, you can’t treat the zero-sum parts of war and the non-zero-sum parts as cleanly separable.
(I’d also note that torturing POWs makes surrender less appealing, so it’s consistent with my narrower answer. My narrower answer would also predict that protecting civilians is important but not very important, which is consistent with the historical record. On the other hand it does not have a explanation for weapons bans; my defense is that a decent enough simple theory in social science doesn’t need to explain everything).
I agree that the model I proposed is imprecise; I think this counts against its usefulness but not its validity.
I’m not suggesting this as a thing to advocate for; merely as a descriptive pattern of what the category of war crimes is doing. I think the things which make ending war harder are an important class of really destructive thing, but it seems clarity-obscuring to me to claim that this is definitionally what war crimes are? Rather than giving your thing a new label and then getting to discuss what fraction of war crimes are in that category, and whether there are things in that category which aren’t war crimes (e.g. if torturing POWs counts under your categorization, then why doesn’t conscription count—after all, it damages the “one side runs out of soldiers” mechanism for ending war).
merely as a descriptive pattern of what the category of war crimes is doing. I think the things which make ending war harder are an important class of really destructive thing, but it seems clarity-obscuring to me to claim that this is definitionally what war crimes are? Rather than giving your thing a new label
Fair, I guess the thing I’m interested in is something like “widely shared and independently recurring norms of war.” Though I’d want to be narrow enough to exclude stuff like “norms of war include paying your soldiers and have okay logistics planning” or “norms of war descriptively include being total morons sometimes in XYZ ways”
e.g. if torturing POWs counts under your categorization, then why doesn’t conscription count—after all, it damages the “one side runs out of soldiers” mechanism for ending war
right sorry I do think the costs/benefits ratio matter significantly here.
Ok, so one place the predictions of these theories might come apart is that my theory suggests a norm against impersonating medics, whereas I think yours doesn’t (although maybe I’m just not seeing it; I don’t think I would have said that avoiding torture of prisoners was part of protecting the mechanisms of ending war, although I do kind of see what you mean). I haven’t looked into it at all, but if that norm has emerged independently multiple times that would be suggestive in favour of the broader theory; whereas if it has just emerged once it looks perhaps more potentially-idiosyncratic, which would be suggestive in favour of the narrower theory.
It’s an interesting idea but as expressed feels a little tendentious, particularly if one looks at what is actually formally considered a war crime these days (much of which would not have been recognised as war crimes in the past, including by actors who believed themselves to be unusually chivalrous). Hard to believe it will be impossible to avoid total war if a few civilians are murdered or chemical weapons are used, never mind if a pilot gets shot after ejecting or if a spy is not afforded a fair trial, and hostage taking was once considered a good way to avoid total war. We see peace agreements between prolific war criminals quite often too. Avoiding total war might be a motivation, but it can’t be the only one.
On the other hand game theory favours opposing sides agreeing to not shoot ejecting pilots or torture each others’ prisoners even if they don’t agree on anything else, whereas it is impossible to win a war if you are not permitted to fire at the other side’s soldiers, and at least in 1949 artillery and aerial bombardments were also too critical to winning for the Geneva Convention to agree to ban them. It is possible for opposing sides to agree not to shoot at people that don’t wear uniform, but only if both sides treat sneaking up on the other side and shooting them whilst not wearing uniform as also a crime.
Also, many people genuinely believe in the idea that people shooting other people in a uniform which indicates they intend to fire back represents some sort of fair play (even if the targets happen to be sleeping conscripts who haven’t had a chance to surrender yet) and the sort of people that believe in that sort of thing are disproportionately likely to be military officers. They tend to believe in just wars too...
I do agree that opposing sides are considerably more likely to respect conventions on war crimes (and even reach other bargains like prisoner swaps) whilst the infrastructure that may allow the sides to mutually end the war still exists. But there’s plenty of evidence of war crimes committed with impunity in conflicts that never came close to total war, and for that matter of individual military units choosing to abide by conventions despite there being no realistic prospect of a near-term peace agreement and plenty of war crimes being committed by others on their side
I think the language I used above is more deontological/universalizing than ideal. I agree it’s more of a gradient than anything else. I also think some of the biggest classical norms (“don’t shoot messengers/envoys”) while still important today, are less so in the age of wireless communication, mass media, and email. I also think my primary thesis address the benefits of having “war crime” norms, but norms in practice are about both benefits and costs, and some of your comment here address costs (which are of course also important).
A quick reminder that applications for EA Global: London 2026 close this Sunday (May 10)!
We already have more applications than last year, and this looks set to be our biggest EAG yet (again)! If you’ve been meaning to apply but haven’t gotten around to it, this is your sign.
The admissions bar is more accessible than people often assume. If you’re working on or seriously exploring a high-impact problem, you should apply.
This is the EAG I’ve been most excited to put together yet. I’d love to see you all there.
📍 InterContinental London, The O2 · 29-31 May 2026 ⏰ Applications close: Sunday, May 10 🔗 Apply here
Despite the real risk from hantavirus being low, it is getting covered a lot in media right now. I think this is actually good. A lot of people had already forgotten about the pandemic that we had not that long ago and moved on to worrying about other problems currently dominating the news cycle. Hopefully this serves as a (small) reminder to people that pandemic preparedness / biosecurity really does matter.
Just was watching Dwarkesh/David Reich podcast, fascinating stuff. Looking back at how I was taught taxonomy and anthropological history I find it frustrating. Note that I don’t know much about (evolutionary) biology or genetics or the frontier of what genetic-history research so this is my layman attempt to explain why it’s generally been puzzling for me how i have had this explained by other people who probably don’t understand either, not trying to propose that I understand something david reich doesn’t.
My main gripe is that we are taught evolutionary history mostly from the lens of evolutionary trees. But evolutionary history probably looks like a graph/stochastic process/markov chain, and only at very specific underlying parameters/ level of abstraction is well modeled by a tree. The reason we use trees is because that is the most sensible simple abstraction in some ways, if you are thinking about, “how did we get here?”. But it’s not a great way to think about “what happened/was happening”. I had chatgpt try to make the difference below (don’t look too into the details, it did some hallucinating, just the general vibe).
Taking plausible parameters here to me would be thinking mixture isn’t extremely likely over short time spans because of distance/etc but quite likely and almost certain over hundreds/thousands of years. So what did the near east look like genetically 60k years ago? It could easily look like below.
It seems totally possible that for long periods of hominid history genetics were well modeled by pretty smooth stochastic graphs with generally corresponding smooth genetics across geography (obviously with tons of exceptions or less true when you zoom in, e.g. bell beaker/corded culture), and yet when you look at our specific lineage it doesn’t quite look like that (due to extinction, gene selection, or some other reason). I don’t have a clear enough vision to say much more, but I think there are some interesting implications about what we mean when we say, this group or that group went extinct.
Lauren and most of the authors will be on the Forum to answer your questions throughout the week. More info to come on Monday, but I figured I’d mention in case anyone wanted to read the articles in advance (they are here, and all authors apart from Paul Niehaus will be around to answer questions).
Earning to give is lonely and requires repeated decisions. This is bad.
If you’re earning to give, you are lucky if you have one EtG team-mate. The people you talk to every day do not have moral intuitions similar to yours, and your actions seem weird to them.
If you do direct work, the psychological default every day is to wake up and do work. You are surrounded by people who think the work is important, and whose moral values at least rhyme with your own.
If you earn to give, most days you do not give (you’re probably paid bi-weekly, and transaction costs discourage even donating that frequently).
These differences apply continual pressure for EtG folks to become less hard-core than we intended to be. I wish I had more counter-pressure.
I’m EtG and would love to connect with others. My DMs are open! A bit about me: I’m a SWE based in Europe, and my preferred cause area is animal welfare.
We have a regular EtG meetup in London. You might be interested in setting up something similar where you live, perhaps branching off a preexisting Effective Giving/Giving What We Can group?
Besides my more cold-hearted response below: I agree that EtG is lonely. You are lucky if you have one other EtG’er in the same city. EtG is rare. EtG and feeling committed to it, is even rarer.
“If you do direct work, the psychological default every day is to wake up and do work. You are surrounded by people who think the work is important, and whose moral values at least rhyme with your own.”
This is true for me, and true for many in richer countries sucha as the awesome AIM crew. In low-income countries though many if not most employees (especially in BINGOS) are there for the money and status in non-profits, rather than the value of the work. I know a number of Ugandans who have found this difficult when they cared about the work while their colleagues were just trying to weasel away as much money on allowances as they could while trying to unnecessarily extend projects to keep their salary going.
I think your major point stands, but direct work doesn’t universally come with motivated and encouraging peers.
You don’t need to make a donation decision biweekly. That sounds incredibly tiring. You can also set up a recurring donation to the same charity or fund. Or save up and donate once or twice a year. That has the advantage that you can plan some time to think well about your decision.
Somewhat meta point on epistemic modesty, calling it out here because it is a pattern that has deeply frustrated me about EA/rationalism for as long as I have known them: (making a quick take rather than commenting due to an app.operation_not_allowed error—I’m responding to @Linch’s quick take on war crimes) I guess these are just EA/rationalist norms, but an approach that glosses major positions as being so quickly dismissible strikes me as insufficiently epistemically modest. I would expect such a treatment will fail to properly consider alternative answers or intuitions to the author’s own, especially the strongest versions of those answers (e.g. modern just war positions), won’t consider the most sophisticated counterpoints (e.g. your ‘oldest and clearest form’ gambit may just be bracketing out the counterexamples that don’t fit your definition, like genocide or sexual violence), and reinvent the wheel, e.g. the view seems to be exactly this from 2013:
“A final rationale for the perfidy prohibition is to preserve the possibility of a return to peace. To prevent the degradation of trust and the bad faith between warring parties that would impede negotiation of peace terms. An effective perfidy prohibition preserves the good faith upon which ceasefires, armistices and conclusions of hostilities rely.”
I think deep engagement with the range of serious views on the topic is required to make your post “the best modern articulation of these ancient ideas”. I don’t think the quick take seems on a good track for that.
I feel like I’ve heard this position a lot before, and I have some sympathy for it, but I feel like it implicitly overlooks a lot of what I find valuable about writing EA Forum comments, and it sets an overly high bar.
When one writes academic papers, one is expected to cite relevant previous work. Credit assignation is an important mechanism for tracing the evidence for claims and for assigning credit. Even in academic spheres, I think this is perhaps taken pathologically far (to the point where it probably sometimes is unduly burdensome and vaguely implies that pretty obvious ideas or hypotheses had to have come from someone else as opposed to being generated by the author), but the reasons why it’s important to cite your claims seem a lot stronger in academia.
The EA Forum is partly intended, I believe, to be a place where people are encouraged to say things more quickly and speculatively after having done less research, and where people are more encouraged to share their own overall judgments and thinking process without necessarily fully defending all their positions. You might think it’s bad to have such a place and that people should mostly just rely on the academic literature. I disagree with that, but trying to make the EA Forum use the same standards that academia uses seems counterproductive. We can just use academia for that.
And at least in my mind, a big part of the point of writing things like what Linch wrote is about trying to practice my critical thinking skills and appling them to new areas, for the eventual purpose of use in areas where there’s not already a lot of scholarship. So I value approaching an area I don’t know much about, like the topic of war crimes, and trying to understand it on my own and seeing how far I can get and forming my own view rather than necessarily seeing this as strictly an opportunity to practice building on existing literature on war crimes (or worse, just regurgitating that literature undiscerningly)
Thanks for the reference, and the point that the structural argument doesn’t handle all modern cases as well. Will address both in the post.
Though I’m confused. If you’re accusing me of reinventing the wheel, why reference Watts from 2013 and not Grotius in 1625? Or Didiotus in 427 BC, or other references in Thucydides?
I think EAs if anything are far too epistemically modest and unwilling to stick their neck out for defending true and accurate positions. I also find demands to police epistemic modesty based on hastily written quick takes annoying.
The de facto outcome if I take these concerns seriously is to showcase much less of my intermediate thinking, to bulletproof all my writing before they see the light of day, and/or crosspost to the EA Forum less.
I’ve indeed been taking actions like this, especially the last one, due to comments like yours (this is the third time you’ve done this), though I’m unsure if I endorse it on net.
I’m not trying to be unkind, and I apologise if I was. I’ll take this down if you ask here or via DM. I overreacted to what is a quick take because I think it was emblematic of a bad pattern—but that is unfair and disproportionate of me. My main thing here is to push for better intermediate thinking. Like the standard EA/rat approach is so often based on dismissing mainstream or non-EA views, and then acting like their individual opinion is clearly superior, often reinventing current or past views that have had lots of non-EA examination. I want EA thinking to be better, and a lot of the time it would be improved by people reading more before opining, and not thinking the views of EA are so special.
I think EAs if anything are far too epistemically modest and unwilling to stick their neck out for defending true and accurate positions.
We just have very different experiences then.
(this is the third time you’ve done this),
Do you mean critique someone on epistemic immodesty grounds? This is probably true but can you point me to the examples you have in mind? (I may indeed be doing this too much and seeing the examples would help)
Thanks for the much kinder response and the serious engagement! :) Please don’t take your comment down, it’s good to have this discussion in the open.
(Also apologies for the long comment, brain not working really well so less succinct than I want to be)
My main thing here is to push for better intermediate thinking. Like the standard EA/rat approach is so often based on dismissing mainstream or non-EA views, and then acting like their individual opinion is clearly superior.
I want to defend my own approach here, and won’t speak for the” standard EA/rat approach” except insomuch as my thinking is constitutive of that approach (as the old joke goes, “you’re not in traffic, you are traffic”). Generally when I try to learn information about the world, what I go for is to seek facts and models that are
interesting (ie, novel to me)
true
useful
The best way to do this typically involves some combination of Google searches, original thinking, reading papers, conversations, reading, toy models, and (since ~2025) talking to AIs[1]. Since college, I’ve honed an ability to form views very quickly that I can defend, and believe I’m reasonably calibrated on. I think this is sometimes surprising to people but it shouldn’t be. The first data point tells you a lot[2].
Similarly my bar for publishing my thoughts, ignoring opportunity cost, is also fairly low. The primary thing I’m interested in from a content perspective is some combination of novel/true/useful to my readers. Novel to whom? For me I have an implicit model of who my readers are and I try to calibrate accordingly. I want to write things that are new to a large fraction of my readers. I think you might have more of an academia-derived model where it’s very important to only share thoughts that are novel to humanity.
I think this is less good of a norm. If I can write a better intro to stealth than is widely understood/disseminated, I think this is a useful service even if no individual point there is original.
Similarly, I think it’s less important in non-academic contexts to attribute the originators of an idea or an analysis. I don’t think it’s useless, I just think it’s less important. But if I’m thinking about a problem the academic citations are mostly directly useful inasomuch as it benefits either me or my readers, rather than being the first line of attack.
To be clear, credit attribution is valuable and I want to avoid actual plagiarism (I think academic norms are valuable in a bunch of ways and I want to respect the institution even when I disagree with it).
Also, this may be nonobvious, but I do in fact “do the reading” and “expert engagement” significantly, often past the point of diminishing marginal returns compared to honing my own thinking or writing.
For example, in my earlier post on war, where I summarized and extended James Fearon’s bargaining model, I read Fearon’s paper and skimmed a bunch of others to form a gestalt view. I also emailed my post to both Fearon and another academic on war (Fearon replied positively, the other academic didn’t respond).
In my Chiang review I read something like 10 reviews before starting my piece, and maybe more like 20 before finishing it.
And for war crimes in particular, I’ve been reading about it casually for several years. See here for one example.
I also think it’s very easy to say “do the reading” but in practice what reading you do is highly contingent and it’s easy to waste a bunch of time feeling virtuous for doing the homework on adjacent topics but not actually learning useful things for addressing your original question. For example, you seem to believe that I should be reading the latest academic papers on just war (a plausible enough hypothesis!). Someone on Substack (with a relevant background!) suggested I read the negotiating history of the Geneva conventions and their Additional Protocols (also plausible!) Someone on LessWrong suggested I read Tom Schelling’s treatment of the subject (plausible enough, I ordered the book). And these are just the ones that I think are sufficiently plausible! There are so many other ways to burn time seeming to doing the reading instead of committing to a hypothesis and seeing where it lands.
Finally, I’d note that when you said people’s arguments are
so often based on dismissing mainstream or non-EA views
there is a major selection effect. If I think a mainstream view is both true and introduced well, I usually don’t bother writing about it.
__
> I think EAs if anything are far too epistemically modest and unwilling to stick their neck out for defending true and accurate positions.
We just have very different experiences then.
Concretely I think Bentham’s Bulldog/Matthew comes across as overconfident on his blog, as does John Wentworth on LessWrong. But most randomly selected writers on EAF and LW are underconfident and often hedge in 10 words what they could say in 3.
Maybe a background methodological difference here is that I strongly agree with Scott Alexander on the most useful forms of criticism (highly specific, targeted, concrete). Whereas I’m skeptical of deep paradigmatic criticisms really being correct, changing people’s minds, or overall being insightful/true/useful.
(this is the third time you’ve done this),
Do you mean critique someone on epistemic immodesty grounds?
I meant respond to my comments or posts in a way that seems asymmetrically easy to make but very hard to respond to/argue against on the object-level. I don’t want to dreg up the links, sorry.
Anyway, thanks for the response and for giving me an opportunity to elaborate my thoughts and overall position here.
this is for the types of questions I’m interested in, and my workflow. A historian might do more primary source hunting/archival research offline. An ML researcher might run more experiments, a biologist might work in a wet lab, or a field, and so forth. In the past I also did more expert interviews.
If you have a model of the world/human epistemics where surprisal value is constant across learning time, or even that it’s highly superlinear per topic, then you might prioritize your actions very differently from me.
This is too tangential from the forecasting discussion to justify being a comment there so I’m putting it here:
Forecasting makes no sense as a cause area, because cause areas are problems, something like “people lack resources/basic healthcare/etc.”, “we might be building superintelligent AI and we have no idea what we’re doing”. Forecasting is more like a tool. People use forecasting to address AI, global poverty, and all sorts of more general problems, including ones that aren’t major EA focuses.
For instance, we could treat vaccines as a cause area. All the funding to some AI-x-biosecurity people, GAVI campaigns for existing vaccines, and people working on bird flu vaccines could be treated like they’re doing the same thing. And then we could argue about whether vaccines meet the funding bar. But that would be a pretty pointless argument, when really all those projects are trying to do different things with similar tools.
So I’d rather judge the AI forecasting by AI standards, the general-purpose forecasting by metascience standards, and the global development forecasting by global development standards, rather than trying to lump them in as a single entity. That being said, I do side with the view that there’s too much money and enthusiasm being spent on forecasting, but it’s a weakly held view, and that doesn’t mean that every forecasting project isn’t worth being funded, or even that they’re all equally inflated.
The recent work on SAEBER, which applies sparse autoencoders (SAEs) to the screening of dna synthesis printers marks a big step towards effective function based screening.
This allows for printers to be monitored just as a lab technician uses computational gel electrophoresis to separate a messy mixture into clear, readable bands through the use of a specialized gel. SAEs happen to do the exact same thing by taking the muddied activation results of a neural network and projecting them out onto a higher dimensional space until the individual viral motifs can be seen clearly. This allows for the motifs to be tracked as they move through the system in real-time, rather than waiting for a final product.
However, while SAEBER is undoubtedly an effective method, can we say for a fact that it is the best tool for function based screening? Would it be better to scan the digital thoughts of the AI responsible for guiding the system generating the product, or monitoring the stability of the system itself, given that we can model the printer’s physical state at any given time step during the printer’s run?
While scanning the digital motifs helps provide an understanding of the AI’s intent, it would be interesting to see if monitoring the physical state of the printer might provide a more resilient safety net. My intuition is that modelling the printer’s state as a physical landscape and understanding the implications of changes in the landscape might be more prone to false positives from natural noise, but it also has the potential to be better at detecting divergence much earlier than waiting to interpret a complex digital signal. Has there been much discussion on combining these—using the physics of the machine to flag a problem, and the AI’s internal motifs to figure out exactly what that problem is?
I’d like to have conversations with people who work or are knowledgeable about energy and security. Whether that’s with respect to energy grids, nuclear power plants, solar panels, etc. I’m exploring a startup idea to harden the world’s critical infrastructure against powerful AI. (I am also building a system to make formal verification more deployable at scale so that it may reduce loss of control and misuse scenarios.)
I’ve given workshops on using AIs for productivity/research to various research organizations like MATS. I’m happy to offer a bit of my time to share my expertise on that if that would make the meeting more interesting for you (or any other topics you’d like to hear my perspective on).
Context about me: I’m Jacques. I started working on technical AI safety research in January 2022. Before that, I had been engaging with AI ethics in a more personal capacity, worked as a data scientist at the Canada Energy Regulator, and earned a BSc/master’s in Physics. I’m currently based in Montreal.
Please schedule a meeting if interested (or DM if you know someone I should talk to): https://calendly.com/jacquesthibodeau/45-minute-meeting
Been thinking about morality recently. Here are my current thoughts, take them with a grain of salt because they aren’t battle-tested yet.
There are some strong arguments for utilitarianism, but regardless of what is correct theoretically, in practise utilitarianism doesn’t work well without some kind of deontological bars.
Continuing with attempting to develop a pragmatic morality, it then become clear that virtue ethics is important too because a) rules are rigid compared to judgement b) decisions aren’t independent but also affect how you’ll act in the future[1].
Some folks may be quite tepid in integrating virtue ethics, but my intuition is that the more common fault will be to give yourself too much latitude, so you’ll probably want to revive some of your old deontological bars.
I view the next stage after this as introducing a sort of meta-virtue ethics to balance the three components (utilitarianism, deontology and virtue ethics; obviously it would be possible to break this down further). But this likely gives you too much latitude again, so you’ll probably want to introduce some kind of meta-deontology to limit how you update the balance.
You could go further than this, but you’d probably be running into decreasing marginal utility.
Thanks to Austen Erickson who I first learned this perspective from.
I quite liked this article by Martha Nussbaum: Virtue Ethics is a Misleading Category. She points out that both the classical utilitarians and Kant talked extensively about virtues. On the other hand, there’s great variation among those who call themselves ‘virtue ethicists’, such that it’s not clear if virtue ethics is really a thing.
But the point I want to make is: a good utilitarian has to acknowledge the role of virtue, and I think a lot of modern utilitarians have forgotten this. We want to use utility-calculation to guide our actions, but humans can’t think like calculators all the time.
I’m not really into deontological constraints myself. Rules of thumb, yes, but they should always be open to revision. Exceptional circumstances can always justify breaking rules—and in those cases, I will refer to what maximizes utility.
I wanted to make this poll to see how the community views the speed/x-risk tradeoff. I’m personally 99% x-risk and 1% speed, so I would hard agree. My prediction is most people will agree, maybe a 70⁄30 split, but I’m curious to see.
I think the question is too imprecise phrased to be answered precisely. When would the delay start? Over what time period would it be felt? (e.g. a 100% delay for 100 years is very different than 1% delay over 10,000 years)
I’m thus giving a directional answer assuming we’re talking about whether seeking to dramatically reducing technological progress in exchange for safety is a feasible way to make the world a better place. I don’t think this is, but I’m not sure.
My biggest gripe is that any attempt to reduce technological innovation dramatically would entail a bunch of side-effects that would degrade the quality of existence (e.g. requiring authoritarianism, moving power from cooperators to defectors, to people skilled at deception to people less skilled, incentivises fighting for a larger slice of the pie instead of expanding it as expanding it is far harder without improved technology)
‘significantly reduce’ could mean a lot of things. I’m answering as if this reduces absolute X-risk by 20% or more over the next 10 centuries.
Initially I just calculated a naive expected value function and put 100% agree, but then I realized that I don’t value realizing potential lives nearly as much as I value improving existing ones. While I do value realizing potential lives, the loss of them is not experienced by anyone other than present-day people like myself who think about them abstractly, which seems to me in sum to be less bad than the suffering otherwise avertible due to technological progress in the next 100 years. But I obviously haven’t thought about this enough or I wouldn’t have made my initial mistake.
One thing I didn’t consider in my revised answer is that I didn’t actually do the math. Taking an existential event as literally causing the end of earth-originating life, the question is whether the difference in probability multiplied by the immediate mass extinction itself would represent more death and suffering than the avertible death and suffering occurring over a 100-year period. I just don’t know. It seems unlikely that the avertible death and suffering amounts to as much as the amount caused by the mass-extinction event itself, but after multiplying by the difference in probability and acknowledging the ambiguity of the timeline proposed in this question, things become less clear. However, let’s say that the probability-adjusted, undetermined-timing mass-extinction event does cause more suffering and death and I change my answer to 50% agree. I don’t think this is what most people would interpret 50% agree to express.
I should also be clear that I’m taking the question to mean literally ending earth-originating life in more-or-less one, fell swoop. Obviously, traditional x-risks actually have a spectrum of severity, so this is not so straightforward to apply to real-world resource allocation.
If I had to be more specific I would mean “reducing the probability of all humanity (and only humanity) dying in a few short days/weeks from 50% to 10%” by “significantly reduce existential risk”.
Also, I disagree with your methods. X risks aren’t especially bad because of all the utility lost (and “negative utility” created), they’re bad because after they happen there’s never any utility again. Unless apes re-evolve into humans and reestablish all of civilization all over again, but we’re getting too hypothetical. What’s 100, or even 1000 years of death and suffering compared to 10000 of utopia? If stalling/slowing down technological progress for 1000 years made the P(Doom) go from 50% to 1%, I would definitely take it. Unless of course you think utopia is gonna be some short lived thing, but I seriously doubt that.
You are rightly grasping that we disagree, but I don’t think you are understanding my view (and to be clear, reasonable people can disagree about this).
My wife and I are debating whether we will have more children or not. Having another child is desirable to us. So much so that she’s willing to undergo the relatively risky process of child birth to have another one. However, failing to have another child is significantly less bad than losing one of our existing children, IMO. I’d even say that, failing to have 100 more children is significantly less bad than losing one of our existing children. The reason why is that the child who never existed is not sentient and so does not experience any deprivation. They do not suffer. And my suffering of that abstract loss is not nearly as bad as would be the suffering I would experience losing a living child who I know.
Now you may disagree with that, and mourn all the lost utility, and that is a reasonable perspective, but its not mine, and as you can see, this is a deeper philosophical difference and not some sort of misunderstanding about expected utility or something like that.
FYI, about this sentence: “X risks aren’t especially bad because of all the utility lost … they’re bad because after they happen there’s never any utility again.” I don’t really see a difference between these two statements.
I agree with Craig here. I’ve written about problems with most conceptions of utility people use and describe alternatives that I think better match what Craig is saying in this sequence.
I can’t respond because I don’t know what “significantly reduce” means. 0.01%? 10%?
I would imagine “significantly reducing” as going from 50% to 10%, but I should have been more clear
Wrote a post about it, but the TL;DR is that extintion is THE worst case scenario. It is the end of all utility and completely irreversible, whereas progress can always be made at a later date.
S risks are a thing. There exist fates worse than death.
That’s fair, but I imagine X risks and S risks are very heavily correlated. Especially in regards to “speed of progress”, accelerationism will, in my view, obviously increase X risks (safety research takes time, the more time you have, the more time for research you have, the more research is done, therefore reducing risk) but also increase S risks (this is more personal opinion, but I don’t think the current leaders of AI innovation have stuff like animal welfare in mind. if we just keep chugging along, the first ASI might not care about animals at all).
“On the Promotion of Safe and Socially Beneficial Artificial Intelligence” by @SethBaum from 2016
The recent forecasting is overrated post got me thinking:
Intuitively, that seems correct, and I’ve relied on the expression “when you have a hammer, everything looks like a nail.” This got me thinking: is it necessarily the wrong way, or is this a truism?
If I have a legitimately useful and powerful tool, isn’t it indeed valuable to look around for problems that it can help solve? E.g., if we have discovered a way to harness electricity, shouldn’t think about the ways it can be used to improve communication, build labor-saving devices, power factories, etc? If we have something that has demonstrated potential to generate reliable information (supposing that forecasting could do this) shouldn’t we look for fruitful opportunities to apply it?
With a set of tools and a set of problems, why is it more useful for one side to do the searching than the other? (Sorry, maybe this is getting too meta and belongs in its own shortform?)
This just came to mind: the reason that it’s the wrong way to go about solving problems is that you want to solve the largest problems (well, per resource) and not just solve any random problem. Like, there is a problem that my shoes are currently untied, and I don’t want to bend down or spend 10 seconds to tie them, but it’s not very important.
So if you want to solve the most important problems, you should start with the problem and then work backwards for what solutions you might wish existed. I think the mere fact that people often talk about forecasting as the solution they are seeking to apply, whether that be Sentinel or whoever, is evidence that things are going wrong.
Actually, the set of things you want to apply electricity to is far smaller than the set of things you dont want to. For example, if your baby is crying, please dont use electricity.
The problem side should do the searching since they have the shape and exact know-how of the problem
They do and it’s a powerful point. But on the other hand they may be very much unaware of the nature of available tools and solutions. So I think there should probably be some searching — and listening — in both directions. If it’s done in good faith.
I have been disappointed by the support some EAs have expressed for recent activist actions at Ridglan Farms. I share others’ outrage at the outcome of the state animal cruelty investigation, which found serious animal cruelty law violations but led to a settlement that still permits Ridglan to sell beagles through July and to continue in-house experimentation. But I personally think the tactics used in the recent open rescues, including property damage and forced entry to remove animals, violate reasonable moral bounds on what actions are permissible in response to the belief that a serious harm is occurring. My views here stem from contractualist views of democratic legitimacy and from concerns about the non-universalizability of principles that justify lawbreaking, though I think a purely act utilitarian calculus also supports them.
Regarding universalizability, in a society where many people believe that different forms of irreparable harm are occurring (e.g. viewing abortion as murder, climate change as destroying the sacredness of the natural world, immigration as ending western civilization), I worry that moral principles that allow for significant lawbreaking when one believes that irreparable harm is occurring could easily lead to great damage if broadly followed (consider for example what it would be like to live in a country where hundreds of activists were regularly smashing their way into abortion clinics, energy companies, and refugee assistance nonprofits with sledgehammers and crowbars). Regarding the legitimacy of the law, I think reasonable contractualist views can give us obligations to follow the law when the processes by which the law is determined are legitimate, and that democracies with universal suffrage qualify as such (even granting that certain groups such as animals and future generations are impossible to enfranchise).[1] Therefore, I think that if we are trying to make decisions under moral uncertainty and give meaningful credences to rule utilitarian and contractualist views, we ought to reject the kinds of lawbreaking done by the Ridglan activists.
Moreover, I think that even if one rejects this kind of moral uncertainty-based reasoning and is a pure act utilitarian, rejecting lawbreaking in the western democratic context is still a relatively robust decision procedure under epistemic uncertainty. Broadly-followed norms against lawbreaking would have prevented EA’s worst scandal (FTX) without preventing EA’s most significant successes (cage-free reforms, evidence-based health interventions in LMICs). And while there are historical examples of illegal civil disobedience clearly producing good outcomes, I don’t think these generalize well to the type of lawbreaking under consideration here. The clearest such historical cases are ones where a disenfranchised group of people broke laws that directly enforced their own exclusion from political participation or basic legal personhood. These cases are self-limiting (and thus pass reasonable tests of universalizability) since the principles justifying such lawbreaking achieve their own obsolescence once participation is granted.[2] It’s much harder to find historical cases of property-damaging civil disobedience occurring in a democracy with universal suffrage that, in hindsight, appear clearly both effective and in service of a good cause. DxE’s own history is instructive here—their work over the last decade has led to many criminal convictions among its members, as well as several organizational scandals. But their record of concrete wins for animals is at best small-scale and mixed, especially compared to the successes of groups that have purely used lawful tactics like ballot initiatives and corporate campaigns.
One last point in the utilitarian calculus, this time on the more object-level cost-benefit calculation, is that I think EAs who embrace these kinds of illegal tactics may be underestimating the downside risks of endorsing criminal activity. I think there is a set of donors and volunteers that are happy to contribute to legal activism but who would be concerned about being associated with lawbreaking (at a minimum, I would consider myself to be in this group). If people in leadership roles within the EA/EAA ecosystem endorse illegal action, any foreseen benefits may easily be swamped by the harms of driving away risk-averse donors.
None of this is to say that Ridglan’s treatment of animals is justified, or that the lack of state enforcement against Ridglan for their serious violations of animal cruelty laws is acceptable. However, these harms don’t justify using tactics that are neither clearly effective nor robustly permissible across moral views.
I don’t mean to say that literally all lawbreaking is unjustified in a democracy. In particular, if one thinks that a law in the US is unconstitutional, breaking it may be required to gain standing for a legal challenge. But this implies a narrow exception for doing the minimum amount of lawbreaking required to obtain standing; it doesn’t imply that the tactics in question at Ridglan are permissible.
Note that this self-limiting principle only holds when applied to groups of humans denied suffrage. It doesn’t extend to cases like animals, where suffrage isn’t possible and there’s no natural bound on how many such groups might be invoked to justify lawbreaking.
I’ve been somewhat disappointed in reading this post. But as I know some folks I like are reading it, I feel the need to share a few thoughts as a legal scholar and theorist. Your post I think demonstrates some misunderstandings about the nature of law.
1. You seem to misunderstand contractarianism, by making it an argument for quietism, as well as the nature of law in a democracy. We don’t agree to many laws as a society; most extant laws are conditions that no one ever agrees on. They are traditions. Take property law — no one ever voted to make animals property. That’s an inherited concept. There is a difference between a law that is democratically enacted, and a vestige of the English common law. The same goes for trespass, a common law principle. Moreover, no contractarian worth her salt is going to claim you can never break an unjust law. Going back to Aquinas, there is broad agreement that sometimes breaking an unjust law is morally appropriate, or at least defensible.
2. No one could reasonably argue this tactic is universalizable. It is strategic — it is a non-universalizable tactic meant to create significant attention and pressure for change toward universalizable norms. And more — the argument that if they do it, everyone will, can be a dangerous slippery slope. It’s infeasible for everyone to do this: they have jobs. More important, I would very much caution against that kind of thinking. It has been used to justify atrocities in the past. Take slaveholders: they used that kind of argument to push back against slave rebellions—it would destroy the antebellum south, socially and economically. That is not an argument worth keeping. I am not defending the Ridglan approach, but I am cautioning against these types dismissals of what they are doing.
3. Historically, you seem to again misunderstand the nature of law. Slaves were considered property, and their rebellions certainly damaged their property status. They also moved the north toward abolitionism. Elements of the civil rights movement and the labor movement in the United States engaged in tactics that damaged property and yet ultimately won reforms. Property is a convention—it should not interfere with moral obligation and serious moral intervention.
I’ll just mention one thing about donors. Doing what is right sometimes risks making certain people unhappy. That’s why social movements shouldn’t rely on the whims of wealthy individuals. The right donors will see the right priorities for what they are. The priority should be supporting one another within a social movement.
Thanks for engaging with the post! You made a lot of different points, so I’ll do my best to separate them out and consider them one-by-one:
(1)
I’m not making an argument for quietism. Saying that we have an obligation to follow the law is compatible with having obligations (even extraordinarily strong ones) to use non-illegal means to combat injustice (e.g. by advocating for changes to laws).
It’s a genuinely interesting point that many of our laws are inherited traditions, rather than the direct product of the democratic process. However, I don’t think that’s a strong argument in this specific case. The US has had true universal suffrage for more than 60 years, and in that time Congress and state legislative bodies have passed many laws related to the treatment of animals and the criminality of trespassing. Under any reasonable interpretation of democratic legitimacy, a democratically-elected legislative body specifically dealing with an issue and choosing to pass laws that accept the underlying common law principles and add specific penalties, related rules etc., should confer it.
I don’t disagree that a reasonable contractualist would think that there are cases where it would be justified to break an unjust law. The core question is whether the required conditions hold in this case. Democratic legitimacy is one important part of that, since reasonable contractualists generally would give some weight to whether laws resulted from a just process. A point I didn’t make in the OP, but I think is relevant here, is that even if you disagree about the democratic legitimacy argument, I think the specific nature of the lawbreaking here falls outside many notions of justifiable civil disobedience. That’s because the Ridglan rescues involved breaking a law to achieve a non-symbolic end (rescuing the dogs), not merely symbolically challenging a law by breaking it.
(2)
I think you’re moving between a couple different notions of universalizability here. It’s true literally everyone breaking and entering in service of moral aims is a far-fetched idea. But it’s still coherent to ask whether a tactic would have positive or negative effects if commonly used across social movements. Democratic societies can and have experienced periods of widespread civil unrest.
I agree that a similar argument could have been deployed against rebellion by enslaved people, but I think the analogy is weak because of the specifics. Slave rebellions occurred in a society where the affected population was excluded from political participation, and the principles justifying them were self-limiting in the way I described in the OP. The current case is different: the affected parties (animals) can’t be enfranchised, but the human population that cares about animal welfare has full political participation, and the legal channels for advancing animal welfare are open and have produced incremental gains over recent decades (most notably the transition of nearly half of the US egg supply to cage free). The bar for lawbreaking is plausibly higher when those channels are more responsive than when they’re closed.
(3)
I think you’re being too quick to dismiss property as being something that can drive moral obligation. There are clearly many cases where we are obligated to not destroy or interfere with others’ property, as is obvious in cases where vulnerable groups’ property rights are infringed. The way I’d think about this is that obligations to protect property are stronger the more just a society’s system of property rights is. In a more just society, property destruction not only weakens otherwise-good norms, but is also more likely to be the result of a miscalculation: if the base rate of unjust property ownership is lower, then any given case of someone believing property destruction is justified is more likely to be wrong. So both the rule-following considerations and the act-utility considerations point toward higher property protection in more just societies.
(4)
I disagree that the priority should be supporting one another within a social movement; the priority should be trying to do the most good. Donor considerations reasonably impact that calculation, both because money is a necessary ingredient for advocacy and because donor preferences may reflect genuine moral views that are worth considering. But I do also agree that it can be worth trying to convince donors of an approach rather than just deferring to their preexisting preferences.
State laws are path dependent, and rely very often on common law principles and concepts uncritically applied. That does not equate to democratic legitimacy for every codified version of property and criminal law.
I think we have fundamentally incompatible views on the appropriate frame to apply to balancing questions—I am not at all a utilitarian, and I don’t think you should be either. But I’ll set that aside.
You again seem to conflate lawbreaking with immorality. Please don’t do that. Rosa Parks broke the law. So did the Ridglan rescuers. That doesn’t make what they did wrong. That’s a separate question. The symbolic/non-symbolic distinction is not one I find compelling.
You seem to see humans and animals as categorically different from a legal perspective. Humans care for animal welfare but animals have no voice. I fundamentally disagree — animals should have rights, including a right to a voice. That means they should have institutions that represent their interests. So there is no fundamental difference between the slave revolts and what people are trying to do here for animals in terms of voice, as their representatives. I suggest Zoopolis on this point.
Your articulation of the EA bias toward donors is particularly problematic. Successful social movements historically have not relied on extremely wealthy individuals for funding. Your concern to persuade donors is troubling. You should be worried about persuading people, not persuading donors. Lots of people are going to be needed to change social perspectives and institutions around animals. A social narrative that focuses on “earning to give” or major donors is likely to be unmoored from a durable movement.
For what it’s worth, I’m surprised that people who think abortion is murder aren’t doing more illegal stuff to destroy clinics.
Also for what it’s worth, I think factory farming is so bad that it’s by far the greatest injustice caused by humans in history. Justifiable wars have been fought for orders of magnitude less important things.
I think this is a good/fine question and the answer is “they’ll go to jail and then stop”. I think maybe you’re conflating this question and the following:
Plausibly the morally correct answers are different. If your policy might cause total collapse of social order (irl, not in a nested thought experiment), maybe you shouldn’t do the nonviolent disruptive protest, but if you live in the real US where you largely internalize the negative consequences and others are similarly dissuaded and you still find the ~1st order effect worthwhile, then go right ahead
It’s like a sin tax (not a perfect term here tbc) - you want some amount of Pigouvian tax on the thing that you’re worried is not generalizable (or good if it generalizes). If you find that the action is worth it to you tax included, then godspeed. It would be fallacious to say “in addition to the correctly priced carbon tax you’d be paying on the gas, consider your impact on the environment by driving”
Two thoughts:
I think we disagree about whether the harms of lawbreaking are mostly internalized. The degradation of social trust in the deliberative process seems bigger to me than the consequences to the individual? As an analogy, shoplifting is an ordinary crime where individuals do face real consequences, but the diffuse harms to consumers and businesses (goods locked up, stores closing) are large and dominate the social calculus.
The Pigouvian tax comparison doesn’t quite work here because paying a tax contributes to public resources that can directly address the harms of the act or improve welfare elsewhere, making the net outcome neutral. Going to jail doesn’t repair damaged property or restore trust in the democratic process.
Ok yeah I was using terms too loosely. But still:
I don’t think we should speak of “lawbreaking” as a general case in this context; some argue that shoplifting is too lightly punished/prosecuted (especially in eg liberal US cities), but even assuming that’s true, the question remains as to whether the more specific category of say “property damage via protest” is punished too lightly, too harshely, or about right.
My best guess is that it’s not “too lightly” from a purely normie “law and order and human welfare right now” perspective. Many people believe moral-ish things strongly and don’t find property destruction immoral, but far far fewer actually destroy the property of those they think are doing something immoral. This seems like good evidence that the expected punishment (including via informal mechanisms) is not too light.
I think we are/were both sort failing to decouple Pigouvian taxes and restitution. My understanding about both how the term “Pigouvian tax” is used in econ and about the real world is that even without restitution, you can get to the socially optimal level of some bad with a tax alone and no transfer to victims.
I think the motivating intuition is that the tax is affecting the amount of eg “social disorder” supplied, but the tax revenue is just a transfer of economic power from one party to another—it’s not creating real wealth that can then be given to the victims. So the same amount of real wealth exists before and after the transfer and a separate question is what to do with that wealth given the state of the world (eg you might think that the very well-off who are harmed slightly by some negative externality, say ambient noise, should not be given restitution and a tax on decibels should really flow to some other party like the very poor)
On (1): You say:
I think that this is at best weak evidence. Activists’ decisions of whether or not to commit crimes are surely influenced by norms, not just the expected intensity of punishment. The recent history of climate activism in the UK is a good example. As far as I can tell, nothing changed about UK law to cause the rapid rise of high-profile lawbreaking by Extinction Rebellion and then Just Stop Oil in the 2018-2023 timeframe. The UK government did in the end stop the activists through increased legal penalties (going from typically no prison time for nonviolent lawbreaking when motivated by ethical concerns to 4+ year prison sentences becoming common for the more serious cases). But something other than threat of prison time was keeping climate activists from using these tactics in the early-to-mid 2010s.
On (2):
I agree that a Pigouvian tax doesn’t require restitution (as I indicated with including “improve welfare elsewhere” as something that can be done with the tax revenue). But the classical formulation (in which the optimal tax rate fully eliminates deadweight losses) does require that a dollar of consumer/producer surplus and a dollar of tax revenue produce the same social welfare. If a dollar of tax revenue produces less social welfare, then the deadweight loss cannot be eliminated.
To make this more concrete, I want to dig into an example based on your comment about driving in a world with a carbon tax. Consider taking a long trip by car rather than train under 3 different taxation schemes. Let’s assume you value the convenience of the car over the train at $101, the social cost of your carbon emissions is $100, and that all consumers in this world have identical marginal utilities of money.
World A: No carbon tax. You take the trip, gaining benefits you value at +$101 while causing social costs of -$100. In this world, we might say that you’ve done the right thing by driving (since this maximizes utility overall), but that for fairness reasons you might be obligated to donate some money to others, since your utility-maximizing decision also acted as a transfer from others to you.
World B: Carbon tax of $100 on the trip, returned as an equal dividend to all people. You take the trip, gaining net benefits after the tax of $1. The rest of society ends up net neutral (though there might still be particular winners and losers). In this world, you’ve done the right thing by driving, and have no further obligations.
World C: Carbon tax of $100 on the trip, which the government will use to buy $100 of consumer goods and dump them down an old mineshaft. You take the trip, gaining net benefits after the tax of $1. The rest of society still experiences the social costs of -$100, which the tax doesn’t do anything to reduce. In this world, you’ve clearly done the wrong thing by driving, since you caused a net utility loss of $99.
My claim is that doing crimes is similar to deciding to drive in World C. The “tax” on crime is imprisoning the criminal, which causes them to pay large costs in terms of their lost freedom and ability to work and doesn’t do anything to benefit society. And in fact it’s worse than World C, since the rest of society needs to pay the additional costs of arresting, prosecuting, and jailing them. So I think the Pigouvian tax analogy does not hold here, and it’s wrong to think that the harms of crime are properly internalized.
Not addressing every point but I think in some respects I agree that crime is C but then how much benefit the criminal gets/values is a case-by-case question, and we can’t just assume that in the irl case at hand that the benefit is (in the analogy) $101 instead of $1000
There’s real deadweight loss from the mineshaft drop/spending money on prisons, but also potentially real value to be gained from the crime itself (canonical case = speeding bc wife is going in to labor)
Remember, property damage as activism isn’t like simple theft—the property damage can cost society amount $X and benefits of activism feature can separately benefit society or be valued by the perpetrator at any other number $Y
I see what you’re getting at here. But if we agree that the externalities of crime aren’t internalized, then I think we’re just back in the position of the original post. You think the act utilitarian calculus checks you, I’m both skeptical that it does and think that there are non-act-utilitarian reasons why we ought to avoid lawbreaking.
Thank you for the quick post.
Since you still state that in some instances, some careful lawbreaking can be justified in the pursuit of a just outcome, perhaps you could spend more of the post detailing why you thought lawbreaking was a bad call in this specific instance? This is not clear to me from reading your note.
Great points. Thank you for writing this up. I think it’s a strong and fair critique of the strategy of actions like this, and would love to see more discussion at this high level of context and analysis.
(I expect you understand the legal arguments at play, but I do want to reemphasize for other readers that I stand behind the ultimate legality of all actions I took at the first Ridglan rescue in March, using a basic necessity defense argument: “you’d break a window to save a dog stuck in locked car on a hot day”, e.g., sometimes property damage is legal to avoid a foreseeable imminent harm. We can argue whether the harms at Ridglan are foreseeable or imminent, but I believe they were and that’s the basis for why I chose to do what I did. I wasn’t there for the one last weekend in April and don’t have a settled opinion about it yet.)
To respond to part of your very good post, I feel that we should be able to discuss and analyze nonviolent direct action and other forms of civil disobedience in EA spaces. I engaged in this action in part because I think EA folks don’t think about this kind of thing enough and I want to raise the salience of civil disobedience, at least as at least a secondary or tertiary sort of thing that EA should have as levers. I don’t think it is ever likely to be primary and I don’t want it to be, but I also don’t want it to be ignored and I think it largely has been around here.
A chunk of your argument boils down to what’s good for the overall EA brand. I strongly agree that there are bright lines I would not want the community to cross (e.g. endorsing or promoting violence). I think nonviolent direct action falls on the “OK” side of the line for me, but I agree there is probably a useful discussion to be had here, and am open to more arguments on this.
What would you say in response to a conservative abortion clinic protestor who makes the same argument you’re making? “It was ethically necessary for me to kidnap the doctor who was about to start their shift at Planned Parenthood. Yes, it’s normally illegal to kidnap people, but those babies* were in imminent danger of being killed by the doctor, and it’s permissible to break laws to avoid a forseeable imminent harm.” (*The conservative protestor believes that fetuses have equal moral status to babies, the same way you and I believe that pigs have equal moral status to dogs.)
Kidnapping somebody would be a violent action.
A better analogy of non violent direct action would be breaking in to disable the clinic’s capabilities to provide abortions (without harming anybody).
In this case, the protesters would be subject to the same penalties under the law that the Ridglan protesters are. That being said, there is a case that what is happening to the dogs is illegal (311 counts of animal cruelty documented already). I’m wary of an appeal to authority bias here—just because they are not enforcing the law doesn’t mean what Ridglan is doing is legal. As pointed out, the necessity defense is being tested here, and has reason for consideration.
I worry this approach excludes the most vulnerable (those who cannot meaningfully participate in political life, like human babies and animals), and focuses on less fundamental rights: I think protection from torture is more urgent than legal personhood.
Why would women be justified in engaging in civil disobedience to get the vote for themselves, but not be justified in engaging in civil disobedience to rescue babies from Josef Mengele?
I agree that there’s a sense in which the constraints I’m talking about focus on less fundamental rights. But I think the more important sense is that they focus on preserving a viable process for living together in a society with people of greatly differing moral views. That doesn’t mean we have to leave behind other vulnerable groups, just that we have to try and bring about change for them through democratic means.
Regarding the Mengele example, I think it’s disanalogous because it took place in a dictatorship, where the rule utilitarian and contractualist constraints on action look very different.[1] I’m really probing at what constraints EAs should have when acting in the context of a democracy (including a flawed one), not what behavior would be correct in Nazi Germany.
Note that the act utilitarian calculus also changes in a dictatorship too. Following the law in a dictatorship is unlikely to be a successful decision procedure for maximizing utility under epistemic uncertainty.
What if “protecting innocent sentient beings from torture” is a higher moral priority than “living together in a society with people of greatly differing moral views”?
I’m sceptical that the distinction between flawed democracy and dictatorship is clean enough to justify civil disobedience on behalf of others only in the latter (if this is what you’re saying). Would you support rescuing American children from deliberate infection with hepatitis at Willowbrook in the 1960s?
Taking each of these points in turn:
On your first question, I think your framing isn’t addressing what happens if other people think the same way. The equilibrium where everyone with strong moral convictions feels licensed to break laws doesn’t seem to me like it’s better for vulnerable groups, just more chaotic. I think that to some extent you’re proposing smashing the “defect” button in a prisoner’s dilemma and hoping the other side doesn’t do the same.
On your second, I agree that it’s not a clear line between flawed democracy and dictatorship, but in the US today this isn’t really relevant.
On your third, I think the Willowbrook example is worth thinking about more carefully. As I understand the history, the binding constraint at Willowbrook wasn’t legal. Many parents and guardians retained custody and could have legally removed their children. The constraint was that families without resources didn’t have a better option. And in the end, legal activism was able to marshal those resources, albeit much more slowly than I would have wished.
I’ve been pondering this. I think your button-smashing characterisation is basically accurate, and it is a leap of faith that those who engage in civil disobedience make: an appeal to the conscience of society, the jury etc..
You’re right to say that one way to think about universalisability is “if it’s okay for me to break the law to achieve what I consider to be a moral goal here, why can’t everyone break the law to achieve their own moral goals?”. But another way to think about universalisability is to go “if I were the one in Ridglan / Unit 731 / Willowbrook, what actions would I support to end my suffering?”
I don’t know whether it would be illegal for parents to break their children out of Willowbrook, but for the purposes of this question assume it was.
You should volunteer at your first EAG! (Especially if you are a student or early career)
If you don’t have a network in EA, EAG’s can be overwhelming. Volunteering gives you a ready-made, organic network.
Volunteering is pretty chill—a lot of the shifts aren’t that hard.
At your first EAG, it’s unlikely that you are using your time so efficiently that a few hours of volunteering would cut into the value of your conference.
I am attending my EAG and volunteering as well. Hoping to learn and build meaningful networks.
I volunteered at my first EAGx (EAGx Australia 2023) and support this sentiment.
And also, AFAIK if you volunteer your ticket is free :)
Deleted
I no longer endorse this post, which argued that honey is basically fine to eat ethically, to the degree that I chose to delete it entirely.
Protein/dairy tradeoffs/substitutions make more sense: honey/syrup/agave seem less necessary. For example, waffles, pancakes, french toast, etc still taste good without much of those, and honey/syrup/agave all seem too sugary to be healthy. Since they seem less necessary, your reasoning makes more sense to me as a case against honey alternatives rather than a case for honey
We recently published an interview with Matthew Coleman—another entry in our Career Journeys series. Matthew is the Executive Director of Giving Multiplier, a platform that encourages donations to highly effective charities through donation matching. Before this, he completed a PhD in psychology, researching the psychology of altruism.
The interview covers quite a lot of ground, but a few of the things we talked about include:
The gap between what a career looks like from the outside and what it’s actually like day-to-day.
Advice for people wanting to make an impact through psychology.
The tension between keeping your options open and committing to a path.
Here’s one of our favorite extracts from the full interview:
On engaging with the (often mundane) realities of academic research:
I learned a lot. By the time I started my lab manager role, I was fairly confident I wanted to do a PhD. But my research lab in undergrad, which I loved, was a very small lab where I was working closely with the faculty advisor, and I wanted to try out a larger lab studying different topics to explore a bit more.
As the lab manager of an unusually large lab, I got a bird’s-eye view of a lot of the research projects going on and understood what the day-to-day looked like, whether that was grant applications, hiring and onboarding, or actually conducting research myself alongside my colleagues. I found the experience amazing and fascinating and really intellectually stimulating, which confirmed that I wanted to go the PhD route, so I followed through on my original plan from undergrad.
[…] I was certainly very fortunate to have gotten a lot of hands-on experience in research as an undergraduate, so I think I had a better sense of the day-to-day than many people do. But I do think it’s a very important point, and some related advice I like to give is: when you wake up on a random Tuesday in February, do you actually want to do the things that you have to do? Not just do you like the topics or ideas you’re studying (although that’s of course very important, too). Maybe you read a book, watched a TED talk, or listened to a podcast about some topic you found fascinating, and maybe you do want to pursue work in that domain. But I think the ideas themselves aren’t enough, because you actually have to do the day-to-day work.
So what are the actual responsibilities and tasks you like doing? For example, you may find neuroscience fascinating, but maybe you don’t want to spend a large portion of your workweek interacting with research subjects running brain imaging sessions, or whatever it might be. In such a case, even if you think the subject matter is fascinating, maybe that’s not the best career fit for you. Or maybe you do also enjoy most of the regular responsibilities associated with that career, in which case it could be a great fit. So I think a combination of enjoying the topic itself plus the day-to-day responsibilities is important. I was lucky that, early in my career, I was able to test it out and experiment with which responsibilities I liked more than others.
Lighting has been getting ridiculously cheaper. And for the most part we seem to be not taking advantage of that positive externality: reducing crime through better lighting. This has been battle-tested as one of the effective ways for public security, see Chalfin, Hansen, Lerner & Parker (2022), an RCT in NYC public housing finding ~36% reductions in nighttime outdoor index crimes from added street lighting. Many, many major cities still haven’t copied this at the right levels!
But we’re also getting substantially negative externalities of bright lighting. Office buildings that never turn off their lights because why would they care. Apropos the new office building that just opened next to my housing. This may alimentate NIMBY spirits in me, God forbid. Kyba et al. (2017) document that Earth’s artificially lit outdoor area grew 2.2% per year from 2012 to 2016, with the LED transition producing a rebound effect instead of getting savings. Jevons paradox and such.
Also, this has all sorts of annoyances. I think malls, pharmacies, and hospitals have all become much brighter since my childhood. I may be more sensorially overloaded than most people, but this does meaningfully affect my qualia, so much that Pigou himself would collect taxes from the pharmacies with dozens and dozens of LEDs, while Coase would advocate that I have the natural property right of not being assaulted with that much lumen while buying a Tylenol. This does affect wellbeing of more than just me (Cho et al. 2015). But lightly enough, ha, to not be a topic of discussion.
Maybe my biggest medium-term worry about transformative AI, other than the takeover stuff, is a constellation of concerns I sometimes abbreviate to “political economy.” Right now a large fraction of humans in democracies can live and support their families as a direct result of voluntarily exchanging their labor. It’d take active acts of violence to break from this (pretty good, all things considered) status quo. As a peacetime norm, this is unusually good relative to the history of human civilization.
At some point in the future (in the “good” futures, I’d add), there’ll be a natural transition from that to people living and supporting their families as a result of UBI or welfare or other gifts from companies or the State. Ie they will now be surviving explicitly due to someone else’s largesse[1]. This seems bad!
Unfortunately I don’t have a good answer here, even in principle. But it seems worth considering! I vaguely wish more people would work on it.
State power is of course backed by the threat of violence, so it may not be just largesse. But a) “my desired system is the peaceful default, and it takes violence to wrest me away from it” is more stable and dignified than “my desired system relies on the constant threat of violence to hold”, and b) a fair amount of democratic power comes from the democratic nature (and the ease of mass mobilization) of guns, and this has also been eroded by technological developments in the last century, and will also likely be further eroded by developments in AI.
I agree. What’s the bottleneck in creating good answers to this question? Money? Talent? Would you be happy to give a shot at fleshing this out?
My current first-pass answer is that
Windfall shares. Some fraction of AI stocks should be given one-time to every human alive
This still requires some form of largesse/threat but one-time largesse feels less scary to me than continuously need to uphold the norm.
And it’s not exactly largesse while people (especially outside of AI companies) still have real power, more like a structured negotiation
For reasons of political-economy realities, probably with more given towards rich countries and/or countries that are closer to developing AGI
I’m imagining maybe ratios like 10:1
Not sure about the exact amount of shares but should be way more than enough to support everybody indefinitely at significantly above modern Western standards, excepting positional goods
After the initial transfer, this completely solves the largesse and political economy problems. The “dignity” problem of having your consumption no longer tied to your labor is still there but I’m less worried about this (seems more like a framing problem).
Children can still be a problem. My guess is that normal inheritance stuff is enough though in edge cases maybe we say that you aren’t allowed to disown your children completely from your windfall shares.
If people live forever maybe we have a rule that reproduction means a minimum fraction of your shares automatically go to your children I dunno.
Charter. Later on, some version of this is also written directly into the charters of the AIs, so at minimum something like 0.1-10% of their values ought to care something like all of current humanity’s preferences
Assuming alignment is solved, now superintelligence is (0.1-10%) on the side of all humanity.
(probably optional) some form of protection against manipulation/theft/expropriation
If there’s a transition period where AIs are good enough to do most work in the economy and generate a lot of wealth and/or disemploy most people but AI alignment and capabilities aren’t enough that #2 solves all the new AI-generated problems (eg if we’re worried about superpersuader thieves) we have ad hoc paternalism stuff to prevent obvious ways to steal people’s windfall shares.
how heavy the paternalism is defends on how serious different concerns look. Eg if AI superpersuasion scams are common maybe we’d just make it legally impossible to transfer windfall shares, in the same way you can’t legally sell your organs in most countries.
To ease the transition, this should be seen in earlier stages as a complement to existing welfare systems rather than a substitute to them. Eg if someone’s dumb enough to gamble their monthly AI windfall dividends away, different societies can either choose to let them starve or (my preferred solution) still feed them, perhaps until AI-assisted tools can cure their gambling addictions. In general, don’t let “the windfall shares solution can’t solve all of society’s problems” be a blocker to implementing it.
__
tbc I don’t think this is an amazing answer. I worry both that this won’t be enough and that we won’t implement anything as good as this. I don’t know what the bottlenecks to better answers are, and why other people aren’t working on this. Two obvious answers come to mind:
It’s just kind of a hard problem!
Most people don’t “feel the AGI”, and the people who do think they have more important/tractable problems to work on.
Thanks Linch <3
Claude gives some references to prior work. Maybe the most interesting is Anton Korinek:
I’ve also had worries there; my naive hope is that there’ll be a meaningful plurality within the-controllers-of-the-AI that they’ll have to compete for feet. So if you want to grow the amount of matter and energy you govern, you’ll need more people to opt in to your system to justify yourself (unless you want to give a good excuse for everyone else to band together and smite you). Then I hope the world is held stable by something like mutually assured resource exhaustion.
If you squint, I think UBI could function more like a lease on individual consent vs a gift. Hopefully, giving people inherent political value.
But for sure seems dicey; easy to imagine a few people in power colluding to disregard the vast majority of the population.
Thinking of drafting a post on war crimes, trying to answer the following puzzles:
Why do we have a notion of war crimes at all, given how bad war itself is?
Why are some things war crimes and not others?
Why do precursor notions to war crimes appear, independently, in essentially every culture that has fought wars at scale?
Given that essentially every culture has also broken these norms, sometimes spectacularly, why does the norm always come back, and often come back stronger?
Common answers to these questions seem profoundly misguided. The naive answer, that war crimes are simply the most horrible things that we all agree is collectively wrong, does not survive even five minutes of scrutiny. More sophisticated versions of that argument also do not survive scrutiny: Just War theory is similarly flawed and question-begging on the descriptivist front, and the Schelling—shaped argument that war crimes can’t limit all of war’s badness, but are aimed at curbing the worst excesses, does not explain why mass bombings and medieval sieges are/were considered acceptable, but false surrender is not.
The “cynical” answers are (differently) flawed. eg some people think war crimes are completely fake and anything other than total war is just modern virtue signaling, ignoring the thousands of years of documented history we have on precursors to war crime (Xerxes in 400s BC: “The Spartans, when they do such things overthrow all law and justice among men.“). If anything, the modern version of “total war” is much newer than the idea of war crimes. Similarly, a naive “power analysis” that war crimes are simply defined by the powerful to limit the options of the powerless ignores that powerful people are often themselves constrained by these norms, sometimes hugely.
Instead, my core answer here is surprisingly simple: A “war crime” is, in its oldest and clearest form, the category of acts that destroy the means by which wars can be ended. The prohibitions track not the moral worst things people do in war, but the acts that, if generalized, would turn every future war into a total war.
I don’t think my theory here is very novel. Indeed, as I’ll discuss, this theory is literally thousands of years old and likely arose independently in many places. I will try, however, to make my post the best modern articulation of these ancient ideas.
No offense Linch, but aren’t these questions for jurists, historians and philosophers? Why should you develop the answers from first principles, so to speak? I’d get writing a blog post about a journey through such sources and what their theories are, but I think trying to answer such questions ourselves is not very robust.
This is not a criticism of you personally—developing ideas that require domain expertise from first principles is an approach I often see in EA and I think it’s a wrong one.
No offense taken!
My experience with trying to investigate various questions is that it’s pretty hard to ex ante predict which things already have substantial attention from the “experts”, and many questions that seem important fall through the cracks (for some EA-relevant examples that comes to mind, optimal charity, AI risk, pandemic preparedness, the impact of incarceration on crime).[1]
In this case, I don’t think the specific question I’m interested in has attracted a lot of academic attention/I don’t believe any single field has a good unifying theory. Just war theory is overwhelmingly normative with limited descriptivist content. IHL scholarship is interpretive and doctrinal. IR/game theory scholarship has partial answers, but afaik no one has synthesized them into a structural theory of war crime, etc, etc[2].
Second, I’m definitely reading a bunch of sources here! I’m sure I’ll miss a bunch but I’m certainly reading through a bunch of sources, including historical ones.
Third, if you believe the most useful thing to do here is a literature review, be the change you want to see in the world! Like if you think what’s missing in the world is a pure literature review summarizing various theories, go for it! I’d be happy to read your review.
As an example in the other direction, many US gov’t agencies are just kinda reasonable and make pretty reasonable tradeoffs given their constraints.
Watts (2013) talking about perfidy specifically is the closest. But tbc there have been many shadows of this theory/hypothesis over the centuries, since at least 2400 years ago and probably long before.
Your experience reminded me of how Holden Karnofsky described his career so far:
Thanks, though tbc Holden’s way better at it than I am!
Thanks for the serious reply!
I guess a “but can’t we, like, just outlaw all war?” approach is not the standard one so I’m at least interested in what answers you may find. Especially with me coming from a very, umm, war-prone country...
You might like this post I wrote earlier about the bargaining theory puzzle of war. I engaged with the academic literature on the subject pretty significantly, particularly James Fearon, so you might like it. On the other hand Fearon himself mostly reasoned from first-principles rather than conduct a careful historical assessment, so in that regard it might fit your interests less.
The post never got very popular but a few people who read it carefully really enjoyed it. One of the better compliments I’ve gotten on my writing is when somebody said they were surprised to learn after reading my post and several books on the subject that the post gave them >50% of the value of an academic book on the subject.
I like the puzzle. But I wonder if you can make your answer even simpler:
Actions taken in war have some benefit to the perpetrator, and some costs to the larger system of permitting them
When the ratio between these things gets too extreme, it’s regarded as a war crime
I think this explains the category that you outline (undermining trust in the kind of institutions that could stop the war is super destructive!), but also explains some other cases, e.g. abuse of prisoners, not impersonating medical staff, etc.
Yeah this is fair. I outlined something like that here.
I think there’s a few tricky things with this model. One is lack of precision, eg by whose lights are you interpreting “costs to the larger system of permitting them.” Relatedly, an advantage of advocating “war crimes are crimes against the end of war” is that it creates a clear core (even if it doesn’t describe everything) of norms that I think are a good description of commonly shared norms in history, and I think are good to uphold morally[1]. In contrast, many other norms of war tend to be more sporadic, like protecting civilians, chivalry, or diplomatic precedence.
Another tricky thing is Schelling’s point that almost all conflict is non-zero-sum, you can’t treat the zero-sum parts of war and the non-zero-sum parts as cleanly separable.
(I’d also note that torturing POWs makes surrender less appealing, so it’s consistent with my narrower answer. My narrower answer would also predict that protecting civilians is important but not very important, which is consistent with the historical record. On the other hand it does not have a explanation for weapons bans; my defense is that a decent enough simple theory in social science doesn’t need to explain everything).
With the important caveat of course that it reduces the cost of war, which is probably bad.
I agree that the model I proposed is imprecise; I think this counts against its usefulness but not its validity.
I’m not suggesting this as a thing to advocate for; merely as a descriptive pattern of what the category of war crimes is doing. I think the things which make ending war harder are an important class of really destructive thing, but it seems clarity-obscuring to me to claim that this is definitionally what war crimes are? Rather than giving your thing a new label and then getting to discuss what fraction of war crimes are in that category, and whether there are things in that category which aren’t war crimes (e.g. if torturing POWs counts under your categorization, then why doesn’t conscription count—after all, it damages the “one side runs out of soldiers” mechanism for ending war).
Poor choice of words on my part!
Fair, I guess the thing I’m interested in is something like “widely shared and independently recurring norms of war.” Though I’d want to be narrow enough to exclude stuff like “norms of war include paying your soldiers and have okay logistics planning” or “norms of war descriptively include being total morons sometimes in XYZ ways”
right sorry I do think the costs/benefits ratio matter significantly here.
Ok, so one place the predictions of these theories might come apart is that my theory suggests a norm against impersonating medics, whereas I think yours doesn’t (although maybe I’m just not seeing it; I don’t think I would have said that avoiding torture of prisoners was part of protecting the mechanisms of ending war, although I do kind of see what you mean). I haven’t looked into it at all, but if that norm has emerged independently multiple times that would be suggestive in favour of the broader theory; whereas if it has just emerged once it looks perhaps more potentially-idiosyncratic, which would be suggestive in favour of the narrower theory.
Thanks, I like this crux/operationalization!
It’s an interesting idea but as expressed feels a little tendentious, particularly if one looks at what is actually formally considered a war crime these days (much of which would not have been recognised as war crimes in the past, including by actors who believed themselves to be unusually chivalrous). Hard to believe it will be impossible to avoid total war if a few civilians are murdered or chemical weapons are used, never mind if a pilot gets shot after ejecting or if a spy is not afforded a fair trial, and hostage taking was once considered a good way to avoid total war. We see peace agreements between prolific war criminals quite often too. Avoiding total war might be a motivation, but it can’t be the only one.
On the other hand game theory favours opposing sides agreeing to not shoot ejecting pilots or torture each others’ prisoners even if they don’t agree on anything else, whereas it is impossible to win a war if you are not permitted to fire at the other side’s soldiers, and at least in 1949 artillery and aerial bombardments were also too critical to winning for the Geneva Convention to agree to ban them. It is possible for opposing sides to agree not to shoot at people that don’t wear uniform, but only if both sides treat sneaking up on the other side and shooting them whilst not wearing uniform as also a crime.
Also, many people genuinely believe in the idea that people shooting other people in a uniform which indicates they intend to fire back represents some sort of fair play (even if the targets happen to be sleeping conscripts who haven’t had a chance to surrender yet) and the sort of people that believe in that sort of thing are disproportionately likely to be military officers. They tend to believe in just wars too...
I do agree that opposing sides are considerably more likely to respect conventions on war crimes (and even reach other bargains like prisoner swaps) whilst the infrastructure that may allow the sides to mutually end the war still exists. But there’s plenty of evidence of war crimes committed with impunity in conflicts that never came close to total war, and for that matter of individual military units choosing to abide by conventions despite there being no realistic prospect of a near-term peace agreement and plenty of war crimes being committed by others on their side
I think the language I used above is more deontological/universalizing than ideal. I agree it’s more of a gradient than anything else. I also think some of the biggest classical norms (“don’t shoot messengers/envoys”) while still important today, are less so in the age of wireless communication, mass media, and email. I also think my primary thesis address the benefits of having “war crime” norms, but norms in practice are about both benefits and costs, and some of your comment here address costs (which are of course also important).
A quick reminder that applications for EA Global: London 2026 close this Sunday (May 10)!
We already have more applications than last year, and this looks set to be our biggest EAG yet (again)! If you’ve been meaning to apply but haven’t gotten around to it, this is your sign.
The admissions bar is more accessible than people often assume. If you’re working on or seriously exploring a high-impact problem, you should apply.
This is the EAG I’ve been most excited to put together yet. I’d love to see you all there.
📍 InterContinental London, The O2 · 29-31 May 2026
⏰ Applications close: Sunday, May 10
🔗 Apply here
So… what’s the general take on the hantavirus outbreak?
The best thing I’ve read on it so far is this article by Kelsey Piper.
At the risk of being too curmudgeonly, I’d say the main take is to stay away from the news cycle.
WHO does not seem too worried about it: “WHO currently assesses the risk to the global population from this event as low”.
Despite the real risk from hantavirus being low, it is getting covered a lot in media right now. I think this is actually good. A lot of people had already forgotten about the pandemic that we had not that long ago and moved on to worrying about other problems currently dominating the news cycle. Hopefully this serves as a (small) reminder to people that pandemic preparedness / biosecurity really does matter.
Just was watching Dwarkesh/David Reich podcast, fascinating stuff. Looking back at how I was taught taxonomy and anthropological history I find it frustrating. Note that I don’t know much about (evolutionary) biology or genetics or the frontier of what genetic-history research so this is my layman attempt to explain why it’s generally been puzzling for me how i have had this explained by other people who probably don’t understand either, not trying to propose that I understand something david reich doesn’t.
My main gripe is that we are taught evolutionary history mostly from the lens of evolutionary trees. But evolutionary history probably looks like a graph/stochastic process/markov chain, and only at very specific underlying parameters/ level of abstraction is well modeled by a tree. The reason we use trees is because that is the most sensible simple abstraction in some ways, if you are thinking about, “how did we get here?”. But it’s not a great way to think about “what happened/was happening”. I had chatgpt try to make the difference below (don’t look too into the details, it did some hallucinating, just the general vibe).
Taking plausible parameters here to me would be thinking mixture isn’t extremely likely over short time spans because of distance/etc but quite likely and almost certain over hundreds/thousands of years. So what did the near east look like genetically 60k years ago? It could easily look like below.
It seems totally possible that for long periods of hominid history genetics were well modeled by pretty smooth stochastic graphs with generally corresponding smooth genetics across geography (obviously with tons of exceptions or less true when you zoom in, e.g. bell beaker/corded culture), and yet when you look at our specific lineage it doesn’t quite look like that (due to extinction, gene selection, or some other reason). I don’t have a clear enough vision to say much more, but I think there are some interesting implications about what we mean when we say, this group or that group went extinct.
FYI, next week we will be highlighting the first batch of articles from In Development, @Lauren Gilbert’s new global development magazine.
Lauren and most of the authors will be on the Forum to answer your questions throughout the week. More info to come on Monday, but I figured I’d mention in case anyone wanted to read the articles in advance (they are here, and all authors apart from Paul Niehaus will be around to answer questions).
I’m looking forward to the discussion.
Earning to give is lonely and requires repeated decisions. This is bad.
If you’re earning to give, you are lucky if you have one EtG team-mate. The people you talk to every day do not have moral intuitions similar to yours, and your actions seem weird to them.
If you do direct work, the psychological default every day is to wake up and do work. You are surrounded by people who think the work is important, and whose moral values at least rhyme with your own.
If you earn to give, most days you do not give (you’re probably paid bi-weekly, and transaction costs discourage even donating that frequently).
These differences apply continual pressure for EtG folks to become less hard-core than we intended to be. I wish I had more counter-pressure.
[None of these observations are novel]
I’m EtG and would love to connect with others. My DMs are open! A bit about me: I’m a SWE based in Europe, and my preferred cause area is animal welfare.
We have a regular EtG meetup in London. You might be interested in setting up something similar where you live, perhaps branching off a preexisting Effective Giving/Giving What We Can group?
Besides my more cold-hearted response below: I agree that EtG is lonely. You are lucky if you have one other EtG’er in the same city. EtG is rare. EtG and feeling committed to it, is even rarer.
Oh totally. I’m lucky to be in the Bay Area where EA is a thing at all.
“If you do direct work, the psychological default every day is to wake up and do work. You are surrounded by people who think the work is important, and whose moral values at least rhyme with your own.”
This is true for me, and true for many in richer countries sucha as the awesome AIM crew. In low-income countries though many if not most employees (especially in BINGOS) are there for the money and status in non-profits, rather than the value of the work. I know a number of Ugandans who have found this difficult when they cared about the work while their colleagues were just trying to weasel away as much money on allowances as they could while trying to unnecessarily extend projects to keep their salary going.
I think your major point stands, but direct work doesn’t universally come with motivated and encouraging peers.
Very fair, I’m definitely speaking from the perspective of a rich Westerner working with other rich Westerners.
You don’t need to make a donation decision biweekly. That sounds incredibly tiring. You can also set up a recurring donation to the same charity or fund. Or save up and donate once or twice a year. That has the advantage that you can plan some time to think well about your decision.
True, but as a matter of fact don’t you continuously reassess your donation decisions? I know I do (I have the impression you do too).
And every time do I that, it really helps to have people around me who I can talk to about it.
Do you reassess more frequently than annually or biannually, which is my impression of the Schelling point frequency for most folks?
Just annually, unless something exceptional changes how I think about these things
Somewhat meta point on epistemic modesty, calling it out here because it is a pattern that has deeply frustrated me about EA/rationalism for as long as I have known them:
(making a quick take rather than commenting due to an app.operation_not_allowed error—I’m responding to @Linch’s quick take on war crimes)
I guess these are just EA/rationalist norms, but an approach that glosses major positions as being so quickly dismissible strikes me as insufficiently epistemically modest. I would expect such a treatment will fail to properly consider alternative answers or intuitions to the author’s own, especially the strongest versions of those answers (e.g. modern just war positions), won’t consider the most sophisticated counterpoints (e.g. your ‘oldest and clearest form’ gambit may just be bracketing out the counterexamples that don’t fit your definition, like genocide or sexual violence), and reinvent the wheel, e.g. the view seems to be exactly this from 2013:
I think deep engagement with the range of serious views on the topic is required to make your post “the best modern articulation of these ancient ideas”. I don’t think the quick take seems on a good track for that.
I feel like I’ve heard this position a lot before, and I have some sympathy for it, but I feel like it implicitly overlooks a lot of what I find valuable about writing EA Forum comments, and it sets an overly high bar.
When one writes academic papers, one is expected to cite relevant previous work. Credit assignation is an important mechanism for tracing the evidence for claims and for assigning credit. Even in academic spheres, I think this is perhaps taken pathologically far (to the point where it probably sometimes is unduly burdensome and vaguely implies that pretty obvious ideas or hypotheses had to have come from someone else as opposed to being generated by the author), but the reasons why it’s important to cite your claims seem a lot stronger in academia.
The EA Forum is partly intended, I believe, to be a place where people are encouraged to say things more quickly and speculatively after having done less research, and where people are more encouraged to share their own overall judgments and thinking process without necessarily fully defending all their positions. You might think it’s bad to have such a place and that people should mostly just rely on the academic literature. I disagree with that, but trying to make the EA Forum use the same standards that academia uses seems counterproductive. We can just use academia for that.
And at least in my mind, a big part of the point of writing things like what Linch wrote is about trying to practice my critical thinking skills and appling them to new areas, for the eventual purpose of use in areas where there’s not already a lot of scholarship. So I value approaching an area I don’t know much about, like the topic of war crimes, and trying to understand it on my own and seeing how far I can get and forming my own view rather than necessarily seeing this as strictly an opportunity to practice building on existing literature on war crimes (or worse, just regurgitating that literature undiscerningly)
Thanks for the reference, and the point that the structural argument doesn’t handle all modern cases as well. Will address both in the post.
Though I’m confused. If you’re accusing me of reinventing the wheel, why reference Watts from 2013 and not Grotius in 1625? Or Didiotus in 427 BC, or other references in Thucydides?
I think EAs if anything are far too epistemically modest and unwilling to stick their neck out for defending true and accurate positions. I also find demands to police epistemic modesty based on hastily written quick takes annoying.
The de facto outcome if I take these concerns seriously is to showcase much less of my intermediate thinking, to bulletproof all my writing before they see the light of day, and/or crosspost to the EA Forum less.
I’ve indeed been taking actions like this, especially the last one, due to comments like yours (this is the third time you’ve done this), though I’m unsure if I endorse it on net.
I’m not trying to be unkind, and I apologise if I was. I’ll take this down if you ask here or via DM. I overreacted to what is a quick take because I think it was emblematic of a bad pattern—but that is unfair and disproportionate of me.
My main thing here is to push for better intermediate thinking. Like the standard EA/rat approach is so often based on dismissing mainstream or non-EA views, and then acting like their individual opinion is clearly superior, often reinventing current or past views that have had lots of non-EA examination. I want EA thinking to be better, and a lot of the time it would be improved by people reading more before opining, and not thinking the views of EA are so special.
We just have very different experiences then.
Do you mean critique someone on epistemic immodesty grounds? This is probably true but can you point me to the examples you have in mind? (I may indeed be doing this too much and seeing the examples would help)
Thanks for the much kinder response and the serious engagement! :) Please don’t take your comment down, it’s good to have this discussion in the open.
(Also apologies for the long comment, brain not working really well so less succinct than I want to be)
I want to defend my own approach here, and won’t speak for the” standard EA/rat approach” except insomuch as my thinking is constitutive of that approach (as the old joke goes, “you’re not in traffic, you are traffic”). Generally when I try to learn information about the world, what I go for is to seek facts and models that are
interesting (ie, novel to me)
true
useful
The best way to do this typically involves some combination of Google searches, original thinking, reading papers, conversations, reading, toy models, and (since ~2025) talking to AIs[1]. Since college, I’ve honed an ability to form views very quickly that I can defend, and believe I’m reasonably calibrated on. I think this is sometimes surprising to people but it shouldn’t be. The first data point tells you a lot[2].
Similarly my bar for publishing my thoughts, ignoring opportunity cost, is also fairly low. The primary thing I’m interested in from a content perspective is some combination of novel/true/useful to my readers. Novel to whom? For me I have an implicit model of who my readers are and I try to calibrate accordingly. I want to write things that are new to a large fraction of my readers. I think you might have more of an academia-derived model where it’s very important to only share thoughts that are novel to humanity.
I think this is less good of a norm. If I can write a better intro to stealth than is widely understood/disseminated, I think this is a useful service even if no individual point there is original.
Similarly, I think it’s less important in non-academic contexts to attribute the originators of an idea or an analysis. I don’t think it’s useless, I just think it’s less important. But if I’m thinking about a problem the academic citations are mostly directly useful inasomuch as it benefits either me or my readers, rather than being the first line of attack.
To be clear, credit attribution is valuable and I want to avoid actual plagiarism (I think academic norms are valuable in a bunch of ways and I want to respect the institution even when I disagree with it).
Also, this may be nonobvious, but I do in fact “do the reading” and “expert engagement” significantly, often past the point of diminishing marginal returns compared to honing my own thinking or writing.
For example, in my earlier post on war, where I summarized and extended James Fearon’s bargaining model, I read Fearon’s paper and skimmed a bunch of others to form a gestalt view. I also emailed my post to both Fearon and another academic on war (Fearon replied positively, the other academic didn’t respond).
In my Chiang review I read something like 10 reviews before starting my piece, and maybe more like 20 before finishing it.
And for war crimes in particular, I’ve been reading about it casually for several years. See here for one example.
I also think it’s very easy to say “do the reading” but in practice what reading you do is highly contingent and it’s easy to waste a bunch of time feeling virtuous for doing the homework on adjacent topics but not actually learning useful things for addressing your original question. For example, you seem to believe that I should be reading the latest academic papers on just war (a plausible enough hypothesis!). Someone on Substack (with a relevant background!) suggested I read the negotiating history of the Geneva conventions and their Additional Protocols (also plausible!) Someone on LessWrong suggested I read Tom Schelling’s treatment of the subject (plausible enough, I ordered the book). And these are just the ones that I think are sufficiently plausible! There are so many other ways to burn time seeming to doing the reading instead of committing to a hypothesis and seeing where it lands.
Finally, I’d note that when you said people’s arguments are
there is a major selection effect. If I think a mainstream view is both true and introduced well, I usually don’t bother writing about it.
__
Concretely I think Bentham’s Bulldog/Matthew comes across as overconfident on his blog, as does John Wentworth on LessWrong. But most randomly selected writers on EAF and LW are underconfident and often hedge in 10 words what they could say in 3.
Maybe a background methodological difference here is that I strongly agree with Scott Alexander on the most useful forms of criticism (highly specific, targeted, concrete). Whereas I’m skeptical of deep paradigmatic criticisms really being correct, changing people’s minds, or overall being insightful/true/useful.
I meant respond to my comments or posts in a way that seems asymmetrically easy to make but very hard to respond to/argue against on the object-level. I don’t want to dreg up the links, sorry.
Anyway, thanks for the response and for giving me an opportunity to elaborate my thoughts and overall position here.
this is for the types of questions I’m interested in, and my workflow. A historian might do more primary source hunting/archival research offline. An ML researcher might run more experiments, a biologist might work in a wet lab, or a field, and so forth. In the past I also did more expert interviews.
If you have a model of the world/human epistemics where surprisal value is constant across learning time, or even that it’s highly superlinear per topic, then you might prioritize your actions very differently from me.
This is too tangential from the forecasting discussion to justify being a comment there so I’m putting it here:
Forecasting makes no sense as a cause area, because cause areas are problems, something like “people lack resources/basic healthcare/etc.”, “we might be building superintelligent AI and we have no idea what we’re doing”. Forecasting is more like a tool. People use forecasting to address AI, global poverty, and all sorts of more general problems, including ones that aren’t major EA focuses.
For instance, we could treat vaccines as a cause area. All the funding to some AI-x-biosecurity people, GAVI campaigns for existing vaccines, and people working on bird flu vaccines could be treated like they’re doing the same thing. And then we could argue about whether vaccines meet the funding bar. But that would be a pretty pointless argument, when really all those projects are trying to do different things with similar tools.
So I’d rather judge the AI forecasting by AI standards, the general-purpose forecasting by metascience standards, and the global development forecasting by global development standards, rather than trying to lump them in as a single entity. That being said, I do side with the view that there’s too much money and enthusiasm being spent on forecasting, but it’s a weakly held view, and that doesn’t mean that every forecasting project isn’t worth being funded, or even that they’re all equally inflated.
The recent work on SAEBER, which applies sparse autoencoders (SAEs) to the screening of dna synthesis printers marks a big step towards effective function based screening.
This allows for printers to be monitored just as a lab technician uses computational gel electrophoresis to separate a messy mixture into clear, readable bands through the use of a specialized gel. SAEs happen to do the exact same thing by taking the muddied activation results of a neural network and projecting them out onto a higher dimensional space until the individual viral motifs can be seen clearly. This allows for the motifs to be tracked as they move through the system in real-time, rather than waiting for a final product.
However, while SAEBER is undoubtedly an effective method, can we say for a fact that it is the best tool for function based screening? Would it be better to scan the digital thoughts of the AI responsible for guiding the system generating the product, or monitoring the stability of the system itself, given that we can model the printer’s physical state at any given time step during the printer’s run?
While scanning the digital motifs helps provide an understanding of the AI’s intent, it would be interesting to see if monitoring the physical state of the printer might provide a more resilient safety net. My intuition is that modelling the printer’s state as a physical landscape and understanding the implications of changes in the landscape might be more prone to false positives from natural noise, but it also has the potential to be better at detecting divergence much earlier than waiting to interpret a complex digital signal. Has there been much discussion on combining these—using the physics of the machine to flag a problem, and the AI’s internal motifs to figure out exactly what that problem is?