Erich_Grunewald
Erich_Grunewald’s Shortform
I agree, and though it doesn’t matter from an expected value point of view, I suspect part of what people object to in those risks is not just the probabilities being low but also there being lots of uncertainty around them.
Or actually, it could change the expected value calculation too if the probabilities aren’t normally distributed, e.g. one could look at an x-risk and judge most of the probability density to be around 0.001% but feel pretty confident that it’s not more than 0.01% and not at all confident that it’s not below 0.0001% or even 0.00001% etc. This makes it different from your examples, which probably have relatively narrow and normally distributed probabilities (because we have well-grounded base rates for airline accidents and voting and—I believe—robust scientific models of asteroid risks).
Edit: I see that Richard Y Chappell made this point already.
I take this post to argue that, just as an AGI’s alignment property won’t generalise well out-of-distribution, its ability to actually do things, i.e. achieve its goals, also won’t generalise well out-of-distribution. Does that seem like a fair (if brief) summary?
As an aside, I feel like it’s more fruitful to talk about specific classes of defects rather than all of them together. You use the word “bug” to mean everything from divide by zero crashes to wrong beliefs which leads you to write things like “the inherent bugginess of AI is a very good thing for AI safety”, whereas the entire field of AI safety seems to exist precisely because AIs will have bugs (i.e. deviations from desired/correct behaviour), so if anything an inherent lack of bugs in AI would be better for AI safety.
This looks very cool!
I’m curious about why you need to sign up to view the lessons?
Also, a quibble: some links (like the author’s name next to the course) aren’t actually HTML
<a>
elements, which both makes it impossible to e.g. right-click and open in a new tab, and is also bad for accessibility purposes.For what it’s worth, I don’t think the design is particularly childish (as some others have opined). I see a similar style all the time in the creative/tech/start-up-ish world, and there it’s surely aimed at adults.
I don’t think that negates the validity of the critique.
Agreed—I didn’t mean to imply it was.
Okay, but I still don’t know what the view says about x-risk reduction (the example in my previous comment)?
By “the view”, do you mean the consequentialist person-affecting view you argued against, or one of the non-consequentialist person-affecting views I alluded to?
If the former, I have no idea.
If the latter, I guess it depends on the precise view. On the deontological view I find pretty plausible we have, roughly speaking, a duty to humanity, and that’d mean actions that reduce x-risk are good (and vice versa). (I think there are also other deontological reasons to reduce x-risk, but that’s the main one.) I guess I don’t see any way that changes depending on what the default is? I’ll stop here since I’m not sure this is even what you were asking about …
My objection to it is that you can’t use it for decision-making because it depends on what the “default” is. For example, if you view x-risk reduction as preventing a move from “lots of happy people to no people” this view is super excited about x-risk reduction, but if you view x-risk reduction as a move from “no people to lots of happy people” this view doesn’t care.
That still seems somehow like a consequentialist critique though. Maybe that’s what it is and was intended to be. Or maybe I just don’t follow?
From a non-consequentialist point of view, whether a “no people to lots of happy people” move (like any other move) is good or not depends on other considerations, like the nature of the action, our duties or virtue. I guess what I want to say is that “going from state A to state B”-type thinking is evaluating world states in an outcome-oriented way, and that just seems like the wrong level of analysis for those other philosophies.
From a consequentalist point of view, I agree.
Risk of famine in Somalia
pablo is quoting a 10-year-old comment; the 80k article you link was published in 2020.
for what it’s worth, something like one fifth of eas don’t identify as consequentialist.
This is not to say that these people were good parents, that they didn’t have extensive help, or that they didn’t heavily rely on their spouses to do deeply unequal child rearing. But it should be surprising that if we were one of the only groups in history working so productively that we should eschew child rearing entirely.
it doesn’t seem surprising at all to me—for example, i have a hard time thinking of any historical community that has not separated child-rearing duties by gender. i mean, i’m sure there’s one out there, but it’s probably vanishingly rare. the present seems very unusual in that regard.
https://ourworldindata.org/grapher/regional-averages-of-the-composite-gender-equality-index
As you noticed, I limited the scope of the original comment to axiology (partly because moral theory is messier and more confusing to me), hence the handwaviness. Generally speaking, I trust my intuitions about axiology more than my intuitions about moral theory, because I feel like my intuition is more likely to “overfit” on more complicated and specific moral dilemmas than on more basic questions of value, or something in that vein.
Anyway, I’ll just preface the rest of this comment with this: I’m not very confident about all this and at any rate not sure whether deontology is the most plausible view. (I know that there are consequentialists who take person-affecting views too, but I haven’t really read much about it. It seems weird to me because the view of value as tethered seems to resist aggregation, and it seems like you need to aggregate to evaluate and compare different consequences?)
On Challenge 1A (and as a more general point) - if we take action against climate change, that presumably means making some sort of sacrifice today for the sake of future generations. Does your position imply that this is “simply better for some and worse for others, and not better or worse on the whole?” Does that imply that it is not particularly good or bad to take action on climate change, such that we may as well do what’s best for our own generation?
Since in deontology we can’t compare two consequences and say which one is better, the answer depends on the action used to get there. I guess what matters is whether the action that brings about world X involves us doing or neglecting (or neither) the duties we have towards people in world X (and people alive now). Whether world X is good/bad for the population of world X (or for people alive today) only matters to the extent that it tells us something about our duties to those people.
Example: Say we can do something about climate change either (1) by becoming benevolent dictators and implementing a carbon tax that way, or (2) by inventing a new travel simulation device, which reduces carbon emissions from flights but is also really addictive. (Assume the consequences of these two scenarios have equivalent expected utility, though I know the example is unfair since “dictatorship” sounds really bad—I just couldn’t think of a better one off the top of my head.) Here, I think the Kantian should reject (1) and permit or even recommend (2), roughly speaking because (2) respects people’s autonomy (though the “addictive” part may complicate this a bit) in a way that (1) does not.
Also on Challenge 1A—under your model, who specifically are the people it is “better for” to take action on climate change, if we presume that the set of people that exists conditional on taking action is completely distinct from the set of people that exists conditional on not taking action (due to chaotic effects as discussed in the dialogue)?
I don’t mean to say that a certain action is better or worse for the people that will exist if we take it. I mean more that what is good or bad for those people matters when deciding what duties we have to them, and this matters when deciding whether the action we take wrongs them. But of course the action can’t be said to be “better” for them as they wouldn’t have existed otherwise.
On Challenge 1B, are you saying there is no answer to how to ethically choose between those two worlds, if one is simply presented with a choice?
I am imagining this scenario as a choice between two actions, one involving waving a magic wand that brings world X into existence, and the other waving it to bring world Y into existence.
I guess deontology has less to say about this thought experiment than consequentialism does, given that the latter is concerned with the values of states of affair and the former more with the values of actions. What this thought experiment does is almost eliminate the action, reducing it to a choice of value. (Of course choosing is still an action, but it seems qualitatively different to me in a way that I can’t really explain.) Most actions we’re faced with in practice probably aren’t like that, so it seems like ambivalence in the face of pure value choices isn’t too problematic?
I realise that I’m kind of dodging the question here, but in my defense you are, in a way, asking me to make a decision about consequences, and not actions. :)
On Challenge 2, does your position imply that it is wrong to bring someone into existence, because there is a risk that they will suffer greatly (which will mean they’ve been wronged), and no way to “offset” this potential wrong?
One of the weaknesses in deontology is its awkwardness with uncertainty. I think one ok approach is to put values on outcomes (by “outcome” I mean e.g. “violating duty X” or “carrying out duty Y”, not a state of affairs as in consequentialism) and multiplying by probability. So I could put a value on “wronging someone by bringing them into a life of terrible suffering” and on “carrying out my duty to bring a flourishing person into the world” (if we have such a duty) and calculating expected value that way. Then whether or not the action is wrong would depend on the level of risk. But that is very tentative …
Really like this post!
I think one important crux here is differing theories of value.
My preferred theory is the (in my view, commonsensical) view that for something to be good or bad, it has to be good or bad for someone. (This is essentially Christine Korsgaard’s argument; she calls it “tethered value”.) That is, value is conditional on some valuer. So where a utilitarian might say that happiness/well-being/whatever is the good and that we therefore ought to maximise it, I say that the good is always dependent on some creature who values things. If all the creatures in the world valued totally different things than what they do in our dimension, then that would be the good instead.
(I should mention that, though I’m not very confident about moral philosophy, to me the most plausible view is a version of Kantianism. Maybe I give 70% weight to that, 20% to some form of utilitarianism and the rest to Schopenhauerian ethics/norms/intuitions. I can recommend being a Kantian effective altruist: it keeps you on your toes. Anyway, I’m closer to non-utilitarian Holden in the post, but with some differences.)
This view has two important implications:
It no longer makes sense to aggregate value. As Korsgaard puts it, “If Jack would get more pleasure from owning Jill’s convertible than Jill does, the utilitarian thinks you should take the car away from Jill and give it to Jack. I don’t think that makes things better for everyone. I think it makes it better for Jack and worse for Jill, and that’s all. It doesn’t make it better on the whole.”
It no longer makes sense to talk about the value of potential people. Their non-existence is neither good nor bad because there is no one for it to be good or bad for. (Exception: They can still be valued by people who are alive. But let’s ignore that.)
I haven’t spent tons of time thinking about how this shakes out in longtermism, so quite a lot of uncertainty here. But here’s roughly how I think this view would apply to your thought experiments:
Challenge 1A—climate change. If we decide to ignore climate change, then we wrong future people (because climate change is bad for them). If we don’t ignore it, then we don’t wrong those people (because they won’t exist); we also don’t wrong the future people who will exist, because we did our best to mitigate the problem. In a sense, we have a duty to future generations, whoever they may be.
Challenge 1B—world A/B/C. It doesn’t make sense to compare different world in this way, because that would necessarily involve aggregation. Instead, we have to evaluate every action based on whether it wrongs (or not, or benefits) people in the world it produces.
Challenge 2 -- asymmetry. This objection I think doesn’t apply now. The relevant question is still: does our action wrong the person that does come into existence? If we have good reason to believe that a new life will be full of suffering, and we choose to bring it into existence, plausibly we do wrong that person. If we have good reason to believe that the life will be great, and we choose to bring it into existence, obviously we don’t wrong the person. (If we do not bring it into existence, we don’t wrong anyone, because there’s no one to wrong.)
Additional thoughts:
I want to mention a harder problem than the “should we have as many children as possible?” one you mention. It is that it seems ok to abort a fetus that would have a happy life, but it seems really wrong not to abort a fetus we know would have a terrible life full of pain and suffering. (This is apparently called the asymmetry problem in philosophy.) These intuitions make perfect sense if we take the view that value is tethered. But they don’t really make sense in total utilitarianism.
Extinction would still be very bad, but it would be bad for the people who are alive when it happens, and for all the people in history whose work to improve things in the far future is being thwarted.
(I recognise that my view gets weirder when we bring probability into the picture (as we have to). That’s something I want to think more about. I also totally recognise that my view is pretty complicated, and simplicity is one of the things I admire in utilitarianism.)
I think one important difference between me and non-utilitarian Holden is that I am not a consequentialist, but I kind of suspect that he is? Otherwise I would say that he is ceding too much ground to his evil twin. ;)
Claim (5) is more interesting. People certainly seem to value free public education and healthcare highly (“The NHS is the closest thing the English have to a religion”). Many families that send their children to public school could afford to pay tuition, if they had to.
maybe you are talking about two different things:
valuing the product alone
valuing the product and its price as a package deal
people probably really like free health care because it’s health care and free. but that doesn’t necessarily mean they value the health care they get for free as much as they would value the health care that they paid for, had they done that instead. it just means they value not having to spend any money on it.
There’s no attempt to quantify how much the “whole is greater than the sum of the parts”. If the whole is just 1% greater than the sum of the parts, maybe it’s no big deal and can be safely ignored when making rough estimates. (ie, if we magically overcame all sexism to eliminate anti-women bias, and also all racism to overcome anti-minority bias, how much anti-minority-women bias would be left? Maybe “intersectional” anti-minority-women bias is 50% or more of the problem, but maybe it’s very small relative to the first-order problems of “non-intersectional” racism and sexism. I’ve never seen anyone try to explore whether “intersectionality” is a huge deal or just a minor epicycle in the social-justice universe.)
this might be a nitpick, and i generally agree with your comment, but i think that question—whether there’d be any anti-minority-women bias left after eliminating anti-women and anti-minority bias—isn’t really the right thing to ask. if the old view was that anti-minority-women bias is anti-minority bias plus anti-women bias, the intersectional view would be closer to multiplying the two factors. in that case, anti-minority-women bias would still go to zero if the other two were eliminated. it might be better to ask something like, “how much total anti-minority-women bias is there at various levels of anti-minority and anti-women bias?”
thanks for writing this. i think the version of intersectionality that you define is useful in that it highlights non-linear effects. but it’s worth noting that intersectional analysis can lead us to interventions that are suboptimal from a cost-effectiveness perspective. like, an intersectional analysis of poverty could recommend providing aid for identity groups that are poor on average, but an effective altruist may prefer to give aid to people based on their income, no matter their identity.
by the way, i think this clearer thinking podcast episode with amber dawn and holly elmore was really good. it touches on intersectionality among other things.
To what extent if any is centralisation/decentralisation useful in improving infosec?
The obvious way to reduce infosec risk is to beef up security. Another is to disincentivise actors from attacking in the first place. Are there any good ways of doing that (other than maybe criminal justice)?
How large a portion of infosec risk is due to software/hardware issues and how large due to social engineering?
I wrote a post about Kantian moral philosophy and (human) extinction risk. Summary:
The deontologist in me thinks human extinction would be very bad for three reasons:
We’d be failing in our duty to humanity itself (55% confidence).
We’d be failing in our duty to all those who have worked for a better future (70% confidence).
We’d be failing in our duty to those wild animals whose only hope for better lives rests on future human technology (35% confidence).