Did your outcomes 2 and 3 get mixed up at some point? I feel like the evaluations don’t align with the initial descriptions of those, but maybe I’m misunderstanding.
Thanks for writing this though, this is something I’ve been thinking a little about as I try to understand longtermism better. It makes sense to be risk-averse with existential risk, but at the same I have a hard time understanding some of the more extreme takes. My wild guess would be that AI has a significantly higher chance of improving the well-being of humanity than it does causing extinction, like I said care is warranted with existential risk but at the same time slowing AI development delays your positive outcomes 2 and 3, and I haven’t seen much discussion about the downsides of delaying.
Also I’m not sure about outcome 1 having zero utility, maybe that’s standard notation but it seems unintuitive to me, like it kind of buries the downsides of extinction risk. To me it would seem more natural as a negative utility, relative to the positive utility currently existing in the world.
...I haven’t seen much discussion about the downsides of delaying
I’m not sure how your first point relates to what I was saying in this post; but, I’ll take a guess. I said something about how investing in capabilities at anthropic could be good. An upside to this would be increasing the probability that EAs end up controlling the super-intelligent AGI in the future. The downside is that it could shorten timelines, but hopefully this can be mitigated by keeping all of the research under wraps (which is what they are doing). This is a controversial issue though. I haven’t thought very much about whether the upsides outweigh the downsides, but the argument in this post caused me to believe the upsides were larger than I thought before.
Also I’m not sure about outcome 1 having zero utility...
It doesn’t matter what outcome you assign zero value to as long as the relative values are the same since if a utility function is an affine function of another utility function then they produce equivalent decisions.
I’m not sure how your first point relates to what I was saying in this post; but, I’ll take a guess.
Sorry, what I said wasn’t very clear. Attempting to rephrase, I was thinking more along the lines of what the possible future for AI might look like if there were no EA interventions in the AI space. I haven’t seen much discussion of the possible downsides there (for example slowing down AI research by prioritizing alignment resulting in delays in AI advancement and delays in good things brought about by AI advancement). But this was a less-than-half-baked idea, thinking about it some more I’m having trouble thinking of scenarios where that could produce a lower expected utility.
It doesn’t matter what outcome you assign zero value to as long as the relative values are the same since if a utility function is an affine function of another utility function then they produce equivalent decisions.
Did your outcomes 2 and 3 get mixed up at some point? I feel like the evaluations don’t align with the initial descriptions of those, but maybe I’m misunderstanding.
Thanks for writing this though, this is something I’ve been thinking a little about as I try to understand longtermism better. It makes sense to be risk-averse with existential risk, but at the same I have a hard time understanding some of the more extreme takes. My wild guess would be that AI has a significantly higher chance of improving the well-being of humanity than it does causing extinction, like I said care is warranted with existential risk but at the same time slowing AI development delays your positive outcomes 2 and 3, and I haven’t seen much discussion about the downsides of delaying.
Also I’m not sure about outcome 1 having zero utility, maybe that’s standard notation but it seems unintuitive to me, like it kind of buries the downsides of extinction risk. To me it would seem more natural as a negative utility, relative to the positive utility currently existing in the world.
Yep, thanks for pointing that out! Fixed it.
I’m not sure how your first point relates to what I was saying in this post; but, I’ll take a guess. I said something about how investing in capabilities at anthropic could be good. An upside to this would be increasing the probability that EAs end up controlling the super-intelligent AGI in the future. The downside is that it could shorten timelines, but hopefully this can be mitigated by keeping all of the research under wraps (which is what they are doing). This is a controversial issue though. I haven’t thought very much about whether the upsides outweigh the downsides, but the argument in this post caused me to believe the upsides were larger than I thought before.
It doesn’t matter what outcome you assign zero value to as long as the relative values are the same since if a utility function is an affine function of another utility function then they produce equivalent decisions.
Sorry, what I said wasn’t very clear. Attempting to rephrase, I was thinking more along the lines of what the possible future for AI might look like if there were no EA interventions in the AI space. I haven’t seen much discussion of the possible downsides there (for example slowing down AI research by prioritizing alignment resulting in delays in AI advancement and delays in good things brought about by AI advancement). But this was a less-than-half-baked idea, thinking about it some more I’m having trouble thinking of scenarios where that could produce a lower expected utility.
Thanks, I follow this now and see what you mean.