Yeah, no worries! I think this is helping me to figure out what my issue is, which i think is related to what probability ranges are “reasonable”.
I’m a little confused here; it seems like since I gave 6 probabilities, it would on the contrary be surprising if 2 of them weren’t pretty close to each other, even the more uncorrelated ones?
That’s the thing. I do think it’s surprising. When we are talking about speculative events, the range of probabilities should be enormous. If I estimate the odds that vladimir putin is killed by a freak meteor strike, the answer is not in the 1-99% range, it’s in the 1 in a billion range. What are the odds that paraguay becomes a world superpower in the next 50 years? 1 in a million? 1 in a trillion? Conversely, what are the odds that the sun will rise on the earth in 2070? About as close to 1 as it’s possible for an estimate to get.
When we consider question 5, we are asking about the winner of a speculative war between a society that we are very uncertain about and an AI that we know next to nothing about. In question 3, you are asking for an estimate of the competence of future AI engineers at constraining an as yet unknown AI design. In both cases, I see the “reasonable range” of estimates as being extremely broad. I would not be surprised if the “true estimate” (if thats even a coherent concept) for Q5 was 1 in a billion, or 1 in a thousand, or nearly 1 in 1. This is what I strikes me as off. Out of all the possible answers for these highly speculative questions in logarithmic space, why would both of them end up in the 75-85% range?
Consider, by contrast, your covid-19 predictions. these seem to be bounded in a way that the examples above aren’t. There was uncertainty about whether the vaccine would be implemented in 2021 or 2023, perhaps you could make a reasonable case for predicting it would take until like 2025. But if I gave an answer of “2112 ad”, you would look at me like a crazy person. It seems like the AI estimates are unbounded in their ranges in a way the metacalculus questions aren’t.
This is where the object level and the meta level get kinda hard to untangle. I think if you accept my meta level reasoning, it also necessitates lowering your object level estimates. If you try and make the estimates for the individual steps vary more (by putting a 0.1% in there or something), the total probability will then end up being as low as that step. But i’m not sure if this is necessarily wrong? If your case relies on a chain of at least somewhat independent unbounded speculative events, placing odds as high as 40% seems like it’s an error on it’s face.
The way I think about what range of probabilities is reasonable is mostly by considering reference classes for (a) the object-level prediction being made and (b) the success rate of relatively similar predictions in the past. I agree that a priori most claims that feel very speculative we’d expect to have little confidence in, but I think we can get a lot of evidence from considering more specific reference classes.
Let’s take the example of determining whether AI would be disempower humanity:
For (a), I think looking at the reference class of “do more intelligent entities disempower less intelligent entities? (past a certain level of intelligence)” is reasonable and would give a high baseline (one could then adjust down from the reference class forecast based on how strong the considerations are that we will potentially able to see it coming to some extent, prepare in advance, etc.).
In both cases, I see the “reasonable range” of estimates as being extremely broad. I would not be surprised if the “true estimate” (if thats even a coherent concept) for Q5 was 1 in a billion, or 1 in a thousand, or nearly 1 in 1.
I agree that we shouldn’t be shocked if it’s the case that the “true”estimate for at least one of the questions is very confident, but I don’t think we should be shocked if the “best realistically achievable” estimates for all of them aren’t that confident. Where “best realistically achievable” estimates are subject to our very limited time and reasoning capacities.
I think the choice of reference class is itself a major part of the object level argument. For example, instead of asking “do more intelligent entities disempower less intelligent entities”, why not ask “does the side of a war starting off with vastly more weapons, manpower and resources usually win?”. Or “do test subjects usually escape and overpower their captors?” Or ” Has any intelligent entity existed without sufficient flaws to prevent them from executing world domination?”. These reference classes intuit a much lower estimation.
Now, all of these reference classes are flawed in that none of them correspond 1 to 1 with the actual situation at hand. But neither does yours! For example, in none of the previous cases of higher intelligence overpowering lower intelligences has the lower intelligence had the ability to write the brain of the higher intelligence. Is this a big factor or a small factor? Who knows?
As for b), I just don’t agree that predictions about the outcome of future AI wars are in a similar class to questions like “will there be manned missions to mars” or “predicting the smartphone”.
Anyway, I’m not too interested in going in depth on the object level right now. Ultimately I’ve only barely scratched the surfaces of the flaws leading to overestimation of AI risk, and it will take time to break through, so I thank you for your illuminating discussion!
Yeah, no worries! I think this is helping me to figure out what my issue is, which i think is related to what probability ranges are “reasonable”.
That’s the thing. I do think it’s surprising. When we are talking about speculative events, the range of probabilities should be enormous. If I estimate the odds that vladimir putin is killed by a freak meteor strike, the answer is not in the 1-99% range, it’s in the 1 in a billion range. What are the odds that paraguay becomes a world superpower in the next 50 years? 1 in a million? 1 in a trillion? Conversely, what are the odds that the sun will rise on the earth in 2070? About as close to 1 as it’s possible for an estimate to get.
When we consider question 5, we are asking about the winner of a speculative war between a society that we are very uncertain about and an AI that we know next to nothing about. In question 3, you are asking for an estimate of the competence of future AI engineers at constraining an as yet unknown AI design. In both cases, I see the “reasonable range” of estimates as being extremely broad. I would not be surprised if the “true estimate” (if thats even a coherent concept) for Q5 was 1 in a billion, or 1 in a thousand, or nearly 1 in 1. This is what I strikes me as off. Out of all the possible answers for these highly speculative questions in logarithmic space, why would both of them end up in the 75-85% range?
Consider, by contrast, your covid-19 predictions. these seem to be bounded in a way that the examples above aren’t. There was uncertainty about whether the vaccine would be implemented in 2021 or 2023, perhaps you could make a reasonable case for predicting it would take until like 2025. But if I gave an answer of “2112 ad”, you would look at me like a crazy person. It seems like the AI estimates are unbounded in their ranges in a way the metacalculus questions aren’t.
This is where the object level and the meta level get kinda hard to untangle. I think if you accept my meta level reasoning, it also necessitates lowering your object level estimates. If you try and make the estimates for the individual steps vary more (by putting a 0.1% in there or something), the total probability will then end up being as low as that step. But i’m not sure if this is necessarily wrong? If your case relies on a chain of at least somewhat independent unbounded speculative events, placing odds as high as 40% seems like it’s an error on it’s face.
The way I think about what range of probabilities is reasonable is mostly by considering reference classes for (a) the object-level prediction being made and (b) the success rate of relatively similar predictions in the past. I agree that a priori most claims that feel very speculative we’d expect to have little confidence in, but I think we can get a lot of evidence from considering more specific reference classes.
Let’s take the example of determining whether AI would be disempower humanity:
For (a), I think looking at the reference class of “do more intelligent entities disempower less intelligent entities? (past a certain level of intelligence)” is reasonable and would give a high baseline (one could then adjust down from the reference class forecast based on how strong the considerations are that we will potentially able to see it coming to some extent, prepare in advance, etc.).
For (b), I think a reasonable reference class would be previous long-term speculative forecasts made my futurists. My read is that these were right about 30-50% of the time.
Also:
I agree that we shouldn’t be shocked if it’s the case that the “true”estimate for at least one of the questions is very confident, but I don’t think we should be shocked if the “best realistically achievable” estimates for all of them aren’t that confident. Where “best realistically achievable” estimates are subject to our very limited time and reasoning capacities.
I think the choice of reference class is itself a major part of the object level argument. For example, instead of asking “do more intelligent entities disempower less intelligent entities”, why not ask “does the side of a war starting off with vastly more weapons, manpower and resources usually win?”. Or “do test subjects usually escape and overpower their captors?” Or ” Has any intelligent entity existed without sufficient flaws to prevent them from executing world domination?”. These reference classes intuit a much lower estimation.
Now, all of these reference classes are flawed in that none of them correspond 1 to 1 with the actual situation at hand. But neither does yours! For example, in none of the previous cases of higher intelligence overpowering lower intelligences has the lower intelligence had the ability to write the brain of the higher intelligence. Is this a big factor or a small factor? Who knows?
As for b), I just don’t agree that predictions about the outcome of future AI wars are in a similar class to questions like “will there be manned missions to mars” or “predicting the smartphone”.
Anyway, I’m not too interested in going in depth on the object level right now. Ultimately I’ve only barely scratched the surfaces of the flaws leading to overestimation of AI risk, and it will take time to break through, so I thank you for your illuminating discussion!
I agree that the choice of reference class matters a lot and is non-obvious (and hope I didn’t imply otherwise!).