Thanks for the response! This clarifies what I was wondering well:
I intend identifying new arguments or considerations based on current evidence to be allowed
I have some more thoughts regarding the following, but want to note up front that no response is necessary—I’m just sharing my thoughts out loud:
I’m more skeptical than you that this would converge that much closer to 0% or 100%. I think there’s a ton of effectively irreducible uncertainty in forecasting something as complex as whether misaligned AI will takeover this century.
I agree there’s a ton of irreducible uncertainty here, but… what’s a way of putting it… I think there are lots of other strong forecasters who think this too, but might look at the evidence that humanity has today and come to a significantly different forecast than you.
Like who is to say that Nate Soares and Daniel Kokotajlo’s forecasts are wrong? (Though actually it takes a smaller likelihood ratio for you to update to reach their forecasts than it does for you to reach MacAskill’s forecast.) Presumably they’ve thought of some arguments and considerations that you haven’t read or thought of before. I think it wouldn’t surprise me if this team deliberating on humanity’s current evidence for a thousand years would come across those arguments or considerations (or some other ones) in their process of logical induction (to use a term I learned from MIRI that roughly means updating without new evidence) and ultimately decide on a final forecast very different than yours as a result.
Perhaps another way of saying this is that your current forecast may be 35% not because that’s the best forecast that can be made with humanity’s current evidence, given the irreducible uncertainty in the world, but rather because you don’t currently have all of humanity’s current evidence. Perhaps your 35% is more reflective of your own ignorance than the actual amount of irreducible uncertainty in the world.
Reflecting a bit more, I’m realizing I should ask myself what I think is the appropriate level of confidence that 3% is too low. Thinking about it a bit more, 90% actually doesn’t seem that high, even given what I just wrote above. I think my main reason for thinking it may be too high is that 1000 years is a long time for a team of 100 reasonable people to think about the evidence humanity currently has and I’d expect such a team to get a much better understanding of what the actual risk of misaligned AI takeover is than anyone alive today has, even without new evidence. And because I feel like we’re in a state of relative ignorance about the risk still, it wouldn’t surprise me if after the 1000 years they justifiably believed they could be much more confident one way or the other about the amount of risk.
Thanks for the response! This clarifies what I was wondering well:
I have some more thoughts regarding the following, but want to note up front that no response is necessary—I’m just sharing my thoughts out loud:
I agree there’s a ton of irreducible uncertainty here, but… what’s a way of putting it… I think there are lots of other strong forecasters who think this too, but might look at the evidence that humanity has today and come to a significantly different forecast than you.
Like who is to say that Nate Soares and Daniel Kokotajlo’s forecasts are wrong? (Though actually it takes a smaller likelihood ratio for you to update to reach their forecasts than it does for you to reach MacAskill’s forecast.) Presumably they’ve thought of some arguments and considerations that you haven’t read or thought of before. I think it wouldn’t surprise me if this team deliberating on humanity’s current evidence for a thousand years would come across those arguments or considerations (or some other ones) in their process of logical induction (to use a term I learned from MIRI that roughly means updating without new evidence) and ultimately decide on a final forecast very different than yours as a result.
Perhaps another way of saying this is that your current forecast may be 35% not because that’s the best forecast that can be made with humanity’s current evidence, given the irreducible uncertainty in the world, but rather because you don’t currently have all of humanity’s current evidence. Perhaps your 35% is more reflective of your own ignorance than the actual amount of irreducible uncertainty in the world.
Reflecting a bit more, I’m realizing I should ask myself what I think is the appropriate level of confidence that 3% is too low. Thinking about it a bit more, 90% actually doesn’t seem that high, even given what I just wrote above. I think my main reason for thinking it may be too high is that 1000 years is a long time for a team of 100 reasonable people to think about the evidence humanity currently has and I’d expect such a team to get a much better understanding of what the actual risk of misaligned AI takeover is than anyone alive today has, even without new evidence. And because I feel like we’re in a state of relative ignorance about the risk still, it wouldn’t surprise me if after the 1000 years they justifiably believed they could be much more confident one way or the other about the amount of risk.