if AIs can own property and earn income by selling their labor on an open market, then they can simply work a job and use their income to purchase whatever it is they want, without any need to violently “take over the world” to satisfy their goals.
If an individual AI’s relative skill-level is extremely high, then this could simply translate into higher wages for them, obviating the need for them to take part in a violent coup to achieve their objectives.
For example, one can imagine a human hiring a paperclip maximizer AI to perform work, paying them a wage. In return the paperclip maximizer could use their wages to buy more paperclips.
It could be that the AI can achieve much more of their objectives if it takes over (violently or non-violently) than it can achieve by playing by the rules. To use your paperclip example, the AI might think it can get 10^22 paperclips if it takes over the world, but can only achieve 10^18 paperclips with the strategy of making money through legal means and buying paperclips on the open market. In this case, the AI would prefer the takeover plan even if it has only a 10% chance of success.
Also, the objectives of the AI must be designed in such a way that they can be achieved in a legal way. For example, if an AI strongly prefers a higher average temperature of the planet, but the humans put a cap on the global average temperature, then it will be hard to achieve without breaking laws or bribing lawmakers.
There are lots of ways for AIs to have objectives that are shaped in a bad way. To obtain guarantees that the objectives of the AIs don’t take these bad shapes is still a very difficult thing to do.
I might also say “The ‘true’ P(heads) is either 0 or 1”. If the coin comes up heads, the “‘true’ P(heads)” is 1, otherwise the “‘true’ P(heads)” is 0. Uncertainty exists in the map, not in the territory.
Okay, in in this specific context I can guess what you mean by the statement, but I think this is an inaccurate way to express yourself, and the previous “I have a 95% credence that the bias is between X and Y” is a more accurate way to express the same thing.
To explain, lets formalize things using probability theory. One way to formalize the above would be to use a random variable B which describes the bias of the coin, and then you could say P(X≤B≤Y)=0.95andP(heads|B=b)=bfor all b∈[0,1]. I think this would be a good way to formalize things. A wrong way to formalize things would be to say P(X≤P(heads)≤Y)=0.95. And people might think of the latter when you say “I have a 95% credence that the “true” P(heads) is between X and Y”. People might mistakenly think that “‘true’ P(heads)” is a real number, but what you actually mean is that it is a random variable.
So, can you have a probability of probabilities?
If you do things formally, I think you should avoid “probability of probabilities” statements. You can have a probability of events, but probabilities are real numbers, and in almost all useful formalizations, events are not in the domain of real numbers. Making sense of such a statement always requires some kind of interpretation (eg random variables that refer to biased coins, or other context), and I think it is better to avoid such statements. If sufficient context is provided, I can guess what you mean, but otherwise I cannot parse such statements meaningfully.
On a related note, a “median of probabilities” also does not make sense to me.
What about P(doom)?
It does make sense to model your uncertainty about doom by looking at different scenarios. I am not opposed to model P(doom) in more detail rather than just conveing a single number. If you have three scenarios, you can consider a random variable S with values in {1,2,3} which describes which scenario will happen. But in the end, the following formula is the correct way to calculate the probability of doom: P(doom)=P(doom|S=1)P(S=1)+P(doom|S=2)P(S=2)+P(doom|S=3)P(S=3).
Talking about a “median P(doom)” does not make sense in my opinion. Two people can have the same beliefs about the world, but if they model their beliefs with different but equivalent models, they can end up with different “median P(doom)” (assuming a certain imo naive calculation of that median).
What about the cosmologist’s belief in simulation shutdown?
As you might guess by now, I side with the journalist here. If we were to formalize the expressed beliefs of the cosmologist’s using different scenarios (as above in the AI xrisk example), then the resulting P(doom) would be what the journalist reports.
It is fine to say “I don’t know” when asked for a probability estimate. But the beliefs of the cosmologist’s look incoherent to me as soon as they enter the domain of probability theory.
I suspect it feels like a paradox because he gives way higher estimates for simulation shutdown than he actually believes, in particular the 1% slice where doom is between 10% and 100%.