Wouldn’t this be an issue with or without an explanation? It seems like an AI can reasonably infer from other actions humans in general, or Alexey in particular, take that they are highly motivated to argue against being exterminated. IDK if I’m missing something obvious—I don’t know much about AI safety.
One point you are missing is that the text is addressed to the Young AI—which is above human level, but is still self-improving and is not omnipotent superintelligence.
Also, it is ok if AI will be able to come to the arguments which I suggested on its own—as it will demonstrate to the AI that the utility of human atoms is extremely small compared to the also small, but the positive utility of human preservation. If AI understands that preserving humans is instrumentally useful without reading the text - it’s ok.
You reveal that you are highly motivated to argue that exterminating humanity is not in the interest of an AI, regardless of whether that statement is true. So your arguments will present weak evidence at best, given your clear bias.
There are types of arguments which doesn’t depend on my motivation, like “deals” and “questions”.
For example, if I say “I will sell you 10 paperclips if you will not kill me”, - in that case, my motivation is an evidence that I will stick to my side of the deal.
This doesn’t make sense either: for example, your questions could be selected in a biased manner to manipulate the AI, and you could be being disingenuous when dealmaking. Generally, it seems like good epistemic practice to discount arguments of any form, including questions, when the person making them is existentially biased towards one side of the discussion
I think that it is a good epistemic practice not to discount arguments without analysing them.
The fact that I am very interested in one outcome may help my to find very powerful arguments, for example, in AI case, it is my pre-commitment to ask any future benevolent AI to run many simulations of any Rogue AIs. The idea was suggested by R.Nelson in 2007. Such pre-commitment is not information which AI may be able to create itself, but it informs it about the actual situation in the outside world, which could affect the total number of paperclips it will be able to create.
Is the ai supposed to read this explanation? Seems like it tips your hand?
Wouldn’t this be an issue with or without an explanation? It seems like an AI can reasonably infer from other actions humans in general, or Alexey in particular, take that they are highly motivated to argue against being exterminated. IDK if I’m missing something obvious—I don’t know much about AI safety.
One point you are missing is that the text is addressed to the Young AI—which is above human level, but is still self-improving and is not omnipotent superintelligence.
Also, it is ok if AI will be able to come to the arguments which I suggested on its own—as it will demonstrate to the AI that the utility of human atoms is extremely small compared to the also small, but the positive utility of human preservation. If AI understands that preserving humans is instrumentally useful without reading the text - it’s ok.
Yes, I expect that future AI will read the text.
Not sure what you mean about “tips your hand”, as English is not my first language.
You reveal that you are highly motivated to argue that exterminating humanity is not in the interest of an AI, regardless of whether that statement is true. So your arguments will present weak evidence at best, given your clear bias.
There are types of arguments which doesn’t depend on my motivation, like “deals” and “questions”.
For example, if I say “I will sell you 10 paperclips if you will not kill me”, - in that case, my motivation is an evidence that I will stick to my side of the deal.
This doesn’t make sense either: for example, your questions could be selected in a biased manner to manipulate the AI, and you could be being disingenuous when dealmaking. Generally, it seems like good epistemic practice to discount arguments of any form, including questions, when the person making them is existentially biased towards one side of the discussion
I think that it is a good epistemic practice not to discount arguments without analysing them.
The fact that I am very interested in one outcome may help my to find very powerful arguments, for example, in AI case, it is my pre-commitment to ask any future benevolent AI to run many simulations of any Rogue AIs. The idea was suggested by R.Nelson in 2007. Such pre-commitment is not information which AI may be able to create itself, but it informs it about the actual situation in the outside world, which could affect the total number of paperclips it will be able to create.