harfe comments on Ryan Greenblatt’s Quick takes

harfe 3 Feb 2024 23:37 UTC
1 point
0 ∶ 0
In the context of a misaligned AI takeover, making negotiations and contracts with a misaligned AI in order to allow it to take over does not seem useful to me at all.

A misaligned AI that is in power could simply decide to walk back any promises and ignore any contracts it agreed to. Humans cannot do anything about it because they lost all the power at that point.
- Ryan Greenblatt 4 Feb 2024 0:59 UTC
  2 points
  0 ∶ 0
  Parent
  Some objections:
  
  - Building better contract enforcement ability might be doable. (Though pretty tricky on enforcing humans to do things and on forcing AIs to do things.)
  - Negotation could involve tit-for-tat arrangements.
  - Unconditional surrenders can reduce the need for violence if we ensure there is some process for demonstrating that the AI would have succeeded at takeover with very high probabilty. (Note that I’m assuming the AI puts some small amount of weight on the preferences of currently alive humans as I discuss in the parent.)