RSS

Ca­pa­bil­ity con­trol method

TagLast edit: 16 Mar 2022 20:47 UTC by Pablo

A capability control method is a method that attempt to prevent undesirable outcomes from artificial intelligence by restricting what the AI can do. Capability control methods encompass AI boxing, incentive methods (including anthropic capture), stunting, and tripwires.[1] Capability control methods may be contrasted with motivation selection methods, which attempt instead to restrict what the AI wants to do.

Further reading

Bostrom, Nick (2014) Superintelligence: Paths, Dangers, Strategies, Oxford: Oxford University Press, pp. 129-138.

Related entries

AI alignment | AI boxing | anthropic capture | motivation selection method

  1. ^

    Bostrom, Nick (2014) Superintelligence: Paths, Dangers, Strategies, Oxford: Oxford University Press, pp. 129-138.

No entries.