Yonatan Cale comments on AMA: Ought

Yonatan Cale 11 Aug 2022 14:12 UTC
4 points
0 ∶ 0
it’s plausible that we’ll have such capable but unaligned AI systems
I simply agree, no need to convince me there 👍
Ought’s approach:
- Instead of giving a training signal after the entire AI gives an output,
- Do give a signal after each sub-module gives an output.
Yes?
My worry: The sub-modules will themselves be misaligned.
Is your suggestion: Limit compute and neural memory of sub-models in order to lower the risk
?
- Yonatan Cale 11 Aug 2022 14:13 UTC
  2 points
  0 ∶ 0
  Parent
  And my second worry is that the “big AI” (the collection of sub models) will be so good that you could ask it to perform a task and it will be exceedingly effective at it, in some misaligned-to-our-values (misaligned-to-what-we-actually-meant) way