I’m hearing “the current approach will fail by default, so we need a different approach. In particular, the new approach should be clearer about the reasoning of the AI system than current approaches.”
Noticeably, that’s different from a positive case that sounds like “Here is such an approach and why it could work.”
I’m curious how much of your thinking is currently split between the two rough possibilities below.
First:
I don’t know of another approach that could work, so while I maybe personally feel more of an ability to understand some people’s ideas than others, many people’s very different concrete suggestions for approaches to understanding these systems better are all arguably similar in terms of how likely we should think they are to pan out, and how much resources we should want to put behind them.
Alternatively, second:
While it’s incredibly difficult to communicate mathematical intuitions of this depth, my sense is I can see a very attractive case for why one or two particular efforts (e.g. MIRI’s embedded agency work) could work out.
Thanks :)
I’m hearing “the current approach will fail by default, so we need a different approach. In particular, the new approach should be clearer about the reasoning of the AI system than current approaches.”
Noticeably, that’s different from a positive case that sounds like “Here is such an approach and why it could work.”
I’m curious how much of your thinking is currently split between the two rough possibilities below.
First:
Alternatively, second: