Marylen comments on Ask MIRI Anything (AMA)

Marylen 12 Oct 2016 5:27 UTC
5 points
0 ∶ 0
I believe that the best and biggest system of morality so far is the legal system. It is an enormous database where the fairest of men have built over the wisdom of their predecessors for a balance between fairness and avoiding chaos; where the bad or obsolete judgements are weed out. It is a system of prioritisation of law which could be encoded one day. I believe that it would be a great tool for addressing corrigibility and value learning. I’m a lawyer and I’m afraid that MIRI may not understand all the potential of the legal system.

Could you tell me why the legal system would not be a great tool for addressing corrigibility and value learning in the near future?

I describe in a little more detail how I think it could be useful at: https://docs.google.com/document/d/1eRirDom-EA_CtLD9Q5T9hWLD6xEKJ80AL3h_u7K-ErA/edit?usp=sharing
- So8res 13 Oct 2016 0:03 UTC
  10 points
  0 ∶ 0
  Parent
  In short: there’s a big difference between building a system that follows the letter of the law (but not the spirit), and a system that follows the intent behind a large body of law. I agree that the legal system is a large corpus of data containing information about human values and how humans currently want their civilization organized. In order to use that corpus, we need to be able to design systems that reliably act as intended, and I’m not sure how the legal corpus helps with that technical problem (aside from providing lots of training data, which I agree is useful).
  
  In colloquial terms, MIRI is more focused on questions like “if we had a big corpus of information about human values, how could we design a system to learn from that corpus how to act as intended”, and less focused on the lack of corpus.
  
  The reason that we have to work on corrigibility ourselves is that we need advanced learning systems to be corrigible before they’ve finished learning how to behave correctly from a large training corpus. In other words, there are lots of different training corpuses and goal systems where, if the system is fully trained and working correctly, we get corrigibility for free; the difficult part is getting the system to behave corrigibly before it’s smart enough to be doing corrigibility for the “right reasons”.
- turchin 12 Oct 2016 21:16 UTC
  −1 points
  0 ∶ 0
  Parent
  Agree