Max_Daniel comments on The academic contribution to AI safety seems large

Max_Daniel 30 Jul 2020 16:52 UTC
21 points
0 ∶ 0
[Epistemic status: third-hand information, definitely check with someone from MIRI. ETA: confirmed by MIRI’s Rob Bensinger below.]
Since 2016, actually “about half” of MIRI’s research has been on their ML agenda, apparently to cover the chance of prosaic AGI.
My impression is that MIRI never did much work on the ML agenda you link to because the relevant key researchers left.
For a few years, they do seem to have been doing a lot of nonpublic work that’s distinct from their agent foundations work. But I don’t know how related that work is to their old ML agenda, and what fraction of their research it represents.
- RobBensinger 31 Jul 2020 16:43 UTC
  19 points
  0 ∶ 0
  Parent
  I agree with Max’s take. MIRI researchers still look at Alignment for Advanced Machine Learning Systems (AAMLS) problems periodically, but per Why I am not currently working on the AAMLS agenda, they mostly haven’t felt the problems are tractable enough right now to warrant a heavy focus.
  Nate describes our new work here: 2018 Update: Our New Research Directions.
  Since 2016, actually “about half” of MIRI’s research has been on their ML agenda, apparently to cover the chance of prosaic AGI.
  I don’t think any of MIRI’s major research programs, including AAMLS, have been focused on prosaic AI alignment. (I’d be interested to hear if Jessica or others disagree with me.)
  Paul introduced prosaic AI alignment (in November 2016) with:
  It’s conceivable that we will build “prosaic” AGI, which doesn’t reveal any fundamentally new ideas about the nature of intelligence or turn up any “unknown unknowns.” I think we wouldn’t know how to align such an AGI; moreover, in the process of building it, we wouldn’t necessarily learn anything that would make the alignment problem more approachable. So I think that understanding this case is a natural priority for research on AI alignment.
  In contrast, I think of AAMLS as assuming that we’ll need new deep insights into intelligence in order to actually align an AGI system. There’s a large gulf between (1) “Prosaic AGI alignment is feasible” and (2) “AGI may be produced by techniques that are descended from current ML techniques” or (3) “Working with ML concepts and systems can help improve our understanding of AGI alignment”, and I think of AAMLS as assuming some combination of 2 and 3, but not 1. From a post I wrote in July 2016:
  [… AAMLS] is intended to help more in scenarios where advanced AI is relatively near and relatively directly descended from contemporary ML techniques, while our agent foundations agenda is more agnostic about when and how advanced AI will be developed.
  As we recently wrote, we believe that developing a basic formal theory of highly reliable reasoning and decision-making “could make it possible to get very strong guarantees about the behavior of advanced AI systems — stronger than many currently think is possible, in a time when the most successful machine learning techniques are often poorly understood.” Without such a theory, AI alignment will be a much more difficult task.
  The authors of “Concrete problems in AI safety” write that their own focus “is on the empirical study of practical safety problems in modern machine learning systems, which we believe is likely to be robustly useful across a broad variety of potential risks, both short- and long-term.” Their paper discusses a number of the same problems as the [AAMLS] agenda (or closely related ones), but directed more toward building on existing work and finding applications in present-day systems.
  Where the agent foundations agenda can be said to follow the principle “start with the least well-understood long-term AI safety problems, since those seem likely to require the most work and are the likeliest to seriously alter our understanding of the overall problem space,” the concrete problems agenda [by Amodei, Olah, Steinhardt, Christiano, Schulman, and Mané] follows the principle “start with the long-term AI safety problems that are most applicable to systems today, since those problems are the easiest to connect to existing work by the AI research community.”
  Taylor et al.’s new [AAMLS] agenda is less focused on present-day and near-future systems than “Concrete problems in AI safety,” but is more ML-oriented than the agent foundations agenda.
  What links here?
  - Max_Daniel's comment on The academic contribution to AI safety seems large by technicalities (30 Jul 2020 16:52 UTC; 21 points)
- Misha_Yagudin 31 Jul 2020 11:17 UTC
  14 points
  0 ∶ 0
  Parent
  Indeed, Why I am not currently working on the AAMLS agenda is a year-later write up by the lead researcher. Moreover, they write:
  That is, though I was officially lead on AAMLS, I mostly did other things in that time period.
  - MichaelA🔸 1 Aug 2020 6:29 UTC
    2 points
    0 ∶ 0
    Parent
    Thanks for this info.
    Just a heads up that the link seems to lead back to this post. I think you mean to link to this?
    - Misha_Yagudin 1 Aug 2020 8:49 UTC
      1 point
      0 ∶ 0
      Parent
      Thanks; fixed.
- Buck 31 Jul 2020 15:57 UTC
  8 points
  0 ∶ 0
  Parent
  MIRI is not optimistic about prosaic AGI alignment and doesn’t put much time into it.
- Misha_Yagudin 31 Jul 2020 11:22 UTC
  5 points
  0 ∶ 0
  Parent
  On the other hand, in 2018′s review MIRI wrote about new research directions, one of which feels ML adjacent. But from a few paragraphs, it doesn’t seem that the direction is relevant for prosaic AI alignment.
  Seeking entirely new low-level foundations for optimization, designed for transparency and alignability from the get-go, as an alternative to gradient-descent-style machine learning foundations.
- technicalities 31 Jul 2020 11:39 UTC
  3 points
  0 ∶ 0
  Parent
  Thanks for this, I’ve flagged this in the main text. Should’ve paid more attention to my confusion on reading their old announcement!