MIRI 2024 Mission and Strategy Update

MaloJan 5, 2024, 1:10 AM

154 points

Organization updates Machine Intelligence Research Institute AI safety Announcements and updates Existential risk Policy

As we announced back in October, I have taken on the senior leadership role at MIRI as its CEO. It’s a big pair of shoes to fill, and an awesome responsibility that I’m honored to take on.

There have been several changes at MIRI since our 2020 strategic update, so let’s get into it.^[1]

The short version:

We think it’s very unlikely that the AI alignment field will be able to make progress quickly enough to prevent human extinction and the loss of the future’s potential value, that we expect will result from loss of control to smarter-than-human AI systems.

However, developments this past year like the release of ChatGPT seem to have shifted the Overton window in a lot of groups. There’s been a lot more discussion of extinction risk from AI, including among policymakers, and the discussion quality seems greatly improved.

This provides a glimmer of hope. While we expect that more shifts in public opinion are necessary before the world takes actions that sufficiently change its course, it now appears more likely that governments could enact meaningful regulations to forestall the development of unaligned, smarter-than-human AI systems. It also seems more possible that humanity could take on a new megaproject squarely aimed at ending the acute risk period.

As such, in 2023, MIRI shifted its strategy to pursue three objectives:

Policy: Increase the probability that the major governments of the world end up coming to some international agreement to halt progress toward smarter-than-human AI, until humanity’s state of knowledge and justified confidence about its understanding of relevant phenomena has drastically changed; and until we are able to secure these systems such that they can’t fall into the hands of malicious or incautious actors.^[2]
Communications: Share our models of the situation with a broad audience, especially in cases where talking about an important consideration could help normalize discussion of it.
Research: Continue to invest in a portfolio of research. This includes technical alignment research (though we’ve become more pessimistic that such work will have time to bear fruit if policy interventions fail to buy the research field more time), as well as research in support of our policy and communications goals.^[3]

We see the communications work as instrumental support for our policy objective. We also see candid and honest communication as a way to bring key models and considerations into the Overton window, and we generally think that being honest in this way tends to be a good default.

Although we plan to pursue all three of these priorities, it’s likely that policy and communications will be a higher priority for MIRI than research going forward.^[4]

The rest of this post will discuss MIRI’s trajectory over time and our current strategy. In one or more future posts, we plan to say more about our policy/comms efforts and our research plans.

Note that this post will assume that you’re already reasonably familiar with MIRI and AGI risk; if you aren’t, I recommend checking out Eliezer Yudkowsky’s recent short TED talk,

along with some of the resources cited on the TED page:

“A.I. Poses ‘Risk of Extinction,’ Industry Leaders Warn”, New York Times
“We must slow down the race to god-like AI”, Financial Times
“Pausing AI Developments Isn’t Enough. We Need to Shut it All Down”, TIME
“AGI Ruin: A List of Lethalities”, AI Alignment Forum

MIRI’s mission

Throughout its history, MIRI’s goal has been to ensure that the long-term future goes well, with a focus on increasing the probability that humanity can safely navigate the transition to a world with smarter-than-human AI. If humanity can safely navigate the emergence of these systems, we believe this will lead to unprecedented levels of prosperity.

How we’ve approached that mission has varied a lot over the years.

When MIRI was first founded by Eliezer Yudkowsky and Brian and Sabine Atkins in 2000, its goal was to try to accelerate to smarter-than-human AI as quickly as possible, on the assumption that greater-than-human intelligence entails greater-than-human morality. In the course of looking into the alignment problem for the first time (initially called “the Friendly AI problem”), Eliezer came to the conclusion that he was completely wrong about “greater intelligence implies greater morality”, and MIRI shifted its focus to the alignment problem around 2003.

MIRI has continued to revise its strategy since then. In ~2006–2012, our primary focus was on trying to establish communities and fields of inquiry focused on existential risk reduction and domain-general problem-solving ability (e.g., the rationality community). Starting in 2013, our focus was on Agent Foundations research and trying to ensure that the nascent AI alignment field outside MIRI got off to a good start. In 2017–2020, we shifted our primary focus to a new engineering-heavy set of alignment research directions.

Now, after several years of reorienting and exploring different options, we’re making policy and communications our top focus, as that currently seems like the most promising way to serve our mission.

At a high level, MIRI is a place where folks who care deeply about humanity and its future, and who share a loose set of intuitions about the challenges we face in safely navigating the development of smarter-than-human AI systems, have teamed up to work towards a brighter future. In the spirit of “keep your identity small”, we don’t want to build more content than that into MIRI-the-organization: if an old “stereotypical” MIRI belief or strategy turns out to be wrong, we should just ditch the failed belief/strategy and move on.

With that context in view, I’ll next say a bit about what, concretely, MIRI has been up to recently, and what we plan to do next.

MIRI in 2021–2022

In our last strategy update (posted in December 2020), Nate Soares wrote that the research push we began in 2017 “has, at this point, largely failed, in the sense that neither Eliezer nor I have sufficient hope in it for us to continue focusing our main efforts there[...] We are currently in a state of regrouping, weighing our options, and searching for plans that we believe may yet have a shot at working.”

In 2021–2022, we focused on:

the aforementioned process of trying to find more promising plans,
smaller-scale support for our Agent Foundations and our 2017-initiated research programs (along with some exploration of other research ideas),^[5]
dialoguing with other people working on AI x-risk, and
writing up many of our background views.^[6]

The most central write-ups from this period include:

Our sense is that these and other MIRI write-ups were helpful for better communicating our perspective on the strategic landscape, which in turn is important for understanding how we’ve changed our strategy. Some of the main points we tried to communicate were:

Our default expectation is that humanity will soon develop AI systems that are smarter than humans.^[7] We think the world isn’t ready for this, and that rushing ahead will very likely cause human extinction and the destruction of the future’s potential value.
The world’s situation currently looks extremely dire to us. For example, Eliezer wrote in late 2021, “I consider the present gameboard to look incredibly grim, and I don’t actually see a way out through hard work alone. We can hope there’s a miracle that violates some aspect of my background model, and we can try to prepare for that unknown miracle”.
The primary reason we think the situation is dire is because humanity’s current technical understanding of how to align smarter-than-human systems seems vastly lower than what’s likely to be required. Not much progress has been made to date (relative to what’s required), including at MIRI; and we see the field to date as mostly working on topics orthogonal to the core difficulties, or approaching the problem in a way that assumes away these core difficulties.
It would be a mistake to give up, but it would also be a mistake to start piling on optimistic assumptions and mentally living in a more-optimistic (but not real) world. If, somehow, we achieve a breakthrough, it’s likely to come from an unexpected direction, and we’re likely to be in a better position to take advantage of it if we’ve kept in touch with reality.

In his TIME article, Eliezer Yudkowsky describes what he sees as the sort of policy that would be minimally required in order for governments to prevent the world from destroying itself: an indefinite worldwide moratorium on new large training runs, enforced by an international agreement with actual teeth. The LessWrong mirror of Eliezer’s TIME piece goes into more detail on this, adding several clarifying addenda.

In the absence of sufficient political will/consensus to impose such a moratorium, we think the best policy objective to focus on would be building an “off switch”—that is, assembling the legal and technical capability needed to make it possible to shut down a dangerous project or impose an indefinite moratorium on the field, if policymakers decide at a future date that it’s necessary to do so. Having the option to shut things down would be an important step in the right direction.

The difficulties of proactively enacting and effectively enforcing such an international agreement are obvious to the point that, until recently, MIRI had much more hope in finding some AI-alignment-mediated solution to AGI proliferation, in spite of how little relevant technical progress has been made to date.

However, a combination of recent developments have changed our minds about this. First, our hope in anyone finding technical solutions in time has declined, and second, the moderate success of our communications efforts has increased our hope in that direction.

Although we still think the situation looks bleak, it now looks a little less bleak than we thought it did a year ago; and the source of this new hope lies in the way the public conversation about AI has recently shifted.

New developments in 2023

In the past, MIRI has mostly spent its time on alignment research and outreach to technical audiences—that is, outreach to the sort of people who might do relevant technical work.

Several developments this year have updated us toward thinking that we should prioritize outreach activities with an emphasis on influencing policymakers and groups that may influence policymakers, including AI researchers and the general public:

1. GPT-3.5 and GPT-4 were more impressive than some of us expected. We already had short timelines, but these launches were a further pessimistic update for some of us about how plausible it is that humanity could build world-destroying AGI with relatively few (or no) additional algorithmic advances.

2. The general public and policymakers have been more receptive to arguments about existential risk from AGI than we expected, and their responses have been pretty reasonable. For example, we were positively surprised by the reception to Eliezer’s February interview on Bankless and his March piece in TIME magazine (which was TIME’s highest-traffic page for a week). Eliezer’s piece in TIME mentions that we’d been surprised by how many non-specialists’ initial reactions to hearing about AI risk were quite reasonable and grounded.

3. More broadly, there has been a shift in the Overton window toward “take extinction risk from AGI more seriously”, including within ML. Geoffrey Hinton and Yoshua Bengio’s public statements seemed pivotal here; likewise the one-sentence statement signed by the CEOs of DeepMind, Anthropic, and OpenAI, and by hundreds of other AI scientists and public intellectuals:

Mitigating the risk of extinction from AI should be a global priority alongside other societal-scale risks such as pandemics and nuclear war.

A recent example of this 2023 shift was my participation in one of the U.S. Senate’s bipartisan AI Insight Forums. It was heartening to see how far the discourse has come—Leader Schumer opened the event by asking attendees for our probability that AI could lead to a doomsday scenario, using the term “p(doom)”, while Senators and their staff listened and took notes. We would not have predicted this level of interest and reception at the beginning of the year.

Even if policymakers turn out not to be immediately receptive to arguments for halting large training runs, this may still be a critical time for establishing precedents and laying policy groundwork that could be built on if policymakers and their constituents become more alarmed in the future, e.g., in the unfortunate event of a major, but not existential, AI related disaster.

Policy efforts like this seem very unlikely to save us, but all other plans we know of seem even less likely to succeed. As a result, we’ve started building capacity for our public outreach efforts, and ramping up these efforts.

Looking forward

In the coming year, we plan to continue our policy, communications, and research efforts. We are somewhat limited in our ability to enact policy programs by our 501(c)3 status, but we work closely with others at organizations with different structures that are more free to engage in advocacy work. We are growing our communications team and expanding our capacity to broadcast the basic arguments about AI x-risk, and of course, we’re continuing our alignment research programs and expanding the research team to communicate better about our results as well as exploring new ideas.

If you are interested in working directly with MIRI as we grow, please watch our careers page for job listings.

^
Thanks to Rob Bensinger, Gretta Duleba, Matt Fallshaw, Alex Vermeer, Lisa Thiergart, and Nate Soares for your valuable thoughts on this post.
^
As Nate has written about in Superintelligent AI is necessary for an amazing future, but far from sufficient, we would consider it an enormous tragedy if humanity never developed artificial superintelligence. However, regulators may have a difficult time determining when we’ve reached the threshold “it’s now safe to move forward on AI capabilities.”
One alternative, proposed by Nate, would be for researchers to stop trying to pursue de novo AGI, and instead pursue human whole-brain emulation or human cognitive enhancement. This helps largely sidestep the issue of bureaucratic legibility, since the risks are far lower and success criteria are a lot clearer; and it could allow us to realize many of the near-term benefits of aligned AGI (e.g., for existential risk reduction).
^
Various people at MIRI have different levels of hope about this. Nate and Eliezer both believe that humanity should not be attempting technical alignment at its current level of cognitive ability, and should instead pursue human cognitive enhancement (e.g., via uploading), and then having smarter (trans)humans figure out alignment.
^
Lisa comments:
I personally would like to note my dissenting perspective on this overall choice.
While I agree MIRI can contribute in the short term with a comms and policy push towards effective regulation, in the medium to long term I think research is our greater comparative advantage and I think we should keep substantial focus here (as well as increase empirical research staff). We should continue doing research but shift our focus more towards technical work which can support regulatory efforts (ex. safety standards, etc), including empirical work but also for example drawing on our Agent Foundations experience to produce theoretical frameworks. I think stronger technical (ideally empirically grounded) research arguments targeting scientific government advisors, regulators and lab decision makers (rather than public-oriented comms using philosophical arguments) on why there is AI risk and why mitigation makes sense, are more critically missing from the picture and a component MIRI can perhaps uniquely contribute. I also personally expect more impact to come from influencing lab decision makers and creating more of an academic/research consensus on safety risks rather than hoping for substantial regulatory success in the shorter term of the next 1–3 years.
Further, solving critical open questions in the safety standards space on how to regulate using metrics (and how to do this scientifically and correctly) seems to me a priority for at least the next 1–2 years. I think it’s important to have a diversity of perspectives including non-lab organizations like MIRI creating this knowledge base.
^
In the case of our Agent Foundations research team, team size stayed the same, but we didn’t put any effort into trying to expand the team.
^
We also helped set up a co-working space and related infrastructure for other AI x-risk orgs like Redwood Research. We’re fans of Redwood, and often direct researchers and engineers to apply to work there if they don’t obviously fit the far more unusual and constrained research niches at MIRI.
^
Where “soon” means, roughly, “there’s a lot of uncertainty here, but it’s a very live possibility that AGI is only a few years away; and it no longer seems likely to be (for example) 30+ years away.” In a fall 2023 poll of most MIRI researchers, we expect AGI (according to the definition from this Metaculus market) in a median of 9 years and a mean of 14.6 years. One researcher was an outlier at 52 years; the majority predicted under ten years.

MaloJan 5, 2024, 1:10 AM

154 points

38 comments EA link

Organization updates Machine Intelligence Research Institute AI safety Announcements and updates Existential risk Policy

Crossposted from LessWrong (223 points, 44 comments)

Geoffrey Miller Jan 5, 2024, 7:27 PM
51 points
13 ∶ 0

Malo—bravo on this pivot in MIRI’s strategy and priorities. Honestly it’s what I’ve hoped MIRI would do for a while. It seems rational, timely, humble, and very useful! I’m excited about this.
I agree that we’re very unlikely to solve ‘technical alignment’ challenges fast enough to keep AI safe, given the breakneck rate of progress in AI capabilities. If we can’t speed up alignment work, we have to slow down capabilities work.
I guess the big organizational challenge for MIRI will be whether its current staff, who may have been recruited largely for their technical AI knowledge, general rationality, and optimism about solving alignment, can pivot towards this more policy-focused and outreach-focused agenda—which may require quite different skill sets.
Let me know if there’s anything I can do to help, and best of luck with this new strategy!
NickLaing Jan 5, 2024, 8:37 AM
45 points
11 ∶ 6

I appreciate the impressive epistemic humility it must have taken for one of the original and most prestigious alignment research orgs to decide that right now prioritising policy and communications work over research might be the best course to follow. I would imagine that might be a somewhat painful decision for technical people who have devoted their life to finding a technical solution. Nice one!

“Although we plan to pursue all three of these priorities, it’s likely that policy and communications will be a higher priority for MIRI than research going forward.”
- trevor1 Jan 5, 2024, 6:31 PM
  −51 points
  0 ∶ 19
  Parent
  
  Strong downvoted. This isn’t a laughing matter.
  I understand what it’s like to think of a really funny joke and not want to waste it. But this isn’t an appropriate environment to substitute charisma for substance.
  If EA grows by, say, 30% per year, then that means at any given time there’s going to be a large number of people on the forum who will see this, think it’s normal, and upvote it (reinforcing that behavior). Even if professional norms hold strong, it will still make the onboarding process that much harder and more confusing for the new people, as they are misled into making serious social-status-damaging faux passes, and that reputation might follow them around in the community for years regardless of how talented or valuable they become.
  - Larks Jan 5, 2024, 6:58 PM
    22 points
    17 ∶ 0
    Parent
    
    I assumed Nick was being sincere?
    - NickLaing Jan 5, 2024, 8:42 PM
      10 points
      1 ∶ 0
      Parent
      
      Yes I was being sincere. I might have missed some meta thing here as obviously I’m not steeped in AI alignment. Perhaps Trevor intended to reply on another comment but mistakenly replied here?
      - trevor1 Jan 5, 2024, 8:52 PM
        13 points
        0 ∶ 0
        Parent
        
        Oops! I’m off my groove today, sorry. I’m going to go read up on some of the conflict theory vs. mistake theory literature on my backlog in order to figure out what went wrong and how to prevent it (e.g. how human variation and inferential distance causes very strange mistakes due to miscommunication).
  - titotal Jan 5, 2024, 7:00 PM
    5 points
    10 ∶ 0
    Parent
    
    I’m confused. The comment reads as sincere to me? What part of it did you think was a joke?
  - Hayven Frienby Jan 9, 2024, 12:31 PM
    1 point
    0 ∶ 0
    Parent
    
    This is why I don’t think the goal should be to grow the movement. Movements that grow by seeking converts usually end up drifting far from their original mission and taking on negative, irrational aspects of the societies they emerge from. Religious and political history provide dozens of examples of this process taking place.
    
    EA should be about quality over quantity just in my opinion, and “social status” is both figuratively and literally worthless in the face of extinction.
maxime Jan 5, 2024, 11:01 AM
29 points
8 ∶ 0

Could you unpack (1) how you plan to work towards “Increase the probability that the major governments of the world end up coming to some international agreement” and (2) how confident you are a foundational research org can transition into and make a difference in the policy space?
Will Aldred Jan 5, 2024, 3:49 PM
11 points
1 ∶ 1

I’m curious, since it sounds like MIRI folks may have thought about this, if you have takes on how best to allocate marginal effort between pushing for cooperation-to-halt-AI-progress on the one hand, and accelerating cognitive enhancement (e.g., mind uploading) on the other?^[1]
Like, I see that you list promoting cooperation as a priority, but to me, based on your footnote 3, it doesn’t seem obvious that promoting cooperation to buy ourselves time is a better strategy at the margin than simply working on mind uploading.^[2] (At least, I don’t see this being obviously true for people-trying-to-reduce-AI-risk at large, and I’d be interested in your—or others’—thoughts here, in case there’s something I’m missing. It may well be clearly true for MIRI given your comparative advantages; I’m asking this question from the perspective of overall AI risk reduction strategy.) Here’s that footnote 3:
Nate and Eliezer both believe that humanity should not be attempting technical alignment at its current level of cognitive ability, and should instead pursue human cognitive enhancement (e.g., via uploading), and then having smarter (trans)humans figure out alignment.
Related recent discussion:
- “Does davidad’s uploading moonshot work?”
  - Context: David Dalrymple (aka davidad) recently outlined a concrete plan for mind uploading by 2040.
Related prediction markets:
- Eliezer’s Manifold market, “If Artificial General Intelligence has an okay outcome, what will be the reason?”
  - At present, the leading answer is: “Humanity successfully coordinates worldwide to prevent the creation of powerful AGIs for long enough to develop human intelligence augmentation, uploading, or some other pathway into transcending humanity’s window of fragility.”
- My Metaculus question, “Will mind uploading happen before AGI?”
  - The current community prediction is 1%.^[3]
1. ^
  ETA: I’ve just noticed that earlier today, another Forum user posted a quick take on a similar theme, asking why there’s been no EA funding for cognitive enhancement projects. See here.
2. ^
  The immediate lines of reasoning I can think of for why “put all marginal effort towards pausing AI” is the best strategy right now are: i) uploading is intractable given AGI timelines, and ii) future, just-before-the-pause models—GPT-7, say—could help significantly with mind uploading R&D. But then, assuming that uploading is our best bet for getting alignment right, I think ii just shifts the discussion to things like “where is the best place to pause (with respect to the tradeoff between powerful automation of uploading R&D versus not pausing too late)?” and “are there ways to push for differential progress in models’ capabilities? (e.g., narrow superhuman ability in neuroscience research).”
  What’s more, as counters to i: Firstly, most problems fall within a 100x tractability range. Secondly, even if cooperation+pause efforts are clearly higher impact right now than object-level uploading work, I think there’s still the argument that field-building for mind uploading should start now, rather than once the pause is in place. Because if field-building starts now, then with luck there’ll be a body of uploading researchers ready to make the most of a future pause. (This argument doesn’t go through if the pause lasts indefinitely, because in that case there’s time to build up the mind uploading field from scratch in the pause. But it does go through if the pause is limited or fragile, which I tentatively believe are more likely possibilities. See also Scott Alexander’s taxonomy of AI pauses.)
3. ^
  Taken together, these two prediction markets arguably paint a grim picture. Namely, the trades on Eliezer’s question imply that mind uploading is the most likely way that AGI goes well for humanity, but the forecasts on my question imply that we’re very unlikely to get mind uploading before AGI.
What links here?
- Will Aldred's comment on Digital Minds Takeoff Scenarios by Bradford Saad (Jul 10, 2024, 1:19 PM; 3 points)
- Geoffrey Miller Jan 5, 2024, 7:21 PM
  9 points
  1 ∶ 0
  Parent
  
  Will—we seem to be many decades away from being able to do ‘mind uploading’ or serious levels of cognitive enhancement, but we’re probably only a few years away from extremely dangerous AI.
  I don’t think that betting on mind uploading or cognitive enhancement is a winning strategy, compared to pausing, heavily regulating, and morally stigmatizing AI development.
  (Yes, given a few generations of iterated embryo selection for cognitive ability, we could probably breed much smarter people within a century or two. But they’d still run a million times slower than machine intelligences. As for mind uploading, we have nowhere near the brain imaging abilities required to do whole-brain emulations of the sort envisioned by Robin Hanson)
  - Hayven Frienby Jan 9, 2024, 12:56 PM
    0 points
    0 ∶ 0
    Parent
    
    Agreed, but as I said earlier, acceptance seems to be the answer. We are limited, biological beings, who aren’t capable of understanding everything about ourselves or the universe. We’re animals. I understand this leads to anxiety and disquiet for a lot of people. Recognizing the danger of AI and the impossibility of transhumanism and mind uploading, I think the best possible path forward is to just accept our limited state, rationally stagnate our technology, and focus on social harmony and environmental protection as the way forward.
    As for the despair this could cause to some, I’m not sure what the answer is. EA has taken a lot of its organizational structure and methods of moral encouragement from philosophies like Confucianism, religions, universities, etc. Maybe an EA-led philosophical research project into human ultimate hope (in the absence of techno-salvation) would be fruitful.
    - Geoffrey Miller Jan 10, 2024, 2:19 AM
      4 points
      1 ∶ 0
      Parent
      
      Hayven—there’s a huge, huge middle ground between reckless e/acc ASI accelerationism on the one hand, and stagnation on the other hand.
      I can imagine a moratorium on further AGI research that still allows awesome progress on all kinds of wonderful technologies such as longevity, (local) space colonization, geoengineering, etc—none of which require AGI.
      - Hayven Frienby Jan 10, 2024, 3:59 AM
        1 point
        0 ∶ 0
        Parent
        
        We can certainly research those things, but using purely human efforts (no AI) progress will likely take many decades to see even modest gains. From a longtermist perspective that’s not a problem of course, but it’s a difficult thing to sell to someone not excited about living what is essentially a 20th century life so we can make progress long after they are gone. A ban on AI should come with a cultural shift toward a much less individualistic, less present-oriented value set.
- Greg_Colbourn ⏸️ Jan 10, 2024, 5:27 PM
  1 point
  0 ∶ 1
  Parent
  
  I think there is an unstated assumption here that uploading is safe. And by safe, I mean existentially safe for humanity^[1]. If in addition to being uploaded, a human is uplifted to superintelligence, would they—indeed any given human in such a state—be aligned enough with humanity as a whole to not cause an existential disaster? Arguably humans right now are only relatively existentially safe because power imbalances between them are limited.
  
  Even the nicest human could accidentally obliterate the rest of us if uplifted to superintelligence and left running for subjective millions of years (years of our time). “Whoops, I didn’t expect that to happen from my little physics experiment”; “Uploading everyone into a hive mind is what my extrapolations suggested was for the best (and it was just so boring talking to you all at one word per week of my time)”.
  1. ^
    Although safety for the individual being uploaded would be far from guaranteed either.
  What links here?
  - Greg_Colbourn ⏸️ 's comment on MIRI 2024 Mission and Strategy Update by Malo (Jan 10, 2024, 9:44 PM; 0 points)
  - MichaelStJules Jan 10, 2024, 8:46 PM
    3 points
    1 ∶ 0
    Parent
    
    We could upload many minds, trying to represent some (sub)distribution of human values (EDIT: and psychological traits), and augment them all slowly, limiting power imbalances between them along the way.
    - Greg_Colbourn ⏸️ Jan 10, 2024, 9:47 PM
      2 points
      0 ∶ 0
      Parent
      
      Perhaps. But remember they will be smarter than us, so controlling them might not be so easy (especially if they gain access to enough computer power to speed themselves up massively. And they need not be hostile, just curious, to accidentally doom us.)
  - Will Aldred Jan 10, 2024, 7:39 PM
    2 points
    0 ∶ 0
    Parent
    
    Yes, this is a fair point; Holden has discussed these dangers a little in “Digital People Would Be An Even Bigger Deal”. My bottom-line belief, though, is that mind uploads are still significantly more likely to be safe than ML-derived ASI, since uploaded minds would presumably work, and act, much more similarly to (biological) human minds. My impression is that others also hold this view? I’d be interested if you disagree.
    To be clear, I rank moratorium > mind uploads > ML-derived ASI, but I think it’s plausible that our strategy portfolio should include mind uploading R&D alongside pushing for a moratorium.
    - Greg_Colbourn ⏸️ Jan 10, 2024, 8:20 PM
      2 points
      0 ∶ 0
      Parent
      
      I agree that they would most likely be safer than ML-derived ASI. What I’m saying is that they still won’t be safe enough to prevent an existential catastrophe. It might buy us a bit more time (if uploads happen before ASI), but that might only be measured in years. Moratorium >> mind uploads > ML-derived ASI.
      - MichaelStJules Jan 10, 2024, 8:43 PM
        4 points
        0 ∶ 0
        Parent
        
        Why do you expect an existential catastrophe from augmented mind uploads?
        Greg_Colbourn ⏸️ Jan 10, 2024, 9:44 PM
        0 points
        0 ∶ 0
        Parent
        
        Because of the crazy high power differential, and propensity for accidents (can a human really not mess up on an existential scale if acting for millions of years subjectively at superhuman capability levels?). As I say in my comment above:
        Even the nicest human could accidentally obliterate the rest of us if uplifted to superintelligence and left running for subjective millions of years (years of our time). “Whoops, I didn’t expect that to happen from my little physics experiment”; “Uploading everyone into a hive mind is what my extrapolations suggested was for the best (and it was just so boring talking to you all at one word per week of my time)”.
        MichaelStJules Jan 10, 2024, 11:28 PM
        4 points
        1 ∶ 0
        Parent
        
        This doesn’t seem like a strong enough argument to justify a high probability of existential catastrophe (if that’s what you intended?).
        
        At vastly superhuman capabilities (including intelligence and rationality), it should be easier to reduce existential-level mistakes to tiny levels. They would have vastly more capability for assessing and mitigating risks and for moral reflection (not that this would converge to some moral truth; I don’t think there is any).
        
        If you think this has a low chance of success (if we could delay AGI long enough to actually do it), then alignment seems pretty hopeless to me on that view, and a temporary pause only delays the inevitable doom.
        
        I do think we could do better (for upside-focused views) by ensuring more value pluralism and preventing particular values from dominating, e.g. by uploading and augmenting multiple minds.
        Greg_Colbourn ⏸️ Jan 11, 2024, 3:37 PM
        2 points
        0 ∶ 1
        Parent
        
        At vastly superhuman capabilities (including intelligence and rationality), it should be easier to reduce existential-level mistakes to tiny levels. They would have vastly more capability for assessing and mitigating risks and for moral reflection
        They are still human though, and humans are famous for making mistakes, even the most intelligent and rational of us. It’s even regarded by many as part of what being human is—being fallible. That’s not (too much of) a problem at current power differentials, but it is when we’re talking of solar-system-rearranging powers for millions of subjective years without catastrophic error...
        a temporary pause only delays the inevitable doom.
        Yes. The pause should be indefinite, or at least until global consensus to proceed, with democratic acceptance of whatever risk remains.
- Hayven Frienby Jan 9, 2024, 12:41 PM
  0 points
  0 ∶ 0
  Parent
  
  Thank you for this well-sourced comment. I’m not affiliated with MIRI, so I can’t answer the questions directed to the OP. With that said, I did have a small question to ask you. What would be your issue with simply accepting human fragility and limits? Does the fact that we don’t and can’t know everything, live no more than a century, and are at risk for disease and early death mean that we should fundamentally alter our nature?
  
  I think the best antidote to the present moment’s dangerous dance with AI isn’t mind uploading or transhumanism, but acceptance. We can accept that we are animals, that we will not live forever, and that any ultimate bliss or salvation won’t come via silicon. We can design policies that ensure these principles are always upheld.
SiebeRozendal Jan 10, 2024, 12:46 AM
7 points
0 ∶ 0

I just want to share that I think you did an excellent job explaining the arguments on the recent Politico Tech podcast, in a way that I think comes across as very grounded and reasonable, which makes me more optimistic that MIRI can make this shift. I also hope that you can nudge Eliezer more towards this style of communication, which I think would make his audience more receptive. (I thought the tone of the TIME piece didn’t seem professional enough). This seems especially important if Eliezer will also focus on communications and policy instead of research.
Guy Raveh Jan 5, 2024, 11:02 PM
7 points
3 ∶ 5

How does the choice to publish MIRI’s main views as LessWrong posts rather than, say, articles in peer-reviewed journals or more pieces in the media, square with the need to convince a much broader audience (including decision-makers in particular)?
- RobertM Jan 6, 2024, 2:25 AM
  18 points
  6 ∶ 2
  Parent
  
  There is no button you can press on demand to publish an article in either a peer-reviewed journal or a mainstream media outlet.
  Publishing pieces in the media (with minimal 3rd-party editing) is at least tractable on the scale of weeks, if you have a friendly journalist. The academic game is one to two orders of magnitude slower than that. If you want to communicate your views in real-time, you need to stick to platforms which allow that.
  I do think media comms is a complementary strategy to direct comms (which MIRI has been using, to some degree). But it’s difficult to escape the fact that information posted on LW, the EA forum, or Twitter (by certain accounts) makes its way down the grapevine to relevant decision-makers surprisingly often, given how little overhead is involved.
  - titotal Jan 6, 2024, 1:40 PM
    18 points
    7 ∶ 10
    Parent
    
    But it’s difficult to escape the fact that information posted on LW, the EA forum, or Twitter (by certain accounts) makes its way down the grapevine to relevant decision-makers surprisingly often, given how little overhead is involved.
    
    This isn’t necessarily a good thing, if the information being passed down is flawed or incorrect, due to the lack of rigor involved.
    The judges of quality for peer reviewed papers are domain level experts who contribute their relevant expertise. The judges of quality for blog posts are a collection of random people on the internet, often few of which have relevant expertise and who are often unable to distinguish between actual truth and convincing sounding BS.
    The ideal situation would be to write peer reviewed papers and then communicate their results on blogs, but this won’t be a good fit for a lot of things, given that some fields are not well established and some points are too small or obvious to be worth writing up academically.
  - Guy Raveh Jan 6, 2024, 1:08 PM
    12 points
    4 ∶ 0
    Parent
    
    
    Publishing pieces in the media (with minimal 3rd-party editing) is at least tractable on the scale of weeks, if you have a friendly journalist. The academic game is one to two orders of magnitude slower than that.
    
    Given that MIRI has held these views for decades, I don’t quite see how the timeline for academic publication is of issue here.
- Malo Feb 14, 2024, 10:59 PM
  1 point
  0 ∶ 0
  Parent
  
  We’ve also been doing media and we’re working on building capacity and gaining expertise to do more of it more effectively.
  Publishing research in more traditional venues is also something we’ve been chatting about internally.
Otto Jan 10, 2024, 1:39 PM
3 points
1 ∶ 1

Congratulations on a great prioritization!

Perhaps the research that we (Existential Risk Observatory) and others (e.g. @Nik Samoylov, @KoenSchoen) have done on effectively communicating AI xrisk, could be something to build on. Here’s our first paper and three blog posts (the second includes measurement of Eliezer’s TIME article effectiveness—its numbers are actually pretty good!). We’re currently working on a base rate public awareness update and further research.

Best of luck and we’d love to cooperate!
Prometheus Mar 2, 2024, 9:11 PM
1 point
0 ∶ 4

Judging from all the comments in agreement, from people who probably have no political power to actually implement these things, but who might have been useful toward actually solving the problem, this pivot is probably a net negative. You will probably fail at having much of a political influence, but succeed at dissuading people from doing technical research.
Hayven Frienby Jan 9, 2024, 12:11 PM
1 point
0 ∶ 1

I fully agree with the shift away from research and toward policy. With how close we are to what you termed smarter-than-human AI (also called AGI or ASI, but your term is much more precise, so I’ll use it going forward), research is not where efforts are best placed. We could be looking at a human extinction scenario (or an equally bad outcome, such as the permanent limiting of human potential) within 5-20 years. That’s an emergency situation as far as I’m considered. Once the necessary laws and procedures are in place, research an continue.

I can’t speak for all EAs, but my ultimate goal is to see a world without smarter-than-human AI until humanity outgrows its tendencies to wage war and seek personal gain over the flourishing of all sentient beings. This would likely place ASI development somewhere between 100 years AP* and never, and probably closer to the “never” end of that timescale. This is something we have to accept—especially those of us with tech-loving tendencies.
I’m under no illusions that Silicon Valley would ever accept this, but in a democratic society they aren’t the ones calling the shots. A democratic government can ban agents / generative / smarter-than-human AI, and the actors I mentioned previously would simply have to accept it. We need the US, EU, Canada, Taiwan, and Japan to adopt MIRI guidelines on AI safety, security, and non-proliferation—and these conversations must begin at the local level.
If we are looking to shift the Overton window, we have to target our communications toward “ordinary people” and policymakers, not tech geeks and data wonks. This will be my top priority going forward, along with animal welfare activism.
*AP = after present
Evan_Gaensbauer Jan 5, 2024, 1:43 AM
−3 points
15 ∶ 11

I appreciate the pivot to a better-devised and merely pessimistic strategy on MIRI’s part, as opposed to a deceptively dignified and misrepresentative resignation to death.
- RobBensinger Jan 6, 2024, 4:33 AM
  16 points
  4 ∶ 2
  Parent
  
  Every aspect of that summary of how MIRI’s strategy has shifted seems misleading or inaccurate to me.
  - Evan_Gaensbauer Jan 8, 2024, 10:01 PM
    5 points
    2 ∶ 0
    Parent
    
    This was an acerbic and bitter comment I made as a reference to the fake MIRI strategy update in 2022 from Eliezer, the notorious “Dying with Dignity.” I’ve thought about this for a few days and I’m sorry I made that nasty comment.
    
    I was considering deleting or retracting it, though I’ve decided against that. The fact my comment has a significantly net negative karma score seems like punishment enough. Retracting the comment now probably wouldn’t change that anyway.
    
    I’ve decided against deleting or retracting this comment because its reception seems like a useful signal for MIRI to receive. At least as of the time I’m writing this reply, my original comment has received more agreement than disagreement. It’s valid for you or whoever from MIRI disagrees with the perception I snarkily expressed as wrong or unserious. I expect it’s still worth MIRI being aware that almost as many people still distrust as trust MIRI as being sufficiently honest in its public communications.
    - Malo Feb 14, 2024, 11:11 PM
      3 points
      1 ∶ 1
      Parent
      
      I expect it’s still worth MIRI being aware that almost as many people still distrust as trust MIRI as being sufficiently honest in its public communications.
      FWIW, I found this last bit confusing. In my experience chatting with folk, regardless of how much they agree with or like MIRI, they usually think MIRI is quite candid an honest in it’s communication.
      (TBC, I do think the “Death with Dignity” post was needlessly confusing, but that’s not the same thing as dishonest.)
Bob Jan 12, 2024, 3:02 PM
−5 points
0 ∶ 0

Nonsensical cheems. Go for a walk. Have a drink. Lighten up.