Ought’s theory of change

stuhlmuellerApr 12, 2022, 12:09 AM

43 points

AI safety AI alignment Theory of change Ought

Ought is an applied machine learning lab. In this post we summarize our work on Elicit and why we think it’s important.

We’d love to get feedback on how to make Elicit more useful to the EA community, and on our plans more generally.

This post is based on two recent LessWrong posts:

In short

Our mission is to automate and scale open-ended reasoning. To that end, we’re building Elicit, the AI research assistant.

Elicit’s architecture is based on supervising reasoning processes, not outcomes. This is better for supporting open-ended reasoning in the short run and better for alignment in the long run.

Over the last year, we built Elicit to support broad reviews of empirical literature. The literature review workflow runs on general-purpose infrastructure for executing compositional language model processes. Going forward, we’ll expand to deep literature reviews, then other research workflows, then general-purpose reasoning.

Our mission

Our mission is to automate and scale open-ended reasoning. If we can improve the world’s ability to reason, we’ll unlock positive impact across many domains including AI governance & alignment, psychological well-being, economic development, and climate change.

As AI advances, the raw cognitive capabilities of the world will increase. The goal of our work is to channel this growth toward good reasoning. We want AI to be more helpful for qualitative research, long-term forecasting, planning, and decision-making than for persuasion, keeping people engaged, and military robotics.

Good reasoning is as much about process as it is about outcomes. In fact, outcomes are unavailable if we’re reasoning about the long term. So we’re generally not training machine learning models end-to-end using outcome data, but building Elicit compositionally and based on human reasoning processes.

The case for process-based ML systems

We can think about machine learning systems on a spectrum from process-based to outcome-based:

Process-based systems are built on human-understandable task decompositions, with direct supervision of reasoning steps. More
Outcome-based systems are built on end-to-end optimization, with supervision of final results. More

We think that process-based systems are better:

In the short term, process-based ML systems have better differential capabilities: They help us apply ML to tasks where we don’t have access to outcomes. These tasks include long-range forecasting, policy decisions, and theoretical research. More
In the long term, process-based ML systems help avoid catastrophic outcomes from systems gaming outcome measures and are thus more aligned. More
Both process- and outcome-based evaluation are attractors to varying degrees: Once an architecture is entrenched, it’s hard to move away from it. This lock-in applies much more to outcome-based systems. More
Whether the most powerful ML systems will primarily be process-based or outcome-based is up in the air. More
So it’s crucial to push toward process-based training now.

Relative to the potential benefits, we think that process-based systems have gotten surprisingly little explicit attention in the AI alignment community.

How we think about success

We’re pursuing our mission by building Elicit, a process-based AI research assistant.

We succeed if:

Elicit radically increases the amount of good reasoning in the world.
1. For experts, Elicit pushes the frontier forward.
2. For non-experts, Elicit makes good reasoning more affordable. People who don’t have the tools, expertise, time, or mental energy to make well-reasoned decisions on their own can do so with Elicit.
Elicit is a scalable ML system based on human-understandable task decompositions, with supervision of process, not outcomes. This expands our collective understanding of safe AGI architectures.

Progress in 2021

We’ve made the following progress in 2021:

We built Elicit to support researchers because high-quality research is a bottleneck to important progress and because researchers care about good reasoning processes. More
We identified some building blocks of research (e.g. search, summarization, classification), operationalized them as language model tasks, and connected them in the Elicit literature review workflow. More
On the infrastructure side, we built a streaming task execution engine for running compositions of language model tasks This engine is supporting the literature review workflow in production. More
About 1,500 people use Elicit every month. More

Roadmap for 2022+

Our plans for 2022+:

We expand literature review to digest the full text of papers, extract evidence, judge methodological robustness, and help researchers do deeper evaluations by decomposing questions like “What are the assumptions behind this experimental result?” More
After literature review, we add other research workflows, e.g. evaluating project directions, decomposing research questions, and augmented reading. More
To support these workflows, we refine the primitive tasks through verifier models and human feedback, and expand our infrastructure for running complex task pipelines, quickly adding new tasks, and efficiently gathering human data. More
Over time, Elicit becomes a general-purpose reasoning assistant, transforming any task involving evidence, arguments, plans and decisions. More

We’re hiring for basically all roles—ML engineer, front-end, full-stack, operations, product design, operations, even recruiting. Join our team!

What links here?

stuhlmuellerApr 12, 2022, 12:09 AM

43 points

4 comments3 min readEA link

AI safety AI alignment Theory of change Ought

MaxRa Apr 17, 2022, 1:44 PM
4 points
1 ∶ 0
Cool, thanks for sharing, I’m a big fan of Elicit! Some spontaneous thoughts:
We want AI to be more helpful for qualitative research, long-term forecasting, planning, and decision-making than for persuasion, keeping people engaged, and military robotics.
Are you worried that your work will be used for more likely regretable things like
- improving the competence of actors who are less altruistic and less careful about unintended consequences (e.g. many companies, militaries and government insitutions), and
- speeding up AI capabilities research, and speeding it up more than AI safety research?
I suppose it will be difficult to have much control over insights you generate and it will be relatively easy to replicate your product if you make it publicly available?
Have you considered deemphasizing trying to offer a commercially successful product that will find broad application in the world, and focussing more strongly on designing systems that are safe and aligned with human values?
Regarding the competition between process-based vs. outcome-based machine learning
Today, process-based systems are ahead: Most systems in the world don’t use much machine learning, and to the extent that they use it, it’s for small, independently meaningful, fairly interpretable steps like predictive search, ranking, or recommendation as part of much larger systems. [from your referenced LessWrong post]
My first reaction was thinking that today’s ML systems might not be the best comparison, and instead you might want to include all information processing systems, which include human brains. I guess human brains are mostly outcome-based systems with processed-based features:
- we’re monitoring our own thinking and adjust it if it fails to live up to standards we hold, and
- we communicate our thought processes for feedback and to teach others
But most of it seems outcome-based and fairly inscrutable?
- stuhlmueller Apr 20, 2022, 10:01 AM
  9 points
  0 ∶ 0
  Parent
  Are you worried that your work will be used for more likely regretable things like
  improving the competence of actors who are less altruistic and less careful about unintended consequences (e.g. many companies, militaries and government insitutions), and
  Less careful actors: Our goal is for Elicit to help people reason better. We want less careful people to use it and reason better than they would have without Elicit, recognizing more unintended consequences and finding actions that are more aligned with their values. The hope is that if we can make good reasoning cheap enough, people will use it. In a sense, we’re all less careful actors right now.
  Less altruistic actors: We favor more altruistic actors in deciding who to work with, give access to, and improve Elicit for. We also monitor use so that we can prevent misuse.
  speeding up AI capabilities research, and speeding it up more than AI safety research?
  I expect the overall impact on x-risk to be a reduction by (a) causing more and better x-risk reduction thinking to happen and (b) shifting ML efforts to a more alignable paradigm, even if (c) Elicit has a non-zero contribution to ML capabilities.
  The implicit claim in the concern about speeding up capabilities is that Elicit has a large impact on capabilities because it is so useful. If that is true, we’d expect that it’s also super useful for other domains e.g. AI safety. The larger Elicit’s impact on (c), the larger the corresponding impacts on (a) and (b).
  To shift the balance away from (c) we’ll focus on supporting safety-related research and researchers, especially conceptual research. We’re not doing this very well today but are actively thinking about it and moving in that direction. Given that, it would be surprising if Elicit helped a lot with ML capabilities relative to tools and organizations that are explicitly pushing that agenda.
  Have you considered deemphasizing trying to offer a commercially successful product that will find broad application in the world, and focussing more strongly on designing systems that are safe and aligned with human values?
  We’re a non-profit so have no obligation to make a commercially successful product. We’ll only focus on it to the extent that it furthers aligned reasoning. That said, I think the best outcome is that we make a widely adopted product that makes it easier for everyone to think through the consequences of their actions and act in alignment with their values.
  - MaxRa Apr 20, 2022, 11:23 AM
    2 points
    0 ∶ 0
    Parent
    Thanks a lot for elaborating, makes sense to me.
    I was fuzzy about what I wanted to communicate with the term “careful”, thanks for spelling out your perspective here. I’m still a little uneasy about the idea that generally improving the ability to plan better will also make sufficiently many actors more careful about avoiding problems that are particularly risky for our future. It just seems so rare that important actors care enough about such risks, even for things that humanity is able to predict and plan for reasonably well, like pandemics.
    - stuhlmueller Apr 20, 2022, 1:13 PM
      4 points
      0 ∶ 0
      Parent
      We’re also only reporting our current guess for how things will turn out. We’re monitoring how Elicit is used and we’ll study its impacts and the anticipated impacts of future features, and if it turns out that the costs outweigh the benefits we will adjust our plans.