If the concrete problems are too watered down compared to the real thing, you also wonât solve AI alignment by misleading people into thinking itâs easier.
We could not yet create a beneficial AI system even via brute force.
Imagine you have a Jupiter-sized computer and a very simple goal: Make the universe contain as much diamond as possible. The computer has access to the internet and a number of robotic factories and laboratories, and by âdiamondâ we mean carbon atoms covalently bound to four other carbon atoms. (Pretend we donât care how it makes the diamond, or what it has to take apart in order to get the carbon; the goal is to study a simplified problem.) Letâs say that the Jupiter-sized computer is running python. How would you program it to produce lots and lots of diamond?
As it stands, we do not yet know how to program a computer to achieve a goal such as that one.
It would be fair to say that this is just from an exposition of the importance of AI Safety, rather than from a proposal itself. But in any case, humans always solve complicated problems by breaking them up because otherwise it is terribly hard. Of course, there is a risk that we oversimplify the problem, but general researchers often know where to stop.
Perhaps you were focusing more on things vaguely related such as fairness etc, but Iâm arguing more for making the real AI Safety problems concrete enough that they will tackle it. And thatâs the challenge, to know where to stop simplifying. :)
some original-thinking genius reasoners can produce useful shovel-ready research questions for not-so-original-thinking academics
Donât discount the originality of academics, they can also be quite cool :)
I think the best judges are the people who are already doing work that the alignment community deems valuable.
I agree!
If EAs who have specialized on this for years are so vastly confused about it, academia will be even more confused.
Yeah, I think this is right. Thatâs why I wanted to pose this as concrete subproblems so that they do not feel the confusion we still have around it :)
Independently of the above argument that weâre in trouble if we canât even recognize talent, I also feel pretty convinced that we can on first-order grounds. It seems pretty obvious to me that work tests or interviews conducted by community experts do an okay job at recognizing talent.
Yeah, I agree. But also notice that Holden Karnofsky believes that academic research has a lot of aptitudes overlap with AI Safety research skills, and that the academic research track of record is the best fidelity signal for whether youâll do well in AI Safety research. So perhaps we should not discount it entirely.
but Iâm arguing more for making the real AI Safety problems concrete enough that they will tackle it.
I agree that this would be immensely valuable if it works. Therefore, I think itâs important to try it. I suspect it likely wonât succeed because itâs hard to usefully simplify problems in a pre-paradigmatic field. I feel like if you can do that, maybe youâve already solved the hardest part of the problem.
(I think most of my intuitions about the difficulty of usefully simplifying AI alignment relate to it being a pre-paradigmatic field. However, maybe the necessity of âsecurity mindsetâ for alignment also plays into it.)
In my view, progress in pre-paradigmatic fields often comes from a single individual or a tight-knit group with high-bandwidth internal communication. It doesnât come from lots of people working on a list of simplified problems.
(But maybe the picture Iâm painting is too black-and-white. I agree that thereâs some use to getting inputs from a broader set of people, and occasionally people who isnât usually very creative can have a great insight, etc.)
Donât discount the originality of academics, they can also be quite cool :)
Thatâs true. What I said sounded like a blanket dismissal of original thinking in academia, but thatâs not how I meant it. Basically, my picture of the situation is as follows:
Few people are capable of making major breakthroughs in pre-paradigmatic fields because that requires a rare kind of creativity and originality (and probably also being a genius). There are people like that in academia, but they have their quirks and theyâd mostly already be working on AI alignment if they had the relevant background. For the sort of people Iâm thinking about, they are drawn to problems like AI risk or AI alignment. They likely wouldnât need things to be simplified. If they look at a simplified problem, their mind immediately jumps to all the implications of the general principle and they think through the more advanced version of the problem because thatâs way more interesting and way more relevant.
In any case, there are a bunch of people like that in long-termist EA because EA heavily selects for this sort of thinking. People from academia who excel at this sort of thinking often end up at EA aligned organizations.
So, who is left in academia and isnât usefully contributing to alignment but could maybe contribute to it if we knew what we wanted from them? Those are the people who donât invent entire fields on their own.
Hey Lukas!
Note that even MIRI sometimes does this
It would be fair to say that this is just from an exposition of the importance of AI Safety, rather than from a proposal itself. But in any case, humans always solve complicated problems by breaking them up because otherwise it is terribly hard. Of course, there is a risk that we oversimplify the problem, but general researchers often know where to stop.
Perhaps you were focusing more on things vaguely related such as fairness etc, but Iâm arguing more for making the real AI Safety problems concrete enough that they will tackle it. And thatâs the challenge, to know where to stop simplifying. :)
Donât discount the originality of academics, they can also be quite cool :)
I agree!
Yeah, I think this is right. Thatâs why I wanted to pose this as concrete subproblems so that they do not feel the confusion we still have around it :)
Yeah, I agree. But also notice that Holden Karnofsky believes that academic research has a lot of aptitudes overlap with AI Safety research skills, and that the academic research track of record is the best fidelity signal for whether youâll do well in AI Safety research. So perhaps we should not discount it entirely.
Thanks!
It sounds like our views are close!
I agree that this would be immensely valuable if it works. Therefore, I think itâs important to try it. I suspect it likely wonât succeed because itâs hard to usefully simplify problems in a pre-paradigmatic field. I feel like if you can do that, maybe youâve already solved the hardest part of the problem.
(I think most of my intuitions about the difficulty of usefully simplifying AI alignment relate to it being a pre-paradigmatic field. However, maybe the necessity of âsecurity mindsetâ for alignment also plays into it.)
In my view, progress in pre-paradigmatic fields often comes from a single individual or a tight-knit group with high-bandwidth internal communication. It doesnât come from lots of people working on a list of simplified problems.
(But maybe the picture Iâm painting is too black-and-white. I agree that thereâs some use to getting inputs from a broader set of people, and occasionally people who isnât usually very creative can have a great insight, etc.)
Thatâs true. What I said sounded like a blanket dismissal of original thinking in academia, but thatâs not how I meant it. Basically, my picture of the situation is as follows:
Few people are capable of making major breakthroughs in pre-paradigmatic fields because that requires a rare kind of creativity and originality (and probably also being a genius). There are people like that in academia, but they have their quirks and theyâd mostly already be working on AI alignment if they had the relevant background. For the sort of people Iâm thinking about, they are drawn to problems like AI risk or AI alignment. They likely wouldnât need things to be simplified. If they look at a simplified problem, their mind immediately jumps to all the implications of the general principle and they think through the more advanced version of the problem because thatâs way more interesting and way more relevant.
In any case, there are a bunch of people like that in long-termist EA because EA heavily selects for this sort of thinking. People from academia who excel at this sort of thinking often end up at EA aligned organizations.
So, who is left in academia and isnât usefully contributing to alignment but could maybe contribute to it if we knew what we wanted from them? Those are the people who donât invent entire fields on their own.