I like this way of thinking about AI risk, though I would emphasize that my disagreement comes a lot from my skepticism of crux 2 and in turn crux 3. If AI is far away, then it seems pretty difficult to understand how it will end up being used, and I think even when timelines are 20-30 years from now, this remains an issue [ETA: Note that also, during a period of rapid economic growth, much more intellectual progress might happen in a relatively small period of physical time, as computers could automate some parts of human intellectual labor. This implies that short physical timelines could underestimate the conceptual timelines before systems are superhuman].
I have two intuitions that pull me in this direction.
The first is that it seems like if you asked someone from 10 years ago what AI would look like now, you’d mostly get responses that wouldn’t really help us that much at aligning our current systems. If you agree with me here, but still think that we know better now, I think you need to believe that the conceptual distance between now and AGI is smaller than the conceptual distance between AI in 2010 and AI in 2020.
The second intuition is that it seems like safety engineering is usually very sensitive to small details of a system that are hard to get access to unless the design schematics are right in front of you.
Without concrete details, the major approach within AI safety (as Buck explicitly advocates here) is to define a relaxed version of the problem that abstracts low level details away. But if safety engineering mostly involves getting little details right rather than big ones, then this might not be very fruitful.
I haven’t discovered any examples of real world systems where doing extensive abstract reasoning beforehand was essential for making it safe. Computer security is probably the main example where abstract mathematics seems to help, but my understanding is that the math probably could have been developed alongside the computers in question, and that the way these systems are compromised is usually not due to some conceptual mistake.
I broadly agree with this, but I feel like this is mostly skepticism of crux 3 and not crux 2. I think to switch my position on crux 2 using only timeline arguments, you’d have to argue something like <10% chance of transformative AI in 50 years.
I think to switch my position on crux 2 using only timeline arguments, you’d have to argue something like <10% chance of transformative AI in 50 years.
That makes sense. “Plausibly soonish” is pretty vague so I pattern matched to something more similar to—by default it will come within a few decades.
It’s reasonable that for people with different comparative advantages, their threshold for caring should be higher. If there were only a 2% chance of transformative AI in 50 years, and I was in charge of effective altruism resource allocation, I would still want some people (perhaps 20-30) to be looking into it.
I like this way of thinking about AI risk, though I would emphasize that my disagreement comes a lot from my skepticism of crux 2 and in turn crux 3. If AI is far away, then it seems pretty difficult to understand how it will end up being used, and I think even when timelines are 20-30 years from now, this remains an issue [ETA: Note that also, during a period of rapid economic growth, much more intellectual progress might happen in a relatively small period of physical time, as computers could automate some parts of human intellectual labor. This implies that short physical timelines could underestimate the conceptual timelines before systems are superhuman].
I have two intuitions that pull me in this direction.
The first is that it seems like if you asked someone from 10 years ago what AI would look like now, you’d mostly get responses that wouldn’t really help us that much at aligning our current systems. If you agree with me here, but still think that we know better now, I think you need to believe that the conceptual distance between now and AGI is smaller than the conceptual distance between AI in 2010 and AI in 2020.
The second intuition is that it seems like safety engineering is usually very sensitive to small details of a system that are hard to get access to unless the design schematics are right in front of you.
Without concrete details, the major approach within AI safety (as Buck explicitly advocates here) is to define a relaxed version of the problem that abstracts low level details away. But if safety engineering mostly involves getting little details right rather than big ones, then this might not be very fruitful.
I haven’t discovered any examples of real world systems where doing extensive abstract reasoning beforehand was essential for making it safe. Computer security is probably the main example where abstract mathematics seems to help, but my understanding is that the math probably could have been developed alongside the computers in question, and that the way these systems are compromised is usually not due to some conceptual mistake.
I broadly agree with this, but I feel like this is mostly skepticism of crux 3 and not crux 2. I think to switch my position on crux 2 using only timeline arguments, you’d have to argue something like <10% chance of transformative AI in 50 years.
That makes sense. “Plausibly soonish” is pretty vague so I pattern matched to something more similar to—by default it will come within a few decades.
It’s reasonable that for people with different comparative advantages, their threshold for caring should be higher. If there were only a 2% chance of transformative AI in 50 years, and I was in charge of effective altruism resource allocation, I would still want some people (perhaps 20-30) to be looking into it.