Stephen Casper (https://stephencasper.com/) was giving advice today in how to upskill in research, and suggested doing a “deep dive”.
Deep dive: read 40-50 papers in a specific research area you’re interested in going into (e.g. adversarial examples in deep NNs). Take notes on each paper. You’ll then have comparable knowledge to people working in the area, after which you do a synthesis project at the end where you write something up (could be lit review, could be more original than that).
He said he’d trade any class he’d ever taken for one of these deep dives, and they’re worth doing even if it takes like 4 months.
I think classes are great given they’re targeting something you want to learn, and you’re not uncommonly self-motivated. They add a lot of structure and force engagement (i.e. homework, problem sets) in a way that’s hard to find time / energy for by yourself. You also get a fair amount of guidance and scaffolding information, plus information presented in a pedagogical order! With a lot of variance due to the skill and time investment of the instructor, size of class and quality of the curriculum etc.
But if you DO happen to be very self-driven, know what you want to learn, and if in a research context if you’re the type of person who is capable of generating novel insights without much guidance, then heck yes classes are inefficient. Even if you’re not all of these things, it certainly seems worth trying to see if you can be, since self-learning is so accessible and one learns a lot by being focusedly confused. I like how neatly presented the above deep dives idea is: it feels like it gives me enough structure to have a handle on it and makes it feel unusually feasible to do.
But yeah, for the people who are best at deep dives, I imagine it’s hard for any class to match, even with how high-variance classes can be :).
Continual investment argument for why AGI will probably happen, absent major societal catastrophes, written informally, for my notes:
We’ve been working on AI since ~1950s, in an era of history that feels normal to us but in fact develops technologies very very very fast compared to most of human existence. In 2012, the deep learning revolution of AI started with AlexNet and GPUs. Deep learning has made progress even faster than the current very fast rate of progress: 10 years later, we have unprecedented and unpredicted progress in large language model systems like GPT-3, which have unusual emergent capabilities (text generation, translation, coding, math) for being trained on the next token of a language sequence. One can imagine that if we continue to pour in resources like training data and compute (as many companies are), continue to see algorithmic improvements at the rate we’ve seen, and continue to see hardware improvements (e.g. optical computing), then maybe humanity develops something like AGI or AI at very high levels of capabilities.
Even if we don’t see this progress with deep learning and need a paradigm shift, there’s still an immense amount of human investment being poured into AI in terms of talent, money from private investors + government + company profits, and resources. There’s international competition to develop AI fast, there’s immense economic incentives to make AI products that make our lives ever more convenient along with other benefits, and some of the leading companies (DeepMind, OpenAI) are explicitly aiming at AGI. Given that we’ve only been working on AI since the 1950s, and the major recent progress has been in the last 10 years, and the pace of technological innovation seems very fast or accelerating with worldwide investment, it seems likely we will alive at advanced AI someday, and that someday could be well within our lifetimes, pending major societal disruption.
(“AI can have bad consequences” as a motivation for AI safety--> Yes, but AI can have bad consequences in meaningfully different ways!)
Here’s some frame confusion that I see a lot, that I think leads to confused intuitions (especially when trying to reason about existential risk from advanced AI, as opposed to today’s systems):
1. There’s (weapons) -- tech like nuclear weapons or autonomous weapons that if used correctly involve people dying. (Tech like this exists)
2. There’s (misuse) -- tech was intentioned to be anywhere from beneficial <> neutral <> seems high on offense-defense balance, but it wasn’t designed for harm. Examples here include social media, identification systems, language models, surveillance systems. (Tech like this exists)
3. There’s (advanced AI pursuing instrumental incentives --> causing existential risk), which is not about misuse, it’s about the *system itself* being an optimizer and seeking power (humans are not the problem here, the AI itself is, once the AI is sufficiently advanced). (Tech like this does not exist)
You can say “AI is bad” for all of them, and they’re all problems, but they’re different problems and should be thought of separately. (1) is a problem (autonomous weapons is the AI version of it) but is pretty independent from (3). Technical AI safety discussion is mostly about the power-seeking agent issue (3). (2) is a problem all the time for all tech (though some tech lends itself more to this than others). They’re all going to need to get solved, but at least (1) and (2) are problems humanity has any experience with (and so we have at least some structures in place to deal with them, and people are aware these are problems).
I’ve been running a two-month “program” with eight of the students who reached out to me! We’ve come up with research questions from my original list, and the expectation is that individuals work 9h/week as volunteer research assistants. I’ve been meeting with each person / group for 30min per week to discuss progress. We’re halfway through this experiment, with a variety of projects and progress states—hopefully you’ll see at least one EA Forum post up from those students!
I was quite surprised by the interest that this post generated; ~30 people reached out to me, and a large number were willing to do a volunteer research for no credit / pay. I ended up working with eight students, mostly based on their willingness to work with me on some of my short-listed projects. I was willing to have their projects drift significantly from my original list if the students were enthusiastic and the project felt decently aligned with risks from long-term AI, and that did occur. My goal here was to get some experience training students who had limited research experience, and I’ve been enjoying working with them.
I’m not sure about how likely it is I’ll continue working with students past this 2-month program, because it does take up a chunk of time (that’s made worse by trying to wrangle schedules), but I’m considering what to do for the future. If anyone’s interested in also mentoring students with an interest in longterm risks from AI, please let me know, since I think there’s interest! It’s a decently low time commitment (30m/student or group of students) once you’ve got everything sorted. However, I am doing it for the benefit of the students, rather than with the expectation of getting help on my work, so it’s more of a volunteer role.
(quick reply to a private doc on interaction effects vs direct effects for existential risks / GCR. They’re arguing for more of a focus on interaction effects overall, I’m arguing for mostly work on direct effects. Keeping for my notes.)
In addition to direct effects from AI, bio, nuclear, climate...
...there are also mitigating / interaction effects, which could make these direct effects better or worse. For each of the possible direct risks, mitigating / interaction effects are more or less important.
For AI, the mainline direct risks that are possibly existential (whether or not the risks occur) are both really bad and also roughly independent of things that don’t reduce the risk of AGI being developed, and it’s possible to work directly on the direct risks (technical and governance). e.g. Misinformation or disinformation won’t particularly matter if we get an unaligned AGI developed at one of the big companies, except to the extent that mis/disinformation contributed to the unaligned AGI getting developed (which I think is probably more influenced by other factors, but it’s debatable), so I think efforts should be focused on solving the technical alignment problem. Misinformation and disinformation are more relevant to the governance problem, but working on them still seems worse to me than a focus on specifically trying to make AI governance go well, given the goal of trying to reduce existential risk (and not solve other important problems associated with misinformation and disinformation). (I expect one will make more progress trying to directly reduce xrisk from AI than working on related things.)
(Bio has excellent interaction effects with AI in terms of risk though, so that’s going to be one of the best examples. )
Just to quickly go over my intuitions here about the interactions:
AI x bio <-- re: AI, I think AI direct effects are worse AI x nuclear <-- these are pretty separate problems imo AI x climate <-- could go either way; I expect AI could have substantial improvements on climate depending on how advanced our AI gets. AI doesn’t contribute that much to climate change compared to other factors I think Bio x AI <-- bio is SO MUCH WORSE with AI, this is an important interaction effect Bio x nuclear <-- these are pretty separate problems imo Bio x climate <-- worse climate will make pandemics worse for sure Nuclear x AI <-- separate problems Nuclear x bio <-- separate problems Nuclear x climate <-- if climate influences war then climate makes nuclear worse Climate x AI <-- could go either way I think but I think probably best to work directly on climate if you don’t think we’ll get advanced AI systems soon Climate x nuclear <-- nuclear stuff certainly does mess up climate a LOT, but then we’re thinking more about nuclear Climate x bio <-- pandemics don’t influence climate that much I think
----
Feedback from another EA (thank you!)
> I think there are more interaction effects than your shortform is implying, but also most lines of inquiry aren’t very productive. [Agree in the general direction, but object-level disagree]
I think this is true and if presented arguments I’d agree with them and have a fairer / more comprehensive picture.
(How to independent study)
Stephen Casper (https://stephencasper.com/) was giving advice today in how to upskill in research, and suggested doing a “deep dive”.
Deep dive: read 40-50 papers in a specific research area you’re interested in going into (e.g. adversarial examples in deep NNs). Take notes on each paper. You’ll then have comparable knowledge to people working in the area, after which you do a synthesis project at the end where you write something up (could be lit review, could be more original than that).
He said he’d trade any class he’d ever taken for one of these deep dives, and they’re worth doing even if it takes like 4 months.
*cool idea
This sounds like a great idea and aligns with my growing belief that classes are, more often than not, far from the best way to learn.
I think classes are great given they’re targeting something you want to learn, and you’re not uncommonly self-motivated. They add a lot of structure and force engagement (i.e. homework, problem sets) in a way that’s hard to find time / energy for by yourself. You also get a fair amount of guidance and scaffolding information, plus information presented in a pedagogical order! With a lot of variance due to the skill and time investment of the instructor, size of class and quality of the curriculum etc.
But if you DO happen to be very self-driven, know what you want to learn, and if in a research context if you’re the type of person who is capable of generating novel insights without much guidance, then heck yes classes are inefficient. Even if you’re not all of these things, it certainly seems worth trying to see if you can be, since self-learning is so accessible and one learns a lot by being focusedly confused. I like how neatly presented the above deep dives idea is: it feels like it gives me enough structure to have a handle on it and makes it feel unusually feasible to do.
But yeah, for the people who are best at deep dives, I imagine it’s hard for any class to match, even with how high-variance classes can be :).
Continual investment argument for why AGI will probably happen, absent major societal catastrophes, written informally, for my notes:
We’ve been working on AI since ~1950s, in an era of history that feels normal to us but in fact develops technologies very very very fast compared to most of human existence. In 2012, the deep learning revolution of AI started with AlexNet and GPUs. Deep learning has made progress even faster than the current very fast rate of progress: 10 years later, we have unprecedented and unpredicted progress in large language model systems like GPT-3, which have unusual emergent capabilities (text generation, translation, coding, math) for being trained on the next token of a language sequence. One can imagine that if we continue to pour in resources like training data and compute (as many companies are), continue to see algorithmic improvements at the rate we’ve seen, and continue to see hardware improvements (e.g. optical computing), then maybe humanity develops something like AGI or AI at very high levels of capabilities.
Even if we don’t see this progress with deep learning and need a paradigm shift, there’s still an immense amount of human investment being poured into AI in terms of talent, money from private investors + government + company profits, and resources. There’s international competition to develop AI fast, there’s immense economic incentives to make AI products that make our lives ever more convenient along with other benefits, and some of the leading companies (DeepMind, OpenAI) are explicitly aiming at AGI. Given that we’ve only been working on AI since the 1950s, and the major recent progress has been in the last 10 years, and the pace of technological innovation seems very fast or accelerating with worldwide investment, it seems likely we will alive at advanced AI someday, and that someday could be well within our lifetimes, pending major societal disruption.
(“AI can have bad consequences” as a motivation for AI safety--> Yes, but AI can have bad consequences in meaningfully different ways!)
Here’s some frame confusion that I see a lot, that I think leads to confused intuitions (especially when trying to reason about existential risk from advanced AI, as opposed to today’s systems):
1. There’s (weapons) -- tech like nuclear weapons or autonomous weapons that if used correctly involve people dying. (Tech like this exists)
2. There’s (misuse) -- tech was intentioned to be anywhere from beneficial <> neutral <> seems high on offense-defense balance, but it wasn’t designed for harm. Examples here include social media, identification systems, language models, surveillance systems. (Tech like this exists)
3. There’s (advanced AI pursuing instrumental incentives --> causing existential risk), which is not about misuse, it’s about the *system itself* being an optimizer and seeking power (humans are not the problem here, the AI itself is, once the AI is sufficiently advanced). (Tech like this does not exist)
You can say “AI is bad” for all of them, and they’re all problems, but they’re different problems and should be thought of separately. (1) is a problem (autonomous weapons is the AI version of it) but is pretty independent from (3). Technical AI safety discussion is mostly about the power-seeking agent issue (3). (2) is a problem all the time for all tech (though some tech lends itself more to this than others). They’re all going to need to get solved, but at least (1) and (2) are problems humanity has any experience with (and so we have at least some structures in place to deal with them, and people are aware these are problems).
Update on my post “Seeking social science students / collaborators interested in AI existential risks” from ~1.5 months ago:
I’ve been running a two-month “program” with eight of the students who reached out to me! We’ve come up with research questions from my original list, and the expectation is that individuals work 9h/week as volunteer research assistants. I’ve been meeting with each person / group for 30min per week to discuss progress. We’re halfway through this experiment, with a variety of projects and progress states—hopefully you’ll see at least one EA Forum post up from those students!
I was quite surprised by the interest that this post generated; ~30 people reached out to me, and a large number were willing to do a volunteer research for no credit / pay. I ended up working with eight students, mostly based on their willingness to work with me on some of my short-listed projects. I was willing to have their projects drift significantly from my original list if the students were enthusiastic and the project felt decently aligned with risks from long-term AI, and that did occur. My goal here was to get some experience training students who had limited research experience, and I’ve been enjoying working with them.
I’m not sure about how likely it is I’ll continue working with students past this 2-month program, because it does take up a chunk of time (that’s made worse by trying to wrangle schedules), but I’m considering what to do for the future. If anyone’s interested in also mentoring students with an interest in longterm risks from AI, please let me know, since I think there’s interest! It’s a decently low time commitment (30m/student or group of students) once you’ve got everything sorted. However, I am doing it for the benefit of the students, rather than with the expectation of getting help on my work, so it’s more of a volunteer role.
(quick reply to a private doc on interaction effects vs direct effects for existential risks / GCR. They’re arguing for more of a focus on interaction effects overall, I’m arguing for mostly work on direct effects. Keeping for my notes.)
In addition to direct effects from AI, bio, nuclear, climate...
...there are also mitigating / interaction effects, which could make these direct effects better or worse. For each of the possible direct risks, mitigating / interaction effects are more or less important.
For AI, the mainline direct risks that are possibly existential (whether or not the risks occur) are both really bad and also roughly independent of things that don’t reduce the risk of AGI being developed, and it’s possible to work directly on the direct risks (technical and governance). e.g. Misinformation or disinformation won’t particularly matter if we get an unaligned AGI developed at one of the big companies, except to the extent that mis/disinformation contributed to the unaligned AGI getting developed (which I think is probably more influenced by other factors, but it’s debatable), so I think efforts should be focused on solving the technical alignment problem. Misinformation and disinformation are more relevant to the governance problem, but working on them still seems worse to me than a focus on specifically trying to make AI governance go well, given the goal of trying to reduce existential risk (and not solve other important problems associated with misinformation and disinformation). (I expect one will make more progress trying to directly reduce xrisk from AI than working on related things.)
(Bio has excellent interaction effects with AI in terms of risk though, so that’s going to be one of the best examples. )
Just to quickly go over my intuitions here about the interactions:
AI x bio <-- re: AI, I think AI direct effects are worse
AI x nuclear <-- these are pretty separate problems imo
AI x climate <-- could go either way; I expect AI could have substantial improvements on climate depending on how advanced our AI gets. AI doesn’t contribute that much to climate change compared to other factors I think
Bio x AI <-- bio is SO MUCH WORSE with AI, this is an important interaction effect
Bio x nuclear <-- these are pretty separate problems imo
Bio x climate <-- worse climate will make pandemics worse for sure
Nuclear x AI <-- separate problems
Nuclear x bio <-- separate problems
Nuclear x climate <-- if climate influences war then climate makes nuclear worse
Climate x AI <-- could go either way I think but I think probably best to work directly on climate if you don’t think we’ll get advanced AI systems soon
Climate x nuclear <-- nuclear stuff certainly does mess up climate a LOT, but then we’re thinking more about nuclear
Climate x bio <-- pandemics don’t influence climate that much I think
----
Feedback from another EA (thank you!)
> I think there are more interaction effects than your shortform is implying, but also most lines of inquiry aren’t very productive. [Agree in the general direction, but object-level disagree]
I think this is true and if presented arguments I’d agree with them and have a fairer / more comprehensive picture.