Do you have any thoughts on why there is not much engagement/participation in technical AI safety/alignment research by professional philosophers or people with philosophy PhDs? (I don’t know anyone except one philosophy PhD student who is directly active in this field, and Nick Bostrom who occasionally publishes something relevant.) Is it just that the few philosophers who are concerned about AI risk have more valuable things to do, like working on macrostrategy, AI policy, or trying to get more people to take ideas like existential risk and longtermism seriously? Have you ever thought about at what point it would start to make sense for the marginal philosopher (or the marginal philosopher-hour) to go into technical AI safety? Do you have a sense of why “philosophers concerned about AI risk” as a class hasn’t grown as quickly as one might have expected?
On a related note, I feel like encouraging EA people with philosophy background to go into journalism or tech policy (as you did in the recent 80,000 Hours career review) is a big waste, since an advanced education in philosophy does not seem to create an obvious advantage in those fields, whereas there are important philosophical questions in AI alignment for which such a background would be more obviously helpful. Curious what your thinking is here.
It occurs to me that another reason for the lack of engagement by people with philosophy backgrounds may be that philosophers aren’t aware of the many philosophical problems in AI alignment that they could potentially contribute to. So here’s a list of philosophical problems that have come up just in my own thinking about AI alignment.
EDIT: Since the actual list is perhaps only of tangential interest here (and is taking up a lot of screen space that people have to scroll through), I’ve moved it to the AI Alignment Forum.
Wei’s list focused on ethics and decision theory, but I think that it would be most valuable to have more good conceptual analysis of the arguments for why AI safety matters, and particularly the role of concepts like “agency”, “intelligence”, and “goal-directed behaviour”. While it’d be easier to tackle these given some knowledge of machine learning, I don’t think that background is necessary—clarity of thought is probably the most important thing.
Hey Wei_Dai, thanks for this feedback! I agree that philosophers can be useful in alignment research by way of working on some of the philosophical questions you list in the linked post. Insofar as you’re talking about working on questions like those within academia, I think of that as covered by the suggestion to work on global priorities research. For instance, I know that working on some of those questions would be welcome at the Global Priorities Institute, and I think FHI would probably also welcome philosophers working on AI questions. But I agree that that isn’t clear from the article, and I’ve added a bit to clarify it.
But maybe the suggestion is working on those questions outside academia. We mention DeepMind and Open AI as having ethics divisions, but likely only some philosophical questions relevant to AI safely are done in those kinds of centers, and it could be worth listing more non-academic settings in which philosophers might be able to pursue alignment relevant questions. There are, for instance, lots of AI ethics organizations, though most are only focused on short-term issues, and are more concerned with ‘implications’ than with philosophical questions that arise in the course of design. CHAI, AI Impacts, the Leverhulme center, and MIRI also seem to do a bit of philosophy each. The future Schwarzman Center at Oxford may also be a good place for this once it gets going. I’ve edited the relevant sections to reflect this.
Do you know of any other projects or organizations that might be useful to mention? I also think your list of philosophy questions relevant to AI is useful—thanks for writing it up!-- and would like to link to it in the article.
As for the comparison with journalism and AI policy, in line with what Will wrote below I was thinking of those as suggestions for people who are trying to get out of philosophy or who will be deciding not to go into it in the first place, i.e., for people who would be good at philosophy but who choose to do something else that takes advantage of their general strengths.
Thanks for making the changes. I think they address most of my concerns. However I think splitting the AI safety organizations mentioned between academic and non-academic is suboptimal, because it seems like what’s most important is that someone who can contribute to AI safety go to an organization that can use them, whether that organization belongs to a university or not. On a pragmatic level, I’m worried that someone sees a list of organizations where they can contribute to AI safety, and not realize that there’s another list in a distant part of the article.
Do you know of any other projects or organizations that might be useful to mention?
Individual grants from various EA sources seem worth mentioning. I would also suggest mentioning FHI for AI safety research, not just global priorities research.
As for the comparison with journalism and AI policy, in line with what Will wrote below I was thinking of those as suggestions for people who are trying to get out of philosophy or who will be deciding not to go into it in the first place, i.e., for people who would be good at philosophy but who choose to do something else that takes advantage of their general strengths.
Ok, that wasn’t clear to me, as there’s nothing in the text that explicitly says those suggestions are for people who are trying to get out of philosophy. Instead the opening of that section says “If you want to leave academia”. I think you can address this as well as my “splitting” concern above by reorganizing the article into “careers inside philosophy” and “careers outside philosophy” instead of “careers inside academia” and “careers outside academia”. (But it’s just a suggestion as I’m sure you have other considerations for how to organize the article.)
Re: these being alternatives to philosophy, I see what you mean. But I think it’s ok to group together non-academic philosophy and non-philosophy alternatives since it’s a career review of philosophy academia. However, I take the point that I can better connect the two ‘alternatives’ sections in the article and have added a link.
As for individual grants, I’m hesitant to add that suggestion because I worry that that would encourage some people people who aren’t able to get philosophy roles in academia or in other organizations to go the ‘independent’ route, and I think that will rarely be the right choice.
As for individual grants, I’m hesitant to add that suggestion because I worry that that would encourage some people people who aren’t able to get philosophy roles in academia or in other organizations to go the ‘independent’ route, and I think that will rarely be the right choice.
I’m interested to hear why you think that. My own thinking is that a typical AI safety research organization may not currently be very willing to hire someone with mainly philosophy background, so they may have to first prove their value by doing some AI safety related independent research. After that they can either join a research org or continue down the ‘independent’ route if it seems suitable to them. Does this not seem like a good plan?
I don’t feel I have a great answer here. I think in part there’s just not that many philosophers in the world, and most of them are already wrapped up in existing research projects. Of those that are EA-aligned, I think the field of global priorities research probably tends to seem to them like a better fit with their skills than AI alignment. It also might be (this is a guess, based on maybe one or two very brief impressions) that philosophers in general aren’t that convinced of the value of the ‘agent foundations’ approach to AI safety, and feel that they’d need to spend a year getting to grips with machine learning before they could contribute to AI technical safety research.
Of your problems list, quite a number are more-or-less mainstream philosophical topics: standard debates in decision theory; infinite ethics; fair distribution of benefits; paternalism; metaphilosophy; nature of normativity. So philosophers are already working on those at least.
I really like your ‘metaethical policing’ bullet points, and wish there was more work from philosophers there.
On a related note, I feel like encouraging EA people with philosophy background to go into journalism or tech policy (as you did in the recent 80,000 Hours career review)
Arden Koehler wrote that part of the post (and is the main author of that post), so I’ll leave that to her. But quite a number of people who leave philosophy do so because they no longer want to keep doing philosophy research, so it seems good to list other options outside of that.
Of your problems list, quite a number are more-or-less mainstream philosophical topics
Sure. To clarify, I think it would be helpful for philosophers to think about those problems specifically in the context of AI alignment. For example many mainstream decision theorists seem to think mostly in terms of what kind of decision theory best fit with our intuitions about how humans should make decisions, whereas for AI alignment it’s likely more productive to think about what would actually happen if an AI were to follow a certain decision theory and whether we would prefer that to what would happen if it were to follow a different decision theory. Another thing that would be really helpful is to act as a bridge from mainstream philosophy research to AI alignment research, e.g., pointing out relevant results from mainstream philosophy when appropriate.
Arden Koehler wrote that part of the post (and is the main author of that post), so I’ll leave that to her. But quite a number of people who leave philosophy do so because they no longer want to keep doing philosophy research, so it seems good to list other options outside of that.
Ah ok. Any chance you could discuss this issue with her and perhaps suggest adding working on technical AI safety as an option that EA-aligned philosophers or people with philosophy backgrounds should strongly consider? (One EA person with a philosophy PhD already contacted me privately to say that they didn’t realize that their background might be helpful for AI alignment and to ask for more details on how they can help, so it seems like raising awareness here is at least part of the solution.)
I think it would be helpful for philosophers to think about those problems specifically in the context of AI alignment.
That makes sense; agree there’s lots of work to do there.
Any chance you could discuss this issue with her and perhaps suggest adding working on technical AI safety as an option that EA-aligned philosophers or people with philosophy backgrounds should strongly consider?
Do you have any thoughts on why there is not much engagement/participation in technical AI safety/alignment research by professional philosophers or people with philosophy PhDs? (I don’t know anyone except one philosophy PhD student who is directly active in this field, and Nick Bostrom who occasionally publishes something relevant.) Is it just that the few philosophers who are concerned about AI risk have more valuable things to do, like working on macrostrategy, AI policy, or trying to get more people to take ideas like existential risk and longtermism seriously? Have you ever thought about at what point it would start to make sense for the marginal philosopher (or the marginal philosopher-hour) to go into technical AI safety? Do you have a sense of why “philosophers concerned about AI risk” as a class hasn’t grown as quickly as one might have expected?
On a related note, I feel like encouraging EA people with philosophy background to go into journalism or tech policy (as you did in the recent 80,000 Hours career review) is a big waste, since an advanced education in philosophy does not seem to create an obvious advantage in those fields, whereas there are important philosophical questions in AI alignment for which such a background would be more obviously helpful. Curious what your thinking is here.
It occurs to me that another reason for the lack of engagement by people with philosophy backgrounds may be that philosophers aren’t aware of the many philosophical problems in AI alignment that they could potentially contribute to. So here’s a list of philosophical problems that have come up just in my own thinking about AI alignment.
EDIT: Since the actual list is perhaps only of tangential interest here (and is taking up a lot of screen space that people have to scroll through), I’ve moved it to the AI Alignment Forum.
Wei’s list focused on ethics and decision theory, but I think that it would be most valuable to have more good conceptual analysis of the arguments for why AI safety matters, and particularly the role of concepts like “agency”, “intelligence”, and “goal-directed behaviour”. While it’d be easier to tackle these given some knowledge of machine learning, I don’t think that background is necessary—clarity of thought is probably the most important thing.
Hey Wei_Dai, thanks for this feedback! I agree that philosophers can be useful in alignment research by way of working on some of the philosophical questions you list in the linked post. Insofar as you’re talking about working on questions like those within academia, I think of that as covered by the suggestion to work on global priorities research. For instance, I know that working on some of those questions would be welcome at the Global Priorities Institute, and I think FHI would probably also welcome philosophers working on AI questions. But I agree that that isn’t clear from the article, and I’ve added a bit to clarify it.
But maybe the suggestion is working on those questions outside academia. We mention DeepMind and Open AI as having ethics divisions, but likely only some philosophical questions relevant to AI safely are done in those kinds of centers, and it could be worth listing more non-academic settings in which philosophers might be able to pursue alignment relevant questions. There are, for instance, lots of AI ethics organizations, though most are only focused on short-term issues, and are more concerned with ‘implications’ than with philosophical questions that arise in the course of design. CHAI, AI Impacts, the Leverhulme center, and MIRI also seem to do a bit of philosophy each. The future Schwarzman Center at Oxford may also be a good place for this once it gets going. I’ve edited the relevant sections to reflect this.
Do you know of any other projects or organizations that might be useful to mention? I also think your list of philosophy questions relevant to AI is useful—thanks for writing it up!-- and would like to link to it in the article.
As for the comparison with journalism and AI policy, in line with what Will wrote below I was thinking of those as suggestions for people who are trying to get out of philosophy or who will be deciding not to go into it in the first place, i.e., for people who would be good at philosophy but who choose to do something else that takes advantage of their general strengths.
Thanks for making the changes. I think they address most of my concerns. However I think splitting the AI safety organizations mentioned between academic and non-academic is suboptimal, because it seems like what’s most important is that someone who can contribute to AI safety go to an organization that can use them, whether that organization belongs to a university or not. On a pragmatic level, I’m worried that someone sees a list of organizations where they can contribute to AI safety, and not realize that there’s another list in a distant part of the article.
Individual grants from various EA sources seem worth mentioning. I would also suggest mentioning FHI for AI safety research, not just global priorities research.
Ok, that wasn’t clear to me, as there’s nothing in the text that explicitly says those suggestions are for people who are trying to get out of philosophy. Instead the opening of that section says “If you want to leave academia”. I think you can address this as well as my “splitting” concern above by reorganizing the article into “careers inside philosophy” and “careers outside philosophy” instead of “careers inside academia” and “careers outside academia”. (But it’s just a suggestion as I’m sure you have other considerations for how to organize the article.)
Re: these being alternatives to philosophy, I see what you mean. But I think it’s ok to group together non-academic philosophy and non-philosophy alternatives since it’s a career review of philosophy academia. However, I take the point that I can better connect the two ‘alternatives’ sections in the article and have added a link.
As for individual grants, I’m hesitant to add that suggestion because I worry that that would encourage some people people who aren’t able to get philosophy roles in academia or in other organizations to go the ‘independent’ route, and I think that will rarely be the right choice.
I’m interested to hear why you think that. My own thinking is that a typical AI safety research organization may not currently be very willing to hire someone with mainly philosophy background, so they may have to first prove their value by doing some AI safety related independent research. After that they can either join a research org or continue down the ‘independent’ route if it seems suitable to them. Does this not seem like a good plan?
I don’t feel I have a great answer here. I think in part there’s just not that many philosophers in the world, and most of them are already wrapped up in existing research projects. Of those that are EA-aligned, I think the field of global priorities research probably tends to seem to them like a better fit with their skills than AI alignment. It also might be (this is a guess, based on maybe one or two very brief impressions) that philosophers in general aren’t that convinced of the value of the ‘agent foundations’ approach to AI safety, and feel that they’d need to spend a year getting to grips with machine learning before they could contribute to AI technical safety research.
Of your problems list, quite a number are more-or-less mainstream philosophical topics: standard debates in decision theory; infinite ethics; fair distribution of benefits; paternalism; metaphilosophy; nature of normativity. So philosophers are already working on those at least.
I really like your ‘metaethical policing’ bullet points, and wish there was more work from philosophers there.
Arden Koehler wrote that part of the post (and is the main author of that post), so I’ll leave that to her. But quite a number of people who leave philosophy do so because they no longer want to keep doing philosophy research, so it seems good to list other options outside of that.
Sure. To clarify, I think it would be helpful for philosophers to think about those problems specifically in the context of AI alignment. For example many mainstream decision theorists seem to think mostly in terms of what kind of decision theory best fit with our intuitions about how humans should make decisions, whereas for AI alignment it’s likely more productive to think about what would actually happen if an AI were to follow a certain decision theory and whether we would prefer that to what would happen if it were to follow a different decision theory. Another thing that would be really helpful is to act as a bridge from mainstream philosophy research to AI alignment research, e.g., pointing out relevant results from mainstream philosophy when appropriate.
Ah ok. Any chance you could discuss this issue with her and perhaps suggest adding working on technical AI safety as an option that EA-aligned philosophers or people with philosophy backgrounds should strongly consider? (One EA person with a philosophy PhD already contacted me privately to say that they didn’t realize that their background might be helpful for AI alignment and to ask for more details on how they can help, so it seems like raising awareness here is at least part of the solution.)
That makes sense; agree there’s lots of work to do there.
Have sent an email! :)