Recently, Iâve been mulling over the question of whether it was a good idea or not to join a frontier AI companyâs safety team for the purposes of reducing extinction risk. One of my big cons was something like:
Jay, you think the incentives are less likely to affect you compared to most people. But most AI safety people who join frontier labs probably think this. You will be affected as well.
So I decided on a partial mitigation strategy, entirely as a precautionary principle and not at all because I thought I needed to. I committed to myself and to several people Iâm close to that if I were to join a frontier lab safety team, I would donate 100% of the surplus that I would gain as a result of taking that job instead of a less lucrative job somewhere else.
At this time I was applying for a few jobs, one of which was at a frontier company. Approximately immediately, my System 1 became way less interested in that job. And I didnât even have an offer in hand for a specific amount of money. I donât have good reasons to care a lot about getting more money for myself. I have enough already, and I voluntarily live well below my means. This did not stop the effect from existing, and I didnât notice the effect before. I still donât notice the effect on my thinking in a vacuum. I only notice it by doing a mental side-by-side comparison.
I now think anyone who is considering joining a frontier company in order to reduce extinction risk should make this same commitment as a basic defensive measure against perverted incentives. I am sure there exist people who are entirely indifferent to money in this wayâthis is at least partially a skill issue on my part. But it does seem that âThinking you are indifferent to the moneyâ is not a reliable signal that your thinking is unaltered by it.
This is also an opportunity to say that, if I ever do join a frontier safety team, I officially give you permission to ask me if Iâm meeting this commitment of mine in conversation, even if other people are around.
For some people I think it is important to want more money, as they might have opportunities to use it quite effectively. I am mentioning this because I imagine some people might benefit from becoming more open to wanting a higher income. Sometimes people can be in a particularly good position to put more money to good use.
This advice is designed specifically for frontier AI companiesâI definitely think thereâs nothing wrong with wanting to earn more in most situations. Usually I would advise against giving 100% of your money over some threshold, precisely because it removes the personal incentive to earn more money. This is generally bad, unless of course you happen to be in a position where money may incentivize you to do bad things. Then removing that incentive is a very good idea.
I would give this same advice if someone were, say, considering leaving an animal advocacy group to join a meat producer, earn 3x the money, and lobby from the inside for better practices. Maybe this is the best move for animals, maybe notâbut you should be very suspicious about this reasoning when it stands to benefit you financially. In this exact case, wanting more money is highly counterproductive to doing good, because it distorts your thinking. In most cases, this is not true, or is true to a much smaller extent.
Basically, the pattern is âIf you would gain more money from joining the powerful people doing bad things, thatâs a bad incentive structure and you should remove that incentive structure from yourself.â
Recently, Iâve been mulling over the question of whether it was a good idea or not to join a frontier AI companyâs safety team for the purposes of reducing extinction risk. One of my big cons was something like:
Jay, you think the incentives are less likely to affect you compared to most people. But most AI safety people who join frontier labs probably think this. You will be affected as well.
So I decided on a partial mitigation strategy, entirely as a precautionary principle and not at all because I thought I needed to. I committed to myself and to several people Iâm close to that if I were to join a frontier lab safety team, I would donate 100% of the surplus that I would gain as a result of taking that job instead of a less lucrative job somewhere else.
At this time I was applying for a few jobs, one of which was at a frontier company. Approximately immediately, my System 1 became way less interested in that job. And I didnât even have an offer in hand for a specific amount of money. I donât have good reasons to care a lot about getting more money for myself. I have enough already, and I voluntarily live well below my means. This did not stop the effect from existing, and I didnât notice the effect before. I still donât notice the effect on my thinking in a vacuum. I only notice it by doing a mental side-by-side comparison.
I now think anyone who is considering joining a frontier company in order to reduce extinction risk should make this same commitment as a basic defensive measure against perverted incentives. I am sure there exist people who are entirely indifferent to money in this wayâthis is at least partially a skill issue on my part. But it does seem that âThinking you are indifferent to the moneyâ is not a reliable signal that your thinking is unaltered by it.
This is also an opportunity to say that, if I ever do join a frontier safety team, I officially give you permission to ask me if Iâm meeting this commitment of mine in conversation, even if other people are around.
For some people I think it is important to want more money, as they might have opportunities to use it quite effectively. I am mentioning this because I imagine some people might benefit from becoming more open to wanting a higher income. Sometimes people can be in a particularly good position to put more money to good use.
This advice is designed specifically for frontier AI companiesâI definitely think thereâs nothing wrong with wanting to earn more in most situations. Usually I would advise against giving 100% of your money over some threshold, precisely because it removes the personal incentive to earn more money. This is generally bad, unless of course you happen to be in a position where money may incentivize you to do bad things. Then removing that incentive is a very good idea.
I would give this same advice if someone were, say, considering leaving an animal advocacy group to join a meat producer, earn 3x the money, and lobby from the inside for better practices. Maybe this is the best move for animals, maybe notâbut you should be very suspicious about this reasoning when it stands to benefit you financially. In this exact case, wanting more money is highly counterproductive to doing good, because it distorts your thinking. In most cases, this is not true, or is true to a much smaller extent.
Basically, the pattern is âIf you would gain more money from joining the powerful people doing bad things, thatâs a bad incentive structure and you should remove that incentive structure from yourself.â