Hello Tom, thanks very much for this write up. Three comments:
I very much admire your ability to self-criticise, but I think you’re being overly harsh on yourself. It didn’t turn out as well as you hoped, but you couldn’t have known that in advance, which was the point. I think this is a good example of what is sometimes called ‘hits-based charity’: EAs trying new things with a high expected value but a low probability of success. I also hesitate to call this a failure because, as you noted, quite a few lessons were learnt. I think your (only?) substantial mistake was in having too high expectations about what a part-time student group could achieve. Perhaps you took “EAs”, who are typically smart, consciousness and driven as your reference group, rather than “student club/society” which no one really expects to be very productive or world-changing.
On reflection, I wonder if OxPrio fell into a sort of research no-man’s land. It was too detailed for student, average EAs to engage with, but maybe not in depth enough to attract critical commentary and engagement from full-time researchers, such as those in CEA or GiveWell, whose research you were, to some extent, replicating. I’m not sure who you thought the target audience of your research was.
I think a contributing factor to not having much local, Oxford university engagement is that you’d selected a team. Presumably the people who would be most interested in OxPrio’s research applied. I imagine many of the people who applied, but you rejected from the team, then decided that, as a standard psychological reflex, that they didn’t want to be involved further (disclaimer: I applied and was rejected, but ended up being really curious about what was OxPrio were doing anyway). Hence the process of selecting alienated much of your intended audience. I don’t have suggestion for what would have been better, I just think this is worth factoring in.
I will echo the conclusion of this, in that OxPrio was likely a counterfactually net positive way to spend your time. Actually running a real team project with a deadline and things depending on you, learning basic management and realising the difference between how you expect a group of people to behave and how they actually behave, are rare life lessons that many people don’t learn, or at least not until much older.
Instead, participants strongly preferred to continue researching the area they already knew and cared most about, even as other participants were doing the same thing with a different area.
This is one of the things I fear is most likely to fundamentally undermine EA in the long term: people prefer to discuss and associate with people who share their assumptions, concrete concerns and detailed cause-specific knowledge and EA functionally splits into 3+ movement areas who never speak with each other and don’t understand each other’s arguments, and cause neutrality essentially stops being a thing. Notably, I think this has already happened to a significant extent.
Debates sound useful, although it would be great to think of something functionally similar, but without the oppositional/competitive aspect of the debate. I think a lot of EA’s would benefit from debates, but for some it would probably increase their cause partisanship and easy dismissal of other causes.
Some other things which could be useful:
Involving EAs in structured giving games with a deliberative democratic component where they had to evaluate different causes. This could be structured something like Tom’s project here- though it would have to avoid the relativism/disinclination to think about or challenge other people’s causes noted in (7.2).
Involving non-supporters of a particular cause in evaluating and selecting interventions within that cause. At the moment the people evaluating (interventions within) causes tend to be supporters of those causes. This naturally encourages wild one-sided over-optimism about your preferred cause and a lack of interest beyond that cause.
Thanks for this, detailed post-mortems like this are very valuable!
Some thoughts:
I considered getting involved in the project, but was somewhat put off by the messaging. Somehow it came across as a “learning exercise for students” rather than “attempt to do actually new research”. Not sure exactly why that was (the grant size may have been a part, see below), and I now regret not getting more involved.
You describe the grant amount of £10,000 as “substantial”. This is surprising to me, since my reaction to the grant size was that it was too small to bother with. I think this corroborates your thoughts about grant size: any size of grant would have had most of the beneficial effects that you saw, but a much larger grant would have been needed to make it seem really “serious”.
I think that the project goal was too ambitious. Global prioritization is much harder than more restricted prioritization, but also vaguer and more abstract. Usually when we’re learning to deal with vague and abstract problems we start out by becoming very adept with simple, concrete versions to build skills and intuitions before moving up the abstraction hierarchy (easier, better feedback, more motivating, etc.). If I wanted to train up some prioritization researchers I would probably start by getting them to just do lots of small, concrete prioritization tasks.
As Michael Plant says below, I think the project was in a bit of an awkward middle ground. The costs of participation (in terms of work and “top-of-mind” time) were perhaps a bit too high for either students or otherwise-busy community members (like myself), and the perceived benefits (in terms of expected quality of research produced) were perhaps too low for the professionals. (To elaborate on why engaging felt like it would be substantial work for me: in order to provide good commentary on one of your posts, I would have had to: read the post; probably read some prior posts; think hard about it; possibly do some research myself; condense that into a thoughtful reply. That could easily take up an evening of my time, for not a huge perceived reward.) I think your suggestion of running such a project as a week-long retreat is a good idea—it would get a committed block of time from people, and prevents inefficiencies due to repeated time spent “re-loading” the background information.
Agree that quantitative modelling is great and under-utilised. I think a course which was more or less How To Measure Anything applied to EA with modern techniques and technologies would be a fantastic starter for prioritization research, and give people generally useful skills too.
I would have preferred less, higher-quality output from the project. My reaction to the first few blog posts was that they were fine but not terribly interesting, which meant I largely didn’t read much of the rest of the content until the models started appearing, which I did find interesting.
Even if you think the project was net-negative, I hope this doesn’t put you off starting new things. Exploration is very valuable, even if the median case is a failure.
I think a course which was more or less How To Measure Anything applied to EA with modern techniques and technologies would be a fantastic starter for prioritization research, and give people generally useful skills too.
Just want to strongly agree with this. Those are real figure-out-how-the-world-works skills. If anyone wants an overview, Luke Muehlhauser did an in-depth summary here.
Even if you think the project was net-negative, I hope this doesn’t put you off starting new things. Exploration is very valuable, even if the median case is a failure.
Further agreement. Seeing this failed project report is one of the few signs to me that EA is actually trying. I have a vague recollection of Charity Science doing a failed project report too.
A further thought. If something like OxPrio were to run again, I think it should specify, from the outset, that it won’t give money to any charity already considered a mainstream by the EA world (I admit this is vague and would need some tightening). My thinking is (1) this invites participant to do new research, rather than just replicate that done by others. This is both useful for the EA world and also more interesting for the particpants and the audience. I was hoping OxPrio would come up with something more radical that giving money to 80k. (2) I encourages people to question the received wisdom about what the best charities/orgs are without risking looking stupid for choosing something different.
EA Berkeley seemed more positive about their student-led EA class, calling it “very successful”, but we believe it was many times less ambitious
Yeah, that’s accurate. I doubt that any of our students are more likely to go into prioritization research as a result of the class. I could name a few people who might change their career as a result of the class, but that would also be a pretty low number, and for each individual person I’d put the probability at less than 50%. “Very successful” here means that a large fraction of the students were convinced of EA ideas and were taking actions in support of them (such as taking the GWWC pledge, and going veg*n). It certainly seems a lot harder to cause career changes, without explicitly selecting for people who want to change their career (as in an 80K workshop).
We implicitly predicted that other team members would also be more motivated by the ambitious nature of the Project, but this turned out not to be the case. If anything, motivation increased after we shifted to less ambitious goals.
We observed the same thing. In the first iteration of EA Berkeley’s class, there was some large amount of money (probably ~$5000) that was allocated for the final project, and students were asked to propose projects that they could run with that money. This was in some sense even more ambitious than OxPrio, since donating it to a charity was a baseline—students were encouraged to think of more out-of-the-box ideas as well. What ended up happening was that the project was too open-ended for students to really make progress on, and while people proposed projects because it was required to pass the course, they didn’t actually get implemented, and we used the $5000 to fund costs for EA Berkeley in future semesters.
Regarding “direct benefits for people who would learn from reading” our research: this is very difficult to evaluate, but our tentative feeling was that this was lower than we expected. We received less direct engagement with our research on the EA forum than we expected, and we believe few people read our models. Indirectly, the models were referenced in some newsletters (for example MIRI’s). However, since our writings will remain online, there may be a small but long-lasting trickle of benefits into the future, from people coming across our models.
Facetious interpretation: “Effective Altruism found to have a weak culture of intellectual discourse. Conclusion: Deprioritize intellectual discourse.”
The Project created some benefits, but (with low confidence) I don’t think the costs were worth it.
I think it’s worth separating the outcome with the expected value at the time the project was begun.
It can still have been a +EV decision to have done it based on the information you had at the time.
i.e. it can’t be evaluated as not being worth it because it didn’t turn out to be worth it, it can have been worth doing because the expected value at the time made it worth it, even if it ended up not being net-positive.
Hello Tom, thanks very much for this write up. Three comments:
I very much admire your ability to self-criticise, but I think you’re being overly harsh on yourself. It didn’t turn out as well as you hoped, but you couldn’t have known that in advance, which was the point. I think this is a good example of what is sometimes called ‘hits-based charity’: EAs trying new things with a high expected value but a low probability of success. I also hesitate to call this a failure because, as you noted, quite a few lessons were learnt. I think your (only?) substantial mistake was in having too high expectations about what a part-time student group could achieve. Perhaps you took “EAs”, who are typically smart, consciousness and driven as your reference group, rather than “student club/society” which no one really expects to be very productive or world-changing.
On reflection, I wonder if OxPrio fell into a sort of research no-man’s land. It was too detailed for student, average EAs to engage with, but maybe not in depth enough to attract critical commentary and engagement from full-time researchers, such as those in CEA or GiveWell, whose research you were, to some extent, replicating. I’m not sure who you thought the target audience of your research was.
I think a contributing factor to not having much local, Oxford university engagement is that you’d selected a team. Presumably the people who would be most interested in OxPrio’s research applied. I imagine many of the people who applied, but you rejected from the team, then decided that, as a standard psychological reflex, that they didn’t want to be involved further (disclaimer: I applied and was rejected, but ended up being really curious about what was OxPrio were doing anyway). Hence the process of selecting alienated much of your intended audience. I don’t have suggestion for what would have been better, I just think this is worth factoring in.
I will echo the conclusion of this, in that OxPrio was likely a counterfactually net positive way to spend your time. Actually running a real team project with a deadline and things depending on you, learning basic management and realising the difference between how you expect a group of people to behave and how they actually behave, are rare life lessons that many people don’t learn, or at least not until much older.
This is one of the things I fear is most likely to fundamentally undermine EA in the long term: people prefer to discuss and associate with people who share their assumptions, concrete concerns and detailed cause-specific knowledge and EA functionally splits into 3+ movement areas who never speak with each other and don’t understand each other’s arguments, and cause neutrality essentially stops being a thing. Notably, I think this has already happened to a significant extent.
Could public debates be helpful for this?
Debates sound useful, although it would be great to think of something functionally similar, but without the oppositional/competitive aspect of the debate. I think a lot of EA’s would benefit from debates, but for some it would probably increase their cause partisanship and easy dismissal of other causes.
Some other things which could be useful:
Involving EAs in structured giving games with a deliberative democratic component where they had to evaluate different causes. This could be structured something like Tom’s project here- though it would have to avoid the relativism/disinclination to think about or challenge other people’s causes noted in (7.2).
Red teaming causes (https://en.wikipedia.org/wiki/Red_team)
Involving non-supporters of a particular cause in evaluating and selecting interventions within that cause. At the moment the people evaluating (interventions within) causes tend to be supporters of those causes. This naturally encourages wild one-sided over-optimism about your preferred cause and a lack of interest beyond that cause.
Thanks for this, detailed post-mortems like this are very valuable!
Some thoughts:
I considered getting involved in the project, but was somewhat put off by the messaging. Somehow it came across as a “learning exercise for students” rather than “attempt to do actually new research”. Not sure exactly why that was (the grant size may have been a part, see below), and I now regret not getting more involved.
You describe the grant amount of £10,000 as “substantial”. This is surprising to me, since my reaction to the grant size was that it was too small to bother with. I think this corroborates your thoughts about grant size: any size of grant would have had most of the beneficial effects that you saw, but a much larger grant would have been needed to make it seem really “serious”.
I think that the project goal was too ambitious. Global prioritization is much harder than more restricted prioritization, but also vaguer and more abstract. Usually when we’re learning to deal with vague and abstract problems we start out by becoming very adept with simple, concrete versions to build skills and intuitions before moving up the abstraction hierarchy (easier, better feedback, more motivating, etc.). If I wanted to train up some prioritization researchers I would probably start by getting them to just do lots of small, concrete prioritization tasks.
As Michael Plant says below, I think the project was in a bit of an awkward middle ground. The costs of participation (in terms of work and “top-of-mind” time) were perhaps a bit too high for either students or otherwise-busy community members (like myself), and the perceived benefits (in terms of expected quality of research produced) were perhaps too low for the professionals. (To elaborate on why engaging felt like it would be substantial work for me: in order to provide good commentary on one of your posts, I would have had to: read the post; probably read some prior posts; think hard about it; possibly do some research myself; condense that into a thoughtful reply. That could easily take up an evening of my time, for not a huge perceived reward.) I think your suggestion of running such a project as a week-long retreat is a good idea—it would get a committed block of time from people, and prevents inefficiencies due to repeated time spent “re-loading” the background information.
Agree that quantitative modelling is great and under-utilised. I think a course which was more or less How To Measure Anything applied to EA with modern techniques and technologies would be a fantastic starter for prioritization research, and give people generally useful skills too.
I would have preferred less, higher-quality output from the project. My reaction to the first few blog posts was that they were fine but not terribly interesting, which meant I largely didn’t read much of the rest of the content until the models started appearing, which I did find interesting.
Even if you think the project was net-negative, I hope this doesn’t put you off starting new things. Exploration is very valuable, even if the median case is a failure.
Just want to strongly agree with this. Those are real figure-out-how-the-world-works skills. If anyone wants an overview, Luke Muehlhauser did an in-depth summary here.
Further agreement. Seeing this failed project report is one of the few signs to me that EA is actually trying. I have a vague recollection of Charity Science doing a failed project report too.
A further thought. If something like OxPrio were to run again, I think it should specify, from the outset, that it won’t give money to any charity already considered a mainstream by the EA world (I admit this is vague and would need some tightening). My thinking is (1) this invites participant to do new research, rather than just replicate that done by others. This is both useful for the EA world and also more interesting for the particpants and the audience. I was hoping OxPrio would come up with something more radical that giving money to 80k. (2) I encourages people to question the received wisdom about what the best charities/orgs are without risking looking stupid for choosing something different.
Yeah, that’s accurate. I doubt that any of our students are more likely to go into prioritization research as a result of the class. I could name a few people who might change their career as a result of the class, but that would also be a pretty low number, and for each individual person I’d put the probability at less than 50%. “Very successful” here means that a large fraction of the students were convinced of EA ideas and were taking actions in support of them (such as taking the GWWC pledge, and going veg*n). It certainly seems a lot harder to cause career changes, without explicitly selecting for people who want to change their career (as in an 80K workshop).
We observed the same thing. In the first iteration of EA Berkeley’s class, there was some large amount of money (probably ~$5000) that was allocated for the final project, and students were asked to propose projects that they could run with that money. This was in some sense even more ambitious than OxPrio, since donating it to a charity was a baseline—students were encouraged to think of more out-of-the-box ideas as well. What ended up happening was that the project was too open-ended for students to really make progress on, and while people proposed projects because it was required to pass the course, they didn’t actually get implemented, and we used the $5000 to fund costs for EA Berkeley in future semesters.
Sounds more like increased credence to me. People allowed their charity preferences to move but stuck to their pet causes and pet research areas...
Facetious interpretation: “Effective Altruism found to have a weak culture of intellectual discourse. Conclusion: Deprioritize intellectual discourse.”
I think it’s worth separating the outcome with the expected value at the time the project was begun.
It can still have been a +EV decision to have done it based on the information you had at the time.
i.e. it can’t be evaluated as not being worth it because it didn’t turn out to be worth it, it can have been worth doing because the expected value at the time made it worth it, even if it ended up not being net-positive.