Peer review in a debate framework. Two referees will write evaluations, one focusing on the possible negatives, costs and problems of the proposal; and the other on the benefits. Both referees will also suggest what kind of resources a team attempting the project should have. (2-5h / analyst / project)
I thought about this stage a bunch since Jan shared me on the original draft, and I don’t expect a system like this to provide useful evaluations.
When I have had good conversations about potential projects with people in EA, it’s very rarely the case that the assessments of people could be easily summarized by two lists of the form “possible negative consequences” and “possible benefits”. In practice I think people will have models that will output a net-positive impact or a net-negative impact, depending on certain facts that they have uncertainty about, and understanding those cruxes and uncertainties is the key thing in understanding whether a project will be worth working on. I tried for 15 minutes to write a list of just the negative consequences I expect a project to have, and failed to do so, because most of the potential negative consequences are entwined with the potential benefits of the project in a way that doesn’t really make it easy to just focus on the negatives or the positives.
This is true, but “possible negative consequences” and “possible benefits” are good brainstorming prompts. Especially if someone has a bias towards one or the other, telling them to use both prompts can help even things out.
I agree. I think having reviewers that use those as prompts for their evaluations and their feedback seems like a decent choice. But I wouldn’t want to force one person to only use that one prompt.
Based on Habryka’s point, what if “stage 1b” allowed the two reviewers to come to their own conclusions according to their own biases, and then at the end, each reviewer is asked to give an initial impression as to whether it’s fund-worthy (I suppose this means its EV is equal to or greater than typical GiveWell charity) or not (EV may be positive, but not high enough).
This impression doesn’t need to be published to anyone if, as you say, the point of this stage is not to produce an EV estimate. But whenever both reviewers come to the same conclusion (whether positive or not), a third reviewer is asked to review it too, to potentially point out details the first reviewers missed.
Now, if all three reviewers give a thumbs down, I’m inclined to think … the applicant should be notified and suggested to go back to the drawing board? If it’s just two, well, maybe that’s okay, maybe EV will be decidedly good upon closer analysis.
I think reviewers need to be able (and encouraged) to ask questions of the applicant, as applications are likely to be have some points that are fuzzy or hard to understand. It isn’t just that some proposals are written by people with poor communication skills; I think this will be a particular problem with ambitious projects whose vision is hard to articulate. Perhaps the Q&As can be appended to the application when it becomes public? But personally, as an applicant, I would be very interested to edit the original proposal to clarify points at the location where they are first made.
And perhaps proposals will need to be rate-limited to discourage certain individuals from wasting too much reviewer time?
I thought about this stage a bunch since Jan shared me on the original draft, and I don’t expect a system like this to provide useful evaluations.
When I have had good conversations about potential projects with people in EA, it’s very rarely the case that the assessments of people could be easily summarized by two lists of the form “possible negative consequences” and “possible benefits”. In practice I think people will have models that will output a net-positive impact or a net-negative impact, depending on certain facts that they have uncertainty about, and understanding those cruxes and uncertainties is the key thing in understanding whether a project will be worth working on. I tried for 15 minutes to write a list of just the negative consequences I expect a project to have, and failed to do so, because most of the potential negative consequences are entwined with the potential benefits of the project in a way that doesn’t really make it easy to just focus on the negatives or the positives.
This is true, but “possible negative consequences” and “possible benefits” are good brainstorming prompts. Especially if someone has a bias towards one or the other, telling them to use both prompts can help even things out.
I agree. I think having reviewers that use those as prompts for their evaluations and their feedback seems like a decent choice. But I wouldn’t want to force one person to only use that one prompt.
It is very easy to replace this stage with e.g. just two reviews.
Some of the arguments for the contradictory version
the point of this stage is not to produce EV estimate, but to map the space of costs, benefits, and considerations
it is easier to be biased in a defined way than unbiased
it removes part of the problem with social incentives
Some arguments against it are
such adversarial setups for truth-seeking are uncommon outside of judicial process
it may contribute to unnecessary polarization
the splitting may feel unnatural
Based on Habryka’s point, what if “stage 1b” allowed the two reviewers to come to their own conclusions according to their own biases, and then at the end, each reviewer is asked to give an initial impression as to whether it’s fund-worthy (I suppose this means its EV is equal to or greater than typical GiveWell charity) or not (EV may be positive, but not high enough).
This impression doesn’t need to be published to anyone if, as you say, the point of this stage is not to produce an EV estimate. But whenever both reviewers come to the same conclusion (whether positive or not), a third reviewer is asked to review it too, to potentially point out details the first reviewers missed.
Now, if all three reviewers give a thumbs down, I’m inclined to think … the applicant should be notified and suggested to go back to the drawing board? If it’s just two, well, maybe that’s okay, maybe EV will be decidedly good upon closer analysis.
I think reviewers need to be able (and encouraged) to ask questions of the applicant, as applications are likely to be have some points that are fuzzy or hard to understand. It isn’t just that some proposals are written by people with poor communication skills; I think this will be a particular problem with ambitious projects whose vision is hard to articulate. Perhaps the Q&As can be appended to the application when it becomes public? But personally, as an applicant, I would be very interested to edit the original proposal to clarify points at the location where they are first made.
And perhaps proposals will need to be rate-limited to discourage certain individuals from wasting too much reviewer time?