several times. How critical do you think it is to have quality of our standards or higher? One reason I’m suspicious of this is that RP chose a particular standard and we happen to be an existing example of something that works, so naively it’d be quite surprising if we hit exactly at the right level of quality vs quantity/scalability tradeoff, such that anything worse than us is ~useless.
Another reason I’m suss is that there are quality differences within RP’s work. For example, Jason’s work on invertebrate sentience is considerably higher quality than some of the (nonpublic) projects I did, which are (I hope) still quite useful to the relevant funders.
To decompose this a little, I think there are several dimensions where I think quality can be sacrificed and still be useful to EA orgs, the most obvious of which is time. Several projects you mentioned are done in what appears to be very short timelines (both calendar time and clock time), which makes it hard for other people to replicate RP’s performance, either because they’re more junior/otherwise worse researchers or because they have more external commitments.
For example, David and Jason’s report on charter cities was completed in 100 hours, a reasonable fraction of which was extra legwork for external writeup/following up with affected parties, after the original report was delivered to Open Phil. My impression is that the bulk of the work was done on a fairly short calendar time cycle too, in ways that may be hard for external parties to replicate. But naively the report would still be useful to Open Phil and cost-effective to fund if it took 200 hours to complete and 3x the calendar time.
Other dimensions I can imagine orgs being willing to accept work at a lower bar than RP’s while still being useful: EA alignment, reasoning transparency, accuracy, thoroughness, formatting, etc. Of course, some of these dimensions are more important than others, and these dimensions are not uncorrelated.
Other Rethink Priorities clients (including at Open Phil) might disagree, but my hunch is that if anything, higher quality and lower quantity is the way to go, because a client like me has less context on consultants doing some project than I do on someone I’ve directly managed (internally) on research projects for 2 years. So e.g. Holden vetted my Open Phil work pretty closely for 2 years and now feels less need to do so because he has a sense of what my strengths and weaknesses are, where he can just defer to me and where he should make sure to develop his own opinion, etc. That’s part of the massive cost of hiring, training, and managing internal talent, but it eventually gets you to a place where you don’t need to be so nervous about major crippling flaws (of some kinds) in someone’s work. But a major purpose of outsourcing analysis work is to get some information you need without needing to first have built up months or years of deep context with them. But how can I trust the work of someone I have so little context with? I think “go kinda overboard on legibility / reasoning transparency” and “go kinda overboard on quality / thoroughness / vetting” are two major strategies, especially when the client is far more time-constrained than funding constrained (as Open Phil is).
In this case, do you think RP should focus more on quality and less on quantity as we scale, by satisficing on quantity and focusing/optimizing on research quality (concretely, this may mean being very slow to add additional researchers and primarily using them as additional quality checks on existing work, over trying to have more output in novel work)? This is very much not the way we currently plan to scale, which is closer to focusing on maintaining research quality and trying to increase quantity/output.
(reiterating that all impressions here are my own).
I don’t feel strongly. You all have more context than I do on what seems feasible here. My hunch is in favor of RP maintaining current quality (or raising it only a tiny bit) and scaling quickly for a while — I mostly wanted to give some counterpoints to your suggestion that maybe RP should lower its quality to get more quantity.
Another read of this is maybe RP is leaving a bunch of gains on the table by not trying to be higher quality.
I think right now (as you know) while we’d like to have higher quality (and we expect to improve naturally somewhat by gaining more experience both as individual researchers and in research management), we’re prioritizing organizational resources more towards scalability/output than quality.
For example, David and Jason’s report on charter cities was completed in 100 hours, a reasonable fraction of which was extra legwork for external writeup/following up with affected parties, after the original report was delivered to Open Phil. My impression is that the bulk of the work was done on a fairly short calendar time cycle too, in ways that may be hard for external parties to replicate. But naively the report would still be useful to Open Phil and cost-effective to fund if it took 200 hours to complete and 3x the calendar time.
Just to clarify, the 100 hours was actually just for the original report and doesn’t include any of the extra leg work for the public version, because I forgot to update that time taken estimate in the public version. The extra work for the public version was an additional 10-15 hours of work from the two of us, but there was also work from others reviewing the report. This extra work took place over 5 weeks of calendar time.
You mention
several times. How critical do you think it is to have quality of our standards or higher? One reason I’m suspicious of this is that RP chose a particular standard and we happen to be an existing example of something that works, so naively it’d be quite surprising if we hit exactly at the right level of quality vs quantity/scalability tradeoff, such that anything worse than us is ~useless.
Another reason I’m suss is that there are quality differences within RP’s work. For example, Jason’s work on invertebrate sentience is considerably higher quality than some of the (nonpublic) projects I did, which are (I hope) still quite useful to the relevant funders.
To decompose this a little, I think there are several dimensions where I think quality can be sacrificed and still be useful to EA orgs, the most obvious of which is time. Several projects you mentioned are done in what appears to be very short timelines (both calendar time and clock time), which makes it hard for other people to replicate RP’s performance, either because they’re more junior/otherwise worse researchers or because they have more external commitments.
For example, David and Jason’s report on charter cities was completed in 100 hours, a reasonable fraction of which was extra legwork for external writeup/following up with affected parties, after the original report was delivered to Open Phil. My impression is that the bulk of the work was done on a fairly short calendar time cycle too, in ways that may be hard for external parties to replicate. But naively the report would still be useful to Open Phil and cost-effective to fund if it took 200 hours to complete and 3x the calendar time.
Other dimensions I can imagine orgs being willing to accept work at a lower bar than RP’s while still being useful: EA alignment, reasoning transparency, accuracy, thoroughness, formatting, etc. Of course, some of these dimensions are more important than others, and these dimensions are not uncorrelated.
Other Rethink Priorities clients (including at Open Phil) might disagree, but my hunch is that if anything, higher quality and lower quantity is the way to go, because a client like me has less context on consultants doing some project than I do on someone I’ve directly managed (internally) on research projects for 2 years. So e.g. Holden vetted my Open Phil work pretty closely for 2 years and now feels less need to do so because he has a sense of what my strengths and weaknesses are, where he can just defer to me and where he should make sure to develop his own opinion, etc. That’s part of the massive cost of hiring, training, and managing internal talent, but it eventually gets you to a place where you don’t need to be so nervous about major crippling flaws (of some kinds) in someone’s work. But a major purpose of outsourcing analysis work is to get some information you need without needing to first have built up months or years of deep context with them. But how can I trust the work of someone I have so little context with? I think “go kinda overboard on legibility / reasoning transparency” and “go kinda overboard on quality / thoroughness / vetting” are two major strategies, especially when the client is far more time-constrained than funding constrained (as Open Phil is).
In this case, do you think RP should focus more on quality and less on quantity as we scale, by satisficing on quantity and focusing/optimizing on research quality (concretely, this may mean being very slow to add additional researchers and primarily using them as additional quality checks on existing work, over trying to have more output in novel work)? This is very much not the way we currently plan to scale, which is closer to focusing on maintaining research quality and trying to increase quantity/output.
(reiterating that all impressions here are my own).
I don’t feel strongly. You all have more context than I do on what seems feasible here. My hunch is in favor of RP maintaining current quality (or raising it only a tiny bit) and scaling quickly for a while — I mostly wanted to give some counterpoints to your suggestion that maybe RP should lower its quality to get more quantity.
Another read of this is maybe RP is leaving a bunch of gains on the table by not trying to be higher quality.
I think right now (as you know) while we’d like to have higher quality (and we expect to improve naturally somewhat by gaining more experience both as individual researchers and in research management), we’re prioritizing organizational resources more towards scalability/output than quality.
I’m also interested in whether this is mistaken.
Just to clarify, the 100 hours was actually just for the original report and doesn’t include any of the extra leg work for the public version, because I forgot to update that time taken estimate in the public version. The extra work for the public version was an additional 10-15 hours of work from the two of us, but there was also work from others reviewing the report. This extra work took place over 5 weeks of calendar time.
Thanks for the clarification!