Research Fellow at Open Philanthropy. Former Quantitative Researcher on the Worldview Investigations Team at Rethink Priorities. Completed a PhD at the Paris School of Economics.
David Rhys Bernard
1. Thinking vs. reading.
Another benefit of thinking before reading is that it can help you develop your research skills. Noticing some phenomena and then developing a model to explain it is a super valuable exercise. If it turns out you reproduce something that someone else has already done and published, then great, you’ve gotten experience solving some problem and you’ve shown that you can think through it at least as well as some expert in the field. If it turns out that you have produced something novel then it’s time to see how it compares to existing results in the literature and get feedback on how useful it is.
This said, I think this is more true for theoretical work than applied work, e.g. the value of doing this in philosophy > in theoretical economics > in applied economics. A fair amount of EA-relevant research is summarising and synthesising what the academic literature on some topic finds and it seems pretty difficult to do that by just thinking to yourself!
3. Is there something interesting here?
I mostly try to work out how excited I am by this idea and whether I could see myself still being excited in 6 months, since for me having internal motivation to work on a project is pretty important. I also try to chat about this idea with various other people and see how excited they are by it.
4. Survival vs. exploratory mindset.
I also haven’t heard these terms before, but from your description (which frames a survival mindset pretty negatively), an exploratory mindset comes fairly naturally to me and therefore I haven’t ever actively cultivated it. Lots of research projects fail so extreme risk aversion in particular seems like it would be bad for researchers.
5. Optimal hours of work per day.
I typically aim for 6-7 hours of deep work a day and a couple of dedicated hours for miscellaneous tasks and meetings. Since starting part-time at RP I’ve been doing 6 days a week (2 RP, 4 PhD), but before that I did 5. I find RP deep work less taxing than PhD work. 6 days a week is at the upper limit of manageable for me at the moment, so I plan to experiment with different schedules in the new year.
6. Learning a new field.
I’m a big fan of textbooks and schedule time to read a couple of textbook chapters each week. Lesswrong’s best textbooks on every subject thread is pretty good for finding them. I usually make Anki flashcards to help me remember the key facts, but I’ve recently started experimenting with Roam Research to take notes which I’m also enjoying so my “learning flow” is in flux at the moment.
8. Emotional motivators.
My main trick for dealing with this is to always plan my day the night before. I let System 2 Dave work out what is important and needs to be done and put blocks in the calendar for these things. When System 1 Dave is working the next day, his motivation doesn’t end up mattering so much because he can easily defer to what System 2 Dave said he should do. I don’t read too much into lack of System 1 motivation, it happens and I haven’t noticed that it is particularly correlated with how important the work is, it’s more correlated with things like how scary it is to start some new task and irrelevant things like how much sunlight I’ve been getting.
9. Typing speed.
I struggle to imagine typing speed being a binding constraint on research productivity since I’ve never found typing speed to be a problem for getting into flow, but when I just checked my wpm was 85 so maybe I’d feel different if it was slower. When I’m coding the vast majority of my time is spent thinking about how to solve the problem I’m facing, not typing the code that solves the problem. When I’m writing first drafts, I think typing speed is a bit more helpful for the reasons you mention, but again more time goes into planning the structure of what I want to say and polishing, than the first pass at writing where speed might help.
11. Tiredness, focus, etc.
My favourite thing to do is to stop working! Not all days can be good days and I became a lot happier and more productive when I stopped beating myself up for having bad days and allowed myself to take the rest of the afternoon off.
12. Meta.
The questions I didn’t answer were because I didn’t have much to say about them so I’d be happy to see answers to them!
For example, David and Jason’s report on charter cities was completed in 100 hours, a reasonable fraction of which was extra legwork for external writeup/following up with affected parties, after the original report was delivered to Open Phil. My impression is that the bulk of the work was done on a fairly short calendar time cycle too, in ways that may be hard for external parties to replicate. But naively the report would still be useful to Open Phil and cost-effective to fund if it took 200 hours to complete and 3x the calendar time.
Just to clarify, the 100 hours was actually just for the original report and doesn’t include any of the extra leg work for the public version, because I forgot to update that time taken estimate in the public version. The extra work for the public version was an additional 10-15 hours of work from the two of us, but there was also work from others reviewing the report. This extra work took place over 5 weeks of calendar time.
Within 3 days of departing the UK to return to the US, take another COVID test. This is required by the US CDC according to this link, and both PCR and Rapid Antigen tests are acceptable. I am planning to walk into an NHS location near the EA conference venue (like this) and get a free test. You don’t have to be a UK citizen to get free tests from the NHS (link).
My understanding is that you should not be using the free NHS test for travel and should instead book a private test, which is possible across London and at airports on the day of your flight. See the travelling abroad section of this NHS page. More practically, I think you only get a text message confirming your result from the NHS tests and this is not sufficient documentation for the CDC requirements.
What information must be included on the test result? A test result must be in the form of written documentation (paper or electronic copy). The documentation must include:
1 . Type of test (indicating it is a NAAT or antigen test) 2. Entity issuing the result (e.g. laboratory, healthcare entity, or telehealth service) 3. Specimen collection date. A negative test result must show the specimen was collected within the 3 days before the flight. A positive test result for documentation of recovery from COVID-19 must show the specimen was collected within the 3 months before the flight. 4. Information that identifies the person (full name plus at least one other identifier such as date of birth or passport number) 5. Test Result
Thanks for the post, but I don’t think you can conclude from your analysis that your criteria weren’t helpful and the result is not necessarily that surprising.
If you look at professional NBA basketball players, there’s not much of a correlation between how tall a basketball player is and how much they get paid or some other measure of how good they are. Does this mean NBA teams are making a mistake by choosing tall basketball players? Of course not!
The mistake your analysis is making is called ‘selecting on the dependent variable’ or ‘collider bias’. You are looking at the correlation between two variables (interview score and engagement) in a specific subpopulation, the subpopulation that scored highly in interview score. However, that specific subpopulation correlation may not be representative of the correlation between interview score and engagement in the broader relevant population i.e., all students who applied to the fellowship. This is related to David Moss’s comment on range restrictions.
The correlation in the population is the thing you care about, not the correlation in your subpopulation. You want to know whether the scores are helpful for selecting people into or out of the fellowship. For this, you need to know about engagement of people not in the fellowship as well as people in the fellowship.
This sort of thing comes up all the time, like in the basketball case. Another common example with a clear analogy to your case is grad school admissions. For admitted students, GRE scores are (usually) not predictive of success. Does that mean schools shouldn’t select students based on GRE? Only if the relationship between success and GPA for admitted students is representative of the relationship for unadmitted students, which is unlikely to be the case.
The simplest thing you could do to improve this would be to measure engagement for all the people who applied (or who you interviewed if you only have scores for them) and then re-estimate the correlation on the full sample, rather than the selected subsample. This will provide a better answer to your question of whether scores are predictive of engagement. It seems like the things included in your engagement measure are pretty easy to observe so this should be easy to do. However, a lot of them are explicitly linked to participation in the fellowship which biases it towards fellows somewhat, so if you could construct an alternative engagement measure which doesn’t include these, that would likely be better.
Side note: a Cohen’s d of .31 is not small. My opinion is that the rules of thumb used to interpret effect sizes in psychology are messed up, because so much p-hacking in the past produced way overinflated effect sizes. Regardless, 0.3 is typically seen as a moderate effect size. A 0.3 standard deviation increase in IQ would be 4.5 points which would lead to economically meaningful differences in income.
Thanks Mark, both for your time and feedback while we were writing the report and your comments now.
On 1, I agree that charter cities sit somewhere between neartermist and longtermist so thinking about them as mid/mediumtermist makes sense. I imagine Rethink Priorities’ future work in this space will be a mixture of traditionally neartermist and mediumtermist topics. However, most of the current arguments for charter cities, especially Mason (2019), have an explicitly neartermist flavour, given the direct comparisons to GiveWell charities and a focus on the direct benefits. I’m keen to see robust medium/longtermist arguments for charter cities being made more explicitly.
On 2 & 3, there’s some tension between the claims that (1) Chinese growth is a result of SEZs, (2) the charter cities movement is trying to replicate the success of China, and (3) that SEZs are not the right comparison for charter cities.
To simplify the argument somewhat, we are taking the position that the more useful currently existing empirical analogue for charter cities is all SEZs, whereas your position is that it is Shenzhen. I totally accept your points about the important differences between SEZs and charter cities, however I am still concerned that focusing solely on the Shenzhen SEZ is cherry picking and an unrepresentative sample of how we might expect charter cities to perform. I think the ideal empirical analogue would be the subset of all SEZs that were large, had relatively high autonomy and multiple industries, however we couldn’t find any analysis of the performance of this subset.
On 4, I think the report is clear about why we are currently skeptical of the tractability of charter cities despite recent history (although I recognise that you have inside knowledge that might cause us to update more positively). I’d also highlight that regardless of what you think of the absolute tractability of charter cities, it seems intuitive that the relative tractability is lower than alternatives such as special reform zones, which aim at delivering the same benefits as charter cities without having to set up and build a brand new city. That said, I’m happy you and CCI are still working on this and I would love for you to prove us wrong!
Thanks for this Luisa, I found it very interesting and appreciated the level of detail in the different cases. One thought and related questions that came up when reading the toy calculations at the end of each case:
For a fixed number of survivors, there is a trade-off between groups of different sizes. The larger the groups, the more likely each group is to survive, but the fewer groups need to be wiped out in order for humanity to go extinct.
What might this trade-off look like and is there some optimal group size to minimise the risk of extinction?
What are the game theoretic considerations of individuals forming groups of varying sizes and how do these vary depending on the extent to which people care about their own individual survival and human extinction?
What group sizes might we expect in practice and is there anything we could do to influence group sizes in the event of a catastrophe?
Given the low likelihood of extinction you suggest, I think these are relatively low priority questions but could be potentially interesting for someone to look at in more detail.
I’m happy to see an increase in the number of temporary visiting researcher positions at various EA orgs. I found my time visiting GPI during their Early Career Conference Programme very valuable (hint: applications for 2021 are now open, apply!) and would encourage other orgs to run similar sorts of programmes to this and FHI’s (summer) research scholars programme. I’m very excited to see how our internship program develops as I really enjoy mentoring.
I think I was competitive for the RP job because of my T-shaped skills, broad knowledge in lots of EA-related things but also specialised knowledge in a specific useful area, economics in my case. Michael Aird probably has the most to say about developing broad knowledge given how much EA content he has consumed in the last couple of years, but in general reading things on the Forum and actively discussing them with other people (perhaps in a reading group) seems to be the way to develop in this area. Developing specialised skills obviously depends a lot on the skill, but graduate education and relevant internships are the most obvious routes here.
Hi Michael, thanks for this.
On 1: Thorstad argues that if you want to hold both claims (1) Existential Risk Pessimism—per-century existential risk is very high, and (2) Astronomical Value Thesis—efforts to mitigate existential risk have astronomically high expected value, then TOP is the most plausible way to jointly hold both claims. He does look at two arguments for TOP—space settlement and an existential risk Kuznets curve—but says these aren’t strong enough to ground TOP and we instead need a version of TOP that appeals to AI. It’s fair to think of this piece as starting from that point, although the motivation for appealing to AI here was more due to this seeming to be the most compelling version of TOP to x-risk scholars.
On 2: I don’t think I’m an expert on TOP and was mostly aimed at summarising premises that seem to be common, hence the hedging. Broadly, I think you do only need the 4 claims that formed the main headings (1) high levels x-risk now, (2) significantly reduced levels of x-risk in the future, (3) a long and valuable / positive EV future, and (4) a moral framework that places a lot of weight on this future. I think the slimmed down version of the argument focuses solely on AI as it’s relevant for (1), (2) and (3), but as I say in the piece, I think there are potentially other ways to ground TOP without appealing to AI and would be very keen to see those articulated and explored more.
(2) is the part where my credences feel most fragile, especially the parts about AI being sufficiently capable to drastically reduce other x-risks and misaligned AI, and AI remaining aligned near indefinitely. It would be great to have a better sense of how difficult various x-risks are to solve and how powerful an AI system we might need to near eliminate them. No unknown unknowns seems like the least plausible premise of the group, but its very nature makes it hard to know how to cash this out.
Innovations for Poverty Action just released their Best Bets: Emerging Opportunities for Impact at Scale report. It covers what they think are best evidence-backed opportunities in global health and development. The opportunities are:
Small-quantity lipid-based nutrient supplements to reduce stunting
Mobile phone reminders for routine childhood immunization
Social signaling for routine childhood immunization
Cognitive behavioral therapy to reduce crime
Teacher coaching to improve student learning
Psychosocial stimulation and responsive care to promote early childhood development
Soft-skills training to boost business profits and sales
Consulting services to support small and medium-sized businesses
Empowerment and Livelihoods for Adolescents to promote girls’ agency and health
Becoming One: Couples’ counseling to reduce intimate partner violence
Edutainment to change attitudes and behavior
Digital payments to improve financial health
Childcare for women’s economic empowerment and child development
Payment for ecosystem services to reduce deforestation and protect the environment
Thanks for highlighting this Michael and spelling out the different possibilities. In particular, it seems like if aliens are present and would expand into the same space we would have expanded into had we not gone extinct, then for the totalist, to the extent that aliens have similar values to us, the value of x-risk mitigation is reduced. If we are replaceable by aliens, then it seems like not much is lost if we do go extinct, since the aliens would still produce the large valuable future that we would have otherwise produced.
I have to admit though, it is personally uncomfortable for my valuation of x-risk mitigation efforts and cause prioritisation to depend partially on something as abstract and unknowable as the existence of aliens.
I’m not convinced that our CEA is particularly useful for more generalised interventions. All we really do is assume that the intervention causes some growth increase (a distribution rather than a point estimate) and then model expected income with the intervention, with the intervention 10 years later and with no intervention. The amount the intervention increases growth is the key parameter and is very uncertain so further research on this will have the highest VoI, but this will be different for each intervention. We treat how the intervention increases growth as a black box so I think looking inside the box and trying to understand the mechanisms better would shed some light on how robust the assumed growth increase is and how we might expect it to generalise to other contexts.
Furthermore, we only model the direct benefits of the growth intervention. In general, I’d expect the indirect effects to be larger and our modelling approach doesn’t say anything about these so I expect looking into these indirect benefits, perhaps via an alternative model, to have higher VoI than further modelling of the direct benefits.
For charter cities in particular, we could probably further tighten the bounds on the direct benefits by getting more rigorous information on city population growth rates and the correlation between population growth and income growth.
This paper was a chapter in the book Randomized Control Trials in the Field of Development: A Critical Perspective, a collection of articles on RCTs. Assuming the author of this chapter, Timothy Ogden doesn’t identify as a randomista, the only other author who maybe does is Jonathan Morduch, so it’s a pretty one-sided book (which isn’t necessarily a problem, just something to be aware of).
There was a launch event for the book with talks from Sir Angus Deaton, Agnès Labrousse, Jonathan Morduch, Lant Pritchett and moderated by William Easterly, which you might find interesting if you enjoyed this post.
Hey Kaj, I just thought I’d let you know that you’re not alone in Scandinavia! A few of us are starting an EA group in Uppsala, Sweden and Trondheim, Norway launched a couple of weeks ago. I know it’s late notice, but we’re having a Google Hangout this evening, 9pm your time so if you could join, that’d be great!
I was somewhat surprised by the lack of distinction between the cases where we go extinct and the universe is barren (value 0) and big negative futures filled with suffering. The difference between these cases seem large to me and seems like they will substantially affect the value of x-risk and s-risk mitigation. This is even more the case if you don’t subscribe to symmetric welfare ranges and think our capacity to suffer is vastly greater than our capacity to feel pleasure, which would make the worst possible futures way worse than the best possible futures are good. I suspect this is related to the popularity of the term ‘existential catastrophe’ which collapses any difference between these cases (as well as cases where we bumble along and produce some small positive value but far from our best possible future).
Hi Edo!
Our funder was interested in How Asia Works, presumably from positive reviews it’s received from people like Bill Gates and Noah Smith, and asked us to check the land section in more detail. We had a comparative advantage here given my background in development economics.
I wouldn’t be particularly interested in more land redistribution research, given that there don’t seem to be any clear funding opportunities in this space. If someone could find decent opportunities then that would make it a bit more interesting. But given the ambiguous results on the relationship between farm size and yield, I imagine research on other unexplored development interventions would have higher value of information.
I would be interested to read a deep dive into tenure reform, but this is just my personal opinion. A bunch more work, both policy and academic, seems to have been done on tenure reform so there would probably be more literature and case studies to work with. We link a couple of systematic reviews (Gignoux et al. 2014 and Lawry et al. 2017) but didn’t look into them ourselves.
The JPAL and IPA Dataverses have data from 200+ RCTs from development economics and the 3ie portal has 500+ studies with datasets available (and you can further filter by study type if you want to limit to RCTs). I can’t point you to particular studies that having missing or mismeasured covariates, but from personal experience, a lot of them have lots of missing data.
Thanks Vasco, I’m glad you enjoyed it! I corrected the typo and your points about inverse-variance weighting and lognormal distributions are well-taken.
I agree that doing more work to specify what our priors should be in this sort of situation is valuable although I’m unsure if it rises to the level of a crucial consideration. Our ability to predict long-run effects has been an important crux for me hence the work I’ve been doing on it, but in general, it seems to be more of an important consideration for people who lean neartermist than those who lean longtermist.
Yep, I agree you can generate the time of perils conclusion if AI risk is the only x-risk we face. I was attempting to empirically describe a view that seem to be popular in the x-risk space here, that other x-risks beside AI are also cause for concern, but you’re right that we don’t necessarily need this full premise.
Section 2.2.2 of their report is titled “Choosing a fixed or random effects model”. They discuss the points you make and clearly say that they use a random effects model. In section 2.2.3 they discuss the standard measures of heterogeneity they use. Section 2.2.4 discusses the specific 4-level random effects model they use and how they did model selection.
I reviewed a small section of the report prior to publication but none of these sections, and it only took me 5 minutes now to check what they did. I’d like the EA Forum to have a higher bar (as Gregory’s parent comment exemplifies) before throwing around easily checkable suspicions about what (very basic) mistakes might have been made.