Focused on impact evaluation, economics, and (lately) animal welfare
geoffrey
Any tips for running discussion groups on the WIT sequences? I’m vaguely interested in doing one on the CURVE sequence (which I’ve read deeply) or the CRAFT sequence (which I have only skimmed). However, the technical density seems like a big barrier and perhaps some posts are more key than others
What are selfish lifestyle reasons to work on the WIT team?
Is it fair to say the work WIT does is unusual outside of academia? What are closely related organizations that tackle similar problems?
How does your team define “good enough” for a sequence? What adjustments do you make when you fall behind schedule? Cutting individual posts? Shortening posts? Spending more time?
How much does the direction of a sequence change as you’re writing it? It seems like you have a vision in mind when starting out, but you also mention being surprised by some results.
Can you tell us more about the structure of research meetings? How frequently do individual authors chat with each other and for what reason? In particular, the CURVE sequence feels very intentionally like a celebration of different “EA methodologies”. Most of the posts feel individual before converging on a big cost-effectiveness analysis.
Much of your work is numerical simulation over discrete choices. Have there been attempts to define more “closed-form” analytical equations? What are pros and cons here?
What are the main constraints the WIT team faces?
Definitely wish I read and believed this when I was out of college.
One thing that surprised me once I got my ‘dream job’ was how behind I was on soft skills. I think if I had ‘lowered my bar’ earlier, I would have had more practice in communicating concisely, staying cool in stressful situations, and building work relationships.
Not sure if that path would have been more or less impactful in expectation, but there are definitely benefits to ‘lower ambition’ that I’m only appreciating now
Not knowing anything else about your friend, CEA intro resources + saying you’d be excited to discuss it sometime sounds like the best bet.
Cruxes here include:
How deeply does your friend want to learn about EA? They might only want to engage with it for a week or sporadically. Or they may want to know that longtermism is a thing but not go through any (or much) of the moral calculus
How does their disability manifest? The little bit I know about intellectual disabilities suggests that it’s hard to know in advance how it affects your learning, even for the person who has the disability. Struggling with math and stats is very common so that doesn’t tell me much.
Not knowing either of these makes me suspect you should do the same as usual but mention the community’s not always the best at communicating / makes stuff more complicated than it needs to be
Project-based learning seems to be a underappreciated bottleneck for building career capital in public policy and non-profits. By projects, I mean subjective problems like writing policy briefs, delivering research insights, lobbying for political change, or running community events. These have subtle domain-specific tradeoffs without a clean answer. (See the Project Work section in On-Ramps Into Biosecurity)
Thus the lessons can’t be easily generalized or made legible the way a math problem can be. With projects, even the very first step of identifying a good problem is tough. Without access to a formal network, you can spend weeks on a dead end only realizing your mistakes months or years after the fact.
This constraint seems well-known for professionals in the network, as organizers for research fellowships like SERI Mats describe their program as valuable, highly in-demand, yet constrained in how many people they can train.
I think operations best shows the surprising importance of domain-specific knowledge. The skill set looks similar across fields. So that would imply some exchange-ability between private sector and social sector. But in practice, organizations want you to know their specific mission very well and they’re willing (correctly or incorrectly) to hire a young Research Assistant over, say, someone with 10 years of experience in a Fortune 500 company. That domain knowledge helps you internalize the organization’s trade-offs and prioritize without using too much senior management time.
Emphasizing this supervised project-based learning mechanism of getting domain-specific career capital would clarify a few points.
With school, it would
emphasize that textbook-knowledge is both necessary yet insufficient for contributing to social sector work
show the benefits of STEM electives and liberal arts fields, where the material is easier from a technical standpoint but you work on open-ended problems
illustrate how research-based Master degrees in Europe tend to be better training than purely coursework-based ones in the US (IMHO, true in Economics)
With young professionals, it would
highlight the “Hollywood big break” element of getting a social sector job, where it’s easier to develop your career capital after you get your target job and get feedback on what to work on (and probably not as important before that)
formalize the intuition some people have about “assistant roles in effective organizations” being very valuable even though you’re not developing many hard skills
With discussions on elitism and privilege, it would
give a reason for the two-tier system many social sectors seem to have, where the stable jobs require years of low-paid experience and financially unstable training opportunities require significant sacrifice to even access
perhaps inform some informational interventions like books highlighting the hidden curriculum in executing projects or communicating with stakeholders (Doing Economics: What You Should Have Learned in Grad School But Didn’t or The Unspoken Rules: Secrets to Starting Your Career Off Right)
I always read therapeutic alliance as advice for the patient, where one should try many therapists before finding one that fits. I imagine therapists are already putting a lot of effort on the alliance front
Perhaps an intervention could be an information campaign to tell patients more about this? I feel it’s not well known or to obvious that you can (1) tell your therapist their approach isn’t working and (2) switch around a ton before potentially finding a fit
I haven’t looked much into it though
Love this and excited to see more of it. (3) is the biggest surprise for me and I think I’m more positive on education now.
Interested to hear your thoughts on growth diagnostics if you ever get around to it
P.S. I imagine you’re too busy to respond, but I’d be curious to hear if these findings surprised you / what updates you made as a result
EA organizations often have to make assumptions about how long a policy intervention matters in calculating cost-effectiveness. Typically people assume that passing a policy is equivalent to having it in place for around five years more or moving the start date of the policy forward by around five years.
I am really really surprised 5 years is the typical assumption. My conservative guess would have been ~30 years persistence on average for a “referendum-sized” policy change.
Related, I’m surprised this paper is a big update for some people. I suppose that attests to the power of empirical work, however uncertain, for illuminating the discussion on big picture questions.
How Much Does Performance Differ Between People by Max Daniel and Benjamin Todd goes into this
Also there’s a post on “vetting-constrained” I can’t recall off the top of my head. The gist is that funders are risk-adverse (not in the moral sense, but in the relying on elite signals sense) because Program Officers don’t have enough time / knowledge as they’d like for evaluating grant opportunities. So they rely more on credentials than ideal
I liked this a lot. For context, I work as a RA on an impact evaluation project. I have light interests / familiarity with meta-analysis + machine learning, but I did not know what surrogate indices were going into the paper. Some comments below, roughly in order of importance:
Unclear contribution. I feel there’s 3 contributions here: (1) an application of surrogate method to long-term development RCTs, (2) a graduate-level intro to the surrogate method, and (3) a new M-Lasso method which I mostly ignored. I read the paper mostly for the first 2 contributions, so I was surprised to find out that the novel contribution was actually M-Lasso
Missing relevance for “Very Long-Run” Outcomes. Given the mission of Global Priorities Institute, I was thinking throughout how the surrogate method would work when predicting outcomes on a 100-year horizon or 1000-year horizon. Long-run RCTs will get you around the 10-year mark. But presumably, one could apply this technique to some historical econ studies with (I would assume) shaky foundations.
Intuition and layout is good. I followed a lot of this pretty well despite not knowing the fiddly mechanics of many methods. And I had a good idea on what insight I would gain if I dived into the details in each section. It’s also great that the paper led with a graph diagram and progressed from simple kitchen sink regression before going into the black box ML methods.
Estimator properties could use more clarity.
Unsure what “negative bias” is. I don’t know if the “negative bias” in surrogate index is an empirical result arising from this application, or a theoretical result where the estimator is biased in a negative direction. I’m also unsure if this is attenuation (biasing towards 0) or a honest-to-god negative bias. The paper sometimes mentions attenuation and other times negative bias but as far as I can tell, there’s one surrogacy technique used
Is surrogate index biased and inconsistent? Maybe machine learning sees this differently, but I think of estimators as ideally being unbiased and consistent (i.e. consistent meaning more probability mass around the true value as sample size tends to infinity). I get that the surrogate index has a bias of some kind, but I’m unclear on if there’s also the asymptotic property of consistency. And at some point, a limit is mentioned but not what it’s a limit with respect to (larger sample size within each trial is my guess, but I’m not sure)
How would null effects perform? I might be wrong about this but I think normalization of standard errors wouldn’t work if treatment effects are 0...
Got confused on relation between Prentice criterion and regular unconfoundedness. Maybe this is something I just have to sit down and learn one day, but I initially read Prentice criterion as a standard econometric assumption of exogeneity. But then the theory section mentions Prentice criterion (Assumption 3) as distinct from unconfoundedness (Assumption 1). It is good the assumptions are spelt are since that pointed out a bad assumption I was working with but perhaps this can be clarified.
Analogy to Instrumental Variables / mediators could use a bit more emphasis. The econometric section (lit review?) buries this analogy towards the end. I’m glad it’s mentioned since it clarifies the first-stage vibes I was getting through the theory section, but I feel it’s (1) possibly a good hook to lead the the theory section and (2) something worth discussing a bit more
Could expand Table 1 with summary counts on outcomes per treatment. 9 RCTs sounds tiny, until I remember that these have giant sample sizes, multiple outcomes, and multiple possible surrogates. A summary table of sample size, outcomes, and surrogates used might give a bit more heft to what’s forming the estimates.
Other stuff I really liked
The “selection bias” in long-term RCTs is cool. I like the paragraph discussing how these results are biased by what gets a long-term RCT. Perhaps it’s good emphasizing this as a limitation in the intro or perhaps it’s a good follow-on paper. Another idea is how surrogates would perform in dynamic effects that grow over time. Urban investments, for example, might have no effect until agglomeration kicks in.
The surprising result of surrogates being more precise than actual RCTs outcomes. This was a pretty good hook for me but I could have easily passed over in in the intro. I also think the result here captures the core intuition of bias-variance tradeoff + surrogate assumption in the paper quite strongly.
I’ve read conflicting things about how individual contributor skills (writing the code) and people management skills relate to one another in programming.
Hacker News and the cscareerquestions subreddit give me the impression that they’re very separate, with many complaining about how advancement dries up on a non-management track.
But I’ve also read a few blog posts (which I can’t recall) arguing the most successful tech managers / coders switch between the two, so that they keep their technical skills fresh and know how their work fits in a greater whole.
What’s your take in this? Has it changed since starting your new job?
Flagging quickly that ProbablyGood seems to have moved into this niche. Unsure exactly how their strategy differs from 80k hours but their career profiles do seem more animals and global health focused
I think they’re funded by similar sources to 80k https://probablygood.org/career-profiles/
This looks like a really cool framework! Hoping to experiment with the inputs sometime to inform my future career decisions / my thoughts on funding desk research versus original research / value of replications.
Moving some funders from an overall lower cost effectiveness to a still relatively low or middling level of cost effectiveness can be highly competitive with, and, in some cases, more effective than working with highly cost-effective funders.
I’ve suspected this but never had the framework to formalize it. Or what parameters my claim was sensitive to. Mostly I had a toy metaphor in my head of “small nudge on giant slow rock” > “big nudge on tiny speeding rock” (the former being a bureaucrat nudging an agency budget and the latter being a CEO pivoting an EA-style non-profit). So having more of these cruxes and levers listed out here is very helpful
One concrete case I feel is a sort of “constrained optimization” where a funder has a specific theme they want to stick to. With governments, this might be a legal requirement.
Do you think your area is more talent-constrained or cash-constrained? How about your particular role? Read this in whatever way makes sense
Thanks so much for this! I don’t know why I ever thought about decomposing the idea of corruption but it seems like a really obvious framework now that you’ve mentioned it. Hoping to give that a read sometime.
Hi Khai, this depends on what you want to do in the future. The short answer is no. Both statistics and maths are broad fields with solid generalizability and respectability. They also tend to vary a bit in difficulty, rigor and focus across schools.
Math is prob better for keeping the option of various fields of academia open. Stats is prob better for industry. But it’ll depend on the classes you take too.
The most generalizable classes will be:
Calculus Sequence
Linear Algebra
intro to probability and statistics
These are used in a very wide range of fields. But after that it branches out pretty quickly and you want to focus on domain knowledge or technical classes specific to a field
Economics has its own approach to stats called econometrics which deviates quite a bit culturally and technically in its focus. Andrew Gelman has some blog posts you can search on that
Stuff I don’t know which other stats people are more likely to know:
Markov chains
Monte Carlo simulations
really any simulation technique
Bayesian stats
information theory
textual analysis or ML stuff
…and a lot more. And I can / will learn a few of these in the future for work or interest. But they’re not immediately useful
Quick thought on the tangent, which I’d also love to hear more thoughts on from other people.
I’m skeptical that corruption is a big obstacle to growth and development. Measurement and historical comparisons are tricky here, but corruption seems to be a pervasive feature across many societies.
Even the United States had its local political machines and share of bribery before the Progressive Movement in the 1920s tried to filter it out. And conventional wisdom credits the Industrial Revolution (of the 19th century before the US reduced its corruption) with our modern wealth.
I suspect if we applied our same concern of corruption to currently-developed countries to their past, we’d find they (1) would fare just as bad and (2) had their development periods before they dealt with the corruption
My 2 cents:
Good advice but I’ll add that many of these things (solo projects, getting internships, writing, etc.) benefit substantially from attending a school with good training (which correlates somewhat with prestige and cost-of-attending).
Feedback, mentorship, and direction are bottlenecks for executing impressive projects and sometimes the best way (or only way) for someone to access these is through the conventional schooling route.
Conventional education and independent projects complement each other
Decreasing focus over time may not mean decreasing productivity:
Suppose you want to double your productivity by doubling your work hours from 30 to 60 per week. Standard advice will say this is silly, since focus decreases over time. You may still increase your productivity but it will scale slower than your work hours.
But this assumes all assigned work is equally important. In reality, many jobs have peripheral tasks that must be done before your core tasks (or your “real work”). Civil servants have reporting requirements, academic researchers have teaching obligations, and individual contributors everywhere have to attend meetings so managers can coordinate direction.
Suppose the non-core tasks takes 20 hours per week. Then going from a 30-hour to 60-hour workweek isn’t just doubling your core task hours; it’s quadrupling your core task hours from 10 to 40! And that quadrupling of core task hours can outweigh the diminishing focus over time. It can even mean that the last 20 hours are more productive than the first 20 hours.
Now 20 hours of peripheral tasks is admittedly an extreme example. But it may not be that far off for modeling career advancement. Promotions are based partly on stretch assignments (or “performing above your level”) and you won’t get to work on stretch assignments all the time. Managers may split your time between your current job and the job you want to promote into.
Once you get to a certain level of seniority and organizational maturity, then more of your hours become core task hours. So diminishing focus more directly translates into diminishing productivity. But I think the earlier you are in your career, the more exploration you’re doing, and the further you are from your target job, the more likely you’ll want those extra hours.