Since November, we’ve made 27 grants worth a total of $1,650,795 (with the expectation that we will get up to $220,000 back), reserved $26,500 for possible later spending, and referred two grants worth a total of $120,000 to private funders. We anticipate being able to spend $3–8 million this year (up from $1.4 million spent in all of 2020). To fill our funding gap, we’ve applied for a $1–1.5 million grant from the Survival and Flourishing Fund, and we hope to receive more funding from small and large longtermist donors.
The composition of the fund has changed since our last grant round. The new regular fund team consists of Asya Bergal (chair), Adam Gleave, and Oliver Habryka, though we may take on additional regular fund managers over the next several months. Notably, we’re experimenting with a “guest manager” system where we invite people to act as temporary fund managers under close supervision. Our guest managers this round were Daniel Eth, Evan Hubinger, and Ozzie Gooen.
Highlights
Our grants include:
$35,000 to support John Wentworth’s independent AI safety research, specifically testing an empirical claim relevant to AI alignment called the natural abstraction hypothesis. John Wentworth has produced a hugeamountofhigh-qualitywork that has influenced top AI safety researchers and pushed the field of AI safety forward; this grant will enable him to produce more work of this kind. This grant was given the maximum possible score by four of our six fund managers (one manager gave a non-maximal score; the other didn’t score this grant).
Up to $200,000 to fund PhD students and computing resources at David Krueger’s new AI safety lab at the University of Cambridge. David Krueger has done excellentsafetywork and was recently appointed to a faculty position in Cambridge’s Computational and Biological Learning Lab. Having a new academic lab focused on AI safety is likely to have highly positive field-building effects by attracting promising junior AI researchers to safety work and shifting the thinking of more senior researchers.
$6,917 to support a research assistant for Jess Whittlestone and Shahar Avin’s work at the Centre for the Study of Existential Risk (CSER), wherein they hope to ensure that the lessons learnt from COVID-19 improve global catastrophic risk (GCR) prevention and mitigation in the future. This is a timely grant for an ambitious project which, if successful, could shift global attitudes towards GCRs.
Grant recipients
See below for a list of grantees’ names, grant amounts, and project descriptions. Most of the grants have been accepted, but in some cases, the final grant amount is still uncertain.
Grants made during the last grant application round:
Adam Shimi ($60,000): Independent research in AI alignment for a year, to help transition from theoretical CS to AI research.
AI Safety Camp ($85,000): Running a virtual and physical camp where selected applicants test their fit for AI safety research.
Alexander Turner ($30,000): Formalizing the side effect avoidance problem.
Amon Elders ($250,000): Doing a PhD in computer science with a focus on AI safety.
Anton Korinek ($71,500): Developing a free online course to prepare students for cutting-edge research on the economics of transformative AI.
Anonymous ($33,000): Working on AI safety research, emphasizing AI learning from human preferences.
Center for Human-Compatible AI ($48,000): Hiring research engineers to support CHAI’s technical research projects.
Daniel Filan ($5,280): Technical support for a podcast about research aimed at reducing x-risk from AI.
David Krueger ($200,000, with an expected reimbursement of up to $120,000): Computing resources and researcher salaries at a new deep learning + AI alignment research group at Cambridge.
David Manheim ($80,000): Building understanding of the structure of risks from AI to inform prioritization.
David Reber ($3,273): Researching empirical and theoretical extensions of Cohen & Hutter’s pessimistic/conservative RL agent.
Joel Becker ($42,000): Conducting research into forecasting and pandemics during an economics Ph.D.
Legal Priorities Project ($135,000): Hiring staff to carry out longtermist academic legal research and increase the operational capacity of the organization.
Logan Strohl ($80,000): Developing and sharing an investigative method to improve traction in pre-theoretic fields.
Marc-Everin Carauleanu ($2,491): Writing a paper/blog post on cognitive and evolutionary insights for AI alignment.
Naomi Nederlof ($34,064): Running a six-week summer research program on global catastrophic risks for Swiss (under)graduate students.
Po-Shen Loh ($100,000, with an expected reimbursement of up to $100,000): Paying for R&D and communications for a new app to fight pandemics.
Rethink Priorities ($70,000): Researching global security, forecasting, and public communication.
Sam Chorlton ($37,500): Democratizing analysis of nanopore metagenomic sequencing data.
Stanford Existential Risks Initiative ($60,000): Providing general-purpose support via the Berkeley Existential Risks Initiative.
Sören Mindermann ($41,869): Doing a 3rd and 4th year of a PhD in machine learning, with a focus on AI forecasting.
Summer Program on Applied Rationality and Cognition ($15,000): Supporting multiple SPARC projects during 2021.
Tegan McCaslin ($80,401): Continuing research projects in AI forecasting, AI strategy and forecasting.
Toby Bonvoisin ($18,000): Doing the first part of a DPhil at Oxford in modelling viral pandemics.
Off-cycle grants:
Jess Whittlestone, Shahar Avin ($6,917): Ensuring that the lessons learnt from COVID-19 improve GCR prevention and mitigation in the future.
Longtermist Entrepreneurship Fellowship (see below): Giving seed funding to projects incubated by the Longtermist Entrepreneurship Fellowship Programme.
Grants made through the Longtermist Entrepreneurship Fellowship:
Josh Jacobson ($26,500, with $26,500 earmarked for potential use later): Research into physical-world interventions to improve productivity and safeguard life.
Grant reports
Note: Many of the grant reports below are very detailed. We encourage anyone who is considering applying to the fund but would prefer a less detailed report to apply anyway; we are very sympathetic to circumstances where a grantee might be uncomfortable with a highly detailed report on their work.
We run all of our payout reports by grantees, and we think carefully about what information to include to maximize transparency while respecting grantees’ preferences. If considerations around reporting make it difficult for us to fund a request, we are able to refer to private donors whose grants needn’t involve public reporting. We are also able to make anonymous grants, as we did this round.
Grant reports by Adam Gleave
Adam Shimi – $60,000
Independent research in AI alignment for a year, to help transition from theoretical CS to AI research.
Recusal note: Evan Hubinger did not participate in the voting or final discussion around this grant.
Adam applied to perform independent research on AI safety for 1 year in collaboration with Evan Hubinger at MIRI, whom he has been working with since mid-2020. Adam completed a PhD from IRIT (publications) in 2020 focused on distributed computing theory. Since then, he has written a number of posts on the Alignment Forum.
I particularly liked his Literature Review on Goal-Directedness, which I felt served as an excellent introduction to the area with a useful taxonomy of different motivations (or “intuitions”) for the work. This seems like a good example of research distillation, a category of work that is generally under-incentivized both by academia and by the major corporate AI labs. I’m not sure AI safety needs much in the way of distillation at this point (it’s still a fairly small field), but at the margin I think we’d likely benefit from more of it, and this need will only grow as the field scales. Given this, I’m generally excited to see good research distillation.
In addition to continuing his research distillation, Adam plans to work on formalizing goal-directedness. He seems well-prepared for this given his CS theory background and grasp of the existing space, as evidenced by his literature review above. I find this direction promising: slippery and imprecise language is common in AI safety, which often leads to miscommunication and confusion. I doubt that Adam or anyone else will find a universally agreed-upon formalism for goal-directedness, but attempts in this area can help to surface implicit assumptions, and I expect will eventually lead to a more coherent and precise taxonomy of different kinds of goal-directedness.
I tend to apply a fairly high bar to long-term independent research, since I think it is hard to be a productive researcher in isolation. There are several factors that make me more positive about this case:
Adam already has significant research experience from his PhD, making him more likely to be able to make progress with limited oversight.
Evan Hubinger is spending around 1 hour a week on informal mentorship, which I think helps to provide accountability and an independent perspective.
Adam is interested in working at an organization long-term, but has found finding a position particularly challenging this year owing to the pandemic.
Given this, it seems reasonable to support Adam for a year to continue his research while he searches for positions.
The main concern I have regarding this grant is that Adam’s previous work in distributed computing seems to have had a limited impact on the field, with only 6 citations as of the time of writing. While it’s entirely possible that Adam has better personal fit in AI safety, his track record in that field is brief enough that I’m not yet confident about this. However, there are other positive signs: his previous publications were all in reasonable conferences, including one in PODC, which I understand to be one of the top-tier venues in distributed computing. Unfortunately, I am not familiar enough with this area of CS to render my own judgment on the papers, so have instead relied primarily on my impressions of his more recent AI-focused work.
Amon Elders – $250,000
Doing a PhD in computer science with a focus on AI safety.
Amon applied for funding to support a PhD position in artificial intelligence working with Prof. Michael Osborne. Receiving funding was important to securing an offer, since there is limited PhD funding available for non-UK citizens in the UK after Brexit. Moreover, the funding sources that are available have application deadlines earlier in the year, and so waiting for these would have delayed the start of the PhD by a year.
Amon graduated from UCL with an MS in CS & ML (with distinction). His research experience includes a year-long RA position at the Italian Institute of Technology in Genoa with his UCL supervisors, which resulted in a third-author publication in AIES, and an MS Thesis on learning to rank. Amon also has relevant professional experience, having worked as an ML engineer at Spark Wave, and as an undergraduate co-founded the EA society at the University of Amsterdam.
Amon is particularly interested in working on robustness and distributional shift in his PhD, though he is also considering other topics. While his research direction is currently tentative, he is familiar with the key aspects of long-term AI safety and is interested in pursuing this direction further both during and after his PhD.
Michael Osborne seems like a good advisor given Amon’s interests. Michael has a deep background in Bayesian ML which is relevant for robustness and distributional shift, and Michael also has an interest in the social impact of machine learning. Additionally, Michael Osborne is also advising Michael Cohen, a PhD scholar at FHI. While Michael Cohen is pursuing a different concrete research agenda to Amon’s current interests, they have a similar high-level focus, so I expect being in the same group will help lead to fruitful discussions.
I generally think there’s a strong case for funding talented people to pursue PhDs on relevant topics. For one, this is a great way of training relevant researchers: I think PhDs generally do a good job of teaching relevant research skills and are a valuable credential. In addition, there’s often scope to produce directly valuable research during a PhD, although the opportunity here varies between labs.
A big reason why I’m often hesitant to fund PhDs is adverse selection. There are many other funding sources for PhDs: government fellowships, university bursaries, private endowments, advisors’ research grants, etc. Most of the people who we would want to fund can easily get funding from elsewhere. So the applications we see will be disproportionately those whom other funders have chosen to reject. However, Amon was simply ineligible for the majority of those funding sources, so this consideration is much weaker in his case. He ended up getting 5 PhD offers including offers from Oxford, Cambridge, UCL, and Imperial.
The main concern I have with this grant is that Amon may end up working on topics that I do not believe will have an impact on the long-term future. This could be fine if he later pivots to focus more on AI safety post-PhD, but research direction often can be “sticky” (see similar discussion in the write-up for
Sören Mindermann below). However, Amon does seem to currently be on a promising path, and he’s currently keeping up with AI safety research at other labs.
David Krueger – $200,000, with an expected reimbursement of up to $120,000
Computing resources and researcher salaries at a new deep learning + AI alignment research group at Cambridge.
I am generally excited about supporting academic labs that want to focus on AI safety research. Right now CHAI is the only academic lab with a critical mass of people working on technical AI safety. While there are a handful of other early-stage labs, I believe we still need many more if we are to seriously expand the field of AI safety in academia.
The most direct path for impact I see for additional safety-focused academic labs is allowing PhD students with a pre-existing interest in safety to work on related research topics. Right now, many PhD students end up in labs focused on unrelated topics. Indeed, I’ve even recommended that prospective students “give more weight to program quality than the ability to pursue topics you consider important.” While I stand by this advice, I hope that as safety-focused labs expand, students will no longer have to make this trade-off.
I also expect that students graduating from a safety-focused academic lab will be better prepared to perform safety research than if they had worked in another lab. I’d estimate that they will save somewhere between 6 months and a year of time that would otherwise need to be spent on literature review and clarifying their thinking on safety after graduation.
Perhaps the largest upside, though also one of the more uncertain ones, is an indirect field-building effect. Right now, a large fraction of people working on AI safety have been influenced by the philosophy of effective altruism (EA). While I am grateful for the support of the EA community on this topic, this also highlights a failure to tap into the much larger talent pool of (prospective) AI researchers who aren’t connected to that community. Academic labs can help with this by recruiting PhD students who are interested in, but have little prior exposure to, safety, and by shifting the thinking of other more senior researchers.
Despite this positivity, we do not want to indiscriminately fund academic labs. In particular, most of the best academic AI research happens in the top 20 or so universities. Outside of those universities, I would expect a significant drop-off in the caliber of PhD applicants, as well as the lab being in a worse position to influence the field. On this criterion, Cambridge seems like a good place for a new lab: it is one of the best universities in the world for Bayesian ML, and benefits from the presence of strong mathematics and CS departments. While I would be even more excited by a lab at the likes of Stanford or MIT, which have amongst the strongest AI groups right now, Cambridge is clearly still a good place to do AI research.
David intends to use our funding to support new PhD students and purchase computational resources. Both of these seem like reasonable uses of funding:
PhD students: I reviewed a list of prospective PhD students and saw several applicants who seem strong, and I would be excited to see them join David’s lab. While PhD students can often receive fellowships or scholarships from other sources, these opportunities are limited for international students in the UK. Furthermore, current applicants are not eligible for most of those funding sources this year, because David received his faculty offer after many funding deadlines had already passed.
Computational resources: The CBL lab that Cambridge is joining has limited in-house computational resources, with a ratio of around 2 GPUs per student. Moreover, the existing resources are fragmented between many small servers and workstations. I spoke to a current student at the CBL who confirmed that there is a shortage of computational resources which both limits the scope of possible experiments and wastes time.
The main risk I see with this grant is that, since the funding is unrestricted, we might be negatively surprised by what it is spent on. In particular, it is conceivable that it could be used to fund a PhD student the LTFF would not have chosen to fund directly. However, unrestricted funding is also more valuable to David, allowing him to make timely offers to students who are faced with short deadlines from competing institutions. Moreover, David has a strong incentive to recruit good students, so it isn’t clear that the LTFF has a comparative advantage in evaluating his prospective students. Overall, these considerations seem to weigh heavily in favour of unrestricted funding, although this does make the outcome harder to predict.
It’s also possible that mentorship will cease to be a major bottleneck on the growth of the AI safety field in the near future. Senior safety researchers are graduating in increasing numbers, and many of them are interested in faculty positions. Moreover, my (by no means rigorous) impression is that the growth in PhD candidates is beginning to slow (though is still expanding). If mentorship becomes more plentiful in the future, David’s lab may be significantly less influential than expected. However, I think this grant (barely) clears the bar for funding even on the basis of its expected short-term impact alone (before mentorship “catches up” with the influx of junior research talent).
We thought the Survival and Flourishing Fund would also be likely to want to fund David, and we thought it would be fair to share the costs with them, so we arranged with them that the first $120,000 of any money they decide to give to David Krueger will be given back to us instead.
Sören Mindermann – $41,869
Doing a 3rd and 4th year of a PhD in machine learning, with a focus on AI forecasting.
We are providing financial support to Sören, a PhD student at Oxford advised by Yarin Gal, to improve his research productivity during his PhD. This is a renewal of our grant made in August 2019 at the start of his PhD. Sören anticipates spending money on items such as home office equipment, meal replacements (to spend less time cooking), a cleaning service, and other miscellaneous productivity-boosting costs.
We have made several grants providing supplementary funding to graduate students, such as Vincent Luczkow and an anonymous PhD candidate, so it’s worth taking a moment to reflect on our general approach to evaluating these grants. In general, I find the basic case for providing supplementary funding to graduate students fairly strong. The stipend provided to graduate students is often very low, especially outside the US. In Soren’s case, his stipend pays £15,000/year ($21,000/year equivalent). I therefore expect him, and other students in similar positions. to be able to productively use additional funding. Concretely, I would anticipate a grant of $20,000/year to be able to “buy” an extra 2 to 8 hours of additional productive work per week, giving a cost of around $50 to $200/hour.
As a rule of thumb, I’d be very excited to buy an additional hour of the median junior AI safety researcher’s time for $50. At $100, I’d still weakly favor this purchase; at $200, I would lean against but feel uncertain. This is very sensitive to personal fit and track record — I could easily go 4x higher or lower than this depending on the candidate.
In the case of existing researchers, I think we should actually be willing to pay slightly more than this. In other words, I would prefer to have 10 researchers who each produce “1.1 units” of research than 11 researchers who each produce “1 unit”. My rationale for this is that growing the number of people in a field imposes additional communication costs. Additionally, the reputation of a field depends to some extent on the average reputation of each researcher, and this in turn affects who joins the field in the future. This makes it preferable to have a smaller field of unusually productive researchers than a larger field of less productive researchers.
In addition to increasing direct research output, I anticipate these grants also having positive indirect effects. In particular, the track record someone has in a PhD heavily influences their future job opportunities, and this can be very sticky, especially in faculty positions.
One countervailing consideration is that people already in PhD programs applying for funding are highly likely to continue in the field. The impact on someone’s career trajectory might be greater providing funding pre-PhD, especially if that allows them to secure a position they otherwise wouldn’t. For this reason, all else being equal I think we should be more excited about funding people for a PhD than topping up funding of someone already in a PhD. However, in practice it’s quite rare for us to be able to truly counterfactually change someone’s career trajectory in this way. Most talented researchers secure PhD funding from other sources, such as their advisor’s grants. When we have made a grant to support someone in a PhD, I expect it has most often merely accelerated their entry into a PhD program, rather than altering their long-run trajectory.
While I do not believe this applies in Sören’s case, I also think having a norm that additional funding is available may encourage people to pursue PhDs who would otherwise have been reluctant to do so due to the financial sacrifices required. This is particularly relevant for people later in their careers, who might have to take a significant pay cut to pursue a PhD, and are more likely to have dependents and other costs.
Turning back to the specifics of this grant, Sören’s work to date has focused on causal inference. His publications include a more theoretical paper in NeurIPS (a top-tier ML conference), and a joint first-author paper in Science evaluating different government interventions against COVID-19. This seems on par with or exceeding the publication track record of other PhD students in similar programs and seniority. Having a Science paper in particular is very unusual (and impressive) in ML, however I do think this was facilitated by working on such a topical issue, and so may be a difficult feat to repeat.
My biggest concern is that none of the work to date has focused on directly improving the long-term future. There’s a tentative case for the COVID-19 work helping to improve governments’ response to future pandemics or generally improve methodology in epidemiology, but I’m unconvinced (although I do think the work looks good from a short-term perspective). Similarly, while Soren’s work on causal inference is in a relevant area, and will be useful for skill-building, I don’t see a clear pathway for this to directly improve technical AI safety.
However, I think that it can be a reasonable choice to focus more on skill-building during your PhD. In particular, it is often useful to start by imitating the research direction of those around you, since that will be the area for which you can get the most detailed and valuable feedback. While this may limit your short-term impact, it can set you up for an outsized future impact. The catch is that you must eventually pivot to working on things that are directly important.
In Sören’s case, I’m reasonably confident such a pivot will happen. Sören is already considering working on scaling laws, which have a clear connection to AI forecasting, for the remainder of his PhD. Additionally, we have clear evidence from his past work (e.g. internships CHAI and FHI) that he has an interest in working on safety-related topics.
Nonetheless, I think research directions can end up surprisingly sticky. For example, you tend to get hired by teams who are working on similar things to the work you’ve already done. So I do think there’s around a 25% chance that Sören doesn’t pivot within the next 3 years, in which case I’d regret making this grant, but I think the risk is worth taking.
Grant reports by Asya Bergal
Any views expressed below are my personal views and not the views of my employer, Open Philanthropy. In particular, receiving funding from the Long-Term Future Fund should not be read as an indication that an organization or individual has an elevated likelihood of receiving funding from Open Philanthropy. Correspondingly, not receiving funding from the Long-Term Future Fund (or any risks and reservations noted in the public payout report) should not be read as an indication that an organization or individual has a diminished likelihood of receiving funding from Open Philanthropy.
Anton Korinek – $71,500
Developing a free online course to prepare students for cutting-edge research on the economics of transformative AI.
Update December 2021: This course is now live here.
Anton applied to improve and professionalize a free Coursera course on the economics of transformative AI in time for an initial public run this fall. Anton is an associate professor of economics and had previously created a beta version of the course for his own graduate students at the University of Virginia. He applied for funding to pay for his own time working on the course, the time of a graduate student assistant, professional video editing, guest lecturers, copyright fees, and support from the University of Virginia.
I have spoken to Anton before and read large parts of his and Philip Trammel’s paper, Economic growth under transformative AI, which I thought was a good summary of existing economics work on the subject. My sense is that Anton is a competent economist and well-suited to creating an academic course.
I also skimmed the course materials for the beta course and watched parts of the associated video lectures. I would summarize the course as starting in more conventional AI economics covering the automation of labor and economic growth, and ending in more longtermist AI economics covering economic singularities, the macroeconomics of AI agents, and the control problem. The course is fairly technical and aimed towards advanced economics students. I am somewhat familiar with the economics work Anton is drawing from; while I didn’t investigate the course materials in depth, I didn’t see anything obviously inaccurate. The presentation of the beta version of the course seemed rough on certain dimensions (in particular, video quality); I can imagine it benefitting substantially from additional work and editing.
I was interested in funding this course because I think it could introduce smart economists to longtermist work, though much of the course isn’t explicitly focused on the long-term impacts of transformative AI. I think the work of many economists, including Philip Trammell and Robin Hanson, has been really influential in longtermist thinking, and a lot of the AI forecasting work I’m most excited about now is based in economics. It doesn’t seem implausible to me that a course like this could find the “low-hanging fruit” of economics students who turn out to be really interested in longtermist work when exposed to it. (As a secondary path to impact, this course is the best collection of materials around the economics of transformative AI that I know of, and I can imagine using it as a resource in the future.)
One risk of funding work that isn’t strictly longtermist is that it misrepresents longtermist ideas in an attempt to have broader appeal, discouraging people who would be on board with longtermism and attracting people who don’t really understand it. Anton strikes me as someone who is careful in his presentation of concepts, so I think this is unlikely to happen here.
I think there’s a significant chance the course doesn’t have much of an impact, at least in its first run. Anton’s goal is to have 50 students complete the course in the fall; even if that goal is met, it seems likely to me that that’s too small a number to find anyone who ends up being really excited about the material. However, given that the course is likely to have multiple runs and iterations in the future, I think the upside from this grant is worth it.
Daniel Filan – $5,280
Technical support for a podcast about research aimed at reducing x-risk from AI.
Recusal note: Adam Gleave and Oliver Habryka did not participate in the voting or final discussion around this grant.
Daniel applied for funding to pay for audio editing, transcription, and recording space usage for his AI alignment podcast, the AI X-risk Research Podcast (AXRP).
I’ve listened or read through several episodes of the podcast; I thought Daniel asked good questions and got researchers to talk about interesting parts of their work. I think having researchers talk about their work informally can provide value not provided by papers (and to a lesser extent, not provided by blog posts). In particular:
I’ve personally found that talks by researchers can help me understand their research better than reading their academic papers (e.g. Jared Kaplan’s talk about his scaling laws paper). This effect seems to have also held for at least one listener of Daniel’s podcast.
Informal conversations can expose motivations for the research and relative confidence level in conclusions better than published work.
Daniel provided some statistics about his podcast download numbers over time: 200 – 400 per episode as of early March. This guide suggests that Daniel’s podcast is in the top 10% to 25% of podcasts, though this single metric seems like a pretty dubious measure of podcast performance. (I also think it’s too early to tell exactly how well this podcast will do.)
Overall, the existing download counts along with personal anecdotes from people getting value out of this podcast were enough to justify this grant for me.
Rethink Priorities – $70,000
Researching global security, forecasting, and public communication.
Recusal note: Daniel Eth and Ozzie Gooen did not participate in the voting or final discussion around this grant.
Rethink Priorities is an organization that does public-facing research on a variety of EA topics; see their previous work here. Their previous longtermist work has consisted entirely of Luisa Rodriguez’s work on nuclear war. Luisa has since gone on to do other work, including this post, which I’ve referenced multiple times in this very grant round.
Rethink’s longtermist team is very new and is proposing work on fairly disparate topics, so I think about funding them similarly to how I would think about funding several independent researchers. Their longtermist hires are LinchuanZhang, David Reinstein, and 50% of Michael Aird (he will be spending the rest of his time as a Research Scholar at FHI). I’m not familiar with David Reinstein. Michael Aird has produced a lot of writing over the past year, some of which I’ve found useful. l haven’t looked at any written work Linchuan Zhang has produced (and I’m not aware of anything major), but he has a good track record in forecasting, I’ve appreciated some of his EA forum comments, and my impression is that several longtermist researchers I know think he’s smart. Evaluating them as independent researchers, I think they’re both new and promising enough that I’m interested in paying for a year of their time to see what they produce.
They are proposing to do research in three different areas:
Global security (conflict, arms control, avoiding totalitarianism)
Forecasting (estimating existential risk, epistemic challenges to longtermism)
Polling / message testing (identifying longtermist policies, figuring out how to talk about longtermism to the public)
Broadly, I am most excited about the third of these, because I think there’s a clear and pressing need for it. I think work in the other two areas could be good, but feels highly dependent on the details (their application only described these broad directions).
Here’s some example work in these areas that I could imagine being interested in. Note that I haven’t spent time looking into these ideas; it’s possible that on further reflection I would no longer endorse them, or discover that the work has already been done:
Finding high-leverage ways to reduce geopolitical conflict that don’t require political influence (or learning that there aren’t any)
Estimating the expected badness of a totalitarian future compared to an extinction outcome
We decided to pay 25% of the budget that Rethink requested, which I guessed was our fair share given Rethink’s other funding opportunities.
Toby Bonvoisin – $18,000
Doing the first part of a DPhil at Oxford in modelling viral pandemics.
Toby applied for 7 months of funding for his tuition and living expenses during his DPhil at the Nuffield Department of Medicine at the University of Oxford. Toby’s DPhil offer came too late for him to access other sources of funding for these 7 months; this money meant he wouldn’t have to take out a loan or put off his studies.
Toby’s proposed DPhil topic is to model transmission of novel respiratory viruses within hospitals. I don’t think this is a particularly high-leverage strain of work for preventing global catastrophic biological risks (GCBRs), but I think in biosecurity, it makes sense to prioritize building career capital over having an immediate impact early on.
Funding career building means taking a bet that someone will ultimately go on to do longtermist work. I didn’t know Toby at all, so I had a call with him talking about his motivations, future plans, and the interactions between short-term-focused and long-term-focused work in biosecurity. He gave thoughtful answers to those questions that made me feel like he was likely to pursue longtermist work going forward.
Jess Whittlestone, Shahar Avin – $6,917
Ensuring the lessons learnt from COVID-19 improve GCR prevention and mitigation in the future.
Jess and Shahar, who both work as researchers at the Centre for the Study of Existential Risk (CSER), applied for funding to pay for a research assistant for their project ensuring lessons learned from COVID-19 improve GCR prevention and mitigation in the future. From their application:
Described by Jaan Tallinn as a “minimum-viable catastrophe”, the C19 crisis has exposed failures across all levels of GCR prevention and mitigation. It is a compelling example of how institutions and systems can fail to prevent large-scale mortality and harm, even when much of the knowledge needed to do so is available. This presents an unprecedented opportunity to better understand failures of risk prevention and mitigation, and to enact long-lasting change in global attitudes. There will be extended and detailed investigation into “what went wrong” across different parts of society. We expect many powerful stories will be told about C19, in history books, museums, and educational curricula, which have potential to influence attitudes towards GCRs more broadly for decades to come.
Our aim is to ensure that the lessons emerging from the C19 crisis are broad and compelling enough to result in increased incentives and motivation for institutions to prioritise GCR prevention and mitigation. Actively shaping historical narratives is challenging but, we believe, powerful and doable: we have conducted preliminary research on the emergence of historical narratives about the Holocaust, which suggests that today’s widely accepted anti-fascist norms can be traced to narratives that emerged in novels, movies, museums and educational curricula over decades.
This will be a long-term project (10+ years with multiple stages), using CSER’s convening power to bring different expertise and perspectives to bear on the evolving lessons and narratives around C19.
I agree with their justification for this work. I think it’s good in expectation to put some longtermist effort towards promoting lessons learned from COVID-19, though as they say, I expect it to be very difficult to shape historical narratives.
Giving seed funding to projects incubated by the Longtermist Entrepreneurship Fellowship Programme.
In November 2020, Jade Leung applied to the LTFF asking for a commitment of up to $200,000 to provide seed funding to projects incubated in the pilot run of the Longtermist Entrepreneurship Fellowship (LEF), a new program intended to help longtermist projects and organizations get off the ground. Fellows submitted proposals for seed grants at the end of the fellowship in December, and a committee consisting of Claire Zabel (representing Open Philanthropy), Kit Harris (representing Longview Philanthropy), Sjir Hoeijmakers (representing Founders Pledge), and Jonas Vollmer (representing EA Funds) decided on which proposals received funding. Funding decisions were made by committee, but Jonas retained the right of veto over the LTFF’s portion of the funding. I felt good about committing this money because I trust the collective judgement of this set of grantmakers.
Five projects run by seven fellows ended up applying, for a total of $464,500 in funding. The committee decided to approve up to three grants for a total of up to $182,000. One of those grants, worth $106,000, was accepted by the grantee. The LTFF will cover half of the cost of this grant, with Open Philanthropy’s grant covering the other half. We’ve asked Jonas Vollmer to explain the reasoning behind the grant below. We will report on any further grants in future payout reports if they are accepted (which currently seems unlikely).
A grant report from the Longtermist Entrepreneurship Fellowship, by Jonas Vollmer
Josh Jacobson — up to $53,000
Research into physical-world interventions to improve productivity and safeguard life.
Josh Jacobson applied for a grant to research physical-world interventions to improve productivity and safeguard life. He plans to brainstorm and research various interventions, such as improving air quality, preparing for nuclear war or earthquakes, or securing second citizenships. He also plans to help longtermists implement these interventions.
I overall voted in favor of this grant because:
I agree with the basic premise: Improving the productivity of people working on important issues could be very valuable, and protecting their safety might be useful, too.
I am aware that some longtermist organizations are currently researching some of these issues themselves. Josh may save these organizations time and circulate the results of his research beyond organizational boundaries.
Josh’s concrete ideas seemed relatively detailed to me, so I thought he was likely to make substantial progress on the project. His project plan and milestones seemed focused on delivering concrete output quickly, which I liked. He also previously worked with BERI on a related project that seemed successful.
Josh seems very motivated to work on this project, such that I expect him to do a good job at it.
Some reasons against this grant that I considered:
Overestimating the risks involved might lead to overspending on prevention. The potential benefits seem relatively limited (marginal reduction of a low risk), whereas the costs could grow large (many organizations could potentially spend a lot of time on this).
For instance, I estimate that an earthquake in San Francisco has a ~2% annual likelihood of occurring, and would kill ~0.03% of the population in expectation. I estimate 30% of such deaths to be preventable. The resulting preventable risk seems tiny (just two micromorts per year). If longtermist organizations started taking measures to avert this risk for their staff as a result of this grant, that could be a poor use of their time.
Detailed reporting on personal safety risks could lead to unhelpful anxiety about them. During the COVID pandemic, I frequently got the impression that some EAs and rationalists seemed anxious about catching COVID to a degree where their worries seemed to take a much bigger toll on their expected quality of life than the actual risk at hand. I worry that a similar dynamic might come into play here.
Josh applied for a $106,000 grant over a year. Because we don’t yet know whether this project will be successful longer-term and the requested grant was relatively large, the LEF funding committee decided to make the grant in two installments: an initial $53,000 grant, and another $53,000 subject to a satisfactory progress report after the first six months. Half of each installment ($26,500) will be covered by the LTFF.
Grant reports by Daniel Eth
Legal Priorities Project – $135,000
Hiring staff to carry out longtermist academic legal research and increase the operational capacity of the organization.
The Legal Priorities Project (LPP) applied for funding to hire Suzanne Van Arsdale and Renan Araújo to conduct academic legal research, and Alfredo Parra to perform operations work. All have previously been involved with the LPP, and Suzanne and Renan contributed to the LPP’s research agenda. I’m excited about this grant for reasons related to LPP as an organization, the specific hires they would use the grant for, and the proposed work of the new hires.
Overall, I think the LPP is exciting enough to warrant funding in general, at least for the time being. The goal of the LPP is to generate longtermist legal research that could subsequently be used to influence policymakers in a longtermist direction. Personally, while I don’t think this theory of change is crazy, I’m also not completely sold on it (I have perhaps a more cynical view of policy, where principled academic arguments are rarely causal in policies being enacted, even if they are claimed as justifications). I do, however, think this theory of change is plausible enough that it’s at least worth trying (if done well), especially since the indirect effects (discussed below) also seem most likely positive.
Insofar as the LPP’s theory of change makes sense, I think they’re likely to go about it well. Since they’re a younger organization, they do not yet have a large body of output. Their main piece of output so far is their research agenda. As part of my evaluation, I skimmed the agenda in full and more closely read a couple sections (including the sections on longtermism and biorisk), and I found the work to be at a high level of quality. More specifically, I thought it was well written (thorough without being verbose, expressed concepts clearly – especially for an audience that might not be familiar with longtermism), demonstrated good understanding of longtermism and related concerns (e.g., infohazards), and presented “weird” longtermist concepts in a manner unlikely to be off-putting to those who might be put off by much of the rhetoric around such concepts elsewhere. This work makes me hopeful that future work will also be high quality.
Additionally, the LPP team includes many individuals with stellar academic credentials. I’m usually skeptical of leaning too heavily on academic credentials in making assessments (I tend to consider them only one of many reasonable signals for intelligence and competence), but I’d imagine that such credentials are particularly important here given LPP’s theory of change. My sense is that policymakers and other legal scholars are much more inclined to take academic ideas seriously if they’re promoted by those with prestigious credentials.
Furthermore, I think it often makes sense to fund newer organizations without long track records if there is a plausible argument in favor (and no apparent large downside risk). The upside of such grants can be large, and the wasted effort and resources if they don’t work out will probably be only a minor downside. I think such arguments are also stronger for organizations such as the LPP that are operating in newer areas, or with a theory of change different than what has come before, as their progress (or lack thereof) provides valuable information about not only the organization, but also potentially the whole area.
In addition to the direct effects from the LPP pursuing its theory of change, I’d imagine the indirect effects here would also most likely be positive. Given my impression of the LPP’s ability to promote longtermist ideas faithfully and in a non-offputting manner, and additionally the fact that it would be lending academic credibility to such ideas, my sense is that the LPP would, if anything, be good for the reputation of longtermism more broadly. Additionally, the LPP might allow for those involved to gain career capital.
Regarding the specific hires, I am also enthusiastic – due to their past work and educational credentials. Suzanne was the main author of the “Synthetic Biology and Biorisk” section of the LPP’s research agenda, and Renan was a co-author of the “Longtermism” section, both of which impressed me. Additionally, both have impressive academic credentials (Suzanne has a J.D. from Harvard, Renan has an M.Sc. from LSE), which I believe is valuable for reasons I outlined above. While I have less of a sense of the quality of Alfredo’s work than that of Suzanne and Renan (and, as a researcher myself, I trust my judgment less on questions related to operations than to research), it’s a positive signal that Alfredo has a Ph.D. from TU Munich and previously worked for four years in operations at EAF/CLR, and the fact that he has been involved with the LPP so far and they’re interested in hiring him is a further signal. I also think that operations work is important, and I am fairly inclined to defer to (seemingly competent) organizations when they claim a particular individual that they are familiar with would be a good operations hire.
Finally, I support this grant because I support the proposed work for the hires. According to the application, Suzanne will work on the topic of dual-use research, and Renan’s work will focus on protection of future generations via constitutional law. Both of these areas strike me as sensible areas of research for academic legal work. While Suzanne’s topic does contain potential downsides (such as infohazards) if handled poorly, her section within the research agenda on “Synthetic Biology and Biorisk” contains a subsection on “Information Hazards” which shows she is at least aware of this risk, and the section as a whole demonstrates a nuanced level of thinking that makes me less worried about these potential downsides (and I further update against being worried here given that the LPP as an institution strikes me as presenting an environment where such issues are handled well). Alfredo’s proposed work includes setting up “robust internal systems”. While vague, this work also seems valuable for the organization.
Sam Chorlton – $37,500
Democratizing analysis of nanopore metagenomic sequencing data.
Sam applied for funding for his online microbiological sequencing-analysis startup BugSeq to hire a bioinformatician for their effort to progress metagenomic sequencing (i.e., sequencing of genetic material recovered from environmental samples, which could help with early detection of novel pathogens).
My general impression of metagenomic sequencing is positive, as being able to better screen the environment for pathogens has clear defense applications and (at least as far as I can tell) lacks clear offense applications, and thus seems likely to advance biodefense over bio-offense. While I do not have a biosecurity background myself, I asked a well-respected biorisk person within the EA community about their opinion of metagenomic sequencing, and they thought it was generally valuable. Additionally, Kevin Esvelt (another well-respected biorisk expert within EA, and an MIT professor) voiced support for metagenomic sequencing in his EAGVirtual 2020 talk. One particular aspect of BugSeq’s approach that excites me is their work on improved detection of mutations in viruses, which could plausibly help to detect the presence of genetic engineering; this excited me because I worry more about engineered than natural pandemics (from an X-risk perspective).
Given my support for BugSeq’s goals, my support for the grant depended on my evaluation of BugSeq as an organization. In the application, Sam provided confidential information about organizations for which they’ve previously conducted analyses, as well as those with which they are currently working, and this information made me feel more confident in BugSeq’s competence/legitimacy. An additional factor in favor of BugSeq includes the fact that they have received a conditional matching grant from the Canadian Government’s IRAP for $37,500 (implying not just competence on their behalf, but also that our grant would plausibly be doubled). Overall, while I’m still unsure about how much impact to expect from BugSeq, they passed my bar for funding, and I strongly expect that further investigation wouldn’t change that.
Tegan McCaslin – $80,401
Continuing research projects in AI forecasting, AI strategy and forecasting.
Tegan applied for funding to continue independent research relevant for AI strategy and AI timelines. The LTFF has twicebefore funded Tegan to perform independent research. I supported this grant primarily due to Tegan’s research output from these previous two grants, and despite some skepticism about an independent-research environment being the best option for Tegan.
Tegan’s main outputs from her previous grants are a report comparing primate and bird brain architectures in terms of cognition, and a draft of a report that both compares the efficiency of biological evolution vs. deep learning in terms of improving cognitive abilities and sheds light on whether there is a common ordering between biological evolution and deep learning in terms of which cognitive tasks are more difficult to learn. I had previously read the former report, and I skimmed the latter for this evaluation.
My overall evaluation of these two outputs is as follows. These research questions are very ambitious (in a good way), and would be valuable to learn about. I think these questions are also in areas that are unlikely to see significant work from academia; both projects require making somewhat fuzzy or subjective judgments with very limited data and generalizing from there, and my impression is that many academics would dismiss such work as too speculative or unscientific. Additionally, the interdisciplinary nature of this work means there are fewer individuals in academia who would be well-positioned to tackle such research. Furthermore, work along these lines is neglected within the EA community. Despite the fact that biological evolution is the only process we know of that has produced human-level intelligence, there has been very little work within EA to study evolution in hopes of gaining insights about transformative AI. That Tegan’s research focuses on questions that are important and neglected makes me more optimistic about the value of the sorts of projects she will pursue, as well as her research judgement.
Regarding research methodology, my sense is Tegan is going about this in a mostly reasonable way. My inside view is that she’s handling the fuzzy, subjective nature of this research well, her evidence largely supports her conclusions, and she appreciates the limitations of her research. The biggest weakness of both projects, in my mind, is that they suffer from very few data points. Admittedly, both projects are in areas where not much data seems to exist (and that which does exist is fuzzy and may be difficult to use). But the fact remains that with few data points, strong conclusions cannot be drawn, and the research itself ends up less informative than it would be with more data. While finding such data may be difficult, I have a reasonably strong prior that at least some further relevant data exists and that the benefits to the research from finding such data would outweigh the search costs.
In light of all this, I would like to see more research along the lines that Tegan has performed previously, and I am in favor of her continuing to pursue such research. Having said that, I think there are some potential downsides to pursuing long-term independent research. Since this will be the third grant that the LTFF will give to Tegan for independent research, I think it’s worth considering these potential downsides, as well as potential mitigation techniques. (I also think this is a valuable exercise for other long-term independent researchers, which is part of my motivation for spelling out these considerations below.)
The largest pitfall in my mind for long-term independent researchers is one’s research becoming detached from the actual concerns of a field and thereby producing negligible value. Tegan seems to have avoided this pitfall so far, thanks to her research judgment and understanding of the relevant areas, and I see no evidence that she’s headed towards it in the future.
Another potential pitfall of independent research is a general lack of feedback loops, both for specific research projects and for the individual’s research skills. One way that independent researchers may be able to produce stronger feedback loops for their work is by sharing more intermediate work. While Tegan has shared (and received feedback from) some senior longtermist researchers on some of her intermediate work, I think she would probably benefit from sharing intermediate work more broadly, such as on the EA forum.
Finally, independent research can struggle to get as much traction as other work (keeping quality constant), as it’s less likely to be connected to organizations or networks where it will naturally be passed around. My sense is that Tegan’s research hasn’t gotten as much attention as it “deserves” given its level of quality, and that many who would find value in the research aren’t aware of it. Fixing such a dynamic generally requires a more active promotion strategy from the researcher. Again, I think posting more intermediate work could help here, as it would create more instances where others see the work, learn about what the researcher is working on, and perhaps even offer feedback.
Grant reports by Evan Hubinger
Anonymous – $33,000
Working on AI safety research, emphasizing AI’s learning from human preferences.
This grant, to an AI alignment researcher who wished to remain anonymous, will support their work on safe exploration, learning from human preferences, and robustness to distributional shift.
I’m only moderately excited about this specific project. It partly focuses on out-of-distribution detection, which I think is likely to be useful for helping with a lot of proxy pseudo-alignment issues. However, since I think the project overall is not that exciting, this grant is somewhat speculative.
That being said, the applicant will be doing this work in close collaboration with others from a large, established AI safety research organization that we are quite positive on and that the applicant previously did some work at, which significantly increases my opinion of the project. I think that the applicant’s continuing to do AI safety research with others at this organization is likely to substantially improve their chances of becoming a high-quality AI safety researcher in the future.
Finally, we did not take the decision to make this grant anonymous lightly. We are obviously willing to make anonymous grants, but only if we believe that the reasoning presented for anonymity by the applicant is sufficiently compelling. We believe this is true for this grant.
For additional accountability, we asked Daniel Ziegler of OpenAI, who is not part of the LTFF, to look at this grant. He said he thought it looked “pretty solid.”
Finally, though the grantee is anonymous, we can say that there were no conflicts of interest in evaluating this grant.
Center for Human-Compatible AI – $48,000
Hiring research engineers to support CHAI’s technical research projects.
Recusal note: Adam Gleave did not participate in the voting or final discussion around this grant.
This grant is to support Cody Wild and Steven Wang in their work assisting CHAI as research engineers, funded through BERI.
Overall, I have a very high opinion of CHAI’s ability to produce good alignment researchers—Rohin Shah, Adam Gleave, Daniel Filan, Michael Dennis, etc.—and I think it would be very unfortunate if those researchers had to spend a lot of their time doing non-alignment-relevant engineering work. Thus, I think there is a very strong case for making high-quality research engineers available to help CHAI students run ML experiments.
However, I have found that in many software engineering projects, there is a real risk that a bad engineer can often be worse than no engineer. That being said, I think this is significantly less true when research engineers work mostly independently, as they would here, since in those cases there’s less risk of bad engineers creating code debt on a central codebase. Furthermore, both Cody and Steven have already been working with CHAI doing exactly this sort of work; when we spoke to Adam Gleave early in the evaluation process, he seems to have found their work to be positive and quite helpful. Thus, the risk of this grant hurting rather than helping CHAI researchers seems very minimal, and the case for it seems quite strong overall, given our general excitement about CHAI.
David Reber – $3,273
Researching empirical and theoretical extensions of Cohen & Hutter’s pessimistic/conservative RL agent.
David applied for funding for technical AI safety research. He would like to work with Michael Cohen to build an empirical demonstration of the conservative agent detailed in Cohen et al.’s “Pessimism About Unknown Unknowns Inspires Conservatism.” David is planning on working on this project at the AI Safety Camp.
On my inside view, I have mixed feelings about creating an empirical demonstration of Cohen et al.’s paper. I suspect that the guarantees surrounding the agent described in that paper are likely to break in a fundamental way when applied to deep learning, due to our inability to really constrain what sorts of agents will be produced by a deep learning setup just by modifying the training setup, environment/dataset, and loss function—see “Risks from Learned Optimization in Advanced Machine Learning Systems.” That being said, I think Cohen et al.’s work does have real value to the extent that it gives us a better theoretical understanding of the space of possible agent designs, which can hopefully eventually help us figure out how to construct training processes to be able to train such agents.
On the whole, I see this as a pretty speculative grant. That being said, there are a number of reasons that I think it is still worth funding.
First, Michael Cohen has a clear and demonstrated track record of producing useful AI safety research, and I think it’s important to give researchers with a strong prior track record a sort of tenure where we are willing to support their work even if we don’t find it inside-view compelling, so that researchers feel comfortable working on whatever new ideas are most exciting to them. Of course, this grant is to support David rather than Michael, but given that David is going to be working directly with Michael—and, having talked with Michael, he seemed quite excited about this—I think the same reasoning still applies.
Second, having talked with Michael about David’s work, he seemed to indicate that David was more excited about the theoretical aspects of Michael’s work, and would be likely to do more theoretical work in the future. Thus, I expect that this project will have significant educational value for David and hopefully will enable him to do more AI safety work in the future—such as theoretical work with Michael—that I think is more exciting.
Third, though David initially applied for more funding from us, he lowered his requested amount after he received funding from another source, which meant that the overall quantity of money being requested was quite small, and as such our bar for funding this grant was overall lower than for other similar but larger grants. This was not a very large factor in my thinking, however, as I don’t believe that the AI safety space is very funding-constrained; if we can find good opportunities for funding, it’s likely we’ll be able to raise the necessary money.
John Wentworth is an independent AI safety researcher who has published a large number of articles on the AI Alignment Forum, primarily focusing on agent foundations and specifically the problem of understanding abstractions.
John’s current work, which he applied for the grant to work on, is in testing what John calls the “natural abstraction hypothesis.” This work builds directly on my all-time favorite post of John’s, his “Alignment By Default,” which makes the case that there is a non-negligible chance that the abstractions/proxies that humans use are natural enough properties of the world that any trained model would likely use similar abstractions/proxies as well, making such a model aligned effectively “by default”.
According to my inside-view model of AI safety, I think this work is very exciting. I think that understanding abstractions in general is likely to be quite helpful for being able to better understand how models work internally. In particular, I think that the natural abstraction hypothesis is a very exciting thing to try and understand, in that I expect doing so to give us a good deal of information about how models are likely to use abstractions. Additionally, the truth or falsity of the general alignment by default scenario is highly relevant to general AI safety strategy. Though I don’t expect John’s analysis to actually update me that much on this specific question, I do think the relevance suggests that his work is pointed in the right direction.
Regardless, I would have supported funding John even if I didn’t believe his current work was very inside-view exciting, simply because I think John has done a lot of good work in the past—e.g. his original “Alignment By Default” post, or any numberof other postsof his thatI’ve read andthought were quite good—and I think it’s important to give researchers who’ve demonstrated the ability to do good work in the past a sort of tenure, so they feel comfortable working on things that they think are exciting even if others do not. Additionally, I think the outside-view case for John is quite strong, as all of me, Scott Garrabrant, and Abram Demski are all very positive on John’s work and excited about it continuing, with MIRI researchers seeming like a good reference class for determining the goodness of agent foundations work.
Marc-Everin Carauleanu – $2,491
Writing a paper/blogpost on cognitive and evolutionary insights for AI alignment.
Marc’s project is to attempt to understand the evolutionary development of psychological altruism in humans—i.e. the extent to which people intrinsically value others—and understand what sorts of evolutionary pressures led to such a development.
Marc was pretty unknown to all of us when he applied and didn’t seem to have much of a prior track record of AI safety research. Thus, this grant is somewhat speculative. That being said, we decided to fund Marc for a number of reasons.
First, I think Marc’s proposed project is very inside-view exciting, and demonstrates a good sense of research taste that I think is likely to be indicative of Marc potentially being a good researcher. Specifically, evolution is the only real example we have of a non-human-level optimization process producing a human-level optimizer, which I think makes it very important to learn about. Furthermore, understanding the forces that led to the development of altruism in particular is something that is likely to be very relevant if we want to figure out how to make alignment work in a multi-agent safety setting.
Second, after talking with Marc, and having had some experience with Bogdan-Ionut Cirstea, with whom Marc will be working, it seemed to me like both of them were very longtermism-focused, smart, and at least worth giving the chance to try doing independent AI safety research.
Third, the small amount of money requested for this grant meant that our bar for funding was lower than for other similar but larger grants. This was not a very large factor in my thinking, however, as I don’t believe that the AI safety space is overall very funding-constrained—such that if we can find good opportunities for funding, it’s likely we’ll be able to raise the necessary money.
Grant reports by Oliver Habryka
AI Safety Camp – $85,000
Running a virtual and physical camp where selected applicants test their fit for AI safety research.
We’ve made multiple grants to the AI Safety Camp in the past. From the April 2019 grant report:
I’ve talked with various participants of past AI Safety camps and heard broadly good things across the board. I also generally have a positive impression of the people involved, though I don’t know any of the organizers very well.
The material and testimonials that I’ve seen so far suggest that the camp successfully points participants towards a technical approach to AI Alignment, focusing on rigorous reasoning and clear explanations, which seems good to me.
I am not really sure whether I’ve observed significant positive outcomes of camps in past years, though this might just be because I am less connected to the European community these days.
I also have a sense that there is a lack of opportunities for people in Europe to productively work on AI Alignment related problems, and so I am particularly interested in investing in infrastructure and events there. This does however make this a higher-risk grant, since I think this means this event and the people surrounding it might become the main location for AI Alignment in Europe, and if the quality of the event and the people surrounding it isn’t high enough, this might cause long-term problems for the AI Alignment community in Europe.
Concerns
I think organizing long in-person events is hard, and conflict can easily have outsized negative effects. The reviews that I read from past years suggest that interpersonal conflict negatively affected many participants. Learning how to deal with conflict like this is difficult. The organizers seem to have considered this and thought a lot about it, but the most likely way I expect this grant to have large negative consequences is still if there is some kind of conflict at the camp that results in more serious problems.
I think it’s inevitable that some people won’t get along with organizers or other participants at the camp for cultural reasons. If that happens, I think it’s important for these people to have some other way of getting connected to people working on AI Alignment. I don’t know the best way to arrange this, but I would want the organizers to think about ways to achieve it.
[...]
I would want to engage with the organizers a fair bit more before recommending a renewal of this grant, but I am happy about the project as a space for Europeans to get engaged with alignment ideas and work on them for a week together with other technical and engaged people.
Broadly, the effects of the camp seem very likely to be positive, while the (financial) cost of the camp seems small compared to the expected size of the impact. This makes me relatively confident that this grant is a good bet.
When we next funded them, I said:
This grant is for the AI Safety Camp, to which we made a grant in the last round. Of the grants I recommended this round, I am most uncertain about this one. The primary reason is that I have not received much evidence about the performance of either of the last two camps [1], and I assign at least some probability that the camps are not facilitating very much good work. (This is mostly because I have low expectations for the quality of most work of this kind and haven’t looked closely enough at the camp to override these — not because I have positive evidence that they produce low-quality work.)
My biggest concern is that the camps do not provide a sufficient level of feedback and mentorship for the attendees. When I try to predict how well I’d expect a research retreat like the AI Safety Camp to go, much of the impact hinges on putting attendees into contact with more experienced researchers and having a good mentoring setup. Some of the problems I have with the output from the AI Safety Camp seem like they could be explained by a lack of mentorship.
From the evidence I observe on their website, I see that the attendees of the second camp all produced an artifact of their research (e.g. an academic writeup or code repository). I think this is a very positive sign. That said, it doesn’t look like any alignment researchers have commented on any of this work (this may in part have been because most of it was presented in formats that require a lot of time to engage with, such as GitHub repositories), so I’m not sure the output actually lead to the participants to get any feedback on their research directions, which is one of the most important things for people new to the field.
During this grant round, I spent additional time reviewing and evaluating the AI Safety Camp application, which seemed important given that we are the camp’s most central and reliable funder.
To evaluate the camp, I sent out a follow-up survey to a subset of past participants of the AI Safety camp, asking them some questions about how they benefited from the camp. I also spent some time talking to alumni of the camp who have since done promising work.
Overall, my concern above about mentorship still seems well-placed, and I continue to be concerned about the lack of mentorship infrastructure at the event, which, as far as I can tell, doesn’t seem to have improved very much.
However, some alumni of the camp reported very substantial positive benefits from attending the camp, while none of them reported noticing any substantial harmful consequences. And as far as I can tell, all alumni I reached out to thought that the camp was at worst, only a slightly less valuable use of their time than what they would have done instead, so the downside risk seems relatively limited.
In addition to that, I also came to believe that the need for social events and workshops like this is greater than I previously thought, and that they are in high demand among people new to the AI Alignment field. I think there is enough demand for multiple programs like this one, which reduces the grant’s downside risk, since it means that AI Safety Camp is not substantially crowding out other similar camps. There also don’t seem to be many similar events to AI Safety Camp right now, which suggests that a better camp would not happen naturally, and makes it seem like a bad idea to further reduce the supply by not funding the camp.
Alexander Turner – $30,000
Formalizing the side effect avoidance problem.
Alex is planning to continue and potentially finish formalizing his work on impact measures that he has been working on for the past few years during his PhD. We’ve given two grants to Alex in the past:
April 2019 – $30,000: Building towards a “Limited Agent Foundations” thesis on mild optimization and corrigibility
September 2020 – $30,000: Understanding when and why proposed AI designs seek power over their environment.
Since then, Alex has continued to produce research that seems pretty good to me, and has also helped other researchers who seem promising find traction in the field of AI alignment. I’ve also received references from multiple other researchers who have found his work valuable.
Overall, I didn’t investigate this grant in very much additional detail this round, since I had already evaluated his last two applications in much more detail, and it seems the value proposition for this grant is very similar in nature. Here are some of the most relevant quotes from past rounds.
From the April 2019 report:
I’m excited about this because:
Alex’s approach to finding personal traction in the domain of AI Alignment is one that I would want many other people to follow. On LessWrong, he read and reviewed a large number of math textbooks that are useful for thinking about the alignment problem, and sought public input and feedback on what things to study and read early on in the process.
He wasn’t intimidated by the complexity of the problem, but started thinking independently about potential solutions to important sub-problems long before he had “comprehensively” studied the mathematical background that is commonly cited as being the foundation of AI Alignment.
He wrote up his thoughts and hypotheses in a clear way, sought feedback on them early, and ended up making a set of novel contributions to an interesting sub-field of AI Alignment quite quickly (in the form of his work on impact measures, on which he recently collaborated with the DeepMind AI Safety team)
Potential concerns
These intuitions, however, are a bit in conflict with some of the concrete research that Alex has actually produced. My inside views on AI Alignment make me think that work on impact measures is very unlikely to result in much concrete progress on what I perceive to be core AI Alignment problems, and I have talked to a variety of other researchers in the field who share that assessment. I think it’s important that this grant not be viewed as an endorsement of the concrete research direction that Alex is pursuing, but only as an endorsement of the higher-level process that he has been using while doing that research.
As such, I think it was a necessary component of this grant that I have talked to other people in AI Alignment whose judgment I trust, who do seem excited about Alex’s work on impact measures. I think I would not have recommended this grant, or at least this large of a grant amount, without their endorsement. I think in that case I would have been worried about a risk of diverting attention from what I think are more promising approaches to AI Alignment, and a potential dilution of the field by introducing a set of (to me) somewhat dubious philosophical assumptions.
From the September 2020 report:
I’ve been following Alex’s work closely since [the 2019 grant round], and overall have been quite happy with its quality. I still have high-level concerns about his approach, but have over time become more convinced that Alex is aware of some of the philosophical problems that work on impact measures seems to run into, and so am more confident that he will navigate the difficulties of this space correctly. His work also updated me on the tractability of impact-measure approaches, and though I am still skeptical, I am substantially more open to interesting insights coming out of an analysis of that space than I was before. (I think it is generally more valuable to pursue a promising approach that many people are skeptical about, rather than one already known to be good, because the former is much less likely to be replaceable).
I’ve also continued to get positive feedback from others in the field of AI alignment about Alex’s work, and have had multiple conversations with people who thought it made a difference to their thinking on AI alignment.
One other thing that has excited me about Alex’s work is his pedagogical approach to his insights. Researchers frequently produce ideas without paying attention to how understandable those ideas are to other people, and enshrine formulations that end up being clunky, unintuitive or unwieldy, as well as explanations that aren’t actually very good at explaining. Over time, this poor communication often results in substantial research debt. Alex, on the other hand, has put large amounts of effort into explaining his ideas clearly and in an approachable way, with his “Reframing Impact” sequence on the AI Alignment Forum.
I have not made any substantial updates since September, so the above still summarizes most of my perspective on this.
David Manheim – $80,000
Building understanding of the structure of risks from AI to inform prioritization.
Recusal note: Daniel Eth and Ozzie Gooen did not participate in the voting or final discussion around this grant.
We made a grant to David Manheim in the August 2019 round. In that grant report, I wrote:
However, I am excited about this grant, because I have a good amount of trust in David’s judgment. To be more specific, he has a track record of identifying important ideas and institutions and then working on/with them. Some concrete examples include:
Wrote up a paper on Goodhart’s Law with Scott Garrabrant (after seeing Scott’s very terse post on it)
Works with the biorisk teams at FHI and OpenPhil
Completed his PhD in public policy and decision theory at the RAND Corporation, which is an unusually innovative institution (e.g. this study);
Writes interesting comments and blog posts on the internet (e.g. LessWrong)
Has offered mentoring in his fields of expertise to other people working or preparing to work projects in the x-risk space; I’ve heard positive feedback from his mentees
Another major factor for me is the degree to which David shares his thinking openly and transparently on the internet, and participates in public discourse, so that other people interested in these topics can engage with his ideas. (He’s also a superforecaster, which I think is predictive of broadly good judgment.) If David didn’t have this track record of public discourse, I likely wouldn’t be recommending this grant, and if he suddenly stopped participating, I’d be fairly hesitant to recommend such a grant in the future.
As I said, I’m not excited about the specific project he is proposing, but have trust in his sense of which projects might be good to work on, and I have emphasized to him that I think he should feel comfortable working on the projects he thinks are best. I strongly prefer a world where David has the freedom to work on the projects he judges to be most valuable, compared to the world where he has to take unrelated jobs (e.g. teaching at university).
Since then, I haven’t received any evidence that contradicts this perspective, so I continue to think that it’s valuable for David to be able to work on projects he thinks are valuable, especially if he plans to write up his findings and thoughts publicly.
However, the scope of this project is broader than just David’s personal work, so it seems worthwhile to further explain my take on the broader project. From the application:
This grant will fund a group of researchers who are collaborating on a project working to improve understanding of the structure of risk from AI, with the aim of making results publicly available to AI safety researchers and academia. The project started as a collaboration between researchers from several institutions including FHI and APL, so while several members of the group already have funding, many do not. APL has declined funding for year 2, so we are going to take the project forward with a slightly different focus.
The project goals for year 2 (2020-2021) are first, to refine the model for uncertainties that drive the different risks which emerge from AGI and/or ASI and to write up our preliminary understanding in a series of posts on the Alignment Forum. Second, to continue developing and perform a series of elicitations to understand what AI safety experts think about both the specific uncertainties, and the structure of how these uncertainties interrelate to create risk, and then to publicise those results along with an interactive probabilistic model. In the process we will refine some inaccessible or unclear arguments so that they are suitable for academic treatment, and hopefully improve the quality of communication around AGI and ASI risk scenarios. The people who will receive the funding are being given time to think through and write more about AI safety, and we will continue to encourage them to write papers and LessWrong or Alignment Forum posts about their insights or questions.
I am co-leading the project with Daniel Eth, with assistance from Sammy Martin. The group of collaborators are involved in AI safety research already, but most have other work and/or do not have funding that allows them to focus on AI safety. The individuals are David Manheim (myself), Issa Rice, Ben Cottier, Ross Greutzmacher, Jeremy Perret, Alexis Carlier, and Sammy Martin. Daniel Eth and Aryeh Englander are also working extensively on the project, but they are funded at FHI as a Research Scholar and at APL respectively. We are also consulting with other groups, including AI Impacts and GCRI. The direct near-term impact of the project is expected to be several-fold. We believe, and have been told by others, that the direct impact of the work in clarifying and cataloguing which uncertainties exist and how they are related is of high value, and in general this issue is not being tackled by individual researchers outside our group. Our team members agree that there is a real need for a formalization and an organization of the arguments around AGI and ASI risk.
I’ve observed the work of a good number of the people involved in this project, and the group seems pretty promising to me, though I don’t know all of the people involved.
I do have some uncertainty about the value of this project. In particular, it feels quite high-level and vague to me — which isn’t necessarily bad, but I feel particularly hesitant about relatively unfocused proposals for teams as large as this. My current best guess as to the outcome of this grant is that a number of people who seem like promising researchers have more resources and time available to think in a relatively unconstrained way about AI risk.
Logan Strohl – $80,000
Developing and sharing an investigative method to improve traction in pre-theoretic fields.
Logan has previously worked with the Center for Applied Rationality, but is now running an independent project aiming to help researchers in AI alignment and related pre-paradigmatic fields find more traction on philosophically confusing questions. I find work in this space potentially very valuable, but also very high-variance, so I assign high probability to this project not producing very much value. However, I think there’s at least some chance that it helps a substantial number of future AI alignment researchers in a way that few other interventions are capable of.
My general experience of talking to people who are trying to think about the long-term future (and AI alignment in particular) is that they often find it very difficult to productively make progress on almost any of the open problems in the field, with the problems often seeming far too big and ill-defined.
Many well-established research fields seem to have had these problems in their early stages. I think that figuring out how to productively gain traction on these kinds of ill-defined questions is one of the key bottlenecks for thinking about the long-term future. Substantially more so than, for example, getting more people involved in those fields, since my current sense is that marginal researchers currently struggle to find any area to meaningfully contribute and most people in the field have trouble productively integrating others’ contributions into their own thinking.
The basic approach Logan is aiming for seems descendant of some of Eugene Gendlin’s “Focusing” material, which I know has been quite useful for many people working in AI alignment and related research fields, based on many conversations I’ve had with researchers in the last few years. It seems to frequently come up as the single most useful individual thinking technique, next to the ability to make rough quantitative back-of-the-envelope estimates for many domains relevant to events and phenomena on a global scale. This makes me more optimistic than I would usually be about developing techniques in this space.
The approach itself still seems to be in very early stages, with only the very first post of the planned sequence of posts being available here. The current tentative name for it is “naturalism”:
“Naturalism” (an allusion to 19th century naturalists) is the name I use for a specific way of engaging curiously with the world. It’s a method of inquiry that uses patient observation and original seeing to build models that were previously unthinkable. It takes longer to learn than the spontaneous curiosity of pure exploration, but this specialized method helps you to make deliberate progress in areas where your existing concepts are leading you astray.
Logan is planning to work closely with a small number of interested researchers, which I think is generally the right approach for work like this, and is planning to err on the side of working with people in practical contexts instead of writing long blog posts full of theory. Armchair philosophizing about how thinking is supposed to work can sometimes be useful (particularly when combined with mathematical arguments — a combination which sparked, for example, the Bayesian theory of cognition). But most of the time, it seems to me to produce suggestions that are only hypothetically interesting, but ultimately ungrounded and hard to use for anything in practice. On the margin, investigations that integrate people’s existing curiosities and problems seem better suited to making progress on this topic.
In evaluating this grant, we received a substantial number of highly positive references for Logan, both through their historical work at CFAR, and from people involved in the early stages of their “naturalism” program. I’ve worked a bit with Logan in the context of a few rationality-related workshops and have generally been impressed by their thinking. I’ve also found many of their blog posts quite valuable.
Summer Program on Applied Rationality and Cognition – $15,000
Supporting multiple SPARC project operations during 2021.
This is a relatively small grant to SPARC. I’ve written in the past about SPARC when evaluating our grant to CASPAR:
[...] SPARC or ESPR, two programs with somewhat similar goals that have been supported by other funders in the long-term future space. I currently think interventions in this space are quite valuable, and have been impressed with the impact of SPARC; multiple very promising people in the long-term future space cite it as the key reason they became involved.
SPARC has been funded by Open Philanthropy for the past few years, and they applied for a small supplement to that funding, which seemed worthwhile to me.
Historically, SPARC seems to have been quite successful in causing some of the world’s most promising high-school students (e.g. multiple IMO gold medalists) to develop an interest in the long-term future, and to find traction on related problems. In surveys of top talent within the EA community, SPARC has performed well on measures of what got people involved. Informally, I’ve gotten a sense that SPARC seems to have been a critical factor in the involvement of a number of people I think are quite competent, and also seems to have substantially helped them to become that competent.
I do have some hesitations around the future of SPARC: the pandemic harmed their ability to operate over the past year, and I also have a vague sense that some parts of the program have gotten worse (possibly not surprisingly, given that relatively few of the original founders are still involved). But I haven’t had the time to engage with those concerns in more depth, and am not sure I will find the time to do so, given that this is a relatively small grant.
Grant reports by Ozzie Gooen
General thoughts
We considered a few “general purpose” grants to sizable organizations this round and did some thinking on the shared challenges these pose to us. Some factors I considered:
Evaluation work scales with project size, but there aren’t equivalent discounts for funding a fraction of a project. For example, a typical $20K project might take 10 hours to evaluate well enough for us to be comfortable funding, but a $4 million project could easily take over 60 hours — even if they only asked the LTFF for $20K. To be cost-effective, funders typically have rough goals of how much time they should spend for grants of a certain size.
Relatively small grants to large organizations do little to influence those organizations. Theoretically, funders like us make donations whenever the direct expected value passes a certain bar. An organization could take advantage of this by having a few projects we’d be extremely happy to fund but spending most of their budget on projects that aren’t a good fit for us. Our grants might still pass the bar for expected impact based on the few projects we like, but we’d have a hard time convincing the organization to do more LTFF-shaped projects (even if we think this would make them far more impactful).
Naomi Nederlof – $34,064
Running a six-week summer research program on global catastrophic risks for Swiss (under)graduate students.
Naomi Nederlof of EA Geneva is running a new longtermism-oriented summer research program for students in Switzerland. This grant will support up to 10 students for six weeks and help to provide them with stipends for this time. You can see early information about the program here.
The EA Geneva scene is interesting. EA Geneva has introduced a (to me) surprising number of notable researchers to longtermism. These people typically leave Geneva, in large part because of the longtermist orgs elsewhere. EA Geneva has links to the Simon Institute and the Lausanne EA scene. Geneva is notable for being one of the top two United Nations hubs, so a lot of EA work in the area is oriented around the UN ecosystem.
This grant is in some ways a fairly safe one, but in other ways, it’s fairly risky. It’s safe because EA Geneva has a history of running in-person events without notable problems, and the modest size of this program seems very manageable for them. It’s risky because this is the first time this program is to be run, and the planning is still somewhat early in its development. Some advisors have agreed to it, but more still need to be found (if you’re reading this and think you’d be a good fit, reach out!). More uncertain is the quality of the students. There are other similar summer programs hosted by SERI and other longtermist centers, and these accept applicants globally. It’s unclear how many promising students both belong to universities in Switzerland and won’t attend other competitive summer programs.
All that said, my impression is that an earlier FHI summer program with similar goals was both quite good and smaller than ideal (In fact, it’s not running for 2021), and I’d expect similar outcomes for the SERI version. This raises my estimate of how useful an additional summer program will be, and makes me more optimistic about this grant. We’ll hopefully learn a lot over the next few months as we find out which students go through EA Geneva’s program and what they produce (these sorts of summer projects often don’t seem directly useful to me, but hint at future work with more utility). If this sort of program could be successfully scaled, particularly by being replicated in other locations with strong EA scenes, that could be terrific.
Po-Shen Loh – $100,000, with an expected reimbursement of up to $100,000
Paying for R&D and communications for a new app to fight pandemics.
Po-Shen is the founder of NOVID, a contact-tracing-style application that forewarns individuals when their in-person networks are a certain number of steps away from someone who reported an infection. This helps them know to take preventative measures before they are at risk. I suggest visiting the site to see images and read a more detailed explanation.
At first, I was highly skeptical. Most COVID-related projects seem either poorly tractable, not at all longtermist, or both. NOVID was unusual early on because it had some unusually good references. We asked multiple researchers whose judgment we trust (several in biosecurity) about the project, and their responses were mostly either mildly or highly positive. You can see some recent public write-ups by Andrew Gelman and Tyler Cowen.
NOVID is available on both Apple and Android. Right now, both applications have around 230 ratings, with an average score of around 4.5/5. I started using it on my iPhone. I’ve found the interface simple and straightforward, though I was a bit uncomfortable with the number of permissions that it asked for. In my ~3 weeks of using it, it hasn’t shown any n-degree cases yet. I’ve been in Berkeley, so I assume that’s due to some combination of me staying at home a lot and the app having few users in the area.
NOVID is an unusual donation for the LTFF, as it has both short-term and long-term benefits. I personally believe the long-term benefits are considerable:
The use of NOVID during COVID-19 could give us valuable data, and ideally rigorous experiments, to reveal if this method will be useful in future pandemics.
Po-Shen genuinely seems to be focused on the value of NOVID for future pandemics. He has multiple longtermist collaborators. I expect that Po-Shen and his collaborators might do valuable work in future pandemics, so the skills and networks they build now could be useful later on.
There is a chance that NOVID makes a significant improvement to the handling of COVID-19. This might have positive long-term impacts of its own. One might think that vaccine deployment will completely solve the issue, but I think there are serious risks of this not being successful.
There are some reasons why we, as longtermist funders, might not want to fund this project. Reasons that stand out to me:
It’s not clear if the sorts of pandemics that will be extinction threats in the future will be alleviated by this kind of technology.
NOVID’s special tech requires asking for more permissions than contact-tracing apps by Apple and Google, which are relatively minimal in both data collection and benefit. Perhaps this trade-off won’t be accepted, even with more evidence of benefit.
An organization might require ample technical experience and connections to get the necessary support and follow-up for a project like this. For example, to get many users, it could be necessary to sign contracts with governments or gain large amounts of government trust. The NOVID team is quite young and small for this kind of work. This is a common issue for startups, and it’s particularly challenging in cases of public health.
NOVID is subject to strong network effects. It might be very difficult for either NOVID or NOVID-inspired applications to get the critical mass necessary to become useful. Arguably, NOVID could have a much easier time here than traditional contact tracing apps (given its focus on personal safety rather than avoiding harm to others), but it will likely still be a challenge.
I’ve been in the startup scene for some time. I look at NOVID a lot like other tech startups I’m familiar with. There are some significant pros and significant cons, and the chances of a large success are probably small. But in my personal opinion, the NOVID team and product-market fit seem more promising than many other groups I’ve seen at a similar stage that have gone on to be very successful.
Some remaining questions one might have:
“If this application is valuable for preventing COVID-19, why haven’t traditional funders been more interested?”
“Isn’t there already a lot of money going to COVID relief?
The easy answer for me would be that NOVID is particularly useful for helping with future pandemics, which other funders probably care less about, so we would have an obvious advantage. The more frustrating, but seemingly true (from what I can tell) answer, is more like, “The existing COVID funding system isn’t that great, and there are some substantial gaps in it.” My impression is that this application sits in a strange gap in the funding environment; hopefully, as it gets more traction, other non-longtermist funders will join in more.
All that said, while I found NOVID interesting, I thought it was overall a better fit for other funders. The funding needs were significantly larger than the LTFF maximum grant size, and the LTFF is more exclusively focussed on long-term value than other funders are. Unfortunately, these other funders would take a few more months to make a decision, and NOVID needed a commitment sooner.
Therefore, this donation was structured as a sort of advance via the Survival and Flourishing Fund. The SFF will consider funding NOVID in the next few months. When they do, if they decide to fund NOVID for over $100,000, the next $100,000 they donate will instead be given back to the LTFF fund. For example, if they decide to donate $500,000, $100,000 of this will be given to the LTFF instead of NOVID. If they decide on $150,000, the LTFF will get $50,000. This way, NOVID could get a commitment and some funding quickly, but the SFF will likely pay the bill in the end.
Stanford Existential Risks Initiative – $60,000
Providing general purpose support via the Berkeley Existential Risks Initiative.
The Stanford Existential Risks Initiative (SERI) is a new existential risk mitigation program at Stanford. They have 12 undergraduate student affiliates and what seems like a strong team of affiliates and mentors.
They seem to do a lot of things. From their website: “Our current programs include a research fellowship for Stanford undergraduates, an annual conference, speaker events, discussion groups, and a Thinking Matters class (Preventing Human Extinction) taught annually by two of the initiative’s faculty advisors.”
This grant is to support them via BERI. I’ve personally had very good experiences with BERI and have heard positive reviews from others. BERI doesn’t just provide useful operations help; more importantly, they provide an alternative funding structure for a large class of things that universities tend not to fund. (It seems like the most prestigious universities sometimes have the most intensely restrictive funding policies, so this alternative could be a big deal.) I used BERI for help funding contractors for Foretold.io, and am confident I wouldn’t have been able to do this otherwise. (Note to those reading this: if you have an organization and would like help from BERI, I suggest reading this page and emailing them.)
My guess is that dollars given to requested BERI organizational grants would be somewhat more valuable than marginal ones going to the corresponding organizations (if this were not the case, then we would just donate directly to the organizations). For one, BERI organizational grants typically represent efforts to hire part-time help and perform small but valuable operations tasks horizontally through many projects of an organization. I think “making sure that existing projects go more smoothly” is rather safe.
This specific grant would be spent on a combination of:
Stipends and researcher support for approximately one year
Existential risk seed grants to graduate students
Support for a spring conference
Support for future courses on existential risks at Stanford
Miscellaneous funds and BERI core operations expenses
This was a fairly easy bet to make. The Stanford EA community has historically been successful, and the current community seems promising. I reached out to a few people close to the community and got glowing reviews. The program seems to have a lot of activity and senior longtermist mentors.
The main downsides for this decision are more about the structure of the LTFF and the longtermist funding landscape than about SERI. This grant was somewhat unusual in that SERI was already funded by Open Philanthropy, but multiple parties (OP, BERI) wanted them to have a more diverse set of funding sources. One key question for me was whether the LTFF would really provide the intended benefits of a diverse funding ecosystem. It would be easy for the LTFF to simply defer to Open Phil on which groups are worth funding, for example. But if we did, that would defeat the main benefits of funding diversification.
This donation seemed like a good enough bet, and the organization still relatively small enough, for the grant to be a solid use of LTFF funding.
Joel Becker – $42,000
Conducting research into forecasting and pandemics during an economics Ph.D.
Joel Becker is a first-year Ph.D. student at NYU who is pursuing forecasting research. The grant will act as a replacement salary for teaching for three years, to allow Joel to pursue research instead.
He intends to work on a few projects, including (text taken from the application):
Extend ‘external validity’ projects with Rajeev Dehejia, using expert forecasting to select treated units in development interventions.
Explore the accuracy of long-term forecasts in simulated environments (e.g. real history with blinded participants, games with different time horizons).
Explore trade-offs in forecast elicitation techniques (accuracy vs. cost of training, accuracy from full distributions vs. point estimates, returns to different accuracy rewards, etc).
I think Joel could plausibly become a successful academic: He’s been part of the Philanthropy Advisory Fellowship at Harvard, and will soon join the Global Priorities Fellowship of the Forethought Foundation. I get the impression from this that he’s fairly committed to improving the long-term future.
Joel seemed to me to currently be working on somewhat standard academic forecasting topics. This has the benefits of having accessible advisors and an easier time integrating with other work. But it has the downside that the current academic forecasting topics are likely different than would be ideal for longtermists. My intuition is that much of the most valuable work for generic forecasting will be a combination of theoretical foundations and advanced software implementations, and the existing economics work is fairly separate from this. For example, much of the literature explores laboratory experimentation with what seem to me to be unrealistic and inhibited scenarios.
I’ve read some of Joel’s in-progress work, and he seems to get some things right (asking better questions than other academic work, and using more modern software), but this work is very early-stage.
The big question is what direction Joel goes after doing this research. I could see the research helping to build valuable credentials and skills. This could put him in a position to spearhead bold and/or longtermism-centric projects later on.
Long-Term Future Fund: May 2021 grant recommendations
Introduction
Since November, we’ve made 27 grants worth a total of $1,650,795 (with the expectation that we will get up to $220,000 back), reserved $26,500 for possible later spending, and referred two grants worth a total of $120,000 to private funders. We anticipate being able to spend $3–8 million this year (up from $1.4 million spent in all of 2020). To fill our funding gap, we’ve applied for a $1–1.5 million grant from the Survival and Flourishing Fund, and we hope to receive more funding from small and large longtermist donors.
The composition of the fund has changed since our last grant round. The new regular fund team consists of Asya Bergal (chair), Adam Gleave, and Oliver Habryka, though we may take on additional regular fund managers over the next several months. Notably, we’re experimenting with a “guest manager” system where we invite people to act as temporary fund managers under close supervision. Our guest managers this round were Daniel Eth, Evan Hubinger, and Ozzie Gooen.
Highlights
Our grants include:
$35,000 to support John Wentworth’s independent AI safety research, specifically testing an empirical claim relevant to AI alignment called the natural abstraction hypothesis. John Wentworth has produced a huge amount of high-quality work that has influenced top AI safety researchers and pushed the field of AI safety forward; this grant will enable him to produce more work of this kind. This grant was given the maximum possible score by four of our six fund managers (one manager gave a non-maximal score; the other didn’t score this grant).
Up to $200,000 to fund PhD students and computing resources at David Krueger’s new AI safety lab at the University of Cambridge. David Krueger has done excellent safety work and was recently appointed to a faculty position in Cambridge’s Computational and Biological Learning Lab. Having a new academic lab focused on AI safety is likely to have highly positive field-building effects by attracting promising junior AI researchers to safety work and shifting the thinking of more senior researchers.
$6,917 to support a research assistant for Jess Whittlestone and Shahar Avin’s work at the Centre for the Study of Existential Risk (CSER), wherein they hope to ensure that the lessons learnt from COVID-19 improve global catastrophic risk (GCR) prevention and mitigation in the future. This is a timely grant for an ambitious project which, if successful, could shift global attitudes towards GCRs.
Grant recipients
See below for a list of grantees’ names, grant amounts, and project descriptions. Most of the grants have been accepted, but in some cases, the final grant amount is still uncertain.
Grants made during the last grant application round:
Adam Shimi ($60,000): Independent research in AI alignment for a year, to help transition from theoretical CS to AI research.
AI Safety Camp ($85,000): Running a virtual and physical camp where selected applicants test their fit for AI safety research.
Alexander Turner ($30,000): Formalizing the side effect avoidance problem.
Amon Elders ($250,000): Doing a PhD in computer science with a focus on AI safety.
Anton Korinek ($71,500): Developing a free online course to prepare students for cutting-edge research on the economics of transformative AI.
Anonymous ($33,000): Working on AI safety research, emphasizing AI learning from human preferences.
Center for Human-Compatible AI ($48,000): Hiring research engineers to support CHAI’s technical research projects.
Daniel Filan ($5,280): Technical support for a podcast about research aimed at reducing x-risk from AI.
David Krueger ($200,000, with an expected reimbursement of up to $120,000): Computing resources and researcher salaries at a new deep learning + AI alignment research group at Cambridge.
David Manheim ($80,000): Building understanding of the structure of risks from AI to inform prioritization.
David Reber ($3,273): Researching empirical and theoretical extensions of Cohen & Hutter’s pessimistic/conservative RL agent.
Joel Becker ($42,000): Conducting research into forecasting and pandemics during an economics Ph.D.
John Wentworth ($35,000): Developing tools to test the natural abstraction hypothesis.
Legal Priorities Project ($135,000): Hiring staff to carry out longtermist academic legal research and increase the operational capacity of the organization.
Logan Strohl ($80,000): Developing and sharing an investigative method to improve traction in pre-theoretic fields.
Marc-Everin Carauleanu ($2,491): Writing a paper/blog post on cognitive and evolutionary insights for AI alignment.
Naomi Nederlof ($34,064): Running a six-week summer research program on global catastrophic risks for Swiss (under)graduate students.
Po-Shen Loh ($100,000, with an expected reimbursement of up to $100,000): Paying for R&D and communications for a new app to fight pandemics.
Rethink Priorities ($70,000): Researching global security, forecasting, and public communication.
Sam Chorlton ($37,500): Democratizing analysis of nanopore metagenomic sequencing data.
Stanford Existential Risks Initiative ($60,000): Providing general-purpose support via the Berkeley Existential Risks Initiative.
Sören Mindermann ($41,869): Doing a 3rd and 4th year of a PhD in machine learning, with a focus on AI forecasting.
Summer Program on Applied Rationality and Cognition ($15,000): Supporting multiple SPARC projects during 2021.
Tegan McCaslin ($80,401): Continuing research projects in AI forecasting, AI strategy and forecasting.
Toby Bonvoisin ($18,000): Doing the first part of a DPhil at Oxford in modelling viral pandemics.
Off-cycle grants:
Jess Whittlestone, Shahar Avin ($6,917): Ensuring that the lessons learnt from COVID-19 improve GCR prevention and mitigation in the future.
Longtermist Entrepreneurship Fellowship (see below): Giving seed funding to projects incubated by the Longtermist Entrepreneurship Fellowship Programme.
Grants made through the Longtermist Entrepreneurship Fellowship:
Josh Jacobson ($26,500, with $26,500 earmarked for potential use later): Research into physical-world interventions to improve productivity and safeguard life.
Grant reports
Note: Many of the grant reports below are very detailed. We encourage anyone who is considering applying to the fund but would prefer a less detailed report to apply anyway; we are very sympathetic to circumstances where a grantee might be uncomfortable with a highly detailed report on their work.
We run all of our payout reports by grantees, and we think carefully about what information to include to maximize transparency while respecting grantees’ preferences. If considerations around reporting make it difficult for us to fund a request, we are able to refer to private donors whose grants needn’t involve public reporting. We are also able to make anonymous grants, as we did this round.
Grant reports by Adam Gleave
Adam Shimi – $60,000
Independent research in AI alignment for a year, to help transition from theoretical CS to AI research.
Recusal note: Evan Hubinger did not participate in the voting or final discussion around this grant.
Adam applied to perform independent research on AI safety for 1 year in collaboration with Evan Hubinger at MIRI, whom he has been working with since mid-2020. Adam completed a PhD from IRIT (publications) in 2020 focused on distributed computing theory. Since then, he has written a number of posts on the Alignment Forum.
I particularly liked his Literature Review on Goal-Directedness, which I felt served as an excellent introduction to the area with a useful taxonomy of different motivations (or “intuitions”) for the work. This seems like a good example of research distillation, a category of work that is generally under-incentivized both by academia and by the major corporate AI labs. I’m not sure AI safety needs much in the way of distillation at this point (it’s still a fairly small field), but at the margin I think we’d likely benefit from more of it, and this need will only grow as the field scales. Given this, I’m generally excited to see good research distillation.
In addition to continuing his research distillation, Adam plans to work on formalizing goal-directedness. He seems well-prepared for this given his CS theory background and grasp of the existing space, as evidenced by his literature review above. I find this direction promising: slippery and imprecise language is common in AI safety, which often leads to miscommunication and confusion. I doubt that Adam or anyone else will find a universally agreed-upon formalism for goal-directedness, but attempts in this area can help to surface implicit assumptions, and I expect will eventually lead to a more coherent and precise taxonomy of different kinds of goal-directedness.
I tend to apply a fairly high bar to long-term independent research, since I think it is hard to be a productive researcher in isolation. There are several factors that make me more positive about this case:
Adam already has significant research experience from his PhD, making him more likely to be able to make progress with limited oversight.
Evan Hubinger is spending around 1 hour a week on informal mentorship, which I think helps to provide accountability and an independent perspective.
Adam is interested in working at an organization long-term, but has found finding a position particularly challenging this year owing to the pandemic.
Given this, it seems reasonable to support Adam for a year to continue his research while he searches for positions.
The main concern I have regarding this grant is that Adam’s previous work in distributed computing seems to have had a limited impact on the field, with only 6 citations as of the time of writing. While it’s entirely possible that Adam has better personal fit in AI safety, his track record in that field is brief enough that I’m not yet confident about this. However, there are other positive signs: his previous publications were all in reasonable conferences, including one in PODC, which I understand to be one of the top-tier venues in distributed computing. Unfortunately, I am not familiar enough with this area of CS to render my own judgment on the papers, so have instead relied primarily on my impressions of his more recent AI-focused work.
Amon Elders – $250,000
Doing a PhD in computer science with a focus on AI safety.
Amon applied for funding to support a PhD position in artificial intelligence working with Prof. Michael Osborne. Receiving funding was important to securing an offer, since there is limited PhD funding available for non-UK citizens in the UK after Brexit. Moreover, the funding sources that are available have application deadlines earlier in the year, and so waiting for these would have delayed the start of the PhD by a year.
Amon graduated from UCL with an MS in CS & ML (with distinction). His research experience includes a year-long RA position at the Italian Institute of Technology in Genoa with his UCL supervisors, which resulted in a third-author publication in AIES, and an MS Thesis on learning to rank. Amon also has relevant professional experience, having worked as an ML engineer at Spark Wave, and as an undergraduate co-founded the EA society at the University of Amsterdam.
Amon is particularly interested in working on robustness and distributional shift in his PhD, though he is also considering other topics. While his research direction is currently tentative, he is familiar with the key aspects of long-term AI safety and is interested in pursuing this direction further both during and after his PhD.
Michael Osborne seems like a good advisor given Amon’s interests. Michael has a deep background in Bayesian ML which is relevant for robustness and distributional shift, and Michael also has an interest in the social impact of machine learning. Additionally, Michael Osborne is also advising Michael Cohen, a PhD scholar at FHI. While Michael Cohen is pursuing a different concrete research agenda to Amon’s current interests, they have a similar high-level focus, so I expect being in the same group will help lead to fruitful discussions.
I generally think there’s a strong case for funding talented people to pursue PhDs on relevant topics. For one, this is a great way of training relevant researchers: I think PhDs generally do a good job of teaching relevant research skills and are a valuable credential. In addition, there’s often scope to produce directly valuable research during a PhD, although the opportunity here varies between labs.
A big reason why I’m often hesitant to fund PhDs is adverse selection. There are many other funding sources for PhDs: government fellowships, university bursaries, private endowments, advisors’ research grants, etc. Most of the people who we would want to fund can easily get funding from elsewhere. So the applications we see will be disproportionately those whom other funders have chosen to reject. However, Amon was simply ineligible for the majority of those funding sources, so this consideration is much weaker in his case. He ended up getting 5 PhD offers including offers from Oxford, Cambridge, UCL, and Imperial.
The main concern I have with this grant is that Amon may end up working on topics that I do not believe will have an impact on the long-term future. This could be fine if he later pivots to focus more on AI safety post-PhD, but research direction often can be “sticky” (see similar discussion in the write-up for
Sören Mindermann below). However, Amon does seem to currently be on a promising path, and he’s currently keeping up with AI safety research at other labs.
David Krueger – $200,000, with an expected reimbursement of up to $120,000
Computing resources and researcher salaries at a new deep learning + AI alignment research group at Cambridge.
David, a newly appointed AI faculty member in Cambridge’s CBL Lab, applied for funding to support PhD students and to purchase computational resources. David has a strong track record of working on AI safety research. He is a co-author on the ARCHES, Trustworthy AI Development, and Recursive Reward Modeling research agendas. He has also made several relevant technical contributions, including Hidden Incentives for Auto-Inducted Distributional Shift, Out-of-Distribution Generalization via REx, and Active Reinforcement Learning. David has also helped with field-building via co-organizing the AI Safety Unconference held concurrently with NeurIPS.
I am generally excited about supporting academic labs that want to focus on AI safety research. Right now CHAI is the only academic lab with a critical mass of people working on technical AI safety. While there are a handful of other early-stage labs, I believe we still need many more if we are to seriously expand the field of AI safety in academia.
The most direct path for impact I see for additional safety-focused academic labs is allowing PhD students with a pre-existing interest in safety to work on related research topics. Right now, many PhD students end up in labs focused on unrelated topics. Indeed, I’ve even recommended that prospective students “give more weight to program quality than the ability to pursue topics you consider important.” While I stand by this advice, I hope that as safety-focused labs expand, students will no longer have to make this trade-off.
I also expect that students graduating from a safety-focused academic lab will be better prepared to perform safety research than if they had worked in another lab. I’d estimate that they will save somewhere between 6 months and a year of time that would otherwise need to be spent on literature review and clarifying their thinking on safety after graduation.
Perhaps the largest upside, though also one of the more uncertain ones, is an indirect field-building effect. Right now, a large fraction of people working on AI safety have been influenced by the philosophy of effective altruism (EA). While I am grateful for the support of the EA community on this topic, this also highlights a failure to tap into the much larger talent pool of (prospective) AI researchers who aren’t connected to that community. Academic labs can help with this by recruiting PhD students who are interested in, but have little prior exposure to, safety, and by shifting the thinking of other more senior researchers.
Despite this positivity, we do not want to indiscriminately fund academic labs. In particular, most of the best academic AI research happens in the top 20 or so universities. Outside of those universities, I would expect a significant drop-off in the caliber of PhD applicants, as well as the lab being in a worse position to influence the field. On this criterion, Cambridge seems like a good place for a new lab: it is one of the best universities in the world for Bayesian ML, and benefits from the presence of strong mathematics and CS departments. While I would be even more excited by a lab at the likes of Stanford or MIT, which have amongst the strongest AI groups right now, Cambridge is clearly still a good place to do AI research.
David intends to use our funding to support new PhD students and purchase computational resources. Both of these seem like reasonable uses of funding:
PhD students: I reviewed a list of prospective PhD students and saw several applicants who seem strong, and I would be excited to see them join David’s lab. While PhD students can often receive fellowships or scholarships from other sources, these opportunities are limited for international students in the UK. Furthermore, current applicants are not eligible for most of those funding sources this year, because David received his faculty offer after many funding deadlines had already passed.
Computational resources: The CBL lab that Cambridge is joining has limited in-house computational resources, with a ratio of around 2 GPUs per student. Moreover, the existing resources are fragmented between many small servers and workstations. I spoke to a current student at the CBL who confirmed that there is a shortage of computational resources which both limits the scope of possible experiments and wastes time.
The main risk I see with this grant is that, since the funding is unrestricted, we might be negatively surprised by what it is spent on. In particular, it is conceivable that it could be used to fund a PhD student the LTFF would not have chosen to fund directly. However, unrestricted funding is also more valuable to David, allowing him to make timely offers to students who are faced with short deadlines from competing institutions. Moreover, David has a strong incentive to recruit good students, so it isn’t clear that the LTFF has a comparative advantage in evaluating his prospective students. Overall, these considerations seem to weigh heavily in favour of unrestricted funding, although this does make the outcome harder to predict.
It’s also possible that mentorship will cease to be a major bottleneck on the growth of the AI safety field in the near future. Senior safety researchers are graduating in increasing numbers, and many of them are interested in faculty positions. Moreover, my (by no means rigorous) impression is that the growth in PhD candidates is beginning to slow (though is still expanding). If mentorship becomes more plentiful in the future, David’s lab may be significantly less influential than expected. However, I think this grant (barely) clears the bar for funding even on the basis of its expected short-term impact alone (before mentorship “catches up” with the influx of junior research talent).
We thought the Survival and Flourishing Fund would also be likely to want to fund David, and we thought it would be fair to share the costs with them, so we arranged with them that the first $120,000 of any money they decide to give to David Krueger will be given back to us instead.
Sören Mindermann – $41,869
Doing a 3rd and 4th year of a PhD in machine learning, with a focus on AI forecasting.
We are providing financial support to Sören, a PhD student at Oxford advised by Yarin Gal, to improve his research productivity during his PhD. This is a renewal of our grant made in August 2019 at the start of his PhD. Sören anticipates spending money on items such as home office equipment, meal replacements (to spend less time cooking), a cleaning service, and other miscellaneous productivity-boosting costs.
We have made several grants providing supplementary funding to graduate students, such as Vincent Luczkow and an anonymous PhD candidate, so it’s worth taking a moment to reflect on our general approach to evaluating these grants. In general, I find the basic case for providing supplementary funding to graduate students fairly strong. The stipend provided to graduate students is often very low, especially outside the US. In Soren’s case, his stipend pays £15,000/year ($21,000/year equivalent). I therefore expect him, and other students in similar positions. to be able to productively use additional funding. Concretely, I would anticipate a grant of $20,000/year to be able to “buy” an extra 2 to 8 hours of additional productive work per week, giving a cost of around $50 to $200/hour.
As a rule of thumb, I’d be very excited to buy an additional hour of the median junior AI safety researcher’s time for $50. At $100, I’d still weakly favor this purchase; at $200, I would lean against but feel uncertain. This is very sensitive to personal fit and track record — I could easily go 4x higher or lower than this depending on the candidate.
In the case of existing researchers, I think we should actually be willing to pay slightly more than this. In other words, I would prefer to have 10 researchers who each produce “1.1 units” of research than 11 researchers who each produce “1 unit”. My rationale for this is that growing the number of people in a field imposes additional communication costs. Additionally, the reputation of a field depends to some extent on the average reputation of each researcher, and this in turn affects who joins the field in the future. This makes it preferable to have a smaller field of unusually productive researchers than a larger field of less productive researchers.
In addition to increasing direct research output, I anticipate these grants also having positive indirect effects. In particular, the track record someone has in a PhD heavily influences their future job opportunities, and this can be very sticky, especially in faculty positions.
One countervailing consideration is that people already in PhD programs applying for funding are highly likely to continue in the field. The impact on someone’s career trajectory might be greater providing funding pre-PhD, especially if that allows them to secure a position they otherwise wouldn’t. For this reason, all else being equal I think we should be more excited about funding people for a PhD than topping up funding of someone already in a PhD. However, in practice it’s quite rare for us to be able to truly counterfactually change someone’s career trajectory in this way. Most talented researchers secure PhD funding from other sources, such as their advisor’s grants. When we have made a grant to support someone in a PhD, I expect it has most often merely accelerated their entry into a PhD program, rather than altering their long-run trajectory.
While I do not believe this applies in Sören’s case, I also think having a norm that additional funding is available may encourage people to pursue PhDs who would otherwise have been reluctant to do so due to the financial sacrifices required. This is particularly relevant for people later in their careers, who might have to take a significant pay cut to pursue a PhD, and are more likely to have dependents and other costs.
Turning back to the specifics of this grant, Sören’s work to date has focused on causal inference. His publications include a more theoretical paper in NeurIPS (a top-tier ML conference), and a joint first-author paper in Science evaluating different government interventions against COVID-19. This seems on par with or exceeding the publication track record of other PhD students in similar programs and seniority. Having a Science paper in particular is very unusual (and impressive) in ML, however I do think this was facilitated by working on such a topical issue, and so may be a difficult feat to repeat.
My biggest concern is that none of the work to date has focused on directly improving the long-term future. There’s a tentative case for the COVID-19 work helping to improve governments’ response to future pandemics or generally improve methodology in epidemiology, but I’m unconvinced (although I do think the work looks good from a short-term perspective). Similarly, while Soren’s work on causal inference is in a relevant area, and will be useful for skill-building, I don’t see a clear pathway for this to directly improve technical AI safety.
However, I think that it can be a reasonable choice to focus more on skill-building during your PhD. In particular, it is often useful to start by imitating the research direction of those around you, since that will be the area for which you can get the most detailed and valuable feedback. While this may limit your short-term impact, it can set you up for an outsized future impact. The catch is that you must eventually pivot to working on things that are directly important.
In Sören’s case, I’m reasonably confident such a pivot will happen. Sören is already considering working on scaling laws, which have a clear connection to AI forecasting, for the remainder of his PhD. Additionally, we have clear evidence from his past work (e.g. internships CHAI and FHI) that he has an interest in working on safety-related topics.
Nonetheless, I think research directions can end up surprisingly sticky. For example, you tend to get hired by teams who are working on similar things to the work you’ve already done. So I do think there’s around a 25% chance that Sören doesn’t pivot within the next 3 years, in which case I’d regret making this grant, but I think the risk is worth taking.
Grant reports by Asya Bergal
Any views expressed below are my personal views and not the views of my employer, Open Philanthropy. In particular, receiving funding from the Long-Term Future Fund should not be read as an indication that an organization or individual has an elevated likelihood of receiving funding from Open Philanthropy. Correspondingly, not receiving funding from the Long-Term Future Fund (or any risks and reservations noted in the public payout report) should not be read as an indication that an organization or individual has a diminished likelihood of receiving funding from Open Philanthropy.
Anton Korinek – $71,500
Developing a free online course to prepare students for cutting-edge research on the economics of transformative AI.
Update December 2021: This course is now live here.
Anton applied to improve and professionalize a free Coursera course on the economics of transformative AI in time for an initial public run this fall. Anton is an associate professor of economics and had previously created a beta version of the course for his own graduate students at the University of Virginia. He applied for funding to pay for his own time working on the course, the time of a graduate student assistant, professional video editing, guest lecturers, copyright fees, and support from the University of Virginia.
I have spoken to Anton before and read large parts of his and Philip Trammel’s paper, Economic growth under transformative AI, which I thought was a good summary of existing economics work on the subject. My sense is that Anton is a competent economist and well-suited to creating an academic course.
I also skimmed the course materials for the beta course and watched parts of the associated video lectures. I would summarize the course as starting in more conventional AI economics covering the automation of labor and economic growth, and ending in more longtermist AI economics covering economic singularities, the macroeconomics of AI agents, and the control problem. The course is fairly technical and aimed towards advanced economics students. I am somewhat familiar with the economics work Anton is drawing from; while I didn’t investigate the course materials in depth, I didn’t see anything obviously inaccurate. The presentation of the beta version of the course seemed rough on certain dimensions (in particular, video quality); I can imagine it benefitting substantially from additional work and editing.
I was interested in funding this course because I think it could introduce smart economists to longtermist work, though much of the course isn’t explicitly focused on the long-term impacts of transformative AI. I think the work of many economists, including Philip Trammell and Robin Hanson, has been really influential in longtermist thinking, and a lot of the AI forecasting work I’m most excited about now is based in economics. It doesn’t seem implausible to me that a course like this could find the “low-hanging fruit” of economics students who turn out to be really interested in longtermist work when exposed to it. (As a secondary path to impact, this course is the best collection of materials around the economics of transformative AI that I know of, and I can imagine using it as a resource in the future.)
One risk of funding work that isn’t strictly longtermist is that it misrepresents longtermist ideas in an attempt to have broader appeal, discouraging people who would be on board with longtermism and attracting people who don’t really understand it. Anton strikes me as someone who is careful in his presentation of concepts, so I think this is unlikely to happen here.
I think there’s a significant chance the course doesn’t have much of an impact, at least in its first run. Anton’s goal is to have 50 students complete the course in the fall; even if that goal is met, it seems likely to me that that’s too small a number to find anyone who ends up being really excited about the material. However, given that the course is likely to have multiple runs and iterations in the future, I think the upside from this grant is worth it.
Daniel Filan – $5,280
Technical support for a podcast about research aimed at reducing x-risk from AI.
Recusal note: Adam Gleave and Oliver Habryka did not participate in the voting or final discussion around this grant.
Daniel applied for funding to pay for audio editing, transcription, and recording space usage for his AI alignment podcast, the AI X-risk Research Podcast (AXRP).
I’ve listened or read through several episodes of the podcast; I thought Daniel asked good questions and got researchers to talk about interesting parts of their work. I think having researchers talk about their work informally can provide value not provided by papers (and to a lesser extent, not provided by blog posts). In particular:
I’ve personally found that talks by researchers can help me understand their research better than reading their academic papers (e.g. Jared Kaplan’s talk about his scaling laws paper). This effect seems to have also held for at least one listener of Daniel’s podcast.
Informal conversations can expose motivations for the research and relative confidence level in conclusions better than published work.
Daniel provided some statistics about his podcast download numbers over time: 200 – 400 per episode as of early March. This guide suggests that Daniel’s podcast is in the top 10% to 25% of podcasts, though this single metric seems like a pretty dubious measure of podcast performance. (I also think it’s too early to tell exactly how well this podcast will do.)
Overall, the existing download counts along with personal anecdotes from people getting value out of this podcast were enough to justify this grant for me.
Rethink Priorities – $70,000
Researching global security, forecasting, and public communication.
Recusal note: Daniel Eth and Ozzie Gooen did not participate in the voting or final discussion around this grant.
Rethink Priorities is an organization that does public-facing research on a variety of EA topics; see their previous work here. Their previous longtermist work has consisted entirely of Luisa Rodriguez’s work on nuclear war. Luisa has since gone on to do other work, including this post, which I’ve referenced multiple times in this very grant round.
Rethink’s longtermist team is very new and is proposing work on fairly disparate topics, so I think about funding them similarly to how I would think about funding several independent researchers. Their longtermist hires are Linchuan Zhang, David Reinstein, and 50% of Michael Aird (he will be spending the rest of his time as a Research Scholar at FHI). I’m not familiar with David Reinstein. Michael Aird has produced a lot of writing over the past year, some of which I’ve found useful. l haven’t looked at any written work Linchuan Zhang has produced (and I’m not aware of anything major), but he has a good track record in forecasting, I’ve appreciated some of his EA forum comments, and my impression is that several longtermist researchers I know think he’s smart. Evaluating them as independent researchers, I think they’re both new and promising enough that I’m interested in paying for a year of their time to see what they produce.
They are proposing to do research in three different areas:
Global security (conflict, arms control, avoiding totalitarianism)
Forecasting (estimating existential risk, epistemic challenges to longtermism)
Polling / message testing (identifying longtermist policies, figuring out how to talk about longtermism to the public)
Broadly, I am most excited about the third of these, because I think there’s a clear and pressing need for it. I think work in the other two areas could be good, but feels highly dependent on the details (their application only described these broad directions).
Here’s some example work in these areas that I could imagine being interested in. Note that I haven’t spent time looking into these ideas; it’s possible that on further reflection I would no longer endorse them, or discover that the work has already been done:
Finding high-leverage ways to reduce geopolitical conflict that don’t require political influence (or learning that there aren’t any)
Estimating the expected badness of a totalitarian future compared to an extinction outcome
Experimenting with daily-rolling forecasts for early warning
We decided to pay 25% of the budget that Rethink requested, which I guessed was our fair share given Rethink’s other funding opportunities.
Toby Bonvoisin – $18,000
Doing the first part of a DPhil at Oxford in modelling viral pandemics.
Toby applied for 7 months of funding for his tuition and living expenses during his DPhil at the Nuffield Department of Medicine at the University of Oxford. Toby’s DPhil offer came too late for him to access other sources of funding for these 7 months; this money meant he wouldn’t have to take out a loan or put off his studies.
Toby’s proposed DPhil topic is to model transmission of novel respiratory viruses within hospitals. I don’t think this is a particularly high-leverage strain of work for preventing global catastrophic biological risks (GCBRs), but I think in biosecurity, it makes sense to prioritize building career capital over having an immediate impact early on.
Funding career building means taking a bet that someone will ultimately go on to do longtermist work. I didn’t know Toby at all, so I had a call with him talking about his motivations, future plans, and the interactions between short-term-focused and long-term-focused work in biosecurity. He gave thoughtful answers to those questions that made me feel like he was likely to pursue longtermist work going forward.
Jess Whittlestone, Shahar Avin – $6,917
Ensuring the lessons learnt from COVID-19 improve GCR prevention and mitigation in the future.
Jess and Shahar, who both work as researchers at the Centre for the Study of Existential Risk (CSER), applied for funding to pay for a research assistant for their project ensuring lessons learned from COVID-19 improve GCR prevention and mitigation in the future. From their application:
I agree with their justification for this work. I think it’s good in expectation to put some longtermist effort towards promoting lessons learned from COVID-19, though as they say, I expect it to be very difficult to shape historical narratives.
I think Jess and Shahar are a good fit for this project because they have both done good work for public and policy audiences in the past. I’ve particularly benefited from Jess’s work on improving institutional decision-making at 80,000 Hours and Shahar’s co-authored report, Toward Trustworthy AI Development: Mechanisms for Supporting Verifiable Claims.
Longtermist Entrepreneurship Fellowship
Giving seed funding to projects incubated by the Longtermist Entrepreneurship Fellowship Programme.
In November 2020, Jade Leung applied to the LTFF asking for a commitment of up to $200,000 to provide seed funding to projects incubated in the pilot run of the Longtermist Entrepreneurship Fellowship (LEF), a new program intended to help longtermist projects and organizations get off the ground. Fellows submitted proposals for seed grants at the end of the fellowship in December, and a committee consisting of Claire Zabel (representing Open Philanthropy), Kit Harris (representing Longview Philanthropy), Sjir Hoeijmakers (representing Founders Pledge), and Jonas Vollmer (representing EA Funds) decided on which proposals received funding. Funding decisions were made by committee, but Jonas retained the right of veto over the LTFF’s portion of the funding. I felt good about committing this money because I trust the collective judgement of this set of grantmakers.
Five projects run by seven fellows ended up applying, for a total of $464,500 in funding. The committee decided to approve up to three grants for a total of up to $182,000. One of those grants, worth $106,000, was accepted by the grantee. The LTFF will cover half of the cost of this grant, with Open Philanthropy’s grant covering the other half. We’ve asked Jonas Vollmer to explain the reasoning behind the grant below. We will report on any further grants in future payout reports if they are accepted (which currently seems unlikely).
A grant report from the Longtermist Entrepreneurship Fellowship, by Jonas Vollmer
Josh Jacobson — up to $53,000
Research into physical-world interventions to improve productivity and safeguard life.
Josh Jacobson applied for a grant to research physical-world interventions to improve productivity and safeguard life. He plans to brainstorm and research various interventions, such as improving air quality, preparing for nuclear war or earthquakes, or securing second citizenships. He also plans to help longtermists implement these interventions.
I overall voted in favor of this grant because:
I agree with the basic premise: Improving the productivity of people working on important issues could be very valuable, and protecting their safety might be useful, too.
I am aware that some longtermist organizations are currently researching some of these issues themselves. Josh may save these organizations time and circulate the results of his research beyond organizational boundaries.
Josh’s concrete ideas seemed relatively detailed to me, so I thought he was likely to make substantial progress on the project. His project plan and milestones seemed focused on delivering concrete output quickly, which I liked. He also previously worked with BERI on a related project that seemed successful.
Josh seems very motivated to work on this project, such that I expect him to do a good job at it.
Some reasons against this grant that I considered:
Overestimating the risks involved might lead to overspending on prevention. The potential benefits seem relatively limited (marginal reduction of a low risk), whereas the costs could grow large (many organizations could potentially spend a lot of time on this).
For instance, I estimate that an earthquake in San Francisco has a ~2% annual likelihood of occurring, and would kill ~0.03% of the population in expectation. I estimate 30% of such deaths to be preventable. The resulting preventable risk seems tiny (just two micromorts per year). If longtermist organizations started taking measures to avert this risk for their staff as a result of this grant, that could be a poor use of their time.
Detailed reporting on personal safety risks could lead to unhelpful anxiety about them. During the COVID pandemic, I frequently got the impression that some EAs and rationalists seemed anxious about catching COVID to a degree where their worries seemed to take a much bigger toll on their expected quality of life than the actual risk at hand. I worry that a similar dynamic might come into play here.
Josh applied for a $106,000 grant over a year. Because we don’t yet know whether this project will be successful longer-term and the requested grant was relatively large, the LEF funding committee decided to make the grant in two installments: an initial $53,000 grant, and another $53,000 subject to a satisfactory progress report after the first six months. Half of each installment ($26,500) will be covered by the LTFF.
Grant reports by Daniel Eth
Legal Priorities Project – $135,000
Hiring staff to carry out longtermist academic legal research and increase the operational capacity of the organization.
The Legal Priorities Project (LPP) applied for funding to hire Suzanne Van Arsdale and Renan Araújo to conduct academic legal research, and Alfredo Parra to perform operations work. All have previously been involved with the LPP, and Suzanne and Renan contributed to the LPP’s research agenda. I’m excited about this grant for reasons related to LPP as an organization, the specific hires they would use the grant for, and the proposed work of the new hires.
Overall, I think the LPP is exciting enough to warrant funding in general, at least for the time being. The goal of the LPP is to generate longtermist legal research that could subsequently be used to influence policymakers in a longtermist direction. Personally, while I don’t think this theory of change is crazy, I’m also not completely sold on it (I have perhaps a more cynical view of policy, where principled academic arguments are rarely causal in policies being enacted, even if they are claimed as justifications). I do, however, think this theory of change is plausible enough that it’s at least worth trying (if done well), especially since the indirect effects (discussed below) also seem most likely positive.
Insofar as the LPP’s theory of change makes sense, I think they’re likely to go about it well. Since they’re a younger organization, they do not yet have a large body of output. Their main piece of output so far is their research agenda. As part of my evaluation, I skimmed the agenda in full and more closely read a couple sections (including the sections on longtermism and biorisk), and I found the work to be at a high level of quality. More specifically, I thought it was well written (thorough without being verbose, expressed concepts clearly – especially for an audience that might not be familiar with longtermism), demonstrated good understanding of longtermism and related concerns (e.g., infohazards), and presented “weird” longtermist concepts in a manner unlikely to be off-putting to those who might be put off by much of the rhetoric around such concepts elsewhere. This work makes me hopeful that future work will also be high quality.
Additionally, the LPP team includes many individuals with stellar academic credentials. I’m usually skeptical of leaning too heavily on academic credentials in making assessments (I tend to consider them only one of many reasonable signals for intelligence and competence), but I’d imagine that such credentials are particularly important here given LPP’s theory of change. My sense is that policymakers and other legal scholars are much more inclined to take academic ideas seriously if they’re promoted by those with prestigious credentials.
Furthermore, I think it often makes sense to fund newer organizations without long track records if there is a plausible argument in favor (and no apparent large downside risk). The upside of such grants can be large, and the wasted effort and resources if they don’t work out will probably be only a minor downside. I think such arguments are also stronger for organizations such as the LPP that are operating in newer areas, or with a theory of change different than what has come before, as their progress (or lack thereof) provides valuable information about not only the organization, but also potentially the whole area.
In addition to the direct effects from the LPP pursuing its theory of change, I’d imagine the indirect effects here would also most likely be positive. Given my impression of the LPP’s ability to promote longtermist ideas faithfully and in a non-offputting manner, and additionally the fact that it would be lending academic credibility to such ideas, my sense is that the LPP would, if anything, be good for the reputation of longtermism more broadly. Additionally, the LPP might allow for those involved to gain career capital.
Regarding the specific hires, I am also enthusiastic – due to their past work and educational credentials. Suzanne was the main author of the “Synthetic Biology and Biorisk” section of the LPP’s research agenda, and Renan was a co-author of the “Longtermism” section, both of which impressed me. Additionally, both have impressive academic credentials (Suzanne has a J.D. from Harvard, Renan has an M.Sc. from LSE), which I believe is valuable for reasons I outlined above. While I have less of a sense of the quality of Alfredo’s work than that of Suzanne and Renan (and, as a researcher myself, I trust my judgment less on questions related to operations than to research), it’s a positive signal that Alfredo has a Ph.D. from TU Munich and previously worked for four years in operations at EAF/CLR, and the fact that he has been involved with the LPP so far and they’re interested in hiring him is a further signal. I also think that operations work is important, and I am fairly inclined to defer to (seemingly competent) organizations when they claim a particular individual that they are familiar with would be a good operations hire.
Finally, I support this grant because I support the proposed work for the hires. According to the application, Suzanne will work on the topic of dual-use research, and Renan’s work will focus on protection of future generations via constitutional law. Both of these areas strike me as sensible areas of research for academic legal work. While Suzanne’s topic does contain potential downsides (such as infohazards) if handled poorly, her section within the research agenda on “Synthetic Biology and Biorisk” contains a subsection on “Information Hazards” which shows she is at least aware of this risk, and the section as a whole demonstrates a nuanced level of thinking that makes me less worried about these potential downsides (and I further update against being worried here given that the LPP as an institution strikes me as presenting an environment where such issues are handled well). Alfredo’s proposed work includes setting up “robust internal systems”. While vague, this work also seems valuable for the organization.
Sam Chorlton – $37,500
Democratizing analysis of nanopore metagenomic sequencing data.
Sam applied for funding for his online microbiological sequencing-analysis startup BugSeq to hire a bioinformatician for their effort to progress metagenomic sequencing (i.e., sequencing of genetic material recovered from environmental samples, which could help with early detection of novel pathogens).
My general impression of metagenomic sequencing is positive, as being able to better screen the environment for pathogens has clear defense applications and (at least as far as I can tell) lacks clear offense applications, and thus seems likely to advance biodefense over bio-offense. While I do not have a biosecurity background myself, I asked a well-respected biorisk person within the EA community about their opinion of metagenomic sequencing, and they thought it was generally valuable. Additionally, Kevin Esvelt (another well-respected biorisk expert within EA, and an MIT professor) voiced support for metagenomic sequencing in his EAGVirtual 2020 talk. One particular aspect of BugSeq’s approach that excites me is their work on improved detection of mutations in viruses, which could plausibly help to detect the presence of genetic engineering; this excited me because I worry more about engineered than natural pandemics (from an X-risk perspective).
Given my support for BugSeq’s goals, my support for the grant depended on my evaluation of BugSeq as an organization. In the application, Sam provided confidential information about organizations for which they’ve previously conducted analyses, as well as those with which they are currently working, and this information made me feel more confident in BugSeq’s competence/legitimacy. An additional factor in favor of BugSeq includes the fact that they have received a conditional matching grant from the Canadian Government’s IRAP for $37,500 (implying not just competence on their behalf, but also that our grant would plausibly be doubled). Overall, while I’m still unsure about how much impact to expect from BugSeq, they passed my bar for funding, and I strongly expect that further investigation wouldn’t change that.
Tegan McCaslin – $80,401
Continuing research projects in AI forecasting, AI strategy and forecasting.
Tegan applied for funding to continue independent research relevant for AI strategy and AI timelines. The LTFF has twice before funded Tegan to perform independent research. I supported this grant primarily due to Tegan’s research output from these previous two grants, and despite some skepticism about an independent-research environment being the best option for Tegan.
Tegan’s main outputs from her previous grants are a report comparing primate and bird brain architectures in terms of cognition, and a draft of a report that both compares the efficiency of biological evolution vs. deep learning in terms of improving cognitive abilities and sheds light on whether there is a common ordering between biological evolution and deep learning in terms of which cognitive tasks are more difficult to learn. I had previously read the former report, and I skimmed the latter for this evaluation.
My overall evaluation of these two outputs is as follows. These research questions are very ambitious (in a good way), and would be valuable to learn about. I think these questions are also in areas that are unlikely to see significant work from academia; both projects require making somewhat fuzzy or subjective judgments with very limited data and generalizing from there, and my impression is that many academics would dismiss such work as too speculative or unscientific. Additionally, the interdisciplinary nature of this work means there are fewer individuals in academia who would be well-positioned to tackle such research. Furthermore, work along these lines is neglected within the EA community. Despite the fact that biological evolution is the only process we know of that has produced human-level intelligence, there has been very little work within EA to study evolution in hopes of gaining insights about transformative AI. That Tegan’s research focuses on questions that are important and neglected makes me more optimistic about the value of the sorts of projects she will pursue, as well as her research judgement.
Regarding research methodology, my sense is Tegan is going about this in a mostly reasonable way. My inside view is that she’s handling the fuzzy, subjective nature of this research well, her evidence largely supports her conclusions, and she appreciates the limitations of her research. The biggest weakness of both projects, in my mind, is that they suffer from very few data points. Admittedly, both projects are in areas where not much data seems to exist (and that which does exist is fuzzy and may be difficult to use). But the fact remains that with few data points, strong conclusions cannot be drawn, and the research itself ends up less informative than it would be with more data. While finding such data may be difficult, I have a reasonably strong prior that at least some further relevant data exists and that the benefits to the research from finding such data would outweigh the search costs.
In light of all this, I would like to see more research along the lines that Tegan has performed previously, and I am in favor of her continuing to pursue such research. Having said that, I think there are some potential downsides to pursuing long-term independent research. Since this will be the third grant that the LTFF will give to Tegan for independent research, I think it’s worth considering these potential downsides, as well as potential mitigation techniques. (I also think this is a valuable exercise for other long-term independent researchers, which is part of my motivation for spelling out these considerations below.)
The largest pitfall in my mind for long-term independent researchers is one’s research becoming detached from the actual concerns of a field and thereby producing negligible value. Tegan seems to have avoided this pitfall so far, thanks to her research judgment and understanding of the relevant areas, and I see no evidence that she’s headed towards it in the future.
Another potential pitfall of independent research is a general lack of feedback loops, both for specific research projects and for the individual’s research skills. One way that independent researchers may be able to produce stronger feedback loops for their work is by sharing more intermediate work. While Tegan has shared (and received feedback from) some senior longtermist researchers on some of her intermediate work, I think she would probably benefit from sharing intermediate work more broadly, such as on the EA forum.
Finally, independent research can struggle to get as much traction as other work (keeping quality constant), as it’s less likely to be connected to organizations or networks where it will naturally be passed around. My sense is that Tegan’s research hasn’t gotten as much attention as it “deserves” given its level of quality, and that many who would find value in the research aren’t aware of it. Fixing such a dynamic generally requires a more active promotion strategy from the researcher. Again, I think posting more intermediate work could help here, as it would create more instances where others see the work, learn about what the researcher is working on, and perhaps even offer feedback.
Grant reports by Evan Hubinger
Anonymous – $33,000
Working on AI safety research, emphasizing AI’s learning from human preferences.
This grant, to an AI alignment researcher who wished to remain anonymous, will support their work on safe exploration, learning from human preferences, and robustness to distributional shift.
I’m only moderately excited about this specific project. It partly focuses on out-of-distribution detection, which I think is likely to be useful for helping with a lot of proxy pseudo-alignment issues. However, since I think the project overall is not that exciting, this grant is somewhat speculative.
That being said, the applicant will be doing this work in close collaboration with others from a large, established AI safety research organization that we are quite positive on and that the applicant previously did some work at, which significantly increases my opinion of the project. I think that the applicant’s continuing to do AI safety research with others at this organization is likely to substantially improve their chances of becoming a high-quality AI safety researcher in the future.
Finally, we did not take the decision to make this grant anonymous lightly. We are obviously willing to make anonymous grants, but only if we believe that the reasoning presented for anonymity by the applicant is sufficiently compelling. We believe this is true for this grant.
For additional accountability, we asked Daniel Ziegler of OpenAI, who is not part of the LTFF, to look at this grant. He said he thought it looked “pretty solid.”
Finally, though the grantee is anonymous, we can say that there were no conflicts of interest in evaluating this grant.
Center for Human-Compatible AI – $48,000
Hiring research engineers to support CHAI’s technical research projects.
Recusal note: Adam Gleave did not participate in the voting or final discussion around this grant.
This grant is to support Cody Wild and Steven Wang in their work assisting CHAI as research engineers, funded through BERI.
Overall, I have a very high opinion of CHAI’s ability to produce good alignment researchers—Rohin Shah, Adam Gleave, Daniel Filan, Michael Dennis, etc.—and I think it would be very unfortunate if those researchers had to spend a lot of their time doing non-alignment-relevant engineering work. Thus, I think there is a very strong case for making high-quality research engineers available to help CHAI students run ML experiments.
However, I have found that in many software engineering projects, there is a real risk that a bad engineer can often be worse than no engineer. That being said, I think this is significantly less true when research engineers work mostly independently, as they would here, since in those cases there’s less risk of bad engineers creating code debt on a central codebase. Furthermore, both Cody and Steven have already been working with CHAI doing exactly this sort of work; when we spoke to Adam Gleave early in the evaluation process, he seems to have found their work to be positive and quite helpful. Thus, the risk of this grant hurting rather than helping CHAI researchers seems very minimal, and the case for it seems quite strong overall, given our general excitement about CHAI.
David Reber – $3,273
Researching empirical and theoretical extensions of Cohen & Hutter’s pessimistic/conservative RL agent.
David applied for funding for technical AI safety research. He would like to work with Michael Cohen to build an empirical demonstration of the conservative agent detailed in Cohen et al.’s “Pessimism About Unknown Unknowns Inspires Conservatism.” David is planning on working on this project at the AI Safety Camp.
On my inside view, I have mixed feelings about creating an empirical demonstration of Cohen et al.’s paper. I suspect that the guarantees surrounding the agent described in that paper are likely to break in a fundamental way when applied to deep learning, due to our inability to really constrain what sorts of agents will be produced by a deep learning setup just by modifying the training setup, environment/dataset, and loss function—see “Risks from Learned Optimization in Advanced Machine Learning Systems.” That being said, I think Cohen et al.’s work does have real value to the extent that it gives us a better theoretical understanding of the space of possible agent designs, which can hopefully eventually help us figure out how to construct training processes to be able to train such agents.
On the whole, I see this as a pretty speculative grant. That being said, there are a number of reasons that I think it is still worth funding.
First, Michael Cohen has a clear and demonstrated track record of producing useful AI safety research, and I think it’s important to give researchers with a strong prior track record a sort of tenure where we are willing to support their work even if we don’t find it inside-view compelling, so that researchers feel comfortable working on whatever new ideas are most exciting to them. Of course, this grant is to support David rather than Michael, but given that David is going to be working directly with Michael—and, having talked with Michael, he seemed quite excited about this—I think the same reasoning still applies.
Second, having talked with Michael about David’s work, he seemed to indicate that David was more excited about the theoretical aspects of Michael’s work, and would be likely to do more theoretical work in the future. Thus, I expect that this project will have significant educational value for David and hopefully will enable him to do more AI safety work in the future—such as theoretical work with Michael—that I think is more exciting.
Third, though David initially applied for more funding from us, he lowered his requested amount after he received funding from another source, which meant that the overall quantity of money being requested was quite small, and as such our bar for funding this grant was overall lower than for other similar but larger grants. This was not a very large factor in my thinking, however, as I don’t believe that the AI safety space is very funding-constrained; if we can find good opportunities for funding, it’s likely we’ll be able to raise the necessary money.
John Wentworth – $35,000
Developing tools to test the natural abstraction hypothesis.
John Wentworth is an independent AI safety researcher who has published a large number of articles on the AI Alignment Forum, primarily focusing on agent foundations and specifically the problem of understanding abstractions.
John’s current work, which he applied for the grant to work on, is in testing what John calls the “natural abstraction hypothesis.” This work builds directly on my all-time favorite post of John’s, his “Alignment By Default,” which makes the case that there is a non-negligible chance that the abstractions/proxies that humans use are natural enough properties of the world that any trained model would likely use similar abstractions/proxies as well, making such a model aligned effectively “by default”.
According to my inside-view model of AI safety, I think this work is very exciting. I think that understanding abstractions in general is likely to be quite helpful for being able to better understand how models work internally. In particular, I think that the natural abstraction hypothesis is a very exciting thing to try and understand, in that I expect doing so to give us a good deal of information about how models are likely to use abstractions. Additionally, the truth or falsity of the general alignment by default scenario is highly relevant to general AI safety strategy. Though I don’t expect John’s analysis to actually update me that much on this specific question, I do think the relevance suggests that his work is pointed in the right direction.
Regardless, I would have supported funding John even if I didn’t believe his current work was very inside-view exciting, simply because I think John has done a lot of good work in the past—e.g. his original “Alignment By Default” post, or any number of other posts of his that I’ve read and thought were quite good—and I think it’s important to give researchers who’ve demonstrated the ability to do good work in the past a sort of tenure, so they feel comfortable working on things that they think are exciting even if others do not. Additionally, I think the outside-view case for John is quite strong, as all of me, Scott Garrabrant, and Abram Demski are all very positive on John’s work and excited about it continuing, with MIRI researchers seeming like a good reference class for determining the goodness of agent foundations work.
Marc-Everin Carauleanu – $2,491
Writing a paper/blogpost on cognitive and evolutionary insights for AI alignment.
Marc’s project is to attempt to understand the evolutionary development of psychological altruism in humans—i.e. the extent to which people intrinsically value others—and understand what sorts of evolutionary pressures led to such a development.
Marc was pretty unknown to all of us when he applied and didn’t seem to have much of a prior track record of AI safety research. Thus, this grant is somewhat speculative. That being said, we decided to fund Marc for a number of reasons.
First, I think Marc’s proposed project is very inside-view exciting, and demonstrates a good sense of research taste that I think is likely to be indicative of Marc potentially being a good researcher. Specifically, evolution is the only real example we have of a non-human-level optimization process producing a human-level optimizer, which I think makes it very important to learn about. Furthermore, understanding the forces that led to the development of altruism in particular is something that is likely to be very relevant if we want to figure out how to make alignment work in a multi-agent safety setting.
Second, after talking with Marc, and having had some experience with Bogdan-Ionut Cirstea, with whom Marc will be working, it seemed to me like both of them were very longtermism-focused, smart, and at least worth giving the chance to try doing independent AI safety research.
Third, the small amount of money requested for this grant meant that our bar for funding was lower than for other similar but larger grants. This was not a very large factor in my thinking, however, as I don’t believe that the AI safety space is overall very funding-constrained—such that if we can find good opportunities for funding, it’s likely we’ll be able to raise the necessary money.
Grant reports by Oliver Habryka
AI Safety Camp – $85,000
Running a virtual and physical camp where selected applicants test their fit for AI safety research.
We’ve made multiple grants to the AI Safety Camp in the past. From the April 2019 grant report:
When we next funded them, I said:
During this grant round, I spent additional time reviewing and evaluating the AI Safety Camp application, which seemed important given that we are the camp’s most central and reliable funder.
To evaluate the camp, I sent out a follow-up survey to a subset of past participants of the AI Safety camp, asking them some questions about how they benefited from the camp. I also spent some time talking to alumni of the camp who have since done promising work.
Overall, my concern above about mentorship still seems well-placed, and I continue to be concerned about the lack of mentorship infrastructure at the event, which, as far as I can tell, doesn’t seem to have improved very much.
However, some alumni of the camp reported very substantial positive benefits from attending the camp, while none of them reported noticing any substantial harmful consequences. And as far as I can tell, all alumni I reached out to thought that the camp was at worst, only a slightly less valuable use of their time than what they would have done instead, so the downside risk seems relatively limited.
In addition to that, I also came to believe that the need for social events and workshops like this is greater than I previously thought, and that they are in high demand among people new to the AI Alignment field. I think there is enough demand for multiple programs like this one, which reduces the grant’s downside risk, since it means that AI Safety Camp is not substantially crowding out other similar camps. There also don’t seem to be many similar events to AI Safety Camp right now, which suggests that a better camp would not happen naturally, and makes it seem like a bad idea to further reduce the supply by not funding the camp.
Alexander Turner – $30,000
Formalizing the side effect avoidance problem.
Alex is planning to continue and potentially finish formalizing his work on impact measures that he has been working on for the past few years during his PhD. We’ve given two grants to Alex in the past:
April 2019 – $30,000: Building towards a “Limited Agent Foundations” thesis on mild optimization and corrigibility
September 2020 – $30,000: Understanding when and why proposed AI designs seek power over their environment.
Since then, Alex has continued to produce research that seems pretty good to me, and has also helped other researchers who seem promising find traction in the field of AI alignment. I’ve also received references from multiple other researchers who have found his work valuable.
Overall, I didn’t investigate this grant in very much additional detail this round, since I had already evaluated his last two applications in much more detail, and it seems the value proposition for this grant is very similar in nature. Here are some of the most relevant quotes from past rounds.
From the April 2019 report:
From the September 2020 report:
I have not made any substantial updates since September, so the above still summarizes most of my perspective on this.
David Manheim – $80,000
Building understanding of the structure of risks from AI to inform prioritization.
Recusal note: Daniel Eth and Ozzie Gooen did not participate in the voting or final discussion around this grant.
We made a grant to David Manheim in the August 2019 round. In that grant report, I wrote:
Since then, I haven’t received any evidence that contradicts this perspective, so I continue to think that it’s valuable for David to be able to work on projects he thinks are valuable, especially if he plans to write up his findings and thoughts publicly.
However, the scope of this project is broader than just David’s personal work, so it seems worthwhile to further explain my take on the broader project. From the application:
I’ve observed the work of a good number of the people involved in this project, and the group seems pretty promising to me, though I don’t know all of the people involved.
I do have some uncertainty about the value of this project. In particular, it feels quite high-level and vague to me — which isn’t necessarily bad, but I feel particularly hesitant about relatively unfocused proposals for teams as large as this. My current best guess as to the outcome of this grant is that a number of people who seem like promising researchers have more resources and time available to think in a relatively unconstrained way about AI risk.
Logan Strohl – $80,000
Developing and sharing an investigative method to improve traction in pre-theoretic fields.
Logan has previously worked with the Center for Applied Rationality, but is now running an independent project aiming to help researchers in AI alignment and related pre-paradigmatic fields find more traction on philosophically confusing questions. I find work in this space potentially very valuable, but also very high-variance, so I assign high probability to this project not producing very much value. However, I think there’s at least some chance that it helps a substantial number of future AI alignment researchers in a way that few other interventions are capable of.
My general experience of talking to people who are trying to think about the long-term future (and AI alignment in particular) is that they often find it very difficult to productively make progress on almost any of the open problems in the field, with the problems often seeming far too big and ill-defined.
Many well-established research fields seem to have had these problems in their early stages. I think that figuring out how to productively gain traction on these kinds of ill-defined questions is one of the key bottlenecks for thinking about the long-term future. Substantially more so than, for example, getting more people involved in those fields, since my current sense is that marginal researchers currently struggle to find any area to meaningfully contribute and most people in the field have trouble productively integrating others’ contributions into their own thinking.
The basic approach Logan is aiming for seems descendant of some of Eugene Gendlin’s “Focusing” material, which I know has been quite useful for many people working in AI alignment and related research fields, based on many conversations I’ve had with researchers in the last few years. It seems to frequently come up as the single most useful individual thinking technique, next to the ability to make rough quantitative back-of-the-envelope estimates for many domains relevant to events and phenomena on a global scale. This makes me more optimistic than I would usually be about developing techniques in this space.
The approach itself still seems to be in very early stages, with only the very first post of the planned sequence of posts being available here. The current tentative name for it is “naturalism”:
Logan is planning to work closely with a small number of interested researchers, which I think is generally the right approach for work like this, and is planning to err on the side of working with people in practical contexts instead of writing long blog posts full of theory. Armchair philosophizing about how thinking is supposed to work can sometimes be useful (particularly when combined with mathematical arguments — a combination which sparked, for example, the Bayesian theory of cognition). But most of the time, it seems to me to produce suggestions that are only hypothetically interesting, but ultimately ungrounded and hard to use for anything in practice. On the margin, investigations that integrate people’s existing curiosities and problems seem better suited to making progress on this topic.
In evaluating this grant, we received a substantial number of highly positive references for Logan, both through their historical work at CFAR, and from people involved in the early stages of their “naturalism” program. I’ve worked a bit with Logan in the context of a few rationality-related workshops and have generally been impressed by their thinking. I’ve also found many of their blog posts quite valuable.
Summer Program on Applied Rationality and Cognition – $15,000
Supporting multiple SPARC project operations during 2021.
This is a relatively small grant to SPARC. I’ve written in the past about SPARC when evaluating our grant to CASPAR:
SPARC has been funded by Open Philanthropy for the past few years, and they applied for a small supplement to that funding, which seemed worthwhile to me.
Historically, SPARC seems to have been quite successful in causing some of the world’s most promising high-school students (e.g. multiple IMO gold medalists) to develop an interest in the long-term future, and to find traction on related problems. In surveys of top talent within the EA community, SPARC has performed well on measures of what got people involved. Informally, I’ve gotten a sense that SPARC seems to have been a critical factor in the involvement of a number of people I think are quite competent, and also seems to have substantially helped them to become that competent.
I do have some hesitations around the future of SPARC: the pandemic harmed their ability to operate over the past year, and I also have a vague sense that some parts of the program have gotten worse (possibly not surprisingly, given that relatively few of the original founders are still involved). But I haven’t had the time to engage with those concerns in more depth, and am not sure I will find the time to do so, given that this is a relatively small grant.
Grant reports by Ozzie Gooen
General thoughts
We considered a few “general purpose” grants to sizable organizations this round and did some thinking on the shared challenges these pose to us. Some factors I considered:
Evaluation work scales with project size, but there aren’t equivalent discounts for funding a fraction of a project. For example, a typical $20K project might take 10 hours to evaluate well enough for us to be comfortable funding, but a $4 million project could easily take over 60 hours — even if they only asked the LTFF for $20K. To be cost-effective, funders typically have rough goals of how much time they should spend for grants of a certain size.
Relatively small grants to large organizations do little to influence those organizations. Theoretically, funders like us make donations whenever the direct expected value passes a certain bar. An organization could take advantage of this by having a few projects we’d be extremely happy to fund but spending most of their budget on projects that aren’t a good fit for us. Our grants might still pass the bar for expected impact based on the few projects we like, but we’d have a hard time convincing the organization to do more LTFF-shaped projects (even if we think this would make them far more impactful).
Naomi Nederlof – $34,064
Running a six-week summer research program on global catastrophic risks for Swiss (under)graduate students.
Naomi Nederlof of EA Geneva is running a new longtermism-oriented summer research program for students in Switzerland. This grant will support up to 10 students for six weeks and help to provide them with stipends for this time. You can see early information about the program here.
The EA Geneva scene is interesting. EA Geneva has introduced a (to me) surprising number of notable researchers to longtermism. These people typically leave Geneva, in large part because of the longtermist orgs elsewhere. EA Geneva has links to the Simon Institute and the Lausanne EA scene. Geneva is notable for being one of the top two United Nations hubs, so a lot of EA work in the area is oriented around the UN ecosystem.
This grant is in some ways a fairly safe one, but in other ways, it’s fairly risky. It’s safe because EA Geneva has a history of running in-person events without notable problems, and the modest size of this program seems very manageable for them. It’s risky because this is the first time this program is to be run, and the planning is still somewhat early in its development. Some advisors have agreed to it, but more still need to be found (if you’re reading this and think you’d be a good fit, reach out!). More uncertain is the quality of the students. There are other similar summer programs hosted by SERI and other longtermist centers, and these accept applicants globally. It’s unclear how many promising students both belong to universities in Switzerland and won’t attend other competitive summer programs.
All that said, my impression is that an earlier FHI summer program with similar goals was both quite good and smaller than ideal (In fact, it’s not running for 2021), and I’d expect similar outcomes for the SERI version. This raises my estimate of how useful an additional summer program will be, and makes me more optimistic about this grant. We’ll hopefully learn a lot over the next few months as we find out which students go through EA Geneva’s program and what they produce (these sorts of summer projects often don’t seem directly useful to me, but hint at future work with more utility). If this sort of program could be successfully scaled, particularly by being replicated in other locations with strong EA scenes, that could be terrific.
Po-Shen Loh – $100,000, with an expected reimbursement of up to $100,000
Paying for R&D and communications for a new app to fight pandemics.
Po-Shen is the founder of NOVID, a contact-tracing-style application that forewarns individuals when their in-person networks are a certain number of steps away from someone who reported an infection. This helps them know to take preventative measures before they are at risk. I suggest visiting the site to see images and read a more detailed explanation.
At first, I was highly skeptical. Most COVID-related projects seem either poorly tractable, not at all longtermist, or both. NOVID was unusual early on because it had some unusually good references. We asked multiple researchers whose judgment we trust (several in biosecurity) about the project, and their responses were mostly either mildly or highly positive. You can see some recent public write-ups by Andrew Gelman and Tyler Cowen.
NOVID is available on both Apple and Android. Right now, both applications have around 230 ratings, with an average score of around 4.5/5. I started using it on my iPhone. I’ve found the interface simple and straightforward, though I was a bit uncomfortable with the number of permissions that it asked for. In my ~3 weeks of using it, it hasn’t shown any n-degree cases yet. I’ve been in Berkeley, so I assume that’s due to some combination of me staying at home a lot and the app having few users in the area.
NOVID is an unusual donation for the LTFF, as it has both short-term and long-term benefits. I personally believe the long-term benefits are considerable:
The use of NOVID during COVID-19 could give us valuable data, and ideally rigorous experiments, to reveal if this method will be useful in future pandemics.
Po-Shen genuinely seems to be focused on the value of NOVID for future pandemics. He has multiple longtermist collaborators. I expect that Po-Shen and his collaborators might do valuable work in future pandemics, so the skills and networks they build now could be useful later on.
There is a chance that NOVID makes a significant improvement to the handling of COVID-19. This might have positive long-term impacts of its own. One might think that vaccine deployment will completely solve the issue, but I think there are serious risks of this not being successful.
There are some reasons why we, as longtermist funders, might not want to fund this project. Reasons that stand out to me:
It’s not clear if the sorts of pandemics that will be extinction threats in the future will be alleviated by this kind of technology.
NOVID’s special tech requires asking for more permissions than contact-tracing apps by Apple and Google, which are relatively minimal in both data collection and benefit. Perhaps this trade-off won’t be accepted, even with more evidence of benefit.
An organization might require ample technical experience and connections to get the necessary support and follow-up for a project like this. For example, to get many users, it could be necessary to sign contracts with governments or gain large amounts of government trust. The NOVID team is quite young and small for this kind of work. This is a common issue for startups, and it’s particularly challenging in cases of public health.
NOVID is subject to strong network effects. It might be very difficult for either NOVID or NOVID-inspired applications to get the critical mass necessary to become useful. Arguably, NOVID could have a much easier time here than traditional contact tracing apps (given its focus on personal safety rather than avoiding harm to others), but it will likely still be a challenge.
I’ve been in the startup scene for some time. I look at NOVID a lot like other tech startups I’m familiar with. There are some significant pros and significant cons, and the chances of a large success are probably small. But in my personal opinion, the NOVID team and product-market fit seem more promising than many other groups I’ve seen at a similar stage that have gone on to be very successful.
Some remaining questions one might have:
“If this application is valuable for preventing COVID-19, why haven’t traditional funders been more interested?”
“Isn’t there already a lot of money going to COVID relief?
The easy answer for me would be that NOVID is particularly useful for helping with future pandemics, which other funders probably care less about, so we would have an obvious advantage. The more frustrating, but seemingly true (from what I can tell) answer, is more like, “The existing COVID funding system isn’t that great, and there are some substantial gaps in it.” My impression is that this application sits in a strange gap in the funding environment; hopefully, as it gets more traction, other non-longtermist funders will join in more.
All that said, while I found NOVID interesting, I thought it was overall a better fit for other funders. The funding needs were significantly larger than the LTFF maximum grant size, and the LTFF is more exclusively focussed on long-term value than other funders are. Unfortunately, these other funders would take a few more months to make a decision, and NOVID needed a commitment sooner.
Therefore, this donation was structured as a sort of advance via the Survival and Flourishing Fund. The SFF will consider funding NOVID in the next few months. When they do, if they decide to fund NOVID for over $100,000, the next $100,000 they donate will instead be given back to the LTFF fund. For example, if they decide to donate $500,000, $100,000 of this will be given to the LTFF instead of NOVID. If they decide on $150,000, the LTFF will get $50,000. This way, NOVID could get a commitment and some funding quickly, but the SFF will likely pay the bill in the end.
Stanford Existential Risks Initiative – $60,000
Providing general purpose support via the Berkeley Existential Risks Initiative.
The Stanford Existential Risks Initiative (SERI) is a new existential risk mitigation program at Stanford. They have 12 undergraduate student affiliates and what seems like a strong team of affiliates and mentors.
They seem to do a lot of things. From their website:
“Our current programs include a research fellowship for Stanford undergraduates, an annual conference, speaker events, discussion groups, and a Thinking Matters class (Preventing Human Extinction) taught annually by two of the initiative’s faculty advisors.”
This grant is to support them via BERI. I’ve personally had very good experiences with BERI and have heard positive reviews from others. BERI doesn’t just provide useful operations help; more importantly, they provide an alternative funding structure for a large class of things that universities tend not to fund. (It seems like the most prestigious universities sometimes have the most intensely restrictive funding policies, so this alternative could be a big deal.) I used BERI for help funding contractors for Foretold.io, and am confident I wouldn’t have been able to do this otherwise. (Note to those reading this: if you have an organization and would like help from BERI, I suggest reading this page and emailing them.)
My guess is that dollars given to requested BERI organizational grants would be somewhat more valuable than marginal ones going to the corresponding organizations (if this were not the case, then we would just donate directly to the organizations). For one, BERI organizational grants typically represent efforts to hire part-time help and perform small but valuable operations tasks horizontally through many projects of an organization. I think “making sure that existing projects go more smoothly” is rather safe.
This specific grant would be spent on a combination of:
Stipends and researcher support for approximately one year
Existential risk seed grants to graduate students
Support for a spring conference
Support for future courses on existential risks at Stanford
Miscellaneous funds and BERI core operations expenses
This was a fairly easy bet to make. The Stanford EA community has historically been successful, and the current community seems promising. I reached out to a few people close to the community and got glowing reviews. The program seems to have a lot of activity and senior longtermist mentors.
The main downsides for this decision are more about the structure of the LTFF and the longtermist funding landscape than about SERI. This grant was somewhat unusual in that SERI was already funded by Open Philanthropy, but multiple parties (OP, BERI) wanted them to have a more diverse set of funding sources. One key question for me was whether the LTFF would really provide the intended benefits of a diverse funding ecosystem. It would be easy for the LTFF to simply defer to Open Phil on which groups are worth funding, for example. But if we did, that would defeat the main benefits of funding diversification.
This donation seemed like a good enough bet, and the organization still relatively small enough, for the grant to be a solid use of LTFF funding.
Joel Becker – $42,000
Conducting research into forecasting and pandemics during an economics Ph.D.
Joel Becker is a first-year Ph.D. student at NYU who is pursuing forecasting research. The grant will act as a replacement salary for teaching for three years, to allow Joel to pursue research instead.
He intends to work on a few projects, including (text taken from the application):
Extend ‘external validity’ projects with Rajeev Dehejia, using expert forecasting to select treated units in development interventions.
Explore the accuracy of long-term forecasts in simulated environments (e.g. real history with blinded participants, games with different time horizons).
Explore trade-offs in forecast elicitation techniques (accuracy vs. cost of training, accuracy from full distributions vs. point estimates, returns to different accuracy rewards, etc).
I think Joel could plausibly become a successful academic: He’s been part of the Philanthropy Advisory Fellowship at Harvard, and will soon join the Global Priorities Fellowship of the Forethought Foundation. I get the impression from this that he’s fairly committed to improving the long-term future.
Joel seemed to me to currently be working on somewhat standard academic forecasting topics. This has the benefits of having accessible advisors and an easier time integrating with other work. But it has the downside that the current academic forecasting topics are likely different than would be ideal for longtermists. My intuition is that much of the most valuable work for generic forecasting will be a combination of theoretical foundations and advanced software implementations, and the existing economics work is fairly separate from this. For example, much of the literature explores laboratory experimentation with what seem to me to be unrealistic and inhibited scenarios.
I’ve read some of Joel’s in-progress work, and he seems to get some things right (asking better questions than other academic work, and using more modern software), but this work is very early-stage.
The big question is what direction Joel goes after doing this research. I could see the research helping to build valuable credentials and skills. This could put him in a position to spearhead bold and/or longtermism-centric projects later on.
Feedback
If you have any feedback, we would love to hear from you. You can submit your thoughts through this form or email us at longtermfuture@effectivealtruismfunds.org.