Who are you?
I’m Richard. I’m a research engineer on the AI safety team at DeepMind.
What are some things people can talk to you about? (e.g. your areas of experience/expertise)
AI safety, particularly high-level questions about what the problems are and how we should address them. Also machine learning more generally, particularly deep reinforcement learning. Also careers in AI safety.
I’ve been thinking a lot about futurism in general lately. Longtermism assumes large-scale sci-fi futures, but I don’t think there’s been much serious investigation into what they might look like, so I’m keen to get better discussion going (this post was an early step in that direction).
What are things you’d like to talk to other people about? (e.g. things you want to learn)
I’m interested in learning about evolutionary biology, especially the evolution of morality. Also the neuroscience of motivation and goals.
I’d be interested in learning more about mainstream philosophical views on agency and desire. I’d also be very interested in collaborating with philosophers who want to do this type of work, directed at improving our understanding of AI safety.
How can people get in touch with you?
Here, or email: ngor [at] google.com
What would convince you that preventing s-risks is a bigger priority than preventing x-risks?
Suppose that humanity unified to pursue a common goal, and you faced a gamble where that goal would be the most morally valuable goal with probability p, and the most morally disvaluable goal with probability 1-p. Given your current beliefs about those goals, at what value of p would you prefer this gamble over extinction?
We have a lot of philosophers and philosophically-minded people in EA, but only a tiny number of them are working on philosophical issues related to AI safety. Yet from my perspective as an AI safety researcher, it feels like there are some crucial questions which we need good philosophy to answer (many listed here; I’m particularly thinking about philosophy of mind and agency as applied to AI, a la Dennett). How do you think this funnel could be improved?
If you could convince a dozen of the world’s best philosophers (who aren’t already doing EA-aligned research) to work on topics of your choice, which questions would you ask them to investigate?
If you could only convey one idea from your new book to people who are already heavily involved in longtermism, what would it be?
Thanks for the list! As a follow-up, I’ll try list places online where such debates have occurred for each entry:
2. Toby Ord has estimates in The Precipice. I assume most discussion occurs on specific risks.
3. Lots of discussion on this; summary here: https://forum.effectivealtruism.org/posts/7uJcBNZhinomKtH9p/giving-now-vs-later-a-summary . Also more recently https://forum.effectivealtruism.org/posts/amdReARfSvgf5PpKK/phil-trammell-philanthropy-timing-and-the-hinge-of-history
4. Best discussion of this is probably here: https://www.lesswrong.com/posts/HBxe6wdjxK239zajf/what-failure-looks-like
5. Most stuff on https://longtermrisk.org/ addresses s-risks. In terms of pushback, Carl Shulman wrote http://reflectivedisequilibrium.blogspot.com/2012/03/are-pain-and-pleasure-equally-energy.html and Toby Ord wrote http://www.amirrorclear.net/academic/ideas/negative-utilitarianism/ (although I don’t find either compelling). Also a lot of Simon Knutsson’s stuff, e.g. https://www.simonknutsson.com/thoughts-on-ords-why-im-not-a-negative-utilitarian
6a. https://forum.effectivealtruism.org/posts/LxmJJobC6DEneYSWB/effects-of-anti-aging-research-on-the-long-term-future , https://forum.effectivealtruism.org/posts/jYMdWskbrTWFXG6dH/a-general-framework-for-evaluating-aging-research-part-1
6b. https://forum.effectivealtruism.org/posts/W5AGTHm4pTd6TeEP3/should-longtermists-mostly-think-about-animals , https://forum.effectivealtruism.org/posts/ndvcrHfvay7sKjJGn/human-and-animal-interventions-the-long-term-view
7. Nothing particularly comes to mind, although I assume there’s stuff out there.
9. E.g. here, which also links to more discussions: https://forum.effectivealtruism.org/posts/NLJpMEST6pJhyq99S/notes-could-climate-change-make-earth-uninhabitable-for
Because we are indifferent between who has the 2 and who has the 0
Perhaps I’m missing something, but where does this claim come from? It doesn’t seem to follow from the three starting assumptions.
2018-19: a $100,000 lottery (no winners)
What happens to the money in this case?
I think that they might have been better off if they’d instead spent their effort trying to become really good at ML in the hope of being better skilled up with the goal of working on AI safety later.
I’m broadly sympathetic to this, but I also want to note that there are some research directions in mainstream ML which do seem significantly more valuable than average. For example, I’m pretty excited about people getting really good at interpretability, so that they have an intuitive understanding of what’s actually going on inside our models (particularly RL agents), even if they have no specific plans about how to apply this to safety.
Students able to bring funding would be best-equipped to negotiate the best possible supervision from the best possible school with the greatest possible research freedom.
This seems like the key premise, but I’m pretty uncertain about how much freedom this sort of scholarship would actually buy, especially in the US (people who’ve done PhDs in ML please comment!) My understanding is that it’s rare for good candidates to not get funding; and also that, even with funding, it’s usually important to work on something your supervisor is excited about, in order to get more support.
In most of the examples you give (with the possible exceptions of the FHI and GPI scholarships) buying research freedom for PhD students doesn’t seem to be the main benefit. In particular:
OpenPhil has its fellowship for AI researchers who happen to be highly prestigious
This might be mostly trying to buy prestige for safety.
and has funded a couple of masters students on a one-off basis.
FHI has its… RSP, which funds early-career EAs with slight supervision.
Paul even made grants to independent researchers for a while.
All of these groups are less likely to have other sources of funding compared with PhD students.
Having said all that, it does seem plausible that giving money to safety PhDs is very valuable, in particular via the mechanism of freeing up more of their time (e.g. if they can then afford shorter commutes, outsourcing of time-consuming tasks, etc).
On a meta note: Different people who work on AI alignment have radically different pictures of what the development of AI will look like, what the alignment problem is, and what solutions might look like.
+1, this is the thing that surprised me most when I got into the field. I think helping increase common knowledge and agreement on the big picture of safety should be a major priority for people in the field (and it’s something I’m putting a lot of effort into, so send me an email at firstname.lastname@example.org if you want to discuss this).
I think the ideas described in the paper Risks from Learned Optimization are extremely important.
Also +1 on this.
If I thought there was a <30% chance of AGI within 50 years, I’d probably not be working on AI safety.
I expect the world to change pretty radically over the next 100 years.
I find these statements surprising, and would be keen to hear more about this from you. I suppose that the latter goes a long way towards explaining the former. Personally, there are few technologies that I think are likely to radically change the world within the next 100 years (assuming that your definition of radical is similar to mine). Maybe the only ones that would really qualify are bioengineering and nanotech. Even in those fields, though, I expect the pace of change to be fairly slow if AI isn’t heavily involved.
(For reference, while I assign more than 30% credence to AGI within 50 years, it’s not that much more).
For reference, here’s the post on realism about rationality that Rohin mentioned several times.
I’m planning to donate to the EA hotel. Given that it isn’t a registered charity, I’m interested in doing donation swaps with EAs in countries where charitable donations aren’t tax deductible (like Sweden) so that I can get tax deductions on my donations. Reach out or comment here if interested.
Any of the authors of this paper: https://www.nature.com/articles/s41598-019-50145-9
This homogeneity might well be bad—in particular by excluding valuable but less standard types of community building. If so this problem would be mitigated by having more funding sources.
Agreed—in fact, maybe a better question is whether there are any ideologies where strong adherence doesn’t lead you to make poor decisions.