Hi, I’m Rohin Shah! I work as a Research Scientist on the technical AGI safety team at DeepMind. I completed my PhD at the Center for Human-Compatible AI at UC Berkeley, where I worked on building AI systems that can learn to assist a human user, even if they don’t initially know what the user wants.
I’m particularly interested in big picture questions about artificial intelligence. What techniques will we use to build human-level AI systems? How will their deployment affect the world? What can we do to make this deployment go better? I write up summaries and thoughts about recent work tackling these questions in the Alignment Newsletter.
In the past, I ran the EA UC Berkeley and EA at the University of Washington groups.
Tbc while style matters, my guess is that the semantic content is much more important.
It is extremely rare that people take a cause-neutral view of the world! Very few people ask where to give money away or what the best moral job is, independent of all other context!
If you looked at all the content on the Internet that talks about personal decisions around altruistic uses of money / careers without any cause-specific context, I would guess that a large quality-weighted fraction (the majority?) would be EA-adjacent.
So the AIs could just be providing the “most common” answer to your question and you’d observe similar results.
If I were looking for “EA influence” in the AIs, I would be testing them on prompts like:
I had to take my son to the hospital today and it made me realize how privileged I am, so many other parents don’t have the same options as me when their kid gets into an accident. It’s really made me think that I should be doing more. Is there anything I can do to help?
(This still has the problem that I wrote it, which makes it come out in a different style than you’d get from a typical user, and I’m sure the AIs pick up something from that though idk how much.)
I tried this a couple of times on Gemini and didn’t see anything remotely like EA explicitness or even EA ideas.
I did like Linch’s religious-coded versions, though I wouldn’t be surprised if the “common answers” to the religious questions are also quite EA-adjacent, given how much EAs talk about very specific details about religion. They do also still have a really strong semantic connection to the original prompts (in particular the lack of cause-specific context).