Co-founding BlueDot Impact, focusing on AI safety talent pipeline strategy.
Have a background consisting of a brief research stint on pessimistic agents (reinforcement learning), ML engineering & product ownership, and Physics
Co-founding BlueDot Impact, focusing on AI safety talent pipeline strategy.
Have a background consisting of a brief research stint on pessimistic agents (reinforcement learning), ML engineering & product ownership, and Physics
Hey! I have been thinking about this a lot from the perspective of a confused community builder / pipeline strategist, too. I didn’t get so far as Neel, it’s been great to read this post before getting anywhere near finishing my thoughts. It captures a lot of the same things better than I had. Thanks for your comment too—definitely a lot of overlap here!
I have got as far as some ideas, here, and would love any initial thoughts before I try to write it up with more certainty?
First a distinction which I think you’re pointing at—an inside view on what? The thing I can actually have an excellent inside view about as a (full-time) community builder is how community building works. Like, how to design a programme, how people respond to certain initiatives, what the likelihood certain things work are, etc.
Next, programmes that lead to working in industry, academic field building, independent research, etc, look different. How do I decide which to prioritise? This might require some inside view on how each direction changes the world (and interacts with the others), and lead to an answer on which I’m most optimistic about supporting. There is nobody to defer to here, as practitioners are all (rightly) quite bullish about their choice. Having an inside view on which approach I find most valuable will lead to quite concrete differences in the ultimate strategy I’m working towards or direction I’m pointing people in, I think.
When it comes to what to think about object-level work (i.e. how does alignment happen, technically), I get more hazy on what I should aim for. By statistical arguments, I reckon most inside views that exist on what work is going to be valuable are probably wrong. Why would mine be different? Alternatively, they might all be valuable, so why support just one. Or something in between. Either way, if I am doing meta work, it will probably be wrong to be bullish about my single inside view on ‘what will go wrong’. I think I should aim to support a number of research agenda if I don’t have strong reasons to believe some are wrong. I think this is where I will be doing most of my deferral, ultimately (and as the field shifts from where I left it).
However, understanding how valuable the object-level work is does seem important for deciding which directions to support (e.g. academia vs industry), so I’m a bit stuck on where to draw a kune. As Neel says, I might hope to get as far understanding what other people believe about their agenda and why—I always took this as “can I model the response person X would give, when considering an unseen question”, rather than memorising person X’s response to a number of questions.
I think where I am landing on this is that it might be possible to assume uniform prior over the directions I could take, and adjust my posterior by ‘learning things’ and understanding their models on both the direction-level and object-level, properly. Another thought I want to explore—is this something like a worldview diversification over directions? It feels similar, as we’re in a world where it ‘might turn out’ some agenda or direction was correct, but there’s no way of knowing that right now.
To confirm—I believe people doing the object-level work (i.e. alignment research) should be bullish about their inside view. Let them fight it out, and let expert discourse decide what is “right” or “most promising”. I think this amounts to Neel’s “truth-seeking” point.
Thanks for proposing this, it’s a great idea!
I am quite interested in exploring this more. We currently have a prosaic list of potential capstone projects, intended to be done by AGI Safety Fundamentals programme participants after the course. I think forms a prototype of this proposal, however one main difference is it’s not really about cutting-edge research, more like skilling-up and fit-testing ideas.
Some questions I am interested in exploring further to see where the needs are for this meta-project:
Do AIS researchers have a list of projects ready-to-go that they don’t publicly advertise?
My guess: probably? Most people have too many threads they want to pursue in not enough time—in my experience it doesn’t take long to get to this point in a research career.
If researchers do have lists of project ideas, why don’t they share the lists already?
My guesses as to why:
1) They haven’t (all) written them down in a sharable format
2) There’s no logical place to put them (e.g. a board doesn’t exist), so no obvious inspiration to publish a list or proof that it’s impactful.
3) Working on a project requires a lot of context on previous work done in the same vein. Supporting people who want to pursue a project idea would be a lot of work for the idea-originator, to get the project off the ground.
With a public board, the researcher doesn’t get to vet candidates who might try to take the projects on, which might make it difficult to manage deciding who they want to work with or not on top of doing their regular work.
Vetting and management/ops are some of the things AI safety camp and CERI summer research fellowship offers, for that reason.
Why don’t other fields have public lists of ideas? E.g. is there a list of research ideas for nuclear fusion? I don’t think there is—so other fields also do project coordination behind closed doors, in academic institutions and conferences.
My guess as to why: other fields may just not be well coordinated and maybe we could just do better. They also don’t have as much untied funding for individual researchers to spring up and take stuff on, outside of universities/institutions.
Interested if you or others have thoughts!
I have been community building in Cambridge UK in some way or another since 2015, and have shared many of these concerns for some time now. Thanks so much for writing them much more eloquently than I would have been able to, thanks!
To add some more anecdotal data, I also hear the ‘cult’ criticism all the time. In terms of getting feedback from people who walk away from us: this year, an affiliated (but non-EA), problem-specific table coincidentally ended up positioned downstream of the EA table at a freshers’ fair. We anecdotally overheard approx 10 groups of 3 people discussing that they thought EA was a cult, after they had bounced from our EA table. Probably around 2000-3000 people passed through, so this is only 1-2% of people we overheard.
I managed to dig into these criticisms a little with a couple of friends-of-friends outside of EA, and got a couple of common pieces of feedback which it’s worth adding.
We are giving away many free books lavishly. They are written by longstanding members of the community. These feel like doctrine, to some outside of the community.
Being a member of the EA community is all or nothing. My best guess is we haven’t thought of anything less intensive to keep people occupied due the historical focus on HEAs, where we are looking for people who make EA their ‘all’ (a point well made in this post).
Personally, I think one important reason the situation is different now to how it was some years ago is EA has grown in size and influence since 2015. It’s more likely someone has encountered it online, via 80k or some podcast. In larger cities, it’s more likely individuals know friends who has been to an EA event. I think we have ‘got away with’ people thinking it’s a cult for a while because not enough people knew about EA. I like to say that the R rate of gossip was < 1, so it didn’t proliferate. I feel we’re nearing or passing a tipping point that discussing EA without EA members present becomes an interesting topic of conversation for non-EAs, since people can relate and have all had personal experiences with the movement.
In my own work now, I feel much more personally comfortable leaning into cause area-specific field building, and groups that focus around a project or problem. These are much more manageable commitments, and can exemplify the EA lens of looking at a project without it being a personal identity. Important caveats for the record, I still think EA-aligned motivations are important, and I am still a big supporter of the EA Cambridge group, and I think it is run by conscientious people with good support networks :-)
I made this comment with the assumption that some of these people could have extremely valuable skills to offer to the problems this community cares about. These are students at a top uni in the UK for sciences, and many of whom go on to be significantly influential in politics and business, much higher than the base rate at other unis or average population.
I agree not every student fits this category, or is someone who will ever be inclined towards EA ideas. However I don’t know if we are claiming that being in this category (e.g. being in the top N% at Cambridge) correlates with a more positive baseline-impression of EA community building? Maybe the more conscientious people weren’t ringleaders in making the comments, but they will definitely hear them which I think could have social effects.
I agree that EA will not be for everyone, and we should seek good intellectual critiques from those people that disagree on an intellectual basis. But to me the thrust of this post (and the phenomenon I was commenting on) was: there are many people with the ability to solve the worlds biggest problems. It would be a shame to lose their inclination purely due to our CB strategies. If our strategy could be nudged to achieve better impressions at people’s first encounter with EA, we could capture more of this talent and direct them to the world’s biggest problems. Community building strategy feels much more malleable than the content of our ideas or common conclusions, which we might indeed want to be more bullish about.
I do accept the optimal approach to community building will still turn some people off, but it’s worth thinking about this intentionally. As EA grows, CB culture gets harder to fix (if it’s not already too large to change course significantly).
I also didn’t clarify this in my original comment. It was my impression that many of them had had already encountered EA, rather than them having picked this up from the messaging of the table. It’s been too long to confirm for sure now, and more surveying would help to confirm. This would not be surprising though, as EA has a large presence at Cambridge than most other unis (and not everyone at freshers’ fair is a first year, many later-stage students attend to pick up new hobbies or whatever).
I am currently pursuing a couple of projects that are intended to appeal to the sensibilities of AI researchers who aren’t in the alignment community already. This has already been very useful for informing the communications and messaging I would use for those. I can see myself referring back to this often, when pursuing other field building activities too. Thanks a lot for publishing this!
Thanks for trying this and writing it up :-) I think there might well be some benefits to getting through the programme intensively, like:
+ It takes less time so you can do whatever you want to do next (you mention reading more researchers’ agenda)
+ A better bonding experience if you do it in-person, which might only be possible in a 1-week intensive session if you don’t all live in the same city
My perceived drawbacks:
- Less digestion & retention of the content (you highlighted this one)
- Less opportunity to mix with other people doing the programme (we hope to spark more of this next time we run the global programme)
- Might be harder to have access to facilitators / more knowledgeable people (you also stated this is important)
Overall, I think continuing the global programme suits people who couldn’t take the time to do an intensive version, and intensive version suits people who prefer not to do virtual reading groups.
EA knowledge is not required. Thanks for asking!
Someone brought a game called “Confident?” into the Cambridge office. It’s basically a competitive gamification of callibration training.
You are rewarded for having the smallest confidence window of all the players, and penalised if your answer is outside of your confidence window.
Super fun!
I wrote this guide for Cambridge, UK, when Cambridge EA CIC was running a hiring round.
I think a guide for Cambridge based on your template would still be valuable (but I won’t do it any time soon). In my guide, I was focused on 1) a broader audience (including ‘non-EAs’) and 2) moving to Cambridge rather than visiting temporarily.
Thanks for exploring this issue! I agree that there could be more understanding between AI safety & the wider AI community, and I’m curious to do more thinking about this.
I think each of the 3 claims you make in the body of the text are broadly true. However I don’t think they directly back up the claim in the title that “AI safety is not separate from near-term applications”.
I think there are some important ways that AI safety is distinct; it goes 1 step further by imagining the capabilities of future systems, and trying to anticipate ways they could go wrong ahead of time. I think there are some research questions it’d be hard to work on if the AI safety field wasn’t separate from current-day application research. E.g. agent foundations, inner misalignment and detecting deception.
I think I agree with much of your sentiment still. To illustrate what I mean, I would like it to be true that:
Important AI current-day-application safety issues are worked on by many people, and there is mutual respect between our communities
Work done by near-term application researchers is known about and leverageable by the AGI safety community
Ultimately, there is still a distinct, accessible AGI safety community that works on issues distinct to advanced, general AI systems
I also think that power dynamics are the source of the biggest problems in the work/social overlap, so a flatter power structure might be a good way of avoiding some of the pitfalls and abuses of the work/social overlap.
Do you think that in abstract that professional/social overlap is less of a problem when the power structure is flatter, or that having a flatter power structure is something that EA could actually achieve?
I’m curious because, to deal with potential abuse of power, I would prefer a more explicit power structure (which sounds like an opposite conclusion to your suggestion).
My first assumption is that power structures are an unavoidable fact in any group of people. I assume that trying to enact a flatter power structure might actually cash out as pretending the power structure doesn’t exist [this might be where we disagree!].
Pretending that power structures are flat leads to plausibly permissable abuse of the actual underlying power structure. However strictly acknowledging a power structure means one is forced to acknowledge the power dynamic.
So to encourage healthy relationships, I would have called for making power structures explicit, in EA or any group.
Great, thanks for writing this up! I don’t work in policy, but it seems to be an extremely pragmatic and helpful guide from an outside-perspective.
A question—is being a US citizen a hard requirement for all of this advice?
If not a hard requirement, what hidden (or explicit) barriers would you expect a non-citizen to face?
FWIW, I think this post makes progress and could work in the contexts of some groups. As a concrete example, it would probably work for me as an organiser of one-off courses, and probably for organisers of one-off retreats or internships.
I appreciate the thrust of comments pointing out imperfections in e.g. local group settings, but I just want to be careful that we don’t throw out the proposal just because it doesn’t work for everyone in all contexts; I think it’s better to start with an an imperfect starting point and to iterate on that where it doesn’t work in specific contexts, rather than to try to come up with the perfect policy in-theory and get paralysed when we can’t achieve that.
Thanks for highlighting this!
Hi—am a little late to your comment, but unsure that the other replies address this. Though 80% of homicide victims are male, this doesn’t mean anything like 80% of men experience homicide. However 35% of women experience intimate partner violence or sexual violence. It seems to me like the homicide statistic you give doesn’t take the scale of homicide into account, which is much smaller than 35% of the male population. I would accept your point that the comparable rate for intimate partner violence of any kind for men isn’t given, which while my prior is that this is lower isn’t easily evidenced as you point out.