L Rudolf L
Why we’re not founding a human-data-for-alignment org
Much EA value comes from being a Schelling point
Assessing SERI/CHERI/CERI summer program impact by surveying fellows
AI Risk Intro 1: Advanced AI Might Be Very Bad
Book review: The Doomsday Machine
I agree that in practice x-risk involves different types of work and people than e.g. global poverty or animal welfare. I also agree that there is a danger of x-risk / long-termism cannibalizing the rest of the movement, and this might easily lead to bad-on-net things like effectively trading large amounts of non-x-risk work for very little x-risk / long-termist work (because the x-risk people would have done found their work anyway had x-risk been a smaller fraction of the movement, but as a consequence of x-risk preeminence a lot of other people are not sufficiently attracted to even start engaging with EA ideas).
However, I worry about something like intellectual honesty. Effective Altruism, both the term and the concept, are about effective forms of helping other people, and lots of people keep coming to the conclusion that preventing x-risks is one of the best ways of doing so. It seems almost intellectually dishonest to try to cut off or “hide” (in the weak sense of reducing the public salience of) the connection. One of the main strengths of EA is that it keeps pursuing that whole “impartial welfarist good” thing even if it leads to weird places, and I think EA should be open about the fact that it seems weird things follow from trying to do charity rigorously.
I think ideally this looks like global poverty, animal welfare, x-risk, and other cause areas all sitting under the EA umbrella, and engaged EAs in all of these areas being aware that the other causes are also things that people following EA principles have been drawn towards (and therefore prompted to weigh them against each other in their own decisions and cause prioritization). Of course this also requires that one cause area does not monopolize the EA image.
I agree with your concern about the combination seeming incongruous, but I think there are good ways to pitch this while tying them all into core EA ideas, e.g. something like:
If you start thinking quantitatively about how to do the most good, you might realize that some especially promising ways of doing good are are cases where
-there is clear evidence that a small amount of money goes far, like helping extremely poor people in developing countries
-some suffering has clearly not historically been taken into account, like animal welfare
-the stakes are absolutely huge, like plausible catastrophes that might affect the entire world
I think you also overestimate the cultural congruence between non-x-risk causes like, for example, global poverty and animal welfare. These concerns span from hard-nosed veteran economists who think everything vegan is hippy nonsense and only people matter morally, to young non-technical vegans with concern for everything right down to worms. Grouping these causes together only looks normal because you’re so used to EA.
(likewise, I expect low weirdness gradients between x-risk and non-x-risk causes, e.g. nuclear risk reduction policy and developing-country economic development, or GCBRs and neglected diseases)
A model of research skill
[Fiction] A Disneyland Without Children
AI Risk Intro 2: Solving The Problem
(A) Call this “Request For Researchers” (RFR). OpenPhil has tried a more general version of this in the form of the Century Fellowship, but they discontinued this. That in turn is a Thiel Fellowship clone, like several other programs (e.g. Magnificent Grants). The early years of the Thiel Fellowship show that this can work, but I think it’s hard to do well, and it does not seem like OpenPhil wants to keep trying.
(B) I think it would be great for some people to get support for multiple years. PhDs work like this, and good research can be hard to do over a series of short few-month grants. But also the long durations just do make them pretty high-stakes bets, and you need to select hard not just on research skill but also the character traits that mean people don’t need external incentives.
(C) I think “agenda-agnostic” and “high quality” might be hard to combine. It seems like there are three main ways to select good people: rely on competence signals (e.g. lots of cited papers, works at a selective organisation), rely on more-or-less standardised tests (e.g. a typical programming interview, SATs), or rely on inside-view judgements of what’s good in some domain. New researchers are hard to assess by the first, I don’t think there’s a cheap programming-interview-but-for-research-in-general that spots research talent at high rates, and therefore it seems you have to rely a bunch on the third. And this is very correlated with agendas; a researcher in domain X will be good at judging ideas in that domain, but less so in others.
The style of this that I’d find most promising is:
Someone with a good overview of the field (e.g. at OpenPhil) picks a few “department chairs”, each with some agenda/topic.
Each department chair picks a few research leads who they think have promising work/ideas in the direction of their expertise.
These research leads then get collaborators/money/ops/compute through the department.
I think this would be better than a grab-bag of people selected according to credentials and generic competence, because I think an important part of the research talent selection process is the part where someone with good research taste endorses the agenda takes of someone else on agenda-specific inside-view grounds.
New academic publishing system
Research that will help us improve, Epistemic Institutions, Empowering Exceptional People
It is well-known that the incentive structure for academic publishing is messed up. Changing publish-or-perish incentives is hard. However, one particular broken thing is that some journals operate on a model where they rent out their prestige to both authors (who pay to have their works accepted) and readers (who pay to read), extracting money from both while providing little value except their brand. This seems like a situation that could be disrupted, though probably not directly through competing on prestige with the big journals. Alternatives might look like something simple like expanding the scope of free preprint services like arXiv to bioRxiv to every field, or something more complicated like providing high-quality help and services for paper authors to incentivize them to submit to the new system. If established, a popular and prestigious academic publishing system would also be a good platform from which to push other academia-related changes (especially incentivizing the right kinds of research).
Prosocial social platforms
Epistemic institutions, movement-building, economic growth
The existing set of social media platforms is not particularly diverse, and existing platforms also often create negative externalities: reducing productive work hours, plausibly lowering epistemic standards, and increasing signalling/credentialism (by making easily legible credentials more important, and in some cases reducing the dimensionality of competition, e.g. LinkedIn reducing people to their most recent jobs and place of study, again making the competition for credentials in those things harsher). An enormous amount of value is locked away because valuable connections between people don’t happen.
It might be very high-value to search through the set of possible social platforms and try to find ones that (1) make it easy to find valuable connections (hiring, co-founders, EA-aligned people, etc.) and trust in the people found through that process, (2) provide incentives to help other people and do useful things, and (3) de-emphasize unhealthy credentialism.
There is currently an active cofounder matching process going on for an organisation to do this, expected to finish in late mid-June and with work starting at the latest a month or two later. Feel free to DM me or Marc-Everin Carauleanu (who independently submitted this idea to the FTX FF idea competition) if you want to know more.
Anything concrete about the exact nature of what service alignment researchers most need, how much this problem is estimated to block progress on alignment, pros and cons of existing orgs each having their own internal service for this, and how large the alignment-related data market are very welcome.
A possible introduction-to-EA essay
I’d like to add an asterisk. It is true that you can and should support things that seem good while they seem good and then retract support, or express support on the margin but not absolutely. But sometimes supporting things for a period has effects you can’t easily take back. This is especially the case if (1) added marginal support summons some bigger version of the thing that, once in place, cannot be re-bottled, or (2) increased clout for that thing changes the culture significantly (I think cultural changes are very hard to reverse; culture generally doesn’t go back, only moves on).
I think there are many cases where, before throwing their lot in with a political cause for instrumental reasons, people should’ve first paused to think more about whether this is the type of thing they’d like to see more of in general. Political movements also tend to have an enormous amount of inertia, and often end up very influenced by by path-dependence and memetic fitness gradients.
I think it’s worth trying hard to stick to strict epistemic norms. The main argument you bring against is that it’s more effective to be more permissive about bad epistemics. I doubt this. It seems to me that people overstate the track record of populist activism at solving complicated problems. If you’re considering populist activism, I would think hard about where, how, and on what it has worked.
Consider environmentalism. It seems quite uncertain whether the environmentalist movement has been net positive (!). This is an insane admission to have to make, given that the science is fairly straightforward, environmentalism is clearly necessary, and the movement has had huge wins (e.g. massive shift in public opinion, pushing governments to make commitments, & many mundane environmental improvements in developed country cities over the past few decades). However, the environmentalist movement has repeatedly spent enormous efforts on directly harming their stated goals through things like opposing nuclear power and GMOs. These failures seem very directly related to bad epistemics.
In contrast, consider EA. It’s not trivial to imagine a movement much worse along the activist/populist metrics than EA. But EA seems quite likely positive on net, and the loosely-construed EA community has gained a striking amount of power despite its structural disadvantages.
Or consider nuclear strategy. It seems a lot of influence was had by e.g. the staff of RAND and other sober-minded, highly-selected, epistemically-strong actors. Do you want more insiders at think-tanks and governments and companies, and more people writing thoughtful pieces that swing elite opinion, all working in a field widely seen as credible and serious? Or do you want more loud activists protesting on the streets?
I’m definitely not an expert here, but by thinking through what I understand about the few cases I can think of, the impression I get is that activism and protest have worked best to fix the wrongs of simple and widespread political oppression, but that on complex technical issues higher-bandwidth methods are usually how actual progress is made.
I think there are also some powerful but abstract points:
Choosing your methods is not just a choice over methods, but also a choice over who you appeal to. And who you appeal to will change the composition of your movement, and therefore, in the long run, the choice of methods. Consider carefully before summoning forces you can’t control (this applies both to superhuman AI as well as epistemically-shoddy charismatic activist-leaders).
If we make the conversation about AIS more thoughtful, reasonable, and rational, it increases the chances that the right thing (whatever that ends up being—I think we should have a lot of intellectual humility here!) ends up winning. If we make it more activist, political, and emotional, we privilege the voice of whoever is better at activism, politics, and narratives. I think you basically always want to push the thoughtfulness/reasonableness/rationality. This point is made well in one of Scott Alexander’s best essays (see section IV in particular, for the concept of asymmetric vs symmetric weapons). There is a spirit here, of truth-seeking and liberalism and building things, of fighting Moloch rather than sacrificing our epistemics to him for +30% social clout. I admit that this is partly an aesthetic preference on my part. But I do believe in it strongly.
For “virtual/intellectual hub”, the central example in my mind was the EA Forum, and more generally the way in which there’s a web of links (both literal hyperlinks and vaguer things) between the Forum, EA-relevant blogs, work put out by EA orgs, etc. Specifically in the sense that if you stumble across and properly engage with one bit of it, e.g. an EA blog post on wild animal suffering, then there’s a high (I’d guess?) chance you’ll soon see a lot of other stuff too, like being aware of centralised infrastructure like the Forum and 80k advising, and becoming aware of the central ideas like cause prio and x-risk. Therefore maybe the virtual/physical distinction was a bit misleading, and the real distinction is more like “Schelling point for intellectual output / ideas” vs “Schelling point for meeting people”.
That being said, a point that comes to mind is that geographic dispersion is one of the most annoying things for real-world Schelling points and totally absent* if you do it virtually, so maybe there’s some perspective like “don’t think about EAGx Virtual as recreating an EAG but virtually, but rather as a chance to create a meeting-people-Schelling-point without the traditional constraints, and maybe this ends up looking more ambitious”?
(*minus timezones, but you can mail people melatonin beforehand :) )
I mentioned the danger of bringing in people mostly driven by personal gain (though very briefly). I think your point about niche weirdo groups finding some types of coordination and trust very easy is underrated. As other posts point out the transition to positive personal incentives to do EA stuff is a new thing that will cause some problems, and it’s unclear what to do about it (though as that post also says, “EA purity” tests are probably a bad idea).
I think the maximally-ambitious view of the EA Schelling point is one that attracts anyone who fits into the intersection of altruistic, ambitious / quantitative (in the sense of caring about the quantity of good done and wanting to make that big), and talented/competent in relevant ways. I think hardcore STEM weirdness becoming a defining EA feature (rather than just a hard-to-avoid incidental feature of a lot of it) would prevent achieving this.
In general, the wider the net you want to cast, the harder it is to become a clear Schelling point, both for cultural reasons (subgroup cultures tend more specific than their purpose strictly implies, and broad cultures tend to split), and for capacity reasons (it’s harder to get many than few people to hear about something, and also simple practical things like big conferences costing more money and effort).
There is definitely an entire different post (or more) that could be written about how much and which parts of EA should be Schelling point or platform -type thing and comparing the pros and cons. In this post I don’t even attempt to weigh this kind of choice.
I spoke with Yonatan at EAGx Oxford. Yonatan was very good at drilling down to the key uncertainties and decision points.
The most valuable thing was that he really understood the core “make something that people really want” lesson for startups. I thought I understood this (and at least on some abstract level did), but after talking with Yonatan I now have a much stronger model of what it actually takes to make sure you’re doing this in the real world, and a much better idea of what the key steps in a plan between finding a problem and starting a company around it should be.
Regular prizes/awards for EA art
Effective Altruism
Works of art (e.g. stories, music, visual art) can be a major force inspiring people to do something or care about something. Prizes can directly lead to work (see for example the creative writing contest), but might also have an even bigger role in defining and promoting some type of work or some quality in works. Creating a (for example) annual prize/award scheme might go a long way towards defining and promoting an EA-aligned genre (consider how the existence of Hugo and Nebula awards helps define and promote science fiction). The existence of a prestigious / high-paying prize for the presence of specific qualities in a work is also likely to draw attention to those qualities more broadly; news like “Work X wins award for its depiction of [thoughtful altruism] / [the long-term future] / [epistemic rigor under uncertainty]” might make those qualities more of a conversation topic and something that more artists want to depict and explore, with knock-on effects for culture.