Currently doing local AI safety Movement Building in Australia and NZ.
Chris Leong
[Question] Should the forum be structured such that the drama of the day doesn’t occur on the front page?
Where I agree:
Experimentation with decentralised funding is good. I feel it’s a real shame that EA may not end up learning very much from the FTX regrant program because all the staff at the foundation quit (for extremely good reasons!) before many of the grants were evaluated.
More engagement with experts. Obviously, this trades off against other things and it’s easier to engage with experts when you have money to pay them for consultations, but I’m sure there are opportunities to engage with them more. I suspect that a lot of the time the limiting factor may simply be people not knowing who to reach out to, so perhaps one way to make progress on this would be to make a list of experts who are willing for people at EA orgs to reach out to them, subject to availability?
I would love to see more engagement from Disaster Risk Reduction, Future Studies, Science and Technology Studies, ect. I would encourage anyone with such experience to consider posting on the EA forum. You may want to consider extracting out this section in a separate forum post for greater visibility.
I would be keen to see experiments where people vote on funding decisions (although I would be surprised if this were the right funding mechanism for the vast majority of funds rather than a supplement).
Where I disagree:
I suspect it would be a mistake for EA to shift too much towards always just adopting the expert consensus. As EAs we need to back ourselves, but without becoming overconfident. If EA’s had just deferred to the consensus of development studies experts, EA wouldn’t have gotten off the ground. If EA’s had just deferred to the most experienced animal advocates, that would have biased us towards the wrong interventions. If EA’s had just deferred to ML researchers, we would have skipped over AI Safety as a cause area.
I don’t think EA is too focused on AI safety. In fact, I suspect that in a few years, we’ll probably feel that we underinvested in it given how fast it’s developing.
I see value-alignment as incredibly important for a movement that actually wants to get things done, rather than being pulled in several different directions. I agree that it comes with significant risks, such as those you’ve identified, however, I think that we just have to trust in our ability to navigate those risks.
I agree that we need to seek critiques beyond what the existing red-teaming competition and cause exploration prizes have produced, although I’m less of a fan of your specific proposals. My ideal proposal would be to get a few teams of smart, young EA’s who already have a strong understanding of why things are the way that they are in EA and give them a grant to spend time thinking about how they would construct the norms and institutions of EA if they were building them from the ground up. Movements tend to advance by having the youth break with tradition, so I would favour accelerating this natural process over the suggestions presented.
While I would love to see EA institutions being able to achieve a broader base of funding, this feels more like something that would be nice to have, rather than something that you should risk disrupting your operations over.
Voting isn’t a panacea. Countries have a natural answer to who gets to vote—every citizen. I can’t see open internet polls as a good idea due to how easily they can be manipulated, so we’d then require a definition of a member. This would require either membership fees or recording attendance at EA events, so there would be a lot of complexity in making this work.
AI Safety Microgrant Round
I have to be honest that I’m disappointed in this message. I’m not so much disappointed that you wrote a message along these lines, but in the adoption of perfect PR speak when communicating with the community. I would prefer a much more authentic message that reads like it was written by an actual human (not the PR speak formula) even if that risks subjecting the EA movement to additional criticism and I suspect that this will also be more impactful long term. It is much more important to maintain trust with your community than to worry about what outsiders think, especially since many of our critics will be opposed to us no matter what we do.
“On the other hand, we’ve had quite a bit of anti-cancel-culture stuff on the Forum lately. There’s been much more of that than of pro-SJ/pro-DEI content, and it’s generally got much higher karma. I think the message that the subset of EA that is highly active on the Forum generally disapproves of cancel culture has been made pretty clearly”
Perhaps. However, this post makes specific claims about ACE. And even though these claims have been discussed somewhat informally on Facebook, this post provides a far more solid writeup. So it does seem to be making a signficantly new contribution to the discussion and not just rewarming leftovers.
It would have been better if Hypatia had emailed the organisation ahead of time. However, I believe ACE staff members might have already commented on some of these issues (correct me if I’m wrong). And it’s more of a good practise than something than a strict requirement—I totally understand the urge to just get something out of there.
“I’m sceptical that further content in this vein will have the desired effect on EA and EA-adjacent groups and individuals who are less active on the Forum, other than to alienate them and promote a split in the movement, while also exposing EA to substantial PR risk”
On the contrary, now that this has been written up on the forum it gives people something to link to. So forum posts aren’t just read by people who regularly read the forum. In any case, this kind of high quality write-up is unlikely to have a significnat effect on alienating people compared to some of the lower quality discussions on these topics that occur in person or on Facebook. So, from my perspective it doesn’t really make any sense to be focusing on this post. If you want to avoid a split in the movement, I’d like to encourage you to join the Effective Altruists for Political Tolerance Facebook group and contribute there.
I would also suggest worrying less about PR risks. People who want to attack EA can already go around shouting about ‘techno-capitalists’, ‘overwhelmingly white straight males’, ‘AI alarmists’, ect. If someone wants to find something negative, they’ll find something negative.
Very curious to hear how Open Philanthropy has updated as a result of this competition.
Looking at the winning entries would seem to suggest that Open Philanthropy is likely to now be less worried about these risks, but it would be interesting to know by how much.
- 14 Mar 2024 3:55 UTC; 4 points) 's comment on Mo Putera’s Quick takes by (
It’s worth highlighting that this research was carried out with LLM’s with safeguards in place (which admittedly could be jailbroken by the teams). It’s not clear to me that it directly applies to a scenario where you release a model as open-source, where the team could likely easily remove any safeguards with fine-tuning (let alone what would happen if these models were actually fine-tuned to improve their bioterrorism capabilities).
I think it’s valuable to write critiques of grants that you believe to have mistakes, as I’m sure some of Open Philanthropy’s grants will turn out to be mistakes in retrospect and you’ve raised some quite reasonable concerns.
On the other hand, I was disappointed to read the following sentence “Henry drops out of school because he thinks he is exceptionally smarter and better equipped to solve ’our problems”. I guess when I read sentences like that I apply some (small) level of discounting towards the other claims made, because it sounds like a less than completely objective analysis. To be clear, I think it is valid to write a critique of whether people are biting off more than they can chew, but I still think my point stands.
I also found this quote interesting: “What personal relationships or conflicts of interest are there between the two organizations?” since it makes it sound like there are personal relationships or conflicts of interest without actually claiming this is the case. There might be such conflicts or this implication may not be intentional, but I thought it was worth noting.
Regarding this grant in particular: if you view it from the original EA highly evidence-based philanthropy end, then it isn’t the kind of grant that would rate highly in this framework. On the other hand, if you view it from the perspective of hits-based giving (thinking about philanthropy as a VC would), then it looks like a much more reasonable investment from this angle[1], as for instance, Mark Zuckerberg famously dropped out of college to start Facebook. Similarly, most start-ups have some degree of self-aggrandizement and I suspect that it might actually be functional in terms of pushing them toward greater ambition.
That said, if OpenPhilanthropy is pursuing this grant under a hits-based approach, it might be less controversial if they were to acknowledge this.- ^
Though of course, if the grant was made on the basis of details that were misrepresented (I haven’t looked into those claims) then this would undercut this.
- ^
I said at the time that I felt that Ben made a mistake in not waiting a week, though I wasn’t completely confident about this. Having skimmed parts of the document, I’m now much more confident that not waiting was indeed a mistake.
Disclaimer: I remotely interned at Nonlinear.
I think people are quite reasonably deciding that this post isn’t worth taking the time to engage with. I’ll just make three points even though I could make more:
“A good rule of thumb might be that when InfoWars takes your side, you probably ought to do some self-reflection on whether the path your community is on is the path to a better world.”—Reversed Stupidity is Not Intelligence
“In response, the Slate Star Codex community basically proceeded to harass and threaten to dox both the editor and journalist writing the article. Multiple individuals threatened to release their addresses, or explicitly threatened them with violence.”—The author is completely ignoring the fact that Scott Alexander specifically told people to be nice, not to take it out on them and didn’t name the journalist. This seems to suggest that the author isn’t even trying to be fair.
“I have nothing to say to you — other people have demonstrated this point more clearly elsewhere”—I’m not going to claim that such differences exist, but if the author isn’t open to dialog on one claim, it’s reasonable to infer that they mightn’t be open to dialog on other claims even if they are completely unrelated.
Quite simply this is a low quality post and “I’m going to write a low quality post on topic X and you have to engage with me because topic X is important regardless of the quality” just gives a free pass on low quality content. But doesn’t it spur discussion? I’ve actually found that most often low quality posts don’t even provide the claimed benefit. They don’t change people’s minds and tend to lead to low quality discussion.
A List of Things For People To Do
Some people have criticised the timing. I think there’s some validity to this, but the trigger has been pulled and cannot be unpulled. You might say that we could try write another similar letter a bit further down the track, but it’s hard to get people to do the same thing twice and even harder to get people to pay attention.
So I guess we really have the choice to get behind this or not. I think we should get behind this as I see this letter as really opening up the Overton Window. I think it would be a mistake to wait for a theoretical perfectly timed letter to sign, as opposed to signing what we have in front of us.
On expanding to AI safety: Given all of the recent controversies, I’d think very carefully before linking the reputations of EA and AI safety more than they are already linked. It seems that the same group was responsible for community health for both and it either made a mistake or a correct, but controversial decision, there would be a greater chance of the blowback affecting both communities, rather than just one.
Maybe community health functions being independent of CEA would make this less of an issue. I guess it’s plausible, but also maybe not? Might also depend on whether any new org has EA in the name?
I think that the root cause is that there is no AI safety field building co-ordinating committee which would naturally end up taking on such a function. Someone really needs to make that happen (I’m not the right person for this).
This would have the advantage of allowing the norms of the communities to develop somewhat separately. It would sacrifice some operational efficiencies, but I think this is one of those areas where it is better not to skimp.
I thought Buck’s comment contained useful information, but was also impolite. I can see why people in favour of these proposals would find it frustrating to read.
I’m very interested to see how this goes. I guess the main challenge with this kind of competition is finding a way to encourage high-quality criticism without encouraging low-quality bad faith criticism.
This is harder than it sounds. The stronger you disagree with someone’s position, the more likely it is to appear as bad faith criticism. Indeed, most of the time you can point out legitimate flaws as everything has flaws if you look at it with a close enough microscope. The difference is that when you think the author is writing something of vital importance any flaws seem triffling, whilst when you think the author is arguing for something morally repugnant or that would have disasterous consequences the flaws scream out at you.
On the other hand, it’s possible to write a piece that satisfies any objective criteria that have been set, yet still engages in bad faith.
I think you’re underestimating the impact bad faith criticism can have. Lots of people just copy their takes from someone else.
6 Year Decrease of Metaculus AGI Prediction
I think these are great principles for apologies in the context of personal relationships, however, I do wonder how we would have to adapt them for a public context. For start, making amends works well for an individual who is in a position to accept or reject your offer and much less well in a public context where it can easily end up being seen as a cynical PR move. And indeed, it is especially likely to be seen as a cynical PR move if it is in fact a cynical PR move, which it would be if the person was to write something that didn’t represent their true views. So while I definitely think these apologies could have been improved, I think that the situation is more complex than you give it credit for.
Footnotes are your friend here—they allow you to add detail for those who need or want it—whilst not wasting everyone else’s times.
I’m not sure that should count as brigading or unethical in these circumstances as long as they didn’t ask people to vote a particular way.
Remember that even though Ben is only a single author, he spent a bunch of time gathering negative information from various sources[1]. I think that in order to be fair, we need to allow them to ask people to present the other side of the story. Also consider: if Kat or Emerson had posted a comment containing a bunch of positive comments from people, then I expect that everyone would be questioning why those people hadn’t made the comments themselves.
I think it might also be helpful to think about it from the opposite perspective. Would anyone accuse me of brigading if I theoretically knew other people who had negative experiences with Nonlinear and suggested that they might want to chime in?
If not, then we’ve created an asymmetry where people are allowed to do things in terms of criticism, but not in terms of defense, which seems like a mistake to me.
That said, it is useful for us to know that some of these comments were solicited.
Disclaimer: I formerly interned at Nonlinear. I don’t want my meta-level stance to be taken as support of the actions of Nonlinear leadership (I’m very disappointed by what they’ve admitted to in relation to these claims). Nor was I asked by them to leave any comments here. I just believe that they should be allowed to defend themselves, even though I’m not satisfied by the defense that they’ve given so far, nor do I expect their response to be satisfactory either.
I say negative information not to disparage it. I have negatively updated on this information as the information revealed is worse than the rumors I’d previously heard.