Thanks—I’m curious, how much more convincing would you guess is the case if you were to be completely open about the infohazardous stuff? 1.5x probability of convincing someone? 3x? 10x?
Elias Schmied
Social agency
Okay, I read more and understand it better now.
Thank you! Yeah, I’m aware of these—which is part of why I’m not working on AI safety and generally don’t have strong opinions on interventions in that area (too much uncertainty). I should’ve been more specific, sorry—I mean more like “promoting wisdom, cooperativeness, knowledge, etc”. I only skimmed the series, so I may have missed it—do you list some plausible ways this could backfire anywhere? It just seems like the world would have to be “weird” in some way—I can roughly grasp at some possibilities (knowledge could be used for bad / infohazards, cooperation could make bad lock-in more likely), but I’m sure you have some concrete visions / this has been written up? (Also feel free to say if you feel constrained here due to infohazard considerations).
Claude is pointing me to some related older CLR writing, so I may read that.
On the hypothesis space: Maybe this already exists somewhere else, and it’s also just an entirely different endeavor, so don’t take this as a criticism—but I would be much more convinced by a braindump-style post listing a bunch of really weird ways the world could be that are not obviously wrong, and that would make obvious interventions backfire. (I could think of some myself from my own thinking of course, but I assume you have thought of many). That would push me viscerally much more in the direction that there are even more possible worlds out there that nobody has ever considered, more than something abstract like this post can. I haven’t encountered new visions like that in a few years, so I don’t gut-level believe that there are that many more (even if that may be naive).
Really cool! Thanks
Clarifying the Darwinian Honeymoon
I hope my post can add something novel, that I haven’t come across in the pre-existing discussion (see my footnote 1) of the specter of long-term evolutionary competition: the observation that this at first predictably benefits you (an instance of Goodhart’s Law), a vivid example, and a memetic handle for the concept: The Darwinian Honeymoon.
The Darwinian Honeymoon—Why I am not as impressed by human progress as I used to be
Love this, thanks!
Thanks for this post—I’ve learned things about the AI safety community that I didn’t realize before. I wonder if much of the value of external criticism isn’t in changing the behavior of those being criticized, but rather in explicitly stating and making into common knowledge negative factors that by default are not talked about publically as much. (Both for future projects to do things differently, and for people today to update about how to relate to the entities involved).
There does seem to be non-negligible content in the references to hits-based giving and the lower funding bar, but otherwise I agree.
It’s something that was recently invented on Twitter, here is the manifesto they wrote: https://swarthy.substack.com/p/effective-accelerationism-eacc?s=w
It’s only believed by a couple people afaict, and unironically maybe by no one (although this doesn’t make it unimportant!)
Huh, I’m a little surprised to hear that, to be honest. To be clear, I mean something more like “visceral”/”rhetorical”/”de facto” convincingness, not whether it purely logically hinges on it.
Also, just thinking of this because I’m reading it right now—if you want to convince more people of your view, a critical review / “rebuttal” of https://www.forethought.org/research/how-to-make-the-future-better might be cool. Would be memetically strong. (I could also imagine reasons why you might not want that, of course).