Trying to make sure the development of powerful artificial intelligence goes well.
RomanHauksson
This is excellent research! The quality of Rethink Priorities’ output consistently impresses me.
A couple questions:
What software did you use to create figure 1?
What made you decide to use discrete periods in your model as opposed to a continuous risk probability distribution?
How about “early-start EA (EEA)”? As a term, could sit neatly beside “highly-engaged EA (HEA)”.
I agree that GHW is an excellent introduction to effectiveness and we should watch out for the practical limitations of going too meta, but I want to flag that seeing GHW as a pipeline to animal welfare and longtermism is problematic, both from a common-sense / moral uncertainty view (it feels deceitful and that’s something to avoid for its own sake) and a long-run strategic consequentialist view (I think the EA community would last longer and look better if it focused on being transparent, honest, and upfront about what most members care about, and it’s really important for the long term future of society that the core EA principles don’t die).
The high success rate almost makes me think CE should be incubating even more ambitious, riskier projects, with the expectation of a lower success rate but higher overall EV. Very uncertain about this intuition though, would be interested to hear what CE thinks.
Congratulations on the launch! This is huge. I have to ask, though: why is the ebook version not free? I would assume that if you wanted to promote longtermism to a broad audience, you would make the book as accessible as possible. Maybe charging for a copy actually increases the number of people who end up reading it? For example, it would rank higher on bestseller lists, attracting more eyes. Or perhaps the reason is simply to raise funds for EA?
Can we set up a torrent link for this?
DiscordChatExporter is a tool that enables you to download an archive of all the messages in a server or channel.
Here are a couple of excerpts from relevant comments from the Astral Codex Ten post about the tournament. From the anecdotes, it seems as though this tournament had some flaws in execution, namely that the “superforcasters” weren’t all that. But I want to see more context if anyone has it.
I signed up for this tournament (I think? My emails related to a Hybrid Forecasting-Persuasion tournament that at the very least shares many authors), was selected, and partially participated. I found this tournament from it being referenced on ACX and am not an academic, superforecaster, or in any way involved or qualified whatsoever. I got the Stage 1 email on June 15.
I participated and AIUI got counted as a superforecaster, but I’m really not. There was one guy in my group (I don’t know what happened in other groups) who said X-risk can’t happen unless God decides to end the world. And in general the discourse was barely above “normal Internet person” level, and only about a third of us even participated in said discourse. Like I said, haven’t read the full paper so there might have been some technique to fix this, but overall I wasn’t impressed.
- 12 Oct 2023 15:03 UTC; 4 points) 's comment on Timelines are short, p(doom) is high: a global stop to frontier AI development until x-safety consensus is our only reasonable hope by (
- 24 Jul 2023 9:32 UTC; 3 points) 's comment on What do XPT forecasts tell us about AI risk? by (
I can look into how to set up a torrent link tomorrow and let you know how it goes!
Rational Animations is probably the YouTube channel the report is referring to, in case anyone’s curious.
Where did you copy the quote from?
Suggestion: use a well-designed voting system such as STAR voting, approval voting, or quadratic voting.
I made a similar deck a few months ago, and there might be some overlap: https://github.com/RomanHN/CFAR_jargon
Isn’t the opposite end of the p(doom)–longtermism quadrant also relevant? E.g. my p(doom) is 2%, but I take the arguments for longtermism seriously and think that’s high enough of a chance to justify working on the alignment problem.
It would be great to have data about gaps in professional skills between what EAs are training up in and what EA organizations find most useful and neglected. I’ve heard that there’s a gap in information security expertise within the AI safety field, but it would be nice to see data to back this up before I commit to self-studying cybersecurity. Maybe someone could do a survey of EA organization managers asking them what skills they’re looking for and what roles they’ve been having a hard time filling, as well as a survey of early-career EAs asking them what skills they have and what they’re learning. We could also do this survey regularly and observe trends.
Does anyone have any advice on how I can use language models to write nonfiction text better? For making a specific piece of text better, but also for learning how to write better in the long term. Maybe a tool like Grammarly but more advanced? It would give critiques of the writing I have so far, ask questions, give wording suggestions, point out which sentences are especially well-written, et cetera.
Where can I find thoughtful research on the relationship between AI safety regulation and the centralization of power?
giving the alignment research community an edge
epistemic status: shower thought
On whether advancements in humanity’s understanding of AI alignment will be fast enough compared to advancements in its understanding of how to create AGI, many factors stack in favor of AGI: more organizations are working on it, there’s a direct financial incentive to do so, people tend to be more excited about the prospect of AGI than cautious about misalignment, et cetera. But one factor that gives me a bit of hope (besides the idea that alignment might turn out to be easier to figure out than AGI) is that alignment researchers tend to be cooperative while AGI researchers tend to be competitive. Alignment researchers are motivated to save the world, not make a buck, so if their discoveries are helpful for alignment, they’ll go public, and if they’re helpful for alignment but also maybe capabilities, they’ll go public only to other alignment researchers. Meanwhile, each company trying to create AGI only has their own cutting-edge research to work with – they tend to keep to themselves, while we’re more united.
I’m curious about the ways that the alignment research community could augment this dynamic. One way could be restricting access to helpful information to only other alignment researchers, namely 1) discoveries that might be helpful for alignment but also AGI and 2) knowledge related to AI-assisted research and development. I get the impression this is already a norm, but the community might benefit from more formal and overt methods for doing this. For example, tammy created a “locked post” feature on her website that gives her control over who can decrypt certain posts of hers that relate to capabilities. Along the same vein, maybe the AI Alignment Forum could add a feature that works similar to Twitter Circle, where access to posts could be restricted to trusted members of a group:
Twitter Circle is a way to send Tweets to select people, and share your thoughts with a smaller crowd. You choose who’s in your Twitter Circle, and only the individuals you’ve added can reply to and interact with the Tweets you share in the circle.
Of course, then the forum developer team would have to up their security since nation state actors (for example) would be incentivized to hack the forum to learn all the latest AGI-related discoveries those alignment people are trying to keep to themselves. Another worry is that moles will network their way deep into the alignment community to gain access to privileged information, then pass it on to some company or nation (there might even already be moles today, without formal methods of restricting information). I’m sure there’s pre-existing literature on how to mitigate these risks.
Those with more knowledge about AI strategy, feel free to pick apart these thoughts; I only felt comfortable sharing them in the shortform because I feel like there’s a lot about this subject that I’m missing. Perhaps this has been written about before.
Sorry, I never got around to this. If someone wants to take this up, feel free!
https://www.lesswrong.com/posts/bkfgTSHhm3mqxgTmw/loudly-give-up-don-t-quietly-fade
I would like to emphasize that when we discuss community norms in EA, we should remember the ultimate goal of this community is to improve the world / humanity’s future as much as possible, not to make our lives as enjoyable as possible. Increasing the wellbeing of EAs is instrumentally useful for increased productivity and attracting more people to make sacrifices like “donate tens of thousands of dollars” or “change your career plan to work on this problem”, but ultimately the point isn’t to create a jolly in-group of ambitious nerds. For example, if the meshing of polyamorous and professional relationships causes less qualified candidates to earn positions in EA organizations, this may be net negative, even if the polyamorous relationships make people really happy.