Trying to make sure the development of powerful artificial intelligence goes well.
RomanHauksson
Tools for finding information on the internet
I think it’s important to give the audience some sort of analogy that they’re already familiar with, such as evolution producing humans, humans introducing invasive species in new environments, and viruses. These are all examples of “agents in complex environments which aren’t malicious or Machiavellian, but disrupt the original group of agents anyway”.
I believe these analogies are not object-level enough to be arguments for AI X-risk in themselves, but I think they’re a good way to help people quickly understand the danger of a superintelligent, goal-directed agent.
Besides reading the Cyborgism post, I admit I have not searched around yet; my apologies.
DiscordChatExporter is a tool that enables you to download an archive of all the messages in a server or channel.
giving the alignment research community an edge
epistemic status: shower thought
On whether advancements in humanity’s understanding of AI alignment will be fast enough compared to advancements in its understanding of how to create AGI, many factors stack in favor of AGI: more organizations are working on it, there’s a direct financial incentive to do so, people tend to be more excited about the prospect of AGI than cautious about misalignment, et cetera. But one factor that gives me a bit of hope (besides the idea that alignment might turn out to be easier to figure out than AGI) is that alignment researchers tend to be cooperative while AGI researchers tend to be competitive. Alignment researchers are motivated to save the world, not make a buck, so if their discoveries are helpful for alignment, they’ll go public, and if they’re helpful for alignment but also maybe capabilities, they’ll go public only to other alignment researchers. Meanwhile, each company trying to create AGI only has their own cutting-edge research to work with – they tend to keep to themselves, while we’re more united.
I’m curious about the ways that the alignment research community could augment this dynamic. One way could be restricting access to helpful information to only other alignment researchers, namely 1) discoveries that might be helpful for alignment but also AGI and 2) knowledge related to AI-assisted research and development. I get the impression this is already a norm, but the community might benefit from more formal and overt methods for doing this. For example, tammy created a “locked post” feature on her website that gives her control over who can decrypt certain posts of hers that relate to capabilities. Along the same vein, maybe the AI Alignment Forum could add a feature that works similar to Twitter Circle, where access to posts could be restricted to trusted members of a group:
Twitter Circle is a way to send Tweets to select people, and share your thoughts with a smaller crowd. You choose who’s in your Twitter Circle, and only the individuals you’ve added can reply to and interact with the Tweets you share in the circle.
Of course, then the forum developer team would have to up their security since nation state actors (for example) would be incentivized to hack the forum to learn all the latest AGI-related discoveries those alignment people are trying to keep to themselves. Another worry is that moles will network their way deep into the alignment community to gain access to privileged information, then pass it on to some company or nation (there might even already be moles today, without formal methods of restricting information). I’m sure there’s pre-existing literature on how to mitigate these risks.
Those with more knowledge about AI strategy, feel free to pick apart these thoughts; I only felt comfortable sharing them in the shortform because I feel like there’s a lot about this subject that I’m missing. Perhaps this has been written about before.
Does anyone have any advice on how I can use language models to write nonfiction text better? For making a specific piece of text better, but also for learning how to write better in the long term. Maybe a tool like Grammarly but more advanced? It would give critiques of the writing I have so far, ask questions, give wording suggestions, point out which sentences are especially well-written, et cetera.
Yeah, it’s not perfect… I’d like to be able to silently block people too, in case I no longer want to hang out with them. But hey, it’s open source, maybe we can improve it.
“Small World”: website that shows you which city your friends are in
The high success rate almost makes me think CE should be incubating even more ambitious, riskier projects, with the expectation of a lower success rate but higher overall EV. Very uncertain about this intuition though, would be interested to hear what CE thinks.
It would be great to have data about gaps in professional skills between what EAs are training up in and what EA organizations find most useful and neglected. I’ve heard that there’s a gap in information security expertise within the AI safety field, but it would be nice to see data to back this up before I commit to self-studying cybersecurity. Maybe someone could do a survey of EA organization managers asking them what skills they’re looking for and what roles they’ve been having a hard time filling, as well as a survey of early-career EAs asking them what skills they have and what they’re learning. We could also do this survey regularly and observe trends.
RomanHauksson’s Quick takes
I would like to emphasize that when we discuss community norms in EA, we should remember the ultimate goal of this community is to improve the world / humanity’s future as much as possible, not to make our lives as enjoyable as possible. Increasing the wellbeing of EAs is instrumentally useful for increased productivity and attracting more people to make sacrifices like “donate tens of thousands of dollars” or “change your career plan to work on this problem”, but ultimately the point isn’t to create a jolly in-group of ambitious nerds. For example, if the meshing of polyamorous and professional relationships causes less qualified candidates to earn positions in EA organizations, this may be net negative, even if the polyamorous relationships make people really happy.
I made a similar deck a few months ago, and there might be some overlap: https://github.com/RomanHN/CFAR_jargon
Hi Isaac! We’re in a similar situation: I’m 19, studying Computer Science at a mid-tier university, with a strong interest in AI alignment (and EA in general). Have you gone through the 80,000 Hours career guide yet? If not, it should give you some clarity. It recommends that we just focus on exploration and gaining career capital right now, rather than choosing one problem area or career path and going the whole hog.
Congratulations on the launch! This is huge. I have to ask, though: why is the ebook version not free? I would assume that if you wanted to promote longtermism to a broad audience, you would make the book as accessible as possible. Maybe charging for a copy actually increases the number of people who end up reading it? For example, it would rank higher on bestseller lists, attracting more eyes. Or perhaps the reason is simply to raise funds for EA?
I plan to do some self-studying in my free time over the summer, on topics I would describe as “most useful to know in the pursuit of making the technological singularity go well”. Obviously, this includes technical topics within AI alignment, but I’ve been itching to learn a broad range of subjects to make better decisions about, for example, what position I should work in to have the most counterfactual impact or what research agendas are most promising. I believe this is important because I aim to eventually attempt something really ambitious like founding an organization, which would require especially good judgement and generalist knowledge. What advice do you have on prioritizing topics to self-study and for how much depth? Any other thoughts or resources about my endeavor? I would be super grateful to have a call with you if this is something you’ve thought a lot about (Calendly link). More context: I’m a undergraduate sophomore studying Computer Science.
So far, my ordered list includes:
Productivity
Learning itself
Rationality and decision making
Epistemology
Philosophy of science
Political theory, game theory, mechanism design, artificial intelligence, philosophy of mind, analytic philosophy, forecasting, economics, neuroscience, history, psychology...
...and it’s at this point that I realize I’ve set my sights too high and I need to reach out for advice on how to prioritize subjects to learn!