Currently doing local AI safety Movement Building in Australia and NZ.
Chris Leong
If EA is trying to do the most good, letting people like Ives post their misinformed stuff here seems like a clear mistake.
Disagree because it is at −36.
Happy to consider your points on the merits if you have an example of an objectionable post with positive upvotes.
That said: part of me feels that Effective Altruism shouldn’t be afraid of controversial discussion, whilst another part of me wants to shift it to Less Wrong. I suppose I’d have to have a concrete example in front of me to figure out how to balance these views.
The argument for near-term human disempowerment through AI
I didn’t vote, but maybe people are worried about the EA forum being filled up with a bunch of logistics questions?
Thank you for your service.
This post makes some interesting points about EA’s approach to philanthropy, but I certainly have mixed feelings on “please support at least one charity run by someone in the global south that just so happens to be my own”.
Might be more useful if you explain why the arguments weren’t persuasive to you
So my position is that most of your arguments are worth some “debate points” but that mitigating potential x-risks outweigh this.Our interest is in a system of liability that can meet AI safety goals and at the same time have a good chance of success in the real world
I’ve personally made the mistake of thinking that the Overton Window is narrower than it actually was in the past. So even though such laws may not seem viable now, my strong expectation is that it will quickly change. At the same time, my intuition is that if we’re going to pursue the liability route, at least strict liability has the advantage of keeping the developer focused on preventing the issue from occurring rather than taking actions to avoid legal responsibility. Those actions won’t help, so they need to focus on preventing the issue from occurring.
I know that I wrote above:
In any case my main worry about strong liability laws is that we may create a situation where AI developers end up thinking primarily about dodging liability more than actually making the AI safe.
and that this is in tension with what I’m writing now. I guess upon reflection I now feel that my concerns about strong liability laws only apply to strong fault-based liability laws, not to strict liability laws, so in retrospect I wouldn’t have included this sentence.
Regarding your discussion in point 1 - apologies for not addressing this in my initial reply—I just don’t buy that courts being able to handle chainsaws or medical or actuary evidence means that they’re equipped to handle transformative AI given how fast the situation is changing and how disputed many of the key questions are. Plus the stakes involved play a role in me not wanting to take a risk here/make an unnecessary bet on the capabilities of the courts. Even if there was a 90% chance that the courts would be fine, I’d prefer to avoid the 10% probability that they aren’t.
I find the idea of a reverse burden of proof interesting, but tbh I wasn’t really persuaded by the rest of your arguments. I guess the easiest way to respond to most of them would be “Sure, but human extinction kind of outweighs it” and then you’d reraise how these risks are abstract/speculative and then I’d respond that putting risks in two boxes, speculative and non-speculative, hinders clear thinking more than it helps. Anyway, that’s just how I see the argument play out.
In any case my main worry about strong liability laws is that we may create a situation where AI developers end up thinking primarily about dodging liability more than actually making the AI safe.
I have very mixed views on Richard Hannania.
On one hand, some of his past views were pretty terrible (even though I believe that you’ve exaggerated the extent of these views).On the other hand, he is also one of the best critics of conservatives. Take for example, this article where he tells conservatives to stop being idiots who believe random conspiracy theories and another where he tells them to stop scamming everyone. These are amazing, brilliant articles with great chutzpah. As someone quite far to the right, he’s able to make these points far more credibly than a moderate or liberal ever could.
So I guess I feel he’s kind of a necessary voice, at least at this particular point in time when there are few alternatives.
Yeah, it’s possible I’m taking a narrow view of what a professional organisation is. I don’t have a good sense of the landscape here.
I guess I’m a bit skeptical of this proposal.
I don’t think we’d have much credibility as a professional organisation. We could require people to do the intro and perhaps even the advanced fellowship, but that’s hardly rigorous training.
I’m worried that trying to market ourselves as a professional organisation might backfire if people end up seeing us as just a faux one.
I suspect that this kind of association might be more viable for specific cause areas than for EA as a whole, but there might not be enough people except in a couple of countries.
Thank you for posting this publicly. It’s useful information for everyone to know.
Wasn’t there some law firm that did an investigation? Plus some other projects listed here.
It would be useful for you to clarify exactly what you’d like to see happen and how this differs from the things that did happen, even though this might be obvious to someone who is high-context on the situation like you are. On the other hand, I’d have to do a bit of research to figure out what you’re suggesting.
I didn’t know that CHAI or 80,000 Hours had recommended material.
The 80,000 Hours syllabus = “Go read a bunch of textbooks”. This is probably not ideal for a “getting started’ guide.
I was there for an AI Safety workshop, I can’t remember the content though. Do you know what you included?
I found that just open discussion sometimes leads to less valuable discussion, so in both cases I’d focus on a few specific discussion prompts / trying to help people come to a conclusion on some question
That’s useful feedback. Maybe it’d be best to take some time at the end of the first session of the week to figure out what questions to discuss in the second session? This would also allow people to look things up before the discussion and take some time for reflection.I’d be keen to hear specifically what the pre-requisite knowledge is—just in order to inform people if they ‘know enough’ to take your course. Maybe it’s weeks 1-3 of the alignment course?
Thoughts on prerequisites off the top of my head:
Week 0: Even though it is a theory course, it would likely be useful to have some basic understanding of machine learning, although this would vary depending on the exact content of the course. It might or might not make sense to run a week 0 depending on most people’s backgrounds.
Week 1 & 2: I’d assume that the participants have at least a basic understanding of inner vs outer alignment, deceptive alignment, instrumental convergence, orthogonality thesis, why we’re concerned about powerful optimisers, value lock-in, recursive self-improvement, slow vs. fast take-off, superintelligence, transformative AI, wireheading, though I could quite easily create a document that defines all of these terms. The purpose of this course also wouldn’t be to reiterate the basic AI safety argument, although it might cover debates such as the validity of counting arguments for mesa-optimisers or whether RLHF means that we should expect outer alignment to be solved by default.I.e. what if you ask 3-5 experts what they think the most important part of agent foundations is, and maybe try to conduct 30 min interviews with them to solicit the story they would tell in a curriculum? You can also ask them their top recommended resources, and why they recommend it. That would be a strong start, I think.
That’s a great suggestion. I would still be tempted to create a draft curriculum though, even just at the level of week 1 focuses on question x and includes readings on topics a, b and c. I could also lift heavily from the previous agent foundations week and other past versions of AISF, alignment 201, key phenomenon in AI Safety, MATS AI Safety Strategy Curriculum, MIRI’s Research Guide, John Wentworth’s alignment training program + the highlighted AI Safety Sequences on Less Wrong (in addition to possibly including some material from the AI Safety Bootcamp or Advanced Fellowship that I ran).
I’d want to first ask them what they would like to see included without them being anchored on my draft, then I’d show them my draft and ask for more specific feedback. Expert time is valuable, so I’d want to get the most out of their time and it is easier to critique a specific artifact.
I’m quite tempted to create a course for conceptual AI alignment, especially since agent foundations has been removed from the latest version of the BlueDot Impact course[1].
If I did this, I would probably run it as follows:a) Each week would have two sessions. One to discuss the readings and another for people to bounce their takes off others in the cohort. I expect that people trying to learn conceptual alignment would benefit from having extra time to discuss their ideas with informed participants.
b) The course would be less introductory, though without assuming knowledge of AGISF. AGISF already serves as a general introduction for those who need it and making progress on conceptual alignment is less of a numbers game, so it would likely make sense to focus on people further along the pipeline, rather than trying to expand the top of the funnel. In terms of the rough target audience, I imagine people who have been browsing Less Wrong or hanging around the AI safety community for years; or maybe someone who found out about it more recently and has been seriously reading up on it for the last couple of months. For this reason, I would want to assume that people already know why we’re worried about AI Safety and basic ideas like inner/outer alignment and instrumental convergence.[2]
c) I’d probably follow the AGISF in picking one question to focus on every week. I also like how it contextualises each reading.Figuring out what to include seems like it’d be a massive challenge, but I agree that one of the best ways to do this would be to just create a curriculum, send it around to people and then additionally collect feedback from people who have gone through the course.
Anyway, I’d love to hear if anyone has any thoughts on what such a course should look like.
(The closest current course is the Key Phenomenon in AI Safety Course that PIBSS ran, but this would assume that people are more technical—in the broader sense where technical includes maths, physics, comp sci, etc—and would be less introductory).- ^
This is quite a reasonable decision. Shorter timelines makes agent foundations work less pressing. Additionally, I imagine that most people who complete AGISF would not gain that much value from covering a week on agent foundations, at least not this early in their alignment journeys. Having a week where a substantial part of the cohort feel “why was I taught this” is not a very good experience for them.
- ^
Though it wouldn’t be too hard to create a document containing assumed knowledge.
- ^
I think the biggest criticism that this cause will face from an EA perspective is that it’s going to be pretty hard to argue for moving more talent to first-world countries to do random things than either convincing more medical, educational or business talent to move to developing countries to help them develop or to focus on bringing more talent to top cause areas. I’m not saying that such a case couldn’t be made, just that I think it’d be tricky.
The upshot is: I recommend only choosing this career entry route if you are someone for whom working exclusively at EA organisations is incredibly high on your priority list.
I think taking a role like this early on could also be high-value if you’re trying to determine whether working in a particular cause area is for you. Often it’s useful to figure that out pretty early on. Of course, the fact that it isn’t the exact same job as you might be doing later on might make it less valuable for this.
This is a very interesting idea. I’d love to see if someone could make it work.
I’m perfectly fine with holding an opinion that goes against the consensus. Maybe I could have worded it a bit better though? Happy to listen to any feedback on this.
For anyone wondering about the definition of macrostrategy, the EA forum defines it as follows: