I’m a software engineer from Brisbane, Australia who’s looking to pivot into AI alignment. I have a grant from the Long-Term Future Fund to upskill in this area full time until early 2023, at which point I’ll be seeking work as a research engineer. I also run AI Safety Brisbane.
Jay Bailey
Passage 5 seems to prove too much, in the sense of “If you take X philosophy literally, it becomes bad for you” being applicable to most philosophies, but I very much like Passage 4, the EA judo one.
While it is very much true that disagreeing over the object-level causes shouldn’t disqualify one from EA, I do agree that it is not completely separate from EA—that EA is not defined purely by its choice of causes, but neither does it stand fully apart from them. EA is, in a sense, both a question and an ideology, and trying to make sure the ideology part doesn’t jump too far ahead of the question part is important.
“Again: if your social movement “works in principle” but practical implementation has too many problems, then it’s not really working in principle, either. The quality “we are able to do this effectively in practice” is an important (implicit) in-principle quality.”
I think this is a very key thing that many movements, including EA, should keep in mind. I think that what EA should be aiming for is “EA has some very good answers to the question of how we can do the most good, and we think they’re the best answers humanity has yet come up with to answer the question. That’s different from thinking our answers are objectively true, or that we have all the best answers and there are none left to find.” We can have the humility to question ourselves, but still have the confidence to suggest our answers are good ones.
I dream of a world where EA is to doing good as science is to human knowledge. Science isn’t always right, and science has been proven wrong again and again in the past, but science is collectively humanity’s best guess. I would like for EA to be humanity’s best guess at how to do the most good. EA is very young compared to science, so I’m not surprised we don’t have that same level of mastery over our field as science does, but I think that’s the target.
The point about global poverty and longtermism being very different causes is a good one, and the idea of these things being more separate is interesting.
That said, I disagree with the idea that working to prevent existential catastrophe within one’s own lifetime is selfish rather than altruistic. I suppose it’s possible someone could work on x-risk out of purely selfish motivations, but it doesn’t make much sense to me.
From a social perspective, people who work on climate change are considered altruistic even if they are doomy on climate change. People who perform activism on behalf of marginalised groups are considered altruistic even if they’re part of that marginalised group themselves and thus even more clearly acting in their own self-interest.
From a mathematical perspective, consider AI alignment. What are the chances of me making the difference between “world saved” and “world ends” if I go into this field? Let’s call it around one in a million, as a back-of-the-envelope figure. (Assuming AI risk at 10% this century, the AI safety field reducing it by 10%, and my performing 1⁄10,000th of the field’s total output)
This is still sufficient to save 7,000 lives in expected value, so it seems a worthy bet. By contrast, what if, for some reason, misaligned AI would kill me and only me? Well, now I could devote my entire career to AI alignment and only reduce my chance of death by one micromort—by contrast, my Covid vaccine cost me three micromorts all by itself, and 20 minutes of moderate exercise gives a couple of micromorts back. Thus, working on AI alignment is a really dumb idea if I care only about my own life. I would have to go up to at least 1% (10,000x better odds) to even consider doing this for myself.
Thanks for sharing this, semicycle!
One thing I would also like to point out is that relativity is the enemy here. Compared to being a billionaire, making a “mere” six figures as a successful engineer and donating 10% doesn’t seem like much, but let’s take a step back and look at it objectively. If you donate 10% of that, that’s saving 3+ lives every single year. Across a career, that could easily save a HUNDRED PEOPLE. That’s like, two schoolbusses full of children! This is incredibly valuable, regardless of what anybody else is doing.
If you save three people, then as far as I’ve concerned you’ve made a positive contribution with your life as long as you’re a somewhat decent person the rest of the time, and nobody can tell you otherwise. You’re in a position to do that every year you have an engineering job, even if it’s not in EA!
Everyone who signs that pledge (or donates the equivalent) is doing incredible work. The child you saved doesn’t care if someone else saved ten or not, and every life is precious.
I highly recommend Duck as an advisor. Duck is very empathic, non-judgmental, and a good listener. On top of that, Duck is a master of the ancient art of wu wei. Quite the impressive set of skills!
Also worth noting here is that, as expected, EA’s have in general condemned this idea and SBF has gone against the standard wisdom of EA in doing this. I feel like EA’s principles were broken, not followed, even though I agree SBF was almost certainly a committed effective altruist. The update, for me, is not “EA as an ideology is rotten when taken very seriously” but rather “EA’s are, despite our commitments to ethical behaviour, perhaps no more trustworthy with power than anyone else.”
This has caused me to pretty sharply reduce my probability of EA politicians being a good idea, but hasn’t caused a significant update against the core principles of EA.
A good call to action, I feel, should be about the upper bound rather than the lower bound. I too assumed that was “<= 3 mins” purely because “>= X time” is very unusual to put in a title. Perhaps changing it to something like “<= 15 mins” would be a good idea.
Hi Edward,
First off—you have my sympathies. That sounds terrible, and I understand his and your anger about this. Unfortunately, there are a great deal of problems in the world, so EA’s need to think carefully about where we should allocate our resources to do as much good as we can. Currently, you can save a life for around 3500-5500 USD, or for animals, focusing on factory farming can lead to tremendous gains (Animal Charity Evaluators estimates that lobbying for cage-free campaigns for hens can lead to multiple hen-years affected per dollar).
So, we need to consider how much it would cost to fix this problem. Building a better enclosure would cost a great deal of money, as would convincing the zoo to let the tiger be released and transported back to its home continent. I don’t know how much construction costs, but it definitely wouldn’t be in the four figures. I’d estimate six figures, minimum. Even if it were five figures, this is enough to save multiple human lives, or prevent tens of thousands of chickens living even more wretched lives.
I don’t want to trivialise the problem you’ve mentioned. It’s an injustice. However, this injustice is far from unique. Millions of people and animals are suffering right now, and since we cannot help them all, EA focuses on trying to help as many as we can.
This cause prioritisation is at the heart of the EA movement. If you’ve got questions, I’d be happy to answer them as best I can, whether here or via PM.
I don’t see how, if this system had been popularised five years ago, this would have actually prevented the recent problems. At best, we might have gotten a few reports of slightly alarming behaviour. Maybe one or two people would have thought “Hmm, maybe we should think about that”, and then everyone would have been blindsided just as hard as we actually were.
Also...have you ever actually been in a system that operated like this? Let’s go over a story of how this might go.
You’re a socially anxious 20-year-old who’s gone to an EA meeting or two. You’re nervous, you want people to like you, but things are mostly going well. Maybe you’re a bit awkward, but who’s not? You hear about this EA reporting thing, and being a decent and conscientious person, you ask to receive all anonymized data about you, so you can see if there are any problems.
Turns out, there is! It’s only a vague report—after all, we wanted it to be simplified, so people can use the system. Someone reported you under the category “intolerant”. Why? What did you say? Did you say something offensive? Did someone overhear half a conversation? You have no idea what you did, who reported you, or how you can improve. Nobody’s told you that it’s not a big deal to get one or two reports, and besides, you’re an anxious person at the best of times, you’d never believe them anyway. Given this problem, what should you do? Well, you have no idea what behaviour of yours caused the report, so you don’t know. Your only solution is to be guarded at all times and very carefully watch what you say. This does not make it easy to enjoy yourself and make friends, and you always feel somewhat out of place. Eventually, you make excuses to yourself and just stop showing up for meetings.
This is definitely a made up story, but almost exactly this happened to me in my first year at my first job—I had an anonymous, non-specific complaint given to me by my manager, the only one I’ve ever received. I asked what I was supposed to do about that, and my manager had no good answer. Somewhat annoyed, I said maybe the best solution would be to just not make friends at work, and my manager actually agreed with me. Needless to say, I had much more cordial relationships with most colleagues after that. I was also older than 20 and I didn’t actually care about being liked at my job much. I wanted to do well because it was my first job, but they were never my people. Eventually I grew up, got over it, and realised shit happens, but that takes time. I can imagine that if I were younger and amongst people whose ideology I admired, it would have stung far worse.
And...let’s remember the first paragraph here. Why would such a system have actually worked? SBF gets some complaints about being rude or demanding in his job, and what? EA stops taking his money and refuses to take grants from the FTX Future Fund? I don’t think such a system would ever have led to the kind of actions that would have discovered this ahead of time or significantly mitigated its effects on us.
If we’re going to propose a system that encourages people to worry about any minor interaction being recorded as a black mark on them for several years within the community, imposing high costs on the type of socially anxious people who are highly unlikely to be predatory in the first place...well, let’s at least make sure such a system solves the problem.
“So if a $50 Uber ride saves me half an hour, my half an hour must be more valuable than a three months of someone else’s life. That’s a pretty big claim.”
That line hit hard. Something about reducing it to such a small scale made it really hit home—I can actually viscerally understand why there are people who agonise over every purchase and struggle so much with guilt. I’ve always been able to emotionally remain distant—to donate my 10%, save lives each year, and yet somehow be okay with not donating more, even though I could. Thinking of it in terms of a single purchase and weeks/months of someone’s life makes it feel so much more real all of a sudden, and my justifications of Schelling points and sustainable giving feel much more hollow.
Firstly, I should point out that there is an understanding of the power of both stories and emotions in EA already—see here for an example.
Secondly, I think you’re associating optimisation with a set of concepts (spreadsheets, data-oriented decision-making, technology) rather than the actual meaning of it, which is to maximise what we can do with our resources. If you associate optimisation with these concepts, it’s possible to follow it off a cliff. When people refer to the term “overoptimising”, this is what I think they mean—applying optimisation-as-concept to the point where it actually becomes less optimal overall. Maximising our resources does mean taking into account the importance of human connection and acting accordingly—applying technology and outcome-oriented thinking as much as we can to make things better, but no further. If you “optimise” to the point of turning people away or quitting the movement in disillusionment, that’s not optimising. The goal is not the process.
Thirdly, you mention the following:
Many altruistic organizations “fall back into modeling the oppressive tendencies against which we claim to be pushing…. Many align with the capitalistic belief that constant growth and critical mass is the only way to create change” (2017). A major flaw of Effective Altruism lies in its lack of reimagination, its focus on reducing suffering within the capitalist system as it exists rather than conceptualizing a new system in which oppression is reduced and eventually eliminated.
There’s quite a few unspoken assumptions in these sentences—assumptions that are quite common among progressive circles, but assumptions not everyone in EA shares. Primarily, you assume that EA’s focus is on “reducing suffering within the capitalist system”. This is not how I personally view EA’s mission, and I’m not alone in this. I view EA’s mission as reducing suffering and helping people, period, regardless of what caused their problem in the first place. For instance, I don’t see malaria and schistosomiasis (worms) as “suffering within the capitalist system”, but rather suffering caused by nature. It’s possible that colonialism exacerbated this through keeping Africa poor and unable to fight off these things that Western countries have essentially eliminated, but it is important to understand that this is not a necessary precondition for us to oppose it. I would still support malaria prevention even if it was proven that capitalism/colonialism had absolutely nothing to do with the proliferation of malaria in sub-Saharan Africa. Something doesn’t have to be labelled as oppression for it to be worth fixing.
Now that we’ve established this, that means that calling something capitalistic does not automatically make it bad in the eyes of effective altruism. It is not a failure of imagination that causes many EA’s to oppose creating a new system than capitalism—it is a legitimate difference in viewing the world. EA is not ethically homogenous—for every point I make I’m sure you can find people who identify as EA and disagree with me. That said, there is definitely an existing viewpoint similar to what I have described, and I believe enough EA’s are within this cluster that they must be taken into account.
Finally, you mention several problems you see within EA, but I don’t see concrete ideas for fixing them. You mention fixes on the conceptual level, like “We must recognize that people hold value simply by existing, that time is a tool with which to cultivate meaningful social change rather than a good to be commodified.”, but—how exactly do you propose people do that? Let’s say I’m a community builder who is convinced by your message. What can I do, this week, to begin moving in that direction? What might I be doing that’s currently harmful, and what can I replace it with?
To summarise:
- Optimisation means “doing the most good”, not “Applying techniques generally associated with optimisation”. We should be careful not to confuse the goal from the process, since as you have noticed, excessive application of the process can move us further from the goal.
- Many in EA do not view the world primarily in terms of oppressor and oppressed. Many, like myself, view the world primarily as “Humanity versus suffering” where suffering often comes from the way nature and the universe happened to be laid out. Social structures didn’t cause malaria, or aging, or cholera. You don’t have to agree with, or even engage with, this viewpoint to be in EA. But if you wish to change the minds of people in EA with this viewpoint, it would help to understand this viewpoint deeply.
- Concrete steps to help fix a problem that you’ve identified will make for more valuable feedback.
Welcome to the Forum!
This post falls into a pretty common Internet failure mode, which is so ubiquitous outside of this forum that it’s easy to not realise that any mistake has even been made—after all, everyone talks like this. Specifically, you don’t seem to consider whether your argument would convince someone who genuinely believes these views. I am only going to agree with your answer to your trolley problem if I am already convinced invertebrates have no moral value...and in that case, I don’t need this post to convince me that invertebrate welfare is counterproductive. There isn’t any argument for why someone who does not currently agree with you should change their mind.
It is worth considering what specific reasons people who care about invertebrate reasoning have, and trying to answer those views directly. This requires putting yourself in their shoes and trying to understand why they might consider invertebrates to have actual moral worth.
”So what’s the problem? Why don’t I just let the invertebrate-lovers go do their thing, while I do mine? The problem is that those arguing for the invertebrate cause as an issue of moral importance have brought bad arguments to the table.”
This is much more promising, and I’d like to see actual discussion of what these arguments are, and why they’re bad.
“All leading labs coordinate to slow during crunch time: great. This delays dangerous AI and lengthens crunch time. Ideally the leading labs slow until risk of inaction is as great as risk of action on the margin, then deploy critical systems.
All leading labs coordinate to slow now: bad. This delays dangerous AI. But it burns leading labs’ lead time, making them less able to slow progress later (because further slowing would cause them to fall behind, such that other labs would drive AI progress and the slowed labs’ safety practices would be irrelevant).”
I would be more inclined to agree with this if there was a set of criteria we had that indicated we were in “crunch time” which we are very likely to meet before dangerous systems and haven’t met now. Have people generated such a set? Without that, how do we know when “crunch time” is, or for that matter, if we’re already here?
- Timelines are short, p(doom) is high: a global stop to frontier AI development until x-safety consensus is our only reasonable hope by 12 Oct 2023 11:24 UTC; 70 points) (
- 13 Oct 2023 8:48 UTC; 0 points) 's comment on Timelines are short, p(doom) is high: a global stop to frontier AI development until x-safety consensus is our only reasonable hope by (
This is an interesting read, but I believe I disagree with the premise. I’ve still upvoted it, however.
I actually DO consider 20!Jay to be a dumber and less ethical person than present-day 30!Jay . And I can only hope 40!Jay is smarter and more ethical than I am. (Using “dumb” and “smart” in the colloquial terms − 20!Jay presumably had just as much IQ as I do, but he was far less knowledgeable and effective, even at pursuing his own goals—I have every confidence that I could pursue 20!Jay’s goals far better than he could, even though I think 30!Jay’s goals are better.)
This idea with the future and past selves is interesting, but it isn’t one I think I share. I don’t think 40!Jay has any obligation to 30!Jay . He should consider 30!Jay’s judgement, but overall 40!Jay wins. For the same reason, I consider myself consistent that I don’t consider 20!Jay’s wishes to be worth following. What 20!Jay thought is worth taking into consideration as evidence, but does not carry moral weight in and of itself because he thought it.
With one exception—I would consider any pledge or promise made by 20!Jay to be, if not absolutely binding, at least fairly significant, in the same way you consider 17!Austin’s commitment to mass to be. The reason the GWWC pledge is lifelong is because it is designed to bind your future self to a value system they may no longer share. I explicitly knew and accepted that when I pledged it. I don’t think 40!Jay should be bound absolutely by this pledge, but he should err on the side of keeping it. Perhaps a good metric would be to ask “If 30!Jay understood what 40!Jay understands now, would HE have signed the pledge?” and then break it only if the answer is no, even if 40!Jay would not sign the pledge again.
Similarly, I believe the case for conservatism is best put the way you did—the people in the past were just as intellectually capable (Flynn effect notwithstanding) as we are. We shouldn’t automatically dismiss their wisdom, just like we should consider our past selves to have valuable insights. But that’s evidence, as opposed to moral weight. To be fair, I’m biased here—the people of the past believed a lot of things that I don’t want to give moral weight today. People fifty years ago believed homosexuality was a form of deviancy—I don’t think we owe that perspective any moral weight just because people once thought it. I have taken their wisdom into consideration, dutifully considered the possibility, and determined it to be false—that’s the extent I owe them. I can only hope that when I am gone, people will seriously consider my moral views and judge them on their merits. After that, if they choose to ignore them...well, it’s not my world any more, and that’s their right. Hopefully they’re smarter than me.
Obviously this is a fantastic idea with zero flaws in any way, but I’d love to see it fleshed out a bit more. For instance, let’s say I know a bright young undergraduate with high levels of aggressiveness—how would I encourage them to test their personal fit for this cause area?
For me, I have:
Not wanting to donate more than 10%.
(“There are people dying of malaria right now, and I could save them, and I’m not because...I want to preserve option value for the future? Pretty lame excuse there, Jay.”)
Not being able to get beyond 20 or so highly productive hours per week.
(“I’m never going to be at the top of my field working like that, and if impact is power-lawed, if I’m not at the top of my field, my impact is way less.”)
Though to be fair, the latter was still a pressure before EA, there was just less reason to care because I was able to find work where I could do a competent job regardless, and I only cared about comfortably meeting expectations, not achieving maximum performance.
I organise AI Safety Brisbane—there are no AI safety orgs in Brisbane, or even Australia, so before ever forming it, I had to consider the impact of members (including myself!) eventually leaving for London or the Bay Area to do work there. While we don’t actively encourage people to do this, that certainly is the goal for some of the more committed members.
My general way of handling this is to openly admit that I expect some amount of churn as a result of this, and that this is a totally reasonable thing for any member to do. I’ve also been considering plans for how to manage handoffs in a similar way to EA university groups, where we know that members will eventually graduate out of the group. I haven’t faced any resistance over admitting this dynamic openly thus far—this may change in the future if we get Brisbane-based alternatives, but we’ll cross that bridge when we come to it.
I like the idea of keeping in touch with “alumni” of the group who head overseas to pursue impactful work!
Interesting stuff! It’s definitely got me thinking about my own comparative advantage as well, especially since EA doesn’t have a typical distribution of talents. I’m a software engineer, and we’re vastly overrepresented in EA relative to the general population.
You make a great point about how organisations know the labor pool, and individuals don’t, and that info is needed to understand comparative advantage well. I suppose that would mean the right advice for individuals is ” Pay attention to the signals coming from the market.” For instance, I consider myself a better software engineer than a writer, but thus far when I apply for EA engineering jobs I haven’t had a ton of luck, and when I enter EA writing contests I win small prizes. So I wonder if this is a case where my comparative advantage might lie outside of engineering just because we already have a lot of strong engineering talent in a way that isn’t as true with writing.
Although perhaps even better would be to utilise the Pareto frontier where I have skills at both—e.g, skilling up in ML and then working to become a distiller of AI safety research, which requires both technical skill AND ability to write well.
It’s a difficult question! But the lesson I have definitely taken from this article is “Apply for things, pay attention to what happens.”
I think there’s a bit of an “ugh field” around activism for some EA’s, especially the rationalist types in EA. At least, that’s my experience.
My first instinct, when I think of activism, is to think about people who:
- Have incorrect, often extreme beliefs or ideologies.
- Are aggressively partisan.
- Are more performative than effective with their actions.
This definitely does not describe all activists, but it does describe some activists, and may even describe the median activist. That said, this shouldn’t be a reason for us to discard this idea immediately out of hand—after all, how good is the median charity? Not that great compared to what EA’s actually do.
Perhaps there’s a mass-movement issue here though—activism tends to be best with a large groundswell of numbers. If you have a hundred thousand AI safety activists, you’re simply not going to have a hundred thousand people with a nuanced and deep understanding of the theory of change behind AI safety activism. You’re going to have a few hundred of those, and ninety nine thousand people who think AI is bad for Reason X, and that’s the extent of their thinking, and X varies wildly in quality.
Thus, the question is—would such a movement be useful? For such a movement to be useful, it would need to be effective at changing policy, and it would need to be aimed at the correct places. Even if the former is true, I find myself skeptical that the latter would occur, since even AI policy experts are not yet sure where to aim their own efforts, let alone how to communicate where to aim so well that a hundred thousand casually-engaged people can point in the same useful direction.
How did you test the language models? I just prompted GPT-3 Davinci with some rudimentary prompt engineering, and it got it right on zero-shot prompting ten times out of ten.
My prompt was with temperature 0.7, and was as follows:
You are an expert in English language comprehension, who is answering questions for a benchmark.In the sentence “The dog chased the cat but it got away”, what does the word “it” refer to?
And
You are an expert in English language comprehension, who is answering questions for a benchmark.In the sentence “The dog chased the cat but it tripped and fell”, what does the word “it” refer to?
Testing each case five times, the answer was correct every time, with very minor variation in answers (E.g, “In this sentence, “it” refers to the dog” and “The word “it” refers to the dog.”)
The other thing is—this feels very much like a “god of the gaps” situation, where every time LLM’s can learn to do something, it doesn’t count any more. Realistically, would you have actually predicted a few years ago that a language model would:Be able to write full high-school level essays from a single prompt (Something I recently informally tested)
This ability still wouldn’t be meaningful evidence towards machines eventually becoming as capable of humans in a broad range of cognitive domains.
“Huh, this person definitely speaks fluent LessWrong. I wonder if they read Project Lawful? Who wrote this post, anyway? I may have heard of them.
...Okay, yeah, fair enough.”
One thing I definitely believe, and have commented on before[1], is that median EA’s (I.e, EA’s without an unusual amount of influence) are over-optimising for the image of EA as a whole, which sometimes conflicts with actually trying to do effective altruism. Let the PR people and the intellectual leaders of EA handle that—people outside that should be focusing on saying what we sincerely believe to be true, and worrying much less about whether someone, somewhere, might call us bad people for saying it. That ship has sailed—there are people out there, by now, who already have the conclusion of “And therefore, EA’s are bad people” written down—refusing to post an opinion won’t stop them filling in the middle bits with something else, and this was true even before the FTX debacle.
In short—“We should give the money back because it would help EA’s image” is, imo, a bad take. “We should give the money back because it would be the right thing to do” is, imo, a much better take, which I won’t take a stand on myself at this point since I don’t have a horse in this race.
Maybe I should write a post about this at some point...though I recognise that now, in particular, isn’t exactly the right time to do that.
On a deleted post, so I can’t link the comment here, but people can search my comment history for “Personally, I disagree strenuously” if they wish to verify this.