Another question related to Task Y: supposing Task Y does exist, would you rather people working on Task Y think of themselves as “Soft EAs”, or as people who are part of the “Task Y community”? For example, if eating a vegan diet is Task Y, would you like vegans to start thinking of themselves as EAs due to their veganism? If veganism didn’t exist already, and it was an idea that originated from within the EA community, would it be best to spin it off or keep it internal?
I can think of arguments on both sides:
Maybe there’s already a large audience of people who have heard about EA and think it’s really cool but don’t know how to contribute. If these people already exist, we might as well figure out the best things for them to do. This isn’t necessarily an argument for expansion of EA, however. (It’s also not totally clear which direction this consideration points in.)
If Task Y is a task where the argument for positive impact is abstruse & hard to follow, then maybe a “Task Y Movement” isn’t ever going to get off the ground because it lacks popular appeal. Maybe the EA movement has more popular appeal, and the EA movement’s popular appeal can be directed into Task Y.
Some find the EA movement uninviting in its elitism. Even on this forum, reportedly the most elitist EA discussion venue, a highly upvoted post says: “Many of my friends report that reading 80,000 Hours’ site usually makes them feel demoralized, alienated, and hopeless.” There have been gripes about the difficulty of getting grant money for EA projects from grantmaking organizations after it became known that “EA is no longer funding-limited”. (I might be guilty of this griping myself.) Do we want average Janes and Joes reading EA career advice that Google software engineers find “very depressing”? How will they feel after learning that some EAs are considered 1000x as impactful as them?
Expansion of the EA movement itself could be hard to reverse and destroy option value.
If people are biased towards believing their actions have cosmic significance, does this also imply that people without math & CS skills will be biased against AI safety as a cause area?
The EA Hotel hosted an EA Retreat which sounds a bit similar. Here’s a report from a Czech EA retreat.
The Pareto Fellowship even moreseo for me. Here CEA explains why they discontinued it.
I changed the sentence you mention to “If you want to understand present-day algorithms, the “pre-driven car” model of thinking works a lot better than the “self-driving car” model of thinking. The present and past are the only tools we have to think about the future, so I expect the “pre-driven car” model to make more accurate predictions.” I hope this is clearer.
That is clearer, thanks!
I think that it is a hopeless endeavour to aim for such precise language in these discussions at this point in time, because I estimate that it would take a ludicrous amount of additional intellectual labour to reach that level of rigour. It’s too high of a target.
Well, it’s already possible to write code that exhibits some of the failure modes AI pessimists are worried about. If discussions about AI safety switched from trading sentences to trading toy AI programs, which operate on gridworlds and such, I suspect the clarity of discourse would improve.
I might post some scraps of arguments on my blog soonish, but those posts won’t be well-written and I don’t expect anyone to really read those.
Cool, let me know!
Presumably if the argument is for why the weight should be higher, then kbog will pay attention?
The “language” section is the strongest IMO. But it feels like “self-driving” and “pre-driven” cars probably exist on some kind of continuum. How well do the system’s classification algorithms generalize? To what degree does the system solve the “distribution shift” problem and tell a human operator to take control in circumstances that the car isn’t prepared for? (You call these circumstances “unforeseen”, but what about a car that attempts to foresee likely situations it doesn’t know what to do in and ask a human for input in advance?) What experiment would let me determine whether a particular car is self-driving or pre-driven? What falsifiable predictions, if any, are you making about the future of self-driving cars?
I was confused by this sentence: “The second pattern is superior by wide margin when it comes to present-day software”.
I think leaky abstractions are a big problem in discussions of AI risk. You’re doubtless familiar with the process by which you translate a vague idea in your head into computer code. I think too many AI safety discussions are happening at the “vague idea” level, and more discussions should be happening at the code level or the “English that’s precise enough to translate into code” level, which seems like what you’re grasping at here. I think if you spent more time working on your ontology and the clarity of your thought, the language section could be really strong.
(Any post which argues the thesis “AI safety is easily solvable” is both a post that argues for de-prioritizing AI safety and a post that is, in a sense, attempting to solve AI safety. I think posts like these are valuable; “AI safety has this specific easy solution” isn’t as within the Overton window of the community devoted to working on AI safety as I would like it to be. Even if the best solution ends up being complex, I think in-depth discussion of why easy solutions won’t work has been neglected.)
Re: the anchoring section, pretty sure it is well documented by psychologists that humans are overconfident in their probabilistic judgements. Even if humans tend to anchor on 50% probability and adjust from there, it seems this isn’t enough to counter our overconfidence bias. Regarding the “Discounting the future” section of your post, see the “Multiple-Stage Fallacy”. If a superintelligent FAI gets created, it can likely make humanity’s extinction probability almost arbitrarily low through sufficient paranoia. Regarding AI accidents going “really really wrong”, see the instrumental convergence thesis. And AI safety work could be helpful even if countermeasures aren’t implemented universally, through creation of a friendly singleton.
Presumably the programmer will make some effort to embed the right set of values in the AI. If this is an easy task, doom is probably not the default outcome.
AI pessimists have argued human values will be difficult to communicate due to their complexity. But as AI capabilities improve, AI systems get better at learning complex things.
Both the instrumental convergence thesis and the complexity of value thesis are key parts of the argument for AI pessimism as it’s commonly presented. Are you claiming that they aren’t actually necessary for the argument to be compelling? (If so, why were they included in the first place? This sounds a bit like justification drift.)
the original texts are very clear that the massive jump in AI capability is supposed to come from recursive self-improvement, i.e. the AI helping to do AI research
...because that AI research is useful for some other goal the AI has, such as maximizing paperclips. See the instrumental convergence thesis.
At any rate, though, what does it matter whether the goal is put in after the capability growth, or before/during? Obviously, it matters, but it doesn’t matter for purposes of evaluating the priority of AI safety work, since in both cases the potential for accidental catastrophe exists.
The argument for doom by default seems to rest on a default misunderstanding of human values as the programmer attempts to communicate them to the AI. If capability growth comes before a goal is granted, it seems less likely that misunderstanding will occur.
OK. I went ahead and removed it now, so the next person to create an open thread will copy/paste the correct message.
Great idea! I don’t think mass requests are the way to go, though. I’ll bet if someone like Peter Singer, Will MacAskill, or Toby Ord sent them a proposal to write an article about EA, they’d accept. I sent Will a Facebook message to ask him what he thinks.
I think more people should be studying statistics, machine learning, and data science, especially Bayesian methods and causal inference. Not only do these skills offer a chance to contribute to AI safety, they’re also critical for evaluating scientific papers (important for any field given the replication crisis), doing predictive modeling, and generally thinking in a data-driven and evidence-based way. Math is apparently 80k’s #1 recommendation, but when I was a student, I went to an event where math majors talked about their experiences in industry. Most of them said they didn’t use the math they learned much and they wish they had studied more statistics. So I would suggest applied math with a statistics emphasis.
If we’re choosing between trying to improve Vox vs trying to discredit Vox, I think EA goals are served better by the former.
Tractability matters. Scott Alexander has been critiquing Vox for years. It might be that improving Vox is a less tractable goal than getting EAs to share their articles less.
they went out on a limb to hire Piper, and they’ve sacrificed some readership to maintain EA fidelity.
My understanding is that Future Perfect is funded by the Rockefeller Foundation. Without knowing the terms of their funding, I think it’s hard to ascribe either virtue or vice to Vox. For example, if the Rockefeller Foundation is paying them per content item in the “Future Perfect” vertical, I could ascribe vice to Vox by saying that they are churning out subpar EA content in order to improve their bottom line.
This is an interesting essay. My thinking is that “coalition norms”, under which politics operate, trade off instrumental rationality against epistemic rationality. I can argue that it’s morally correct from a consequentialist point of view to tell a lie in order to get my favorite politician elected so they will pass some critical policy. But this is a Faustian bargain in the long run, because it sacrifices the epistemology of the group, and causes the people who have the best arguments against the group’s thinking to leave in disgust or never join in the first place.
I’m not saying EAs shouldn’t join political coalitions. But I feel like we’d be sacrificing a lot if the EA movement began sliding toward coalition norms. If you think some coalition is the best one, you can go off and work with that coalition. Or if you don’t like any of the existing ones, create one of your own, or maybe even join one & try to improve it from the inside.
But if we actually want EA to go mainstream, we can’t rely on econbloggers and think-tanks to reach most people. We need easier explanations, and I think Vox provides that well.
Is “taking EA mainstream” the best thing for Future Perfect to try & accomplish? Our goal as a movement is not to maximize the people of number who have the “EA” label. See Goodhart’s Law. Our goal is to do the most good. If we garble the ideas or epistemology of EA in an effort to maximize the number of people who have the “EA” self-label, this seems like it’s potentially an example of Goodhart’s Law.
Instead of “taking EA mainstream”, how about “spread memes to Vox’s audience that will cause people in that audience to have a greater positive impact on the world”?
I don’t have stats, it’s just something I hear from vegans when I suggest an organization to provide welfare standards for meat providers. They say it has been tried before and the organization always gets co-opted by the industry. I’m actually kinda skeptical.
If you work as an agricultural inspector and err on the side of making recommendations which happen to improve animal welfare, that seems like it could be high-impact. Also: An argument I hear from vegans is that we can’t have happy meat because any organization which purports to enforce some standard of animal welfare will essentially get bribed by factory farms. If this is true, a way to address it would be to funnel un-bribable people with a passion for animal welfare into those roles.
WRT earning to give, the US Bureau of Labor Statistics maintains an Occupational Outlook Handbook with info on wages and job growth for loads of different jobs. Air traffic controller looks pretty good, although the BLS seems to think you typically need a 2-year degree, so maybe it doesn’t count as “vocational”.
I also think it is worth specifically thinking in terms of jobs which aren’t on the radar of other people, because lower supply is going to mean a higher salary. These reddit threads might be worth checking out. Finally, it might be worthwhile to try to get access to publicly available salary data in order to determine which municipalities pay a lot of money for jobs like being a police officer. (You probably also want to take a careful look at the pension plan in that municipality to ensure that it’s on solid ground fiscally.) BTW, Tyler Cowen likes to argue that hiring more cops and imprisoning fewer people would be good for the USA on both crime reduction and humanitarian grounds; here is one presentation of the argument.
Has there been any research into ways to address Allee effects? Seems like that could address a range of combination existential risks simultaneously.
To me it seems extraordinarily unlikely that any agent capable of performing all these tasks with a high degree of proficiency would simultaneously stand firm in its conviction that the only goal it had reasons to pursue was tilling the universe with paperclips.
Seems a little anthropomorphic. A possibly less anthropomorphic argument: If we possess the algorithms required to construct an agent that’s capable of achieving decisive strategic advantage, we can also apply those algorithms to pondering moral dilemmas etc. and use those algorithms to construct the agent’s value function.