Non-EA interests include chess and TikTok (@benthamite). Formerly @ CEA, METR + a couple now-acquired startups.
Ben_Westđ¸
You respond to Richard Ngo here:
> do you think that, if we had a theory of sociopolitics that was about as good as 20th-century economics, then we wouldnât be clueless about how to do sociopolitical interventions (like founding AI safety movements) effectively?
No, because I think âfounding AI safety movements that succeed at making the far future go betterâ is a pretty out-of-distribution kind of sociopolitical intervention.
Suppose instead we had a comparably good theory of the right reference class, e.g. âmovements trying to shape transformative technologies.â Would we still be clueless about AI safety movement-building?
More generally: you list various considerations across your posts and I have a hard time understanding which is load-bearing for your answer here. Some possibilities:
Weâre clueless because we havenât yet developed the relevant theory (Richardâs reading IIUC, on which cluelessness is contingent and reducible)
No such theory could be validated even in principle, because we never observe the target variable (far-future value) and calibration on near-term proxies doesnât transfer
Even a validated theory wouldnât help, because impact is dominated by considerations inaccessible to any theory (e.g. unconceived hypothesis classes)
Whatâs the clearest example of a complex cluelessness sign flip youâre aware of?
(By âclearâ I mean âhad a very narrow confidence interval before encountering some consideration and a narrow interval after encountering that consideration but the CIs now center points with opposite signsâ.[1])
The clearest examples I know of (e.g. rescuing Hitler as a child) seem to me like examples of simple cluelessness. You list some examples here, but they donât seem that clear to me, e.g. I disagree that âEarly awareness-raising about AGI x-risk presumably seemed robustly goodâ and would guess most people involved in that had CIs which comfortably straddled zero.
- ^
Or alternatively: there are two representors with narrow but non-overlapping CIs.
- ^
Thanks! âDonât arbitrarily favor some moral patients over othersâ was the most intuitive definition of âimpartialâ in my mind, so I was confused about how it was being used here.
I feel somewhat confused about what exactly the challenge here is. You say:
Grant all the premises but argue that the conclusionâthat we have no impartial-altruistic reason to prefer any action over anotherâdoesnât follow.
My understanding is that Anthony agrees we can justifiably prefer actions, and even do so on altruistic consequence-based grounds if something like bracketing works â but heâd deny that either deontological reasons or bracketed reasons are âimpartial altruistic reasonsâ in the sense his conclusion targets: the first arenât consequence-comparisons at all, and the second are consequence-comparisons that give up full impartiality to stay determinate.
Is my understanding correct? If so, I would find a more precise statement of the inference people are supposed to challenge helpful.
My understanding is that Anthony agrees that there are still reasons to do things:
First, the unawareness argument doesnât imply that ânothing we do mattersâ all things considered. It only implies that impartial altruism, or any very far-reaching value system, isnât action-guiding. Other values and moral norms still matter to us, for example, rules like avoiding dishonesty or virtues like compassion. These can be action-guiding even if weâre clueless about total consequences.
I think your justification âbecause all of my attempts to do good actually end up being a net positive for me in terms of my own self-interestâ doesnât disagree with his conclusion?
Surprising that your MP was willing to meet with you for so long. Thanks for doing this (and writing it up)!
Sure, not the metaphor I would use but I broadly agree that applicants who are willing to plug the metaphorical USB stick into their computer (e.g. by following people they want to work with and applying to the jobs that they post) have a much lower rejection rate.
Sure, âhiring managers being bad at marketing is the bottleneck, not fundingâ is at least partially true. It still implies that if you happen to stumble across a poorly advertised position, you shouldnât expect the acceptance rate to be low!
My version of Mattâs critique that you quoted is something like:
Imagine youâre running a mining company, and you want to start mining Venus. You could either embark on a massive terraforming project to make Venus habitable by biological humans who can work in your mining colony, or you could just build a bunch of robots who can naturally withstand Venusâ climate, think faster than a human, make better decisions than a human, etc. etc. What do you do?
Obviously you are going to choose to send the robots, and the robots arenât going to want to eat meat, so you donât need to worry about factory farming on Venus.I donât think this argument is bulletproof. For example, ports in the U.S. are required to pay human dock workers to sit around and do nothing after their jobs had been automated. I could imagine some sort of analogous regulatory capture in the future which would require mining companies to send humans to other planets even when robots would be more efficient. Preventing this kind of lock-in is one of the few interventions targeting a post-singularity world that I feel positive about.
this claim feels at odds with what I understood your perspective to be from the shallow review you did a while ago
Yeah, I think I did a bad job explaining my views there. Could you say more about what you thought I believed? I should maybe update the post.
However, it seems like this would fall into the bucket of ânot actionable unless you work directly on AI,â so it seems like it might be practically useful to act as if this wasnât going to be the case?
Good question and I claim the answer is ânoâ because you can work on AI! E.g. The Midas Project (founded by a former THL campaigner) is bringing corporate campaign tactics to AI safety. See also e.g. AI Safetyâs Biggest Talent Gap Isnât Researchers. Itâs Generalists.
I think thereâs a decent chance that AI wonât actually be very transformative (or at least wonât be transformative soon) and therefore itâs reasonable to bet your career plans on that assumption. But, to the extent you think AI is actually going to be a big deal, I would suggest considering working on it!
Fair. âIn good timelines some humans continue to exist despite not really being a major force analogous to how the Amish exist today but arenât really a major forceâ is probably the modal view amongst people I talk to.
I shouldnât have used the term âTAIâ here; Iâve clarified. Thanks!
What I said was unclear; I probably should have just quoted Holden Karnofsky:
I think there is a good chance that:
During the century weâre in right now, we will develop technologies that cause us to transition to a state in which humans as we know them are no longer the main force in world events. This is our last chance to shape how that transition happens.
Whatever the main force in world events is (perhaps digital people, misaligned AI, or something else) will create highly stable civilizations that populate our entire galaxy for billions of years to come. The transition taking place this century could shape all of that.
I think itâs very unclear whether this would be a good or bad thing.
There are some people (e.g. Evitable) who are trying to stop this transition, but I think more AI safety people would self-describe their work as âshap[ing] how that transition happens.â
What Matt calls efficient many would call âweirdâ and hence undesirable. Iâm not sure we have a reason to think AI will mitigate this rather than amplify it.
In case you havenât seen it, one argument for why AI might amplify (or at least lock-in) traditions was given by Buck here: Christian homeschoolers in the year 3000.
Thanks for writing this! I appreciate you flagging:
Here, Iâm conditioning on humans as we know them continuing to exist. I take no position on how likely this is to happen, but it seems to me like this is the outcome that EAs are pushing for when trying to reduce existential risk from AI.
I am not aware of anyone who works on AI safety who would say that this is what they are pushing for, with the exception of people who are pushing for a complete pause in AI development.
The rest of us are generally resigned to biological humans disappearing once we have transformative AI [edit: this was unclear, see update], even under the most optimistic scenarios. I expect that this is a major way in which the AI safety people I know (including Matt) find arguments like this and Benthamâs Bulldogsâ uncompelling.
Interesting post, thanks! On this:
Itâs possible that the lack of evidence has been accounted for in other ways. Perhaps someone who initially guesses a 20% chance of extinction is subtly dropping that down to 5% on the grounds of epistemic modesty. But itâs unlikely they are doing so in the exact right way to counteract the effect of the optimizerâs curse.
My understanding is that worldview diversification partially addresses things like this (see this old critique from Holden Karnofsky, which makes a similar point to yours and I think is intellectually upstream of CGâs later thinking), in addition to accounting for e.g. normative uncertainty.
Iâm not exactly sure how worldview diversification works (maybe someone from CG can comment) but I share your skepticism that itâs being done in exactly the right way to counteract these effects.
+1, I think cluelessness-type objections are some of the strongest objections to my own work and I would be excited to see more discussion about it.
Thanks for the questions!
Are you saying that people would read âseniorâ in a job description as meaning âolderâ rather than âmore experiencedâ?
No, Iâm saying that they would interpret it to mean âhaving more years of formal experience (rather than e.g. having had a wider variety of experiences, or having had more useful experiences)â and I instead want a word which means âmore skilledâ.
Can you elaborate on your reluctance to hire an âoldâ person?
No reluctance! I check the â20+ years of experienceâ box on eag applications myself. I just am bemoaning the fact that the word âseniorâ indicates both age and skill, and I want a word which only applies to the latter.
Itâs great that youâre thinking about this!
Iâm confused why you are denominating options in robotics-startup-days saved. This feels like a narrow definition of âimpactâ. Iâd encourage you to consider other ways to benefit the world; parts 4-6 of the 80k career guide might be helpful. Specifically, under the assumption that the thing you terminally value is more like âreducing sufferingâ than ârobotics progressâ, I would encourage you to first consider which causes advance those values, and only then drill into job options. (The 80k career guide will walk you through this.)
Also, even to the extent you do just terminally value robotics progress, you might want to consider whether robotics will advance too quickly for your estimation to be accurate.