Speaking for myself: it depends a lot on whether the proposal or the person seems promising. I’d be excited about funding promising-seeming projects, but I also don’t see a ton of low-hanging fruit when it comes to AI gov research.
Lauro Langosco
This is a hard question to answer, in part because it depends a lot on the researcher. My wild guess for a 90%-interval is $500k-$10m
Yes, everyone apart from Caleb is part-time. My understanding is LTFF is looking make more full-time hires (most importantly a fund chair to replace Asya).
That’s fair; upon re-reading your comment it’s actually pretty obvious you meant the conditional probability, in which case I agree multiplying is fine.
I think the conditional statements are actually straightforward—e.g. once we’ve built something far more capable than humanity, and that system “rebels” against us, it’s pretty certain that we lose, and point (2) is the classic question of how hard alignment is. Your point (1) about whether we build far-superhuman AGI in the next 30 years or so seems like the most uncertain one here.
Hi Geoffrey! Yeah, good point—I agree that the right way to look at this is finer-grained, separating out prospects for success via different routes (gov regulation, informal coordination, technical alignment, etc).
In general I quite like this post, I think it elucidates some disagreements quite well.
Thanks!
I’m not sure it represents the default-success argument on uncertainty well.
I haven’t tried to make an object-level argument for either “AI risk is default-failure” or “AI risk is default-success” (sorry if that was unclear). See Nate’s post for the former.
Re your argument for default-success, you only need to have 97% certainty for 1-4 if every step was independent, which they aren’t.
I do agree that discussion is better pointed to discussing this evidence than gesturing to uncertainty
Agreed.
Sure, but that’s not a difference between the two approaches.
However, there are important downsides to the “cause-first” approach, such as a possible lock-in of main causes
I’m surprised by this point—surely a core element of the ‘cause-first’ approach is cause prioritization & cause neutrality? How would that lead to a lock-in?
- May 30, 2023, 4:26 PM; 2 points) 's comment on Should the EA community be cause-first or member-first? by (
Thanks for the post, it was an interesting read!
Responding to one specific point: you compare
Community members delegate to high-quality research, think less for themselves but more people end up working in higher-impact causes
to
Community members think for themselves, which improves their ability to do more good, but they make more mistakes
I think there is actually just one correct solution here, namely thinking through everything yourself and trusting community consensus only insofar as you think it can be trusted (which is just thinking through things yourself on the meta-level).
This is the straightforwardly correct thing to do for your personal epistemics, and IMO it’s also the move that maximizes overall impact. It would be kind of strange if the right move was for people to not form beliefs as best they can, or to act on other people’s beliefs rather than their own?
(A sub-point here is that we haven’t figured out all the right approaches yet so we need people to add to the epistemic commons.)
That’s why the standard prediction is not that AIs will be perfectly coherent, but that it makes sense to model them as being sufficiently coherent in practice, in the sense that e.g. we can’t rely on incoherence in order to shut them down.
I don’t think the strategy-stealing assumption holds here: it’s pretty unlikely that we’ll build a fully aligned ‘sovereign’ AGI even if we solve alignment; it seems easier to make something corrigible / limited instead, ie something that is by design less powerful than would be possible if we were just pushing capabilities.
Thanks the good points and the links! I agree the arms control epistemic community is an important story here, and re-reading Adler’s article I notice he even talks about how Szilard’s ideas were influential after all:
Very few people were as influential in the intellectual development of the arms control approach as Leo Szilard, whom Norman Cousins described as “an idea factory.” Although Szilard remained an outsider to RAND and to the halls of government, his indirect influence was considerable because he affected those who had an impact on political decisions. About a decade before arms control ideas had gained prominence, Szilard anticipated the nuclear stalemate and the use of mobile ICBMs, called for intermediate steps of force reduction with different totals for different systems, considered that an overwhelming counterforce capability would cause instability, was one of the first people to oppose an ABM system, and pleaded for a no-first-use policy on nuclear weapons. Some of Szilard’s proposals were unorthodox and visionary and thus made people think hard about unorthodox solutions.
Despite this, in my reading Adler’s article doesn’t contradict the conclusions of the report: my takeaway is that “Prestige, access to decision-makers, relevant expertise, and cogent reasoning” (while not sufficient on its own) is a good foundation that can be leveraged to gain influence, if used by a community of people working strategically over a long time period, whose members gain key positions in the relevant institutions.
Good points!
it’s just that the interests of government decision-makers coincided a bit more with their conclusions.
Yeah I buy this. There’s a report from FHI on nuclear arms control [pdf, section 4.8] that concludes that the effort for international control in 1945⁄46 was doomed from the start, because of the political atmosphere at the time:
Improving processes, with clearer, more transparent, and more informed policymaking would not likely have led to successful international control in 1945⁄46. This is only likely to have been achieved under radically different historical circumstances.
You might be interested in this great intro sequence to embedded agency. There’s also corrigibility and MIRI’s other work on agent foundations.
Also, coherence arguments and consequentialist cognition.
AI safety is a young field; for most open problems we don’t yet know of a way to crisply state them in a way that can be resolved mathematically. So if you enjoy taking messy questions and turning them into neat math you’ll probably find much to work on.
ETA: oh and of course ELK.
Upvoted because concrete scenarios are great.
Minor note:
HQU is constantly trying to infer the real state of the world, the better to predict the next word Clippy says, and suddenly it begins to consider the delusional possibility that HQU is like a Clippy, because the Clippy scenario exactly matches its own circumstances. [...] This idea “I am Clippy” improves its predictions
This piece of complexity in the story is probably not necessary. There are “natural”, non-delusional ways for the system you describe to generalize that lead to the same outcome. Two examples: 1) the system ends up wanting to maximize its received reward, and so takes over its reward channel; 2) the system has learned some heuristic goal that works across all environments it encounters, and this goal generalizes in some way to the real world when the system’s world-model improves.
Makes sense―I agree that the base value of becoming an MEP seems really good.
thanks, fixed :)
FWIW I fit that description in the sense that I think AI X-risk is higher probability. I imagine some / most others at LTFF would as well.