Ah shucks, I am sorry to hear it. Good luck to you!
Justis
AI Safety Concepts Writeup: WebGPT
Diversification is Underrated
Hmm. I think reactions to that would vary really widely between researchers, and be super sensitive to when it happened, why, whether it was permanent, and other considerations.
Ah yeah, you’re right—I think basically I put in the percent rather than the probability. So it would indeed be very expensive to be competitive with AMF. Though so is everything else, so that’s not hugely surprising.
As for the numbers, yeah, it does just strike me as really, really unlikely that we can solve AI x-risk right now. 1⁄10,000 does feel about right to me. I certainly wouldn’t expect everyone else to agree though! I think some people would put the odds much higher, and others (like Tyler Cowen maybe?) would put them a bit lower. Probably the 1% step is the step I’m least confident in—wouldn’t surprise me if the (hard to find, hard to execute) solutions that are findable would reduce risk significantly more.
EDIT: tried to fix the math and switched the “relative risk reduction term” to 10%. I feel like among findable, executable interventions there’s probably a lot of variance, and it’s plausible some of the best ones do reduce risk by 10% or so. And 1/1000 feels about as plausible as 1/10000 to me. So, somewhere in there.
(Comment to flag that I looked back over this and just totally pretended 4,000 was equal to 1,000. Whoops. Don’t think it affects the argument very strongly, but I have multiplied the relevant dollar figures by 4.)
Thank you! My perspective is: “figuring out if it’s tractable is at least tractable enough that it’s worth a lot more time/attention going there than is currently”, but not necessarily “working on it is far and away the best use of time/money/attention for altruistic purposes”, and almost certainly not “working on it is the best use of time/money/attention under a wide variety of ethical frameworks and it should dominate a healthy moral parliament”.
To me, “aligned” does a lot of work here. Like yes, if it’s perfectly aligned and totally general, the benefits are mind boggling. But maybe we just get a bunch of AI that are mostly generating pretty good/safe outputs, but a few outputs here and there lower the threshold required for random small groups to wreak mass destruction, and then at least one of those groups blows up the biome.
But yeah given the premise we get AGI that mostly does what we tell it to, and we don’t immediately tell it to do anything stupid, I do think it’s very hard to predict what will happen but it’s gonna be wild (and indeed possibly really good).
Yeah, I share the view that the “Recalls” are the weakest part—I mostly was trying to get my fuzzy, accumulated-over-many-years vague sense of “whoa no we’re being way too confident about this” into a more postable form. Seeing your criticisms I think the main issue is a little bit of a Motte-and-Bailey sort of thing where I’m kind of responding to a Yudkowskian model, but smuggling in a more moderate perspective’s odds (ie. Yudkowsky thinks we need to get it right on the first try, but Grace and MacAskill may be agnostic there).
I may think more about this! I do think there’s something there sort of between the parts you’re quoting, by which I mean yes, we could get agreement to a narrower standard than solving ethics, but even just making ethical progress at all, or coming up with standards that go anywhere good/predictable politically seems hard. Like, the political dimension and the technical/problem specification dimensions both seem super hard in a way where we’d have to trust ourselves to be extremely competent across both dimensions, and our actual testable experiments against either outcome are mostly a wash (ie. we can’t get a US congressperson elected yet, or get affordable lab-grown meat on grocery store shelves, so doing harder versions of both at once seems...I dunno, might hedge my portfolio far beyond that!).
Yeah I think a lot of it is West Coast American culture! I imagine EA would have super different vibes if it were mostly centered in New York.
Optimism, AI risk, and EA blind spots
I think it varies with the merits of the underlying argument! But at the very least we should suppose there’s an irrational presumption toward doom: for whatever reason(s), maybe evopsych or maybe purely memetic, doomy ideas have some kind of selection advantage that’s worth offsetting.
Maybe AI risk shouldn’t affect your life plan all that much
Seconded!
Are there other such posts? I guess because I—erm—left for a while, I may have missed them!
Might be worth thinking about, sure. In my case I don’t think there’d have been much.
Maybe! I dunno. A and B have been pretty easy for me. And C, yeah! But to some degree I think that’s the point—if you’re unhappy with a life trying to make EA your social life + career + sense of meaning, try making a life where it’s none of those things (and donate far more money to charity than anyone else you know thinks is reasonable, sure), and then when you’re fully stable and secure, you can consider returning with a clear mind.
I suppose all this is quite n=1 though! It’s just worked well for me. But mileage may vary.
If you’re unhappy, consider leaving
Some other considerations I think might be relevant:
Are there top labs/research outfits that are eager for top technical talent, and don’t care that much how up to speed that talent is on AI safety in particular? If so, seems like you could just attract eg. math Olympiad finalists or something and give them a small amount of field-specific info to get them started. But if lots of AI safety-specific onboarding is required, that’s pretty bad for movement building.
How deep is the well of untapped potential talent in various ways/various places? Seems like there’s lots and lots of outreach at top US universities right now, arguably even too much for image reasons. There’s probably not enough in eg. India or something—it might be really fruitful to make a concerted effort to find AI Safety Ramanujan. But maybe he ends up at Harvard anyway.
Looking at current top safety researchers, were they as a rule at it for several years before producing anything useful? My impression is that a lot of them came on to the field pretty strong almost right away, or after just a year or so of spinning up. It wouldn’t surprise me if many sufficiently smart people don’t need long at all. But maybe I’m wrong!
The ‘scaling up’ step interests me. How much does this happen? How big of a scale is necessary?
Retention seems maybe relevant too. Very hard to predict how many group participants will stick with the field, and for how long. Introduces a lot of risk, though maybe not relevant for timelines per se.
Yeah! This was the actually the first post I tried to write. But it petered out a few times, so I approached it from a different angle and came up with the post above instead. I definitely agree that “robustness” is something that should be seen as a pillar of EA—boringly overdetermined interventions just seem a lot more likely to survive repeated contact with reality to me, and I think as we’ve moved away from geeking out about RCTs we’ve lost some of that caution as a communtiy.
Yes, I agree it’s a confused concept. But I think that same confused concept gets smuggled into conversations about “impact” quite often.
It’s also relevant for coordination: any time you can be the 100th person that joins a group of 100 that suddenly is able to save lots of lives, there first must have been 99 people who coordinated on the bet they’d be able to get you or someone like you. But how did they make that bet?
Fine, basically. Surprised at the degree of emotional investment people seem to have in the EA community as such, above and beyond doing helpful stuff for the world personally. Sounds unpleasant. If I had feelings anywhere near as negative about a community as the ones I’m seeing routinely expressed, I think I’d just stop engaging and go do something else.