(Posting in a personal capacity unless stated otherwise.) I help allocate Open Phil’s resources to improve the governance of AI with a focus on avoiding catastrophic outcomes. Formerly co-founder of the Cambridge Boston Alignment Initiative, which supports AI alignment/safety research and outreach programs at Harvard, MIT, and beyond, co-president of Harvard EA, Director of Governance Programs at the Harvard AI Safety Team and MIT AI Alignment, and occasional AI governance researcher. I’m also a proud GWWC pledger and vegan.
tlevin
Nitpick: I would be sad if people ruled themselves out for e.g. being “20th percentile conscientiousness” since in my impression the popular tests for OCEAN are very sensitive to what implicit reference class the test-taker is using.
For example, I took one a year ago and got third percentile conscientiousness, which seems pretty unlikely to be true given my abilities to e.g. hold down a grantmaking job, get decent grades in grad school, successfully run 50-person retreats, etc. I think the explanation is basically that this is how I respond to “I am often late for my appointments”: “Oh boy, so true. I really am often rushing to my office for meetings and often don’t join until a minute or two after the hour.” And I could instead be thinking, “Well, there are lots of people who just regularly completely miss appointments, don’t pay bills, lose jobs, etc. It seems to me like I’m running late a lot, but I should be accounting for the vast diversity of human experience and answer ‘somewhat disagree’.” But the first thing is way easier; you kinda have to know about this issue with the test to do the second thing.
(Unless you wouldn’t hire someone because they were only ~1.3 standard deviations more conscientious than I am, which is fair I guess!)
Reposting my LW comment here:
Just want to plug Josh Greene’s great book Moral Tribes here (disclosure: he’s my former boss). Moral Tribes basically makes the same argument in different/more words: we evolved moral instincts that usually serve us pretty well, and the tricky part is realizing when we’re in a situation that requires us to pull out the heavy-duty philosophical machinery.
Huh, it really doesn’t read that way to me. Both are pretty clear causal paths to “the policy and general coordination we get are better/worse as a result.”
Most of these have the downside of not giving the accused the chance to respond and thereby giving the community the chance to evaluate both the criticism and the response (which as I wrote recently isn’t necessarily a dominant consideration, but it is an upside of the public writeup).
Fwiw, seems like the positive performance is more censored in expectation than the negative performance: while a case that CH handled poorly could either be widely discussed or never heard about again, I’m struggling to think of how we’d all hear about a case that they handled well, since part of handling it well likely involves the thing not escalating into a big deal and respecting people’s requests for anonymity and privacy.
It does seem like a big drawback that the accused don’t know the details of the accusations, but it also seems like there are obvious tradeoffs here, and it would make sense for this to be very different from the criminal justice system given the difference in punishments (loss of professional and financial opportunities and social status vs. actual prison time).
Agreed that a survey seems really good.
Thanks for writing this up!
I hope to write a post about this at some point, but since you raise some of these arguments, I think the most important cruxes for a pause are:
It seems like in many people’s models, the reason the “snap back” is problematic is that the productivity of safety research is much higher when capabilities are close to the danger zone, both because the AIs that we’re using to do safety research are better and because the AIs that we’re doing the safety research on are more similar to the ones in the danger zone. If the “snap back” reduces the amount of calendar time during which we think AI safety research will be most productive in exchange for giving us more time overall, this could easily be net negative. On the other hand, a pause might just “snap back” to somewhere on the capabilities graph that’s still outside the danger zone, and lower than it would’ve been without the pause for the reasons you describe.
A huge empirical uncertainty I have is: how elastic is the long-term supply curve of compute? If, on one extreme end, the production of computing hardware for the next 20 years is set in stone, then at the end of the pause there would be a huge jump in how much compute a developer could use to train a model, which seems pretty likely to produce a destabilizing/costly jump. At the other end, if compute supply were very responsive to expected AI progress and a pause would mean a big cut to e.g. Nvidia’s R&D budget and TSMC shelved plans for a leading-node fab or two as a result, the jump would be much less worrying in expectation. I’ve heard that the industry plans pretty far in advance because of how much time and money it takes to build a fab (and how much coordination is required between the different parts of the supply chain), but it seems like at this point a lot of the future expected revenue to be won from designing the next generations of GPUs comes from their usefulness for training huge AI systems, so it seems like there should at least be some marginal reduction in long-term capacity if there were a big regulatory response.
Notes on nukes, IR, and AI from “Arsenals of Folly” (and other books)
Agree, basically any policy job seems to start teaching you important stuff about institutional politics and process and the culture of the whole political system!
Though I should also add this important-seeming nuance I gathered from a pretty senior policy person who said basically: “I don’t like the mindset of, get anywhere in the government and climb the ladder and wait for your time to save the day; people should be thinking of it as proactively learning as much as possible about their corner of the government-world, and ideally sharing that information with others.”
Suggestion for how people go about developing this expertise from ~scratch, in a way that should be pretty adaptable to e.g. the context of an undergraduate or grad-level course, or independent research (a much better/stronger version of things I’ve done in the past, which involved lots of talking and take-developing but not a lot of detail and publication, which I think are both really important):
Figure out who, both within the EA world and not, would know at least a fair amount about this topic—maybe they just would be able to explain why it’s useful in more context than you have, maybe they know what papers you should read or acronyms you should familiarize yourself with—and talk to them, roughly in increasing order of scariness/value of their time, such that you’ve at least had a few conversations by the time you’re talking to the scariest/highest-time-value people. Maybe this is like a list of 5-10 people?
During these conversations, take note of what’s confusing you, ideas that you have, connections you or your interlocutors draw between topics, takes you find yourself repeating, etc.; you’re on the hunt for a first project.
Use the “learning by writing” method and just try to write “what you think should happen” in this area, as in, a specific person (maybe a government agency, maybe a funder in EA) should take a specific action, with as much detail as you can, noting a bunch of ways it could go wrong and how you propose to overcome these obstacles.
Treat this proposal as a hypothesis that you then test (meaning, you have some sense of what could convince you it’s wrong), and you seek out tests for it, e.g. talking to more experts about it (or asking them to read your draft and give feedback), finding academic or non-academic literature that bears on the important cruxes, etc., and revise your proposal (including scrapping it) as implied by the evidence.
Try to publish something from this exercise—maybe it’s the proposal, maybe it’s “hey, it turns out lots of proposals in this domain hinge on this empirical question,” maybe it’s “here’s why I now think [topic] is a dead end.” This gathers more feedback and importantly circulates the information that you’ve thought about it a nonzero amount.
Curious what other approaches people recommend!
A technique I’ve found useful in making complex decisions where you gather lots of evidence over time—for example, deciding what to do after your graduation, or whether to change jobs, etc., where you talk to lots of different people and weigh lots of considerations—is to make a spreadsheet of all the arguments you hear, each with a score for how much it supports each decision.
For example, this summer, I was considering the options of “take the Open Phil job,” “go to law school,” and “finish the master’s.” I put each of these options in columns. Then, I’d hear an argument like “being in school delays your ability to take a full-time job, which is where most of your impact will happen”; I’d add a row for this argument. I thought this was a very strong consideration, so I gave the Open Phil job 10 points, law school 0, and the master’s 3 (since it was one more year of school instead of 3 years). Later, I’d hear an argument like “legal knowledge is actually pretty useful for policy work,” which I thought was a medium-strength consideration, and I’d give these options 0, 5, and 0.
I wouldn’t take the sum of these as a final answer, but it was useful for a few reasons:
In complicated decisions, it’s hard to hold all of the arguments in your head at a time. This might be part of why I noticed a strong recency bias, where the most recent handful of considerations raised to me seemed the most important. By putting them all in one place, I could feel like I was properly accounting for all the things I was aware of.
Relatedly, it helped me avoid double-counting arguments. When I’d talk to a new person, and they’d give me an opinion, I could just check whether their argument was basically already in the spreadsheet; sometimes I’d bump a number from 4 to 5, or something, based on them being persuasive, but sometimes I’d just say, “Oh, right, I guess I already knew this and shouldn’t really update from it.”
I also notice a temptation to simplify the decision down to a single crux or knockdown argument, but usually cluster thinking is a better way to make these decisions, and the spreadsheet helps aggregate things such that an overall balance of evidence can carry the day.
levin’s Quick takes
As of August 24, 2023, I no longer endorse this post for a few reasons.
I think university groups should primarily be focused on encouraging people to learn a lot of things and becoming a venue/community for people to try to become excellent at things that the world really needs, and this will mostly look like creating exciting and welcoming environments for co-working and discussion on campus. In part this is driven by the things that I think made HAIST successful, and in part it’s driven by thinking there’s some merit to the unfavorable link to this post in “University EA Groups Need Fixing.”
I also think retreats are more costly than I realized when writing, and (relatedly) if you’re going to organize a retreat or workshop or whatever, it should probably have a theory of change and driven by a target audience’s area of interest and background (e.g., “early-career people interested in AI policy who haven’t spent time in DC come and meet AI policy people in DC”) rather than general-purpose uni group bonding.
I also think “retreat” is basically the wrong word for what these events are; at least the ones I’ve run have generally had enough subject-matter-driven content that “workshop” is a more appropriate term.
That said, I do still think university groups should consider doing retreats/workshops, depending on their capacity, the specific needs of their group, and the extent to which they buy the arguments for/against prioritizing them over other programs.
Edited the top of the post to reflect this.
Was going to make a very similar comment. Also, even if “someone else in Boston could have” done the things, their labor would have funged from something else; organizer time/talent is a scarce resource, and adding to that pool is really valuable.
Yep, all sounds right to me re: not deferring too much and thinking through cause prioritization yourself, and then also that the portfolio is too broad, though these are kind of in tension.
To answer your question, I’m not sure I update that much on having changed my mind, since I think if people did listen to me and do AISTR this would have been a better use of time even for a governance career than basically anything besides AI governance work (and of course there’s a distribution within each of those categories for how useful a given project is; lots of technical projects would’ve been more useful than the median governance project).
I just want to say I really like this style of non-judgmental anthropology and think it gives an accurate-in-my-experience range of what people are thinking and feeling in the Bay, for better and for worse.
Also: one thing that I sort of expected to come up and didn’t see, except indirectly in a few vignettes, is just how much of one’s life in the Bay Area rationalist/EA scene is comprised of work, of AI, and/or of EA. Part of this is just that I’ve only ever lived in the Bay for up to ~6 weeks at a time and was brought there by work, and if I lived there permanently I’d probably try to carve out some non-EA/AI time, but I think it’s a fairly common experience for people who move to the Bay to do AI safety-related things to find that it absorbs everything else unless you make a conscious effort not to. At basically all the social events I attended, >25% of the attendees worked in the same office I did and >25% of the people at any given time are talking about AI or EA. This has not been my experience even while doing related full-time work in Boston, Oxford, and DC.
Again, part of this is that I’ve been in Berkeley for shorter stints that were more work-focused. But yeah, I think it’s not just my experience that the scene is very intense in this way, and this amplifies everything in this post in terms of how much it affects your day-to-day experience.
That is a useful post, thanks. It changes my mind somewhat about EA’s overall reputational damage, but I still think the FTX crisis exploded the self-narrative of ascendancy (both in money and influence), and the prospects have worsened for attracting allies, especially in adversarial environments like politics.
I agree that we’re now in a third wave, but I think this post is missing an essential aspect of the new wave, which is that EA’s reputation has taken a massive hit. EA doesn’t just have less money because of SBF; it has less trust and prestige, less optimism about becoming a mass movement (or even a mass-elite movement), and fewer potential allies because of SBF, Bostrom’s email/apology, and the Time article.
For that reason, I’d put the date of the third wave around the 10th of November 2022, when it became clear that FTX was not only experiencing a “liquidity crisis” but had misled customers, investors, and the EA community and likely committed massive fraud, and when the Future Fund team resigned. The other features of the Third Wave (the additional scandals and the rise in public interest in AI safety due to ChatGPT, GPT-4, the FLI letter, the CAIS statement, and so on) took a few months to emerge, but that week seems like the turning point.
I spent some time last summer looking into the “other countries” idea: if we’d like to slow down both Chinese AI timelines without speeding up US timelines, what if we tried to get countries that aren’t the US (or the UK, since DeepMind is there) to accept more STEM talent from China? TLDR:
There are very few countries at the intersection of “has enough of an AI industry and general hospitality to Chinese immigrants (e.g., low xenophobia, widely spoken languages) that they’d be interested in moving” + “doesn’t have so much of an AI industry that this would defeat the purpose” + “has sufficiently tractable immigration policy that they might actually do it.” GovAI did some survey work on this. Canada, Switzerland, France, Australia, and Singapore looked like plausible candidates (in order of how much time I spent looking into them).
Because this policy might also draw migrants who would have instead moved to the US, the question of whether this is a good idea hinges in part on the question of whether overall timelines or the West-China gap is more important (as is often the case). I think recently consensus has moved in the “timelines” direction (in light of very fast Western progress and the export controls and domestic regulation likely slowing things down in China), making it look more appealing.
Happy to share my (now somewhat outdated) draft privately if people are curious.
I don’t know how we got to whether we should update about longtermism being “bad.” As far as I’m concerned, this is a conversation about whether Eric Schmidt counts as a longtermist by virtue of being focused on existential risk from AI.
It seems to me like you’re saying: “the vast majority of longtermists are focused on existential risks from AI; therefore, people like Eric Schmidt who are focused on existential risks from AI are accurately described as longtermists.”
When stated that simply, this is an obvious logical error (in the form of “most squares are rectangles, so this rectangle named Eric Schmidt must be a square”). I’m curious if I’m missing something about your argument.
(I began working for OP on the AI governance team in June. I’m commenting in a personal capacity based on my own observations; other team members may disagree with me.)
FWIW I really don’t think OP is in the business of preserving the status quo. People who work on AI at OP have a range of opinions on just about every issue, but I don’t think any of us feel good about the status quo! People (including non-grantees) often ask us for our thoughts about a proposed action, and we’ll share if we think some action might be counterproductive, but many things we’d consider “productive” look very different from “preserving the status quo.” For example, I would consider the CAIS statement to be pretty disruptive to the status quo and productive, and people at Open Phil were excited about it and spent a bunch of time finding additional people to sign it before it was published.
I agree that OP has an easier time recruiting than many other orgs, though perhaps a harder time than frontier labs. But at risk of self-flattery, I think the people we’ve hired would generally be hard to replace — these roles require a fairly rare combination of traits. People who have them can be huge value-adds relative to the counterfactual!
I basically disagree with this. There are areas where senior staff have strong takes, but they’ll definitely engage with the views of junior staff, and they sometimes change their minds. Also, the AI world is changing fast, and as a result our strategy has been changing fast, and there are areas full of new terrain where a new hire could really shape our strategy. (This is one way in which grantmaker capacity is a serious bottleneck.)