Working on behavioral research, AI governance, compute governance. Previously an IAPS Fellow, Brown.
Tao
I wrote some criticism in this comment. Mainly, I argue that
(1) A pause could be undesirable. A pause could be net-negative in expectation (with high variance depending on implementation specifics), and that PauseAI should take this concern more seriously.
(2) Fighting doesn’t necessarily bring you closer to winning. PauseAI’s approach *could* be counterproductive even for the aim of achieving a pause, whether or not it’s desirable. From my comment:Although the analogy of war is compelling and lends itself well to your post’s argument, in politics fighting often does not get one closer to winning. Putting up a bad fight may be worse than putting up no fight at all. If the goal is winning (instead of just putting up a fight), then taking criticism to your fighting style seriously should be paramount.
This is a valuable post, but I don’t think it engages with a lot of the concern about PauseAI advocacy. I have two main reasons why I broadly disagree:
Pausing AI development could be the wrong move, even if you don’t care about benefits and only care about risks
AI safety is an area with a lot of uncertainty. Importantly, this uncertainty isn’t merely about the nature of the risks but about the impact of potential interventions.
Of all interventions, pausing AI development is, some think, a particularly risky one. There are dangers like:
Falling behind China
Creating a compute overhang with subsequent rapid catch-up development
Polarizing the AI discourse before risks are clearer (and discrediting concerned AI experts), turning AI into a politically intractable problem, and
Causing AI lab regulatory flight to countries with lower state capacity, less robust democracies, fewer safety guardrails, and a lesser ability to mandate security standards to prevent model exfiltration
People at PauseAI are probably less concerned about the above (or more concerned about model autonomy, catastrophic risks, and short timelines).
Although you may have felt that you did your “scouting” work and arrived at a position worth defending as a warrior, others’ comparably thorough scouting work has led them to a different position. Their opposition to your warrior-like advocacy, then, may not come (as your post suggests) from a purist notion that we should preserve elite epistemics at the cost of impact, but from a fundamental disagreement about the desirability of the consequences of a pause (or other policies), or of advocacy for a pause.
If our shared goal is the clichéd securing-benefits-and-minimizing-risks, or even just minimizing risks, one should be open to thoughtful colleagues’ input that one’s actions may be counterproductive to that end-goal.
2. Fighting does not necessarily get one closer to winning.
Although the analogy of war is compelling and lends itself well to your post’s argument, in politics fighting often does not get one closer to winning. Putting up a bad fight may be worse than putting up no fight at all. If the goal is winning (instead of just putting up a fight), then taking criticism to your fighting style seriously should be paramount.
I still concede that a lot of people dismiss PauseAI merely because they see it as cringe. But I don’t think this is the core of most thoughtful people’s criticism.
To be very clear, I’m not saying that PauseAI people are wrong, or that a pause will always be undesirable, or that they are using the wrong methods. I am answering to
(1) the feeling that this post dismissed criticism of PauseAI without engaging with object-level arguments, and the feeing that this post wrongly ascribed outside criticism to epistemic purism and a reluctance to “do the dirty work,” and
(2) the idea that the scout-work is “done” already and an AI pause is currently desirable. (I’m not sure I’m right here at all, but I have reasons [above] to think that PauseAI shouldn’t be so sure either.)
Sorry for not editing this better, I wanted to write it quickly. I welcome people’s responses though I may not be able to answer to them!
Mostly the meat-eater problem, also cost-effectiveness analyses. Also higher neglectedness on priors.
A few quick ideas:
1. On the methods side, I find the potential use of LLMs/AI as research participants in psychology studies interesting (not necessarily related to safety). This may sound ridiculous at first but I think the studies are really interesting.
From my post on studying AI-nuclear integration with methods from psychology:[Using] LLMs as participants in a survey experiment, something that is seeing growing interest in the social sciences (see Manning, Zhu, & Horton, 2024; Argyle et al., 2023; Dillion et al., 2023; Grossmann et al., 2023).
2. You may be interested or get good ideas from the Large Language Model Psychology research agenda (safety-focused). I haven’t gone into it so this is not an endorsement.
3. Then you have comparative analyses of human and LLM behavior. E.g. the Human vs. Machine paper (Lamparth, 2024) compares humans and LLMs’ decision-making in a wargame. I do something similar with a nuclear decision-making simulation, but it’s not in paper/preprint form yet.
Please feel free to add comments or ask questions, even if you think your question is probably already answered in the full manuscript. I have no problem answering or pointing you to the answer.
AI-nuclear integration: evidence of automation bias from humans and LLMs [research summary]
Thanks Clare! Your comment was super informative and thorough.
One thing that I would lightly dispute is that 360 feedback is easily gameable. I (anecdotally) feel like people with malevolent traits (“psychopaths” here) often have trouble remaining “undiscovered” and so have to constantly move or change social circles.
Of course, almost by definition I wouldn’t know any psychopaths that are still undiscovered. But 360 feedback could help discover the “discoverable” subgroup, since the test is not easily gameable by them.
Any thoughts?
Thank you for this! These are great resources; I’ll dive into them when I have the time.
Glad it was useful!
I don’t feel qualified to give advice on teaching a language to small kids, although I do have a few thoughts. Please take them with a grain of salt, as I’ve never done this.
I’m assuming you mean your kids, not kids in a classroom? If this is the case:
It seems like language interaction is important for kids, so I’m skeptical of the “having them watch cartoons in TL instead of NL” approach, unless they already have a solid understanding of the language.
Do you speak this language yourself? If so, you could try to increasingly only speak this language with your kid. E.g., my cousins grew up strictly speaking French with their mother, German with their father, and English in school. Now they’re fully fluent in all 3.
If you don’t speak the language yourself, I’d bet it’ll be much harder to make it happen. You could send them to private lessons (depending on age and disposition). You could also try to hire a caretaker/nanny (again depending on age) that speaks the TL and is willing to speak with the kid in that language. I knew a couple of people who spoke decent Spanish because they had a Spanish-speaking nanny growing up.
That’s all I could think of. That said, I think a quick Google/YouTube search might uncover much more valuable guidance on this!
Thanks for your comment! I also think EAs sometimes fall into the trap of not considering their own interests and things that make them happy as much as they should. The importance of personal interest and enjoyment in language learning is hard to overemphasize.
Effective language-learning for effective altruists
Thank you for researching this; this is incredibly valuable.
I noticed that the OUS-Impartial Beneficence subscale correlates well with expansive altruism and effectiveness focus. Maybe I skipped over it, but did you include in your results whether this OUS subscale had higher predictive power than your two new factors?
Thank you for writing this. This is a really useful insight that I’ll be thinking more about as I engage more with IIDM — I have definitely focused disproportionately more on adding good processes than eliminating bad ones. This could be because I’m not very familiar in general with common processes within institutions, as my studies have really only focused on individual decision-making/rationality so far.
Below are a few quick thoughts on that.
Following your Putin-EU example, I wonder how much of Russia’s nimbleness is enabled by one man having so much decision-making power, which might both enable quick decision-making as well as democratic backsliding.
Although you could argue that quicker experimentation might pay off in the long run, I would worry that modern states having too few checks and balances might increase the risk of solo actors making catastrophically bad decisions. At the same time, I worry about vast bureaucracies failing to make important decisions, and that being equally catastrophic.
I agree, as you say, that the need for “caution and consensus vs. experimentation and accountability” depends on the institution and the decision to be made. I’m also not aware of attempts to describe when exactly you would want more of the former vs. the latter.
If you (or others) have good resources on eliminating bad processes/bureaucracy, I’d love to see them.
Thanks. None of the internal links transferred from Google doc so I missed a few when manually creating them here.
List of Interventions for Improving Institutional Decision-Making
I’m glad you found it valuable!
Thank you for the update! I will add this to the post as soon as I can. I find it very exciting that so many constitutions have included legal protections for future generations. Even if it hasn’t influenced policy yet, it could serve as a means of justifying longtermist policy once enough momentum has built up.
Edit: changes made.
Thanks for the feedback! I don’t have access to my computer right now but I’ll try to fix that when I do.
This is missing the point of my 2nd argument. It sure sounds better to “fight and lose than roll over and die.”
But I’m saying that “fighting” in the way that PauseAI is “fighting” could make it more likely that you lose.
Not saying “fighting” in general will have this effect. Or that this won’t ever change. Or that I’m confident about this. Just saying: take criticism seriously, acknowledge the uncertainty, don’t rush into action just because you want to do something.
Unrelated to my argument: Not sure what you mean by “high probability” but I’d take a combination of these views are a reasonable prior: XPT.