I lead a small think tank dedicated to accelerating the pace of scientific advancement by improving the conditions of science funding. As well, I’m a senior advisor to the Social Science Research Council. Prior to these roles, I spent some 9 years at Arnold Ventures (formerly the Laura and John Arnold Foundation) as VP of Research.
Stuart Buck
I guess the overall point for me is that if the goal is just to speculate about what much more capable and accurate LLMs might enable, then what’s the point of doing a small, uncontrolled, empirical study demonstrating that current LLMs are not, in fact, that kind of risk?
Just saw this piece, which is strongly worded but seems defensible: https://1a3orn.com/sub/essays-propaganda-or-science.html
Thanks for your thoughtful replies!
Do you think that future LLMs will enable bioterrorists to a greater degree than traditional tools like search engines or print text?
I can imagine future AIs that might do this, but LLMs (strictly speaking) are just outputting strings of text. As I said in another comment: If a bioterrorist is already capable of understanding and actually carrying out the detailed instructions in an article like this, then I’m not sure that an LLM would add that much to his capacities. Conversely, handing a detailed set of instructions like that to the average person poses virtually no risk, because they wouldn’t have the knowledge or abilty to actually do anything with it.
As well, if a wannabe terrorist actually wants to do harm, there are much easier and simpler ways that are already widely discoverable: 1) Make chlorine gas by mixing bleach and ammonia (or vinegar); 2) Make sarin gas via instructions that were easily findable in this 1995 article:
How easy is it to make sarin, the nerve gas that Japanese authorities believe was used to kill eight and injure thousands in the Tokyo subways during the Monday-morning rush hour?
“Wait a minute, I’ll look it up,” University of Toronto chemistry professor Ronald Kluger said over the phone. This was followed by the sound of pages flipping as he skimmed through the Merck Index, the bible of chemical preparations.Five seconds later, Kluger announced, “Here it is,” and proceeded to read not only the chemical formula but also the references that describe the step-by-step preparation of sarin, a gas that cripples the nervous system and can kill in minutes.
“This stuff is so trivial and so open,” he said of both the theory and the procedure required to make a substance so potent that less than a milligram can kill you.
And so forth. Put another way, if we aren’t already seeing attacks like that on a daily basis, it isn’t for lack of GPT-5--it’s because hardly anyone actually wants to carry out such attacks.
If yes, do you think the difference will be significant enough to warrant regulations that incentivize developers of future models to only release them once properly safeguarded (or not at all)?
I guess it depends on what we mean by regulation. If we’re talking about liability and related insurance, I would need to see a much more detailed argument drawing on 50+ years of the law and economics literature. For example, why would we hold AI companies liable when we don’t hold Google or the NIH (or my wifi provider, for that matter) liable for the fact that right now, it is trivially easy to look up the entire genetic sequences for smallpox and Ebola?
Do you think that there are specific areas of knowledge around engineering and releasing exponentially growing biology that should be restricted?
If we are worried about someone releasing smallpox and the like, or genetically engineering something new, LLMs are much less of an issue than the fact that so much information (e.g., the smallpox sequence, the CRISPR techniques, etc.) is already out there.
“future model could successfully walk an unskilled person through the process without the person needing to understand it at all.”
Seems very doubtful. Could an unskilled person be “walked through” this process just by slightly more elaborate instructions? https://www.nature.com/articles/nprot.2007.135? Seems that the real barriers to something as complex as synthesizing a virus are 1) lack of training/skill/tacit knowledge, 2) lack of equipment or supplies. Detailed instructions are already out there.
Also, if you’re worried about low-IQ people being able to create mayhem, I think the least of our worries should be that they’d get their hands on a detailed protocol for creating a virus or anything similar (see, e.g., https://www.nature.com/articles/nprot.2007.135) -- hardly anyone would be able to understand it anyway, let alone have the real-world skills or equipment to do any of it.
What about the majority of my comment showing that by the paper’s own account, LLMs cannot (at least not yet) walk anyone through a recipe for mayhem, unless they are already enough of an expert to know when to discard hallucinatory answers, reprompt the LLM, etc.?
I’m not sure what to make of this kind of paper. They specifically trained the model on openly available sources that you can easily google, and the paper notes that “there is sufficient information in online resources and in scientific publications to map out several feasible ways to obtain infectious 1918 influenza.”
So, all of this is already openly available in numerous ways. What do LLMs add compared to Google?
Not clear: When participants “failed to access information key to navigating a particular path, we directly tested the Spicy model to determine whether it is capable of generating the information.” In other words, the participants did end up getting stumped at various points, but the researchers would jump in to see if the LLM would return a good answer IF the prompter already knew the answer and what exactly to ask for.
Then, they note that “the inability of current models to accurately provide specific citations and scientific facts and their tendency to ‘hallucinate’ caused participants to waste considerable time . . . ” I’ll bet. LLMs are notoriously bad at this sort of thing, at least currently.
Bottom line in their own words: “According to our own tests, the Spicy model can skillfully walk a user along the most accessible path in just 30 minutes if that user can recognize and ignore inaccurate responses.”
What an “if”! The LLM can tell a user all this harmful info … IF the user is already enough of an expert that they already know the answer!
Bottom line for me: Seems mostly to be scaremongering, and the paper concludes with a completely unsupported policy recommendation about legal liability. Seems odd to talk about legal liability for an inefficient, expensive, hallucinatory way to access information freely available via Google and textbooks.
- Nov 3, 2023, 10:20 AM; 5 points) 's comment on Propaganda or Science: A Look at Open Source AI and Bioterrorism Risk by (LessWrong;
Fair point, and I rephrased to be more clear on what I meant to say—that the scenario here is mostly science fiction (it’s not as if GPT5 is turned on, diamondoid bacteria appear out of nowhere, and we all drop dead).
Not really a surprise that this story is basically science fiction.
I don’t have a dog in this fight, I’m just intrigued that people have the time to send that many WhatsApp messages about food.
I wouldn’t support a ban on all these practices, but surely they are a “risk”?
What if you just broke up with a coworker in a small office? What if you live together, and they are the type of roommate who drives everyone up the wall by failing to do the dishes on time, etc.? What if the roommate/partner needs to be fired for non-performance, and now you have a tremendously awkward situation at home?
It’s hard enough to execute at work on a high level, and see eye-to-eye with co-workers about strategy, workplace communication, and everything else. Add in romance and roommates (in a small office), and it obviously can lead to even more complicated and difficult scenarios.
Consider why it’s the case that every few months there is a long article about some organization that, in the best case, looks like it was run with the professionalism and adult supervision of a frat house.
Much thanks! Also, the best way to find a typo is to hit “send” or “publish”. :)
Fair point. I’m just thinking of grants like the Reproducibility Project in Psychology. At the time I asked for funding for this (late 2012-early 2013), I absolutely did not foresee that it would be published in Science in 2015, that Science would ask me to write an accompanying editorial, or that it would become one of the standard citations (with nearly 8,000 citations on Google Scholar). At the time, I would only have predicted a much more normal-size impact, and mostly I was just relying on gut intuition that “this seems really promising.”
Metascience Since 2012: A Personal History
I generally agree, but I think that we are nowhere near being able to say, “The risk of future climate catastrophe was previously 29.5 percent, but thanks to my organization’s work, that risk has been reduced to 29.4 percent, thus justifying the money spent.” The whole idea of making grants on such a slender basis of unprovable speculation is radically different from the traditional EA approach of demanding multiple RCTs. Might be a great idea, but still a totally different thing. Shouldn’t even be mentioned in the same breath.
Because the question is impossible to answer.
First, by definition, we have no actual evidence about outcomes in the long-term future—it is not as if we can run RCTs where we run Earth 1 for the next 1,000 years with one intervention and Earth 2 with a different intervention. Second, even where experts stand behind short-term treatments and swear that they can observe the outcomes happening right in front of them (everything from psychology to education to medicine), there are many cases where the experts are wrong—even many cases where we do harm while thinking we do good (see Prasad and Cifu’s book Medical Reversals).
Given the lack of evidentiary feedback as well as any solid basis for considering people to be “experts” in the first place, there is a high likelihood that anything we think benefits the long-term future might do nothing or actually make things worse.
The main way to justify long-termist work (especially on AGI) is to claim that there’s a risk of everyone dying (leading to astronomically huge costs), and then claim that there’s a non-zero positive probability of affecting that outcome. There will never be any evidentiary confirmation of either claim, but you can justify any grant to anyone for anything by adjusting the estimated probabilities as needed.
It’s all a bit intuitive, but my heuristics were basically: Figure out the general issues that seem worth addressing; find talented people who are already trying to address those issues (perhaps in their spare time) and whose main constraint is capital; and give them more capital (e.g., time and employees) to do even better things (which they will often come up with on their own).
I’ve been a grantmaker (at Arnold Ventures, a $2 billion philanthropy), and I couldn’t agree more. Those kinds of questions are good if the aim is to reward and positively select for people who are good at bullshitting. And I also worry about a broader paradox—sometimes the highest impact comes from people who weren’t thinking about impact, had no idea where their plans would lead, and serendipitously stumbled into something like penicillin while doing something else.
Just a note: this post could have opposite advice for people from guess culture rather than ask culture. See https://ask.metafilter.com/55153/Whats-the-middle-ground-between-FU-and-Welcome#830421
I.e., someone from ask culture might need to be warned not to bother people so much. Someone from guess culture might need to be told that it is ok to reach out to people once in a while.
In a way, the sarin story confirms what I’ve been trying to say: a list of instructions, no matter how complete, does not mean that people can literally execute the instructions in the real world. Indeed, having tried to teach my kids to cook, even making something as simple as scrambled eggs requires lots of experience and tacit knowledge.