Error
Unrecognized LW server error:
Field "fmCrosspost" of type "CrosspostOutput" must have a selection of subfields. Did you mean "fmCrosspost { ... }"?
Unrecognized LW server error:
Field "fmCrosspost" of type "CrosspostOutput" must have a selection of subfields. Did you mean "fmCrosspost { ... }"?
This makes a lot of sense to me—people usually give me a funny look if I mention AI risks. I’ll try mentioning “AI accidents” to fellow public policy students and see if that phrase is more intuitive.
It might make a lot of sense to test the risk vs. accidents framing on the next survey of AI researchers.
You will have to be sure that the researchers actually know what you mean though. AI researchers are already concerned about accidents in the narrow sense, and they could respond positively to the idea of preventing AI accidents merely because they have something else in mind (like keeping self driving cars safe or something like that).
If accept this switch to language that is appealing at the expense of precision then eventually you will reach a motte-and-bailey situation where the motte is the broad idea of ‘preventing accidents’ and the bailey is the specific long-term AGI scheme outlined by Bostrom and MIRI. You’ll get fewer funny looks, but only by conflating and muddling the issues.
Looks like DeepMind have gone with “AI risk” and classified that into “misuse and unintended consequences”, among other ethical challenges. See https://deepmind.com/applied/deepmind-ethics-society/
What do you think the risk is of AI accidents just adopting the baggage that AI risks has now via the euphemism treadmill?
I don’t think it’s an implausible risk, but I also don’t think that it’s one that should prevent the goal of a better framing.
However, “AI accidents” don’t communicate the scale of a possible disaster. Something like “global catastrophic AI accidents” may be even clearer. Or “permanent loss of control of a hostile AI system”.
“permanent loss of control of a hostile AI system”—This seems especially facilitative of the science-fiction interpretation to me.
I agree with the rest.
Was your “WEF” link supposed to point to something involving the World Economic Forum?
Yes. Thanks. Link has been amended. Author was in fact Luke Muehlhauser, so labeling it ‘WEF’ is only partially accurate.
So what are the risks of this verbal change?
Potentially money gets mis-allocated: Just like all chemistry got rebranded nanotech during that phase in the 2000, if there is money in AI safety, computer departments will rebrand research as AI safety to prevent AI accidents. This might be a problem when governments start to try and fund AI Safety.
I personally want to be able to differentiate different types of work, between AI Safety and AGI Safety. Both are valuable, we are going to living in a world of AI for a while and it may cause catastrophic problems (including problems that distract us from AGI safety) and learning to mitigating them might help us with AGI Safety. I want us to be able continue to look at both as potentially separate things, because AI Safety may not help much with AGI Safety.
I think this proposition could do with some refinement. AI safety should be a superset of both AGI safety and narrow-AI safety. Then we don’t run into problematic sentences like “AI safety may not help much with AGI Safety”, which contradicts how we currently use ‘AI safety’.
To address the point on these terms, then:
I don’t think AI safety runs the risk of being so attractive that misallocation becomes a big problem. Even if we consider risk of funding misallocation as significant, ‘AI risk’ seems like a worse term for permitting conflation of work areas.
Yes, it’s of course useful to have two different concepts for these two types of work, but this conceptual distinction doesn’t go away with a shift toward ‘AI accidents’ as the subject of these two fields. I don’t think a move toward ‘AI accidents’ awkwardly merges all AI safety work.
But if it did: The outcome we want to avoid is AGI safety getting too little funding. This outcome seems more likely in a world that makes two fields of N-AI safety and AGI safety, given the common dispreference for work on AGI safety. Overflow seems more likely in the N-AI Safety → AGI Safety direction when they are treated as the same category than when they are treated as different. It doesn’t seem beneficial for AGI safety to market the two as separate types of work.
Ultimately, though, I place more weight on the other reasons why I think it’s worth reconsidering the terms.
I agree it is worth reconsidering the terms!
The agi/narrow ai distinction is beside the point a bit, I’m happy to drop it. I also have an AI/IA bugbear so I’m used to not liking how things are talked about.
Part of the trouble is we have lost the marketing war before it even began, every vaguely advanced technology we have currently is marketing itself as AI, that leaves no space for anything else.
AI accidents brings to my mind trying to prevent robots crashing into things. 90% of robotics work could be classed as AI accident prevention because they are always crashing into things.
It is not just funding confusion that might be a problem. If I’m reading a journal on AI safety or taking a class on AI safety what should I expect? Robot mishaps or the alignment problem? How will we make sure the next generation of people can find the worthwhile papers/courses?
AI risks is not perfect, but is not at least it is not that.
Perhaps we should take a hard left and say that we are looking at studying Artificial Intelligence Motivation? People know that an incorrectly motivated person is bad and that figuring out how to motivate AIs might be important. It covers the alignment problem and the control problem.
Most AI doesn’t look like it has any form of motivation and is harder to rebrand as such, so it is easier to steer funding to the right people and tell people what research to read.
It doesn’t cover my IA gripe, which briefly is: AI makes people think of separate entities with their own goals/moral worth. I think we want to avoid that as much of possible. General Intelligence augmentation requires its own motivation work, but one so that the motivation of the human is inherited by the computer that human is augmenting. I think that my best hope is that AGI work might move in that direction.
I take the point. This is a potential outcome, and I see the apprehension, but I think it’s a probably a low risk that users will grow to mistake robotics and hardware accidents for AI accidents (and work that mitigates each) - sufficiently low that I’d argue expected value favours the accident frame. Of course, I recognize that I’m probably invested in that direction.
I think this steers close to an older debate on AI “safety” vs “control” vs “alignment”. I wasn’t a member of that discussion so am hesitant to reenact concluded debates (I’ve found it difficult to find resources on that topic other than what I’ve linked—I’d be grateful to be directed to more). I personally disfavour ‘motivation’ on grounds of risk of anthropomorphism.
I would do some research onto how well sciences that have suffered brand dilution do.
As far as I understand it Research institutions have high incentives to
Find funding
Pump out tractible digestible papers
See this kind of article for other worries about this kind of thing.
You have to frame things with that in mind, give incentives so that people do the hard stuff and can be recognized for doing the hard stuff.
Nanotech is a classic case of a diluted research path, if you have contacts maybe try and talk to Erik Drexler, he is interested in AI safety so might be interested in how the AI Safety research is framed.
Fair enough I’m not wedded to motivation (I see animals having motivation as well, so not strictly human). It doesn’t seem to cover Phototaxis which seems like the simplest thing we want to worry about. So that is an argument against motivation. I’m worded out at the moment. I’ll see if my brain thinks of anything better in a bit.
Meh, that makes it sound too narrowly technical—there are a lot of ways that advanced AI can cause problems, and they don’t all fit into the narrow paradigm of a system running into bugs/accidents that can be fixed with better programming.
This seems unnecessarily rude to me, and doesn’t engage with the post. For example, I don’t see the post anywhere characterising accidents as only coming from bugs in code, and it seems like this dismissal of the phrase ‘AI accidents’ would apply equally to ‘AI risk’.
“Rude?” Oh please, grow some thick skin.
But I didn’t say that the author is characterizing accidents as coming from bugs in code. I said that the language he is proposing has that effect. The author didn’t address this potential problem, so there was nothing for me to engage with.
It does in fact apply, since AI risk neglects important topics in AI ethics, but it doesn’t apply as strongly as it would for “AI accidents.”
Hi Kyle, I think that it’s worth us all putting effort into being friendly and polite on this forum, especially when we disagree with one another. I didn’t find your first comment informative or polite, and just commented to explain why I down-voted it.
https://www.centreforeffectivealtruism.org/blog/considering-considerateness-why-communities-of-do-gooders-should-be/
Thanks Ben, for telling us that communities of do-gooders should be considerate. But I wasn’t inconsiderate. If you linked an article titled “why communities of do-gooders should be so insanely fragile that they can’t handle a small bit of criticism” then it would be relevant.
Yeah, and now I’m commenting to explain why I downvoted yours, and how you are failing to communicate a convincing point. If you found my first comment “rude” or impolite then you’ve lost your grip on ordinary conversation. Saying “meh” is not rude, yikes.
OpenPhil notion of ‘accident risk’ more general than yours to describe the scenarios that aren’t misuse risk and their term makes perfect sense to me: https://www.openphilanthropy.org/blog/potential-risks-advanced-artificial-intelligence-philanthropic-opportunity
Yeah, well I don’t think we should only be talking about accident risk.
What do you have in mind? If it can’t be fixed with better programming, how will they be fixed?
Better decision theory, which is much of what MIRI does, and better guiding philosophy.
I agree that more of both is needed. Both need to be instantiated in actual code, though. And both are useless if researchers don’t care implement them.
I admit I would benefit from some clarification on your point—are you arguing that the article assumes a bug-free AI won’t cause AI accidents? Is it the case that this arose from Amodei et al.’s definition?: “unintended and harmful behavior that may emerge from poor design of real-world AI systems”. Poor design of real world AI systems isn’t limited to being bug-free, but I can see why this might have caused confusion.
I’m not—I’m saying that when you phrase it as accidents then it creates flawed perceptions about the nature and scope of the problem. An accident sounds like a onetime event that a system causes in the course of its performance; AI risk is about systems whose performance itself is fundamentally destructive. Accidents are aberrations from normal system behavior; the core idea of AI risk is that any known specification of system behavior, when followed comprehensively by advanced AI, is not going to work.