See this comment thread.
Sends me the message that longtermists should care less about AI risk.
I do believe that, and so does Robin. I don’t know about Paul and Adam, but I wouldn’t be surprised if they thought so too.
Though, the people in the “conversations” all support AI safety research.
Well, it’s unclear if Robin supports AI safety research, but yes, the other three of us do. This is because:
10% chance of existential risk from AI sounds like a problem of catastrophic proportions to me.
(Though I’ll note that I don’t think the 10% figure is robust.)
I’m not arguing “AI will definitely go well by default, so no one should work on it”. I’m arguing “Longtermists currently overestimate the magnitude of AI risk”.
I also broadly agree with reallyeli:
However I really think we ought to be able to discuss guesses about what’s true merely on the level of what’s true, without thinking about secondary messages being sent by some statement or another. It seems to me that if we’re unable to do so, that will make the difficult task of finding truth even more difficult.
And this really does have important implications: if you believe “non-robust 10% chance of AI accident risk”, maybe you’ll find that biosecurity, global governance, etc. are more important problems to work on. I haven’t checked myself—for me personally, it seems quite clear that AI safety is my comparative advantage—but I wouldn’t be surprised if on reflection I thought one of those areas was more important for EA to work on than AI safety.
I hope it’s sufficiently clear that I’m not trying to claim that action-relevance is *all* you should think about as a fledgling researcher?
I didn’t think that you thought that; I think the post is fine as is. I wasn’t trying to critique this post; it’s an important concept and I can certainly think of some people who I think should take this advice.
In the spirit of reversing advice, the very short case for not asking yourself whether something is action-relevant, is that curiosity is an incredibly valuable tool for motivation and directing your learning where there is something important to be learned. Justifying every question on decision-relevance replaces curiosity with (semi) explicit reasoning; it is not clear to me that this is a good trade (many of the best thinkers of the past seem to me to be extremely curious, and in my experience, explicit reasoning is not very powerful).
I don’t have a strong opinion on whether the median EA interested in research should be taking this advice or its opposite.
I suspect that things like the Alignment Newsletter are causing AI safety researchers to understand and engage with each other’s work more; this seems good.
This is the goal, but it’s unclear that it’s having much of an effect. I feel like I relatively often have conversations with AI safety researchers where I mention something I highlighted in the newsletter, and the other person hasn’t heard of it, or has a very superficial / wrong understanding of it (one that I think would be corrected by reading just the summary in the newsletter).
This is very anecdotal; even if there are times when I talk to people and they do know the paper that I’m talking about because of the newsletter, I probably wouldn’t notice / learn that fact.
(In contrast, junior researchers are often more informed than I would expect, at least about the landscape, even if not the underlying reasons / arguments.)
I mostly meant phrasing it as “the model result”, the “99-100%” is fine if it’s clear that it’s from a model and not your considered belief.
I would have liked to see the models and graphs (presumably the most important part of the paper), but the images don’t load and the links to the models don’t work:
Table 1 shows the key input parameters for Model 1 (largely Denkenberger and conference poll of effective altruists)(D. Denkenberger, Cotton-Barrat, Dewey, & Li, 2019a) and Model 2 (D. Denkenberger, Cotton-Barratt, Dewey, & Li, 2019) (Sandberg inputs)(3).
However, it can be said with 99%-100% confidence that funding interventions for losing industry now is more cost effective than additional funding for AGI safety beyond the expected $3 billion.
If you don’t actually mean such confidence (which I assume you don’t because 1. it’s crazy and 2. you mention model uncertainty elsewhere), can you please not say it?
perhaps babies develop a sense of “hierarchy” which then gets applied to language, explaining how children learn languages so fast.
Though if we are to believe this paper at face value (I haven’t evaluated it), babies start learning in the womb. (The paper claims that the biases depend on which language is spoken around the pregnant mother, which suggests that it must be learned, rather than being “built-in”.)
Ah, somehow I missed that, thanks!
While I’m broadly uncertain about the overall effects of LAWs within the categories you’ve identified, and it seems plausible that LAWs are more likely to be good given those particular consequences, one major consideration for me against LAWs is that it plausibly would differentially benefit small misaligned groups such as terrorists. This is the main point of the Slaughterbots video. I don’t know how big this effect is, especially since I don’t know how much terrorism there is or how competent terrorists are; I’m just claiming that it is plausibly big enough to make a ban on LAWs desirable.
(Not sure how much of this Shah already knows.)
Not much, sadly. I don’t actually intend to learn about it in the near future, because I don’t think timelines are particularly decision-relevant to me (though they are to others, especially funders). Thanks for the links!
Tooby and Cosmides are big advocates for the “massive modularity” view—a huge amount of human cognition takes place in specialized, task-tailored modules rather than on one big, domain-general “computer”.
On my view, babies would learn a huge amount about the structure of the world simply by interacting with it (pushing over an object can in principle teach you a lot about objects, causality, intuitive physics, etc), and this leads to general patterns that we later call “inductive biases” for more complex tasks. For example, hierarchy is a very useful way to understand basically any environment we are ever in; perhaps babies develop a sense of “hierarchy” which then gets applied to language, explaining how children learn languages so fast.
From the Wikipedia page you linked, challenges to a “rationality” based view:
1. Evolutionary theories using the idea of numerous domain-specific adaptions have produced testable predictions that have been empirically confirmed; the theory of domain-general rational thought has produced no such predictions or confirmations.
I wish they said what these predictions were. I’m not going to chase down this reference.
2. The rapidity of responses such as jealousy due to infidelity indicates a domain-specific dedicated module rather than a general, deliberate, rational calculation of consequences.
This is a good point; in general emotions are probably not learned, for the most part. I’m not sure what’s going on there.
3. Reactions may occur instinctively (consistent with innate knowledge) even if a person has not learned such knowledge.
I agree that reflexes are “built-in” and not learned; reflexes are also pretty different from e.g. language. Obviously not everything our bodies do is “learned”, reflexes, breathing, digestion, etc. all fall into the “built-in” category. I don’t think this says much about what leads humans to be good at chess, language, plumbing, soccer, gardening, etc, which is what I’m more interested in.
It seems likely to me that you might need the equivalent of reflexes, breathing, digestion, etc. if you want to design a fully autonomous agent that learns without any human support whatsoever, but we will probably instead design an agent that (initially) depends on us to keep the electricity flowing, to fix any wiring issues, to keep up the Internet connection, etc. (In contrast, human parents can’t ensure that the child keeps breathing, so you need an automatic, built-in system for that.)
Top AI safety researchers are now saying that they expect AI to be safe by default, without further intervention from EA. See here and here.
“Probably safe by default” doesn’t mean “we shouldn’t work on it”. My estimate of 90% that you quote still leaves a 10% chance of catastrophe, which is worth reducing. (Though the 10% is very non-robust.) It also is my opinion before updating on other people’s views.
Those posts were published because AI Impacts was looking to have conversations with people who had safe-by-default views, so there’s a strong selection bias. If you looked for people with doom-by-default views, you could find them.
Amusingly, I use my own Amazon account so infrequently that they refuse to let me write a review. I didn’t think about GoodReads, I might do that.
I also added a bunch of comments with some other less polished thoughts on the book on the Alignment Forum version of this post.
Yes, that’s correct.
Mathematical knowledge would be knowing that the Pythagoras theorem states that a2+b2=c2, mathematical thinking would be the ability to prove that theorem from first principles.
The way I use the phrase, mathematical thinking doesn’t only encompass proofs. It would also count as “mathematical reasoning” if you figure out that means are affected by outliers more than medians are, even if you don’t write down any formulas, equations, or proofs.