Safety Researcher and Scalable Alignment Team lead at DeepMind. AGI will probably be wonderful; let’s make that even more probable.
Geoffrey Irving
It’s worth distinguishing between the protests causing spread and arresting protesters causing spread. It’s quite possible more spread will be caused by the latter, and calling this spread “caused by the protests” is game theoretically similar to “Why are you hitting yourself?” My guess is that you’re not intending to lump those into the same bucket, but it’s worth separating them out explicitly given the title.
Thanks, that’s all reasonable. Though to clarify, the game theory point isn’t about deterring police but about whether to let potential arrests and coronavirus consequences deter the protests themselves.
“Quite possible” means I am making a qualitative point about game theory but haven’t done the estimates.
Though if one did want to do estimates, that ratio isn’t enough, as spread is superlinear as a function of the size of a group arrested and put in a single room.
As a meta-comment, I think it’s quite unhelpful that some of these “good heuristics” are written as intentional strawmen where the author doesn’t believe the assumptions hold. E.g., the author doesn’t believe that there are no insiders talking about X-risk. If you’re going to write a post about good heuristics, maybe try to make the good heuristic arguments actually good? This kind of post mostly just alienates me from wanting to engage in these discussions, which is a problem given that I’m one of the more senior AGI safety researchers.
Yes, the mocking is what bothers me. In some sense the wording of the list means that people on both sides of the question could come away feeling justified without a desire for further communication: AGI safety folk since the arguments seem quite bad, and AGI safety skeptics since they will agree that some of these heuristics can be steel-manned into a good form.
I started working on AI safety prior to reading Superintelligence and despite knowing about MIRI et al. since I didn‘t like their approach. So I don’t think I agree with your initial premise that the field is as much a monoculture as you suggest.
Well, part of my job is making new people that qualify, so yes to some extent. This is true both in my current role and in past work at OpenAI (e.g., https://distill.pub/2019/safety-needs-social-scientists).
I think mostly I arrived with a different set of tools and intuitions, in particular a better sense for numerical algorithms (Paul has that too, of course) and thus intuition about how things should work with finite errors and how to build toy models that capture the finite error setting.
I do think a lot of the intuitions built by Bostrom and Yudkowsky are easy to fix into a form that works in the finite error model (though not all of it), so I don’t agree with some of the recent negativity about these classical arguments. That is, some fixing is required to make me like those arguments, but it doesn’t feel like the fixing is particularly hard.
We should also mention Stuart Russell here, since he’s certainly very aware of Bostrom and MIRI but has different detail views and is very grounded in ML.
In the other direction, I started to think about this stuff in detail at the same time I started working with various other people and definitely learned a ton from them, so there wasn’t a long period where I had developed views but hadn’t spent months talking to Paul.
This is a great document! I agree with the conclusions, though there are a couple factors not mentioned which seem important:
On the positive side, Google has already deployed post-quantum schemes as a test, and I believe the test was successful (https://security.googleblog.com/2016/07/experimenting-with-post-quantum.html). This was explicitly just a test and not intended as a standardization proposal, but it’s good to see that it’s practical to layer a post-quantum scheme on top of an existing scheme in a deployed system. I do think if we needed to do this quickly it would happen; the example of Google and Apple working together to get contact tracing working seems relevant.
On the negative side, there may be significant economic costs due to public key schemes deployed “at rest” which are impossible to change after the fact. This includes any encrypted communication that has been stored by an adversary across the time when we switch from pre-quantum to post-quantum, and also includes slow-to-build up applications like PGP webs of trust which are hard to quickly swap out. I don’t think this changes the overall conclusions, since I’d expect the going-forwards cost to be larger, but it’s worth mentioning.
Yep, that’s the right interpretation.
In terms of hardware, I don’t know how Chrome did it, but at least on fully capable hardware (mobile CPUs and above) you can often bitslice to make almost any circuit efficient if it has to be evaluated in parallel. So my prior is that quite general things don’t need new hardware if one is sufficiently motivated, and would want to see the detailed reasoning before believing you can’t do it with existing machines.
5% probability by 2039 seems way too confident that it will take a long time: is this intended to be a calibrated estimate, or does the number have a different meaning?
I bounce off posts like this. Not sure if you’d consider me net positive or not. :)
I think that isn’t the right counterfactual since I got into EA circles despite having only minimal (and net negative) impressions of EA-related forums. So your claim is narrowly true, but if instead the counterfactual was if my first exposure to EA was the EA forum, then I think yes the prominence of this kind of post would have made me substantially less likely to engage.
But fundamentally if we’re running either of these counterfactuals I think we’re already leaving a bunch of value on the table, as expressed by EricHerboso’s post about false dilemmas.
As someone who’s worked both in ML for formal verification with security motivations in mind, and (now) directly on AGI alignment, I think most EA-aligned folk who would be good at formal verification will be close enough to being good at direct AGI alignment that it will be higher impact to work directly on AGI alignment. It’s possible this would change in the future if there are a lot more people working on theoretically-motivated prosaic AGI alignment, but I don’t think we’re there yet.
Won’t this comment get hidden soon?
If we want to include a hits-based approach to careers, but also respect people not having EA goals as the exclusive life goal, I’d have a worry that signing this pledge is incompatible with staying in a career that the EA community subsequently decides is ineffective. This could be true even if under the information known at the time of career choice the career looked like terrific expected value.
The actual wording of the pledge seems okay under this metric, as it only promises to “seek out ways to increase the impact of my career”, so maybe this is fine as long as the pledge doesn’t rise to “switch career” in all cases.
I think we might just end up in the disaster scenario where you get a bunch of karma. :)
One note: DeepMind is outside the set of typical EA orgs, but is very relevant from a longtermist perspective. It fairs quite a bit better on this measure in terms of leadership: e.g., everyone above me in the hierarchy is non-white.