Geoffrey Irving

Karma: 554

Safety Researcher and Scalable Alignment Team lead at DeepMind. AGI will probably be wonderful; let’s make that even more probable.

Geoffrey Irving 1 May 2020 20:15 UTC
18 points
0 ∶ 0
on: Racial Demographics at Longtermist Organizations
One note: DeepMind is outside the set of typical EA orgs, but is very relevant from a longtermist perspective. It fairs quite a bit better on this measure in terms of leadership: e.g., everyone above me in the hierarchy is non-white.

Geoffrey Irving 3 Jun 2020 21:40 UTC
1 point
0 ∶ 0
on: Will protests lead to thousands of coronavirus deaths?
It’s worth distinguishing between the protests causing spread and arresting protesters causing spread. It’s quite possible more spread will be caused by the latter, and calling this spread “caused by the protests” is game theoretically similar to “Why are you hitting yourself?” My guess is that you’re not intending to lump those into the same bucket, but it’s worth separating them out explicitly given the title.

Geoffrey Irving 4 Jun 2020 15:36 UTC
4 points
0 ∶ 0
in reply to: Larks’s comment on: Will protests lead to thousands of coronavirus deaths?
Thanks, that’s all reasonable. Though to clarify, the game theory point isn’t about deterring police but about whether to let potential arrests and coronavirus consequences deter the protests themselves.

Geoffrey Irving 4 Jun 2020 15:38 UTC
1 point
0 ∶ 0
in reply to: Pablo’s comment on: Will protests lead to thousands of coronavirus deaths?
“Quite possible” means I am making a qualitative point about game theory but haven’t done the estimates.

Though if one did want to do estimates, that ratio isn’t enough, as spread is superlinear as a function of the size of a group arrested and put in a single room.

Geoffrey Irving 16 Jul 2020 11:54 UTC
17 points
0 ∶ 0
on: A list of good heuristics that the case for AI X-risk fails
As a meta-comment, I think it’s quite unhelpful that some of these “good heuristics” are written as intentional strawmen where the author doesn’t believe the assumptions hold. E.g., the author doesn’t believe that there are no insiders talking about X-risk. If you’re going to write a post about good heuristics, maybe try to make the good heuristic arguments actually good? This kind of post mostly just alienates me from wanting to engage in these discussions, which is a problem given that I’m one of the more senior AGI safety researchers.

Geoffrey Irving 17 Jul 2020 16:27 UTC
4 points
0 ∶ 0
in reply to: vaniver’s comment on: A list of good heuristics that the case for AI X-risk fails
Yes, the mocking is what bothers me. In some sense the wording of the list means that people on both sides of the question could come away feeling justified without a desire for further communication: AGI safety folk since the arguments seem quite bad, and AGI safety skeptics since they will agree that some of these heuristics can be steel-manned into a good form.

Geoffrey Irving 22 Jul 2020 22:14 UTC
9 points
0 ∶ 0
on: Intellectual Diversity in AI Safety
I started working on AI safety prior to reading Superintelligence and despite knowing about MIRI et al. since I didn‘t like their approach. So I don’t think I agree with your initial premise that the field is as much a monoculture as you suggest.

Geoffrey Irving 22 Jul 2020 22:56 UTC
7 points
0 ∶ 0
in reply to: KR’s comment on: Intellectual Diversity in AI Safety
Well, part of my job is making new people that qualify, so yes to some extent. This is true both in my current role and in past work at OpenAI (e.g., https://distill.pub/2019/safety-needs-social-scientists).

Geoffrey Irving 22 Jul 2020 23:00 UTC
7 points
0 ∶ 0
in reply to: Buck’s comment on: Intellectual Diversity in AI Safety
I think mostly I arrived with a different set of tools and intuitions, in particular a better sense for numerical algorithms (Paul has that too, of course) and thus intuition about how things should work with finite errors and how to build toy models that capture the finite error setting.

I do think a lot of the intuitions built by Bostrom and Yudkowsky are easy to fix into a form that works in the finite error model (though not all of it), so I don’t agree with some of the recent negativity about these classical arguments. That is, some fixing is required to make me like those arguments, but it doesn’t feel like the fixing is particularly hard.

Geoffrey Irving 22 Jul 2020 23:07 UTC
10 points
0 ∶ 0
on: Intellectual Diversity in AI Safety
We should also mention Stuart Russell here, since he’s certainly very aware of Bostrom and MIRI but has different detail views and is very grounded in ML.

Geoffrey Irving 22 Jul 2020 23:13 UTC
4 points
0 ∶ 0
in reply to: Buck’s comment on: Intellectual Diversity in AI Safety
In the other direction, I started to think about this stuff in detail at the same time I started working with various other people and definitely learned a ton from them, so there wasn’t a long period where I had developed views but hadn’t spent months talking to Paul.

Geoffrey Irving 23 Jul 2020 8:37 UTC
13 points
0 ∶ 0
on: Assessing the impact of quantum cryptanalysis
This is a great document! I agree with the conclusions, though there are a couple factors not mentioned which seem important:
On the positive side, Google has already deployed post-quantum schemes as a test, and I believe the test was successful (https://security.googleblog.com/2016/07/experimenting-with-post-quantum.html). This was explicitly just a test and not intended as a standardization proposal, but it’s good to see that it’s practical to layer a post-quantum scheme on top of an existing scheme in a deployed system. I do think if we needed to do this quickly it would happen; the example of Google and Apple working together to get contact tracing working seems relevant.
On the negative side, there may be significant economic costs due to public key schemes deployed “at rest” which are impossible to change after the fact. This includes any encrypted communication that has been stored by an adversary across the time when we switch from pre-quantum to post-quantum, and also includes slow-to-build up applications like PGP webs of trust which are hard to quickly swap out. I don’t think this changes the overall conclusions, since I’d expect the going-forwards cost to be larger, but it’s worth mentioning.

Geoffrey Irving 23 Jul 2020 13:15 UTC
4 points
0 ∶ 0
in reply to: Jaime Sevilla’s comment on: Assessing the impact of quantum cryptanalysis
Yep, that’s the right interpretation.

In terms of hardware, I don’t know how Chrome did it, but at least on fully capable hardware (mobile CPUs and above) you can often bitslice to make almost any circuit efficient if it has to be evaluated in parallel. So my prior is that quite general things don’t need new hardware if one is sufficiently motivated, and would want to see the detailed reasoning before believing you can’t do it with existing machines.

Geoffrey Irving 15 Sep 2020 16:27 UTC
7 points
0 ∶ 0
on: Quantum computing timelines
5% probability by 2039 seems way too confident that it will take a long time: is this intended to be a calibrated estimate, or does the number have a different meaning?

Geoffrey Irving 19 Apr 2021 8:24 UTC
50 points
0 ∶ 0
in reply to: Buck’s comment on: Concerns with ACE’s Recent Behavior
I bounce off posts like this. Not sure if you’d consider me net positive or not. :)

Geoffrey Irving 19 Apr 2021 21:45 UTC
33 points
0 ∶ 0
in reply to: Buck’s comment on: Concerns with ACE’s Recent Behavior
I think that isn’t the right counterfactual since I got into EA circles despite having only minimal (and net negative) impressions of EA-related forums. So your claim is narrowly true, but if instead the counterfactual was if my first exposure to EA was the EA forum, then I think yes the prominence of this kind of post would have made me substantially less likely to engage.
But fundamentally if we’re running either of these counterfactuals I think we’re already leaving a bunch of value on the table, as expressed by EricHerboso’s post about false dilemmas.

Geoffrey Irving 5 Jun 2021 21:06 UTC
11 points
0 ∶ 0
on: High Impact Careers in Formal Verification: Artificial Intelligence
As someone who’s worked both in ML for formal verification with security motivations in mind, and (now) directly on AGI alignment, I think most EA-aligned folk who would be good at formal verification will be close enough to being good at direct AGI alignment that it will be higher impact to work directly on AGI alignment. It’s possible this would change in the future if there are a lot more people working on theoretically-motivated prosaic AGI alignment, but I don’t think we’re there yet.

Geoffrey Irving 20 Jul 2021 14:16 UTC
3 points
0 ∶ 0
in reply to: BrianTan’s comment on: Should EA have a career-focused “Do the most good” pledge?
Won’t this comment get hidden soon?

Geoffrey Irving 20 Jul 2021 14:21 UTC
3 points
0 ∶ 0
on: Should EA have a career-focused “Do the most good” pledge?
If we want to include a hits-based approach to careers, but also respect people not having EA goals as the exclusive life goal, I’d have a worry that signing this pledge is incompatible with staying in a career that the EA community subsequently decides is ineffective. This could be true even if under the information known at the time of career choice the career looked like terrific expected value.
The actual wording of the pledge seems okay under this metric, as it only promises to “seek out ways to increase the impact of my career”, so maybe this is fine as long as the pledge doesn’t rise to “switch career” in all cases.

Geoffrey Irving 20 Jul 2021 14:21 UTC
2 points
0 ∶ 0
in reply to: BrianTan’s comment on: Should EA have a career-focused “Do the most good” pledge?
I think we might just end up in the disaster scenario where you get a bunch of karma. :)