kokotajlod

Karma: 3,040

Most of my stuff (even the stuff of interest to EAs) can be found on LessWrong: https://www.lesswrong.com/users/daniel-kokotajlo

Taboo “Outside View”

kokotajlod17 Jun 2021 9:39 UTC

177 points

26 comments7 min readEA link

Simple charitable donation app idea

kokotajlod12 May 2023 2:11 UTC

88 points

23 comments3 min readEA link

Evidence on good forecasting practices from the Good Judgment Project: an accompanying blog post

kokotajlod15 Feb 2019 19:14 UTC

79 points

14 comments21 min readEA link

kokotajlod 1 Mar 2023 14:29 UTC
63 points
24 ∶ 2
on: Why I love effective altruism
My friend Cullen once said something like “It’s good for the world to have at least one group of people committed to doing good as such.” At first I was like “Why?” but now I think I understand.

In war, it’s generally a good idea to hold back some of your force as reserves. That way as the battle progresses and you get more information about which parts are doing well and poorly, you can send in the reserves to wherever they are needed most.

In the War On Bad Things, EAs are the reserves. They are much more capable of pivoting to different cause areas, projects, etc. as needed, and they are explicitly trying to go where they are most needed (as opposed to most other groups, which are doing the equivalent of trying to take hill X or hold line Z or whatever)

kokotajlod 29 Sep 2023 16:59 UTC
56 points
11 ∶ 5
on: “Diamondoid bacteria” nanobots: deadly threat or dead-end? A nanotech investigation
Thanks for this thoughtful and detailed deep dive!

I think it misses the main cruxes though. Yes, some people (Drexler and young Yudkowsky) thought that ordinary human science would get us all the way to atomically precise manufacturing in our lifetimes. For the reasons you mention, that seems probably wrong.

But the question I’m interested in is whether a million superintelligences could figure it out in a few years or less. (If it takes them, say, 10 years or longer, then probably they’ll have better ways of taking over the world) Since that’s the situation we’ll actually be facing.
To answer that question, we need to ask questions like

(1) Is it even in principle possible? Is there some configuration of atoms, that would be a general-purpose nanofactory, capable of making more of itself, that uses diamandoid instead of some other material? Or is there no such configuration?

Seems like the answer is “Probably, though not necessarily; it might turn out that the obstacles discussed are truly insurmountable. Maybe 80% credence.” If we remove the diamandoid criterion and allow it to be built of any material (but still require it to be dramatically more impressive and general-purpose / programmable than ordinary life forms) then I feel like the credence shoots up to 95%, the remaining 5% being model uncertainty.

(2) Is it practical for an entire galactic empire of superintelligences to build in a million years? (Conditional on 1, I think the answer to 2 is ‘of course.’)

(3) OK, conditional on the above, the question becomes what the limiting factor is—is it genius insights about clever binding processes or mini-robo-arm-designs exploiting quantum physics to solve the stickiness problems mentioned in this post? Is it mucking around in a laboratory performing experiments to collect data to refine our simulations? Is it compute & sim-algorithms, to run the simulations and predict what designs should in theory work? Genius insights will probably be pretty cheap to come by for a million superintelligences. I’m torn about whether the main constraint will be empirical data to fit the simulations, or compute to run the simulations.

(4) What’s our credence distribution over orders of magnitude of the following inputs: Genius, experiments, and compute, in each case assuming that it’s the bottleneck? Not sure how to think about genius, but it’s OK because I don’t think it’ll be the bottleneck. Our distributions should range over many orders of magnitude, and should update on our observation so far that however many experiments and simulations humans have done didn’t seem close to being enough.

I wildly guess something like 50% that we’ll see some sort of super powerful nanofactory-like thing. I’m more like 5% that it consists of diamandoid in particular, there are so many different material designs and even if diamandoid is viable and in some sense theoretically the best, the theoretical best probably takes several OOMs more inputs to achieve than something else which is just merely good enough.

DeepMind: Generally capable agents emerge from open-ended play

kokotajlod27 Jul 2021 19:35 UTC

48 points

10 comments2 min readEA link

(deepmind.com)

Against GDP as a metric for timelines and takeoff speeds

kokotajlod29 Dec 2020 17:50 UTC

47 points

6 comments14 min readEA link

kokotajlod 31 Mar 2023 18:02 UTC
46 points
20 ∶ 0
in reply to: Spencer Becker-Kahn’s comment on: Critiques of prominent AI safety labs: Redwood Research
I feel like it was only a year or so ago that the standard critique of the AI safety community was that they were too abstract, too theoretical, that they lacked hands-on experience, lacked contact with empirical reality, etc...

kokotajlod 16 Jan 2021 0:47 UTC
44 points
0 ∶ 0
on: Lessons from my time in Effective Altruism
Thanks for this! I think my own experience has led to different lessons in some cases (e.g. I think I should have prioritised personal fit less and engaged less with people outside the EA community), but I nevertheless very much approve of this sort of public reflection.
What links here?
- How much time should EAs spend engaging with other EAs vs with people outside of EA? by MichaelA (18 Jan 2021 3:20 UTC; 13 points)

Vignettes Workshop (AI Impacts)

kokotajlod15 Jun 2021 11:02 UTC

43 points

5 comments1 min readEA link

kokotajlod 7 Jun 2022 22:28 UTC
42 points
0 ∶ 0
on: Deference Culture in EA
EA has a high deference culture? Compared to what other cultures? Idk but I feel like the difference between EA and other groups of people I’ve been in (grad students, City Year people, law students...) may not be that EAs defer more on average but rather that they are much more likely to explicitly flag when they are doing so. In EA the default expectation is that you do your own thinking and back up your decisions and claims with evidence*, and deference is a legitimate source of evidence so people cite it. But in other communities people would just say “I think X” or “I’m doing X” and not bother to explain why (and perhaps not even know why, because they didn’t really think that much about it).

*Other communities have this norm too, I think, but not to the same extent.

kokotajlod 28 Jul 2022 13:45 UTC
41 points
0 ∶ 0
on: Interesting vs. Important Work—A Place EA is Prioritizing Poorly
He more recently mentioned that he noticed “people continuously vanishing higher into the tower,” that is, focusing on more abstract and harder to evaluate issues, and that very few people have done the opposite. One commenter, Ben Weinstein-Raun, suggested several reasons, among them that longer-loop work is more visible, and higher status.

I disagree that longer-loop work is more visible and higher status, I think the opposite is true. In AI, agent foundations researchers are less visible and lower status than prosaic AI alignment researchers, who are less visible and lower status than capabilities researchers. In my own life, I got a huge boost of status & visibility when I did less agent foundationsy stuff and more forecasting stuff (timelines, takeoff speeds, predicting ML benchmarks, etc.).

What 2026 looks like (Daniel’s median future)

kokotajlod7 Aug 2021 5:14 UTC

38 points

1 comment2 min readEA link

(www.lesswrong.com)

Tiny Probabilities of Vast Utilities: Defusing the Initial Worry and Steelmanning the Problem

kokotajlod10 Nov 2018 9:12 UTC

35 points

10 comments8 min readEA link

Tiny Probabilities of Vast Utilities: Concluding Arguments

kokotajlod15 Nov 2018 21:47 UTC

33 points

6 comments10 min readEA link

[Question] How can I bet on short timelines?

kokotajlod7 Nov 2020 12:45 UTC

33 points

12 comments2 min readEA link

kokotajlod 7 Jun 2023 19:17 UTC
33 points
11 ∶ 4
on: A note of caution about recent AI risk coverage
FWIW I think that it’s pretty likely that AGI etc. will happen within 10 years absent strong regulation, and moreover that if it doesn’t, the ‘crying wolf’ effect will be relatively minor, enough that even if I had 20-year medians I wouldn’t worry about it compared to the benefits.

kokotajlod 20 Jun 2022 4:23 UTC
33 points
0 ∶ 0
in reply to: richard_ngo’s comment on: On Deference and Yudkowsky’s AI Risk Estimates
Beat me to it & said it better than I could.

My now-obsolete draft comment was going to say:

It seems to me that between about 2004 and 2014, Yudkowsky was the best person in the world to listen to on the subject of AGI and AI risks. That is, deferring to Yudkowsky would have been a better choice than deferring to literally anyone else in the world. Moreover, after about 2014 Yudkowsky would probably have been in the top 10; if you are going to choose 10 people to split your deference between (which I do not recommend, I recommend thinking for oneself), Yudkowsky should be one of those people and had you dropped Yudkowsky from the list in 2014 you would have missed out on some important stuff. Would you agree with this?

On the positive side, I’d be interested to see a top ten list from you of people you think should be deferred to as much or more than Yudkowsky on matters of AGI and AI risks.*

*What do I mean by this? Idk, here’s a partial operationalization: Timelines, takeoff speeds, technical AI alignment, and p(doom).

[ETA: lest people write me off as a Yudkowsky fanboy, I wish to emphasize that I too think people are overindexing on Yudkowsky’s views, I too think there are a bunch of people who defer to him too much, I too think he is often overconfident, wrong about various things, etc.]

[ETA: OK, I guess I think Bostrom probably was actually slightly better than Yudkowsky even on 20-year timespan.]

[ETA: I wish to reemphasize, but more strongly, that Yudkowsky seems pretty overconfident not just now but historically. Anyone deferring to him should keep this in mind; maybe directly update towards his credences but don’t adopt his credences. E.g. think “we’re probably doomed” but not “99% chance of doom” Also, Yudkowsky doesn’t seem to be listening to others and understanding their positions well. So his criticisms of other views should be listened to but not deferred to, IMO.]

kokotajlod 29 Jan 2021 7:55 UTC
32 points
0 ∶ 0
on: AMA: Ajeya Cotra, researcher at Open Phil
Hi Ajeya! I”m a huge fan of your timelines report, it’s by far the best thing out there on the topic as far as I know. Whenever people ask me to explain my timelines, I say “It’s like Ajeya’s, except...”
My question is, how important do you think it is for someone like me to do timelines research, compared to other kinds of research (e.g. takeoff speeds, alignment, acausal trade...)

I sometimes think that even if I managed to convince everyone to shift from median 2050 to median 2032 (an obviously unlikely scenario!), it still wouldn’t matter much because people’s decisions about what to work on are mostly driven by considerations of tractability, neglectedness, personal fit, importance, etc. and even that timelines difference would be a relatively minor consideration. On the other hand, intuitively it does feel like the difference between 2050 and 2032 is a big deal and that people who believe one when the other is true will probably make big strategic mistakes.
Bonus question: Murphyjitsu: Conditional on TAI being built in 2025, what happened? (i.e. how was it built, what parts of your model were wrong, what do the next 5 years look like, what do the 5 years after 2025 look like?)

kokotajlod 25 Jul 2023 5:08 UTC
31 points
5 ∶ 0
on: Who’s right about inputs to the biological anchors model?
- I haven’t considered all of the inputs to Cotra’s model, most notably the 2020 training computation requirements distribution. Without forming a view on that, I can’t really say that ~53% represents my overall view.
Sorry to bang on about this again and again, but it’s important to repeat for the benefit of those who don’t know: The training computation requirements distribution is by far the biggest cruxy input to the whole thing; it’s the input that matters most to the bottom line and is most subjective. If you hold fixed everything else Ajeya inputs, but change this distribution to something I think is reasonable, you get something like 2030 as the median (!!!) Meanwhile if you change the distribution to be even more extreme than Ajeya picked, you can push timelines arbitrarily far into the future.

Investigating this variable seems to have been beyond scope for the XPT forecasters, so this whole exercise is IMO merely that—a nice exercise, to practice for the real deal, which is when you think about the compute requirements distribution.