Nice!
Kit
‘Predicted’ in the title is pretty clickbaity/misleading given that the market was created and driven by insider traders. ‘Knew About’ or ‘Leaked Information About’ seem much more accurate.
Otherwise, I found this very interesting. I hadn’t heard of this market before, and appreciate the analysis of what seems like it might be a very important case study, both for how to handle the leaking of embargoed or otherwise sensitive information and for what to do about insider trading.
I like that you can interact with this. It makes understanding models so much easier.
Playing with the calculator, I see that the result is driven to a surprising degree by the likelihood that “Compute needed by AGI, relative to a human brain (1e20-1e21 FLOPS)” is <1/1,000x (i.e. the bottom two options).[1]
I think this shows that your conclusion is driven substantially by your choice to hardcode “1e20-1e21 FLOPS” specifically, and then to treat this figure as a reasonable proxy for what computation an AGI would need. (That is, you suggest ~~1x as the midpoint for “Compute needed by AGI, relative to… 1e20-1e21 FLOPS”).
I think it’s also a bit of an issue to call the variable “relative to a human brain (1e20-1e21 FLOPS)”. Most users will read it as “relative to a human brain” while it’s really “relative to 1e20-1e21 FLOPS”, which is quite a specific take on what a human brain is achieving.
I value the fact that you argue for choosing this figure here. However, it seems like you’re hardcoding in confidence that isn’t warranted. Even from your own perspective, I’d guess that including your uncertainty over this figure would bump up the probability by a factor of 2-3, while it looks like other commenters have pointed out that programs seem to use much less computation than we’d predict with a similar methodology applied to tasks computers already do.
- ^
This is assuming a distribution on computation centred on ballpark ~100x as efficient in the future (just naively based on recent trends). If putting all weight on ~100x, nothing above 1⁄1,000x relative compute requirement matters. If putting some weight on ~1,000x, nothing above 1/100x relative compute requirement matters.
- ^
Did you intend to refer to page 83 rather than 82?
Announcing the Longtermism Fund
It seems extremely clear that working with the existing field is necessary to have any idea what to do about nuclear risk. That said, being a field specialist seems like a surprisingly small factor in forecasting accuracy, so I’m surprised by that being the focus of criticism.
I was interested in the criticism (32:02), so I transcribed it here:
Jeffrey Lewis: By the way, we have a second problem that arises, which I think Wizards really helps explain: this is why our field can’t get any money.
Aaron Stein: That’s true.
Jeffrey Lewis: Because it’s extremely hard to explain to people who are not already deep in this field how these deterrence concepts work, because they don’t get it. Like if you look at the work that the effective altruism community does on nuclear risk, it’s as misguided as SAC’s original, you know, approach to nuclear weapons, and you would need an entire RAND-sized outreach effort. And there are some people who’ve tried to do this. Peter Scoblic, who is fundamentally a member of that community, wrote a really nice piece responding to some of the like not great effective altruism assessments of nuclear risk in Ukraine. So I don’t want to, you know, criticise the entire community, but… I experience this at a cocktail party. Once I start talking to someone about nuclear weapons and deterrence… if they don’t do this stuff full-time, the popular ideas they have about this are… (a) they might be super bored, but if they are willing to listen, the popular ideas they have about it are so misguided, that it becomes impossible to make enough progress in a reasonable time. And that’s death when you’re asking someone to make you a big cheque. That’s much harder than ‘hi, I want to buy some mosquito nets to prevent malaria deaths’. That’s really straightforward. This… this is complex.
It’s a shame that this doesn’t identify any specific errors, although that is consistent with Lewis’ view that the errors can’t be explained in minutes, perhaps even in years.
Speaking for myself, I agree with Lewis that popular ideas about nuclear weapons can be wildly, bizarrely wrong. That said, I’m surprised he highlights effective altruism as a community he’s pessimistic about being able to teach. The normal ‘cocktail party’ level of discourse includes alluring claims like ‘XYZ policy is totally obvious; we just have to implement it’, and the effective altruism people I’ve spoken to on nuclear issues are generally way less credulous than this, and hence more interested in understanding how things actually work.
During the calls I’ve offered to people considering applying to work as a grantmaker, I’ve sent this post to about 10 people already. Thanks for writing it!
I agree that this is the most common misconception about grantmaking. To be clear, (as we’ve discussed) I think there are some ways to make a difference in grantmaking which are exceptions to the general rule explained here, but think this approach is the right one for most people.
New Nuclear Security Grantmaking Programme at Longview Philanthropy
This was the single most valuable piece on the Forum to me personally. It provides the only end-to-end model of risks from nuclear winter that I’ve seen and gave me an understanding of key mechanisms of risks from nuclear weapons. I endorse it as the best starting point I know of for thinking seriously about such mechanisms. I wrote what impressed me most here and my main criticism of the original model here (taken into account in the current version).
This piece is part of a series. I found most articles in the series highly informative, but this particular piece did the most excellent job of improving my understanding of risks from nuclear weapons.
Details that I didn’t cover elsewhere, based on recommended topics for reviewers:
How did this post affect you, your thinking, and your actions?
It was a key part of what caused me to believe that civilisation collapsing everywhere solely due to nuclear weapons is extremely unlikely without a large increase in the number of such weapons. (The model in the post is consistent with meaningful existential risk from nuclear weapons in other ways.)
This has various implications for prioritisation between existential risks and prioritisation within the nuclear weapons space.
Does it make accurate claims? Does it carve reality at the joints? How do you know?
I spent about 2 days going through the 5 posts the author published around that time, comparing them to much rougher models I had made and looking into various details. I was very impressed.
The work that went into the post did the heavy lifting and pointed a way to a better understanding of nuclear risk. The model in the original version of the post was exceptionally concrete and with a low error rate, such that reviewers were able to engage with it to identify the key errors in the original version of the post.
For those interested in Triplebyte’s approach, there’s also Kelsey Piper’s thoughts on why and how the company gives feedback, and why others don’t.
Thanks for this.
Without having the data, it seems the controversy graph could be driven substantially by posts which get exactly zero downvotes.
Almost all posts get at least one vote (magnitude >= 1), and balance>=0, so magnitude^balance >=1. Since the controversy graph goes below 1, I assume you are including the handling which sets controversy to zero if there are zero downvotes, per the Reddit code you linked to.
e.g. if a post has 50 upvotes:
0 downvotes --> controversy 0 (not 1.00)
1 downvote --> controversy 1.08
2 downvotes --> controversy 1.17
10 downvotes --> controversy 2.27
so a lot of the action is in whether a post gets 0 downvotes or at least 1, and we know a lot of posts get 0 downvotes because the graph is often below 1.
If this is a major contributor, the spikes would look different if you run the same calculation without the handling (or, equivalently, with the override being to 1 instead of 0). This discontinuity also makes me suspect that Reddit uses this calculation for ordering only, not as a cardinal measure—or that zero downvotes is an edge case on Reddit!
People from 80k, Founders Pledge and GWWC have already replied with corrections.
(I downvoted this because a large fraction of the basic facts about what organisations are doing appear to be incorrect. See other comments. Mostly I think it’s unfortunate to have incorrect things stated as fact in posts, but going on to draw conclusions from incorrect facts also seems unhelpful.)
I’m totally not a mod, but I thought I’d highlight the “Is it true? Is it necessary? Is it kind?” test. I think it’s right in general, but especially important here. The Forum team seems to have listed basically this too: “Writing that is accurate, kind, and relevant to the discussion at hand.”
I’m also excited to highlight another piece of their guidance “When you disagree with someone, approach it with curiosity: try to work out why they think what they think, and what you can learn from each other.” On this:
Figuring out what someone thinks usually involves talking to them. If posting here is the first someone has heard of your concern, that might not be a very good way of resolving the disagreement.
Most people running a project in the community are basically trying to do good. It sounds obvious, but having a pretty strong prior on disagreements being in good faith seems wise here.
I think this is the best intro to investing for altruists that I’ve seen published. The investment concepts it covers are the most important ones, and the application to altruists seems right.
(For context: I used to work as a trader, which is somewhat but not very relevant, and have thought about this kind of thing a bit.)
I would guess that the decision of which GiveDirectly programme to support† is dominated by the principle you noted, of
the dollar going further overseas.
Maybe GiveDirectly will, in this case, be able to serve people in the US who are in comparable need to people in extreme poverty. That seems unlikely to me, but it seems like the main thing to figure out. I think your ‘criteria’ question is most relevant to checking this.
† Of course, I think the most important decision tends to be deciding which problem you aim to help solve, which would precede the question of whether and which cash transfers to fund.
The donation page and mailing list update loosely suggest that donations are project-specific by default. Likewise, GiveWell says:
GiveDirectly has told us that donations driven by GiveWell’s recommendation are used for standard cash transfers (other than some grant funding from Good Ventures and cases where donors have specified a different use of the funds).
(See the donation page for what the alternatives to standard cash transfers are.)
If funding for different GiveDirectly projects are sufficiently separate, your donation would pretty much just increase the budgets of the programmes you wish to support, perhaps especially if you give via GiveWell. If I were considering giving to GiveDirectly, I would want to look into this a bit more.
[Comment not relevant]
For the record, I wouldn’t describe having children to ‘impart positive values and competence to their descendants’ as a ‘common thought’ in effective altruism, at least any time recently.
I’ve been involved in the community in London for three years and in Berkeley for a year, and don’t recall ever having an in-person conversation about having children to promote values etc. I’ve seen it discussed maybe twice on the internet over those years.
--
Additionally: This seems like an ok state of affairs to me. Having children is a huge commitment (a significant fraction of a life’s work). Having children is also a major part of many people’s life goals (worth the huge commitment). Compared to those factors, it seems kind of implausible even in the best case that the effects you mention would be decisive.
Then: If one can determine a priori that these effects will rarely affect the decision of whether to have children, the value of information as discussed in this piece is small.
The commonsense meaning of ‘I predicted X’ is that I used some other information to assess that X was likely. ‘I saw the announcement of X before it was published’ is not that. I agree that it wasn’t literally false. It just gave a false impression. Hence ‘pretty clickbaity/misleading’.