The random chance argument is harder to make if the studies have large effect sizes. If the true effect is 0, it’s unlikely we’ll observe a large effect by chance.
This is exactly what p-values are designed for, so you are probably better off looking at p-values rather than effect size if that’s the scenario you’re trying to avoid.
I suppose you could imagine that p-values are always going to be just around 0.05, and that for a real and large effect size people use a smaller sample because that’s all that’s necessary to get p < 0.05, but this feels less likely to me. I would expect that with a real, large effect you very quickly get p < 0.01, and researchers would in fact do that.
(I don’t necessarily disagree with the rest of your comment, I’m more unsure on the other points.)
See also my summary and Richard Ngo’s comments.
Yeah, I’ve been doing this occasionally (though that started recently).
From my present vantage, the AI alignment newsletter is becoming a pretty prominent clearinghouse for academic AI alignment research updates. (I wouldn’t be surprised if it were the primary source of such for a sizable portion of newsletter subscribers.)
To the extent that’s true, the amplification effects seem possibly strong.
I agree that’s true and that the amplification effects for AI safety researchers are strong; it’s much less strong of an amplification effect for any other category. My current model is that info hazards are most worrisome when they spread outside the AI safety community.
On confidentiality, the downsides of the newsletter failing to preserve confidentiality seem sufficiently small that I’m not worried (if you ignore info hazards). Failures of confidentiality seem bad in that they harm your reputation and make it less likely that people are willing to talk to you—it’s similar to the reason you wouldn’t break a promise even if superficially the consequences of the thing you’re doing seem slightly negative. But in the case of the newsletter, we would amplify someone else’s failure to preserve confidentiality, which shouldn’t reflect all that poorly on us. (Obviously if we knew that the information was supposed to be confidential we wouldn’t publish it.)
This was in response to “the growing amount of AI safety research.”
Yeah, I think I phrased that question poorly. The question is both “should all of it be summarized” and “if yes, how can that be done”.
Presumably as there is more research, it takes more time to read & assess the forthcoming literature to figure out what’s important / worth including in the newsletter.
I feel relatively capable of that—I think I can figure out for any given reading whether I want to include it in ~5 minutes or so with relatively high accuracy. It’s actually reading and summarizing it that takes time.
Interesting to think about what governance the newsletter should have in place re: info hazards, confidentiality, etc.
Currently we only write about public documents, so I don’t think these concerns arise. I suppose you could imagine that someone writes about something they shouldn’t have and we amplify it, but I suspect this is a rare case and one that should be up to my discretion.
What did you guys do for GPT-2?
Not sure what specifically you’re asking about here. You can see the relevant newsletter here.
My intuition is that this would be a good time to formalize the structure of the newsletter somewhat, especially given that there are multiple contributors & you are starting to function more as an editor.
Certainly more systems are being put into place, which is kind of like “formalizing the structure”. Creating an organization feels like a high fixed cost for not much benefit—what do you think the main benefits would be? (Maybe this is combined with paying content writers and editors, in which case an organization might make more sense?)
Plausibly it’s fine to keep it as an informal research product, but I’d guess that “AI alignment newsletter editor” could basically be (or soon become) a full-time job.
If I were to make this my full-time job, the newsletter would approximately double in length (assuming I found enough content to cover), and I’d expect that people wouldn’t read most of it. (People already don’t read all of it, I’m pretty sure.) What do you think would be the value of more time put into the newsletter?
My first guess is that there’s significant value in someone maintaining an open, exhaustive database of AIS research.
Yeah, I agree. But there’s also significant value in doing more AIS research, and I suspect that on the current margin for a full-time researcher (such as myself) it’s better to do more AIS research compared to writing summaries of everything.
Note that I do intend to keep adding all of the links to the database, it’s the summaries that won’t keep up.
It is plausible to me that an org with a safety team (e.g. DeepMind/OpenAI) is already doing this in-house, or planning to do so.
I’m 95% confident that no one is already doing this, and if they were seriously planning to do so I’d expect they would check in with me first. (I do know multiple people at all of these orgs.)
More broadly, these labs might have some good systems in place for maintaining databases of new research in areas with a much higher volume than AIS, so could potentially share some best-practices.
You know, that would make sense as a thing to exist, but I suspect it does not. Regardless that’s a good idea, I should make sure to check.
Comment thread for the question: What is the value of the newsletter for you?
Comment thread for the question: What is the value of the newsletter for other people?
Comment thread for the question: How should I deal with the growing amount of AI safety research?
Comment thread for the question: What can I do to get more feedback on the newsletter on an ongoing basis (rather than having to survey people at fixed times)?
Comment thread for the question: Am I underestimating the risk of causing information cascades? Regardless, how can I mitigate this risk?
Is this different from having more people on a single granting body?
Possibly with more people on a single granting body, everyone talks to each other more and so can all get stuck thinking the same thing, whereas they would have come up with more / different considerations had they been separate. But this would suggest that granting bodies would benefit from splitting into halves, going over grants individually, and then merging at the end. Would you endorse that suggestion?
Mostly agree with all of this; some nitpicks:
My understanding (and I think everyone else’s) of AI capabilities is largely shaped by how impressive the results of major papers intuitively seem.
I claim that this is not how I think about AI capabilities, and it is not how many AI researchers think about AI capabilities. For a particularly extreme example, the Go-explore paper out of Uber had a very nominally impressive result on Montezuma’s Revenge, but much of the AI community didn’t find it compelling because of the assumptions that their algorithm used.
I’m not sure I fully understand how the metric would work. For the Atari example, it seems clear to me that we could easily reach it without making a generalizable AI system, or vice versa.
Tbc, I definitely did not intend for that to be an actual metric.
But let’s say that we could come up with a relevant metric. Then I’d agree with Garfinkel, as long as people in the community had known roughly the current state of AI in relation to it and the rate of advance toward it before the release of “AI and Compute”.
I would say that I have a set of intuitions and impressions that function as a very weak prediction of what AI will look like in the future, along the lines of that sort of metric. I trust timelines based on extrapolation of progress using these intuitions more than timelines based solely on compute.To the extent that you hear timeline estimates from people like me who do this sort of “progress extrapolation” who also did not know about how compute has been scaling, you would want to lengthen their timeline estimates. I’m not sure how timeline predictions break down on this axis.
DeepMind certainly seems to be saying that AlphaZero is better at searching a more limited set of promising moves than Stockfish, a traditional chess engine (unfortunately they don’t compare it to earlier versions of AlphaGo on this metric).
Only at test time. AlphaZero has much more experience gained from its training phase. (Stockfish has no training phase, though you could think of all of the human domain knowledge encoded in it as a form of “training”.)
AlphaZero went from a bundle of blank learning algorithms to stronger than the best human chess players in history...in less than two hours.
Humans are extremely poorly optimized for playing chess.
I don’t agree with Garfinkel that OpenAI’s analysis should make us more pessimistic about human-level AI timelines. While it makes sense to revise our estimate of AI algorithms downward, it doesn’t follow that we should do the same for our estimate of overall progress in AI. By cortical neuron count, systems like AlphaZero are at about the same level as a blackbird (albeit one that lives for 18 years), so there’s a clear case for future advances being more impressive than current ones as we approach the human level.
Sounds like you are using a model where (our understanding of) current capabilities and rates of progress of AI are not very relevant for determining future capabilities, because we don’t know the absolute quantitative capability corresponding to “human-level AI”. Instead, you model it primarily on the absolute amount of compute needed.
Suppose you did know the absolute capability corresponding to “human-level AI”, e.g. you can say something like “once we are able to solve Atari benchmarks using only 10k samples from the environment, we will have human-level AI”, and you found that metric much more persuasive than the compute used by a human brain. Would you then agree with Garfinkel’s point?
In my understanding, coordination/collusion can be limited by keeping donations anonymous.
It’s not hard for an individual to prove that they donated by other means, e.g. screenshots and bank statements.
(See the first two paragraphs on page 16 in the paper for an example.)
Right after that, the authors say:
There is a broader point here. If perfect harmonization of interests is possible, Capitalism leads to optimal outcomes. LR is intended to overcome such lack of harmonization and falls prey to manipulation when it wrongly assumes harmonization is difficult
With donations it is particularly easy to harmonize interests: if I’m planning to allocate 2 votes to MIRI and you’re planning to allocate 2 votes to AMF, we can instead have each of us allocate 1 vote each to MIRI and AMF and we both benefit. Yes, we have to build trust that neither of us would defect by actually putting both of our votes to our preferred charity; but this seems doable in practice: even in the hardest case of vote trading (where there are laws attempting to enforce anonymity and inability to prove your vote) there seems to have been some success.
Sorry, I meant “collusion” in the sense that it is used in the game theory literature, where it’s basically equivalent to “coordination in a way not modeled by the game theory”, and doesn’t carry the illegal/deceitful connotation it does in English. See e.g. here, which is explicitly talking about this problem for Glen Weyl’s proposal.
The overall point is, if donors can coordinate, as they obviously can in the real world, then the optimal provisioning of goods theorem no longer holds. The example with MIRI showcased this effect. I’m not saying that anyone did anything wrong in that example.