ah awesome, thanks! I second-guessed myself by adding that, but I should have third-guessed myself. the midwit meme in real life
If grantee concerns are a reason against doing this, you could allow grantees to opt into having their tiers shared publicly. Even an incomplete list could be useful.
I’d personally happily opt in with the Atlas Fellowship, even if the tier wasn’t very good.
If a concern is that the community would read too much into the tiers, some disclaimers and encouragement for independent thinking might help counteract that.
I don’t think it’s impossible—you could start from Harperin’s et al basic setup  and plug in some numbers about p doom, the long rate growth rate etc and get a market opinion.
I would also be interested in seeing the analysis of hedge fund experts and others. In our cursory lit review we didn’t come across any which was readily quantifiable (would love to learn if there is one!).
Nice post! Using the individual level data are you able to answer the question of whether forecasts also get better if you start with the “best” 1 forecaster and then progressively add the next best, the next best, etc. where best is defined ex ante (eg prior lifetime Metaculus score)? It’s a different question but might also be of interest for thinking about optimal aggregation.
I don’t think you can learn much from observational data like this about the causal effect of the number of forecasters on performance. Do you have any natural experiments that you could exploit? (ie. some ‘random’ factor affecting the number of forecasters, that’s not correlated with forecaster skill.) Or can you run a randomized experiment?
It sounds like you’re doing subsampling. Bootstrapping is random sampling with replacement.
If, for example, we kept increasing the size of the sample we draw, then eventually the variance would be guaranteed to go to zero (when the sample size equals the total number of forecasters and there is only one possible sample we can draw).
With bootstrapping, there are NN possible draws when the bootstrap sample size is equal to the actual sample size N. (And you could choose a bootstrap sample size K>N.)
Yea, I assume the full version is impossible. But maybe there are at least some simpler statements that can be inferred? Like, “<10% of transformative AI by 2030.”I’d be really curious to get a better read on what market specialists around this area (maybe select hedge fund teams around tech disruption?) would think.
Thank you so much for hosting Denise!
I’m often pretty surprised at how insulated EAs in my field can sometimes be from non-EA approaches to the issue. I personally put a high value on people meaningfully engaging in new ideas that challenge pre-existing ideas. I think EAs can be really good at this if it’s within the confines of their own perspective (say, getting criticized from within EA). But less so from outside their own perspective. (note: I think almost everyone is bad at this, not super unique to EA).
In other words, I think there is some validity to the criticism that EAs can be kind of naive and generally unaware of other ways of seeing within their own field but outside of EA. (e.g., the critique that Global health EA is a bit simplistic/naive/insufficient).
Ok, fair enough. Imagine I said ‘a major point’ rather than ‘the whole point’.
Apologies, I misunderstood a fundamental aspect of what you’re doing! For some reason in my head you’d picked a set of conjectures which had just been posited this year, and were seeing how Laplace’s rule of succession would perform when using it to extrapolate forward with no historical input.
I don’t know where I got this wrong impression from, because you state very clearly what you’re doing in the first sentence of your post. I should have read it more carefully before making the bold claims in my last comment. I actually even had a go at stating the terms of the bet I suggested before quickly realising what I’d missed and retracting. But if you want to hold me to it you can (I might be interpreting the forum wrong but I think you can still see the deleted comment?)
I’m not embarrassed by my original concern about the dimensions, but your original reply addressed them nicely and I can see it likely doesn’t make a huge difference here whether you take a year or a month, at least as long as the conjecture was posited a good number of years ago (in the limit that “trial period”/”time since posited” goes to zero, you presumably recover the timeless result you referenced).
New EA forum suggestion: you should be able to disagree with your own comments.
Maybe we should have a forum ranking algorithm hackathon 😅
Maybe an upvote of a post could instead become an upvote of each (post, tag) tuple – e.g., (dDudLPHv7AgPLrzef, Building effective altruism), (dDudLPHv7AgPLrzef, Community), (dDudLPHv7AgPLrzef, Software engineering), (dDudLPHv7AgPLrzef, Public interest technology), etc. The final score of the post could then be calculated as (say) median of the quantiles of the scores of the (post, tag) tuples of the post.
Building effective altruism
Public interest technology
This would make it hard for a post to reach any but a low quantile among Community-tagged posts but should make it easy for posts to reach a high quantile among niche posts.
It could introduce a bit of an incentive for people to mistag their posts with obscure tags and leave out common tags, but using the median should make this effect relatively mild. Plus anyone can add the common tags. One could also exclude tags with < 10 posts from the calculation, so some other suitable threshold.
I haven’t tested this, so it might not work at all!
Thank you for this really helpful tip!
That depends on your primary interest in biosecurity. If it is more policy-oriented then maybe the Johns Hopkins or the BWC ones. If you are more interested in epidemiology then maybe the Pandora Report? If you are more interested in technological developments then newsletters 13-15 might be a better fit. These are just loose suggestions.
Of course! Happy to help :)
I didn’t vote, but your assertion that the “whole point” of the Forum is to discuss “topics of broad interest” seems overstated to me. That’s a purpose, but more technical/specialized discussion is also a purpose, and there are legitimate concerns that some of that is being drowned out.
PS—as always, for folks who disagree-voted on this, I’d appreciate seeing why specifically you disagree.
Can the people who agreement-downvoted this explain yourselves? Bogdan has a good point: if we really believe in short timelines to transformative AI we should either be spending our entire AI-philanthropy capital endowment now, or possibly investing it in something that will be useful after TAI exists. What does not make sense is trying to set up a slow funding stream for 50 years of AI alignment research if we’ll have AGI in 20 years.
At this point, I think it’s unfortunate that this post has not been published, a >2 month delay seems too long to me. If there’s anything I can do to help get this published, please let me know.