I’m currently researching forecasting and epistemics as part of the Quantified Uncertainty Research Institute.
Ozzie Gooen
I guess on one hand, if this were the case, then EAs would be well-represented in America, given that it’s population in 1776 was just 2.5M, vs. the UK population of 8M.
On the other hand, I’d assume that if they were distributed across the US, many would have been farmers / low-income workers / slaves, so wouldn’t have been able to contribute much. There is an interesting question on how much labor mobility or inequality there was at the time.
Also, it seems like EAs got incredibly lucky with Dustin Moskovitz + Good Ventures. It’s hard to picture just how lucky we were with that, and what the corresponding scenarios would have been like in 1776.
Could make for neat historical fiction.
On AI safety, I think it’s fairly likely (40%?) that the risk of x-risk (upon a lot of reflection) in the next 20 years is less than 20%, and that the entirety of the EA scene might be reducing it to say 15%.
This means that the entirety of the EA AI safety scene would help the EV of the world by ~5%.
On one hand, this is a whole lot. But on the other, I’m nervous that it’s not ambitious enough, for what could be one of the most [combination of well-resourced, well-meaning, and analytical/empirical] groups of our generation.
One thing I like about epistemic interventions is that the upper-bounds could be higher.
(There are some AI interventions that are more ambitious, but many do seem to be mainly about reducing x-risk by less than an order of magnitude, not increasing the steady-state potential outcome)
I’d also note here that an EV gain of 5% might not be particularly ambitious. It could well be the case that many different groups can do this—so it’s easier than it might seem if you think goodness is additive instead of multiplicative.
Yea, I think this setup has been incredibly frustrating downstream. I’d hope that people from OP with knowledge could publicly reflect on this, but my quick impression is that some of the following factors happened:
1. OP has had major difficulties/limitations around hiring in the last 5+ years. Some of this is lack of attention, some is that there aren’t great candidates, some is a lack of ability. This effected some cause areas more than others. For whatever reason, they seemed to have more success hiring (and retaining talent) for community than for technical AI safety.
2. I think there’s been some uncertainties / disagreements into how important / valuable current technical AI safety organizations are to fund. For example, I imagine if this were a major priority from those in charge of OP, more could have been done.
3. OP management seems to be a bit in flux now. Lost Holden recently, hiring a new head of GCR, etc.
4. I think OP isn’t very transparent and public with explaining their limitations/challenges publicly.
5. I would flag that there are spots at Anthropic and Deepmind that we don’t need to fund, that are still good fits for talent.
6. I think some of the Paul Christiano—connected orgs were considered a conflict-of-interest, given that Ajeya Cotra was the main grantmaker.
7. Given all of this, I think it would be really nice if people could at least provide warnings about this. Like, people entering the field are strongly warned that the job market is very limited. But I’m not sure who feels responsible / well placed to do this.
Around EA Priorities:
Personally, I feel fairly strongly convinced to favor interventions that could help the future past 20 years from now. (A much lighter version of “Longtermism”).
If I had a budget of $10B, I’d probably donate a fair bit to some existing AI safety groups. But it’s tricky to know what to do with, say, $10k. And the fact that the SFF, OP, and others have funded some of the clearest wins makes it harder to know what’s exciting on-the-margin.
I feel incredibly unsatisfied with the public EA dialogue around AI safety strategy now. From what I can tell, there’s some intelligent conversation happening by a handful of people at the Constellation coworking space, but a lot of this is barely clear publicly. I think many people outside of Constellation are working on simplified models, like “AI is generally dangerous, so we should slow it all down,” as opposed to something like, “Really, there are three scary narrow scenarios we need to worry about.”
I recently spent a week in DC and found it interesting. But my impression is that a lot of people there are focused on fairly low-level details, without a great sense of the big-picture strategy. For example, there’s a lot of work into shovel-ready government legislation, but little thinking on what the TAI transition should really look like.
This sort of myopic mindset is also common in the technical space, where I meet a bunch of people focused on narrow aspects of LLMs, without much understanding of how their work exactly fits into the big picture of AI alignment. As an example, a lot of work seems like it would help with misuse risk, even when the big-picture EAs seem much more focused on accident risk.
Some (very) positive news is that we do have far more talent in this area than we did 5 years ago, and there’s correspondingly more discussion. But it still feels very chaotic.
A bit more evidence—it seems like OP has provided very mixed messages around AI safety. They’ve provided surprisingly little funding / support for technical AI safety in the last few years (perhaps 1 full-time grantmaker?), but they have seemed to provide more support for AI safety community building / recruiting, and AI policy. But all of this still represents perhaps ~30% or so of their total budget, and I don’t sense that that’s about to change. Overall this comes off as measured and cautious. Meanwhile, it’s been difficult to convince other large donors to get into this area. (Other than Jaan Tallinn, he might well have been the strongest dedicated donor here).
Recently it seems like the community on the EA Forum has shifted a bit to favor animal welfare. Or maybe it’s just that the AI safety people have migrated to other blogs and organizations.
But again, I’m very hopeful that we can find interventions that will help in the long-term, so few of these excite me. I’d expect and hope that interventions that help the long-term future would ultimately improve animal welfare and more.
So on one hand, AI risk seems like the main intervention area for the long-term, but on the other, the field is a bit of a mess right now.
I feel quite frustrated that EA doesn’t have many other strong recommendations for other potential donors interested in the long-term. For example, I’d really hope that there could be good interventions to make the US government or just US epistemics more robust, but I barely see any work in that area.
“Forecasting” is one interesting area—it currently does have some dedicated support from OP. But it honestly seems to be in a pretty mediocre state to me right now. There might be 15-30 full-time people in the space at this point, and there’s surprisingly little in terms of any long-term research agendas.
I (with limited information) think the EA Animal Welfare Fund is promising, but wonder how much of that matches the intention of this experiment. It can be a bit underwhelming if an experiment to try to get the crowd’s takes on charities winds up determining to, “just let the current few experts figure it out.” Though I guess, that does represent a good state of the world. (The public thinks that the current experts are basically right)
I occasionally hear implications that cyber + AI + rogue human hackers will cause mass devastation, in ways that roughly match “lots of cyberattacks happening all over.” I’m skeptical of this causing over $1T/year in damages (for over 5 years, pre-TAI), and definitely of it causing an existential disaster.
There are some much more narrow situations that might be more X-risk-relevant, like [A rogue AI exfiltrates itself] or [China uses cyber weapons to dominate the US and create a singleton], but I think these are so narrow they should really be identified individually and called out. If we’re worried about them, I’d expect we’d want to take very different actions then to broadly reduce cyber risks.
I’m worried that some smart+influential folks are worried about the narrow risks, but then there’s various confusion, and soon we have EAs getting scared and vocal about the broader risks.
Some more discussion in this Facebook Post.
Here’s the broader comment against cyber + AI + rogue human hacker risks, or maybe even a lot of cyber + AI + nation state risks.
Note: This was written quickly, and I’m really not a specialist/expert here.1. There’s easily $10T of market cap of tech companies that would be dramatically reduced if AI systems could invalidate common security measures. This means a lot of incentive to prevent this.
2. AI agents could oversee phone calls and video calls, and monitor other conversations, and raise flags about potential risks. There’s already work here, there could be a lot more.
3. If LLMs could detect security vulnerabilities, this might be a fairly standardized and somewhat repeatable process, and actors with more money could have a big advantage. If person A spend $10M using GPT5 to discover 0-days, they’d generally find a subset compared to person B, who spends $100M. This could mean that governments and corporations would have a large advantage. They could do such investigation during the pre-release of software, and have ongoing security checks as new models are released. Or, companies would find bugs before attackers would. (There is a different question of whether the bug is cost-efficient to fix).
4. The way to do a ton of damage with LLMs and cyber is to develop offensive capabilities in-house, then release a bunch of them at once in a planned massive attack. In comparison, I’d expect that many online attackers using LLMs wouldn’t be very coordinated or patient. I think that attackers are already using LLMs somewhat, and would expect this to scale gradually, providing defenders a lot of time and experience.
5. AI code generation is arguably improving quickly. This could allow us to build much more secure software, and to add security-critical features.
6. If the state of cyber-defense is bad enough, groups like the NSA might use it to identify and stop would-be attackers. It could be tricky to have a world where it’s both difficult to protect key data, but also, it’s easy to remain anonymous when going after other’s data. Similarly, if a lot of the online finance world is hackable, then potential hackers might not have a way to store potential hacking earnings, so could be less motivated. It just seems tough to fully imagine a world where many decentralized actors carry out attacks that completely cripple the economy.
7. Cybersecurity has a lot of very smart people and security companies. Perhaps not enough, but I’d expect these people could see threats coming and respond decently.
8. Very arguably, a lot of our infrastructure is fairly insecure, in large part because it’s just not attacked that much, and when it is, it doesn’t cause all too much damage. Companies historically have skimped on security because the costs weren’t prohibitive. If cyberattacks get much worse, there’s likely a backlog of easy wins, once companies actually get motivated to make fixes.
9. I think around our social circles, those worried about AI and cybersecurity generally talk about it far more than those not worried about it. I think this is one of a few biases that might make things seem scarier than they actually are.
10. Some companies like Apple of gotten good at rolling out security updates fairly quickly. In theory, an important security update to iPhones could reach 50% penetration in a day or so. These systems can improve further.
11. I think we have yet to see the markets show worry about cyber-risk. Valuations of tech companies are very high, cyber-risk doesn’t seem like a major factor when discussing tech valuations. Companies can get cyber-insurance—I think the rates have been going up, but not exponentially.
12. Arguably, there’s many trillions of dollars being held to by billionaires and others that they don’t know what to do with. If something like this actually causes 50%+ global wealth to drop, it would be an enticing avenue for such money to go. Basically, we do have large reserves to spend, if the EV is positive enough, as a planet.
13. In worlds with much better AI, many AI companies (and others) will be a lot richer, and be motivated to keep the game going.
14. Very obviously, if there’s 10T+ at stake, this would be a great opportunity for new security companies and products to enter the market.
15. Again, if there’s 10T+ at stake, I’d assume that people could change practices a lot to use more secure devices. In theory all professionals could change to one of a few locked-down phones and computers.
16. The main scary actors potentially behind AI + Cyber would be nation states and rogue AIs. But nation-states have traditionally been hesitant to make these (meaning $1T+ damage) attacks outside of wartime, for similar reasons that they are hesitant to do military attacks outside wartime.
17. I believe that the US leads on cyber now. The US definitely leads on income. More cyber/hacking abilities would likely be used heavily by the US state. So, if they become much more powerful, the NSA/CIA might become far better at using cyber attacks to go after other potential international attackers. US citizens might have a hard time being private and secure, but so would would-be attackers. Cyber-crime becomes far less profitable if the attackers themselves can preserve their own privacy and security. There are only 8 Billion people in the world, so in theory it might be possible to oversee everyone with a risk of doing damage (maybe 1-10 million people)? Another way of putting this is that better cyber offense could directly lead to more surveillance by the US department. (This obviously has some other downsides, like US totalitarian control, but that is a very different risk)
I wonder if some of the worry on AI + Cyber is akin to the “sleepwalking fallacy”. Basically, if AI + Cyber becomes a massive problem, I think we should expect that there will be correspondingly massive resources spent then trying to fix it. I think that many people (but not all!) worried about this topic aren’t really imagining what $1-10T of decently-effective resources spent on defense would do.
I think that AI + Cyber could be critical threat vector for malicious and powerful AIs in the case of AI takeover. I also could easily see it doing $10-$100B/year of damage in the next few years. But I’m having trouble picturing it doing $10T/year of damage in the next few years, if controlled by humans.
I think I broadly like the idea of Donation Week.
One potential weakness is that I’m curious if it promotes the more well-known charities due to the voting system. I’d assume that these are somewhat inversely correlated with the most neglected charities.
Related, I’m curious if future versions could feature specific subprojects/teams within charities. “Rethink Priorities” is a rather large project compared to “PauseAI US”, I assume it would be interesting if different parts of it were put here instead.
(That said, in terms of the donation, I’d hope that we could donate to RP as a whole and trust RP to allocate it accordingly, instead of formally restricting the money, which can be quite a hassle in terms of accounting)
As a very simple example, I think Amanda Askell stands out to me as someone who used to work on Philosophy, then shifted to ML work, where she now seems to be doing important work on crafting the personality of Claude. I think Claude easily has 100k direct users (more through the API) now and I expect that to expand a lot.
There’s been some investigations into trying to get LLMs to be truthful:
https://arxiv.org/abs/2110.06674
And of course, LLMs have shown promise at forecasting:
https://arxiv.org/abs/2402.18563
In general, I’m both suspicious of human intellectuals (for reasons outlined in the above linked post), and I’m suspicious of our ability to improve human intellectuals. On the latter, it’s just very expensive to train humans to adapt new practices or methods. It’s obviously insanely expensive to train humans in any complex topic like Bayesian Statistics.
Meanwhile, LLM setups are rapidly improving, and arguably much more straightforward to improve. There’s of course one challenge of actually getting the right LLM companies to incorporate recommended practices, but my guess is that this is often much easier than training humans. You could also just build epistemic tools on top of LLMs, though these would generally target fewer people.
I have a lot of uncertainty if AI is likely to be an existential risk. But I have a lot more certainty that AI is improving quickly and will become a more critical epistemic tool than it is now. It’s also just far easier to study than studying humans.
Happy to discuss / chat if that could ever be useful!
Happy to see progress on these.
One worry that I have about them, is that they (at least the forecasting part of the economics one, and the psychology one) seem very focused on various adjustments to human judgement. In contrast, I think a much more urgent and tractable question is how to improve the judgement and epistemics of AI systems.
I’ve written a bit more here.
AI epistemics seems like an important area to me both because it helps with AI safety, and because I expect that it’s likely to be the main epistemic enhancement we’ll get in the next 20 years or so.
To give these questions full justice would take quite a while. I’ll give a quick-ish summary.
On Guesstimate and Squiggle, you can see some of the examples on the public websites. On Guesstimate, go here, then select “recent”. We get a bunch of public models by effective altruist users. Guesstimate is still much more popular than Squiggle, but I think Squiggle has a decent amount more potential in the long-term. (It’s much more powerful and flexible, but more work to learn).
I think Guesstimate is continuing to hold fairly steady, though is slowly declining in use each year (we haven’t been improving it, mainly doing minimal maintenance).
With Squiggle, I understand that a fair amount of modeling is done with the public Playground, where we can’t measure activity very well. We do have metrics of use on Squiggle Hub, which does exist but is limited now.
I’d flag that these are somewhat specialized tools that are often are used for certain occasions. A bunch of orgs do modeling in specific batches, then don’t touch the models for a few months.
“examples of standout successes from these tools?” → Our largest one was the Global Unified Cost-Effectiveness Analysis (GUCEM) by the FTX Future Fund, in 2023. Leopold Aschenbrenner specifically did a lot of work making a very comprehensive estimates of their funding. Frustratingly, after FTX collapsed, so too did this project.
We have not since had other users who have been as ambitious. We have had several users inside CEA, OP, and the LTFF. I’m not sure how much I can get into detail into the specifics. I think most of this has been private so far, I hope more eventually becomes public.
In Michael Dicken’s recent post, he linked to a Squiggle model he used for some of his cost-effectiveness estimates.
Some models from CEA were linked in these posts. Ben West was into this when he was the interim CEO there.
https://forum.effectivealtruism.org/posts/xrQkYh8GGR8GipKHL/how-expensive-is-leaving-your-org-squiggle-model
https://forum.effectivealtruism.org/posts/4wNDqRPJWhoe8SnoG/cea-is-fundraising-and-funding-constrained
Guesstimate ActivitySquiggle Hub Activity
I think these results, by themselves, are not as impressive as I’d like. If that was all we were aiming for and accomplished, and we were making a fairly ordinary web application, I’d consider this a minor success, but one with unsure cost-effectiveness, especially given the opportunity cost of our team.
However, I’ll flag that:
- A lot of what we’ve been doing with Squiggle has been on the highly-experimental end. I see this as a research project and an experiment to identify promising interventions, more than a direct value-adding internal tool so far. Through this lens, I think what we have now is much more impressive. We’ve developed a usable programming language with some unique features, a novel interactive environment, a suite of custom visualizations, all of which are iterated on and open-source. We did this with a very small team (less than 2 FTEs, for less than 2 years on it), and a low budget for any serious technical venture. There’s a bunch of experimental features we’ve been trying out, but have not yet fully written about. Relative Value Functions was one such experiment that we have had some use of, and are still excited to promote to other groups, though perhaps in different forms.
- A lot of what I’ve been doing has been on thinking through and envisioning where forecasting/epistemics should go. If you look through our posts you can see a lot of this. I think we have one of the most ambitious and coherent visions for where we can encourage epistemic research and development. I see much of our tooling as a way to help clarify and experiment with these visions. Most of our writing is open and free.
- I’m not sure how much sense it makes to focus on increasing direct authorship of Guesstimate/Squiggle now. I think in the future, it’s very likely that a lot of Squiggle would be written by AIs, perhaps with some specialists analysts in-house. Training people to build these models is fairly high-cost. I’ve done several workshops now. I think they’ve went decently, but I think it would take far more training to substantially increase the amount of numeric cost-effectiveness models written across most EA orgs.
What’s the competitive landscape here? I’m slightly worried that this kind of initiative should be a for-profit and EA-independent
Yea, I get that often. We think about it sometimes. I plan to continue to consider it, but I’d flag that there are a lot of points that makes this less appealing than it might seem:
1. I tried turning Guesstimate into a business and realized it would be an uphill battle, unless we heavily pivoted into something different.
2. The market for numeric tooling is honestly quite narrow and limited. You can maybe make a business, but it’s very hard to scale it.
3. If you go the VC route, they’ll heavily encourage you to pivot to specific products that make more money. Then you can get bought out / closed down, if it’s not growing very quickly. Causal went this route, then semi-pivoted to finance modeling, then got bought out.
4. It’s really hard to make a successful business, especially one with enough leeway to also support EAs and EA use cases. We’re a 2-person team, I don’t think we have the capacity or skills to do this well now.
5. I want to be sure that we can quickly change focus to what’s most promising. For example, AI has been changing, and so too has what’s possible with tools on AI. When you make a business, it can be easy to lock-in to a narrow product.
6. I think that what EAs/Rationalists want is generally a fair bit different from what others want, and especially what other companies would pay a lot for. So it’s difficult to support both.
I hope that helps clarify things. Happy to answer other questions!
Thanks for the clarification!
Yea, I’m not very sure what messaging to use. It’s definitely true that there’s a risk we won’t be able to maintain our current team for another year. At the same time, if we could get more than our baseline of funding, I think we could make good use of it (up to another 1-2 FTE, for 2025).
I’m definitely still hoping that we could eventually (next 1-5 years) either significantly grow (this could mean up to 5-7 FTE) or scale in other ways. Our situation now seems pretty minimal to me, but I still strongly prefer it to not having it.I’d flag that the funding ecosystem feels fairly limited for our sort of work. The main options are really the SFF and the new Open Philanthropy forecasting team. I’ve heard that some related groups have also been having challenges with funding.
If the confusion is that you expected us to have more runway, I’m not very sure what to say. I think this sector can be pretty difficult. We’re in talks for funding from one donor, which would help cover this gap, but I’d like to not depend that much on them.
We also do have a few months of reserves that we could spend in 2025 if really needed.
We so far raised $62,000 for 2025, from the Survival and Flourishing Fund.
Slava and myself are both senior software engineers, I’m in Berkeley (to be close to the EA scene here). Total is roughly $200k for the two of us (including taxes and health care).
In addition, we have server and software payments, plus other misc payments.
We then have a 14% overhead from our sponsorship with Rethink Priorities.
I said around $200k, so this assumes basically a $262k budget. This is on the low end for what I’d really prefer, but given the current EA funding situation, is what I’ll aim for now.
If we had more money we could bring in contractors for things like research and support.
Answering on behalf the Quantified Uncertainty Research Institute!
We’re looking to raise another ~$200k for 2025, to cover our current two-person team plus expenses. We’d also be enthusiastic about expanding our efforts if there is donor interest.
We at QURI have been busy on software infrastructure and epistemics investigations this last year. We currently have two full-time employees—myself and Slava Matyuhin. Slava focuses on engineering, I do a mix of engineering, writing, and admin.Our main work this year has been improving Squiggle and Squiggle Hub.
In the last few months we’ve built Squiggle AI as well, which we’ve started getting feedback on and will write more about here shortly. Basically, we believe that BOTECs and cost-benefit models are good fits for automation. So far, with some tooling, we think that we’ve created a system that produces decent first passes on many simple models. This would ideally be something EAs benefit from directly, and something that could help inspire other epistemic AI improvements.
On the side of software development, we’ve posted a series of articles about forecasting, epistemics, and effective altruism. Recently these have focused on the combination of AI and epistemics.
For 2025, we’re looking to expand more throughout the EA and AI safety ecosystems. We have a backlog of Squiggle updates to inform people, and have a long list of new things we expect people to like. We’ve so far focused on product experimentation and development, and would like to spend more time on education and outreach. In addition, we’ll probably continue focusing a lot on AI—both on improving AI systems to write and audit cost-effectiveness models and similar, and also on helping build cost-effectiveness models to guide AI safety.
If you support this sort of work and are interested in chatting or donating, please reach out! You can reach me at ozzie@quantifieduncertainty.org. We’re very focused on helping the EA ecosystem, and would really like to diversify our base of close contacts and donors.
Donate
QURI is fiscally sponsored by Rethink Priorities. We have a simple donation page here.
I still think that EA Reform is pretty important. I believe that there’s been very little work so far on any of the initiatives we discussed here.
My impression is that the vast majority of money that CEA gets is from OP. I think that in practice, this means that they represent OP’s interests significantly more than I feel comfortable with. While I generally like OP a lot, I think OP’s focuses are fairly distinct from those of the regular EA community.
Some things I’d be eager to see funded:
- Work with CEA to find specific pockets of work that the EA community might prioritize, but OP wouldn’t. Help fund these things.
- Fund other parties to help represent / engage / oversee the EA community.
- Audit/oversee key EA funders (OP, SFF, etc); as these often aren’t reviewed by third parties.
- Make sure that the management in key EA orgs are strong, including the boards.
- Make sure that many key EA employees and small donors are properly taken care of and are provided with support. (I think that OP has reason to neglect this area, as it can be difficult to square with naive cost-effectiveness calculations)
- Identify voices that want to tackle some of these issues head-on, and give them a space to do so. This could mean bloggers / key journalists / potential community leaders in the future.
- Help encourage or set up new EA organizations to sit apart from CEA, but help oversee/manage the movement.
- Help out the Community Health team at CEA. This seems like a very tough job that could arguably use more support, some of which might be best done outside of CEA.
Generally, I feel like there’s a very significant vacuum of leadership and managerial visibility in the EA community. I think that this is a difficult area to make progress on, but also consider it much more important than other EA donation targets.
Thanks for bringing this up. I was unsure what terminology would be best here.
I mainly have in mind fermi models and more complex but similar-in-theory estimations. But I believe this could extend gracefully for more complex models. I don’t know of many great “ontologies of types of mathematical models,” so am not sure how to best draw the line.
Here’s a larger list that I think could work.Fermi estimates
Cost-benefit models
Simple agent-based models
Bayesian models
Physical or social simulations
Risk assessment models
Portfolio optimization models
I think this framework is probably more relevant for models estimating an existing or future parameter, than models optimizing some process, if that helps at all.
Ah, I didn’t quite notice that at the time—that wasn’t obvious from the UI (you need to hover over the date to see the time of it being posted).
Anyway, happy this was resolved! Also, separately, kudos for writing this up, I’m looking forward to seeing where Metaculus goes this next year +.
(The opening line was removed)
I feel like the bulk of this is interesting, but the title and opening come off as more grandiose than necessary.
That makes sense, but I’m feeling skeptical. There are just so many AI safety orgs now, and the technical ones generally aren’t even funded by OP.
For example: https://www.lesswrong.com/posts/9n87is5QsCozxr9fp/the-big-nonprofits-post
While a bunch of these salaries are on the high side, not all of them are.