To give these questions full justice would take quite a while. I’ll give a quick-ish summary.
On Guesstimate and Squiggle, you can see some of the examples on the public websites. On Guesstimate, go here, then select “recent”. We get a bunch of public models by effective altruist users. Guesstimate is still much more popular than Squiggle, but I think Squiggle has a decent amount more potential in the long-term. (It’s much more powerful and flexible, but more work to learn).
I think Guesstimate is continuing to hold fairly steady, though is slowly declining in use each year (we haven’t been improving it, mainly doing minimal maintenance).
With Squiggle, I understand that a fair amount of modeling is done with the public Playground, where we can’t measure activity very well. We do have metrics of use on Squiggle Hub, which does exist but is limited now.
I’d flag that these are somewhat specialized tools that are often are used for certain occasions. A bunch of orgs do modeling in specific batches, then don’t touch the models for a few months.
“examples of standout successes from these tools?” → Our largest one was the Global Unified Cost-Effectiveness Analysis (GUCEM) by the FTX Future Fund, in 2023. Leopold Aschenbrenner specifically did a lot of work making a very comprehensive estimates of their funding. Frustratingly, after FTX collapsed, so too did this project.
We have not since had other users who have been as ambitious. We have had several users inside CEA, OP, and the LTFF. I’m not sure how much I can get into detail into the specifics. I think most of this has been private so far, I hope more eventually becomes public.
In Michael Dicken’s recent post, he linked to a Squiggle model he used for some of his cost-effectiveness estimates.
I think these results, by themselves, are not as impressive as I’d like. If that was all we were aiming for and accomplished, and we were making a fairly ordinary web application, I’d consider this a minor success, but one with unsure cost-effectiveness, especially given the opportunity cost of our team.
However, I’ll flag that:
- A lot of what we’ve been doing with Squiggle has been on the highly-experimental end. I see this as a research project and an experiment to identify promising interventions, more than a direct value-adding internal tool so far. Through this lens, I think what we have now is much more impressive. We’ve developed a usable programming language with some unique features, a novel interactive environment, a suite of custom visualizations, all of which are iterated on and open-source. We did this with a very small team (less than 2 FTEs, for less than 2 years on it), and a low budget for any serious technical venture. There’s a bunch of experimental features we’ve been trying out, but have not yet fully written about. Relative Value Functions was one such experiment that we have had some use of, and are still excited to promote to other groups, though perhaps in different forms.
- A lot of what I’ve been doing has been on thinking through and envisioning where forecasting/epistemics should go. If you look through our posts you can see a lot of this. I think we have one of the most ambitious and coherent visions for where we can encourage epistemic research and development. I see much of our tooling as a way to help clarify and experiment with these visions. Most of our writing is open and free.
- I’m not sure how much sense it makes to focus on increasing direct authorship of Guesstimate/Squiggle now. I think in the future, it’s very likely that a lot of Squiggle would be written by AIs, perhaps with some specialists analysts in-house. Training people to build these models is fairly high-cost. I’ve done several workshops now. I think they’ve went decently, but I think it would take far more training to substantially increase the amount of numeric cost-effectiveness models written across most EA orgs.
What’s the competitive landscape here? I’m slightly worried that this kind of initiative should be a for-profit and EA-independent
Yea, I get that often. We think about it sometimes. I plan to continue to consider it, but I’d flag that there are a lot of points that makes this less appealing than it might seem:
1. I tried turning Guesstimate into a business and realized it would be an uphill battle, unless we heavily pivoted into something different. 2. The market for numeric tooling is honestly quite narrow and limited. You can maybe make a business, but it’s very hard to scale it. 3. If you go the VC route, they’ll heavily encourage you to pivot to specific products that make more money. Then you can get bought out / closed down, if it’s not growing very quickly. Causal went this route, then semi-pivoted to finance modeling, then got bought out. 4. It’s really hard to make a successful business, especially one with enough leeway to also support EAs and EA use cases. We’re a 2-person team, I don’t think we have the capacity or skills to do this well now. 5. I want to be sure that we can quickly change focus to what’s most promising. For example, AI has been changing, and so too has what’s possible with tools on AI. When you make a business, it can be easy to lock-in to a narrow product. 6. I think that what EAs/Rationalists want is generally a fair bit different from what others want, and especially what other companies would pay a lot for. So it’s difficult to support both.
I hope that helps clarify things. Happy to answer other questions!
I’m interested in learning more!
How many active users do your tools currently have? Any examples of standout successes from these tools?
Do you have some metrics on impact/adoption within EA/AI safety orgs?
What’s the competitive landscape here? I’m slightly worried that this kind of initiative should be a for-profit and EA-independent
To give these questions full justice would take quite a while. I’ll give a quick-ish summary.
On Guesstimate and Squiggle, you can see some of the examples on the public websites. On Guesstimate, go here, then select “recent”. We get a bunch of public models by effective altruist users. Guesstimate is still much more popular than Squiggle, but I think Squiggle has a decent amount more potential in the long-term. (It’s much more powerful and flexible, but more work to learn).
I think Guesstimate is continuing to hold fairly steady, though is slowly declining in use each year (we haven’t been improving it, mainly doing minimal maintenance).
With Squiggle, I understand that a fair amount of modeling is done with the public Playground, where we can’t measure activity very well. We do have metrics of use on Squiggle Hub, which does exist but is limited now.
I’d flag that these are somewhat specialized tools that are often are used for certain occasions. A bunch of orgs do modeling in specific batches, then don’t touch the models for a few months.
“examples of standout successes from these tools?” → Our largest one was the Global Unified Cost-Effectiveness Analysis (GUCEM) by the FTX Future Fund, in 2023. Leopold Aschenbrenner specifically did a lot of work making a very comprehensive estimates of their funding. Frustratingly, after FTX collapsed, so too did this project.
We have not since had other users who have been as ambitious. We have had several users inside CEA, OP, and the LTFF. I’m not sure how much I can get into detail into the specifics. I think most of this has been private so far, I hope more eventually becomes public.
In Michael Dicken’s recent post, he linked to a Squiggle model he used for some of his cost-effectiveness estimates.
Some models from CEA were linked in these posts. Ben West was into this when he was the interim CEO there.
https://forum.effectivealtruism.org/posts/xrQkYh8GGR8GipKHL/how-expensive-is-leaving-your-org-squiggle-model
https://forum.effectivealtruism.org/posts/4wNDqRPJWhoe8SnoG/cea-is-fundraising-and-funding-constrained
Guesstimate Activity
Squiggle Hub Activity
I think these results, by themselves, are not as impressive as I’d like. If that was all we were aiming for and accomplished, and we were making a fairly ordinary web application, I’d consider this a minor success, but one with unsure cost-effectiveness, especially given the opportunity cost of our team.
However, I’ll flag that:
- A lot of what we’ve been doing with Squiggle has been on the highly-experimental end. I see this as a research project and an experiment to identify promising interventions, more than a direct value-adding internal tool so far. Through this lens, I think what we have now is much more impressive. We’ve developed a usable programming language with some unique features, a novel interactive environment, a suite of custom visualizations, all of which are iterated on and open-source. We did this with a very small team (less than 2 FTEs, for less than 2 years on it), and a low budget for any serious technical venture. There’s a bunch of experimental features we’ve been trying out, but have not yet fully written about. Relative Value Functions was one such experiment that we have had some use of, and are still excited to promote to other groups, though perhaps in different forms.
- A lot of what I’ve been doing has been on thinking through and envisioning where forecasting/epistemics should go. If you look through our posts you can see a lot of this. I think we have one of the most ambitious and coherent visions for where we can encourage epistemic research and development. I see much of our tooling as a way to help clarify and experiment with these visions. Most of our writing is open and free.
- I’m not sure how much sense it makes to focus on increasing direct authorship of Guesstimate/Squiggle now. I think in the future, it’s very likely that a lot of Squiggle would be written by AIs, perhaps with some specialists analysts in-house. Training people to build these models is fairly high-cost. I’ve done several workshops now. I think they’ve went decently, but I think it would take far more training to substantially increase the amount of numeric cost-effectiveness models written across most EA orgs.
Yea, I get that often. We think about it sometimes. I plan to continue to consider it, but I’d flag that there are a lot of points that makes this less appealing than it might seem:
1. I tried turning Guesstimate into a business and realized it would be an uphill battle, unless we heavily pivoted into something different.
2. The market for numeric tooling is honestly quite narrow and limited. You can maybe make a business, but it’s very hard to scale it.
3. If you go the VC route, they’ll heavily encourage you to pivot to specific products that make more money. Then you can get bought out / closed down, if it’s not growing very quickly. Causal went this route, then semi-pivoted to finance modeling, then got bought out.
4. It’s really hard to make a successful business, especially one with enough leeway to also support EAs and EA use cases. We’re a 2-person team, I don’t think we have the capacity or skills to do this well now.
5. I want to be sure that we can quickly change focus to what’s most promising. For example, AI has been changing, and so too has what’s possible with tools on AI. When you make a business, it can be easy to lock-in to a narrow product.
6. I think that what EAs/Rationalists want is generally a fair bit different from what others want, and especially what other companies would pay a lot for. So it’s difficult to support both.
I hope that helps clarify things. Happy to answer other questions!