I’m skeptical of using this for AI alignment. AI risk is already well funded, so if it all it took was adding more resources or hitting a metric, existing orgs could just buy that directly. I think the economic issues of AI risk are less in its lack of legible liquid resources, and more in the difficulty of getting the AI field overall to cooperate to not race. However, I still think pool-less quadratic funding is very exciting for donations to causes that have room for more funding (like direct charity or meta EA tooling)
I disagree with the strategic thinking section. People don’t think in terms of maximizing leverage, but in maximizing good-to-yourself per $ spent. When other people donate after you, you spend slightly more than you already did, and you get a lot more public good “for free” (paid for by other people) which makes it worth it. And to the extent people are more altruistic, they’ll generally fund these goods more rather than less.
Yeah, I agree that using this to further fund AI alignment wouldn’t help much. I’m less sure about “hitting the metric”—the thing is, we don’t have any good alignment metric right now. But if we somehow managed to build it, convincing AI labs to hit such a metric seems to me like the most feasible thing to make AI race safer. But yeah, building it would be really hard.
Do you maybe have some other ideas how to make AI race safer? Maybe it is possible to somehow turn them into a continuous value that they could coordinate to increase?
Re: strategic thinking—It may be true that most people won’t care so much for their real leverage (they won’t consider the counterfactual where they donate less), but it definitely isn’t rational. So while it may more or less work, I wouldn’t like this system to give an impression that it tricks people into donating.
And, more importantly, my main hope for this system, is to facilitate cooperation between most powerful agents (powerful states, future supercorporations, TAI systems), rather than individual people. I assume such powerful actors will consider what happens if they do not donate, and selfishly do what’s optimal for them.
Doesn’t the leverage go both directions? Donating causes earlier people to pay more, but also adds leverage for later people? Such that you don’t know if later people would’ve donated unless you also did.
Though maybe that depends on some factors of the system, whether the leverage grows or shrinks with more donations. I think this might hit your worry that it incentivizes donating later cause that makes you pay less, but if actors are proper EV-maximizers won’t they scale up their donation such that the expected payment/leverage is the same?
Seems like there’s lots of strategies at play here, including donating several times. Making it work both for real-life humans with real-life problems and TAI seems ambitious though, they require very different incentives to work and I imagine the design ends up significantly different.
Interesting stuff!
You’re right, the leverage definitely goes two ways. The thing it, this later leverage will tend to be smaller than the one you get immediately. At least, this is how the system behaves in my naive simulations. The exception is, when you expect some very big contributors to join later on—then the later leverage is bigger. So yeah, it’s a complicated situation and I didn’t want to go into that in the post, because it would get too bloated.
And yeah, humans and TAI may have different strategies which complicates it further. This is why I’m not yet fully satisfied with this mechanism, and I will try to simplify it, so that we don’t have to care for all those strategies.
A few thoughts.
I’m skeptical of using this for AI alignment. AI risk is already well funded, so if it all it took was adding more resources or hitting a metric, existing orgs could just buy that directly.
I think the economic issues of AI risk are less in its lack of legible liquid resources, and more in the difficulty of getting the AI field overall to cooperate to not race.
However, I still think pool-less quadratic funding is very exciting for donations to causes that have room for more funding (like direct charity or meta EA tooling)
I disagree with the strategic thinking section. People don’t think in terms of maximizing leverage, but in maximizing good-to-yourself per $ spent. When other people donate after you, you spend slightly more than you already did, and you get a lot more public good “for free” (paid for by other people) which makes it worth it. And to the extent people are more altruistic, they’ll generally fund these goods more rather than less.
Yeah, I agree that using this to further fund AI alignment wouldn’t help much. I’m less sure about “hitting the metric”—the thing is, we don’t have any good alignment metric right now. But if we somehow managed to build it, convincing AI labs to hit such a metric seems to me like the most feasible thing to make AI race safer. But yeah, building it would be really hard. Do you maybe have some other ideas how to make AI race safer? Maybe it is possible to somehow turn them into a continuous value that they could coordinate to increase?
Re: strategic thinking—It may be true that most people won’t care so much for their real leverage (they won’t consider the counterfactual where they donate less), but it definitely isn’t rational. So while it may more or less work, I wouldn’t like this system to give an impression that it tricks people into donating. And, more importantly, my main hope for this system, is to facilitate cooperation between most powerful agents (powerful states, future supercorporations, TAI systems), rather than individual people. I assume such powerful actors will consider what happens if they do not donate, and selfishly do what’s optimal for them.
Doesn’t the leverage go both directions? Donating causes earlier people to pay more, but also adds leverage for later people? Such that you don’t know if later people would’ve donated unless you also did.
Though maybe that depends on some factors of the system, whether the leverage grows or shrinks with more donations. I think this might hit your worry that it incentivizes donating later cause that makes you pay less, but if actors are proper EV-maximizers won’t they scale up their donation such that the expected payment/leverage is the same?
Seems like there’s lots of strategies at play here, including donating several times. Making it work both for real-life humans with real-life problems and TAI seems ambitious though, they require very different incentives to work and I imagine the design ends up significantly different. Interesting stuff!
You’re right, the leverage definitely goes two ways. The thing it, this later leverage will tend to be smaller than the one you get immediately. At least, this is how the system behaves in my naive simulations. The exception is, when you expect some very big contributors to join later on—then the later leverage is bigger. So yeah, it’s a complicated situation and I didn’t want to go into that in the post, because it would get too bloated.
And yeah, humans and TAI may have different strategies which complicates it further. This is why I’m not yet fully satisfied with this mechanism, and I will try to simplify it, so that we don’t have to care for all those strategies.