Dawn Drescher comments on Being an individual alignment grantmaker

Dawn Drescher Mar 2, 2022, 10:50 AM
2 points
0 ∶ 0
I would recommend biting the decision theoretic bullet that this is not a problem. If you feel that negative outcomes are worse than positive outcomes of equal quantity, then adjust your units, they’re miscalibrated.

I’m on board with that, and the second that you’re quoting seems to express that. Or am I misunderstanding what you’re referring to? (The quoted section basically says that, e.g., +100 utility with 50% probability and −100 utility with 50% probability cancel out to 0 utility in expectation. So the positive and the negative side are weighed equally and the units are the same.)

Generally, this (yours) is also my critique of the conflict between prioritarianism and classic utilitarianism (or some formulations of those).

So would The Pot be like, an organization devoted especially to promoting integrity in the market? I’m not sure I can see why it would hold together.

Yeah, that’s how I imagine it. You mean it would just have a limited life expectancy like any company or charity? That makes sense. Maybe we could try to push to automate it and create several alternative implmentations of it. Being able to pay people would also be great. Any profit that it could use to pay staff would detract from its influence, but that’s also a tradeoff one could make.

Oh, another idea of mine was to use Augur markets. But I don’t know enough about Augur markets yet to tell if there are difficulties there.

My Venture Granters design becomes relevant again. Investors just get paid a salary. Their career capital (ability to allocate funding) is measured in a play-currency. Selfish investors don’t apply. Unselfish investors are invited in and nurtured.

I still need to read it, but it’s on my reading list! Getting investments from selfish investors is a large part of my motivation. I’m happy to delay that to test all the mechanisms in a safe environment, but I’d like it to be the goal eventually when we deem it to be safe.

We (the Impact Certificate market convocation) just had a call, and we talked about this a bit, and we realize that most of this seems to crux on the question of whether there are any missing or underdeveloped public goods in AI.

Yeah, it would be interesting to get opinions of anyone else who is reading this.

So the way I understand this questions is that there may be retro funders who reward free and open source software projects that have been useful. Lots of investors will be very quick and smart about ferreting out what the long-tail of the 10,000s of tiny libraries are that are holding all the big systems like GPT-3 together. Say, maybe the training data for GPT-3 is extracted by a custom software that relies on cchardet to detect the encoding of the websites it downloads if they’re not declared, misdeclared, or ambiuously declared. That influx of funding to these tiny projects will speed up their development and will supercharge them to the point where they can do their job a lot better and speed up development processes by 2–10x or so.

Attributed impact, the pot, and aligned retro funders would first need to become aware of this (or similar hidden risks), and would then decide that software projects like that are risky, and that they need to make a strong case for why they’re differentially more useful for safety or other net positive work than for enhancing AI capabilities. But the risk is sufficiently hidden that this is the sort of thing where another unaligned funder with a lot of money might come in an skew the market in the direction of their goals.

The assumptions, as I see them, are:
1. Small sortware projects can be sufficiently risky.
2. The influence of unaligned funders who disregard this risk is great.
3. The unaligned funders cannot be reasoned with and either reject attributed impact or argue that small software projects are not risky.
4. They overwhelm the influence of the pot.
5. They overwhelm the influence of all other retro funders that software projects might cater to.
If we can make No Untracked Longterm Negative Externalities (NULNE) audits one of the default signatures that show a nasty red cross logo on the cert until one has been acquired, that could establish healthy culture of use.

Yeah, that sounds sensible. Or make it impossible to display them anywhere in the first place without audit?