Dawn Drescher comments on The Top AI Safety Bets for 2023: GiveWiki’s Latest Recommendations

Dawn Drescher Nov 11, 2023, 11:04 AM
6 points
0 ∶ 0
First it could make sense not to focus too much on the credits. The ranking has to bottom out somewhere, and that’s where the credits come into it, to establish a track record for our donors. The ranking itself is better thought of as the level of endorsement of a project weighed by the track record of the endorsing donors.
We’re still thinking about how we want to introduce funding goals and thus some approximation of short-term marginal utility. At the moment all projects discount donations at the same rate. Ideally we’d be able to use something like the S-Process to generate marginal utility curves that discount the score “payout” that donors can get. I’ve experimented with funding goals around $100k per project, and 10x sharper discounts afterwards, but it hadn’t made enough of a difference that would’ve legitimized the increased complexity and assumptions. Maybe we’ll revive that feature as a configurable funding goal at some point. But there is also the fundamental problem that we don’t have access to complete lists of donations, so less popular, less well-maintained projects would seemingly have higher marginal utility just because their donation records are more incomplete. That would be an annoying incentive to introduce. Those problems paired with the minor, unconvincing results of my experiments have caused me not to prioritize this yet.
But when it comes to the credits, the instructions to the evaluators are probably a good guide:
1. “Imagine that you’re given a budget of 1,000 impact credits to allocate across projects (not artifacts). (a) Please allocate them to the projects in proportion to how impactful you think they were. (b) There’s no transaction cost and no change in marginal utility (the first credit is worth the same as the last to a project). (c) We’ll … average your scores, multiply the averages with the number of evaluators, normalize, and then allocate that product to minimize the impact of recusals.”
2. And further down: “Note that this is a purely retroactive evaluation. (a) You can ignore the tractability of producing an output since they’ve all been produced. (b) Likewise please ignore the cost at which the output was produced. (c) Do consider neglectedness, though, and consider how likely some equivalent output would’ve been produced anyway had it not been for the given project. (d) Consider the ex ante expected utility. A bullshit project mustn’t get a high score because it somehow got unpredictably lucky. (Fictional examples.)”
So like everything in our evaluation, the credits are retroactive too, so they are not about the current margin. One reason to ignore costs is that we don’t have the data, though we might request or estimate it next time around. But the other reason is that the donors to overly expensive projects have already gotten “punish” for their nonoptimal investment through the opportunity cost that they’ve paid. Intuitively it seems to me like it would be double-counting to also reduce the credits that they receive.
- calebp Nov 11, 2023, 6:59 PM
  2 points
  0 ∶ 0
  Parent
  So is it reasonable to interpret your process as saying FAR was similarly impactful to AI safety events over the last year?
  - Dawn Drescher Nov 11, 2023, 8:24 PM
    2 points
    0 ∶ 0
    Parent
    AI Safety Events is one of the projects where we expanded the time window because they were on a hiatus in earlier 2023. The events that got evaluated were from 2022. Otherwise yes. (But just to be clear, this is about the retroactive evaluation results mentioned at the bottom of the post.)