I’m a theoretical CS grad student at Columbia specializing in mechanism design. I write a blog called Unexpected Values which you can find here: https://ericneyman.wordpress.com/. My academic website can be found here: https://sites.google.com/view/ericneyman/.
Eric Neyman
This is probably my favorite proposal I’ve seen so far, thanks!
I’m a little skeptical that warnings from the organization you propose would have been heeded (especially by people who don’t have other sources of funding and so relying on FTX was their only option), but perhaps if the organization had sufficient clout, this would have put pressure on FTX to engage in less risky business practices.
I think this fails (1), but more confidently, I’m pretty sure it fails (2). How are you going to keep individuals from taking crypto money? See also: https://forum.effectivealtruism.org/posts/Pz7RdMRouZ5N5w5eE/ea-should-taboo-ea-should
I think my crux with this argument is “actions are taken by individuals”. This is true, strictly speaking; but when e.g. a member of U.S. Congress votes on a bill, they’re taking an action on behalf of their constituents, and affecting the whole U.S. (and often world) population. I like to ground morality in questions of a political philosophy flavor, such as: “What is the algorithm that we would like legislators to use to decide which legislation to support?”. And as I see it, there’s no way around answering questions like this one, when decisions have significant trade-offs in terms of which people benefit.
And often these trade-offs need to deal with population ethics. Imagine, as a simplified example, that China is about to deploy an AI that has a 50% chance of killing everyone and a 50% chance of creating a flourishing future of many lives like the one many longtermists like to imagine. The U.S. is considering deploying its own “conservative” AI, which we’re pretty confident is safe, and which will prevent any other AGIs from being built but won’t do much else (so humans might be destined for a future that looks like a moderately improved version of the present). Should the U.S. deploy this AI? It seems like we need to grapple with population ethics to answer this question.
(And so I also disagree with “I can’t imagine a reasonable scenario in which I would ever have the power to choose between such worlds”, insofar as you’ll have an effect on what we choose, either by voting or more directly than that.)
Maybe you’d dispute that this is a plausible scenario? I think that’s a reasonable position, though my example is meant to point at a cluster of scenarios involving AI development. (Abortion policy is a less fanciful example: I think any opinion on the question built on consequentialist grounds needs to either make an empirical claim about counterfactual worlds with different abortion laws, or else wrestle with difficult questions of population ethics.)
Does anyone have an estimate of how many dollars donated to the campaign are about equal in value to one hour spent phonebanking? Thanks!
I guess I have two reactions. First, which of the categories are you putting me in? My guess is you want to label me as a mop, but “contribute as little as they reasonably can in exchange” seems an inaccurate description of someone who’s strongly considering devoting their career to an EA cause; also I really enjoy talking about the weird “new things” that come up (like idk actually trade between universes during the long reflection).
My second thought is that while your story about social gradients is a plausible one, I have a more straightforward story about who EA should accept which I like more. My story is: EA should accept/reward people in proportion to (or rather, in a monotone increasing fashion of) how much good they do.* For a group that tries to do the most good, this pretty straightforwardly incentivizes doing good! Sure, there are secondary cultural effects to consider—but I do think they should be thought of as secondary to doing good.
*You can also reward trying to do good to the best of each’s ability. I think there’s a lot of merit to this approach, but might create some not-great incentives of the form “always looking like you’re trying” (regardless of whether you really are trying effectively).
I may have misinterpreted what exactly the concept-shaped hole was. I still think I’m right about them having been surprised, though.
If it helps clarify, the community builders are talking about are some of the Berkeley(-adjacent) longtermist ones. As some sort of signal that I’m not overstating my case here, one messaged me to say that my post helped them plug a “concept-shaped hole”, a la https://slatestarcodex.com/2017/11/07/concept-shaped-holes-can-be-impossible-to-notice/
Great comment, I think that’s right.
I know that “give your other values an extremely high weight compared with impact” is an accurate description of how I behave in practice. I’m kind of tempted to bite that same bullet when it comes to my extrapolated volition—but again, this would definitely be biting a bullet that doesn’t taste very good (do I really endorse caring about the log of my impact?). I should think more about this, thanks!
Yup—that would be the limiting case of an ellipse tilted the other way!
The idea for the ellipse is that what EA values is correlated (but not perfectly) with my utility function, so (under certain modeling assumptions) the space of most likely career outcomes is an ellipse, see e.g. here.
Note that the y-axis is extrapolated volition, i.e. what I endorse/strive for. Extrapolated volition can definitely change—but I think by definition we prefer ours not to?
Note that covid travel restrictions may be a consideration. For example, New Zealand’s borders are currently closed to essentially all non-New Zealanders and are scheduled to remain closed to much of the world until July:
Historically, there have been ~24 Republicans vs ~19 Democrats as senators (and 1 independent) from Oregon, so partisan affiliation doesn’t seem that important.
A better way of looking at this is the partisan lean of his particular district. The answer is D+7, meaning that in a neutral environment (i.e. an equal number of Democratic and Republican votes nationally), a Democrat would be expected to win this district by 7 percentage points.
This year is likely to be a Republican “wave” year, i.e. Republicans are likely to outperform Democrats (the party out of power almost always overperforms in midterm elections); however, D+7 is a substantial lean that’s hard to overcome. I’d give Carrick a 75% chance of winning the general election conditional on winning the primary. His biggest challenge is winning the primary election.
Hi! I’m an author of this paper and am happy to answer questions. Thanks to Jsevillamol for the summary!
A quick note regarding the context in which the extremization factor we suggest is “optimal”: rather than taking a Bayesian view of forecast aggregation, we take a robust/”worst case” view. In brief, we consider the following setup:
(1) you choose an aggregation method.
(2) an adversary chooses an information structure (i.e. joint probability distribution over the true answer and what partial information each expert knows) to make your aggregation method do as poorly as possible in expectation (subject to the information structure satisfying the projective substitutes condition).
In this setup, the 1.73 extremization constant is optimal, i.e. maximizes worst-case performance.
That said, I think it’s probably possible to do even better by using a non-linear extremization technique. Concretely, I strongly suspect that the less variance there is in experts’ forecasts, the less it makes sense to extremize (because the experts have more overlap in the information they know). I would be curious to see how low a loss it’s possible to get by taking into account not just the average log odds, but also the variance in the experts’ log odds. Hopefully we will have formal results to this effect (together with a concrete suggestion for taking variance into account) sometime soon :)
Thanks for putting this together; I might be interested!
I just want to flag that if your goal is to avoid internships, then (at least for American students) I think the right time to do this would be late May-early June rather than late June-early July as you suggest on the Airtable form. I think the most common day for internships to start is the day after Memorial Day, which in 2022 will be May 31st. (Someone correct me if I’m wrong.)
My understanding is that the Neoliberal Project is a part of the Progressive Policy Institute, a DC think tank (correct me if I’m wrong).
Are you guys trying to lobby for any causes, and if so, what has your experience been on the lobbying front? Are there any lessons you’ve learned that may be helpful to EAs lobbying for EA causes like pandemic preparedness funding?
There sort of is—I’ve seen some EAs use the light bulb emoji 💡 on Twitter (I assume this comes from the EA logo) -- but it’s not widely used, and it’s unclear to me whether it means “identifies as an EA” or “is a practicing EA” (i.e. donates a substantial percentage of their income to EA causes and/or does direct work on those causes).
I’m unsure whether I want there to be an easy way to “identify as EA”, since identities do seem to make people worse at thinking clearly. I’ve thought/written about this (in the context of a neoliberal identity too, as it happens), and my conclusion was basically that a strong EA identity would be okay so long as the centerpiece of the identity continues to be a question (“How can we do the most good?”) as opposed to any particular answer. I’m not sure how realistic that is, though.
Thanks for writing this up; I agree with your conclusions.
There’s a neat one-to-one correspondence between proper scoring rules and probabilistic opinion pooling methods satisfying certain axioms, and this correspondence maps Brier’s quadratic scoring rule to arithmetic pooling (averaging probabilities) and the log scoring rule to logarithmic pooling (geometric mean of odds). I’ll illustrate the correspondence with an example.
Let’s say you have two experts: one says 10% and one says 50%. You see these predictions and need to come up with your own prediction, and you’ll be scored using the Brier loss: (1 - x)^2, where x is the probability you assign to whichever outcome ends up happening (you want to minimize this). Suppose you know nothing about pooling; one really basic thing you can do is to pick an expert to trust at random: report 10% with probability 1⁄2 and 50% with probability 1⁄2. Your expected Brier loss in the case of YES is (0.81 + 0.25)/2 = 0.53, and your expected loss in the case of NO is (0.01 + 0.25)/2 = 0.13.
But, you can do better. Suppose you say 35% -- then your loss is 0.4225 in the case of YES and 0.1225 in the case of NO—better in both cases! So you might ask: what is the strategy the gives me the largest possible guaranteed improvement over choosing a random expert? The answer is linear pooling (averaging the experts). This gets you 0.49 in the case of YES and 0.09 in the case of NO (an improvement of 0.04 in each case).
Now suppose you were instead being scored with a log loss—so your loss is -ln(x), where x is the probability you assign to whichever outcome ends up happening. Your expected log loss in the case of YES is (-ln(0.1) - ln(0.5))/2 ~ 1.498, and in the case of NO is (-ln(0.9) - ln(0.5))/2 ~ 0.399.
Again you can ask: what is the strategy that gives you the largest possible guaranteed improvement of this “choose a random expert” strategy? This time, the answer is logarithmic pooling (taking the geometric mean of the odds). This is 25%, which has a loss of 1.386 in the case of YES and 0.288 in the case of NO, an improvement of about 0.111 in each case.
(This works just as well with weights: say you trust one expert more than the other. You could choose an expert at random in proportion to these weights; the strategy that guarantees the largest improvement over this is to take the weighted pool of the experts’ probabilities.)
This generalizes to other scoring rules as well. I co-wrote a paper about this, which you can find here, or here’s a talk if you prefer.
What’s the moral here? I wouldn’t say that it’s “use arithmetic pooling if you’re being scored with the Brier score and logarithmic pooling if you’re being scored with the log score”; as Simon’s data somewhat convincingly demonstrated (and as I think I would have predicted), logarithmic pooling works better regardless of the scoring rule.
Instead I would say: the same judgments that would influence your decision about which scoring rule to use should also influence your decision about which pooling method to use. The log scoring rule is useful for distinguishing between extreme probabilities; it treats 0.01% as substantially different from 1%. Logarithmic pooling does the same thing: the pool of 1% and 50% is about 10%, and the pool of 0.01% and 50% is about 1%. By contrast, if you don’t care about the difference between 0.01% and 1% (“they both round to zero”), perhaps you should use the quadratic scoring rule; and if you’re already not taking distinctions between low and extremely low probabilities seriously, you might as well use linear pooling.
- Nov 13, 2021, 10:37 AM; 5 points) 's comment on When pooling forecasts, use the geometric mean of odds by (
Cool idea! Some thoughts I have:
A different thing you could do, instead of trading models, is compromise by assuming that there’s a 50% chance that your model is right and a 50% chance that your peer’s model is right. Then you can do utility calculations under this uncertainty. Note that this would have the same effect as the one you desire in your motivating example: Alice would scrub surfaces and Bob would wear a mask.
This would however make utility calculations twice as difficult as compared just using your own model, since you’d need to compute the expected utility under each model. But note that this amount of computational intensity is already assumed by the premise that it makes sense for Alice and Bob to trade models. In order for Alice and Bob to reach this conclusion, each needs to compute their utility under each action in each of their models.
I would say that this is more epistemically sound than switching models with your peer, since it’s reasonably well-motivated by the notion that you are epistemic peers and could have ended up in a world where you had had the information your peer has and vice versa.
But the fundamental issue you’re getting at here is that reaching an agreement can be hard, and we’d like to make good/informed decisions anyway. This motivates the question: how can you effectively improve your decision making without paying the cost required by trying to reach an agreement?
One answer is that you can share partial information with your peer. For instance, maybe Alice and Bob decide that they will simply tell each other their best guess about the percentage of COVID transmission that is airborne and leave it at that (without trying to resolve subsequent disagreement). This is enough to, in most circumstances, cause each of them to update a lot (and thus be much better informed in expectation) without requiring a huge amount of communication.
Which is better: acting as if each model is 50% to be correct, or sharing limited information and then updating? I think the answer depends on (1) how well you can conceptualize your peer’s model, (2) how hard updating is, and (3) whether you’ll want to make similar decisions in the future but without communicating. The sort of case when the first approach is better is when both Alice and Bob have simple-to-describe models and will want to make good COVID-related decisions in the future without consulting each other. The sort of case when the second approach is better is when Alice and Bob have difficult-to-describe models, but have pretty good heuristics about how to update their probabilities based on the other’s probabilities.
I started making a formal model of the “sharing partial information” approach and came up with an example of where it makes sense for Alice and Bob to swap behaviors upon sharing partial information. But ultimately this wasn’t super interesting because the underlying behavior was that they were updating on the partial information. So while there are some really interesting questions of the form “How can you improve your expected outcome the most while talking to the other person as little as possible”, ultimately you’re getting at something different (if I understand correctly) -- that adopting a different model might be easier than updating your own. I’d love to see a formal approach to this (and may think some more about it later!)
Yeah—I think it’s unlikely that Pact would become a really large player and have distortionary effects. If that happens, we’ll solve that problem when we get there :)
The broader point that the marginal dollar might be more valuable to one campaign than to another is an important one. You could try to deal with this by making an actual market, where the ratio at which people trade campaign dollars isn’t fixed at 1, but I think that will complicate the platform and end up doing more harm than good.
Great question—you absolutely need to take that into account! You can only bargain with people who you expect to uphold the bargain. This probably means that when you’re bargaining, you should weight “you in other worlds” in proportion to how likely they are to uphold the bargain. This seems really hard to think about and probably ties in with a bunch of complicated questions around decision theory.