# Misha_Yagudin(Misha Yagudin)

Karma: 638
• I think your comment (and particularly the first point) has much more to do with the difficulty of defining causality than with x-risks.

It seems natural to talk about force causing the mass to accelerate: when I push a sofa, I cause it to start moving. but Newtonian mechanics can’t capture casualty basically because the equality sign in lacks direction. Similarly, it’s hard to capture causality in probability spaces.

Following Pearl, I come to think that causality arises from manipulator/​manipulated distinction.

So I think it’s fair to speak about factors only with relation to some framing:

• If you are focusing on bio policy, you are likely to take great-power conflict as an external factor.

• Similarly, if you are focusing on preventing nuclear war between India and Pakistan, you are likely to take bioterrorism as an external factor.

Usually, there are multiple external factors in your x-risk modeling. The most salient and undesirable are important enough to care about them (and give them a name).

Calling bio-risks an x-factor makes sense formally; but doesn’t make sense pragmatically because bio-risks are very salient (in our community) on their own because they are a canonical x-risk. So for me, part of the difference is that I started to care about x-risks first; and that I started to care about x-risk factors because of their relationship to x-risk.

• People shared so many bad experiences with debate…

I had a great time debating (BP style) in Russia a few years ago. I clearly remember some moments which helped me to become better at thinking/​speaking and world modeling:

• The initial feedback I got during the practice session is basically don’t be a guy from the terrible video you shared :-). Make it easy for a judge to understand your arguments: improve the structure and speak slower. Focus on one core argument during your speech: don’t squeeze multiple half-baked ideas in; deliver one but prove it fully.

• At my first tournament for newbies, an experienced debater gave a lecture on playing something-something resolutions and concluded with strongly recommending reading up on game theory (IIRC, The Strategy of Conflict and Governing the Commons).

• My second tournament was in Jedi format: I, an inexperienced Padawan, played with a skilled Jedi. I matched with a person because both of us liked LessWrong. I think we even managed to use “belief should pay rent” as part of an argument in a debate on the tyranny of the majority. I think it’s plausible that we referred to Moloch at least once.

• Later on, improvement came from managing inferential distances during speeches; and grounding arguments in reality: being specific about harms and benefits, delivering appropriate ~examples to support intermediate claims.

I think the experience was worth it. It helped me to think more in-depth and about much more issues than I would have overwise (kinda like forecasting now). I quit because (a) tournaments are time-consuming; (b) I got bored playing social issues & identity politics.

While competitive debating is not about collaborative truth-seeking, in my experience, debtors are high cognitive decouplers. Arguing with them (outside of the game) felt good, and we were able to touch topics far outside of the default Overtone window (like taking the perspective of ISIS).

The culture was healthy because most people were just passionate about debating/​grokking complex issues (like investor-state dispute settlements), and their incentives were not screwed up because the only upside to winning debate tournaments in Russia is internet points.

Upd: I feel that one of your main concerns is Goodharting. I think the BP system as we played it basically encouraged maximizing the expected utility of impacts of arguments you brought to the table i.e. harm/​benefit to individual × scale × probability occurring × how well you proved it (which can be seen as the probability that your reasoning is correct). It’s a bit harder to fit the importance of framing the issue and principled arguments into my simplification. But the first can be seen as prioritizing based on relative tractability (e.g. in almost all of the debate arguing that “we will save money by not implementing a policy” is a bad move because there are multiple other ways to save money and the benefits of the policy might be unique). The second is about the importance of metagame, incentive structures, commitments, and so on.

• Thank you for engaging!

• First, “note that this [misha: Shapley value of evaluator] is just the counterfactual value divided by a fraction [misha: by two].” Right, this is exactly the same in my comment. I further divide by total impact to calculate the Shapley multiplier.

• Do you think we disagree?

• Why isn’t my conclusion follows?

• Second, you conclude “And the Shapley value multiplier would be 1/​(some estimates of how many players there are)”, while your estimate is”0.3 to 0.5″. There have been like 30 participants over two lotteries that year, so you should have ended up with something an order of magnitude less like “3% to 10%”.

• Am I missing something?

• Third, for the model with more than two players, it’s unclear to me who the players are. If these are funders + evaluators. You indeed will end up with because

• Shapley multipliers should add up to , and

• Shapley value of the funders is easy to calculate (any coalition without them lacks any impact).

• Please note that is from the comment above.

• (Note that this model ignores that the beneficiary might win the lottery and no donations will be made.)

In the end,

• I think that it is necessary to estimate X in “shallowly evaluated giving is as impactful as X times of in-depth evaluated giving”. Because if impact of the evaluator is close to nil.

• I might not understand how you model impact here, please, be more specific about the modeling setup and assumptions.

• I don’t think that you should split evaluators. Well, basically because you want to disentangle the impact of evaluation and funding provision and not to calculate Adam’s personal impact.

• Like, take it to the extreme: it would be pretty absurd to say that the overwhelmingly successful (e.g. seeding a new ACE Top Charity in yet unknown but highly tractable area of animal welfare and e.g. discovering AI alignment prodigy) donor lottery had an impact less than an average comment because there have been too many people (100K) contributing a dollar to participate in it.

• Recently Nuño asked me to do similar (but shallower) forecasting for ~150 project ideas. It took me about 5 hours. I think I could have done the evaluation faster but I left ~paragraph-long comments on like to projects and sentence long comments on most others; I haven’t done any advanced modeling or guesstimating.

• Thank you, Nuno!

• Am I understand correctly that the Shapley value multiplier (0.3 to 0.5) is responsible for preventing double counting?

• If so why don’t you apply it to Positive status effects? The effect was also partially enabled by the funding providers (maybe less so).

• Huh! I am surprised that your Shapley value calculation is not explicit but is reasonable.

• Let’s limit ourselves to two players (= funding providers who are only capable of shallow evaluations and grantmakers who are capable of in-depth evaluation but don’t have their own funds). You get Your estimate of “0.3 to 0.5” implies that shallowly evaluated giving is as impactful as “0 to 0.4″ of in-depth evaluated giving.

• This x2.5..∞ multiplier is reasonable but doesn’t feel quite right to put 10% on above ∞ :)

• This makes me further confused about the gap between the donor lottery and the alignment review.

• There are a lot of things l like about this post. From small (e.g. the summary on top of it; and the table at the end) to large (e.g. it’s a good thing to do given a desire to understand how to quantify/​estimate impact better).

Here are some things I am perplexed about or disagree with:

• EAF hiring round estimate misses the enormous realized value of information. As far as I can see, EAF decided to move to London (partly) because of that.

• > We moved to London (Primrose Hill) to better attract and retain staff and collaborate with other researchers in London and Oxford.

• > Budget 2020: $994,000 (7.4 expected full-time equivalent employees). Our per-staff expenses have increased compared with 2019 because we do not have access to free office space anymore, and the cost of living in London is significantly higher than in Berlin. • The donor lottery evaluation seems to miss that$100K would have been donated otherwise.

• Further, I would suggest another decomposition.

• Impact = impact of running donor lottery as a tool (as opposed to donating without ~aggregation) + the counterfactuals impact of particular grants (as opposed to ~expected grants) + misc. side-effects (like a grantmaker joining LTFF).

• I can understand why you added the first two terms. But it seems to me that

• we can get a principled estimate about the first one based on arguments for donor lotteries (e.g. epistemic advantage coming from spending more time per dollar donated; and freed time of donors);

• One can get more empirical and have a quick survey here.

• estimating the second term is trickier because you need to make a guess about the impact of an average epistemically advantaged donation (as opposed to an average donation of 100K I which I think is missing from your estimate)

• Both of these are doable because we saw how other donor lottery winners gave their money and how wealthy/​invested donors give their money.

• A good proxy for an impact of average donation might come from (a) EA survey donation data, (b) a quick survey of lottery participants. The latter seems superior because participating in an early donor lottery suggests a higher engagement with EA ideas &c.

• After thinking a bit longer the choice of decomposition depends on what you want to understand better. It seems like your choice is better if you want to empirically understand whether the donor lottery is valuable.

• Another weird thing is to see the 2017 Donor Lottery Grant having x5..10 higher impact than 2018 AI Alignment Literature Review and Charity Comparison.

• I think it might come down to you not subtracting the counterfactual impact of donating 100K w/​o lottery from donors’ lottery impact estimate.

• The basic source of impact of the donor lottery and charity review comes from an epistemic advantage (someone dedicating more time to think/​evaluate donations; people being better informed about the charities they are likely to donate to). Given how well received the literature review is it seems to be (quite likely) helpful to individual donors and given that it (according to your guess) impacted \$100K..1M it should be kinda as impactful or more impactful than an abstract donor lottery.

• And it’s hard to see this particular donor lottery as overwhelmingly more impactful than an average one.

• I like your 1–5 list.

Tangentially, I just want to push back a bit on 1 and 2 being obviously good. While I think that quantification is in general good, my forecasting experience taught me that quantitative estimates without a robust track record and/​or reasoning are quite unsatisfactory. I am a bit worried that misunderstanding of the Aumann agreement theorem might lead to overpraising communication of pure probabilities (which are often unhelpful).

• I agree that the mechanisms proposed in my comment are quite costly sometimes. But I think higher-effort downstream activities only need to be invoked occasionally (e.g. not everyone who downvotes needs to explain why but it’s good that someone will occasionally) — if they are invoked consistently they will be picked up by people.

Right, I think I see how this can backfire now. Maybe upvoting “ugh, I still think that this is likely but am uncomfortable about betting” might still encourage using qualifiers for reasons 1–3 while acknowledging vulnerability and reducing pressure on commenters?

• I mostly wanted to highlight that there is a confident but uncertain mode of communication. And that displaying uncertainty or lack of knowledge sometimes helps me be more relaxed.

People surely pick up bits of style from others they respect; so aspiring EAs are likely to adopt the manners of respected members of our community. It seems plausible to me that this will lead to the negative consequences you mentioned in the fifth paragraph (e.g. there is too much deference to authority for the amounts of cluelessness and uncertainty we have). I think a solution might be not in discouraging display of uncertainty but in encouraging positive downstream activities like betting, quantification, acknowledging that arguments changed your mind &c — likely this will make cargo culting less probable (a tangential example is encouraging people to make predictions when they say “my model is…”).

I agree underconfidence and anxiety could be confused on the forum. But not in real life as people leak clues about their inner state all the time.

• Hey Chi, let me report my personal experience: uncertainty and putting qualifiers feel quite different to me than anxious social signaling. The conversation in the beginning of Confidence all the way up points to the difference. You can be uncertain or potentially wrong, and be chill about it. Acknowledging uncertainty helps with (fear of) saying “oops, was wrong” and hence makes one more at ease.

• I’m back to being the #1 forecaster there, after having momentarily lost the position to user @Hinterhunter.

This happened in 2021 :P

• I think the evidence from the financial markets is a bit weaker.

First, let’s imagine predicting that the forecasting platform will stop operating and assume that forecasting is only incentivized by points on this platform. The reasonable prediction is that platform will continue to operate because otherwise, points will become meaningless. Same about predicting existential risk (because if it occurs, one won’t be able to claim a prize).

The US collapse will be devastating for the financial markets (plausible to me unless the USA will gradually lose power and importance, in which case interventions are less crucial). The incentives assumption seems plausible to me as well. So the market might not be a reliable predictor of it.

• I weakly downvoted. I felt meh about coming up with better acronyms because

• it feels low-fidelity and I would rather have people forget/​rephrase EA principles rather than learn them by heart;

• guiding principles should not be changed frequently and without great need.

Also, I disliked the proposed acronym because

• pro-life/​pro-choice associations;

• while choice is a generic word, it is associated with the choice/​obligation debate within the community, which makes it not a very good choice.

• Correlating subjective metrics with objective outcomes to provide better intuitions about what an additional point on a scale might mean. Resulting intuitions still suffers from “correlation ≠ causation” and all curses of self-reported data (which, in my opinion, makes such measurements close to useless) but is a step forward.