Today, I mentioned to someone that I tend to disagree with others on some aspects of EA community building, and they asked me to elaborate further. Here’s what I sent them, very quickly written and only lightly edited:
Hard to summarize quickly, but here’s some loose gesturing in the direction:
We should stop thinking about “community building” and instead think about “talent development”. While building a community/culture is important and useful, the wording overall sounds too much like we’re inward-focused as opposed to trying to get important things done in the world.
We should focus on the object level (what’s the probability of an extinction-level pandemic this century?) over social reality (what does Toby Ord think is the probability of an extinction-level pandemic this century?).
We should talk about AI alignment, but also broaden our horizon to not-traditionally-core-EA causes to sharpen our reasoning skills and resist insularity. Example topics I think should be more present in talent development programs are optimal taxation, cybersecurity, global migration and open borders, 1DaySooner, etc.
Useful test: Does your talent development program make sense if EA didn’t exist? (I.e., is it helping people grow and do useful things, or is it just funnelling people according to shallow metrics?)
Based on personal experience and observations of others’ development, the same person can have a much higher or much lower impact depending on the cultural environment they’re embedded in, and the incentives they perceive to affect them. Much of EA talent development should be about transmitting a particular culture that has produced impressive results in the past (and avoiding cultural pitfalls that are responsible for some of the biggest fuckups of the last decade). Shaping culture is really important, and hard to measure, and will systematically be neglected by talent metrics, and avoiding this pitfall requires constantly reminding yourself of that.
Much of the culture is shaped by incentives (such as funding, karma, event admissions, etc.). We should be really deliberate in how we set these incentives.
To be clear, are you saying your preference for the phrase ‘talent development’ over ‘community building’ is based on your concern that people hear ‘community building’ and think, ‘Oh, these people are more interested in investing in their community as an end in itself than they are in improving the world’?
I don’t know about Jonas, but I like this more from the self-directed perspective of “I am less likely to confuse myself about my own goals if I call it talent development.”
Thanks! So, to check I understand you, do you think when we engage in what we’ve traditionally called ‘community building’ we should basically just be doing talent development?
In other words, your theory of change for EA is talent development + direct work = arrival at our ultimate vision of a radically better world?[1]
E.g., a waypoint described by MacAskill as something like the below:
”(i) ending all obvious grievous contemporary harms, like war, violence and unnecessary suffering; (ii) reducing existential risk down to a very low level; (iii) securing a deliberative process for humanity as a whole, so that we make sufficient moral progress before embarking on potentially-irreversible actions like space settlement.”
All of this looks fantastic and like it should have been implemented 10 years ago. This is not something to sleep on.
The only nitpick I have is with how object level vs social reality is described. Lots of people are nowhere near ready to make difficult calculations, e.g. the experience from the COVID reopening makes it hard to predict that pandemic lockdown in the next 5 years is 40% even if that is the correct number. There’s lots of situations where the division of labor is such that deferring to people at FHI etc. is the right place to start, since these predictions are really important and not about people giving their own two cents or beginning learning the ropes of forecasting, which is what happens all too often (of course, that shouldn’t get in the way of new information and models travelling upwards, or fun community building/talent development workshops where people try out forecasting to see if its a good fit for them).
Yeah, I disagree with this on my inside view—I think “come up with your own guess of how bad and how likely future pandemics could be, with the input of others’ arguments” is a really useful exercise, and seems more useful to me than having a good probability estimate of how likely it is. I know that a lot of people find the latter more helpful though, and I can see some plausible arguments for it, so all things considered, I still think there’s some merit to that.
EA Forum discourse tracks actual stakes very poorly
Examples:
There have been many posts about EA spending lots of money, but to my knowledge no posts about the failure to hedge crypto exposure against the crypto crash of the last year, or the failure to hedge Meta/Asana stock, or EA’s failure to produce more billion-dollar start-ups. EA spending norms seem responsible for $1m–$30m of 2022 expenses, but failures to preserve/increase EA assets seem responsible for $1b–$30b of 2022 financial losses, a ~1000x difference.
People are demanding transparency about the purchase of Wytham Abbey (£15m), but they’re not discussing whether it was a good idea to invest $580m in Anthropic (HT to someone else for this example). The financial difference is ~30x, the potential impact difference seems much greater still.
Basically I think EA Forum discourse, Karma voting, and the inflation-adjusted overview of top posts completely fails to correctly track the importance of the ideas presented there. Karma seems to be useful to decide which comments to read, but otherwise its use seems fairly limited.
I think this is about scope of responsibility. To my knowledge, “EA” doesn’t own any Meta/Asana stock, didn’t have billions of (alleged) assets caught up in crypto, and didn’t pour $580MM into Anthropic. All that money belongs/belonged to private individuals or corporations (or so we thought...), and there’s arguably something both a bit unseemly and pointless about writing on a message board about how specific private individuals should conduct their own financial affairs.
On the other hand, EVF is a public charity—and its election of that status and solicitation of funds from the general public rightly call for a higher level of scrutiny of its actions vis-a-vis those of private individuals and for-profit corporations.
I don’t actually know the details, but as far as I know, EVF is primarily funded by private foundations/billionaires, too.
Also, some of this hedging could’ve been done by community members without actual ownership of Meta/Asana/crypto. Again, the lack of discussion of this seems problematic to me.
EVF and its US affiliate, CENTRE FOR EFFECTIVE ALTRUISM USA INC, are public charities. That means there is an indirect public subsidy (in terms of foregone tax revenues on money donated) by US and UK taxpayers somewhere in the ballpark of about 25% of donations. Based on the reports I linked, that seems to be about $10MM per year, probably more in recent years given known big grants. EVF also solicits donations from the general public on its website and elsewhere, which I don’t think is true of the big holder of Meta/Asana stock. (Good Ventures, which has somewhat favorable tax treatment as a private foundation, does not seem concentrated in this stock per the most recent 990-PF I could find.)
If an organization solicits from the public and accepts favorable tax treatment for its charitable status, the range of public scrutiny it should expect is considerably higher than for a private individual.
As far as I know, large philanthropic foundations often use DAFs to attain public charity status, getting the same tax benefits. And if they’re private foundations, they’re still getting a benefit of ~15%, and possibly a lot more via receiving donations of appreciated assets.
I also don’t think public charity status and tax benefits are especially relevant here. I think public scrutiny is not intrinsically important; I mainly care about taking actions that maximize social impact, and public scrutiny seems much worse for this than figuring out high-impact ways to preserve/increase altruistic assets.
there’s arguably something both a bit unseemly and pointless about writing on a message board about how specific private individuals should conduct their own financial affairs
I think you would care about this specific investment if you had more context (or at least I expect that you believe you would deserve to understand the argument). In some sense, this proves Jonas right.
discussing whether it was a good idea to invest $580m in Anthropic (HT to someone else for this example). The financial difference is ~30x, the potential impact difference seems much greater still.
There is a write-up specifically on this, that has been reviewed by some people. The author is now holding it back for ~4-6 weeks because they were requested to.
Yeah, I think making sure discussion of these topics (both Anthropic and Wytham) is appropriately careful seems good to me. E.g., the discussion of Wytham seemed very low-quality to me, with few contributors providing sound analysis of how to think about the counterfactuals of real estate investments.
At this point, I think it’s unfortunate that this post has not been published, a >2 month delay seems too long to me. If there’s anything I can do to help get this published, please let me know.
so we’re waiting 1.5 month to see if Anthropic was a bag idea?
On the other hand: Wytham Abbey was purchased by EV / CEA and made the news. Anthropic is a private venture. If Anthropic shows up in an argument, I can just say I don’t have anything to do with that. But if someone mentioned that Wytham Abbey was bought by the guy who wrote the basics of how we evaluate areas and projects… I still don’t know what to say.
Are there posts about those things which you think are under karma’d? My guess is the problem is more that people aren’t writing about them, rather than karma not tracking the importance of things which are written about. (At least in these two specific cases.)
Cool, fwiw I’d predict that a well-written anthropic piece would get more than the 150 karma the Whytam post currently has, though I acknowledge that “well-written” is vague. Based on what this commenter says, we might get to test that prediction soon.
FWIW the Wytham Abbey post also received ~240 votes, and I doubt that a majority of downvotes were given for the reason that people found the general topic unimportant. Instead I think it’s because the post seemed written fairly quickly and in a prematurely judgemental way. So it doesn’t seem right to take the karma level as evidence that this topic actually didn’t get a ton of attention.
Good point, I wasn’t tracking that the Wytham post doesn’t actually have that much Karma. I do think my claim would be correct regarding my first example (spending norms vs. asset hedges).
My claim might also be correct if your metric of choice was the sum of all the comment Karma on the respective posts.
Yeah, seems believable to me on both counts, though I currently feel more sad that we don’t have posts about those more important things than the possibility that the karma system would counterfactually rank those posts poorly if they existed.
A tangentially related point about example 1: Wow, it really surprises me that crypto exposure wasn’t hedged! I can think of a few reasons why those hedges might be practically infeasible (possibilities: financial cost, counterparty risk of crypto hedge, relationship with donor, counterparty risk of donor). I take your point that it’d be appropriate if these sorts of things got discussed more, so I think I will write something on this hedging tomorrow. Thanks for the inspiration!
I agree that the hedges might be practically infeasible or hard. But my point is that this deserves more discussion and consideration, not that it was obviously easy to fix.
I’ve been trying to find people willing and able to write quality books and have had a hard time finding anyone. “Doing Doing Good Better Better” seems one of the highest-EV projects, and EA Funds (during my tenure) received basically no book proposals, as far as I remember. I’d love to help throw a lot of resources after an upcoming book project by someone competent who isn’t established in the community yet.
The current canonical EA books—DGB, WWOTF, The Precipice—seem pretty good to me. But only very few people have undertaken serious attempts to write excellent EA books.
It generally seems to me that EA book projects have mainly been attempted by people (primarily men) who seem driven by prestige. I’m not sure if this is good—it means they’re going to be especially motivated to do a great job, but it seems uncorrelated with writing skill, so we’re probably missing out on a lot of talented writers.
In particular, I think there is still room for more broad, ambitious, canonical EA books, i.e. ones that can be given as a general EA introduction to a broad range of people, rather than a narrow treatment of e.g. niche areas in philosophy. I feel most excited about proposals that have the potential to become a canonical resource for getting talented people interested in rationality and EA.
Perhaps writing a book requires you to put yourself forward in a way that’s uncomfortable for most people, leaving only prestige-driven authors actually pursuing book projects? If true, I think this is bad, and I want to encourage people who feel shy about putting themselves out there to attempt it. If you like, you could apply for a grant and partly leave it to the grantmakers to decide if it’s a good idea.
A more specific take: Books as culture-building
What’s the secret sauce of the EA community? I.e., what are the key ideas, skills, cultural aspects, and other properties of the EA community that have been most crucial for its success so far?
My guess is that much of EA is about building a culture where people care unusually strongly about the truth, and unusually strongly about having an impact.
Current EA introduction books only embody this spirit to a limited degree. They talk about a wide range of topics that seem non-central to EA thinking, they tend to start with a bottom line (“x-risk is 1 in 6 this century” / “longtermism should be a key moral priority”) and argue for it, instead of the other way around. (Kind of. I think this is a bit unfair, but directionally true.) They tend to hide the source of knowledge; they’re not especially reasoning-transparent. The LessWrong Sequences do this well, but they’re lengthy and don’t work for everyone. There’s a much wider range of rationality-flavored content that could be written.
If we hand out books for talent outreach purposes, it would be great if EA books could also do the cultural onboarding at the same time, not just introducing the reader to new ideas, but also new reasoning styles. This could reduce community dilution and improve EA culture.
The main concern might be that such a book won’t sell well. Maybe, I don’t know? HPMOR seems one of the most widely (the most widely?) read EA and rationality books.
Some specific brainstorming ideas
A book about the world’s biggest problems, with a chapter for each plausibly important cause area, and some chapters on the philosophy (HT Damon / Fin? or someone else?)
A book that discusses a particular cause/issue and uses that to exemplify rationalist/EA-style reasoning, gesturing at the broader potential of the methodology.
A book that discusses historical and current deliberate attempts at large-scale-impact projects (e.g., Green Revolution, Pugwash Conferences, cage-free campaigns, 1DaySooner, …), both successes and failures (with lessons learnt), with a lot of insider war stories and anecdotes that allow you to understand how the sausage actually gets made.
Epistemic status: Have been thinking about this for about a year, spent 30 minutes writing it down, feel reasonably confident in the general idea but not the specifics. I wrote this quickly and may still make edits.
As someone running an organization, I frequently entertain crazy alternatives, such as shutting down our summer fellowship to instead launch a school, moving the organization to a different continent, or shutting down the organization so the cofounders can go work in AI policy.
I think it’s important for individuals and organizations to have the ability to entertain crazy alternatives because it makes it more likely that they escape local optima and find projects/ideas that are vastly more impactful.
Entertaining crazy alternatives can be mentally stressful: it can cause you or others in your organization to be concerned that their impact, social environment, job, or financial situation is insecure. This can be addressed by pointing out why these discussions are important, a clear mental distinction between brainstorming mode and decision-making, and a shared understanding that big changes will be made carefully.
Why considering radical changes seems important
The best projects are orders of magnitude more impactful than good ones. Moving from a local optimum to a global one often involves big changes, and the path isn’t always very smooth. Killing your darlings can be painful. The most successful companies and projects typically have reinvented themselves multiple times until they settled on the activity that was most successful. Having a wide mental and organizational Overton window seems crucial for being able to make pivots that can increase your impact several-fold.
When I took on leadership at CLR, we still had several other projects, such as REG, which raised $15 million for EA charities at a cost of $500k. That might sound impressive, but in the greater scheme of things raising a few million wasn’t very useful given that the best money-making opportunities could make a lot more per staff per year, and EA wasn’t funding-constrained anymore. It took me way too long to realize this, and only my successor stopped putting resources into the project after I left. There’s a world where I took on leadership at CLR, realized that killing REG might be a good idea, seriously considered the idea, got input from stakeholders, and then went through with it, within a few weeks of becoming Executive Director. All the relevant information to make this judgment was available at the time.
When I took on leadership at EA Funds, I did much better: I quickly identified the tension between “raising money from a broad range of donors” and “making speculative, hits-based grants”, and suggested that perhaps these two aims should be decoupled. I still didn’t go through with it nearly as quickly as I could have, this time not because of limitations of my own reasoning, but more because I felt constrained by the large number of stakeholders who had expectations about what we’d be doing.
Going forward, I intend to be much more relentless about entertaining radical changes, even when they seem politically infeasible, unrealistic, or personally stressful. I also intend to discuss those with my colleagues, and make them aware of the importance of such thinking.
How not to freak out
Considering these big changes can be extremely stressful, e.g.:
The organization moving to a different continent could mean breaking up with your life partner or losing your job.
A staff member was excited about a summer fellowship but not a school, such that discussing setting up a school made them think there might not be a role at the organization that matches their interests anymore.
Despite this, I personally don’t find it stressful if I or others consider radical changes, partly because I use the following strategies:
Mentally flag that radical changes can be really valuable. Remind myself of my previous failings (listed above) and the importance of not repeating them. There’s a lot of upside to this type of reasoning! Part of the reason for writing this shortform post is so I can reference it in the future to contextualize why I’m considering big changes.
Brainstorm first, decide later (or “babble first, prune later”): During the brainstorming phase, all crazy ideas are allowed and I (and my team) aim to explore novel ideas freely. We can always still decide against going through with big changes during the decision phase that will happen later. A different way to put this is that considering crazy ideas must not be strong evidence for them actually being implemented. (For this to work, it’s important that your organization actually has a sound decision procedure that actually happens later, and doesn’t mix the two stages. It’s also important for you to flag clearly that you’re in brainstorming mode, not in decision-making mode.)
Implement big changes carefully, and create common knowledge of that intention. Big changes should not be the result of naïve EV maximization, but should carefully take into account the full set of options (avoiding false dichotomies), the value of coordination (maximizing joint impact of the entire team, not just the decision-maker), externalities on other people/projects/communities, existing commitments, etc. Change management is hard; big changes should involve getting buy-in from the people affected by the change.
Should we be using Likelihood Ratios in everyday conversation the same way we use probabilities?
Disclaimer: Copy-pasting some Slack messages here, so this post is less coherent or well-written than others.
I’ve been thinking that perhaps we should be indicating likelihood ratios in everyday conversation to talk about the strength of evidence the same way we indicate probabilities in everyday conversation to talk about beliefs, that there should be a likelihood ratio calibration game, and that we should have cached likelihood ratios for common types of evidence (eg experimental research papers of a given level of quality).
However, maybe this is less useful because different pieces of evidence are often correlated? Or can we just talk about the strength of the uncorrelated portion of additional evidence?
So if you start out with a 50% probability, your prior odds are 1:1, your posterior odds after seeing all the evidence are 6:3 or 2:1, so your posterior probability is 67%.
If another person starts out with a 20% probability, their prior odds are 1:4, their posterior odds are 1:2, their posterior probability is 33%.
These two people agree on the strength of evidence but disagree on the prior. So the idea is that you can talk about the strength of the evidence / size of the update instead of the posterior probability (which might mainly depend on your prior).
Calibration game
A baseline calibration game proposal:
You get presented with a proposition, and submit a probability. Then you receive a piece of evidence that relates to the proposition (e.g. a sentence from a Wikipedia page about the issue, or a screenshot of a paper/abstract). You submit a likelihood ratio, which implies a certain posterior probability. Then both of these probabilities get scored using a proper scoring rule.
My guess is that you can do something more sophisticated here, but I think the baseline proposal basically works.
I really like the proposed calibration game! One thing I’m curious about is whether real-world evidence more often looks like a likelihood ratio or like something else (e.g. pointing towards a specific probability being correct). Maybe you could see this from the structure of priors+likelihoodratios+posteriors in the calibration game — e.g. check whether the long-run top-scorers likelihood ratios correlated more or less than their posterior probabilities.
(If someone wanted to build this: one option would be to start with pastcasting and then give archived articles or wikipedia pages as evidence. Maybe a sophisticated version could let you start out with an old relevant wikipedia page, and then see a wikipedia page much closer to the resolution date as extra evidence.)
Followup question: it seems like these likelihood ratios are fairly subjective. (like, why is the LR for the chicago survey 5:1 and not 10:1 or 20:1? How can you calibrate the likelihood ratio when there is no “right answer”?
It’s the same as with probabilities. How can probabilities be calibrated, given that they are fairly subjective? The LR can be calibrated the same way given that it’s just a function of two probabilities.
You can check probability estimates against outcomes. If you make 5 different predictions and estimate a 20% probability of each, then if you are well calibrated then you expect 1 out of the 5 to happen. If all of them happened, you probably made a mistake in your predictions. I don’t think this is perfect (it’s impractical to test very low probability predictions like 1 in a million), but there is at least some level of empiricism available.
There is no similar test for likliehood ratios. A question like “what is the chance that the chicago survey said minimum wages are fine if they actually aren’t” can’t be empirically tested.
There is also the question of whether people assign different strength to the same evidence. Maybe reporting why you think that the evidence is 1:3 rather than 1:1.5 or 1:6 would help.
Yeah exactly, that’s part of the idea here! E.g., on Metaculus, if someone posts a source and updates their belief, they could display the LR to indicate how much it updated them.
How to fix EA “community building”
Today, I mentioned to someone that I tend to disagree with others on some aspects of EA community building, and they asked me to elaborate further. Here’s what I sent them, very quickly written and only lightly edited:
I find it very interesting to think about the difference between what a talent development project would look like vs. a community-building project.
To be clear, are you saying your preference for the phrase ‘talent development’ over ‘community building’ is based on your concern that people hear ‘community building’ and think, ‘Oh, these people are more interested in investing in their community as an end in itself than they are in improving the world’?
I don’t know about Jonas, but I like this more from the self-directed perspective of “I am less likely to confuse myself about my own goals if I call it talent development.”
Thanks! So, to check I understand you, do you think when we engage in what we’ve traditionally called ‘community building’ we should basically just be doing talent development?
In other words, your theory of change for EA is talent development + direct work = arrival at our ultimate vision of a radically better world?[1]
Personally, I think we need a far more comprehensive social change portfolio.
E.g., a waypoint described by MacAskill as something like the below:
”(i) ending all obvious grievous contemporary harms, like war, violence and unnecessary suffering; (ii) reducing existential risk down to a very low level; (iii) securing a deliberative process for humanity as a whole, so that we make sufficient moral progress before embarking on potentially-irreversible actions like space settlement.”
Yes, this.
All of this looks fantastic and like it should have been implemented 10 years ago. This is not something to sleep on.
The only nitpick I have is with how object level vs social reality is described. Lots of people are nowhere near ready to make difficult calculations, e.g. the experience from the COVID reopening makes it hard to predict that pandemic lockdown in the next 5 years is 40% even if that is the correct number. There’s lots of situations where the division of labor is such that deferring to people at FHI etc. is the right place to start, since these predictions are really important and not about people giving their own two cents or beginning learning the ropes of forecasting, which is what happens all too often (of course, that shouldn’t get in the way of new information and models travelling upwards, or fun community building/talent development workshops where people try out forecasting to see if its a good fit for them).
Yeah, I disagree with this on my inside view—I think “come up with your own guess of how bad and how likely future pandemics could be, with the input of others’ arguments” is a really useful exercise, and seems more useful to me than having a good probability estimate of how likely it is. I know that a lot of people find the latter more helpful though, and I can see some plausible arguments for it, so all things considered, I still think there’s some merit to that.
EA Forum discourse tracks actual stakes very poorly
Examples:
There have been many posts about EA spending lots of money, but to my knowledge no posts about the failure to hedge crypto exposure against the crypto crash of the last year, or the failure to hedge Meta/Asana stock, or EA’s failure to produce more billion-dollar start-ups. EA spending norms seem responsible for $1m–$30m of 2022 expenses, but failures to preserve/increase EA assets seem responsible for $1b–$30b of 2022 financial losses, a ~1000x difference.
People are demanding transparency about the purchase of Wytham Abbey (£15m), but they’re not discussing whether it was a good idea to invest $580m in Anthropic (HT to someone else for this example). The financial difference is ~30x, the potential impact difference seems much greater still.
Basically I think EA Forum discourse, Karma voting, and the inflation-adjusted overview of top posts completely fails to correctly track the importance of the ideas presented there. Karma seems to be useful to decide which comments to read, but otherwise its use seems fairly limited.
(Here’s a related post.)
I think this is about scope of responsibility. To my knowledge, “EA” doesn’t own any Meta/Asana stock, didn’t have billions of (alleged) assets caught up in crypto, and didn’t pour $580MM into Anthropic. All that money belongs/belonged to private individuals or corporations (or so we thought...), and there’s arguably something both a bit unseemly and pointless about writing on a message board about how specific private individuals should conduct their own financial affairs.
On the other hand, EVF is a public charity—and its election of that status and solicitation of funds from the general public rightly call for a higher level of scrutiny of its actions vis-a-vis those of private individuals and for-profit corporations.
I don’t actually know the details, but as far as I know, EVF is primarily funded by private foundations/billionaires, too.
Also, some of this hedging could’ve been done by community members without actual ownership of Meta/Asana/crypto. Again, the lack of discussion of this seems problematic to me.
EVF and its US affiliate, CENTRE FOR EFFECTIVE ALTRUISM USA INC, are public charities. That means there is an indirect public subsidy (in terms of foregone tax revenues on money donated) by US and UK taxpayers somewhere in the ballpark of about 25% of donations. Based on the reports I linked, that seems to be about $10MM per year, probably more in recent years given known big grants. EVF also solicits donations from the general public on its website and elsewhere, which I don’t think is true of the big holder of Meta/Asana stock. (Good Ventures, which has somewhat favorable tax treatment as a private foundation, does not seem concentrated in this stock per the most recent 990-PF I could find.)
If an organization solicits from the public and accepts favorable tax treatment for its charitable status, the range of public scrutiny it should expect is considerably higher than for a private individual.
As far as I know, large philanthropic foundations often use DAFs to attain public charity status, getting the same tax benefits. And if they’re private foundations, they’re still getting a benefit of ~15%, and possibly a lot more via receiving donations of appreciated assets.
I also don’t think public charity status and tax benefits are especially relevant here. I think public scrutiny is not intrinsically important; I mainly care about taking actions that maximize social impact, and public scrutiny seems much worse for this than figuring out high-impact ways to preserve/increase altruistic assets.
I think you would care about this specific investment if you had more context (or at least I expect that you believe you would deserve to understand the argument). In some sense, this proves Jonas right.
There is a write-up specifically on this, that has been reviewed by some people. The author is now holding it back for ~4-6 weeks because they were requested to.
Yeah, I think making sure discussion of these topics (both Anthropic and Wytham) is appropriately careful seems good to me. E.g., the discussion of Wytham seemed very low-quality to me, with few contributors providing sound analysis of how to think about the counterfactuals of real estate investments.
At this point, I think it’s unfortunate that this post has not been published, a >2 month delay seems too long to me. If there’s anything I can do to help get this published, please let me know.
Actually, this particular post was drafted by a person who has been banned from the Forum, so I think it’s fine that it’s not published
so we’re waiting 1.5 month to see if Anthropic was a bag idea? On the other hand: Wytham Abbey was purchased by EV / CEA and made the news. Anthropic is a private venture. If Anthropic shows up in an argument, I can just say I don’t have anything to do with that. But if someone mentioned that Wytham Abbey was bought by the guy who wrote the basics of how we evaluate areas and projects… I still don’t know what to say.
Requested to, by who/for what reason? Is this information you have access to?
Are there posts about those things which you think are under karma’d? My guess is the problem is more that people aren’t writing about them, rather than karma not tracking the importance of things which are written about. (At least in these two specific cases.)
There aren’t posts about them I think, but I’d also predict that they’d get less Karma if they existed.
Cool, fwiw I’d predict that a well-written anthropic piece would get more than the 150 karma the Whytam post currently has, though I acknowledge that “well-written” is vague. Based on what this commenter says, we might get to test that prediction soon.
FWIW the Wytham Abbey post also received ~240 votes, and I doubt that a majority of downvotes were given for the reason that people found the general topic unimportant. Instead I think it’s because the post seemed written fairly quickly and in a prematurely judgemental way. So it doesn’t seem right to take the karma level as evidence that this topic actually didn’t get a ton of attention.
How do you see it got 240 votes?
Anyway I agree that I wrote it quickly and prematurely. I edited it to add my current thoughts.
You can see the number of votes by hovering your mouse above the karma.
Good point, I wasn’t tracking that the Wytham post doesn’t actually have that much Karma. I do think my claim would be correct regarding my first example (spending norms vs. asset hedges).
My claim might also be correct if your metric of choice was the sum of all the comment Karma on the respective posts.
Yeah, seems believable to me on both counts, though I currently feel more sad that we don’t have posts about those more important things than the possibility that the karma system would counterfactually rank those posts poorly if they existed.
A tangentially related point about example 1: Wow, it really surprises me that crypto exposure wasn’t hedged!
I can think of a few reasons why those hedges might be practically infeasible (possibilities: financial cost, counterparty risk of crypto hedge, relationship with donor, counterparty risk of donor). I take your point that it’d be appropriate if these sorts of things got discussed more, so I think I will write something on this hedging tomorrow. Thanks for the inspiration!
I agree that the hedges might be practically infeasible or hard. But my point is that this deserves more discussion and consideration, not that it was obviously easy to fix.
Ah, I see—I’ll edit it a bit for clarity then
Edit: should be better now
Doing Doing Good Better Better
The general idea: We need more great books!
I’ve been trying to find people willing and able to write quality books and have had a hard time finding anyone. “Doing Doing Good Better Better” seems one of the highest-EV projects, and EA Funds (during my tenure) received basically no book proposals, as far as I remember. I’d love to help throw a lot of resources after an upcoming book project by someone competent who isn’t established in the community yet.
The current canonical EA books—DGB, WWOTF, The Precipice—seem pretty good to me. But only very few people have undertaken serious attempts to write excellent EA books.
It generally seems to me that EA book projects have mainly been attempted by people (primarily men) who seem driven by prestige. I’m not sure if this is good—it means they’re going to be especially motivated to do a great job, but it seems uncorrelated with writing skill, so we’re probably missing out on a lot of talented writers.
In particular, I think there is still room for more broad, ambitious, canonical EA books, i.e. ones that can be given as a general EA introduction to a broad range of people, rather than a narrow treatment of e.g. niche areas in philosophy. I feel most excited about proposals that have the potential to become a canonical resource for getting talented people interested in rationality and EA.
Perhaps writing a book requires you to put yourself forward in a way that’s uncomfortable for most people, leaving only prestige-driven authors actually pursuing book projects? If true, I think this is bad, and I want to encourage people who feel shy about putting themselves out there to attempt it. If you like, you could apply for a grant and partly leave it to the grantmakers to decide if it’s a good idea.
A more specific take: Books as culture-building
What’s the secret sauce of the EA community? I.e., what are the key ideas, skills, cultural aspects, and other properties of the EA community that have been most crucial for its success so far?
My guess is that much of EA is about building a culture where people care unusually strongly about the truth, and unusually strongly about having an impact.
Current EA introduction books only embody this spirit to a limited degree. They talk about a wide range of topics that seem non-central to EA thinking, they tend to start with a bottom line (“x-risk is 1 in 6 this century” / “longtermism should be a key moral priority”) and argue for it, instead of the other way around. (Kind of. I think this is a bit unfair, but directionally true.) They tend to hide the source of knowledge; they’re not especially reasoning-transparent. The LessWrong Sequences do this well, but they’re lengthy and don’t work for everyone. There’s a much wider range of rationality-flavored content that could be written.
If we hand out books for talent outreach purposes, it would be great if EA books could also do the cultural onboarding at the same time, not just introducing the reader to new ideas, but also new reasoning styles. This could reduce community dilution and improve EA culture.
The main concern might be that such a book won’t sell well. Maybe, I don’t know? HPMOR seems one of the most widely (the most widely?) read EA and rationality books.
Some specific brainstorming ideas
A book about the world’s biggest problems, with a chapter for each plausibly important cause area, and some chapters on the philosophy (HT Damon / Fin? or someone else?)
A book that discusses a particular cause/issue and uses that to exemplify rationalist/EA-style reasoning, gesturing at the broader potential of the methodology.
A book that discusses historical and current deliberate attempts at large-scale-impact projects (e.g., Green Revolution, Pugwash Conferences, cage-free campaigns, 1DaySooner, …), both successes and failures (with lessons learnt), with a lot of insider war stories and anecdotes that allow you to understand how the sausage actually gets made.
Epistemic status: Have been thinking about this for about a year, spent 30 minutes writing it down, feel reasonably confident in the general idea but not the specifics. I wrote this quickly and may still make edits.
Consider radical changes without freaking out
As someone running an organization, I frequently entertain crazy alternatives, such as shutting down our summer fellowship to instead launch a school, moving the organization to a different continent, or shutting down the organization so the cofounders can go work in AI policy.
I think it’s important for individuals and organizations to have the ability to entertain crazy alternatives because it makes it more likely that they escape local optima and find projects/ideas that are vastly more impactful.
Entertaining crazy alternatives can be mentally stressful: it can cause you or others in your organization to be concerned that their impact, social environment, job, or financial situation is insecure. This can be addressed by pointing out why these discussions are important, a clear mental distinction between brainstorming mode and decision-making, and a shared understanding that big changes will be made carefully.
Why considering radical changes seems important
The best projects are orders of magnitude more impactful than good ones. Moving from a local optimum to a global one often involves big changes, and the path isn’t always very smooth. Killing your darlings can be painful. The most successful companies and projects typically have reinvented themselves multiple times until they settled on the activity that was most successful. Having a wide mental and organizational Overton window seems crucial for being able to make pivots that can increase your impact several-fold.
When I took on leadership at CLR, we still had several other projects, such as REG, which raised $15 million for EA charities at a cost of $500k. That might sound impressive, but in the greater scheme of things raising a few million wasn’t very useful given that the best money-making opportunities could make a lot more per staff per year, and EA wasn’t funding-constrained anymore. It took me way too long to realize this, and only my successor stopped putting resources into the project after I left. There’s a world where I took on leadership at CLR, realized that killing REG might be a good idea, seriously considered the idea, got input from stakeholders, and then went through with it, within a few weeks of becoming Executive Director. All the relevant information to make this judgment was available at the time.
When I took on leadership at EA Funds, I did much better: I quickly identified the tension between “raising money from a broad range of donors” and “making speculative, hits-based grants”, and suggested that perhaps these two aims should be decoupled. I still didn’t go through with it nearly as quickly as I could have, this time not because of limitations of my own reasoning, but more because I felt constrained by the large number of stakeholders who had expectations about what we’d be doing.
Going forward, I intend to be much more relentless about entertaining radical changes, even when they seem politically infeasible, unrealistic, or personally stressful. I also intend to discuss those with my colleagues, and make them aware of the importance of such thinking.
How not to freak out
Considering these big changes can be extremely stressful, e.g.:
The organization moving to a different continent could mean breaking up with your life partner or losing your job.
A staff member was excited about a summer fellowship but not a school, such that discussing setting up a school made them think there might not be a role at the organization that matches their interests anymore.
Despite this, I personally don’t find it stressful if I or others consider radical changes, partly because I use the following strategies:
Mentally flag that radical changes can be really valuable. Remind myself of my previous failings (listed above) and the importance of not repeating them. There’s a lot of upside to this type of reasoning! Part of the reason for writing this shortform post is so I can reference it in the future to contextualize why I’m considering big changes.
Brainstorm first, decide later (or “babble first, prune later”): During the brainstorming phase, all crazy ideas are allowed and I (and my team) aim to explore novel ideas freely. We can always still decide against going through with big changes during the decision phase that will happen later. A different way to put this is that considering crazy ideas must not be strong evidence for them actually being implemented. (For this to work, it’s important that your organization actually has a sound decision procedure that actually happens later, and doesn’t mix the two stages. It’s also important for you to flag clearly that you’re in brainstorming mode, not in decision-making mode.)
Implement big changes carefully, and create common knowledge of that intention. Big changes should not be the result of naïve EV maximization, but should carefully take into account the full set of options (avoiding false dichotomies), the value of coordination (maximizing joint impact of the entire team, not just the decision-maker), externalities on other people/projects/communities, existing commitments, etc. Change management is hard; big changes should involve getting buy-in from the people affected by the change.
Related: Staring into the abyss as a core life skill
Related: Staring into the abyss as a core life skill
Should we be using Likelihood Ratios in everyday conversation the same way we use probabilities?
Disclaimer: Copy-pasting some Slack messages here, so this post is less coherent or well-written than others.
I’ve been thinking that perhaps we should be indicating likelihood ratios in everyday conversation to talk about the strength of evidence the same way we indicate probabilities in everyday conversation to talk about beliefs, that there should be a likelihood ratio calibration game, and that we should have cached likelihood ratios for common types of evidence (eg experimental research papers of a given level of quality).
However, maybe this is less useful because different pieces of evidence are often correlated? Or can we just talk about the strength of the uncorrelated portion of additional evidence?
See also: Strong Evidence is Common
Example
Here’s an example with made-up numbers:
Question: Are minimum wages good or bad for low-skill workers?
Theoretical arguments that minimum wages increase unemployment, LR = 1:3
Someone sends an empirical paper and the abstract says it improved the situation, LR = 1.2:1
IGM Chicago Survey results, LR = 5:1
So if you start out with a 50% probability, your prior odds are 1:1, your posterior odds after seeing all the evidence are 6:3 or 2:1, so your posterior probability is 67%.
If another person starts out with a 20% probability, their prior odds are 1:4, their posterior odds are 1:2, their posterior probability is 33%.
These two people agree on the strength of evidence but disagree on the prior. So the idea is that you can talk about the strength of the evidence / size of the update instead of the posterior probability (which might mainly depend on your prior).
Calibration game
A baseline calibration game proposal:
You get presented with a proposition, and submit a probability. Then you receive a piece of evidence that relates to the proposition (e.g. a sentence from a Wikipedia page about the issue, or a screenshot of a paper/abstract). You submit a likelihood ratio, which implies a certain posterior probability. Then both of these probabilities get scored using a proper scoring rule.
My guess is that you can do something more sophisticated here, but I think the baseline proposal basically works.
I really like the proposed calibration game! One thing I’m curious about is whether real-world evidence more often looks like a likelihood ratio or like something else (e.g. pointing towards a specific probability being correct). Maybe you could see this from the structure of priors+likelihoodratios+posteriors in the calibration game — e.g. check whether the long-run top-scorers likelihood ratios correlated more or less than their posterior probabilities.
(If someone wanted to build this: one option would be to start with pastcasting and then give archived articles or wikipedia pages as evidence. Maybe a sophisticated version could let you start out with an old relevant wikipedia page, and then see a wikipedia page much closer to the resolution date as extra evidence.)
Interesting point, agreed that this would be very interesting to analyze!
Relevant calibration game that was recently posted - I found it surprisingly addictive—maybe they’d be interested in implementing your ideas.
Can you walk through the actual calculations here? Why did the chicago survey shift the person from 1.2:1 to 5:1, and not a different ratio?
No, this is not the description of the absolute shift (i.e., not from 1.2:1 to 5:1) but for the relative shift (i.e., from 1:x to 5:x).
Yeah. Here’s the example in more detail:
Prior odds: 1:1
Theoretical arguments that minimum wages increase unemployment, LR = 1:3 → posterior odds 1:3
Someone sends an empirical paper and the abstract says it improved the situation, LR = 1.2:1 → posterior odds 1.2:3
IGM Chicago Survey results, LR = 5:1 → posterior odds 6:3 (or 2:1)
Ah yes, thank you, that clear it up.
Followup question: it seems like these likelihood ratios are fairly subjective. (like, why is the LR for the chicago survey 5:1 and not 10:1 or 20:1? How can you calibrate the likelihood ratio when there is no “right answer”?
It’s the same as with probabilities. How can probabilities be calibrated, given that they are fairly subjective? The LR can be calibrated the same way given that it’s just a function of two probabilities.
You can check probability estimates against outcomes. If you make 5 different predictions and estimate a 20% probability of each, then if you are well calibrated then you expect 1 out of the 5 to happen. If all of them happened, you probably made a mistake in your predictions. I don’t think this is perfect (it’s impractical to test very low probability predictions like 1 in a million), but there is at least some level of empiricism available.
There is no similar test for likliehood ratios. A question like “what is the chance that the chicago survey said minimum wages are fine if they actually aren’t” can’t be empirically tested.
There is also the question of whether people assign different strength to the same evidence. Maybe reporting why you think that the evidence is 1:3 rather than 1:1.5 or 1:6 would help.
Yeah exactly, that’s part of the idea here! E.g., on Metaculus, if someone posts a source and updates their belief, they could display the LR to indicate how much it updated them.
Note that bits might be better because you can sum them.
Yeah fair, although I expect people to have more difficulty converting log odds back into probabilities.