# evelynciara’s Shortform

• Crazy idea: When charities apply for funding from foundations, they should be required to list 3-5 other charities they think should receive funding. Then, the grantmaker can run a statistical analysis to find orgs that are mentioned a lot and haven’t applied before, reach out to those charities, and encourage them to apply. This way, the foundation can get a more diverse pool of applicants by learning about charities outside their network.

• Great idea!

• “Quality-adjusted civilization years”

We should be able to compare global catastrophic risks in terms of the amount of time they make global civilization significantly worse and how much worse it gets. We might call this measure “quality-adjusted civilization years” (QACYs), or the quality-adjusted amount of civilization time that is lost.

For example, let’s say that the COVID-19 pandemic reduces the quality of civilization by 50% for 2 years. Then the QACY burden of COVID-19 is QACYs.

Another example: suppose climate change will reduce the quality of civilization by 80% for 200 years, and then things will return to normal. Then the total QACY burden of climate change over the long term will be QACYs.

In the limit, an existential catastrophe would have a near-infinite QACY burden.

• I think we need to be careful when we talk about AI and automation not to commit the lump of labor fallacy. When we say that a certain fraction of economically valuable work will be automated at any given time, or that this fraction will increase, we shouldn’t implicitly assume that the total amount of work being done in the economy is constant. Historically, automation has increased the size of the economy, thereby creating more work to be done, whether by humans or by machines; we should expect the same to happen in the future. (Note that this doesn’t exclude the possibility of increasingly general AI systems performing almost all economically valuable work. This could very well happen even as the total amount of work available skyrockets.)

• An idea I liked from Owen Cotton-Barratt’s new interview on the 80K podcast: Defense in depth

If S, M, or L is any small, medium, or large catastrophe and X is human extinction, then the probability of human extinction is

So halving the probability of all small disasters, the probability of any small disaster becoming a medium-sized disaster, etc. would halve the probability of human extinction.

• Nonprofit idea: YIMBY for energy

YIMBY groups in the United States (like YIMBY Action) systematically advocate for housing developments as well as rezonings and other policies to create more housing in cities. YIMBYism is an explicit counter-strategy to the NIMBY groups that oppose housing development; however, NIMBYism affects energy developments as well—everything from solar farms to nuclear power plants to power lines—and is thus an obstacle to the clean energy transition.

There should be groups that systematically advocate for energy projects (which are mostly in rural areas), borrowing the tactics of the YIMBY movement. Currently, when developers propose an energy project, they do an advertising campaign to persuade local residents of the benefits of the development, but there is often opposition as well.

• I thought YIMBYs were generally pretty in favor of this already? (Though not generally as high a priority for them as housing.) My guess is it would be easier to push the already existing YIMBY movement to focus on energy more, as opposed to creating a new movement from scratch.

• Yeah, I think that might be easier too. But YIMBY groups focus on housing in cities whereas most utility-scale energy developments are probably in suburbs or rural areas.

• Hmm, culturally YIMBYism seems much harder to do in suburbs/​rural areas. I wouldn’t be too surprised if the easiest ToC here is to pass YIMBY-energy policies on the state level, with most of the support coming from urbanites.

But sure, still probably worth trying.

• Yeah, good point. Advocating for individual projects or rezonings is so time-consuming, even in the urban housing context.

• I think an EA career fair would be a good idea. It could have EA orgs as well as non-EA orgs that are relevant to EAs (for gaining career capital or earning to give)

• EA Global normally has an EA career fair, or something similar

• On the difference between x-risks and x-risk factors

I suspect there isn’t much of a meaningful difference between “x-risks” and “x-risk factors,” for two reasons:

1. We can treat them the same in terms of probability theory. For example, if is an “x-risk” and is a “risk factor” for , then . But we can also say that , because both statements are equivalent to . We can similarly speak of the total probability of an x-risk factor because of the law of total probability (e.g. ) like we can with an x-risk.

2. Concretely, something can be both an x-risk and a risk factor. Climate change is often cited as an example: it could cause an existential catastrophe directly by making all of Earth unable to support complex societies, or indirectly by increasing humanity’s vulnerability to other risks. Pandemics might also be an example, as a pandemic could either directly cause the collapse of civilization or expose humanity to other risks.

I think the difference is that x-risks are events that directly cause an existential catastrophe, such as extinction or civilizational collapse, whereas x-risk factors are events that don’t have a direct causal pathway to x-catastrophe. But it’s possible that pretty much all x-risks are risk factors and vice versa. For example, suppose that humanity is already decimated by a global pandemic, and then a war causes the permanent collapse of civilization. We usually think of pandemics as risks and wars as risk factors, but in this scenario, the war is the x-risk because it happened last… right?

One way to think about x-risks that avoids this problem is that x-risks can have both direct and indirect causal pathways to x-catastrophe.

• I think your comment (and particularly the first point) has much more to do with the difficulty of defining causality than with x-risks.

It seems natural to talk about force causing the mass to accelerate: when I push a sofa, I cause it to start moving. but Newtonian mechanics can’t capture casualty basically because the equality sign in lacks direction. Similarly, it’s hard to capture causality in probability spaces.

Following Pearl, I come to think that causality arises from manipulator/​manipulated distinction.

So I think it’s fair to speak about factors only with relation to some framing:

• If you are focusing on bio policy, you are likely to take great-power conflict as an external factor.

• Similarly, if you are focusing on preventing nuclear war between India and Pakistan, you are likely to take bioterrorism as an external factor.

Usually, there are multiple external factors in your x-risk modeling. The most salient and undesirable are important enough to care about them (and give them a name).

Calling bio-risks an x-factor makes sense formally; but doesn’t make sense pragmatically because bio-risks are very salient (in our community) on their own because they are a canonical x-risk. So for me, part of the difference is that I started to care about x-risks first; and that I started to care about x-risk factors because of their relationship to x-risk.

• One thing the EA community should try doing is multinational op-ed writing contests. The focus would be op-eds advocating for actions or policies that are important, neglected, and tractable (although the op-eds themselves don’t have to mention EA); and by design, op-eds could be submitted from anywhere in the world. To make judging easier, op-eds could be required to be in a single language, but op-ed contests in multiple languages could be run in parallel (such as English, Spanish, French, and Arabic, each of which is an official language in at least 20 countries).

This would have two benefits for the EA community:

• It would be a cheap way to spread EA-aligned ideas in multiple countries. Also, the people writing the op-eds would know more about the political climates of the countries for which they are publishing them than the organizers of the contest would, and we can encourage them to tailor their messaging accordingly.

• It would also be a way to measure countries’ receptiveness to EA ideas. For example, if there were multiple submissions about immigration policy, we could use them to compare the receptiveness of different countries to immigration reforms that would increase global well-being.

• I think this is a great idea. A related idea I had is a competition for “intro to EA” pitches because I don’t currently feel like I can send my friends a link to a pitch that I’m satisfied with.

A simple version could literally just be an EA forum post where everyone comments an “intro to EA” pitch under a certain word limit, and other people upvote /​ downvote.

A fancier version could have a cash prize, narrowing down entries through EA forum voting, and then testing the top 5 through online surveys.

I think in a more general sense, we should create markets to incentivise and select persuasive writing on EA issues aimed at the public.

• That’s a great idea! I’ve been trying to find a good intro to EA talk for a while and I recently came across the EA for Christians YouTube video about intro to EA and though it’s kinda leaning towards to the religious angle, it seemed like a pretty good intro for a novice. Would love to hear your thoughts about that. Here’s the link: https://​​youtu.be/​​Unt9iHFH5-E

• Possible outline for a 2-3 part documentary adaptation of The Precipice:

Part 1: Introduction & Natural Risks

• Introduce the idea that we are in a time of unprecedented existential risk, but that the future could be very good (Introduction and Chapter 1)

• Discuss natural risks (Chapter 3)

• Argue that man-made risks are greater and use this to lead to the next episode (Chapter 3)

Part 2: Human-Made Risks

• Well-known anthropogenic risks—nuclear war, climate change, other environmental damage (Chapter 4)

• Emerging technological risks—pandemics, AI, dystopia (Chapter 5)

• Existential risk and security factors (Chapter 6)

Part 3: What We Can Do

• Discuss actions society can take to minimize its existential risk (Chapter 7)

What this leaves out:

• Chapter 2 - mostly a discussion of the moral arguments for x-risk’s importance. Can assume that the audience will already care about x-risk at a less sophisticated level, and focus on making the case that x-risk is high and we sort of know what to do about it.

• The discussion of joint probabilities of x-risks in Chapter 6 - too technical for a general audience

Another way to do it would be to do an episode on each type of risk and what can be done about it, for example:

• Part 1: Introduction

• Part 2: Pandemic risk (timely because of COVID-19)

• Part 3: Risks from asteroids, comets, volcanoes, and supernovas

• Part 4: Climate change

• Part 5: Artificial intelligence

• and so on

Like the original book, I’d want the tone of the documentary to be authoritative and hopeful and not lean on fear.

• I just listened to Andrew Critch’s interview about “AI Research Considerations for Human Existential Safety” (ARCHES). I took some notes on the podcast episode, which I’ll share here. I won’t attempt to summarize the entire episode; instead, please see this summary of the ARCHES paper in the Alignment Newsletter.

• We need to explicitly distinguish between “AI existential safety” and “AI safety” writ large. Saying “AI safety” without qualification is confusing for both people who focus on near-term AI safety problems and those who focus on AI existential safety problems; it creates a bait-and-switch for both groups.

• Although existential risk can refer to any event that permanently and drastically reduces humanity’s potential for future development (paraphrasing Bostrom 2013), ARCHES only deals with the risk of human extinction because it’s easier to reason about and because it’s not clear what other non-extinction outcomes are existential events.

• ARCHES frames AI alignment in terms of delegation from m ≥ 1 human stakeholders (such as individuals or organizations) to n ≥ 1 AI systems. Most alignment literature to date focuses on the single-single setting (one principal, one agent), but such settings in the real world are likely to evolve into multi-principal, multi-agent settings. Computer scientists interested in AI existential safety should pay more attention to the multi-multi setting relative to the single-single one for the following reasons:

• There are commercial incentives to develop AI systems that are aligned with respect to the single-single setting, but not to make sure they won’t break down in the multi-multi setting. A group of AI systems that are “aligned” with respect to single-single may still precipitate human extinction if the systems are not designed to interact well.

• Single-single delegation solutions feed into AI capabilities, so focusing only on single-single delegation may increase existential risk.

• What alignment means in the multi-multi setting is more ambiguous because the presence of multiple stakeholders engenders heterogeneous preferences. However, predicting whether humanity goes extinct in the multi-multi setting is easier than predicting whether a group of AI systems will “optimally” satisfy a group’s preferences.

• Critch and Krueger coin the term “prepotent AI” to refer to an AI system that is powerful enough to transform Earth’s environment at least as much as humans have and where humans cannot effectively stop or reverse these changes. Importantly, a prepotent AI need not be an artificial general intelligence.

• I think partnering with local science museums to run events on EA topics could be a great way to get EA-related ideas out to the public.

• That’s a pretty cool idea

• NYC is adopting ranked-choice voting for the 2021 City Council election. One challenge will be explaining the new voting system, though.

• # Some rough thoughts on cause prioritization

• I’ve been tying myself up in knots about what causes to prioritize. I originally came back to effective altruism because I realized I had gotten interested in 23 different causes and needed to prioritize them. But looking at the 80K problem profile page (I am fairly aligned with their worldview), I see at least 17 relatively unexplored causes that they say could be as pressing as the top causes they’ve created profiles for. I’ve taken a stab at one of them: making surveillance compatible with privacy, civil liberties, and public oversight.

• I’m sympathetic to this proposal for how to prioritize given cluelessness. But I’m not sure it should dominate my decision making. It also stops feeling like altruism when it’s too abstracted away from the object-level problems (other than x-risk and governance).

• I’ve been seriously considering just picking causes from the 80K list “at random.”

• By this, I mean could just pick a cause from the list that seems more neglected, “speaks to me” in some meaningful way, and that I have a good personal fit for. Many of the more unexplored causes on the 80K list seem more neglected, like one person worked on it just long enough to write one forum post (e.g. risks from malevolent actors).

• It feels inherently icky because it’s not really taking into account knowledge of the scale of impact, and it’s the exact thing that EA tells you not to do. But: MIRI calls it quantilizing, or picking an action at random from the top x% of actions one could do. They think it’s a promising alternative to expected utility maximization for AI agents, which makes me more confident that it might be a good strategy for clueless altruists too.

• Some analogies that I think support this line of thinking:

• In 2013, the British newspaper The Observer ran a contest between professional investment managers and… a cat throwing a toy at a dartboard to pick stocks. The cat won. According to the efficient market hypothesis, investors are clueless about what investing opportunities will outperform the pack, so they’re unlikely to outperform an index fund or a stock-picking cat. If we’re similarly clueless about what’s effective in the long term, then maybe the stochastic approach is fine.

• One strategy for dimensionality reduction in machine learning and statistics is to compress a high-dimensional dataset into a lower-dimensional space that’s easier to compute with by creating a random projection. Even though the random projection doesn’t take into account any information in the dataset (like PCA does), it still preserves most of the information in the dataset most of the time.

• I’ve also been thinking about going into EA community building activities (such as setting up an EA/​public interest tech hackathon) so I can delegate, in expectation, the process of thinking about which causes are promising to other people who are better suited to doing it. If I did this, I would most likely still be thinking about cause prioritization, but it would allow me to stretch that thinking over a longer time scale than if I had to do it all at once before deciding on an object-level cause to work on.

• Even though I think AI safety is a potentially pressing problem, I don’t emphasize it as much because it doesn’t seem constrained by CS talent. The EA community currently encourages people with CS skills to go into either AI technical safety or earning to give. Direct work applying CS to other pressing causes seems more neglected, and it’s the path I’m exploring.

• Tentative thoughts on “problem stickiness”

When it comes to comparing non-longtermist problems from a longtermist perspective, I find it useful to evaluate them based on their “stickiness”: the rate at which they will grow or shrink over time.

A problem’s stickiness is its annual growth rate. So a problem has positive stickiness if it is growing, and negative stickiness if it is shrinking. For long-term planning, we care about a problem’s expected stickiness: the annual rate at which we think it will grow or shrink. Over the long term—i.e. time frames of 50 years or more—we want to focus on problems that we expect to grow over time without our intervention, instead of problems that will go away on their own.

For example, global poverty has negative stickiness because the poverty rate has declined over the last 200 years. I believe its stickiness will continue to be negative, barring a global catastrophe like climate change or World War III.

On the other hand, farm animal suffering has not gone away over time; in fact, it has gotten worse, as a growing number of people around the world are eating meat and dairy. This trend will continue at least until alternative proteins become competitive with animal products. Therefore, farm animal suffering has positive stickiness. (I would expect wild animal suffering to also have positive stickiness due to increased habitat destruction, but I don’t know.)

The difference in stickiness between these problems motivates me to focus more on animal welfare than on global poverty, although I’m still keeping an eye on and cheering on actors in that space.

I wonder which matters more, a problem’s “absolute” stickiness or its growth rate relative to the population or the size of the economy. But I care more about differences in stickiness between problems than the numbers themselves.

• We’re probably surveilling poor and vulnerable people in developing and developed countries too much in the name of aiding them, and we should give stronger consideration to the privacy rights of aid recipients. Personal data about these people collected for benign purposes can be weaponized against them by malicious actors, and surveillance itself can deter people from accessing vital services.

“Stop Surveillance Humanitarianism” by Mark Latonero

Automating Inequality by Virginia Eubanks makes a similar argument regarding aid recipients in developed countries.

• Interesting op-ed! I wonder to what extent these issues are present in work being done by EA-endorsed global health charities; my impression is that almost all of their work happens outside of the conflict zones where some of these privacy concerns are especially potent. It also seems like these charities are very interested in reaching high levels of usage/​local acceptance, and would be unlikely to adopt policies that deter recipients unless fraud concerns were very strong. But I don’t know all the Top Charities well enough to be confident of their policies in this area.

This would be a question worth asking on one of GiveWell’s occasional Open Threads. And if you ask it on Rob Mather’s AMA, you’ll learn how AMF thinks about these things (given Rob’s response times, possibly within a day).

• Thank you for sharing this! I took a class on surveillance and privacy last semester, so I already have basic knowledge about this subject. I agree that it’s important to reject false tradeoffs. Personally, my contribution to this area would be in formulating a theory of privacy that can be used to assess surveillance schemes in this context.

• Shafi Goldwasser at Berkeley is currently working on some definitions of privacy and their applicability for law. See this paper or this talk. In a talk she gave last month she talked about how to formalize some aspects of law related to cryptographic concepts to formalize “the right to be forgotten”. The recording is not up yet, but in the meantime I paste below my (dirty/​partial) notes from the talk. I feel somewhat silly for not realizing the possible connection there earlier, so thanks for the opportunity to discover connections hidden in plain sight!

Shafi is working directly with judges, and this whole program is looking potentially promising. If you are seriously interested in pursuing this, I can connect you to her if that would help. Also, we have someone in our research team at EA Israel doing some work into this (from a more tech/​crypto solution perspective) so it may be interesting to consider a collaboration here.

The notes-

## “What Crypto can do for the Law?”—Shafi Goldwasser 30.12.19:

• There is a big language barrier between Law and CS, following a knowledge barrier.

• People in law study the law of governing algorithms, but there is not enough participation of computer scientists to help legal work.

• But, CS can help with designing algorithms and formalizing what these laws should be.

• Shafi suggests a crypto definition for “The right to be forgotten”. This should help

• Privacy regulation like CCPA and GDPR have a problem—how to test whether one is compliant?

• Do our cryptographic techniques satisfy the law?

• that requires a formal definition

• A first suggestion:

• after deletions, the state of the data collector and the history of the interaction with the environment should be similar as to the case where information was never changed. [this is clearly inadequate—Shafi aims at starting a conversation]

• Application of cryptographic techniques

• History Oblivious Data Structure

• Data Summarization using Differential Privacy leaves no trace

• ML Data Deletion

• I’m excited about Open Phil’s new cause area, global aid advocacy. Development aid from rich countries could be used to serve several goals that many EAs care about:

• Economic development and poverty reduction

• Public health and biosecurity, including drug liberalization

• Promoting liberal democracy

• Climate change mitigation and adaptation

Also, development aid can fund a combination of randomista-style and systemic interventions (such as building infrastructure to promote growth).

The United States has two agencies that provide development aid: USAID, which provides grants and technical assistance, and the U.S. International Development Finance Corporation (DFC), which provides equity and debt financing. I’d like to see both of these agencies strengthened and expanded with more funding.

I’m especially excited about the DFC because it was created in 2019 as a counterweight to China’s Belt and Road Initiative (BRI). China has used the BRI to buy influence in regions such as Africa, so aid-receiving countries don’t criticize its authoritarian policies. Development finance institutions run by liberal democracies, like the DFC, can be used to peel developing countries away from China and make them more likely to support global democracy promotion initiatives.

• Making specialty meats like foie gras using cellular agriculture could be especially promising. Foie gras traditionally involves fattening ducks or geese by force-feeding them, which is especially ethically problematic (although alternative production methods exist). It could probably be produced by growing liver and fat cells in a medium without much of a scaffold, which would make it easier to develop.

• This sounds plausible to me, and there’s already at least one company working on this, but I’m actually pretty confused about what goes into foie gras. Like do we really think just having liver and fat cells will be enough, or are there weird consistency/​texture criteria that foie gras eaters really care about?

Would be excited to hear more people chime in with some expertise, eg if they have experience working in cellular agriculture or are French.

• AOC’s Among Us stream on Twitch nets $200K for coronavirus relief “We did it!$200k raised in one livestream (on a whim!) for eviction defense, food pantries, and more. This is going to make such a difference for those who need it most right now.” — AOC’s Tweet

Video game streaming is a popular way to raise money for causes. We should use this strategy to fundraise for EA organizations.

• It’s difficult to raise money through streaming unless you already have a popular stream. I ran a charity stream for an audience of a few hundred people for three hours and raised roughly 150, and I may be the most popular video game streamer in the community (though other people with big audiences from elsewhere could probably create bigger streams than mine without much effort). If anyone reading this is in contact with major streamers, it might be worth reaching out, but that can easily go wrong if the streamer has a charity they already feel committed to (so be cautious). • Do emergency universal pass/​fail policies improve or worsen student well-being and future career prospects? I think a natural experiment is in order. Many colleges are adopting universal pass/​fail grading for this semester in response to the COVID-19 pandemic, while others aren’t. Someone should study the impact this will have on students to inform future university pandemic response policy. • When suggestions of this type come up, especially for causes that don’t have existing EA research behind them, my recommended follow-up is to look for people who study this as normal academics (here, “this” would be “ways that grades and grading policy influence student outcomes”). Then, write to professors who do this work and ask if they plan on taking advantage of the opportunity (here, the natural experiment caused by new grading policies). There’s a good chance that the people you write to will have had this idea already (academics who study a subject are frequently on the lookout for opportunities of this kind, and the drastic changes wrought by COVID-19 should be increasing the frequency with which people think about related studies they could run). And if they haven’t, you have the chance to inspire them! Writing to random professors could be intimidating, but in my experience, even when I’ve written emails like this as a private citizen without a .edu email address, I frequently get some kind of response; people who’ve made research their life’s work are often happy to hear from members of the public who care about the same odd things they do. • Thanks for the suggestion! I imagine that most scholars are reeling from the upheavals caused by the pandemic response, so right now doesn’t feel like the right time to ask professors to do anything. What do you think? • Maybe a better question for late May or early June, when classes are over. • I think that’s probably true for those working directly on the pandemic, but I’m not sure education researchers would mind being bothered. If anything they might welcome the distraction. • I think improving bus systems in the United States (and probably other countries) could be a plausible Cause X. Importance: Improving bus service would: • Increase economic output in cities • Dramatically improve quality of life for low-income residents • Reduce cities’ carbon footprint, air pollution, and traffic congestion Neglectedness: City buses probably don’t get much attention because most people don’t think very highly of them, and focus much more on novel transportation technologies like electric vehicles. Tractability: According to Higashide, improving bus systems is a matter of improving how the bus systems are governed. Right now, I think a nationwide movement to improve bus transit would be less polarizing than the YIMBY movement has been. While YIMBYism has earned a reputation as elitist due to some of its early advocates’ mistakes, a pro-bus movement could be seen as aligned with the interests of low-income city dwellers provided that it gets the messaging right from the beginning. Also, bus systems are less costly to roll out, upgrade, and alter than other public transportation options like trains. • Interesting post! Curious what you think of Jeff Kaufman’s proposal to make buses more dangerous in the first world, the idea being that buses in the US are currently too far in the “safety” direction of the safety vs. convenience tradeoff. GiveWell also has a standout charity (Zusha!) working in the opposite direction, trying to get public service vehicles in Kenya to be safer. • I like Kaufman’s second, third, and fourth ideas: • Allow the driver to start while someone is still at the front paying. (The driver should use judgment if they’re allowed to do this, because the passenger at the front might lose their balance when the bus starts. Wheelchairs might be especially vulnerable to rolling back.) • Allow buses to drive 25mph on the shoulder of the highway in traffic jams where the main lanes are averaging below 10mph. • Higher speed limits for buses. Lets say 15mph over. (I’m not so sure about this: speed limits exist in part to protect pedestrians. Buses still cause fewer pedestrian and cyclist deaths than cars, though.) But these should be considered only after we’ve exhausted the space of improvements to bus service that don’t sacrifice safety. For example, we should build more bus-only lanes first. • Wait, do buses some place not start moving until… everyone’s sitting down? Does that mean there’s enough seats for everyone? • I don’t have statistics, but my best guess is that if you sample random points across all public buses running in America, in over 34 of the time, less than half of the seats are filled. This is extremely unlike my experiences in Asia (in China or Singapore). • There should be a directory of EA co-living spaces, if there isn’t already. The EA Hub would be a good place for it. • Back-of-the-envelope calculations for improving efficiency of public transit spending The cost of building and maintaining public transportation varies widely across municipalities due to inefficiencies—for example, the NYC Second Avenue Subway has cost2.14 billion per kilometer to build, whereas it costs an average of $80.22 million to build a kilometer of tunnel in Spain (Transit Costs Project). While many transit advocacy groups advocate for improving quality of public transit service (e.g. Straphangers Campaign in NYC), few advocate for reducing wasteful infrastructure spending. BOTEC for operating costs • Uday Schultz writes: “bringing NYCT’s [the NYC subway agency] facility maintenance costs down to the national average could save$1.3 billion dollars per year.”

• With a 6% discount rate, this equates to a $21.7 billion net present value. So an advocacy campaign that spent$21.7 million to reduce NYCT’s maintenance costs to the national average would yield a 1000x return.

• Things that would make the cost-effectiveness of this campaign higher or lower:

• (Higher) A lower discount rate would increase the net present value of the benefits

• (Higher) In theory, we can reduce maintenance costs to even lower levels than the US national average; Western European levels are (I think) lower.

• (Lower) We might not realize all of the potential efficiency gains for political reasons—e.g. if contractors and labor unions block the best possible reforms.

BOTEC for capital construction costs

• The NYC Second Avenue Subway will be 13.7 km long when completed and cost over $17 billion. Phase 1 of the subway line has been completed, consists of 2.9 km of tunnel, and cost$4.45 billion.

• So the rest of the planned subway line (yet to be built) consists of 10.8 km of tunnel and is expected to cost $12.55 billion, for an average of$1.16 billion per km of tunnel.

• Phase 2 of the subway will be 2.4 km long and cost $6 billion, for an average of$2.5 billion per km of tunnel.

• There will likely be cost overruns in the future, so let’s take the average of these two numbers and assume that the subway will cost an average of $1.83 billion/​km to build. • As I stated before, the average cost per km of new tunnel in Spain is$80.22 million (Transit Costs Project). If NYCT could build the rest of the Second Avenue Subway at this cost, it would save $1.75 billion per km of new tunnel, or$18.9 billion overall (since there are 10.8 km of tunnel left to build).

N.B.: These are BOTECs for individual aspects of transit spending, and a transit spending advocacy project would benefit from economies of scope because it would be lobbying for cost reductions across all aspects of public transit, not just e.g. the Second Avenue Subway or operating costs.

See also: “So You Want to Do an Infrastructure Package,” Alon Levy’s whitepaper on reducing transit infrastructure costs (Niskanen Center, 2021)

• Matt Yglesias gets EA wrong :(

What EAs think is that people should make decisions guided by a rigorous empirical evaluation based on consequentialist criteria.

Ummm, no. Not all EAs are consequentialists (although a large fraction of them are), and most EAs these days understand that “rigorous empirical evaluation” isn’t the only way to reason about interventions.

It just gets worse from there:

In other words, effective altruists don’t think you should make charitable contributions to your church (again, relative to the mass public this is the most controversial part!) or to support the arts or to solve problems in your community. They think most of the stuff that people donate to (which, again, is largely religiously motivated) do is frivolous. But beyond that, they would dismiss the bulk of the kind of problems that concern most people as literal “first world problems” that blatantly fail the cost-benefit test compared to Vitamin A supplementation in Africa.

No! We’re not against supporting programs other than the global health stuff. It’s just that you gotta buy your fuzzies separate from your utils. More fundamentally, EAs disagree on whether EA is mandatory or supererogatory (merely good). If EA is supererogatory, then supporting your local museum isn’t wrong, it just doesn’t count towards your effective giving budget.

• A rebuttal of the paperclip maximizer argument

I was talking to someone (whom I’m leaving anonymous) about AI safety, and they said that the AI alignment problem is a joke (to put it mildly). They said that it won’t actually be that hard to teach AI systems the subtleties of human norms because language models contain normative knowledge. I don’t know if I endorse this claim but I found it quite convincing, so I’d like to share it here.

In the classic naive paperclip maximizer scenario, we assume there’s a goal-directed AI system, and its human boss tells it to “maximize paperclips.” At this point, it creates a plan to turn all of the iron atoms on Earth’s surface into paperclips. The AI knows everything about the world, including the fact that blood hemoglobin and cargo ships contain iron. However, it doesn’t know that it’s wrong to kill people and destroy cargo ships for the purpose of obtaining iron. So it starts going around killing people and destroying cargo ships to obtain as much iron as possible for paperclip manufacturing.

I think most of us assume that the AI system, when directed to “maximize paperclips,” would align itself with an objective function that says to create as many paper clips as superhumanly possible, even at the cost of destroying human lives and economic assets. However, I see two issues:

1. It’s assuming that the system would interpret the term “maximize” extremely literally, in a way that no reasonable human would interpret it. (This is the core of the paperclip argument, but I’m trying to show that it’s a weakness.) Most modern natural language processing (NLP) systems are based on statistical word embeddings, which capture what words mean in the source texts, rather than their strict mathematical definitions (if they even have one). If the AI system interprets commands using a word embedding, it’s going to interpret “maximize” the way humans would.

Ben Garfinkel has proposed the “process orthogonality thesis”—the idea that, for the classic AI alignment argument to work, “the process of imbuing a system with capabilities and the process of imbuing a system with goals” would have to be orthogonal. But this point shows that the process of giving the system capabilities (in this case, knowing that iron can be obtained from various everyday objects) and the process of giving it a goal (in this case, making paperclips) may not be orthogonal. An AI system based on contemporary language models seems much more likely to learn that “maximize X” means something more like “maximize X subject to common-sense constraints Y1, Y2, …” than to learn that human blood can be turned into iron for paperclips. (It’s also possible that it’ll learn neither, which means it might take “maximize” too literally but won’t figure out that it can make paperclips from humans.)

2. It’s assuming that the system would make a special case for verbal commands that can be interpreted as objective functions and set out to optimize the objective function if possible. At a minimum, the AI system needs to convert each verbal command into a plan to execute it, somewhat like a query plan in relational databases. But not every plan to execute a verbal command would involve maximizing an objective function, and using objective functions in execution plans is probably dangerous for the reason that the classic paperclip argument tries to highlight, as well as overkill for most commands.

• Ben gives a great example of how the “alignment problem” might look different than we expect:

The case of the house-cleaning robot

• Problem: We don’t know how to build a simulated robot that cleans houses well

• Available techniques aren’t suitable:

• Simple hand-coded reward functions (e.g. dust minimization) won’t produce the desired behavior

• We don’t have enough data (or sufficiently relevant data) for imitation learning

• Existing reward modeling approaches are probably insufficient

• This is sort of an “AI alignment problem,” insofar as techniques currently classified as “alignment techniques” will probably be needed to solve it. But it also seems very different from the AI alignment problem as classically conceived.

...

• One possible interpretation: If we can’t develop “alignment” techniques soon enough, we will instead build powerful and destructive dust-minimizers

• A more natural interpretation: We won’t have highly capable house-cleaning robots until we make progress on “alignment” techniques

I’ve concluded that the process orthogonality thesis is less likely to apply to real AI systems than I would have assumed (i.e. I’ve updated downward), and therefore, the “alignment problem” as originally conceived is less likely to affect AI systems deployed in the real world. However, I don’t feel ready to reject all potential global catastrophic risks from imperfectly designed AI (e.g. multi-multi failures), because I’d rather be safe than sorry.

• I think it’s worth saying that the context of “maximize paperclips” is not one where the person literally says the words “maximize paperclips” or something similar; this is instead an intuitive stand-in for building an AI capable of superhuman levels of optimization, such that if you set it the task, say via specifying a reward function, of creating an unbounded number of paperclips you’ll get it doing things you wouldn’t as a human do to maximize paperclips because humans have competing concerns and will stop when, say, they’d have to kill themselves or their loved ones to make more paperclips.

The objection seems predicated on interpretation of human language, which is aside the primary point. That is, you could address all the human language interpretation issues and we’d still have an alignment problem, it just might not look literally like building a paperclip maximizer if someone asks the AI to make a lot of paperclips.

• In the classic naive paperclip maximizer scenario, we assume there’s a goal-directed AI system, and its human boss tells it to “maximize paperclips.” At this point, it creates a plan to turn all of the iron atoms on Earth’s surface into paperclips. The AI knows everything about the world, including the fact that blood hemoglobin and cargo ships contain iron. However, it doesn’t know that it’s wrong to kill people and destroy cargo ships for the purpose of obtaining iron. So it starts going around killing people and destroying cargo ships to obtain as much iron as possible for paperclip manufacturing.

I don’t think this is a good representation of the classic scenario. It’s not that the AI “doesn’t know it’s wrong”. It clearly has a good enough model of the world to predict eg “if a human saw me trying to do this, they would try to stop me”. The problem is coding an AI that cares about right and wrong. Which is a pretty difficult technical problem. One key part of why it’s hard is that the interface for giving an AI goals is not the same interface you’d use to give a human goals.

Note that this is not the same as saying that it’s impossible to solve, or that it’s obviously much harder than making powerful AI in the first place, just that it’s a difficult technical problem and solving it is one significant step towards safe AI. I think this is what Paul Christiano calls intent alignment

I think it’s possible that this issue goes away with powerful language models, if that can give us an interface to input a goal via a similar interface to instructing a human. And I’m excited about efforts like this one. But I don’t think it’s at all obvious that this will just happen to work out. For example, GPT-3′s true goal is “generate text that is as plausible as possible, based on the text in your training data”. And it has a natural language interface, and this goal correlates a bit with “do what humans want”, but it is not the same thing.

It’s assuming that the system would make a special case for verbal commands that can be interpreted as objective functions and set out to optimize the objective function if possible. At a minimum, the AI system needs to convert each verbal command into a plan to execute it, somewhat like a query plan in relational databases. But not every plan to execute a verbal command would involve maximizing an objective function, and using objective functions in execution plans is probably dangerous for the reason that the classic paperclip argument tries to highlight, as well as overkill for most commands.

This point feels somewhat backwards. Everything Ai systems ever do is maximising an objective function, and I’m not aware of any AI Safety suggestions that get around this (just ones which have creative objective functions). It’s not that they convert verbal commands to an objective function, they already have an objective function, which might capture ‘obey verbal commands in a sensible way’ or it might not. And my read on the paperclip maximising scenario is that “tell the AI to maximise paperclips” really means “encode an objective function that tells it to maximise paperclips”

Personally I think the paperclip maximiser scenario is somewhat flawed, and not a good representation of AI x-risk. I like it because it illustrates the key point of specification gaming—that it’s really, really hard to make an objective function that captures “do the things we want you to do”. But this is also going to be pretty obvious to the people making AGI, and they probably won’t have an objective function as clearly dumb as maximise paperclips. But it might not be good enough.

• By the way, there will be a workshop on Interactive Learning for Natural Language Processing at ACL 2021. I think it will be useful to incorporate the ideas from this area of research into our models of how AI systems that interpret natural-language feedback would work. One example of this kind of research is Blukis et al. (2019).

• How pressing is countering anti-science?

Intuitively, anti-science attitudes seem like a major barrier to solving many of the world’s most pressing problems: for example, climate change denial has greatly derailed the American response to climate change, and distrust of public health authorities may be stymying the COVID-19 response. (For instance, a candidate running in my district for State Senate is campaigning on opposition to contact tracing as well as vaccines.) I’m particularly concerned about anti-economics attitudes because they lead to bad economic policies that don’t solve the problems they’re meant to solve, such as protectionism and rent control, and opposition to policies that are actually supported by evidence. Additionally, I’ve heard (but can’t find the source for this) that economists are generally more reluctant to do public outreach in defense of their profession than scientists in other fields are.

• Epistemic status: Almost entirely opinion, I’d love to hear counterexamples

When I hear proposals related to instilling certain values widely throughout a population (or preventing the instillation of certain values), I’m always inherently skeptical. I’m not aware of many cases where something like this worked well, at least in a region as large, sophisticated, and polarized as the United States.

You could point to civil rights campaigns, which have generally been successful over long periods of time, but those had the advantage of being run mostly by people who were personally affected (= lots of energy for activism, lots of people “inherently” supporting the movement in a deep and personal way).

If you look at other movements that transformed some part of the U.S. (e.g. bioethics or the conservative legal movement, as seen in Open Phil’s case studies of early field growth), you see narrow targeting of influential people rather than public advocacy.

Rather than thinking about “countering anti-science” more generally, why not focus on specific policies with scientific support? Fighting generically for “science” seems less compelling than pushing for one specific scientific idea (“masks work,” “housing deregulation will lower rents”), and I can think of a lot of cases where scientific ideas won the day in some democratic context.

This isn’t to say that public science advocacy is pointless; you can reach a lot of people by doing that. But I don’t think the people you reach are likely to “matter” much unless they actually campaign for some specific outcome (e.g. I wouldn’t expect a scientist to swing many votes in a national election, but maybe they could push some funding toward an advocacy group for a beneficial policy).

****

One other note: I ran a quick search to look for polls on public trust in science, but all I found was a piece from Gallup on public trust in medical advice.

Putting that aside, I’d still guess that a large majority of Americans would claim to be “pro-science” and to “trust science,” even if many of those people actually endorse minority scientific claims (e.g. “X scientists say climate change isn’t a problem”). But I could be overestimating the extent to which people see “science” as a generally positive applause light.

• I think a more general, and less antagonizing, way to frame this is “increasing scientific literacy among the general public,” where scientific literacy is seen as a spectrum. For example, increasing scientific literacy among climate activists might make them more likely to advocate for policies that more effectively reduce CO2 emissions.

• John, Katherine, Sarah, and Hank Green are making a $6.5M donation to Partners in Health to address the maternal mortality crisis in Sierra Leone, and are trying to raise$25M in total. PIH has been working with the Sierra Leone Ministry of Health to improve the quality of maternal care through facility upgrades, supplies, and training.

• Content warning: the current situation in Afghanistan (2021)

Is there anything people outside Afghanistan can do to address the worsening situation in Afghanistan? GiveDirectly-style aid to Afghans seems like not-an-option because the previous Taliban regime “prevented international aid from entering the country for starving civilians.” (Wikipedia)

The best thing we can do is probably to help resettle Afghan refugees, whether by providing resources to NGOs that help them directly, or by petitioning governments to admit more of them. Some charities that do this:

I don’t have a good sense of how much impact donations to these charities would have. The US is already scrambling to get SIV applicants out of Afghanistan and into the US and other countries where they can wait in safety for their applications to be processed. On the margins, advocacy groups can probably advocate for this process to be improved and streamlined.

• Also, the US withdrawal from Afghanistan is a teaching moment for improving institutional decision making. Biden appears to have been blindsided by the rapid Taliban insurgency:

“The jury is still out, but the likelihood there’s going to be the Taliban overrunning everything and owning the whole country is highly unlikely,” Biden said on July 8.

(I thought that it might take 30 days for the Taliban to completely take over Afghanistan, whereas it happened over a weekend.)

And in general, the media seems to think the US drawdown was botched. USA Today has called it “predictable and disastrous.

• I don’t have anything to add, but I think you’re right. It’s very hard to hear the “Americans matter more than other people” implied or stated in the article comments.

• I agree with you that situations like the current one in Afghanistan might be among our most impactful issues, but:

1. If you wanna talk about iidm, I’d rather think more about how to make failing states in developing countries more viable and functional than improve US government decision-making. Tbh, idk if US decision was a matter of a judgment mistake: Biden’s recent statements seem to show that he doesn’t really regret the decision—that the unwillingness to keep troops in Afghanistan dominated the risk of having Taleban back in power.

2. I’m not sure there’s still any low-hanging fruit concerning Afghanistan, but we still have many other poor countries in civil war where there is still hope of getting a viable democratic regime—like Haiti and Chad. Perhaps it could be useful to have a theory of what features make a “nation building effort” viable (like in East Timor) or not—and also what can be done to mitigate the harm caused by another government collapse. My current pet theory is that these efforts only have a winning chance when the corresponding country is afraid of being invaded by another foreign power; otherwise, the nation building effort is likely to be regarded as a colonial invasion.

3. Even though I can only think of Taleban’s return as catastrophic, I wonder if their recent displays of willingness to engage in international relations is only to get recognition for the new regime (implying we’ll be back to middle age again in a few months), or if they’re actually aiming to modernize a little bit (even if just to prevent another future invasion).

• Reason to invest in cultivated meat research: we can use meat scaffolding technology to grow nervous tissue and put chemicals in the cell media that cause it to experience constant euphoria

• Possible research/​forecasting questions to understand the economic value of AGI research

A common narrative about AI research is that we are on a path to AGI, in that society will be motivated to try to create increasingly general AI systems, culminating in AGI. Since this is a core assumption of the AGI risk hypothesis, I think it’s very important to understand whether this is actually the case.

Some people have predicted that AI research funding will dry up someday as the costs start to outweigh the benefits, resulting in an “AI winter.” Jeff Bigham wrote in 2019 that the AI field will experience an “AI autumn,” in which the AI research community will shift its focus from trying to develop human-level AI capabilities to developing socially valuable applications of narrow AI.

My view is that an AI winter is unlikely to happen anytime soon (10%), an AI autumn is likely to happen eventually (70%), and continued investment in AGI research all the way to AGI is somewhat unlikely (20%). But I think we can try to understand and predict these outcomes better. Here are some ideas for possibly testable research questions:

• What will be the ROI on:

• How much money will OpenAI make by licensing GPT-3?

• How long will it take for the technology behind GPT-2 and GPT-3 (roughly, making generic language models do other language tasks without specific training) to become as economically competitive as similar technologies did after they were invented?

• How long will it take for DeepMind and OpenAI to break even?

• How do the growth rates of DeepMind and OpenAI’s revenues and expenses compare to those of other corporate research labs throughout history?

• Will Alphabet downsize or shut down DeepMind?

• Will Microsoft scale back or end its partnership with OpenAI?

Notes:

• I don’t know of any other labs actively trying to create AGI.

• I have no experience with financial analysis, so I don’t know if these questions are the kind that a financial analyst would actually be able to answer. They could be nonsensical for all I know.

• Epistemic status: Although I’m vaguely aware of the evidence on gender equality and peace, I’m not an expert on international relations. I’m somewhat confident in my main claim here.

Gender equality—in societies at large, in government, and in peace negotiations—may be an existential security factor insofar as it promotes societal stability and decreases international and intra-state conflict.

According to the Council on Foreign Relations, women’s participation in peacemaking and government at large improves the durability of peace agreements and social stability afterward. Gender equality also increases trust in political institutions and decreases risk of terrorism. According to a study by Krause, Krause, and Bränfors (2018), direct participation by women in peacemaking positively affects the quality and durability of peace agreements because of “linkages between women signatories and women civil society groups.” In principle, including other identity groups such as ethnic, racial, and religious minorities in peace negotiations may also activate these linkages and thus lead to more durable and higher quality peace.

Some organizations that advance gender equality in peacemaking and international security:

• I think the instrumental benefits of greater equality (racial, gender, economic, etc.) are hugely undersold, particularly by those of us who like to imagine that we’re somehow “above” traditional social justice concerns (including myself in this group, reluctantly and somewhat shamefully).

In this case, I think your thought is spot on and deserves a lot more exploration. I immediately thought of the claim (e.g. 1, 2) that teams with more women make better collective decisions. I haven’t inspected this evidence in detail, but on an anecdotal level I am ready to believe it.

• Note: I recognize that gender equality is a sensitive topic, so I welcome any feedback on how I could present this information better.

• Is there any appetite for a project to make high-risk, high-yield donation recommendations within global health and development? The idea would be to identify donation opportunities that could outperform the GiveWell top charities, especially ones that make long-lasting and systemic changes.

• I’ve realized that I feel less constrained when writing poetry than when writing essays/​blog posts. Essays are more time-consuming for me—I spend a lot of time adding links, fact-checking my points, and organizing them in a coherent way, and I feel like I have to stake out a clear position when writing in prose. Whereas in poetry, the rules have more to do with making the form and content work well together, and evoking an emotional response in the reader.

I also think poetry is a good medium for expressing ambiguity. I’ve written a few draft poems in my notebook about EA themes, and they pose moral dilemmas in EA without imposing a straightforward answer to them. Here’s one to illustrate what I mean:

“is this creative accounting?”
I think to myself
as I enter donation amounts
into my spreadsheet -
amounts of other people’s money
that they might have given anyway -
just not to my chosen charity

I feel like providing a straightforward answer to a complex problem wastes the potential of poetry. You might as well write a persuasive essay instead.

Also, a lot of us are writing blog posts; far fewer of us are writing poetry to communicate EA ideas.

• Improving checks and balances on U.S. presidential power seems like an important, neglected, and tractable cause area.

• Importance: There is a risk that U.S. federal government policy will become erratic, since each president can easily reverse the previous president’s executive actions (for example, Biden reversed many of Trump’s EOs during his first few weeks of office). The uncertainty makes it hard to reliably adapt to policy changes (this is a recognized challenge for businesses, and also applies to other interest groups, such as LGBTQ+ people, refugees, and immigrants). Also, while strong executive leadership is important for enabling the federal government to address certain challenges, such as international relations and infrastructure, it can also be misused, such as when Trump shared top-secret intelligence with the Russian government.

• Neglectedness: Although many voters care about holding the government accountable (just ask them whether they think the U.S. is “corrupt”), it seems unlikely that they will specifically advocate for curbs on presidential power, as it’s a “wonky” policy proposal. There are a few advocacy groups working on curbing executive authority, such as Protect Democracy, but I’d be surprised if there were more than 10.

• Tractability: There are concrete proposals for curbing presidential power; some of them seem like they could gain bipartisan support, such as increasing controls on financial conflicts of interest. The goal is to turn informal norms that presidents were merely expected to follow into enforceable laws, since the Trump presidency has shown that informal norms can and will be broken.

• In my opinion, while the media has focused a lot on Trump’s tax returns, requiring presidents to disclose their tax returns seems to have more symbolic significance than actual value in holding presidents accountable.

Further reading:

• # Effective Altruism and Freedom

I think freedom is very important as both an end and a means to the pursuit of happiness.

Economic theory posits a deep connection between freedom (both positive and negative) and well-being. When sufficiently rational people are free to make choices from a broader choice set, they can achieve greater well-being than they could with a smaller choice set. Raising people’s incomes expands their choice sets, and consequently, their happiness—this is how GiveDirectly works.

I wonder what a form of effective altruism that focused on maximizing (positive and negative) freedom for all moral patients would look like. I think it would be very similar to the forms of EA focused on maximizing total or average well-being; both the freedom- and well-being-centered forms of EA would recommend actions like supporting GiveDirectly and promoting economic growth. But we know that different variants of utilitarianism have dramatically different implications in some cases. For example, the freedom-maximizing worldview would not endorse forcing people into experience machines.

We can also think of the long-term future of humanity in terms of humanity’s collective freedom to choose how it develops. We want to preserve our option value—our freedom to change course—and avoid making irreversible decisions until we are sure they are right.

• This sounds a lot like a version of preference utilitarianism, certainly an interesting perspective.

I know a lot of effort in political philosophy has gone into trying to define freedom—personally, I don’t think it’s been especially productive, and so I think ‘freedom’ as a term isn’t that useful except as rhetoric. Emphasising ‘fulfilment of preferences’ is an interesting approach, though. It does run into tricky questions around the source of those preferences (eg addiction).

• Yeah, it is very similar to preference utilitarianism. I’m still undecided between hedonic and preference utilitarianism, but thinking about this made me lean more toward preference utilitarianism.

What do you think is wrong with the current definitions of liberty? I think the concept of well-being is similarly vague. I tend to use different proxies for well-being interchangeably (fulfillment of preferences, happiness minus suffering, good health as measured by QALYs or DALYs, etc.) and I think this is common practice in EA. But I still think that freedom and well-being are useful concepts: for example, most people would agree that China has less economic and political freedom than the United States.

• I don’t mind rhetorical descriptions of China as having ‘less economic and political freedom than the United States’, in a very general discussion. But if you’re going to make any sort of proposal like ‘there should be more political freedom!’ I would feel the need to ask many follow-up clarifying questions (freedom to do what? freedom from what consequences? freedom for whom?) to know whether I agreed with you.

Well-being is vague too, I agree, but it’s a more necessary term than freedom (from my philosophical perspective, and I think most others).

• # Worldview diversification for longtermism

I think it would be helpful to get more non-utilitarian perspectives on longtermism (or ones that don’t primarily emphasize utilitarianism).

Some questions that would be valuable to address:

• What non-utilitarian worldviews support longtermism?

• Under a given longtermist non-utilitarian worldview, what are the top-priority problem areas, and what should actors to do address them?

Some reasons I think this would be valuable:

1. We’re working under a lot of moral uncertainty, so the more ethical perspectives, the better.

2. Even if we fully buy into one worldview, it would be valuable to incorporate insights from other worldviews’ perspectives on the problems we are addressing.

3. Doing this would attract more people with worldviews different from the predominant utilitarian one.

## What non-utilitarian worldviews support longtermism?

Liberalism: There are strong theoretical and empirical reasons why liberal democracy may be valuable for the long-term future; see this post and its comments. I think that certain variants of liberalism are highly compatible with longtermism, especially those focusing on:

• Inclusive institutions and democracy

• Civil and political rights (e.g. freedom, equality, and civic participation)

• International security and cooperation

• Moral circle expansion

Environmental and climate justice: Climate justice deals with climate change’s impact on the most vulnerable members of society, and it prescribes how societies ought to respond to climate change in ways that protect their most vulnerable members. We can learn a lot from it about how to respond to other global catastrophic risks.

• Also just realised that the new legal priorities research agenda touches on this with some academic citations on pages 14 and 15.

• Toby Ord has spoken about non-consequentialist arguments for existential risk reduction, which I think also work for longtermism more generally. For example, Ctlr+F for “What are the non-consequentialist arguments for caring about existential risk reduction?” in this link. I suspect relevant content is also in his book The Precipice.

Some selected quotes from the first link:

• “my main approach, the guiding light for me, is really thinking about the opportunity cost, so it’s thinking about everything that we could achieve, and this great and glorious future that is open to us and that we could do”

• “there are also these other foundations, which I think also point to similar things. One of them is a deontological one, where Edmund Burke, one of the founders of political conservatism, had this idea of the partnership of the generations. What he was talking about there was that we’ve had ultimately a hundred billion people who’ve lived before us, and they’ve built this world for us. And each generation has made improvements, innovations of various forms, technological and institutional, and they’ve handed down this world to their children. It’s through that that we have achieved greatness … is our generation going to be the one that breaks this chain and that drops the baton and destroys everything that all of these others have built? It’s an interesting kind of backwards-looking idea there, of debts that we owe and a kind of relationship we’re in. One of the reasons that so much was passed down to us was an expectation of continuation of this. I think that’s, to me, quite another moving way of thinking about this, which doesn’t appeal to thoughts about the opportunity cost that would be lost in the future.”

• “And another one that I think is quite interesting is a virtue approach … When you look at humanity’s current situation, it does not look like how a wise entity would be making decisions about its future. It looks incredibly juvenile and immature and like it needs to grow up. And so I think that’s another kind of moral foundation that one could come to these same conclusions through.”

• Thanks for sharing these! I had Toby Ord’s arguments from The Precipice in mind too.

• Epistemic status: Tentative thoughts.

I think that medical AI could be a nice way to get into the AI field for a few reasons:

• You’d be developing technology that improves global health by a lot. For example, according to the WHO, “The use of X-rays and other physical waves such as ultrasound can resolve between 70% and 80% of diagnostic problems, but nearly two-thirds of the world’s population has no access to diagnostic imaging.”[1] Computer vision can make radiology more accessible to billions of people around the world, as this project is trying to do.

• It’s also a promising starting point for careers in AI safety and applying AI/​ML to other pressing causes.

AI for animal health may be even more important and neglected.

• Stuart Russell: Being human and navigating interpersonal relationships will be humans’ comparative advantage when artificial general intelligence is realized, since humans will be better at simulating other humans’ minds than AIs will. (Human Compatible, chapter 4)

Also Stuart Russell: Automated tutoring!! (Human Compatible, chapter 3)

• I’ve been thinking about AI safety again, and this is what I’m thinking:

The main argument of Stuart Russell’s book focuses on reward modeling as a way to align AI systems with human preferences. But reward modeling seems more like an AI capabilities technology than an AI safety one. If it’s really difficult to write a reward function for a given task Y, then it seems unlikely that AI developers would deploy a system that does it in an unaligned way according to a misspecified reward function. Instead, reward modeling makes it feasible to design an AI system to do the task at all.

Even with reward modeling, though, AI systems are still going to have similar drives due to instrumental convergence: self-preservation, goal preservation, resource acquisition, etc., even if they have goals that were well specified by their developers. Although maybe corrigibility and not doing bad things can be built into the systems’ goals using reward modeling.

The ways I could see reward modeling technology failing to prevent AI catastrophes (other than misuse) are:

• An AI system is created using reward modeling, but the learned reward function still fails in a catastrophic, unexpected way. This is similar to how humans often take actions that unintentionally cause harm, such as habitat destruction, because they’re not thinking about the harms that occur.

• Possible solution: create a model garden for open source reward models that developers can use when training new systems with reward modeling. This way, developers start from a stronger baseline with better safety guarantees than they would have if they were developing reward modeling systems from scratch/​with only their proprietary training data.

• A developer cuts corners while creating an AI system (perhaps due to economic pressure) and doesn’t give the system a robust enough learned reward function, and the system fails catastrophically.

• Lots of ink has been spilled about arms race dynamics 😛

• Possible solution: Make sure reward models can be run efficiently. For example, if reward modeling is done using a neural network that outputs a reward value, make sure it can be done well even with slimmer neural networks (fewer parameters, lower bit depth, etc.).

• The main argument of Stuart Russell’s book focuses on reward modeling as a way to align AI systems with human preferences.

Hmm, I remember him talking more about IRL and CIRL and less about reward modeling. But it’s been a little while since I read it, could be wrong.

If it’s really difficult to write a reward function for a given task Y, then it seems unlikely that AI developers would deploy a system that does it in an unaligned way according to a misspecified reward function. Instead, reward modeling makes it feasible to design an AI system to do the task at all.

Maybe there’s an analogy where someone would say “If it’s really difficult to prevent accidental release of pathogens from your lab, then it seems unlikely that bio researchers would do research on pathogens whose accidental release would be catastrophic”. Unfortunately there’s a horrifying many-decades-long track record of accidental release of pathogens from even BSL-4 labs, and it’s not like this kind of research has stopped. Instead it’s like, the bad thing doesn’t happen every time, and/​or things seem to be working for a while before the bad thing happens, and that’s good enough for the bio researchers to keep trying.

So as I talk about here, I think there are going to be a lot of proposals to modify an AI to be safe that do not in fact work, but do seem ahead-of-time like they might work, and which do in fact work for a while as training progresses. I mean, when x-risk-naysayers like Yann LeCun or Jeff Hawkins are asked how to avoid out-of-control AGIs, they can spout off a list of like 5-10 ideas that would not in fact work, but sound like they would. These are smart people and a lot of other smart people believe them too. Also, even something as dumb as “maximize the amount of money in my bank account” would plausibly work for a while and do superhumanly-helpful things for the programmers, before it starts doing superhumanly-bad things for the programmers.

Even with reward modeling, though, AI systems are still going to have similar drives due to instrumental convergence: self-preservation, goal preservation, resource acquisition, etc., even if they have goals that were well specified by their developers. Although maybe corrigibility and not doing bad things can be built into the systems’ goals using reward modeling.

Yup, if you don’t get corrigibility then you failed.

• Content warning: missing persons, violence against women, racism.

Amid the media coverage of the Gabby Petito case in the United States, there’s been some discussion of how missing persons cases for women and girls of color are more neglected than those for missing White women. Some statistics:

Black girls and women go missing at high rates, but that isn’t reflected in news coverage of missing persons cases. In 2020, of the 268,884 girls and women who were reported missing, 90,333, or nearly 34% of them, were Black, according to the National Crime Information Center. Meanwhile, Black girls and women account for only about 15% of the U.S. female population, according to census data. In contrast, white girls and women — which includes those who identify as Hispanic — made up 59% of the missing, while accounting for 75% of the overall female population.

[...]

In [Wyoming], more than 400 Indigenous girls and women went missing between 2011 and the fall of 2020, according to a state report. Indigenous people made up 21% of homicide victims in Wyoming between 2000 and 2020, despite being less than 3% of the state’s population. The disparity can be seen in the media: Only 18% of Indigenous female victims received coverage. However, among white victims, 51% were in the news.

To be clear, Rivers explains, it’s not about asking for more attention or being in “competition” with white people — it’s about other groups getting the same attention as white victims and having their lives honored in the same ways.

• Hey, thanks for writing this.

I’m not quite sure I understand. Do you think this is an issue that isn’t worked on enough?

• I don’t know how neglected it is compared to EA’s standard portfolio of issues (U.S. issues tend to get disproportionate attention from Americans), but I think it’s an interesting example of how people outside EA have applied importance and neglectedness to call attention to neglected issues.

• Wild idea: Install a small modular reactor in a charter city and make energy its biggest export!

Charter cities’ advantage is their lax regulatory environment relative to their host countries. Such permissiveness could be a good environment for deploying nuclear reactors, which are controversial and regulated to death in many countries. Charter cities are good environments for experimenting with governance structures; they can also be good for experimenting with controversial technologies.

• I’ve been reading Adam Gopnik’s book A Thousand Small Sanities: The Moral Adventure of Liberalism, which is about the meaning and history of liberalism as a political movement. I think many of the ideas that Gopnik discusses are relevant to the EA movement as well:

• Moral circle expansion: To Gopnik, liberalism is primarily about calling for “the necessity and possibility of (imperfectly) egalitarian social reform and ever greater (if not absolute) tolerance of human difference” (p. 23). This means expanding the moral circle to include, at the least, all human beings. However, inclusion in the moral circle is a spectrum, not a binary: although liberal societies have made tremendous progress in treating women, POC, workers, and LGBTQ+ people fairly, there’s still a lot of room for improvement. And these societies are only beginning to improve their treatment of immigrants, the global poor, and non-human animals.

• Societal evolution and the “Long Reflection”: “Liberalism’s task is not to imagine the perfect society and drive us toward it but to point out what’s cruel in the society we have now and fix it if we possibly can” (p. 31). I think that EA’s goals for social change are mostly aligned with this approach: we identify problems and ways to solve them, but we usually don’t offer a utopian vision of the future. However, the idea of the “Long Reflection,” a process of deliberation that humanity would undertake before taking any irreversible steps that would alter its trajectory of development, seems to depart from this vision of social change. The Long Reflection involves figuring out what is ultimately of value to humanity or, failing that, coming close enough to agreement that we won’t regret any irreversible steps we take. This seems hard and very different from the usual way people do politics, and I think it’s worth figuring out exactly how we would do this and what would be required if we think we will have to take such steps in the future.

• Would you recommend the book itself to people interested in movement-building and/​or “EA history”? Is there a good review/​summary that you think would cover the important points in less time?

• Yeah, I would recommend it to anyone interested in movement building, history, or political philosophy from an EA perspective. I’m interested in reconciling longtermism and liberalism.

These paragraphs from the Guardian review summarize the main points of the book:

Given the prevailing gloom, Gopnik’s definition of liberalism is cautious and it depends on two words whose awkwardness, odd in such an elegant writer, betrays their doubtful appeal. One is “fallibilism”, the other is “imperfectability”: we are a shoddy species, unworthy of utopia. I’d have thought that this was reason for conservatively upholding the old order, but for Gopnik it’s our recidivism that makes liberal reform so necessary. We must always try to do better, cleaning up our messes. The sanity in the book’s title extends to sanitation: Gopnik whimsically honours the sewerage system of Victorian London as a shining if smelly triumph of liberal policy.

Liberalism here is less a philosophy or an ideology than a temperament and a way of living. Gopnik regards sympathy with others, not the building of walls and policing of borders, as the basis of community. “Love is love,” he avers, and “kindness is everything”. Both claims, he insists, are “true. Entirely true”, if only because the Beatles say so. But are they truths or blithe truisms? Such soothing mantras would not have disarmed the neo-Nazi thugs who marched through Charlottesville, Virginia, in 2017 or the white supremacist who murdered Jo Cox. Gopnik calls Trump “half-witted” and says Nigel Farage is a “transparent nothing”, but snubs do not diminish the menace of these dreadful men.

• Epistemic status: Raw thoughts that I’ve just started to think about. I’m highly uncertain about a lot of this.

Some works that have inspired my thinking recently:

Reading/​listening to these works has caused me to reevaluate the risks posed by advanced artificial intelligence. While AI risk is currently the top cause in x-risk reduction, I don’t think this is necessarily warranted. I think the CAIS model is a more plausible description of how AI is likely to evolve in the near future, but I haven’t read enough to assess whether it makes AI more or less of a risk (to humanity, civilization, liberal democracy, etc.) than it would be under the classic “Superintelligence” model.

I’m strongly interested in improving diversity in EA, and I think this is an interesting case study about how one could do that. Right now, it seems like there is a core/​middle/​periphery of the EA community where the core includes people and orgs in countries like the US, UK, and Australia, and I think the EA movement would be stronger if we actively tried to bring more people in more countries into the core.

I’m also interested in how we could use qualitative methods like those employed in user experience research (UXR) to solve problems in EA causes. I’m familiar enough with design thinking (the application of design methods to practical problems) that I could do some of this given enough time and training.

• Have you read “Methods of Math Destruction” or “Invisible Women”? Both are on how bias in mostly white, mostly well/​off, mostly male developers lead to unfair but self enforcing ai systems.

• Joan Gass (2019) recommends four areas of international development to focus on:

• New modalities to foster economic productivity

• New modalities or ways to develop state capabilities

• Global catastrophic risks, particularly pandemic preparedness

• Meta EA research on cause prioritization within global development

Improving state capabilities, or governments’ ability to render public services, seems especially promising for public-interest technologists interested in development (ICT4D). For example, the Zenysis platform helps developing-world governments make data-driven decisions, especially in healthcare. Biorisk management also looks promising from a tech standpoint.

• Effective giving, deontologist edition

I’ve got an idea for how to communicate the idea of effective giving to people even if they don’t subscribe to consequentialist ethics.

I’m gonna assume that when deontologists and virtue ethicists donate, they still care about outcomes, but not for the same reasons as consequentialists. For example, a deontologist might support anti-bullying charities to reduce bullying because bullying is wrong behavior, not just because bullying has bad consequences. This person should still seek out charities that are more cost-effective at reducing bullying to donate to.

• Some links about the alleged human male fertility crisis—it’s been suggested that this may lead to population decline, but a 2021 study has pointed out flaws in the research claiming a decline in sperm count:

• I didn’t find this response very convincing. Apart from attempting to smear the researchers as racist, it seems their key argument is that while sperm counts appear to have fallen from towards the top to the bottom of the ‘normal’ range, they’re still within the range. But this ‘normal’ range is fairly arbitrary, and if the decline continues presumably we will go below the normal range in the future.

• My shortforms on public transportation as an EA cause area:

• Practical/​economic reasons why companies might not want to build AGI systems

(Originally posted on the EA Corner Discord server.)

First, most companies that are using ML or data science are not using SOTA neural network models with a billion parameters, at least not directly; they’re using simple models, because no competent data scientist would use a sophisticated model where a simpler one would do. Only a small number of tech companies have the resources or motivation to build large, sophisticated models (here I’m assuming, like OpenAI does, that model size correlates with “sophisticated-ness”).

Second, increasing model size has diminishing returns with respect to model performance. Scaling laws usually relate model size to training loss via a power law, so every doubling of model size results in a smaller increase in training performance. And this is training performance, which is not the same as test set performance—increases in training performance above a certain threshold are considered not to matter for the model’s ultimate performance. (This is why techniques like early stopping exist—you just stop training the model once its true performance stops increasing.)

(Counterpoint: Software systems typically have superstar economics—e.g. the best search engine is 100x more profitable than the second-best search engine. So there could be a non-linear relationship between model performance and profitability, such that increasing a model’s performance from 97% to 98% makes a huge difference in profits whereas going from 96% to 97% does not.)

Third—and this reason only applies to AGI, not powerful narrow AIs—it’s not clear to me how you would design an engineering process to ensure that an AGI system can perform multiple tasks very well and generalize to new tasks. Typically, when we design software, we create a test suite that evaluates its suitability for the tasks for which it’s designed. Before releasing a new version of an AI system, we have to run the entire test suite on it and make sure it passes. It’s obviously easier to design a test suite for an AI that is designed to do a few tasks well than for an AI that’s supposed to be able to do any task. (On the flip side, this means that anyone seeking to design an AGI would have to design a way to test it to ensure that it’s (1) actually an AGI and (2) performant.) (While generality isn’t strictly necessary for AIs to be dangerous, I believe many of us would agree that AGIs are more dangerous as x-risks than narrow AIs.)

Fourth, setting out to create AGI would have a huge opportunity cost. Yes, technically, humans are probably not the absolute smartest, most capable beings that evolution could have built, but that doesn’t mean that building a smarter AGI machine would be profitable. It seems to me that humans have a comparative advantage in planning etc. while “technology as a whole” will have a comparative advantage in e.g. doing machine vision at scale. So most firms ought to just hire a bunch of humans and design/​purchase technological systems that complement humans’ skill sets (this is a common idea about how future AI development will go, called “intelligence augmentation”).

• Vaccine hesitancy might be a cause X (no, really!)

One thing that stuck out to me in the interview between Rob Wiblin and Ezra Klein is how much of a risk vaccine hesitancy poses to the US government’s public health response to COVID:

But there are other things where the conservatism is coming from the simple fact, to put this bluntly, they deal with the consequences of a failure in a way you and I don’t. You and I are sitting here, like, “Go faster. The trade-offs are obvious here.” They are saying, “Actually, no. The trade-offs are not obvious. If this goes wrong, we can have vaccine hesitancy that destroys the entire effort.” …

I think that there is a very different kind of feedback they are getting, and a kind of thing they fear, which is not that just the vaccine will be three weeks slower than it should have been, but if they are wrong, if they did not get enough data, if they missed something, they are going to imperil the whole effort, and that will also kill a gigantic number of people.

I’m aware of PR campaigns aimed at convincing people to get vaccinated, especially populations with higher rates of vaccine hesitancy. I wonder if these efforts could lead to a permanent shift in public attitudes toward vaccines. If that happens, then maybe governments can act faster and take more high-risk, high-reward actions during future epidemics without having to worry as much about vaccine hesitancy and mistrust of public health authorities “crashing” the public health response.

• Prevent global anastrophe we must.

• I think there should be an EA Fund analog for criminal justice reform. This could especially attract non-EA dollars.

• A social constructivist perspective on long-term AI policy

I think the case for addressing the long-term consequences of AI systems holds even if AGI is unlikely to arise.

The future of AI development will be shaped by social, economic and political factors, and I’m not convinced that AGI will be desirable in the future or that AI is necessarily progressing toward AGI. However, (1) AI already has large positive and negative effects on society, and (2) I think it’s very likely that society’s AI capabilities will improve over time, amplifying these effects and creating new benefits and risks in the future.

• A series of polls by the Chicago Council on Global Affairs show that Americans increasingly support free trade and believe that free trade is good for the U.S. economy (87%, up from 59% in 2016). This is probably a reaction to the negative effects and press coverage of President Trump’s trade wars—anecdotally, I have seen a lot of progressives who would otherwise not care about or support free trade criticize policies such as Trump’s steel tariffs as reckless.

I believe this presents a unique window of opportunity to educate the American public about the benefits of globalization. Kimberly Clausing is doing this in her book, Open: The Progressive Case for Free Trade, Immigration, and Global Capital, in which she defends free trade and immigration to the U.S. from the standpoint of American workers.

• An EA Meta reading list:

• # Social constructivism and AI

I have a social constructivist view of technology—that is, I strongly believe that technology is a part of society, not an external force that acts on it. Ultimately, a technology’s effects on a society depend on the interactions between that technology and other values, institutions, and technologies within that society. So for example, although genetic engineering may enable human gene editing, the specific ways in which humans use gene editing would depend on cultural attitudes and institutions regarding the technology.

How this worldview applies to AI: Artificial intelligence systems have embedded values because they are inherently goal-directed, and the goals we put into them may match with one or more human values.[1] Also, because they are autonomous, AI systems have more agency than most technologies. But AI systems are still a product of society, and their effects depend on their own values and capabilities as well as economic, social, environmental, and legal conditions in society.

Because of this constructivist view, I’m moderately optimistic about AI despite some high-stakes risks. Most technologies are net-positive for humanity; this isn’t surprising, because technologies are chosen for their ability to meet human needs. But no technology can solve all of humanity’s problems.

I’ve previously expressed skepticism about AI completely automating human labor. I think it’s very likely that current trends in automation will continue, at least until AGI is developed. But I’m skeptical that all humans will always have a comparative advantage, let alone a comparative advantage in labor. Thus, I see a few ways that widespread automation could go wrong:

• AI stops short of automating everything, but instead of augmenting human productivity, displaces workers into low-productivity jobs—or worse, economic roles other than labor. This scenario would create massive income inequality between those who own AI-powered firms and those who don’t.

• AI takes over most tasks essential to governing society, causing humans to be alienated from the process of running their own society (human enfeeblement). Society drifts off course from where humans want it to go.

I think economics will determine which human tasks are automated and which are still performed by humans.

1. The embedded values thesis is sometimes considered a form of “soft determinism” since it posits that technologies have their own effects on society based on their embedded values. However, I think it’s compatible with social constructivism because a technology’s embedded values are imparted to it by people. ↩︎

• Latex markdown test:

When, in the course of human events, it becomes necessary for people to dissolve the political bands that tie it with another

• Did you mean to leave this published after finishing the test? (Not a problem if so; just wanted to check.)

• In an ironic turn of events, you leaving this comment has made it so that the comment can no longer be unpublished (since users can only delete their comments if they have no replies).

• However, if evelynciara had replied “yes,” I’d have removed the thread in their stead ;-)

• Yes, I did. But I think it would be more valuable if we had a better Markdown editor or a syntax key.

• Table test—Markdown

Column A Column B Column C
Cell A1 Cell B1 Cell C1
Cell A2 Cell B2 Cell C2
Cell A3 Cell B3 Cell C3
• If you’re looking at where to direct funding for U.S. criminal justice reform:

List of U.S. states and territories by incarceration and correctional supervision rate

On this page, you can sort states (and U.S. territories) by total prison/​jail population, incarceration rate per 100,000 adults, or incarceration rate per 100,000 people of all ages—all statistics as of year-end 2016.

As of 2016, the 10 states with the highest incarceration rates per 100,000 people were:

1. Oklahoma (990 prisoners/​100k)

2. Louisiana (970)

3. Mississippi (960)

4. Georgia (880)

5. Alabama (840)

6. Arkansas (800)

7. Arizona (790)

8. Texas (780)

9. Kentucky (780)

10. Missouri (730)

National and state-level bail funds for pretrial and immigration detention

• I’m playing Universal Paperclips right now, and I just had an insight about AI safety: Just programming the AI to maximize profits instead of paperclips wouldn’t solve the control problem.

You’d think that the AI can’t destroy the humans because it needs human customers to make money, but that’s not true. Instead, the AI could sell all of its paperclips to another AI that continually melts them down and turns them back into wire, and they would repeatedly sell paperclips and wire back and forth to each other, both powered by free sunlight. Bonus points if the AIs take over the central bank.

• Can someone please email me a copy of this article?

I’m planning to update the Wikipedia article on Social discount rate, but I need to know what the article says.