I think this post contains many errors/issues (especially for a post with >300 karma). Many have been pointed out by others, but I think at least several still remain unmentioned. I only have time/motivation to point out one (chosen for being relatively easy to show concisely):
Using the 3x levered TTT with duration of 18 years, a 3 percentage point rise in rates would imply a mouth-watering cumulative return of 162%.
Levered ETFs exhibit path dependency, or “volatility drag”, because they reset their leverage daily, which means you can’t calculate the return without knowing what the interest rate does in between the 3% rise. TTT’s website acknowledges this with a very prominent disclaimer:
This short ProShares ETF seeks a return that is −3x the return of its underlying benchmark (target) for a single day, as measured from one NAV calculation to the next.
Due to the compounding of daily returns, holding periods of greater than one day can result in returns that are significantly different than the target return, and ProShares’ returns over periods other than one day will likely differ in amount and possibly direction from the target return for the same period. These effects may be more pronounced in funds with larger or inverse multiples and in funds with volatile benchmarks.”
You can also compare 1 and 2 and note that from Jan 1, 2019 to Jan 1, 2023, the 20-year treasury rate went up ~1%, but TTT is down ~20% instead of up (ETA: and has paid negligible dividends).
A related point: The US stock market has averaged 10% annual returns over a century. If your style of reasoning worked, we should instead buy a 3x levered S&P 500 ETF, get 30% return per year, compounding to 1278% return over a decade, handily beating out 162%.
Pure selfishness can’t work, since if everyone is selfish, why would anyone believe anyone else’s PR? I guess there has to be some amount of real altruism mixed in, just that when push comes to shove, people who will make decisions truly aligned with altruism (e.g., try hard to find flaws in one’s supposedly altruistic plans, give up power after you’ve gained power for supposedly temporary purposes, forgo hidden bets that have positive selfish EV but negative altruistic EV) may be few and far between.
This is just a reasonable decision (from a selfish perspective) that went badly, right? I mean if you have empirical evidence that hand-washing greatly reduced mortality, it seems pretty reasonable that you might be able to convince the medical establishment of this fact, and as a result gain a great deal of status/influence (which could eventually be turned into power/money).
The other two examples seem like real altruism to me, at least at first glance.
The best you can do is “egoism, plus virtue signalling, plus plain insanity in the hard cases”.
Question is, is there a better explanation than this?
Do you know any good articles or posts exploring the phenomenon of “the road to hell is paved in good intentions”? In the absence of a thorough investigation, I’m tempted to think that “good intentions” is merely a PR front that human brains put up (not necessarily consciously), and that humans deeply aligned with altruism don’t really exist, or are even rarer than it looks. See my old post A Master-Slave Model of Human Preferences for a simplistic model that should give you a sense of what I mean… On second thought, that post might be overly bleak as a model of real humans, and the truth might be closer to Shard Theory where altruism is a shard that only or mainly gets activated in PR contexts. In any case, if this is true, there seems to be a crucial problem of how to reliably do good using a bunch of agents who are not reliably interested in doing good, which I don’t see many people trying to solve or even talk about.
(Part of “not reliably interested in doing good” is that you strongly want to do things that look good to other people, but aren’t very motivated to find hidden flaws in your plans/ideas that only show up in the long run, or will never be legible to people whose opinions you care about.)
But maybe I’m on the wrong track and the main root cause of “the road to hell is paved in good intentions” is something else. Interested in your thoughts or pointers.
Over time, I’ve come to see the top questions as:
Is there such a thing as moral/philosophical progress? If yes, is there anything we can feasibly do to ensure continued moral/philosophical progress and maximize the chances that human(-descended) civilization can eventually reach moral/philosophical maturity where all of the major problems that currently confuse us are correctly solved?
Is there anything we might do prior to reaching moral/philosophical maturity that would constitute a non-negligible amount of irreparable harm? (For example, perhaps creating an astronomical amount of digital/simulated suffering would qualify.) How can we minimize the chances of this?
In one of your charts you jokingly ask, “What even is philosophy?” but I’m genuinely confused why this line of thinking doesn’t lead a lot more people to view metaphilosophy as a top priority, either in the technical sense of solving the problems of what philosophy is and what constitutes philosophical progress, or in the sociopolitical sense of how best to structure society for making philosophical progress. (I can’t seem to find anyone else who often talks about this, even among the many philosophers in EA.)
Would be interested in your (eventual) take on the following parallels between FTX and OpenAI:
Inspired/funded by EA
Taking big risks with other people’s lives/money
Large employee exodus due to safety/ethics/governance concerns
Lack of public details of concerns due in part to non-disparagement agreements
just felt like SBF immediately became a highly visible EA figure for no good reason beyond $$$.
Not exactly. From Sam Bankman-Fried Has a Savior Complex—And Maybe You Should Too:
It was his fellow Thetans who introduced SBF to EA and then to MacAskill, who was, at that point, still virtually unknown. MacAskill was visiting MIT in search of volunteers willing to sign on to his earn-to-give program. At a café table in Cambridge, Massachusetts, MacAskill laid out his idea as if it were a business plan: a strategic investment with a return measured in human lives. The opportunity was big, MacAskill argued, because, in the developing world, life was still unconscionably cheap. Just do the math: At $2,000 per life, a million dollars could save 500 people, a billion could save half a million, and, by extension, a trillion could theoretically save half a billion humans from a miserable death.
MacAskill couldn’t have hoped for a better recruit. Not only was SBF raised in the Bay Area as a utilitarian, but he’d already been inspired by Peter Singer to take moral action. During his freshman year, SBF went vegan and organized a campaign against factory farming. As a junior, he was wondering what to do with his life. And MacAskill—Singer’s philosophical heir—had the answer: The best way for him to maximize good in the world would be to maximize his wealth.
SBF listened, nodding, as MacAskill made his pitch. The earn-to-give logic was airtight. It was, SBF realized, applied utilitarianism. Knowing what he had to do, SBF simply said, “Yep. That makes sense.” But, right there, between a bright yellow sunshade and the crumb-strewn red-brick floor, SBF’s purpose in life was set: He was going to get filthy rich, for charity’s sake. All the rest was merely execution risk.
To give some additional context, China emitted 11680 MT of Co2 in 2020, out of 35962 MT globally. In 2022 it plans to mine 300 MT more coal than the previous year (which also added 220 MT of coal production), causing an additional 600 MT of Co2 from this alone (might be a bit higher or lower depending on what kind of coal is produced). Previously, China tried to reduce its coal consumption, but that caused energy shortages and rolling blackouts, forcing the government to reverse direction.
Given this, it’s really unclear how efforts like persuading Canadian voters to take climate change more seriously can make enough difference to be considered “effective” altruism. (Not sure if that line in your conclusions is targeted at EAs, or was originally written for a different audience.) Perhaps EAs should look into other approaches (such as geoengineering) that are potentially more neglected and/or tractable?
To take a step back, I’m not sure it makes sense to talk about “technological feasibility” of lock-in, as opposed to say its expected cost, because suppose the only feasible method of lock-in causes you to lose 99% of the potential value of the universe, that seems like a more important piece of information than “it’s technologically feasible”.
(On second thought, maybe I’m being unfair in this criticism, because feasibility of lock-in is already pretty clear to me, at least if one is willing to assume extreme costs, so I’m more interested in the question of “but can it be done at more acceptable costs”, but perhaps this isn’t true of others.)
That aside, I guess I’m trying to understand what you’re envisioning when you say “An extreme version of this would be to prevent all reasoning that could plausibly lead to value-drift, halting progress in philosophy.” What kind of mechanism do you have in mind for doing this? Also, you distinguish between stopping philosophical progress vs stopping technological progress, but since technological progress often requires solving philosophical questions (e.g., related to how to safely use the new technology), do you really see much distinction between the two?
Consider a civilization that has “locked in” the value of hedonistic utilitarianism. Subsequently some AI in this civilization discovers what appears to be a convincing argument for a new, more optimal design of hedonium, which purports to be 2x more efficient at generating hedons per unit of resources consumed. Except that this argument actually exploits a flaw in the reasoning processes of the AI (which is widespread in this civilization) such that the new design is actually optimized for something different from what was intended when the “lock in” happened. The closest this post comes to addressing this scenario seems to be “An extreme version of this would be to prevent all reasoning that could plausibly lead to value-drift, halting progress in philosophy.” But even if a civilization was willing to take this extreme step, I’m not sure how you’d design a filter that could reliably detect and block all “reasoning” that might exploit some flaw in your reasoning process.
Maybe in order to prevent this, the civilization tried to locked in “maximize the quantity of this specific design of hedonium” as their goal instead of hedonistic utilitarianism in the abstract. But 1) maybe the original design of hedonium is already flawed or highly suboptimal, and 2) what if (as an example) some AI discovers an argument that they should engage in acausal trade in order to maximize the quantity of hedonium in the multiverse, except that this argument is actually wrong.
This is related to the problem of metaphilosophy, and my hope that we can one day understand “correct reasoning” well enough to design AIs that we can be confident are free from flaws like these, but I don’t know how to argue that this is actually feasible.
I don’t have good answers to your questions, but I just want to say that I’m impressed and surprised by the decisive and comprehensive nature of the new policies. It seems that someone or some group actually thought through what would be effective policies for achieving maximum impact on the Chinese AI and semiconductor industries, while minimizing collateral damage to the wider Chinese and global economies. This contrasts strongly with other recent US federal policy-making that I’ve observed, such as COVID, energy, and monetary policies. Pockets of competence seem to still exist within the US government.
But two formidable new problems for humanity could also arise
I think there are other AI-related problems that are comparable in seriousness to these two, which you may be neglecting (since you don’t mention them here). These posts describe a few of them, and this post tried to comprehensively list my worries about AI x-risk.
They are building their own alternatives, for example CodeGeeX is a GPT-sized language model trained entirely on Chinese GPUs.
It used Huawei Ascend 910 AI Processors, which was fabbed by TSMC, which will no longer be allowed to make such chips for China.
absent a war, China can hope to achieve parity with the West (by which I mean the countries allied with the US including South Korea and Japan) on the hardware side by buying chips from Taiwan like everyone else
Apparently this is no longer true as of Oct 2022. From https://twitter.com/jordanschnyc/status/1580889364233539584:
Summary from Lam Research, which is involved with these new sanctions:
All Chinese advanced computing chip design companies are covered by these sanctions, and TSMC will no longer do any tape-out for them from now on;
This was apparently based on this document, which purports to be a transcript of a Q&A session with a Lam Research official. Here’s the relevant part in Chinese (which is consistent with the above tweet):
- 15 Oct 2022 3:03 UTC; 28 points)'s comment on Analysis: US restricts GPU sales to China by (LessWrong;
- 15 Oct 2022 3:04 UTC; 23 points)'s comment on The US expands restrictions on AI exports to China. What are the x-risk effects? by (
What precautions did you take or would you recommend, as far as preventing the (related) problems of falling in with the wrong crowd and getting infected with the wrong memes?
What morality and metaethics did you try to teach your kids, and how did that work out?
(Some of my posts that may help explain my main worries about raising a kid in the current environment: 1 2 3. Would be interested in any comments you have on them, whether from a parent’s perspective or not.)
If the latter, we’re not really seeking ‘AI alignment’. We’re talking about using AI systems as mass ‘moral enhancement’ technologies. AKA ‘moral conformity’ technologies, aka ‘political indoctrination’ technologies. That raises a whole other set of questions about power, do-gooding, elitism, and hubris.
I would draw a distinction between what I call “metaphilosophical paternalism” and “political indoctrination”, the difference being whether we’re “encouraging” what we think are good reasoning methods and good meta-level preferences (e.g., preferences about how to reason, how to form beliefs, how to interact with people with different beliefs/values), or whether we’re “encouraging” object-level preferences for example about income redistribution.
My precondition for doing this though, is that we first solve metaphilosophy, in other words have a thorough understanding of what “good reasoning” (including philosophical and moral reasoning) actually consists of, or a thorough understanding of what good meta-level preferences consist of. I would be the first to admit that we seriously lack this right now. It seems a very long shot to develop such an understanding before AGI, but I have trouble seeing how to ensure a good long term outcome for future human-AI civilization unless we succeed in doing something like this.
I think in practice what we’re likely to get is “political indoctrination” (given huge institutional pressure/incentive to do that), which I’m very worried about but am not sure how to prevent, aside from solving metaphilosophy and talking people into doing metaphilosophical paternalism instead.
So, we better be honest with ourselves about which type of ‘alignment’ we’re really aiming for.
I have had discussions with some alignment researchers (mainly Paul Christiano) about my concerns on this topic, and the impression I get is that they’re mainly focused on “aligned with individual people’s current values as they are” and they’re not hugely concerned about this leading to bad outcomes like people locking in their current beliefs/values. I think Paul said something like he doesn’t think many people would actually want their AI to do that, and others are mostly just ignoring the issue? They also don’t seem hugely concerned that their work will be (mis)used for “political indoctrination” (regardless of what they personally prefer).
So from my perspective, the problem is not so much alignment researchers “not being honest with themselves” about what kind of alignment we’re aiming for, but rather a confusing (to me) nonchalance about potential negative outcomes of AIs aligned with religious or ideological values.
ETA: What’s your own view on this? How do you see things working out in the long run if we do build AIs aligned to people’s current values, which include religious values for many of them? Based on this, are you worried or not worried?
If you think the Simulation Hypothesis seems likely, but the traditional religions are idiotic
I think the key difference here is that while traditional religions claim detailed knowledge about who the gods are, what they’re like, what they want, and what we should do in light of such knowledge, my position is that we currently actually have little idea who our simulators are and can’t even describe our uncertainty in a clear way (such as with a probability distribution), nor how such knowledge should inform our actions. It would take a lot of research, intellectual progress, and perhaps increased intellectual capacity to change that. I’m fairly certain that any confidence in the details of gods/simulators at this point is unjustified, and people like me are simply at a better epistemic vantage point compared to traditional religionists who make such claims.
These are the human values that religious people would want the AI to align with. If we can’t develop AI systems that are aligned with these values, we haven’t solved the AI alignment problem.
I also think that the existence of religious values poses a serious difficulty for AI alignment, but I have the opposite worry, that we might develop AIs that “blindly” align with religious values (for example locking people into their current religious beliefs because they seem to value faith), thus causing a great deal of harm according to more enlightened values.
It’s not clear to me what should be done with religious values though, either technically or sociopolitically. One (half-baked) idea I have is that if we can develop a good understanding of what “good reasoning” consists of, maybe aligned AI can use that to encourage people to adopt good reasoning processes that eventually cause them to abandon their false religious beliefs and the values that are based on those false beliefs, or allow the the AI to talk people out of their unjustified beliefs/values based on the AI’s own good reasoning.
Have you seen Problems in AI Alignment that philosophers could potentially contribute to? (See also additional suggestions in the comments.) Might give your fellows some more topics to research or think about.
ETA: Out of those problems, solving metaphilosophy is currently the highest on my wish list. See this post for my reasons why.
I really appreciate this work. I’ve been looking into some of the same questions recently, but like you say everything I’ve been able to find up to now seem very siloed and fail to take into account all of the potentially important issues. To convince people of your thesis though, I think it needs more of the following:
Discussion of more energy transition scenarios and their potential obstacles. It currently focuses a lot on the impossibility of using batteries to store 1 month worth of electricity, but I’m guessing that it might be much more realistic to use batteries only for daily storage, with seasonal/longer term variations being handled by a combination of overcapacity and fossil fuel backup, or by adaptation on the demand side.
Discussion of counterarguments to your positions. You already do some of this (e.g. “Dave finds it pessimistic, he thinks they give too much importance to land use and climate impacts, and that the model should have higher efficiency and growth of renewables.”) but would appreciate more details of the counterarguments and why you still disagree with them.
In the long run, why is it impossible to build an abundant energy system using only highly available minerals? It seems like your main argument here is that renewables have low EROI, but why can’t we greatly improve that in the future? For example, if much of the current energy investment into renewables goes to spending energy on maintaining living standards that workers demand (I don’t know if this is actually true or not), we could potentially lower that amount by increasing automation. What are the fundamental limits to such improvements?
Yeah, seems like we’ve surfaced some psychological difference here. Interesting.
The point of my comment was that even if you’re 100% sure about the eventual interest rate move (which of course nobody can be), you still have major risk from path dependency (as shown by the concrete example). You haven’t even given a back-of-the-envelope calculation for the risk-adjusted return, and the “first-order approximation” you did give (which both uses leverage and ignores all risk) may be arbitrarily misleading, even for the purpose of “gives an idea of how large the possibilities are”. (Because if you apply enough leverage and ignore risk, there’s no limit to how large the possibilities are of any given trade.)
I thought about not writing that sentence, but figured that other readers can benefit from knowing my overall evaluation of the post (especially given that many others have upvoted it and/or written comments indicating overall approval). Would be interested to know if you still think I should not have said it, or should have said it in a different way.