Humans are neither coherent, nor do they necessarily have a nonsatiable goal—though some might. But they have both to a far greater extent than less intelligent creatures.
Davidmanheim
Are you willing to posit that advanced systems are coherent, with at least one non-satiable component? Because that’s pretty minimal as an assumption, but implies with probability one that they prefer paperclipping of some sort.
Where and when are these supposed to occur, and how can we track that for our respective countries?
“EA has always bee[n] rather demanding,”
I want to clarify that this is a common but generally incorrect reading of EA’s views. EA leaders have repeatedly clarified that you don’t need to dedicate your life to it, and can simply donate to causes that others have identified as highly effective, and otherwise live your life.
If you want to do more than that, great, good for you—but EA isn’t utilitarianism, so please don’t confuse the demandingness of the two.
First, Utilitarianism doesn’t traditionally require the type of extreme species neutrality you propose here.Singer and many EAs gave a somewhat narrower view of what ‘really counts’ as Utilitarian, but your argument assumes that narrow view without really justifying it.
Second, you assume future AIs will have rich inner lives that are valuable, instead of paperclipping the universe. You say “one would need to provide concrete evidence about what kinds of objectives advanced AIs are actually expected to develop”—but Eliezer has done that quite explicitly.
Very much in favor of posts clarifying that cause neutrality doesn’t require value neutrality or deference to others’ values.
I very much appreciate that you are thinking about this, and the writing is great. That said, without trying to address the arguments directly, I worry that the style here is justifying a conclusion you’ve come to and explores analogies you like rather than exploring the arguments and trying to decide what side to be on, and it fails to embrace scout mindset sufficiently to be helpful.
I think that replaceability is very high, so the counterfactual impact is minimal. But that said, there is very little possibility in my mind that even helping with RLHF for compliance with their “safety” guidelines is more beneficial for safety than for accelerating capabilities racing, so any impact is negative.
Thank you for fighting the good fight!
I don’t think multiperson disagreements are in general a tractable problem for one hour sessions. It sounds like you need someone in charge to enable disagree then commit, rather than a better way to argue.
How much of the money raised by effectiv-spenden, etc. is a essentially pass through to Givewell? (I know Israel now has a similar initiative, but is in large part passing the money to the same orgs.)
I’m cheating a bit, because both of these are well on their way, but two big current goals:
Get Israel to iodize its salt!
Run an expert elicitation on Biorisk with RAND and publish it.
Not predictions as such, but lots of current work on AI safety and steering is based pretty directly on paradigms from Yudkowsky and Christiano—from Anthropic’s constitutional AI to ARIA’s Safeguarded AI program. There is also OpenAI’s Superalignment reserach, which was attempting to build AI that could solve agent foundations—that is, explicitly do the work that theoretical AI safety research identified. (I’m unclear whether the last is ongoing or not, given that they managed to alienate most of the people involved.)
I strongly agree that you need to put your own needs first, and think that your level of comfort with your savings and ability to withstand foreseeable challenges is a key input. My go-to in general, is that the standard advice of keeping 3-6 months of expenses is a reasonable goal—so you can and should give, but until you have saved that much, you should at least be splitting your excess funds between savings and charity. (And the reason most people don’t manage this has a lot to do with lifestyle choices and failure to manage their spending—not just not having enough income. Normal people never have enough money to do everything they’d like to; set your expectations clearly and work to avoid the hedonic treadmill!)
To follow on to your point, as it relates to my personal views, (in case anyone is interested,) it’s worth quoting the code of Jewish law. It introduces its discussion of Tzedakah by asking how much one is required to give. “The amount, if one has sufficient ability, is giving enough to fulfill the needs of the poor. But if you do not have enough, the most praiseworthy version is to give one fifth, the normal amount is to give a tenth, and less than that is a poor sign.” And I note that this was written in the 1500s, where local charity was the majority of what was practical; today’s situation is one where the needs are clearly beyond any one person’s ability—so the latter clauses are the relevant ones.
So I think that, in a religion that prides itself on exacting standards and exhaustive rules for the performance of mitzvot, this is endorsing exactly your point: while giving might be a standard, and norms and community behavior is helpful in guiding behavior, the amount to give is always a personal and pragmatic decision, not a general rule.
You seem to be framing this as if deontology is just side constraints with a base of utilitarianism. That’s not how deontology works—it’s an entire class of ethical frameworks on its own.
Deontology doesn’t require you not to have any utilitarian calculations, just that the rules to follow are not justified solely on the basis of outcomes. A deontologist can believe they have a moral obligation to give 10% of their income to the most effective charity as judged by their expected outcomes, for example, making them in some real sense a strictly EA deontologist.
You seem to be generally conflating EA and utilitarianism. If nothing else, there are plenty of deontologist EAs. (Especially if we’re being accurate with terminology!)
There’s a new post or two discussing this: https://www.lesswrong.com/posts/GdBwsYWGytXrkniSy/miri-s-june-2024-newsletter https://www.lesswrong.com/posts/cqF9dDTmWAxcAEfgf/communications-in-hard-mode-my-new-job-at-miri
And a older one from last year: https://www.lesswrong.com/posts/NjtHt55nFbw3gehzY/announcing-miri-s-new-ceo-and-leadership-team
Close enough not to have any cyclic components that would lead to infinite cycles for the nonsatiable component of their utility.