Seth Herd

Karma: 73

Seth Herd Mar 19, 2025, 1:46 AM
7 points
0 ∶ 1
in reply to: Holly Elmore ⏸️ 🔸’s comment on: The Game Board has been Flipped: Now is a good time to rethink what you’re doing
It seems like having genuinely safety-minded people within orgs is invaluable. Do you think that having them refuse to join is going to meaningfully slow things down?
It just takes one brave or terrified person in the know to say “these guys are internally deploying WHAT? I’ve got to stop this!”
I worry very much that we won’t have one such person in the know in OpenAI. I’m very glad we have them in Anthropic.
Having said that, I agree that Anthropic should not be shielded from criticism.

Your assumption that influence flows one way in organizations seems based on fear not psychology. If someone believes AGI is a real risk, they should be motivated enough to resist some pressure from superiors who merely argue that they’re doing good stuff.
If you won’t actively resist changing your beliefs once you join a culture with importantly different beliefs, then don’t join an org.
While Anthropic’s plan is a terrible one, so is PauseAI’s. We have no good plans. And we must’nt fight amongst ourselves.

Seth Herd Mar 19, 2025, 12:53 AM
1 point
0 ∶ 0
in reply to: William_MacAskill’s comment on: Existential Choices Symposium with Will MacAskill and other special guests (3-5pm GMT Monday)
This seems almost exactly like the repugnant conclusion. Taken to extremes, intuition disagrees with logic. When that happens, it’s usually the worse for intuition.
I’m not a utilitarian, but I find the repugnant conclusion impossible to reject if you are.
If you want chose what is good for everyone, there’s little argument what that is in those cases.
And if we’re talking about what’s good for everyone, that’s got to be a linear sum of what’s good for each someone. If the sum is nonlinear, who exactly is worth less than the others? This leads to the repugnant conclusion and your conclusion here.
Other definitions of “good for everyone” seem to always mean “what I idiosyncratically prefer for everyone else but me”.

Seth Herd Mar 19, 2025, 12:31 AM
1 point
0 ∶ 1
on: Discussion Thread: Existential Choices Debate Week
We do not have adequate help with AGI x-risk, and the societal issues demand many skillsets that alignment workers typically lack. Surviving AGI and avoiding s-risk far outweigh all other concerns by any reasonable utilitarian logic.

Seth Herd Mar 13, 2025, 10:13 PM
3 points
0 ∶ 0
in reply to: Davidmanheim’s comment on: How confident are you that it’s preferable for America to develop AGI before China does?
You were getting disagree votes because it sounded like you were claiming certainty. I realize that you weren’t trying to do that, but that’s how people were taking it, and I find that quite understandable. Chicken as an analogy has certain death if neither player swerves, in the standard formulation. Qualifying your statement even a little would’ve gotten your point across better.
FWIW I agree with your statement as I interpret it. I do tend to think that an objective measure of misalignment risk (I place it around 50% largely based on model uncertainty on all sides) makes the question of which side is safer basically irrelevant.
Which highlights the problem with this type of miscomunnication. You were making probably by far the most important point here. It didn’t play a prominent role because it wasn’t communicated in a way the audience would understand.

Seth Herd Mar 7, 2025, 6:17 PM
3 points
0 ∶ 0
in reply to: ASuchy’s comment on: Cage-free in the US
That’s very helpful! They don’t try to include the cost of fear; the old story I’d heard about cage-free environments was that there’s more fighting and cannibalism. But given the other large benefits, I’m convinced that cage-free is better.

Seth Herd Feb 28, 2025, 8:23 PM
1 point
0 ∶ 0
on: Cage-free in the US
Wait now—I thought cage-free chickens suffered as much or more than caged? I heard the claim a long time ago but never looked in to it closely.

Seth Herd Nov 20, 2024, 7:53 PM
17 points
2 ∶ 1
on: China Hawks are Manufacturing an AI Arms Race
Copied from my LW comment, since this is probably more of an EAF discussion:

This is really important pushback. This is the discussion we need to be having.
Most people who are trying to track this believe China has not been racing toward AGI up to this point. Whether they embark on that race is probably being determined now—and based in no small part on the US’s perceived attitude and intentions.
Any calls for racing toward AGI should be closely accompanied with “and of course we’d use it to benefit the entire world, sharing the rapidly growing pie”. If our intentions are hostile, foreign powers have little choice but to race us.
And we should not be so confident we will remain ahead if we do race. There are many routes to progress other than sheer scale of pretraining. The release of DeepSeek r1 today indicates that China is not so far behind. Let’s remember that while the US “won” the race for nukes, our primary rival had nukes very soon after—by stealing our advancements. A standoff between AGI-armed US and China could be disastrous—or navigated successfully if we take the right tone and prevent further proliferation (I shudder to think of Putin controlling an AGI, or many potentially unstable actors).
This discussion is important, so it needs to be better. This pushback is itself badly flawed. In calling out the report’s lack of references, it provides almost none itself. Citing a 2017 official statement from China seems utterly irrelevant to guessing their current, privately held position. Almost everyone has updated massively since 2017. (edit: It’s good that this piece does note that public statements are basically meaningless in such matters.) If China is “racing toward AGI” as an internal policy, they probably would’ve adopted that recently. (I doubt that they are racing yet, but it seems entirely possible they’ll start now in response to the US push to do so—and the their perspective on the US as a dangerous aggressor on the world stage. But what do I know—we need real experts on China and international relations.)
Pointing out the technical errors in the report seems irrelevant to harmful. You can understand very little of the details and still understand that AGI would be a big, big deal if true — and the many experts predicting short timelines could be right. Nitpicking the technical expertise of people who are essentially probably correct in their assessment just sets a bad tone of fighting/arguing instead of having a sensible discussion.
And we desperately need a sensible discussion on this topic.

Seth Herd Aug 11, 2024, 8:07 PM
3 points
3 ∶ 0
on: On the Value of Advancing Progress
I completely agree.
But others may not, because most humans aren’t longtermists nor utilitarians. So I’m afraid arguments like this won’t sway the public opinion much at all. People like progress because it will get them and their loved ones (children and grandchildren, whose future they can imagine) better lives. They just barely care at all whether humanity ends after their grandchildren’s lives (to the extent they can even think about it).
This is why I believe that most arguents against AGI x-risk are really based on differing timelines. People like to think that humans are so special we won’t surpass them for a long time. And they mostly care about the future for their loved ones.

Seth Herd Aug 11, 2024, 8:02 PM
1 point
0 ∶ 0
in reply to: Miquel Banchs-Piqué (prev. mikbp)’s comment on: On the Value of Advancing Progress
I think the point is making this explicit and having a solid exposition to point to when saying “progress is no good if we all die sooner!”

Seth Herd Aug 11, 2024, 7:57 PM
1 point
0 ∶ 1
in reply to: Mjreard’s comment on: #194 – Defensive acceleration and how to regulate AI when you fear government (Vitalik Buterin on the 80,000 Hours Podcast)
I don’t think it’s worth the effort; I’d personally be just as pleased with one snapshot of the participants in conversation as I would be with a whole video. The point of podcasts for me is that I can do something else while still taking in something useful for my alignment work. But I am definitely a tone-of-voice attender over a facial-expression attender, so others will doubtless get more value out of it.

Seth Herd Aug 11, 2024, 7:55 PM
1 point
0 ∶ 0
on: #194 – Defensive acceleration and how to regulate AI when you fear government (Vitalik Buterin on the 80,000 Hours Podcast)
Ooops, I meant to say I wrote a post on one aspect of this interview on LW: Fear of centralized power vs. fear of misaligned AGI: Vitalik Buterin on 80,000 Hours. It did produce some interesting discussion.

Seth Herd Jul 23, 2024, 7:52 PM
3 points
0 ∶ 1
on: Caring about excellence
Yes, but pursuing excellence also costs time that could be spent elsewhere, and time/results tradeoffs are often highly nonlinear.
The perfect is the enemy of the good. It seems to me that the most common LW/EA personality already pursues excellence more than is optimal.
For more, see my LW comment:

Seth Herd Jul 16, 2024, 4:31 AM
21 points
2 ∶ 1
on: Against Aschenbrenner: How ‘Situational Awareness’ constructs a narrative that undermines safety and threatens humanity
Excellent work.
To summarize one central argument in briefest form:

Aschenbrenner’s conclusion in Situational Awareness is wrong in overstating the claim.
He claims that treating AGI as a national security issue is the obvious and inevitable conclusion for those that understand the enormous potential of AGI development in the next few years. But Aschenbrenner doesn’t adequately consider the possibility of treating AGI primarily as a threat to humanity instead of a threat the nation or to a political ideal (the free world). If we considered it primarily a threat to humanity, we might be able to cooperate with China and other actors to safeguard humanity.
I think this argument is straightforwardly true. Aschenbrenner does not adequately consider alternative strategies, and thus his claim of the conclusion being the inevitable consensus is false.
But the opposite isn’t an inevitable conclusion, either.
I currently think Aschenbrenner is more likely correct about the best course of action. But I am highly uncertain. I have thought hard about this issue for many hours both before and after Aschenbrenner’s piece sparked some public discussion. But my analysis, and the public debate thus far, are very far from conclusive on this complex issue.
This question deserves much more thought. It has a strong claim to being the second most pressing issue in the world at this moment, just behind technical AGI alignment.

Seth Herd Jun 17, 2024, 11:52 PM
3 points
0 ∶ 1
on: Questionable Narratives of “Situational Awareness”
This post can be summarized as “Aschenbrenner’s narrative is highly questionable”. Of course it is. From my perspective, having thought deeply about each of the issues he’s addressing, his claims are also highly plausible. To “just discard” this argument because it’s “questionable” would be very foolish. It would be like driving with your eyes closed once the traffic gets confusing.
This is the harshest response I’ve ever written. To the author, I apologize. To the EA community: we will not help the world if we fall back on vibes-based thinking and calling things we don’t like “questionable” to dismiss them. We must engage at the object level. While the future is hard to predict, it is quite possible that it will be very unlike the past, but in understandable ways. We will have plenty of problems with the rest of the world doing its standard vibes-based thinking and policy-making. The EA community needs to do better.
There is much to question and debate in Aschenbrenner’s post, but it must be engaged with at the object level. I will do that, elsewhere.
On the vibes/ad-hominem level, note that Aschenbrenner also recently wrote that Nobody’s on the ball on AGI alignment. He appears to believe (there and elsewhere) that AGI is a deadly risk, and we might very well all die from it. He might be out to make a quick billion, but he’s also serious about the risks involved.
The author’s object-level claim is that they don’t think AGI is immanent. Why? How sure are you? How about we take some action or at least think about the possibility, just in case you might be wrong and the many people close to its development might be right?

Seth Herd Jun 17, 2024, 11:36 PM
1 point
1 ∶ 0
in reply to: Matt Boyd’s comment on: Questionable Narratives of “Situational Awareness”
Agreed. That juxtaposition is quite suspicious.
Unfortunately, most of Aschenbrenner’s claims seem highly plausible. AGI is a huge deal, it could happen very soon, and the government is very likely to do something about it before it’s fully transformative. Whether them spending tons of money on his proposed manhattan project is the right move is highly debatable, and we should debate it.

Seth Herd Apr 3, 2024, 6:26 PM
1 point
0 ∶ 0
in reply to: Yarrow’s comment on: Biological superintelligence: a solution to AI safety
I think the scaling hypothesis is false, and we’ll get to AGI quite soon anyway, by other routes. The better scaling works, the faster we’ll get there, but that’s gravy. We have all of the components of a human-like mind today, putting them together is one route to AGI.

Seth Herd Apr 3, 2024, 4:10 PM
5 points
1 ∶ 0
on: The Rationale-Shaped Hole At The Heart Of Forecasting
I think a major issue is that the people who would be best at predicting AGI usually don’t want to share their rationale.
Gears-level models of the phenomenon in question are highly useful in making accurate predictions. Those with the best models are either worriers who don’t want to advance timelines, or enthusiasts who want to build it first. Neither has an incentive to convince the world it’s coming soon by sharing exactly how that might happen.
The exceptions are people who have really thought about how to get from AI to AGI, but are not in the leading orgs and are either uninterested in racing or want to attract funding and attention for their approach. Yann LeCun comes to mind.
Imagine trying to predict the advent of heavier-than-air flight without studying either birds or mechanical engineering. You’d get predictions like the ones we saw historically—so wild as to be worthless, except those from the people actually trying to achieve that goal.
(copied from LW comment since the discussion is happening over here)

Seth Herd Dec 8, 2023, 12:08 AM
3 points
0 ∶ 0
in reply to: Yarrow’s comment on: Biological superintelligence: a solution to AI safety
I personally think LLMs will plateau around human level, but that they will be made agentic and self-teaching, and therefore and self-aware (in sum, “sapient”) and truly dangerous by scaffolding them into language model agents or language model cognitive architectures. See Capabilities and alignment of LLM cognitive for my logic in expecting that.
That would be a good outcome. We’d have agents with their own goals, capable enough to do useful and dangerous things, but probably not quite capable enough to self-exfiltrate, and probably initially under the control of relatively sane people. That would scare the pants off of the world, and we’d see some real efforts to align the things. Which is uniquely do-able, since they’d take top-level goals in natural language, and be readily interpretable by default (with real concerns still there aplenty, including waluigi effects and their utterances not reliably reflecting their real underlying cognition).

Seth Herd Dec 4, 2023, 10:50 PM
2 points
0 ∶ 0
on: Biological superintelligence: a solution to AI safety
I think the general consensus, which I share, is that neither mind uploading nor good BCI to allow brain extensions are likely to happen before AGI. I wish I had citations ready to hand.
I haven’t heard as much discussion of the biological superbrains approach. I think it’s probably feasible to increase intelligence through genetic engineering, but that’s probably also too long to help with alignment before AGI happens, if you took the route of altering embryos. Altering adults would be tougher and more limited. And it would hit the same legal problems.
I think that neuromorphic AGI is a possibility, which is why some of my alignment work addresses it. I think the best and most prominent work on that topic is Steve Byrnes’ Intro to Brain-Like-AGI Safety.

Seth Herd Dec 3, 2023, 11:36 PM
1 point
0 ∶ 0
in reply to: Geoffrey Miller’s comment on: We’re Not Ready: thoughts on “pausing” and responsible scaling policies
I think that’s quite a pessimistic take. I take Altman seriously on caring about x-risk, although I’m not sure he takes it quite seriously enough. This is based on public comments to that effect around 2013, before he started running OpenAI. And Sutskever definitely seems properly concerned.