harfe

Karma: 954

harfe 8 Jun 2026 11:23 UTC
19 points
11 ∶ 0
in reply to: Kaleem’s comment on: Kaleem’s Shortform
Helping high-impact orgs can be high-impact.

Whether that is through IT, HR, or providing food.

Not everyone needs to be a high-level researcher, and I think it is fine to list jobs that have very different skill requirements than the typical 80k-advertized job.

harfe 6 May 2026 15:16 UTC
6 points
2 ∶ 0
in reply to: Brandon Riggs’s comment on: Brandon Riggs’s Quick takes
Is there a reason why you are focusing on compute and not salaries? The example numbers you use are rather low compared to the yearly salary of a single AIS researcher.

harfe 29 Mar 2026 2:11 UTC
5 points
3 ∶ 0
in reply to: Matthew_Barnett’s comment on: Matthew_Barnett’s Shortform
If you and me and all of humanity gets killed by AI and turned into paperclips, that would be an unprecedented moral catastrophe. If the AIs that killed all of us stay around and enjoy having more paperclips, that is still extremely bad. The very act of killing us makes these AIs not a worthy successor of the human species.

This suggests that proposing to pause AI today is like proposing to pause electricity in 1880

The prospect of AI killing all of us makes these very different. Yes, in both cases a pause will probably slow GDP growth. But humans should be willing to accept lower GDP if this notably reduces the chance of all humans being killed.

harfe 17 Mar 2026 14:15 UTC
5 points
4 ∶ 0
in reply to: titotal’s comment on: Peter Thiel is actively convincing billionaires to abandon The Giving Pledge

warning the Tesla founder his wealth would go to “left-wing nonprofits that will be chosen by Bill Gates.”

Am I missing something or does this argument make no sense? As far as I can tell, Musk can easily fulfill his giving pledge by giving to his preferred not-left-wing nonprofits without deferring to Bill Gates.

harfe 10 Mar 2026 17:43 UTC
2 points
1 ∶ 0
on: There is no systematic pipeline for graduate-level formal proof training data for mathematical AI. I am trying to fix that and mitigate AI safety risk

I’d genuinely welcome feedback on whether the formalization bottleneck resonates as a priority here

It is not clear from your post why getting more people to contribute to mathlib is good for AI safety.

So far, it sounds like this is more about the AI safety risks from AI being dumb rather than AI being smart. The latter is more important for existential risk, but maybe you mean non-xrisk safety?

Suppose you have a math AI that spits out lean proofs, what would you use it for? Which mathematical questions would you like it to tackle?

I would also note that there are already some math AI startups such as Axiom and Harmonic, who also do Lean (and they don’t seem to be about AI safety at all). EA often focuses on things that are neglected, and it is not clear why non-profit money is needed for the same goals.

I don’t want to be too discouraging, Lean could be useful somehow for reducing AI xrisk, but imo you haven’t made the case.

harfe 10 Mar 2026 15:34 UTC
9 points
2 ∶ 0
on: Prediction Markets Are Structurally Inaccurate When Forecasting Democratic Decline
I think there are many flaws in this post:
- comparing changes in probabilities with changes on a 100-point scale doesn’t make sense
- “as users realize they can leave their money in a high-yield savings account” doesn’t apply to manifold or metaculus
- metaculus is not really prediction markets and manifold markets are not polls (there are polls on manifold, but that is a different thing).
- The end of democracy market was not at 23% at Oct 8, 2025, but rather in the 40′s. Perhaps this is an AI hallucination?
- For markets that are closer to 0%, you say that long-horizon markets are systematically overpriced, but your overall argument seems to be that these markets underestimate democratic decline and are thus underpriced.
Of the three platforms hosting the markets tracked in this post, only Polymarket has taken direct action on the long-horizon problem, introducing a 4% annualized yield on long-term political and geopolitical positions in September 2025. Manifold’s play-money structure and Metaculus’s reputation-based system have no equivalent mechanism to address opportunity cost mispricing, and given their design, likely never will.

This is wrong, as metaculus avoids the opportunity cost mispricing entirely by not requiring currency for making predictions. And manifold had a loan system for a long time.

Overall, I find the confident tone and bold phrases such as avoid looking to prediction markets off-putting.

I think forum users should not upvote posts like this that frankly look like AI slop to me.

harfe 19 Jan 2026 16:39 UTC
3 points
2 ∶ 0
in reply to: ChristianKleineidam’s comment on: ChristianKleineidam’s Quick takes
I don’t think so.

Some less tribalistic hypotheses I can think of:
- EAs concerned about animal welfare have typically focused on farmed animals, as opposed to animal testing, because of the much larger scale of the suffering
- EAs mostly haven’t heard of it.
- Maybe some EAs have heard about it, but they don’t think it is worth the effort to write a post about it.
But tribalistic explanations could be a factor too (e.g. MAHA has anti-science vibes, and EAs like to stay on the pro-science side).

(This is probably not the most constructive feedback, but my initial reaction to this short form was that it felt like a right-wing analog of left-wing “Why don’t the EAs tweet about Gaza?”-style criticisms).

harfe 17 Nov 2025 18:28 UTC
2 points
1 ∶ 0
in reply to: Singer Robin’s comment on: Singer Robin’s Quick takes
I think halting undecidability and Rice’s theorem are being misapplied here. It is true that no algorithm can determine, for every possible program and input, whether that program will halt. But for specific programs and inputs, it is often possible to figure out whether they halt or not.

I agree that there is no method that allows us to check all possible AGI designs for a specific nontrivial behavioral property. But this does not forbid us to select an AGI design for which we can prove that it has a specific behavioral property!

harfe 30 May 2025 9:10 UTC
3 points
0 ∶ 0
in reply to: Zach Stein-Perlman’s comment on: Jacob Watts’s Quick takes
Can you say more on why you think a 1:24 ratio is the right one (as opposed to lower or higher ratios)? And how might this ratio differ for people who have different beliefs than you, for example about xrisk, LTFF, or the evilness of these companies?

harfe 22 May 2025 22:41 UTC
6 points
4 ∶ 0
in reply to: Siebe’s comment on: SiebeRozendal’s Shortform
I do not recall seeing this usage in AI safety or LW circles. Can you link to examples?

harfe 18 Apr 2025 16:19 UTC
11 points
3 ∶ 0
in reply to: Joseph Lemien’s comment on: casebash’s Shortform
Once upon a time, some people were arguing that AI might kill everyone, and EA resources should address that problem instead of fighting Malaria. So OpenPhil poured millions of dollars into orgs such as EpochAI (they got 9 million). Now 3 people from EpochAI created a startup to provide training data to help AI replace human workers. Some people are worried that this startup increases AI capabilities, and therefore increases the chance that AI will kill everyone.

harfe 18 Apr 2025 15:39 UTC
4 points
2 ∶ 0
in reply to: sammyboiz🔸’s comment on: Alignment is not *that* hard

However, a model trained to obey the RLHF objective will expect negative reward if decided taking over the world

If an AI takes over the world there is no-one around to give it a negative reward. So the AI will not expect a negative reward for taking over the world

harfe 17 Apr 2025 6:13 UTC
11 points
5 ∶ 2
on: Alignment is not *that* hard
The issue is not whether the AI understands human morality. The issue is whether it cares.

The arguments from the “alignment is hard” side that I was exposed to don’t rely on the AI misinterpreting what the humans want. In fact, superhuman AI assumed to be better at humans at understanding human morality. It still could do things that go against human morality. Overall I get the impression you misunderstand what alignment is about (or maybe you just have a different association to words as “alignment” than me).

Whether a language model can play a nice character that would totally give back the dictatorial powers after takeover is barely any evidence whether the actual super-human AI system will step back from its position of world dictator after it has accomplished some tasks.

harfe 28 Feb 2025 15:16 UTC
0 points
0 ∶ 0
in reply to: calebp’s comment on: Donor Lotteries Aren’t Worth the Effort

How is that better than individuals just donating to wherever they think makes sense on the margin?

I think the comment already addresses that here:

moreover, rule by committee enables deliberation and information transfer, so that persuasion can be used to make decisions and potentially improve accuracy or competence at the loss of independence.

harfe 18 Feb 2025 17:40 UTC
1 point
1 ∶ 0
in reply to: Grayden 🔸’s comment on: Why EA can (and should) appeal to Christians

This article has a lot of downvoting (net karma of 39 from 28)

This does not seem to be an unusual amount of downvoting to me. The net karma is even higher than the number of votes!

As a more general point, I think people should worry less about downvotes on posts with a high net karma.

harfe 16 Feb 2025 21:25 UTC
5 points
0 ∶ 0
on: Could humanity be saved by sending people to other planets (like Mars)?
As for existential risk from AI takeover, I don’t think having a self-sustaining civilization on Mars would help much.

If an AI has completed takeover on earth and killed all humans on earth, taking over Mars too does not sound that hard, especially since the human civilization is likely quite fragile. (There might be some edge cases, where you solve the AI control problem well enough to guarantee that all advanced AIs leave Mars alone, but not well enough for AI to leave Australia alone, but I think scenarios like these are extremely unlikely).

For other existential risks, it might be in principle useful, but practically very difficult. Building a self-sustaining city on Mars will take a lot of time and resources. On the scale of centuries, it seems like a viable option though.

harfe 13 Feb 2025 13:57 UTC
3 points
0 ∶ 0
in reply to: David Mathers🔸’s comment on: Matthew_Barnett’s Shortform

At the same time though I don’t think you mean to endorse 1).

I have read or skimmed some of his posts and my sense is that he does endorse 1). But at the same time he says

critics seem to frequently conflate my arguments with other, simpler positions that can be more easily dismissed.

so maybe this is one of these cases and I should be more careful.

harfe 13 Feb 2025 13:40 UTC
1 point
0 ∶ 0
in reply to: gergo’s comment on: Ideas EAIF is excited to receive applications for
A recent comment says that restriction has been lifted and the website will be updated next week: https://forum.effectivealtruism.org/posts/aBkALPSXBRjnjWLnP/announcing-the-q1-2025-long-term-future-fund-grant-round?commentId=FFFMBth8v7WBqYFzP

harfe 6 Feb 2025 13:43 UTC
3 points
3 ∶ 0
on: Why misaligned AGI won’t lead to mass killings (and what actually matters instead)

the AI won’t ever have more [...] capabilities to hack and destroy infrastructure than Russia, China or the US itself.

Having better hacking capability than China seems like a low bar for super-human AGI. The AGI would need to be better at writing and understanding code than a small group of talented humans, and have access to some servers. This sounds easy if you accept the premise of smarter-than-human AGI.

harfe 5 Feb 2025 18:30 UTC
6 points
3 ∶ 5
in reply to: Eugenics-Adjacent’s comment on: Eugenics-Adjacent’s Quick takes
Merely listing EA under “Memetics adjacence” does not support the claim “is also an avowed effective altruist.”