Jonas Hallgren 🔸

Karma: 490

Co-Director of Equilibria Network: https://eq-network.org/

I try to write as if I were having a conversation with you in person.

I would like to claim that my current safety beliefs are a mix between Paul Christiano’s, Andrew Critch’s and Def/Acc.

Jonas Hallgren 🔸Jun 21, 2025, 9:37 AM
2 points
0 ∶ 0
in reply to: Ben_West🔸’s comment on: A deep critique of AI 2027’s bad timeline models
Besides the point that “shoddy toy models” might be emotionally charged, I just want to point out that accelerating progress majorly increases variance and unknown unknowns? The higher energy a system is and the more variables you have the more chaotic it becomes. So maybe an answer is that a agile short-range model is the best? Outside view it in moderation and plan with the next few years being quite difficult to predict?
You don’t really need another model to disprove an existing one, you might as well point out that we don’t know and that is okay too.

Jonas Hallgren 🔸Jun 9, 2025, 6:04 AM
1 point
0 ∶ 0
in reply to: Patrick Gruban 🔸’s comment on: Don’t Panic About Democracy in EA
Yeah, I think you’re right and I also believe that it can be a both and?
You can have a general non-profit board and at the same time have a form of representative democracy going on which seems the best we can currently do for this?
I think it is fundamentally about a more timeless trade-off between hierarchical organisations that generally are able to act with more “commander’s intent” versus democratic models that are more of a flat voting model. The democratic models suffer when there is a lot of single person linear thinking involved but do well at providing direct information for what people care about whilst the inverse is true for the hierarchical one and the project of good governance is to some extent somewhere in between.

Jonas Hallgren 🔸Jun 9, 2025, 5:58 AM
1 point
0 ∶ 0
in reply to: Patrick Gruban 🔸’s comment on: Don’t Panic About Democracy in EA
Yeah for sure, I think the devil might be in the details here around how things are run and what the purpose of the national organisation is. Since Sweden and Norway have 8x less of a population than germany I think the effect of a “nation-wide group” might be different?
In my experience, I’ve found that EA Sweden focuses on and provides a lot of the things that you listed so I would be very curious to hear what the difference between a local and national organisation would be? Is there a difference in the dynamics of them being motivated to sustain themselves because of the scale?
You probably have a lot more experience than me in this so it would be very interesting to hear!

Jonas Hallgren 🔸Jun 8, 2025, 3:50 PM
3 points
0 ∶ 0
in reply to: Jason’s comment on: Don’t Panic About Democracy in EA
I like that decomposition.
There’s something about a prior on having democratic decision making as part of this because it allows for better community engagement usually? Representation often leads to feelings of inclusion and whilst I’ve only dabbled in the sociology here it seems like the option of saying no is quite important for members to feel heard?
My guess would be that the main pros of having democratic deliberation doesn’t come from when the going is normal but rather as a resillience mechanism? Democracies tend to react late to major changes and not change path often but when they do they do it properly? (I think this statement is true but it might as well be a cultural myth that I’ve heard in the social choice adjacent community.)

Jonas Hallgren 🔸Jun 8, 2025, 3:44 PM
1 point
0 ∶ 0
in reply to: Yi-Yang’s comment on: How Democratic Is Effective Altruism — Really?
I think I went through it in Spring 2021? I remember discussing it then as one of the advanced optional topics, maybe around steering versus rowing and that the discussion went into that? I can’t remember it more clearly than that though.

Jonas Hallgren 🔸Jun 8, 2025, 7:13 AM
19 points
4 ∶ 0
in reply to: Patrick Gruban 🔸’s comment on: Don’t Panic About Democracy in EA
First and foremost, I think the thoughts expressed here make sense and this comment is more just expressing a different perspective, not necessarily disagreeing.

I wanted to bring up an existing framework for thinking about this from Raghuram Rajan’s “The Third Pillar,” which provides economic arguments for why local communities matter even when they’re less “efficient” than centralized alternatives.
The core economic benefits of local community structures include:
- Information advantages: Local groups understand context that centralized organizations miss
- Adaptation capacity: They can respond quickly to local opportunities and constraints
- Social capital generation: They create trust networks that enable coordination
- Motivation infrastructure: They provide ongoing support that sustains long-term engagement
So when you bring up the question of efficiency and adherence to optimal reflective practices I start thinking about it from a more systemic perspective.
Here’s a question that comes to mind: if local EA communities make people 3x more motivated to pursue high-impact careers, or make it much easier for newcomers to engage with EA ideas, then even if these local groups are only operating at 75% efficiency compared to some theoretical global optimum, you still get significant net benefit.
I think this becomes a governance design problem rather than a simple efficiency question. The real challenge is building local communities that capture these motivational benefits while maintaining mechanisms for critical self-evaluation. (Which I think happens through impact evaluations and similar at least in EA Sweden.)
I disagree with the pure globalization solution here. From a broader macroeconomic perspective, we’ve seen repeatedly that dismantling local institutions in favor of “more efficient” centralized alternatives often destroys valuable social infrastructure that’s hard to rebuild. The national EA model might be preserving something important that pure optimization would eliminate.

Jonas Hallgren 🔸Apr 18, 2025, 7:00 AM
1 point
0 ∶ 0
on: Conditional Forecasting As Model Parameterization
This is very nice!

I’ve been thinking that there’s a nice generalisable analogy between bayesian updating and forecasting. (It is quite no shit when you think about it but it feels like people aren’t exploiting it?)
I’m doing a project on simulating a version of this idea but in a way that utilizes democratic decision making called Predictive Liquid Democracy (PLD) and I would love to hear if you have any thoughts on the general setup. It is model parameterization but within a specific democratic framing.
PLD is basically saying the following:
What if we could set up a trust based meritocratic voting network based on the predictions about how well a candidate will perform? It is futarchy with some twists.
Now for the generalised framing in terms of graphs that I’m thinking of:
1. As an example, if we look at a research network we can say that they’re trying to optimise for a certain set of outcomes (citations, new research) and they’re trying to make predictions that are going to work. P(U|A)
2. From a system perspective it is hard to influence the nodes even though it is possible. We therfore say that the edges of the graph that is the research network is what we’ll optimise. We can then set up a graph that has the signals and graph connections optimised to reach the truth.
3. Since we don’t care about the nodes we can also use AIs to help in a combination with human experts.
I’m writing a paper on setting up the variational mathematics behind this right now. I’m also writing a paper on some more specific simulations of this to run so I’m very grateful for any thoughts you might have of this setup!

Jonas Hallgren 🔸Apr 16, 2025, 9:58 AM
8 points
2 ∶ 0
on: How Democratic Is Effective Altruism — Really?
Some people might find that this post is written from a place of agitation which is fully okay. I think that even if you do there are two things that I would want to point out as really good points:
1. A dependence on funders and people with money as something that shapes social capital and incentives, therefore thought in itself. We should therefore be quite vary of the effect that has on people, this can definetely be felt in the community and I think it is a great point.
2. That the karma algorithm could be revisited and that we should think about what incentives are created for the forum through it.
I think there’s a very very interesting project of democratizingthe EA community in a way that makes it more effective. There are lots of institutional design that we can apply to ourselves and I would be very excited to see more work in this direction!
Edit:
Clarification on why I believe it to cause some agitation for some people:
1. I remember that some of the situation around Cremer being a bit politically loaded and that the emotions were running hot at that time and so citing that specific situation makes it lack a bit of context.
  1. There are some object level things that people within the community disagree with when it comes to these comments that point at deeper issues of epistemics and cause prioritization that is actually difficult to answer.
  2. The post makes it seem more one-sided than that situation was. Elitism in EA is something covered in the in-depth fellowship for example and there’s a bunch of back and forth there but it is an issue that you will arrive at different consequences on depending on what modelling assumptions you do.
  3. I don’t want to make a value judgement on this here, I just want to point out that specifice piece of Cremer’s writing has always felt a bit thorny which makes the references feel a bit inflammatory?
  4. For me it’s the vibe that it is written from a perspective of being post EA and something about when leaving something behind you want to get back at the thing itself by pointing out how it’s wrong? So it is kind of written from a emotionally framed perspective which makes the epistemics fraught?
2. There’s some sort of degree where the framing of the post in itself pattern matches onto other critiques that have felt bad faith and so it is “inflammatory” that it raises the immune system of people reading it. I do still think it is quite a valuable point, it is just that part of the phrasing makes it come across more like this than it has to be?
  1. I think that might be because of LLMs often liking to argue towards a specific point but I’m not sure?
    (You’ve got some writing that is reminiscent of claude so I could spot the use of it: e.g):
This isn’t just a technical issue. This is a design philosophy — one that rewards orthodoxy, punishes dissent, and enforces existing hierarchies.
I liked the post, I think it made a good point, I strong upvoted it but I wanted to mention it as a caveat.

Jonas Hallgren 🔸Apr 1, 2025, 10:57 AM
12 points
3 ∶ 0
on: Introducing The Spending What We Must Pledge
I felt that this post might be relevant for longtermism and person affecting views so I had claude write up a quick report on that:

In short: Rejecting the SWWM 💸11% pledge’s EV calculation logically commits you to person-affecting views, effectively transforming you from a longtermist into a neartermist.
Example: Bob rejects investing in a $500 ergonomic chair despite the calculation showing 10^50 * 1.2*10^-49 = 12 lives saved due to “uncertainty in the probabilities.” Yet Bob still identifies as a longtermist who believes we should value future generations. This is inconsistent, as longtermism fundamentally relies on the same expected value calculations with uncertain probabilities that SWWM uses.
The 🔮 Badge
If you’ve rejected the SWWM 💸11% Pledge while maintaining longtermist views, we’d appreciate if you could add the 🔮 “crystal ball” emoji to your social media profiles to signal your epistemic inconsistency.
FAQ
Why can’t I reject SWWM but stay a longtermist? Both longtermism and SWWM rely on the same decision-theoretic framework of accepting tiny probabilities of affecting vast future populations. Our analysis shows the error bars in SWWM calculations (±0.0000000000000000000000000000000000000000000001%) are actually narrower than the error bars in most longtermist calculations.
What alternatives do I have?
1. Accept the SWWM 💸11% pledge (consistent longtermist)
2. Reject both SWWM and longtermism (consistent person-affecting view)
3. Add the 🔮 emoji to your profile (inconsistent but transparent)
According to our comprehensive Fermi estimate, maintaining consistency between your views on SWWM and longtermism is approximately 4.2x more philosophically respectable.

Jonas Hallgren 🔸Mar 17, 2025, 8:00 AM
1 point
0 ∶ 0
on: Discussion Thread: Existential Choices Debate Week
First and foremost, I’m low confidence here.
I will focus on x-risk from AI and I will challenge the premise of this being the right way to ask the question.
What is the difference between x-risk and s-risk/increasing the value of futures? When we mention x-risk with regards to AI we think of humans going extinct but I believe that to be a shortform for wise compassionate decision making. (at least in the EA sphere)
Personally, I think that x-risk and good decision making in terms of moral value might be coupled to each other. We can think of our current governance conditions a bit like correction systems for individual errors. If they pile up, we go off the rail and increase x-risk as well as chances of a bad future.
So a good decision making system should both account for x-risk and value estimation, therefore the solution is the same and it is a false dichotomy?
(I might be wrong and I appreciate the slider question anyway!)

Jonas Hallgren 🔸Mar 9, 2025, 5:25 PM
4 points
0 ∶ 0
on: In a time of rapid change, we should re-examine system-level interventions
First and foremost, I agree with the point. I think looking at this especially from a lens of transformative AI might be interesting. (Coincidentally this is something I’m currently doing using ABMs with LLMs)
You probably know this one but here’s a link to a cool project: https://effectiveinstitutionsproject.org/

Dropping some links below, I’ve been working on this with a couple of people in Sweden for the last 2 years, we’re building an open source platform for better democratic decision making using prediction markets:
https://digitaldemocracy.world/flowback-the-future-of-democracy/
The people I’m working with there are also working on:
https://monreform.org/
I know the general space here so if anyone is curious I’m happy to link to people doing different things!
You might also want to check out:
https://metagov.org/

Jonas Hallgren 🔸Mar 4, 2025, 1:35 PM
3 points
0 ∶ 0
on: Where are all the deepfakes?
I guess a random thought I have here is that you would probably want video and you would probably want it to be pretty spammable so you have many shots at it. Looking at twitter we already see like a large amounts of bots around commenting on things which is like a text deepfake.

Like I can see in a year or so when SORA is good enough that creating a short form stabel video is easy we will see a lot more manipulation of voters through various social media through deepfakes.

(I don’t think the tech is easy enough to use yet for it to be painless to do it even though it is possible. I spent a couple of hours trying to set this up for a showcase once and you had to do some fine-tuning and training stuff, there was no plug and play which is probably a bottleneck for now.)

Jonas Hallgren 🔸Feb 28, 2025, 10:01 AM
3 points
0 ∶ 0
in reply to: Linch’s comment on: Linch’s Shortform
FWIW, I find that if you analyze places where we’ve successfully aligned things in the past (social systems or biology etc.) you find that the 1th and 2nd types of alignment really don’t break down in that way.

After doing Agent Foundations for a while I’m just really against the alignment frame and I’m personally hoping that more research in direction will happen so that we get more evidence that other types of solutions are needed. (e.g alignment of complex systems such as has happened in biology and social systems in the past)

Jonas Hallgren 🔸Feb 13, 2025, 10:03 AM
1 point
0 ∶ 2
in reply to: Matthew_Barnett’s comment on: Matthew_Barnett’s Shortform
FWIW, I completely agree with what you’re saying here and I think that if you seriously go into consciousness research and especially for what we westerners more label as a sense of self rather than anything else it quickly becomes infeasible to hold a position that the way we’re taking AI development, e.g towards AI agents will not lead to AIs having self-models.
For all matters and purposes this encompasses most theories of physicalist or non-dual theories of consciousness which are the only feasible ones unless you want to bite some really sour apples.
There’s a classic “what are we getting wrong” question in EA and I think it’s extremely likely that we will look back in 10 years and say, “wow, what are we doing here?”.
I think it’s a lot better to think of systemic alignment and look at properties that we want for the general collective intelligences that we’re engaging in such as our information networks or our institutional decision making procedures and think of how we can optimise these for resillience and truth-seeking. If certain AIs deserve moral patienthood then that truth will naturally arise from such structures.
(hot take) Individual AI alignment might honestly be counter-productive towards this view.

Jonas Hallgren 🔸Feb 9, 2025, 10:50 AM
2 points
0 ∶ 0
on: FIRE and doing good
I’m not a career councellor so take everything with a grain of salt but you did publically post this asking for unsolicited advice, so here you go!

So, more directly if you’re thinking of EA as a community that needs specific skills and you’re wondering what to do, your people management skills, strategy & general leadership skills are likely to be high in demand from other organisations: https://forum.effectivealtruism.org/posts/LoGBdHoovs4GxeBbF/meta-coordination-forum-2024-talent-need-survey

Someone else mentioned that enjoyment can be highly organisation specific and even specific to the stage of the organisation.
My thought is something like:
1. Take a year off and commit yourself to only doing exploration during the year, try out working in different organisations at different scales, maybe more early stage maybe later stage, I’m sure you got some knowledge on what is best here.
  1. Here’s a fun book that mentions optimal exploration exploitation, I think of this a lot when it comes to my own life, it might be useful:
    https://www.goodreads.com/book/show/25666050-algorithms-to-live-by
  2. I thought this book was pretty good for a very specific strategy of quick career role exploration and how you can go about doing that:
    https://www.goodreads.com/book/show/26046333-designing-your-life
2. Think about what roles that you can leverage your strategic people management & leadership skills whilst still enjoying the work? If you really want to do more coding then a CTO or similar role somewhere probably makes a lot of sense.
  1. Maybe you could work at a deep tech company?
  2. Maybe early stage startups is something you enjoy more, maybe you’re more of a zero to one type of person?
  3. Figure out what it exactly is that you don’t enjoy, you might be surprised, you might not be.
3. Test, test, test. If you’ve found yourself able to do this in the past you have a lot of clout to be able to do it again, it is a lot easier for an executive to get investment again, know what people to hire, etc.
4. I know a bunch of people who have felt similar things to what you’re doing in this moment, specifically people in executive managerial roles. The pattern that I see from everyone is that take a break (shocker!) and then it really varies when it comes to how fast they get back into it again.
  1. Maybe there are specific mental health things you can improve that makes you 20% more effective at listening that can really help at the next thing?
  2. I like to think of it as them decompressing and learning the lessons from the past very focused period before getting back at it again.
Those are some random thoughts, best of luck to you!

Jonas Hallgren 🔸Jan 20, 2025, 9:56 PM
9 points
2 ∶ 1
on: What are we doing about the EA Forum? (Jan 2025)
So I’ll just give some reporting on a vibe I’ve been feeling on the forum.
I feel a lot more comfortable posting on LessWrong compared to the EA forum because it feels like there’s a lot more moral outrage here? Like if I go back 3 or 4 years I felt that the forum was a lot more open to discussing and exploring new ideas. There’s been some controversies recently around meat-eater problem stuff and similar and I can’t help but just feel uncomfortable posting stuff with how people have started to react?
I like the different debate weeks as I think it sets up a specific context to create more content which is quite great. Maybe it’s a vibe thing, maybe it’s something else but I feel that the virtue of open-hearted truthseeking is missing compared to a couple of years back and it makes me want to avoid posting.
I do believe that the post standard should be lowered at least a bit and for things to be more exploratory again. So uhhhm, more events that invite more community writing and engagement?

Jonas Hallgren 🔸Dec 24, 2024, 3:53 PM
4 points
0 ∶ 1
on: GiveWell may have made 1 billion dollars of harmful grants, and Ambitious Impact incubated 8 harmful organisations via increasing factory-farming?
I want to preface that I don’t have a strong opinion here, just some curiosity and a question.

If we are focusing on second order effects wouldn’t it make sense to bring up something like moral circle expansion and its relation to ethical and sustainable living over time as well?

From a long-term perspective, I see one of the major effects of global health being better decision making through moral circle expansion.

My question to you is then what time period you’re optimising for? Does this matter for the argument?

Jonas Hallgren 🔸Dec 19, 2024, 8:53 AM
6 points
0 ∶ 0
in reply to: David Thorstad’s comment on: Blog update: Reflective altruism
Thank you for that substantive response, I really appreciate it! It was also very nice that you mentioned the Turner et.al definitions, I wasn’t expecting that.

(Maybe write a post on that? There’s a comment that mentions uptake from major players in the EA ecosystem and maybe if you acknowledge you understand the arguments they would be more sympathetic? Just a quick thought but it might be worth engaging there a bit more?)

I just wanted to clarify some of the points I was trying to make yesterday as I do realise that they didn’t all get across as I wanted them to.
I completely agree with you on the advancing progress point, I personally am quite against it from a “general”-level, I do not believe that we will be able to counterfactually change the “rowing” speed that much in the grand scheme of things. I also believe that is the conclusion of Toby’s posts if I remember correctly. Toby was rather stating that existential risk reduction is worth a lot compared to any progress that we might be able to make. “Steering” away from the bad stuff is worth more. (That’s the implicit claim from the modelling even though he’s as epistemically humble as you philosophers always are (which is commendable!).)
Now for the power-seeking stuff. I appreciate your careful reasoning about these things and I see what you mean in that there’s no threat model from that claim in itself. If we say that the classical way it is construed is something that is equivalent to minimizing free energy, this is a tautological statement and doesn’t help for existential risk.
I think I can agree with you that we’re not clear enough about the existential risk angle to have a clearly defined goal for what to do. I do think there’s an argument there but that we have to be quite clear with how we’re defining it for it to make foundational sense. A question that arises is if in the process of working on it we get more clarity about what it fundamentally is, similar to a startup figuring out what they’re doing along the way? It might still be worth the resources from a unknown unknown perspective and institutional practices shifting perspective if that makes sense? TAI is such a big thing and it will only happen once so spending those resources on relatively shaky foundations might still make sense?
I’m, however, not sure that this is the case and Wei Dai for example has an entire agenda about “metaphilosophy” where the claim is that we’re too philosophically confused to make sense of alignment. In general, I would agree that ensuring the philosophical and mathematical basis is very important to coordinate the field and it is something I’ve been thinking about for a while.
I personally am trying to import ideas from existing fields that deal with generally intelligent agents in biology and cognitive science such as Active Inference and Computational Biology into the mix to see how TAI will affect society. If we see smaller branches of science as specific offshoots of philosophy then I think the places with the most rigorous thinking on the foundations are the ones that have dealt with it for a long time. I’ve found a lot of interesting models about misalignment in these areas that I think can be transported into the AI Safety frame.
I really appreciate the deconstructive approach that you have to the intellectual foundations of the field. I do believe that there are alternatives to the classic risk story but you have to some extent break down the flaws in the existing arguments in order to advocate for new arguments.
Finally, where I think these threat models come from are arguments similar to the ones in What Failure Looks Like from Paul Christiano and the going out with a wimper idea. This is also explored in Yuval Noah Harari’s books Nexus and Homo Deus. This threat model is more similar to the authoritian capture idea compared to something like a runaway intelligence explosion.
I’m looking forward to more work in this area from you!

Jonas Hallgren 🔸Dec 18, 2024, 12:45 PM
21 points
1 ∶ 0
on: Blog update: Reflective altruism
Thank you for this post David!

I’ve from time to time engaged with my friends in discussion about your criticisms of longtermism and some existential risk calculations. I found that this summary post of your work and interaction calrifies my perspective on the general “inclination” that you have in engaging with the ideas, one that seems like a productive one!
Sometimes, I felt that it didn’t engage with some of the core underlying claims of longtermism and exisential risk which did annoy me.
I want to respect the underlying time spend assymmetry of the following question as I feel I could make myself less ignorant if I had the time to spend which I feel I currently do not have. But what are your thoughts on Toby Ord’s perspective and posts on existential risk?: https://forum.effectivealtruism.org/posts/XKeQbizpDP45CYcYc/on-the-value-of-advancing-progress
and:
https://forum.effectivealtruism.org/posts/hh7bgsDzP6rKZ5bbW/robust-longterm-comparisons

I felt that some of the arguments where about discount rates and that they didn’t really make that much moral sense to me, neither did person-affecting views. I have hard time seeing the arguments for them and maybe that’s just the crux of the matter.

The following will be unfair to say as I haven’t spent the time required to fully understand your models but I sometimes feel that there are deeper underlying assumptions and questions that you pass by in your arguments.

I will be going to a domain I know well, AI Safety. For example, I agree with the power-seeking arguments not being fully true, especially not the early papers yet it doesn’t engage with later follow up work such as:
https://arxiv.org/pdf/2303.16200

Finally I believe that for the power-seeking claim, there’s a large amount of evidence for power-seeking within real world systems. For me it seems an overstep to reject power-seeking due to MIRI work?

You can redefine power-seeking itself as minimizing free energy which is in itself a theory of predictive processing or Active Inference and that has showed to have remarkable predictive capacity for saying useful things about systems that are alive. Yes a specific interpretation of power-seeking may not hold true but for me it is throwing the baby out with the bathwater.

I would love to hear your thoughts here and I’m looking forward to more good-faith discussions! (this is not sarcasm but I’m genuinely happy that you’re engaging with good faith arguments!)

Edit: I do want to clarify that I do not believe that any AI system will converge towards instrumental goals and that it does make sense to question the foundations of the AI Safety assumptions and that I appluade you for doing so. It is rather a question of how much it will do so and under what conditions, in what systems it will do so. (I also made the language less combative)

Jonas Hallgren 🔸Dec 18, 2024, 9:58 AM
8 points
0 ∶ 0
on: There is No EA Sorting Hat
I’ve found a lot of my EA friends falling into this decision paralysis so thank you for this post, I will link this to them!