Jonas Hallgren 🔸

Karma: 484

Co-Director of Equilibria Network: https://eq-network.org/

I try to write as if I were having a conversation with you in person.

I would like to claim that my current safety beliefs are a mix between Paul Christiano’s, Andrew Critch’s and Def/Acc.

Jonas Hallgren 🔸Apr 8, 2024, 7:26 PM
7 points
0 ∶ 0
on: Analyzing the moral value of unaligned AIs
I feel like this goes against the principle of not leaving your footprint on the future, no?

Like, a large part of what I believe to be the danger with AI is that we don’t have any reflective framework for morality. I also don’t believe the standard path for AGI is one of moral reflection. This would then to me say that we leave the value of the future up to market dynamics and this doesn’t seem good with all the traps there are in such a situation? (Moloch for example)

If we want a shot at a long reflection or similar, I don’t think full sending AGI is the best thing to do.

Jonas Hallgren 🔸Apr 2, 2024, 10:11 AM
6 points
0 ∶ 0
on: The Centre for Effective Altruism is spinning out of the Centre for Effective Altruism
How will you address the conflict of interest allegations raised against your organisation? It feels like the two organisations are awfully intertwined. For gods sake, the CEOs are sleeping with each other! I bet they even do each other’s taxes!

I’m joining the other EA.

Jonas Hallgren 🔸Apr 2, 2024, 10:03 AM
2 points
0 ∶ 0
on: Asteroid Impact Interpretability
This was a dig at interpretability research. I’m pro-interpretability research in general, so if you feel personally attacked by this, it wasn’t meant to be too serious. Just be careful, ok? :)

Jonas Hallgren 🔸Mar 16, 2024, 8:06 AM
4 points
0 ∶ 0
on: Unflattering aspects of Effective Altruism
It makes sense for the dynamics of EA to naturally go in this way (Not endorsing). It is just applying the intentional stance plus the free energy principle to the community as a whole. I find myself generally agreeing with the first post at least and I notice the large regularization pressure being applied to individuals in the space.

I often feel the bad vibes that are associated with trying hard to get into an EA organisation. I’m doing for-profit entrepreneurship for AI safety adjacent to EA as a consequence and it is very enjoyable. (And more impactful in my views)

I will however say that the community in general is very supportive and that it is easy to get help with things if one has a good case and asks for it, so maybe we should make our structures more focused around that? I echo some of the things about making it more community focused, however that might look. Good stuff OP, peace.

Jonas Hallgren 🔸Dec 29, 2023, 11:35 AM
3 points
1 ∶ 0
on: Only mammals and birds are sentient, according to neuroscientist Nick Humphrey’s theory of consciousness, recently explained in “Sentience: The invention of consciousness”
I did enjoy the discussion here in general. I hadn’t heard of the “illusionist” stance before and it does sound quite interesting yet I do find it quite confusing as well.
I generally find there to be a big confusion about the relation of the self to what “consciousness” is. I was in this rabbit hole of thinking about it a lot and I realised I had to probe the edges of my “self” to figure out how it truly manifested. A 1000 hours into meditation some of the existing barriers have fallen down.

The complex attractor state can actually be experienced in meditation and it is what you would generally call a case of dependent origination or a self-sustaining loop (literally, lol). You can see through this by the practice of realising that the self-property of mind is co-created by your mind and that it is “empty”. This is a big part of the meditation project. (alongside loving-kindness practice, please don’t skip the loving-kindness practice)

Experience itself isn’t mediated by this “selfing” property, it is rather an artificial boundary we have created about our actions in the world for simplification reasons. (See Boundaries as a general way of this occurring.)

So, the self cannot be the ground of consciousness; it is rather a computationally optimal structure for behaving in the world. Yet realizing this fully is easiest done through your own experience, or through n=1 science. Meaning that to fully collect the evidence you will have to discover it through your own phenomenological experience. (which makes it weird to take into western philosophical contexts)

So, the self cannot be the ground and partly as a consequence of this and partly since consciousness is a very conflated term, I like thinking more about different levels of sentience instead. At a certain threshold of sentience the “selfing” loop is formed.
The claims and evidence he’s talking about may be true but I don’t believe that justifies the conclusions that he draws from them.

Jonas Hallgren 🔸Dec 25, 2023, 3:55 PM
2 points
0 ∶ 0
on: PhD on Moral Progress—Bibliography Review
Thank you for this post! I will make sure to read the ⁵⁄₅ books that I haven’t read yet, especially excited about Joseph Heinrich’s book from 2020, had read The Secret of Our Success before but not that one.

I actually come from an AI Safety interest when it comes to moral progress. The question is to some extent for me on how we can set up AI systems so that they continuously improve “moral progress” as we don’t want to leave our fingerprints on the future.

In my opinion, the larger AI Safety dangers come from “big data hell” like the ones described in Yuah Noah Harari’s Homo Deus or Paul Christiano’s slow take-off scenarios.

Therefore we want to figure out how to set up AIs in such a way that automatically improves moral progress in the structure of their use. I’m also a believer that AI will most likely in the future go through a similar process to the one described in The Secret of Our Success and that we should prepare appropriate optimisation functions for it.

So, if you ever feel like we might die from AI, I would love to see some work in that direction!
(happy to talk more about it if you’re up for it.)

Jonas Hallgren 🔸Oct 19, 2023, 5:10 PM
2 points
0 ∶ 0
on: AMA: Six Open Philanthropy staffers discuss OP’s new GCR hiring round
The number of applications will affect the counterfactual value of applying. Now, saying your expected number might lower the number of people who will apply, but I would still appreciate having a range of expected applicants for the AI Safety roles.

What is the expected amount of people applying for the AI Safety roles?

Jonas Hallgren 🔸Sep 22, 2023, 7:51 AM
2 points
0 ∶ 0
on: AI is centralizing by default; let’s not make it worse
I’m getting the vibe that your priors are on the world to some extent, being in a multipolar scenario in the future. I’m interested in more specifically what your predictions are for multipolarity versus singleton given the shard-theory thinking as it seems unlikely for recursive self-improvement to happen in the way described given what I understand of your model?

Jonas Hallgren 🔸Sep 15, 2023, 3:46 PM
1 point
0 ∶ 0
on: Four productivity techniques if you love working with others but work alone
Great post; I enjoyed it.

I’ve got two things to say, the first one being that GPT is a very nice brainstorming tool as it generates many more ideas than you could yourself that you can then prune from, which is nice.

Secondly, I’ve been doing “peer coaching” with some EA people using reclaim.ai (not sponsored) to automatically book meetings each week where we take turns being the mentor and mentee answering the five following questions:

- What’s on your mind?
- When would today’s setting be a success?
- Where are you right now?
- How do you get where you want to go?
- What are the actions/first steps to get there?
- Ask for feedback
I really like the framing of meetings with yourself, I’ll definitely try that out.

Jonas Hallgren 🔸Sep 14, 2023, 7:24 AM
2 points
0 ∶ 0
in reply to: rileyharris’s comment on: Summary: High risk, low reward: A challenge to the astronomical value of existential risk mitigation
Alright, that makes sense; thank you!

Jonas Hallgren 🔸Sep 13, 2023, 2:16 PM
6 points
1 ∶ 0
on: Summary: High risk, low reward: A challenge to the astronomical value of existential risk mitigation
Isn’t estimated value calculated by the probability times the utility and as a consequence isn’t the higher risk part wrong if one simply looks at it like this? (20% to 10% would be 10x the impact of 2% to 1%)

(I could be missing something here, please correct me in that case)

Jonas Hallgren 🔸Aug 28, 2023, 7:33 AM
1 point
0 ∶ 0
in reply to: Bentham's Bulldog’s comment on: Eliezer Yudkowsky Is Frequently, Confidently, Egregiously Wrong
I didn’t mean it in this sense. I think the lesson you drew from it is fair in general, I was just reacting to the things I felt you pulled under the rug, if that makes sense.

Jonas Hallgren 🔸Aug 28, 2023, 7:31 AM
1 point
0 ∶ 0
in reply to: Pablo’s comment on: Eliezer Yudkowsky Is Frequently, Confidently, Egregiously Wrong
Sorry, Pablo, I meant that I got a lot more epistemically humble, I should have thought about how I phrased it more. It was more that I went from the opinion that many worlds is probably true to: “Oh man, there are some weird answers to the Wigner’s friend thought experiment and I should not give a major weight to any.” So I’m more like maybe 20% on many worlds?

That being said I am overconfident from time to time and it’s fair to point that out from me as well. Maybe you were being overconfident in saying that I was overconfident? :D

Jonas Hallgren 🔸Aug 27, 2023, 7:27 AM
3 points
1 ∶ 0
in reply to: Jonas Hallgren 🔸’s comment on: Eliezer Yudkowsky Is Frequently, Confidently, Egregiously Wrong
I will say that I thought the consciousness p zombie distinction was very interesting and a good example of overconfidence as this didn’t come across in my previous comment.

Jonas Hallgren 🔸Aug 27, 2023, 7:20 AM
21 points
7 ∶ 9
on: Eliezer Yudkowsky Is Frequently, Confidently, Egregiously Wrong
Generally, some good points across the board that I agree with. Talking with some physicist friends helped me debunk the many worlds thing Yud has going. Similarly his animal consciousness stuff seems a bit crazy as well. I will also say that I feel that you’re coming off way to confident and inflammatory when it comes to the general tone. The AI Safety argument you provided was just dismissal without much explanation. Also, when it comes to the consciousness stuff I honestly just get kind of pissed reading it as I feel you’re to some extent hard pandering to dualism.

I totally agree with you that Yudkowsky is way overconfident in the claims that he makes. Ironically enough it also seems that you to some extent are as well in this post since you’re overgeneralizing from insufficient data. As a fellow young person, I recommend some more caution when it comes to solid claims about stuff where you have little knowledge (you cherry-picked data on multiple occasions in this post).

Overall you made some good points though, so still a thought-provoking read.

Jonas Hallgren 🔸Jul 22, 2023, 3:25 PM
2 points
0 ∶ 0
on: Could someone help me understand why it’s so difficult to solve the alignment problem?
Maybe frame it more as if you’re talking to a child. Yes you can tell the child to follow something but how are you certain that it will do it?

Similarly, how can we trust the AI to actually follow the prompt? To trust it we would fundamentally have to understand the AI or safeguard against problems if we don’t understand it. The question then becomes how your prompt is represented in machine language, which is very hard to answer.

To reiterate, ask yourself, how do you know that the AI will do what you say?

Jonas Hallgren 🔸Jul 21, 2023, 10:36 AM
3 points
0 ∶ 0
in reply to: Jonas Hallgren 🔸’s comment on: I’m interviewing Jan Leike, co-lead of OpenAI’s new Superalignment project. What should I ask him?
(Leike responds to this here if anyone is interested)

Jonas Hallgren 🔸Jul 19, 2023, 10:40 AM
13 points
0 ∶ 0
on: I’m interviewing Jan Leike, co-lead of OpenAI’s new Superalignment project. What should I ask him?
John Wentworth has a post on Godzilla strategies where he claims that putting an AGI to solve the alignment problem is like asking Godzilla to make a larger Godzilla behave. How will you ensure you don’t overshoot the intelligence of the agent you’re using to solve alignment and fall into the “Godzilla trap”?

Advice for new alignment people: Info Max

Jonas Hallgren 🔸May 30, 2023, 3:42 PM

9 points

0 comments EA link

Max Tegmark’s new Time article on how we’re in a Don’t Look Up scenario [Linkpost]

Jonas Hallgren 🔸Apr 25, 2023, 3:47 PM

41 points

0 comments EA link

Jonas Hallgren 🔸

Ad­vice for new al­ign­ment peo­ple: Info Max

Max Teg­mark’s new Time ar­ti­cle on how we’re in a Don’t Look Up sce­nario [Linkpost]

Advice for new alignment people: Info Max

Max Tegmark’s new Time article on how we’re in a Don’t Look Up scenario [Linkpost]