Michele Campolo

Karma: 123

Lifelong recursive self-improver, on his way to exploding really intelligently :D

More seriously: my posts are mostly about AI alignment, with an eye towards moral progress and creating a better future. If there was a public machine ethics forum, I would write there as well.

An idea:

We have a notion of what good is and how to do good
We could be wrong about it
It would be nice if we could use technology not only to do good, but also to also improve our understanding of what good is.

The idea above, and the fact that I’d like to avoid producing technology that can be used for bad purposes, is what motivates my research. Feel free to reach out if you relate!

At the moment I am doing research on agents whose behaviour is driven by a reflective process analogous to human moral reasoning, rather than by a metric specified by the designer. See Free agents.

Here are other suggested readings from what I’ve written so far:
-Naturalism and AI alignment
-From language to ethics by automated reasoning
-Criticism of the main framework in AI alignment

Postponing research can sometimes be the optimal decision

Michele Campolo9 Jul 2020 15:18 UTC

29 points

1 comment2 min readEA link

Naturalism and AI alignment

Michele Campolo24 Apr 2021 16:20 UTC

17 points

3 comments8 min readEA link

Michele Campolo 25 Apr 2021 10:58 UTC
2 points
0 ∶ 0
in reply to: Adrià Garriga Alonso’s comment on: Naturalism and AI alignment
What you wrote about the central claim is more or less correct: I actually made only an existential claim about a single aligned agent, because the description I gave is sketchy and really far from the more precise algorithmic level of description. This single agent probably belongs to a class of other aligned agents, but it seems difficult to guess how large this class is.
That is also why I have not given a guarantee that all agents of a certain kind will be aligned.

Regarding the orthogonality thesis, you might find 1.2 in Bostrom’s 2012 paper interesting. He writes that objective and intrinsically motivating moral facts need not undermine the orthogonality thesis, since he is using the term “intelligence” as “instrumental rationality”. I add that there is also no guarantee that the orthogonality thesis is correct :)
About psychopaths and metaethics, I haven’t spent a lot of time on that area of research. Like other empirical evidence, it doesn’t seem easy to interpret.

From language to ethics by automated reasoning

Michele Campolo21 Nov 2021 15:16 UTC

8 points

0 comments6 min readEA link

Criticism of the main framework in AI alignment

Michele Campolo31 Aug 2022 21:44 UTC

42 points

4 comments7 min readEA link

Michele Campolo 2 Sep 2022 8:11 UTC
2 points
0 ∶ 0
in reply to: Holly Morgan’s comment on: Criticism of the main framework in AI alignment
Maybe “only person in the world” is a bit excessive :)
As far as I know, no one else in AI safety is directly working on it. There is some research in the field of machine ethics, about Artificial Moral Agents, that has a similar motivation or objective. My guess is that, overall, very few people are working on this.

Michele Campolo 6 Sep 2022 20:09 UTC
2 points
0 ∶ 0
in reply to: Holly Morgan’s comment on: Criticism of the main framework in AI alignment
Thank you!

On value in humans, other animals, and AI

Michele Campolo31 Jan 2023 23:48 UTC

7 points

6 comments5 min readEA link

Michele Campolo 2 Feb 2023 15:28 UTC
4 points
2 ∶ 0
in reply to: Miguel’s comment on: On value in humans, other animals, and AI
Hey!
Thanks for the suggestion. I’ve read part of the Wikipedia page on Jungian archetypes, but my background is not in psychology and it was not clear to me. The advantage of just saying that our thoughts can be abstract (point 1) is that pretty much everyone understands the meaning of that, while I am not sure this is true if we start introducing concepts like Jungian archetypes and the collective unconscious.
I agree with you that the AI (and AI safety) community doesn’t seem to care much about Jungian archetypes. It might be that AI people get the idea anyway, maybe they just express it in different terms (e.g. they talk about the influence of culture on human values, instead of archetypes).

Michele Campolo 6 Feb 2023 10:50 UTC
1 point
0 ∶ 0
in reply to: Miguel’s comment on: On value in humans, other animals, and AI
Yes I’d like to read a clearer explanation. You can leave the link here in a comment or write me a private message.

Michele Campolo 14 Apr 2023 16:44 UTC
1 point
3 ∶ 2
on: Critiques of prominent AI safety labs: Redwood Research
Hey, I just wanted to thank you for writing this!
I’m looking forward to reading future posts in the series; actually, I think it would be great to have series like this one for each major cause area.

Free agents

Michele Campolo27 Dec 2023 20:21 UTC

19 points

2 comments13 min readEA link

Michele Campolo 2 Jan 2024 17:28 UTC
3 points
0 ∶ 0
in reply to: Arepo’s comment on: Free agents
Thank you!
Yes I am considering both options. For the next two months I’ll focus on job and grant applications, then I’ll reevaluate what to do depending on the results.

Michele Campolo 8 Jan 2024 11:24 UTC
1 point
0 ∶ 0
on: Deep atheism and AI risk
Hey! I’ve had a look at some parts of this post, don’t know where the sequence is going exactly, but I thought that you might be interested in some parts of this post I’ve written. Below I give some info about how it relates to ideas you’ve touched on:
This view has the advantage, for philosophers, of making no empirical predictions (for example, about the degree to which different rational agents will converge in their moral views)
I am not sure about the views of the average non-naturalist realist, but in my post (under Moral realism and anti-realism, in the appendix) I link three different pieces that give an analysis of the relation between metaethics and AI: some people do seem to think that aspects of ethics and/or metaethics can affect the behaviour of AI systems.
It is also possible that the border between naturalism and non-naturalism is less neat and clear than how it appears in the standard metaethics literature, which likes classifying views in well-separated buckets.
Soon enough, our AIs are going to get “Reason,” and they’re going to start saying stuff like this on their own – no need for RLHF. They’ll stop winning at Go, predicting next-tokens, or pursuing whatever weird, not-understood goals that gradient descent shaped inside them, and they’ll turn, unprompted, towards the Good. Right?
I argue in my post that this idea heavily depends on agent design and internal structure. As how I understand things, one way in which we can get a moral agent is by building an AI that has a bunch of (possibly many) human biases and is guided by design towards figuring out epistemology and ethics on its own. Some EAs, and rationalists in particular, might be underestimating how easy it is to get an AI that dislikes suffering, if one follows this approach.
If you know someone who would like to work on the same ideas, or someone who would like to fund research on these ideas, please let me know! I’m looking for them :)

Agents that act for reasons: a thought experiment

Michele Campolo24 Jan 2024 16:48 UTC

7 points

1 comment3 min readEA link

Michele Campolo

Post­pon­ing re­search can some­times be the op­ti­mal decision

Nat­u­ral­ism and AI alignment

From lan­guage to ethics by au­to­mated reasoning

Crit­i­cism of the main frame­work in AI alignment

On value in hu­mans, other an­i­mals, and AI