Cody Albert

Karma: 12

Cody Albert May 10, 2025, 11:45 PM
1 point
0 ∶ 0
in reply to: John Huang’s comment on: AI can solve all EA problems, so why keep focusing on them?
I don’t claim you can align human groups with individual humans. If I’m reading you correctly, I think you’re committing a category error in assigning alignment properties to groups of people like nation states or companies. Alignment, as I’m using the term, is the alignment of goals or values from an AI to a person or group of people. We expect this, I think, in part because we’re accustomed to telling computers what to do and having them do exactly what we say (not always exactly what we mean, though).
Alignment is extremely tricky for the unenhanced human, but theoretically possible. My first best guess at solving it would be to automate the research and development of it with AI itself. We’ll soon reach a sufficiently advanced AI that’s capable of reasoning beyond anything anyone on Earth can come up with; we just have to ensure that the AI is aligned and that the one that trained that one is also aligned, and so on. My second-best guess would be through BCIs, and my third would be whole-brain emulation interpretability.
Assuming we even do develop alignment techniques, I’d argue that exclusive alignment (that is, for one or a small group of people) is more difficult than aligning with humanity.at large for the following reasons (I realize some of these go both ways, but I include them because I see them as more serious for exclusive alignment–like value drift):
- Value drift.
- Impossible specification (e.g., in exploring the inherent contradictions in expressed human values, the AGI expands moral consideration beyond initial human constraints, discovering some form of moral universalism or a morality beyond all human reasoning).
- Emergent properties appear, producing unexpected behavior, and we cannot align systems to exhibit properties we cannot anticipate.
- Exclusive alignment’s instrumental goals may broaden AGI’s moral scope to include more humans (i.e., it may be that broader alignment makes for a more robust AI system).
- Competing AGIs have been successfully created that are designed to align with all of humanity.
- Exclusively aligned AGI may still satisfy many, if not all, of the preferences that the rest of humanity possesses.
- Exclusive alignment requires perfect internal coordination of values within organizations, but inevitable divergent interests emerge as they scale; these coordination failures multiply when AGI systems interpret instructions literally and optimize against specified metrics.
- Alignment requires resolving disagreements over value prioritization, a meta-preference problem. Yet resolving these conflicts necessitates assumptions about how they should be resolved, creating an infinite regress that defies a technical solution.

Cody Albert May 10, 2025, 3:16 AM
1 point
0 ∶ 0
in reply to: John Huang’s comment on: AI can solve all EA problems, so why keep focusing on them?
There is a distinction between “control” and “alignment. ”
The control problem addresses our fundamental capacity to constrain AI systems, preventing undesired behaviors or capabilities from manifesting, regardless of the system’s goals. Control mechanisms encompass technical safeguards that maintain human authority over increasingly autonomous systems, such as containment protocols, capability limitations, and intervention mechanisms.
The alignment problem, conversely, focuses on ensuring AI systems pursue goals compatible with human values and intentions. This involves developing methods to specify, encode, and preserve human objectives within AI decision-making processes. Alignment asks whether an AI system “wants” the right things, while control asks whether we can prevent it from acting on its wants.
I believe AI is soon to have wants, and it’s critical to align those wants with increasingly capable AIs.
As far as I’m concerned I don’t see humanity not eventually creating superintelligence and thus it should be the main focus of EA and other groups concerned with AI. As I mentioned in another comment I don’t have many ideas for how the average EA person can do this aside from making a career change into AI policy or something similar.

Cody Albert May 10, 2025, 3:07 AM
1 point
0 ∶ 0
in reply to: simon’s comment on: AI can solve all EA problems, so why keep focusing on them?
That’s fair and I don’t have a good answer for what the average effective altruist can do to help ensure AI alignment, but there are definitely concrete approaches like career changes to AI policy that can help address this.

Cody Albert May 10, 2025, 3:06 AM
1 point
0 ∶ 0
in reply to: John Huang’s comment on: AI can solve all EA problems, so why keep focusing on them?
I agree that AI could be aligned to certain people or groups, but the dialogue revolving around it is aligned with humanity. Even so, wouldn’t pushing for alignment for all of humanity be a worthwhile effort instead of funding malaria charities, especially given that if AI is aligned only with elites, it could possibly devolve into a totalitarian regime that should become the main focus of overthrowing instead of fighting malaria?
I’m not naive enough to assume alignment will be achieved with all of humanity, but that should be the goal, and is something many companies are at least openly advocating for, whether or not that comes to pass or not. There is also the possibility of other superintelligences being built after the first one, which could align with all of humanity (unless, of course, the first superintelligence built subverts those projects).

Cody Albert May 4, 2025, 2:25 PM
−2 points
0 ∶ 0
in reply to: tobycrisford 🔸’s comment on: AI can solve all EA problems, so why keep focusing on them?
So I said two different things which made my argument unclear. First I said “assuming superintelligence comes aligned with human values” and then I said “AI could lead to major catastrophes, a global totalitarian regime, or human extinction.”
If we knew for sure that AGI is imminent and will eradicate all diseases then I agree with you that it’s worth it to donate to malaria charities. Right now, though, we don’t know what the outcome will be. So, not knowing the outcome of alignment, do you still choose to donate to malaria charities, or do you allocate that money toward, say, a nonprofit actively working on the alignment problem?
Shameless plug; I have an idea for a nonprofit that aims to help solve the alignment problem—https://forum.effectivealtruism.org/posts/GGxZhEdxndsyhFnGG/an-international-collaborative-hub-for-advancing-ai-safety?utm_campaign=post_share&utm_source=link

Cody Albert May 4, 2025, 12:17 AM
−1 points
0 ∶ 0
in reply to: John Huang’s comment on: AI can solve all EA problems, so why keep focusing on them?
Ah okay, I didn’t state this, but I’m operating under the definition of superintelligence being inherently uncontrollable, and thus not a tool. For now, AI is being used as a tool, but in order to gain more power, states/corporations will develop it to the point where it has its own agency, as described by Bostrom and others. I don’t see any power-seeking entity reaching a point in their AI’s capability where they’re satisfied and stop developing it, since a competitor could continue development and gain a power/capabilities advantage. Moreover, a sufficiently advanced AI would be motivated to improve its own cognitive abilities to further its goals.
It may be possible that states/corporations could align superintelligence just to themselves if they can figure out which values to specify and how to hone in on them, but the superintelligence would be acting on its own accord and still out of their control in terms of how it’s accomplishing its goals. This doesn’t seem likely to me if superintelligence is built via automated self-improvement, though, as there are real possibilities of value drift, instrumental goals that broaden its moral scope to include more humans, emergent properties that appear (which produce unexpected behavior), or competing superintelligences that are designed to align with all of humanity. All of these possibilities, with the exception of the last one, are problems for aligning superintelligence with all of humanity too.