Different approaches. ARC, Anthropic, and Redwood seem to be more in the “prosaic alignment” field (see eg Paul Christiano’s post on that). ARC seems to be focusing on eliciting latent knowledge (getting human relevant information out of the AI that the AI knows but has no reason to inform us of). Redwood is aligning text-based systems and hoping to scale up. Anthropic is looking at a lot of interlocking smaller problems that will (hopefully) be of general use for alignment. MIRI seems to focus on some key fundamental issues (logical uncertainty, inner alignment, corrigibility), and, undoubtedly, a lot of stuff I don’t know about. (Apologies if I have mischaracterised any of these organisations).
Our approach is to solve values extrapolation, which we see as comprehensive and fundamental problem, and address the other specific issues as applications of this solution (MIRI’s stuff being the main exception—values extrapolation has pretty weak connections with logical uncertainty and inner alignment).
But the different approach should be quite complementary—progress by any group should make the task easier for the others.
Different approaches. ARC, Anthropic, and Redwood seem to be more in the “prosaic alignment” field (see eg Paul Christiano’s post on that). ARC seems to be focusing on eliciting latent knowledge (getting human relevant information out of the AI that the AI knows but has no reason to inform us of). Redwood is aligning text-based systems and hoping to scale up. Anthropic is looking at a lot of interlocking smaller problems that will (hopefully) be of general use for alignment. MIRI seems to focus on some key fundamental issues (logical uncertainty, inner alignment, corrigibility), and, undoubtedly, a lot of stuff I don’t know about. (Apologies if I have mischaracterised any of these organisations).
Our approach is to solve values extrapolation, which we see as comprehensive and fundamental problem, and address the other specific issues as applications of this solution (MIRI’s stuff being the main exception—values extrapolation has pretty weak connections with logical uncertainty and inner alignment).
But the different approach should be quite complementary—progress by any group should make the task easier for the others.
What is the difference between this, ARC, Redwood Research, MIRI, and Anthropic?
Different approaches. ARC, Anthropic, and Redwood seem to be more in the “prosaic alignment” field (see eg Paul Christiano’s post on that). ARC seems to be focusing on eliciting latent knowledge (getting human relevant information out of the AI that the AI knows but has no reason to inform us of). Redwood is aligning text-based systems and hoping to scale up. Anthropic is looking at a lot of interlocking smaller problems that will (hopefully) be of general use for alignment. MIRI seems to focus on some key fundamental issues (logical uncertainty, inner alignment, corrigibility), and, undoubtedly, a lot of stuff I don’t know about. (Apologies if I have mischaracterised any of these organisations).
Our approach is to solve values extrapolation, which we see as comprehensive and fundamental problem, and address the other specific issues as applications of this solution (MIRI’s stuff being the main exception—values extrapolation has pretty weak connections with logical uncertainty and inner alignment).
But the different approach should be quite complementary—progress by any group should make the task easier for the others.
This is very helpful, thank you!
Comment copied to new “Stuart Armstrong” account:
Different approaches. ARC, Anthropic, and Redwood seem to be more in the “prosaic alignment” field (see eg Paul Christiano’s post on that). ARC seems to be focusing on eliciting latent knowledge (getting human relevant information out of the AI that the AI knows but has no reason to inform us of). Redwood is aligning text-based systems and hoping to scale up. Anthropic is looking at a lot of interlocking smaller problems that will (hopefully) be of general use for alignment. MIRI seems to focus on some key fundamental issues (logical uncertainty, inner alignment, corrigibility), and, undoubtedly, a lot of stuff I don’t know about. (Apologies if I have mischaracterised any of these organisations).
Our approach is to solve values extrapolation, which we see as comprehensive and fundamental problem, and address the other specific issues as applications of this solution (MIRI’s stuff being the main exception—values extrapolation has pretty weak connections with logical uncertainty and inner alignment).
But the different approach should be quite complementary—progress by any group should make the task easier for the others.
For more discussion on this, you can see the comment threads on this lesswrong post and this alignment forum post.
If for some reason OP thinks this is unhelpful feel free to remove my comment.
Congratulations, very exciting!
(FYI the hyperlink from ” buildaligned.ai ” doesn’t work for some reason, but pasting that url into one’s address bar does)
Thanks! Should be corrected now.