Make it an ‘undebate’. 10 points for every time you learn something, and 150 points for changing your mind on the central proposition.
Also, I’d like to see RLHF[1] debated. Whether any form of RL on realistic text data will be able to take us to a point where it’s “smart enough”, either to help us align higher intelligences or just smart enough for what we need.
I wish the forum had the feature where if you write [[RLHF]], it automatically makes an internal link to the topic page or where RLHF is defined in the wiki. It’s standard in personal knowledge management systems like Obsidian, Roam, RemNote, and I think Arbital does it.
Make it an ‘undebate’. 10 points for every time you learn something, and 150 points for changing your mind on the central proposition.
Also, I’d like to see RLHF[1] debated. Whether any form of RL on realistic text data will be able to take us to a point where it’s “smart enough”, either to help us align higher intelligences or just smart enough for what we need.
Reinforcement Learning from Human Feedback.[2] A strategy for AI alignment.
I wish the forum had the feature where if you write [[RLHF]], it automatically makes an internal link to the topic page or where RLHF is defined in the wiki. It’s standard in personal knowledge management systems like Obsidian, Roam, RemNote, and I think Arbital does it.