The second and third strike me as useful ideas and kind of conceptually cool, but not terribly math-y; rather than feeling like these are interesting math problems, the math feels almost like an afterthought. (I’ve read a little about corrigibility before, and had the same feeling then.) The first is the coolest, but also seems like the least practical—doing math about weird simulation thought experiments is fun but I don’t personally expect it to come to much use.
Thank you for sharing all of these! I sincerely appreciate the help collecting data about how existing AI work does or doesn’t mesh with my particular sensibilities.
To me they feel like pre-formal math? Like the discussion of corrigibility gives me a tingly sense of “there’s what on the surface looks like an interesting concept here, and now the math-y question is whether one can formulate definitions which capture that and give something worth exploring”.
(1) Over time we’ll predictably move in the direction from “need theory builders” to “need problem solvers”, so even if you look around now and can’t find anything, it might be worth checking back every now and again.
(2) I’d look at ELK now for sure, as one of the best and further-in-this-direction things.
(3) Actually many things have at least some interesting problems to solve as you get deep enough. Like I expect curricula teaching ML to very much not do this, but if you have mastery of ML and are trying to achieve new things with it, much more of the interesting-problems-to-solve to come up. Unfortunately I don’t know how to predict how much of the itch this will address for you … maybe one question is how much do you find satisfaction in solving problems outside of pure mathematics? (e.g. logic puzzles, but also things in other domains of life)
The point about checking back in every now and then is a good one; I had been thinking in more binary terms and it’s helpful to be reminded that “not yet, maybe later” is also a possible answer to whether to do AI safety research.
I like logic puzzles, and I like programming insofar as it’s like logic puzzles. I’m not particularly interested in e.g. economics or physics or philosophy. My preferred type of problem is very clear-cut and abstract, in the sense of being solvable without reference to how the real world works. More “is there an algorithm with time complexity Y that solves math problem X” than “is there a way to formalize real-world problem X into a math problem for which one might design an algorithm.” Unfortunately AI safety seems to be a lot of the latter!
A handful of ideas (things that tickle my aesthetic) from an ex-topologist:
https://www.lesswrong.com/posts/Tr7tAyt5zZpdTwTQK/the-solomonoff-prior-is-malign
https://www.lesswrong.com/posts/EbFABnst8LsidYs5Y/goodhart-taxonomy/
https://ai-alignment.com/corrigibility-3039e668638 (and other things from https://ai-alignment.com/ )
The second and third strike me as useful ideas and kind of conceptually cool, but not terribly math-y; rather than feeling like these are interesting math problems, the math feels almost like an afterthought. (I’ve read a little about corrigibility before, and had the same feeling then.) The first is the coolest, but also seems like the least practical—doing math about weird simulation thought experiments is fun but I don’t personally expect it to come to much use.
Thank you for sharing all of these! I sincerely appreciate the help collecting data about how existing AI work does or doesn’t mesh with my particular sensibilities.
To me they feel like pre-formal math? Like the discussion of corrigibility gives me a tingly sense of “there’s what on the surface looks like an interesting concept here, and now the math-y question is whether one can formulate definitions which capture that and give something worth exploring”.
(I definitely identify more with the “theory builder” of Gower’s two cultures.)
(Terry Tao’s distinction between ‘pre-rigorous’, ‘rigorous’, and ‘post-rigorous’ maths might also be relevant.)
Ah, that’s a good way of putting it! I’m much more of a “problem solver.”
Cool!
My opinionated takes for problem solvers:
(1) Over time we’ll predictably move in the direction from “need theory builders” to “need problem solvers”, so even if you look around now and can’t find anything, it might be worth checking back every now and again.
(2) I’d look at ELK now for sure, as one of the best and further-in-this-direction things.
(3) Actually many things have at least some interesting problems to solve as you get deep enough. Like I expect curricula teaching ML to very much not do this, but if you have mastery of ML and are trying to achieve new things with it, much more of the interesting-problems-to-solve to come up. Unfortunately I don’t know how to predict how much of the itch this will address for you … maybe one question is how much do you find satisfaction in solving problems outside of pure mathematics? (e.g. logic puzzles, but also things in other domains of life)
The point about checking back in every now and then is a good one; I had been thinking in more binary terms and it’s helpful to be reminded that “not yet, maybe later” is also a possible answer to whether to do AI safety research.
I like logic puzzles, and I like programming insofar as it’s like logic puzzles. I’m not particularly interested in e.g. economics or physics or philosophy. My preferred type of problem is very clear-cut and abstract, in the sense of being solvable without reference to how the real world works. More “is there an algorithm with time complexity Y that solves math problem X” than “is there a way to formalize real-world problem X into a math problem for which one might design an algorithm.” Unfortunately AI safety seems to be a lot of the latter!
Maybe the notes on ‘ascription universality’ on ai-alignment.com are a better match for your sensibilities.