PhD-ing. I think and write about AI safety, cognitive science, history and philosophy of science/technology.
Eleni_A
I designed an AI safety course (for a philosophy department)
Confusions and updates on STEM AI
AI Alignment in The New Yorker
A Study of AI Science Models
A Guide to Forecasting AI Science Capabilities
On taking AI risk seriously
Everything’s normal until it’s not
A model of one’s own (or what I say to myself):
Defer a bit less today—think for yourself!
What would the world look like if X was not true?
Make a prediction—don’t worry if it turns out to be false.
Articulate an argument and find at least one objection to it.
Full-time research in AI Safety.
Questions about AI that bother me
My upskilling study plan:
1. Math
i) Calculus (derivatives, integrals, Taylor series)
ii) Linear Algebra (this video series)
iii) Probability Theory
2. Decision Theory
3. Microeconomics
i) Optimization of individual preferences
4. Computational Complexity
6. Machine Learning theory with a focus on deep neural networks
8. Arbital
“Find where the difficult thing hides, in its difficult cave, in the difficult dark.” Iain S. Thomas
The Collingridge dilemma: it is difficult to predict the future impact of a technology. However, once the technology has been implemented, it becomes difficult to manage.
The quick answer is that wanting to do alignment-related work does not depend on a Philosophy PhD, or any graduate degree tbh. I’d say, start thinking about what are your interests more specifically and then there might be different paths to impact with or without the degree.
It’s more epistemically virtuous to make a wrong prediction than to make no predictions at all.
Helpful post, Zach! I think it’s more useful and concrete to focus on asking about specific capabilities instead of asking about AGI/TAI etc. and I’m pushing myself to ask such questions (e.g., when do you expect to have LLMs that can emulate Richard Feynmann-level -of-text). Also, I like the generality vs capability distinction. We already have a generalist (Gato) but we don’t consider it to be an AGI (I think).