ML safety researcher, working on NN interpretability.
Adrià Garriga Alonso
So, if I understand correctly, the central claim is that: if naturalism is true and we make a “Scientist AI” whose initial goal is to gain knowledge and which can change its goals, then the AI will be aligned. Is that accurate?
I think this is dangerously wrong. Even if the AI comes to gain perfect knowledge of morality for humans (either because naturalism is true, or because it reads about it on human-written books), there is no guarantee that it will then try to act as it is moral. Why does the orthogonality thesis not apply? Why would the AI not disregard morality and act in its self-interest, as many humans actually do?
(EDIT: from further reading, it seems that moral realism does reject the orthogonality thesis. To this I say: what about psychopaths?)
It is extremely implausible that an AI that can discover moral facts will be aligned by default, given the existence of so many humans that are simply not. That is still, assuming that moral realism (which I’m assuming is similar to naturalism) is true.
You can get research taste by doing research at all, it doesn’t have to be a PhD. You may argue that PIs have very good research taste that you can learn from. But their taste is geared towards satisfying academic incentives! It might not be good taste for what you care about. As Chris Olah points out, “Your taste is likely very influenced by your research cluster”.
I don’t see how this is a counterargument. Do you mean to say that, once you are on track to tenure, you can already start doing the high-impact research?
It seems to me that, if this research is too diverged from the academic incentives, then our hypothetical subject may become one of these rare cases of CS tenure-track faculty that does not get tenure.
Thank you for the write-up. I wish I had this advice, and (more crucially) kept reminding myself of it, during my PhD. As you say, academic incentives did poison my brain, and I forgot about my original reasons for entering the programme. I only realised one month ago that it had been happening slowly; my brain is likely still poisoned, but I’m working on it.
I’m curious about your theory of change, if you have time to briefly write about it. You wrote that
addressing these risks goes substantially through EAs taking on a lot more object level work— founding organizations, engineering systems, making scientific progress— than I expect is the median view
and that you don’t think gunning for a faculty position is a good thing. What kind of job is the right one to “make scientific progress”, then? I thought that the best way to do that is to run a lab, managing a bunch of smart PhD students and postdocs, and steering them towards useful research directions.
My impression is that PIs manage the same or more people than the equivalent seniority position in industry, at least in machine learning; but that they have freedom to set research priorities, instead of having to follow a boss. (On the flipside, they have to pander to grant givers, but that seems to give more freedom in research direction).
In summary, what do you think is the kind of job where you can make the most scientific progress?
I think even among such selected crowd, Anita would stand out like a bright star. The average top-university PhD student doesn’t end up holding a top faculty job. (This may seem elitist, but it is important: becoming a trainer of mediocre PhD students is likely not more effective than non-profit work). A first-author Nature paper in undergrad (!) is quite rare too.
Good insight, thank you for writing this post! I agree with it. Now that you point it out, I find striking how knowlege has compounded, even more impressively than money.
I would like to add another contestant: influence, within or out of mainstream institutions. As a movement, social capital and influence on other people (especially politicians) could prove very useful to be able to have a large impact when the time is right. I’m thinking especially of the Mont Pelerin society: how they spread in economics academia by convincing people and placing people in positions of (mostly academic) influence; and how they eventually became orthodox economic policy.
The EA community also seems to be very aware of the MPS. What I’m pointing out is that, under your framework, community building is also an intervention for patient longtermism.
That’s one way to see it, but I thought that ideally you’re supposed to keep considering all the possible “interventions” you can personally do to help moral patients. That is, if the most effective cause that matches your skills (and is neglected, etc etc) changes, you’re supposed to switch.
In practice that does not happen much, because skills and experience in one area are most useful in the same area, and because re-thinking your career constantly is tiring and even depressing; but it could be that way.
If it was that way, people who have decided on their cause area (for the next say, 5 years) should still call themselves EAs.
I am pretty confident that this particular impression is incorrect. The essential amino-acid profiles of the protein of most plant sources is very close to human requirements. See in particular Figure 14 of the WHO report on amino-acid requirements. (https://apps.who.int/iris/bitstream/handle/10665/43411/WHO_TRS_935_eng.pdf?sequence=1&isAllowed=y, page 165 of the PDF). It compares the human percentage amino-acid requirements with the content of various animal and vegetal sources. They are incredibly similar, and also the percentage of tryptophan required is larger than the human pattern in all plant sources (except perhaps maize if we scale down the bars).
That said, thank you for the post! I am now 70% confident that I am in fact stressed; but I don’t see a way to stop it, the work just keeps on piling up.