This model performance is really impressive, and I’m glad you’re interested in large language models. But I share some of Gavin’s concerns, and I think it would be a great use of your time to write up a full theory of impact for this project. You could share it, get some feedback, and think about how to make this the most impactful while reducing risks of harm.
One popular argument for short-term risks from advanced AI are the risks from AI persuasion. Beth Barnes has a great writeup, as does Daniel Kokotajlo. The most succinct case I can make is that the internet is already full of bots, they spread all kinds of harmful misinformation, they reduce trust and increase divisiveness, and we shouldn’t be playing around with more advanced bots without seriously considering the possible consequences.
I don’t think anybody would make the argument that this project is literally an existential threat to humanity, but that shouldn’t be the bar. Just as much as you need the technical skills of LLM training and the creativity and drive to pursue your ideas, you need to be able to faithfully and diligently evaluate the impact of your projects. I haven’t thought about it nearly enough to say the final word on the project’s impact, but before you keep publishing results, I would suggest spending some time to think and write about your impact.
As a countervailing perspective, Dan Hendrycks thinks that it would be valuable to have automated moral philosophy research assistance to “help us reduce risks of value lock-in by improving our moral precedents earlier rather than later” (though I don’t know if he would endorse this project). Likewise, some AI alignment researchers think it would be valuable to have automated assistance with AI alignment research. If EAs could write a nice EA Forum post just by giving GPT-EA-Forum a nice prompt and revising the resulting post, that could help EAs save time and explore a broader space of research directions. Still, I think some risks are:
This bot would write content similar to what the EA Forum has already written, rather than advancing EA philosophy
The content produced is less likely to be well-reasoned, lowering the quality of content on the EA Forum
One goal is to make it easier to understand Effective Altruism through an interactive model.
I’m sick with COVID right now. I might respond in greater depth when I’m not sick.
This model performance is really impressive, and I’m glad you’re interested in large language models. But I share some of Gavin’s concerns, and I think it would be a great use of your time to write up a full theory of impact for this project. You could share it, get some feedback, and think about how to make this the most impactful while reducing risks of harm.
One popular argument for short-term risks from advanced AI are the risks from AI persuasion. Beth Barnes has a great writeup, as does Daniel Kokotajlo. The most succinct case I can make is that the internet is already full of bots, they spread all kinds of harmful misinformation, they reduce trust and increase divisiveness, and we shouldn’t be playing around with more advanced bots without seriously considering the possible consequences.
I don’t think anybody would make the argument that this project is literally an existential threat to humanity, but that shouldn’t be the bar. Just as much as you need the technical skills of LLM training and the creativity and drive to pursue your ideas, you need to be able to faithfully and diligently evaluate the impact of your projects. I haven’t thought about it nearly enough to say the final word on the project’s impact, but before you keep publishing results, I would suggest spending some time to think and write about your impact.
As a countervailing perspective, Dan Hendrycks thinks that it would be valuable to have automated moral philosophy research assistance to “help us reduce risks of value lock-in by improving our moral precedents earlier rather than later” (though I don’t know if he would endorse this project). Likewise, some AI alignment researchers think it would be valuable to have automated assistance with AI alignment research. If EAs could write a nice EA Forum post just by giving GPT-EA-Forum a nice prompt and revising the resulting post, that could help EAs save time and explore a broader space of research directions. Still, I think some risks are:
This bot would write content similar to what the EA Forum has already written, rather than advancing EA philosophy
The content produced is less likely to be well-reasoned, lowering the quality of content on the EA Forum
This advice totally applies here: https://forum.effectivealtruism.org/posts/KFMMRyk6sTFReaWjs/you-don-t-have-to-respond-to-every-comment
Good luck with your projects, hope you’re feeling better soon.
No rush!