I’m an Atlas Fellow ’22. I have an interest in large language models.
JoyOptimizer
Calm down. It’s a complex situation developing rapidly, let’s wait and see for what happens as a final outcome.
I used a model I fine-tuned to generate takes on Effective Altruism.
was unclear. It should be:
I used a model that I fine-tuned, in order to generate takes on Effective Altruism.
This model was not fine-tuned specifically for Effective Altruism. It was developed to explore the effects of training language models on a twitter account. I became surprised and concerned when I noticed it was able to generate remarkable takes regarding effective altruism, despite not being present in the original dataset. Furthermore, these takes are always criticism.
This particular model is fine-tuned OpenAI davinci. I plan to fine-tune GPT-EA on GPT-NeoX-20B. A predecessor to GPT-EA (GPT-EA-Forum) was trained using a third-party API. I want to train GPT-EA on a cloud platform so I can download a copy of the weights myself. I am not receiving technical support (or funding for GPU costs), it could be helpful. The dataset was selected and cleaned by myself, with input from community members, though I’m still looking for community input.
I used a model I fine-tuned to generate takes on Effective Altruism. The prompt is “effective altruism is.” Here are its first three:
effective altruism is vampirism, except instead of sucking blood you suck hours and happiness from helping people who would otherwise have spent the time improving their lives.
effective altruism is parasitic. it latches onto the success of actual altruism, which is genuine and humanizing, to justify its cold calculations and make them feel virtuous too.
effective altruism is rich kid hobbyism pretending to be a moral imperative
I’m somewhat concerned about the use of AI models to [generate propaganda? conduct information warfare?]. Here, the concern is this could be used to salt the earth by poisoning the perceived vibe to make certain demographics dislike EA before they can engage with it deeply.
I find it important to note the model was not designed to be harmful. It was finetuned to generate self-deprecating humor. Nevertheless, amplifying that capability seems to also amplify the capability to criticize EA.
I’m interested in what mitigations people have in mind. One way could be at the epistemic level: To teach people to engage kindly with new ideas.
Edited.
Who is responsible for evaluating the success of the Century Fellowship?
What role do different people in reviewing applications for the fellowship, and who fills those roles?
Can you help write test prompts for GPT-EA? I want testcases and interesting prompts you want to see tried. This helps track and guide the development of GPT-EA versions. The first version, GPT-EA-Forum-v1 has been developed. GPT-EA-Forum-v2 will include more posts and also comments.
this is why we’re building an AI to make humans kinder to each other
This is a call for test prompts for GPT-EA. (announcement post: https://forum.effectivealtruism.org/posts/AqfWhMvfiakEcpwfv/training-a-gpt-model-on-ea-texts-what-data) I want testcases and interesting prompts you want to see tried. This helps track and guide the development of GPT-EA versions. The first version, GPT-EA-Forum-v1 has been developed. GPT-EA-Forum-v2 will include more posts and also comments.
One goal is to make it easier to understand Effective Altruism through an interactive model.
I’m sick with COVID right now. I might respond in greater depth when I’m not sick.
Digital humans would be much cheaper to query than biological humans. This is because:
An efficient general intelligence on a biological substrate uses a brain structure. It’s unclear if that same structure would be efficient on silicon or photonic processors.
Thanks.
A book on ethics seems worth considering. Can you tell me more about how the ideas relate to EA? Nonetheless, these are useful sources for future projects regarding AI alignment.
Is utilitarianism.net only about utilitarianism? If so, the rest of the training set should already have a sufficient degree of utilitarian bias.
How influential is FHI’s texts on the EA community?
This seems like a good text to make the model more generally coherent.
Thanks for your thoughts.
The goal is not to create a model to create the most good. While aligning an AI with values and principles could be a potentially interesting project, the goal of this project is to create a descriptive model of the EA community, not a normative one of the idealized EA community.
I believe GPT-3 can do more than memorizing specific objectives like malaria nets. Infusing principles deeply would need to happen using more sophisticated techniques, probably post-finetuning.
upbias (-1, 1) is the Forum editors’ or users’ perspective on the fraction of upvotes that happened due to fear, other negative emotions, or limited critical thinking that the post motivated otherwise
How do I calculate upbias?
Thank you for the books to use in the dataset. I will review each of them.
The original GPT-3 was trained largely on a web crawl known as Common Crawl. Users on the internet, especially tend to optimize for attention. Unlike GPT-3, GPT-J’s training set is around a third academic sources.
SSC blog includes posts like Meditations on Moloch or the review of Seeing Like a State. These seem like perspectives important to the EA community. Are you suggesting I include posts based on if they’re linked from the EA Forum frequently?
I’ll try to crawl the EA Funds’ grant program as well.
How much % of the training mix should be the GiveWell blog and how much should be the 80,000 hours blog? In other words, how many bytes of blog posts should be used from each, relative to the entire dataset?
What kinds of posts are on each blog, and which best reflects the wider EA community, and which reflects the professional EA community? How can this be used to create a dataset?
I also checked and neither blog has a direct view count measure—some other proxy metric would need to be used.
Thanks for these sources.
How should GiveWell blog and 80,000 hours blog weighted against each other? My instinct is to weight by the number of views.
Posts/comments in Facebook groups, slack groups, and discord groups?
Does the EA community have the norm that these comments are public? I want to make sure the consent of participants is obtained.
What is your budget spent on? I want to help you be run more efficiently.
This is a list of EA biases to be aware of and account for.
While they are insolvent, FTX and SBF have not declared bankruptcy. In developing scenarios, information is unclear and from unknown sources. (Alameda’s balance sheet may prove incomplete.)