I plan to finetune GPT-J, a large language model similar to GPT-3 creative by EleutherAI, on effective altruism texts. GPT-J is known to be better at mathematical, logical, and analytic reasoning than GPT-3 due to a large training on academic texts.
The goals are:
Accurately reflect how the EA community thinks
Represent texts widely read in the EA community
Helps the language model think well
My proposed training mix:
60% EA Forum posts above a certain karma threshold
Bias towards newer posts according to a ?? curve
Weight the likelihood of inclusion of each post by a function of its karma (how does that map to views?
[Question] Training a GPT model on EA texts: what data?
I plan to finetune GPT-J, a large language model similar to GPT-3 creative by EleutherAI, on effective altruism texts. GPT-J is known to be better at mathematical, logical, and analytic reasoning than GPT-3 due to a large training on academic texts.
The goals are:
Accurately reflect how the EA community thinks
Represent texts widely read in the EA community
Helps the language model think wellMy proposed training mix:
60% EA Forum posts above a certain karma threshold
Bias towards newer posts according to a ?? curve
Weight the likelihood of inclusion of each post by a function of its karma (how does that map to views?
Books (3.3MB)
The Alignment Problem (1MB)
The Precipice (0.9MB)
Doing Good Better (0.5MB)
The Scout Mindset (0.5MB)
80,000 Hours (0.4KB)
Articles and blog posts on EA
EA Handbook
Most Important Century Sequence
Replacing Guilt Sequence (h/t Lorenzo)
Winners of the First Decade Review
… what else?
EA Forum Topic Descriptions (h/t Lorenzo)
OpenPhilanthropy.org (h/t Lorenzo)
GivingWhatWeCan.org (h/t Lorenzo)
including comments
??% Rationalism
??% Overcoming Bias
??% Slate Star Codex
??% HPMOR
What sources am I missing?
Please suggest important blog posts and post series I should add to the training mix, and explain how important to or popular EA they are.
Can you help me estimate how much mindshare each of the items labelled ”??” occupies in a typical EA?
I’m new to EA, so I would strongly appreciate input.