Current scaling “laws” are not laws of nature. And there are already worrying signs that things like dataset optimization/pruning, curriculum learning and synthetic data might well break them—It seems likely to me that LLMs will be useful in all three. I would still be worried even if LLMs prove useless in enhancing architecture search.
Tomas B.
The Drowning Child Can Be Read In Two Directions
In general, I have noticed a pattern where people are dismissive of recursive self improvement. To the extent people are still believing this, I would like to suggest this is a cached thought that needs to be refreshed.
When it seemed like models with a chance of understanding code or mathematics were a long ways off—which it did (checks notes) two years ago, this may have seemed sane. I don’t think it seems sane anymore.
What would it look like to be on the precipice of a criticality threshold? I think it looks like increasingly capable models making large strides in coding and mathematics. I think it looks like feeding all of human scientific output into large language models. I think it looks a world where a bunch of corporations are throwing hundreds of millions of dollars into coding models and are now in the process of doing the obvious things that are obvious to everyone.
There’s a garbage article going around with rumors of GPT-4, which appears to be mostly wrong. But from slightly-more reliable rumors, I’ve heard it’s amazing and they’re picking the low-hanging data set optimization fruits.
The threshold for criticality, in my opinion, requires a model capable of understanding the code that produced it as well as a certain amount of scientific intuition and common sense. This no longer seems very far away to me.
But then, I’m no ML expert.
The Maker of MIND
I started writing this for the EA Forum Creative Writing Contest but missed the deadline. I posted this on LessWrong but figured I would post it here, too.
As an intuition pump, imagine Neuman was alive today; would it be worthwhile to pay him to look into alignment? (He explicitly did contract work at extraordinary rates IIRC). I suspect that it would be worth it, despite the uncertainties. If you agree, then it does seem worthwhile to try to figure out who is the closest to being a modern Neumann and paying them to look into alignment.
>I suspect that this doesn’t work as an idea, largely because of what motivates mathematicians at that level.
How confident of this are you? How many mathematicians have been offered, say, $10M for a year of work and turned it down?
This comment is pretty stupid.