Error
Unrecognized LW server error:
Field "fmCrosspost" of type "CrosspostOutput" must have a selection of subfields. Did you mean "fmCrosspost { ... }"?
Unrecognized LW server error:
Field "fmCrosspost" of type "CrosspostOutput" must have a selection of subfields. Did you mean "fmCrosspost { ... }"?
You probably want the link at the top of this post to go directly to the Deepmind paper page, instead of the LessWrong redirect-URL for the link. I.e. the current link is:
https://www.lesswrong.com/out?url=https%3A%2F%2Fdeepmind.com%2Fblog%2Farticle%2Fgenerally-capable-agents-emerge-from-open-ended-play
When it probably should be:
https://deepmind.com/blog/article/generally-capable-agents-emerge-from-open-ended-play
Oops, sorry thanks!
Is there already a handy way to compare computation costs that went into training? E.g. compared to GPT3, AlphaZero, etc.?
I would love to know! If anyone finds out how many PF-DAYs or operations or whatever were used to train this stuff, I’d love to hear it. (Alternatively: How much money was spent on the compute, or the hardware.)
For what it’s worth, I’ve mostly not been interested in AI safety/alignment (and am still mostly not), but this also seems like a pretty big deal to me. I haven’t actually read the details, but this is basically not “narrow” AI anymore, right?
I guess the expressions “narrow” and “general” are a bit unfortunate, since I don’t really want to call this either. I would want to reserve the term AGI for AI that can do at least this, but can also reason generally and abstractly, and excels at one-shot learning (although there are specific networks designed for one-shot learning, like Siamese networks. Actually, why aren’t similar networks used more often,even as subnetworks?).
My take is that indeed, we now have AGI—but it’s really shitty AGI, not even close to human-level. (GPT-3 was another example of this; pretty general, but not human-level.) It seems that we now have the know-how to train a system that combines all the abilities and knowledge of GPT-3 with all the abilities and knowledge of these game-playing agents. Such a system would qualify as AGI, but not human-level AGI. The question is how long it’ll take, and how much money (to make it bigger, train for longer) to get to human-level or something dangerously powerful at least.
It seems like this could extend naturally to cooperative inverse reinforcement learning. Basically, the real world is a new game the AI has to play, and humans decide the reward subjectively (rather than with some explicit rule). The AI has developed some general competence beforehand by playing games, but it has to learn the new rules in the real world, which are not explicit.
Might be confirmation bias. But is it.
I did say it was a hot take. :D If I think of more sophisticated things to say I’ll say them.
AGI confirmed? 😬