Level 3″ is described as “Armed with this ability you can learn not just from your own experience, but from the experience of others—you can identify successful others and imitate them.” This doesn’t seem like what any ML model does.
This sounds like straightforward transfer learning (TL) or fine tuning, common in 2017.
So you could just write 15 lines of python which shops between some set of pretrained weights and sees how they perform. Often TL is many times (1000x) faster than random weights and only needs a few examples.
As speculation: it seems like in one of the agent simulations you can just have agents grab other agents weights or layers and try them out in a strategic way (when they detect an impasse or new environment or something). There is an analogy to biology where species alternate between asexual vs sexual reproduction, and trading of genetic material occurs during periods of adversity. (This is trivial, I’m sure a second year student has written a lot more.)
This doesn’t seem to fit any sort of agent framework or improve agency though. It just makes you train faster.
Eh, there seems like a connection to interpretability.
For example, if the ML architecture “were modular+categorized or legible to the agents”, they would more quickly and effectively swap weights or models.
So there might be some way where legibility can emerge by selection pressure in an environment where say, agents had limited capacity to store weights or data, and had to constantly and extensively share weights with each other. You could imagine teams of agents surviving and proliferating by a shared architecture that let them pass this data fluently in the form of weights.
To make sure the transmission mechanism itself isn’t crazy baroque you can, like, use some sort of regularization or something.
I’m 90% sure this is a shower thought but like it can’t be worse than “The Great Reflection”.
This sounds like straightforward transfer learning (TL) or fine tuning, common in 2017.
So you could just write 15 lines of python which shops between some set of pretrained weights and sees how they perform. Often TL is many times (1000x) faster than random weights and only needs a few examples.
As speculation: it seems like in one of the agent simulations you can just have agents grab other agents weights or layers and try them out in a strategic way (when they detect an impasse or new environment or something). There is an analogy to biology where species alternate between asexual vs sexual reproduction, and trading of genetic material occurs during periods of adversity. (This is trivial, I’m sure a second year student has written a lot more.)
This doesn’t seem to fit any sort of agent framework or improve agency though. It just makes you train faster.
Eh, there seems like a connection to interpretability.
For example, if the ML architecture “were modular+categorized or legible to the agents”, they would more quickly and effectively swap weights or models.
So there might be some way where legibility can emerge by selection pressure in an environment where say, agents had limited capacity to store weights or data, and had to constantly and extensively share weights with each other. You could imagine teams of agents surviving and proliferating by a shared architecture that let them pass this data fluently in the form of weights.
To make sure the transmission mechanism itself isn’t crazy baroque you can, like, use some sort of regularization or something.
I’m 90% sure this is a shower thought but like it can’t be worse than “The Great Reflection”.