Right. And this scales to fairly massive tasks. “Design a medium body airliner to these specs” is completely doable. Or an entire chip in one step.
The model doesn’t get to collaborate with future versions of itself because it doesn’t know who is checking the end product for security vulnerabilities and to be caught is death. We could give a model thousands of tasks of this complexity level and check for sabotage before giving the model 1 real task. It gets no context differentiating the real task from the test tasks, and has no memory of any of the prior steps.
And see it scales up and down the tree. The robots building the aircraft don’t get to plan their sabotage by similar limits and so on and so forth.
Your plan to deal with bad models is to use your restricted models to manufacture the weapons needed to fight them, and to optimize their engagements.
This i think is a grounded and realistic view of how to win this. Asking for pauses is not.
Right. And this scales to fairly massive tasks. “Design a medium body airliner to these specs” is completely doable. Or an entire chip in one step.
The model doesn’t get to collaborate with future versions of itself because it doesn’t know who is checking the end product for security vulnerabilities and to be caught is death. We could give a model thousands of tasks of this complexity level and check for sabotage before giving the model 1 real task. It gets no context differentiating the real task from the test tasks, and has no memory of any of the prior steps.
And see it scales up and down the tree. The robots building the aircraft don’t get to plan their sabotage by similar limits and so on and so forth.
Your plan to deal with bad models is to use your restricted models to manufacture the weapons needed to fight them, and to optimize their engagements.
This i think is a grounded and realistic view of how to win this. Asking for pauses is not.