Just something that jumped out at me. Suppose a pause is on 1e28+ training runs.
The human brain is made of modules organized in a way we don’t understand. But we do know the frontal lobes associated with executive functions are a small part of the total tissue.
This means an AI system could be a collection of a few dozen specialized 1e28 models separated by api calls, hosted in a common data center for low latency interconnects.
If a “few dozen” is 100+ modules the total compute used would be 1e30 and it might be possible to make this system an AGI with difficult training tasks to cause this level of cognitive development through feedback.
Especially with “meta” system architectures where new modules could be automatically added to improve score where deficiencies are present in a way that training existing weights is leading to regressions.
Interesting—something to watch out for! Perhaps it could be caught by limiting the number of training runs any individual actor can do that are close to / at the FLOP limit (to 1/year?). Of course then actors intent on it could try and use a maze of shell companies or something, but that could be addressed by requiring complete financial records and audits.
Sure. In practice there’s the national sovereignty angle though. This just devolves to each party “complies” with the agreement, violating it in various ways. Too much incentive to defect.
The US government just never audits its secret national labs, China just never checks anything, Israel just openly decides they can’t afford to comply at all etc. Everyone claims to be in compliance.
Really depends on how much of a taboo develops around AGI. If it’s driven underground it becomes much less likely to happen given the resources required.
So my thought on this is I think of flamethrowers and gas shells and the worst ww1 battlefields. I am not sure what taboo humans won’t violate in order to win.
This isn’t war though. What are some peace-time examples of taboo violations (especially state-sanctioned ones)? I can only really think of North Korea and a handful of other pariah states (none of which would be capable of developing AGI).
This can be avoided with a treaty that requires full access given to international inspectors. This already happens with the IAEA and was set up even in the far greater tensions of the cold war. If someone like Iran tries to kick out the inspectors, everyone assumes they’re trying to develop nuclear weapons and takes serious action (harsh sanctions, airstrikes, even the threat of war).
If governments think of this as an existential threat, they should agree to it for the same reasons they did with the IAEA. And while there’s big incentives to defect (unless they have very high p(doom)), there is also the knowledge that kicking out inspectors will lead to potential war and their rivals defecting too.
If this turns out to be feasible, one solution would be to have people on-site (or make TSMC put hardware level controls in place) to randomly sample from the training data several times a day to verify outside data isn’t involved in the training run.
Just something that jumped out at me. Suppose a pause is on 1e28+ training runs.
The human brain is made of modules organized in a way we don’t understand. But we do know the frontal lobes associated with executive functions are a small part of the total tissue.
This means an AI system could be a collection of a few dozen specialized 1e28 models separated by api calls, hosted in a common data center for low latency interconnects.
If a “few dozen” is 100+ modules the total compute used would be 1e30 and it might be possible to make this system an AGI with difficult training tasks to cause this level of cognitive development through feedback.
Especially with “meta” system architectures where new modules could be automatically added to improve score where deficiencies are present in a way that training existing weights is leading to regressions.
Interesting—something to watch out for! Perhaps it could be caught by limiting the number of training runs any individual actor can do that are close to / at the FLOP limit (to 1/year?). Of course then actors intent on it could try and use a maze of shell companies or something, but that could be addressed by requiring complete financial records and audits.
Sure. In practice there’s the national sovereignty angle though. This just devolves to each party “complies” with the agreement, violating it in various ways. Too much incentive to defect.
The US government just never audits its secret national labs, China just never checks anything, Israel just openly decides they can’t afford to comply at all etc. Everyone claims to be in compliance.
Really depends on how much of a taboo develops around AGI. If it’s driven underground it becomes much less likely to happen given the resources required.
So my thought on this is I think of flamethrowers and gas shells and the worst ww1 battlefields. I am not sure what taboo humans won’t violate in order to win.
This isn’t war though. What are some peace-time examples of taboo violations (especially state-sanctioned ones)? I can only really think of North Korea and a handful of other pariah states (none of which would be capable of developing AGI).
This can be avoided with a treaty that requires full access given to international inspectors. This already happens with the IAEA and was set up even in the far greater tensions of the cold war. If someone like Iran tries to kick out the inspectors, everyone assumes they’re trying to develop nuclear weapons and takes serious action (harsh sanctions, airstrikes, even the threat of war).
If governments think of this as an existential threat, they should agree to it for the same reasons they did with the IAEA. And while there’s big incentives to defect (unless they have very high p(doom)), there is also the knowledge that kicking out inspectors will lead to potential war and their rivals defecting too.
If this turns out to be feasible, one solution would be to have people on-site (or make TSMC put hardware level controls in place) to randomly sample from the training data several times a day to verify outside data isn’t involved in the training run.