Yeah idk. I think it’s plausible that proofs and products trade off against eachother. There’s a sense in which theoretical legitimacy is a signal that priorities are not in order, that it selects for staff that get excited by the wrong things, whereas deeply product-oriented teams are focused on what users actually care about. In an arms race, or even just scarce modes like layoffs/cuts, we should expect theory-driven teams to either restructure themselves against theory or just lose.
One approach to drawing a reference class: central banking predates macroeconomic theory by centuries, gambling predates probability theory by millenia, enigma was cracked without formal information theory (though the lag here is under a decade). However, this approach fails in at least one of it’s exceptions that I’m thinking about: Godel/church/turing can be said to have provided the “theory of the x86 instruction set” a few decades in advance. Then there are ambiguous cases: like archimedean theory of pressure was available to the engineers of the roman aqueduct, but it’s not clear 1. that the theory was a game changer for the engineers or they would have figured it out anyway, or 2. whether it matters that the theory isn’t the same as the matured theory after civilization continued to iterate.
Importantly I think is observations in the blockchain space: ethereum is not theory driven, it’s very move fast break things, it’s subject to hundreds of millions of dollars of hacks or security failures, its contracting language is intentionally trying to appeal to javascript developers, etc. If theory gets you any gains at all, you should expect that proof-driven blockchain ecosystems to beat ethereum, if nothing else than for the security benefits of theoretical legitimacy. But that happened: a splinter group from the ethereum team went off and started a formal verification -friendly blockchain ecosystem, and (this is not investing advice) it hasn’t gone super well. Basically ethereum says “the whitepaper markets the product” and cardano says “the product implements the whitepaper”, and this strategy has not led to cardano winning. (Note: this example could be washed out by first mover advantage, ethereum was the first mover above an important threshold of contract expressibility, so no reasonable amount of technical superiority may viably work).
So yeah, in spite of not being convinced by the “engineering predates theory” reference class, I still think that an aligned takeoff strategy that rests on deployments abiding soundly and completely to specifications that everyone agrees are right has a deeply slim chance of getting us out of this thing.
Quinn- very interesting analogy to Ethereum (‘move fast break things’) vs. Cardano (‘formal verification maximizes security’). We shall see which approach wins in the long run. My hunch is that the first mover advantage is over-estimated, and that more provably secure, open-source protocols will eventually attract more transaction volume and value. (But that’s a somewhat different issue that formal theorem-proving about the safety of AI systems).
𝕮𝖎𝖓𝖊𝖗𝖆 -- that all sounds plausible enough, at first glance.
However, are there any other domains of security or safety in which formal theorem-proving is considered helpful, relevant, and/or reliable?
Consider crash test safety for vehicles, home security against burglars, computer system security against hackers, consensus mechanisms for crypto, political security of nuclear launch codes, military counter-intelligence, anti-counterfeiting measures for fiat currency, etc.
I can’t think of any of these security domains in which formal theorem-proving is actually very helpful. So why should we trust it for AI safety?
I know for one that computer system security and consensus mechanisms for crypto rely on proofs and theorems to guide them. It is a common when you want a highly secure computer system to provably verify its security, and consensus mechanisms rely much on mechanism design. Similarly for counter-intelligence: cryptography is invaluable in this area.
Even in deep learning, proofs have by and large been a failure. Proofs would be important, and there are many people trying to find angles of attack for having useful proofs for deep learning systems, so it is hard to say it is neglected. Unfortunately useful proofs are rarely tractable for complex systems such as deep learning systems. Compared to other interventions I would not bet a substantial amount on proofs for deep learning systems given its important, neglectedness, and tractability.
I believe that planes and trains are a success story of proof engineering—I only believe this anecdotally because proof engineers bring up planes and trains a lot when the topic of non-academic jobs comes up.
But crucially, I think Cinera is discussing a mathematical process of defining and understanding and communicating, where articulating the theorem in the first place or setting up lemmas for the proof can sometimes be “most” of the work. This process is, from a computer science perspective, “informal” because of how much human judgment and natural language is involved. So the question is less about formal verification than I think the direction you’re taking it is.
Yeah idk. I think it’s plausible that proofs and products trade off against eachother. There’s a sense in which theoretical legitimacy is a signal that priorities are not in order, that it selects for staff that get excited by the wrong things, whereas deeply product-oriented teams are focused on what users actually care about. In an arms race, or even just scarce modes like layoffs/cuts, we should expect theory-driven teams to either restructure themselves against theory or just lose.
One approach to drawing a reference class: central banking predates macroeconomic theory by centuries, gambling predates probability theory by millenia, enigma was cracked without formal information theory (though the lag here is under a decade). However, this approach fails in at least one of it’s exceptions that I’m thinking about: Godel/church/turing can be said to have provided the “theory of the x86 instruction set” a few decades in advance. Then there are ambiguous cases: like archimedean theory of pressure was available to the engineers of the roman aqueduct, but it’s not clear 1. that the theory was a game changer for the engineers or they would have figured it out anyway, or 2. whether it matters that the theory isn’t the same as the matured theory after civilization continued to iterate.
Importantly I think is observations in the blockchain space: ethereum is not theory driven, it’s very move fast break things, it’s subject to hundreds of millions of dollars of hacks or security failures, its contracting language is intentionally trying to appeal to javascript developers, etc. If theory gets you any gains at all, you should expect that proof-driven blockchain ecosystems to beat ethereum, if nothing else than for the security benefits of theoretical legitimacy. But that happened: a splinter group from the ethereum team went off and started a formal verification -friendly blockchain ecosystem, and (this is not investing advice) it hasn’t gone super well. Basically ethereum says “the whitepaper markets the product” and cardano says “the product implements the whitepaper”, and this strategy has not led to cardano winning. (Note: this example could be washed out by first mover advantage, ethereum was the first mover above an important threshold of contract expressibility, so no reasonable amount of technical superiority may viably work).
So yeah, in spite of not being convinced by the “engineering predates theory” reference class, I still think that an aligned takeoff strategy that rests on deployments abiding soundly and completely to specifications that everyone agrees are right has a deeply slim chance of getting us out of this thing.
Quinn- very interesting analogy to Ethereum (‘move fast break things’) vs. Cardano (‘formal verification maximizes security’). We shall see which approach wins in the long run. My hunch is that the first mover advantage is over-estimated, and that more provably secure, open-source protocols will eventually attract more transaction volume and value. (But that’s a somewhat different issue that formal theorem-proving about the safety of AI systems).
𝕮𝖎𝖓𝖊𝖗𝖆 -- that all sounds plausible enough, at first glance.
However, are there any other domains of security or safety in which formal theorem-proving is considered helpful, relevant, and/or reliable?
Consider crash test safety for vehicles, home security against burglars, computer system security against hackers, consensus mechanisms for crypto, political security of nuclear launch codes, military counter-intelligence, anti-counterfeiting measures for fiat currency, etc.
I can’t think of any of these security domains in which formal theorem-proving is actually very helpful. So why should we trust it for AI safety?
I know for one that computer system security and consensus mechanisms for crypto rely on proofs and theorems to guide them. It is a common when you want a highly secure computer system to provably verify its security, and consensus mechanisms rely much on mechanism design. Similarly for counter-intelligence: cryptography is invaluable in this area.
What about testing code for quality, that is, verifying code correctness, thereby reducing bugs?
Even in deep learning, proofs have by and large been a failure. Proofs would be important, and there are many people trying to find angles of attack for having useful proofs for deep learning systems, so it is hard to say it is neglected. Unfortunately useful proofs are rarely tractable for complex systems such as deep learning systems. Compared to other interventions I would not bet a substantial amount on proofs for deep learning systems given its important, neglectedness, and tractability.
I believe that planes and trains are a success story of proof engineering—I only believe this anecdotally because proof engineers bring up planes and trains a lot when the topic of non-academic jobs comes up.
But crucially, I think Cinera is discussing a mathematical process of defining and understanding and communicating, where articulating the theorem in the first place or setting up lemmas for the proof can sometimes be “most” of the work. This process is, from a computer science perspective, “informal” because of how much human judgment and natural language is involved. So the question is less about formal verification than I think the direction you’re taking it is.
Geoffrey, you might enjoy my post from last year.