My point was that HRAD potentially enables the strategy of pushing mainstream AI research away from opaque designs (which are hard to compete with while maintaining alignment, because you don’t understand how they work and you can’t just blindly copy the computation that they do without risking safety), whereas in your approach you always have to worry about “how do I compete with with an AI that doesn’t have an overseer or has an overseer who doesn’t care about safety and just lets the AI use whatever opaque and potentially dangerous technique it wants”.
I think both approaches potentially enable this, but are VERY unlikely to deliver. MIRI seems more bullish that fundamental insights will yield AI that is just plain better (Nate gave me the analogy of Judea Pearl coming up with Causal PGMs as such an insight), whereas Paul just seems optimistic that we can get a somewhat negligible performance hit for safe vs. unsafe AI.
But I don’t think MIRI has given very good arguments for why we might expect this; it would be great if someone can articulate or reference the best available arguments.
I have a very strong intuition that dauntingly large safety-performance trade-offs are extremely likely to persist in practice, thus the only answer to the “how do I compete” question seems to be “be the front-runner”.
I think both approaches potentially enable this, but are VERY unlikely to deliver. MIRI seems more bullish that fundamental insights will yield AI that is just plain better (Nate gave me the analogy of Judea Pearl coming up with Causal PGMs as such an insight), whereas Paul just seems optimistic that we can get a somewhat negligible performance hit for safe vs. unsafe AI.
But I don’t think MIRI has given very good arguments for why we might expect this; it would be great if someone can articulate or reference the best available arguments.
I have a very strong intuition that dauntingly large safety-performance trade-offs are extremely likely to persist in practice, thus the only answer to the “how do I compete” question seems to be “be the front-runner”.