I’m not sure, if you look back at Biosphere 2 for example a large number of the failure modes were identified fairly early on. In my experience there are two things that cause unexpected failure modes, scale and duration. i.e. running something at a larger scale than was previously tested can often reveal unintuitive failure modes and running something for longer that previous can reveal failure modes.
I get what your saying that running a service in a different environment to what it was tested in can cause unforseen issues, but I think with simulation and testing like they did for bejing airport or the kind of testing they do at SpaceX—we should be aiming to test these things to failure points.
I’m not sure, if you look back at Biosphere 2 for example a large number of the failure modes were identified fairly early on. In my experience there are two things that cause unexpected failure modes, scale and duration. i.e. running something at a larger scale than was previously tested can often reveal unintuitive failure modes and running something for longer that previous can reveal failure modes.
I get what your saying that running a service in a different environment to what it was tested in can cause unforseen issues, but I think with simulation and testing like they did for bejing airport or the kind of testing they do at SpaceX—we should be aiming to test these things to failure points.