To your second question first: I don’t know if their are specific laws related to e.g. ASTM standards. But there are laws related to criminal negligence in every country. So if, say, you build a tank and it explodes, and it turns out that you didn’t follow the appropriate regulations, you will be held criminally liable—you will pay fines and potentially end up in jail. You may believe that the approach you took was equally safe and/or that it was unrelated to the accident, but you’re unlikely to succeed with this defence in court—it’s like arguing “I was drunk, but that’s not why I crashed my car.”
And not just you, but a series of people, perhaps up to and including the CEO, and also the Safety Manager, will be liable. So these people are highly incentivised to follow the guidelines. And so, largely independently of whether there are actually criminal penalties for not following the standards even when there is no accident, the system kind of polices itself. As an engineer, you just follow the standards. You do not need further justification for a specific safety step or cost than that it is required by the relevant standard.
But I’m very conscious that the situations are very different. In engineering, there are many years of experience, and there have been lots of accidents from which we’ve learned through experience and eventually based on which the standards have been modified. And the risk of any one accident, even the very worst kind like a building collapse or a major explosion, tends to be localised. In biohazard, we can imagine one incident, perhaps one that never happened before, which could be catastrophic for humanity. So we need to be more proactive.
Now to the specific points:
Reporting:
For engineering (factories in general), there are typically two important mechanisms.
Every incident must be investigated and a detailed incident report must be filed, which would typically be written by the most qualified person (i.e. the engineer working directly on the system) and then reviewed over several steps by more senior managers, by the safety manager and eventually by a very senior manager, often the CEO. It would also be quite typical for the company to bring in external safety experts, even from a competitor (who might be best-placed to understand the risks) to ensure the analysis is complete. This report would need to provide a full analysis of what went wrong, what could have gone worse, why it went wrong, and how to ensure that the incident never reoccurs, or that similar incidents never happen again. It would not be unusual for the entire production unit to be closed while an investigation is being carried out, and for it to not be allowed to re-open until any concerns have been fully addressed to the satisfaction of the safety manager. And this is just the internal processes. There will also frequently be external reviewers of the report itself, sometimes from state-bodies. And all this is what happens when there is an incident which does not lead to any criminal procedures or negligence charges. If these are involved, the whole thing becomes much more involved.
Most companies have a “near miss” box (physical or virtual) in which any employee can report, anonymously if they choose, any case where an accident could have occurred, or a dangerous situation which could be allowed to occur. These are taken very seriously and typically the Safety manager will lead a full investigation into the risks or hazards identified. The fact that the accident didn’t actually happen is not really a mitigating factor, since this might have been just good fortune. An example of this: if an Operator notices that the reactor is overheating and could potentially overflow, and so she turns on the cooling water to avoid the risk, this would be considered a near-miss if it was not that particular Operator’s role to do this—they were lucky that she noticed it, but what would have happened if she hadn’t?
Proactive vs. Reactive:
We start already with a very clear set of safety rules, which have been developed over the years. In my last company, I was one of the people who “developed” the initial safety procedures. But we didn’t start from scratch—rather, we started from lots of excellent procedures which are available online and from government bodies, from material safety data sheets which are available for every chemical and give detailed safe-handling instructions, and so on. Even still, this process probably took about one month of my time, as Head of Engineering, and similar amounts from a couple of other employees, including the Safety Manager. And this was just for a lab and pilot-plant set-up. In a sense you could consider this to be the “reactive” part, this is the set of rules that have been built up over years of experience. The rest of what I describe below is the proactive part, in which the people best qualified to judge evaluate what could potentially go wrong:
Starting from this as a baseline, we then do a detailed hazard analysis (HAZAN, HAZOP analyses) in which we study all the risks and hazards that exist in our lab or pilot plant, whether they be equipment or chemicals or operations. So, for example, if we use liquid Nitrogen, we’d already have (in step one) a detailed set of rules for handling liquid Nitrogen, and in step 2, we’d do a detailed analysis of what could possibly go wrong—what if the container gets dropped? what if an operator makes a mistake? what if a container has a leak? What if someone who isn’t trained to work with liquid nitrogen were in the lab? etc. And we’d then need to develop procedures to ensure that none of these scenarios could lead to accidents or personal injuries.
Then, every time we want to do something new (e.g. we work with a new chemical, we want to try a new process-experiement, …) we need to do a detailed risk analysis of that and get it approved by the safety manager before we can start.
In addition to the above, for major risks (e.g. solvent-handling that could lead to explosions) we would sometimes do additional safety risks—one example would be a “bowtie analysis” (a deceptively elegant name!) in which we create a picture like a bowtie.
In the centre (the knot) is the event itself, let’s say an ignition event, but in a bio-hazard context, it could be say a researcher dropping a glass container with dangerous viruses in it.
To the left are all the steps that are taken to prevent this from occurring. For example, for an ignition/explosion risk, you need three things—an explosive atmosphere, a source of ignition (e.g. a spark) and the presence of oxygen—and typically we’d want to make sure at most one of these was present, so that we’d have two layers of security. So, on the bowtie diagram, you would start on the left and look at something that might happen (e.g. the solvent spills) and then look sequentially at what could happen, to make sure that the safety measures in place would take care of it. An especially important concern would be any occurence that could reduce the impact of more than one layer of security simultaneously—for example, if a research in a biolab takes a sample outside the safe-working-area, intentionally or by accident, this might mean that several layers of what appeared to be an impenetrable safe system are broken.
To the right are the steps to minimise the consequences. If the explosion occurs, these are the steps that will minimise the injuries, the casualties, the damage. For example, appropriate PPE, fire-doors, fire-handling procedures, minimum number of people in the lab, etc.
I think the idea is clear. I am sure that someone, probably many people, have done this for bio-hazards and bio-security. But what is maybe different is that for engineers, there is such a wealth of documentation and examples out there, and such a number of qualified, respected experts, that when doing this risk analysis, we have a lot to build on and we’re not just relying on first-principles, although we use that too.
For example, I can go on the internet and find some excellent guides for how to run a good HAZOP analysis, and I can easily get experienced safety experts to visit my pilot-plant and lead an external review of our HAZOP analysis, and typically these experts will be familiar with the specific risks we’re dealing with. I’m sure people run HAZOP’s for Biolabs too (I hope!!) but I’m not sure they would have the same quality of information and expertise available to help ensure they don’t miss anything.
From your analysis of biolabs, it feels much more haphazard. I’m sure every lab manager means well and does what they can, but it’s so much easier for them to miss something, or maybe just not to have the right qualified person or the relevant information available to them.
What I’ve described above is not a check-list, but it is a procedure that works in widely different scenarios, where you incorporate experience, understanding and risk-analysis to create the safest working environment possible. And even if details change, you will find more or less the same approach anywhere around the world, and anyone, anywhere, will have access to all this information online.
And … despite all this, we still have accidents in chemical factories and building that collapse. Luckily these do not threaten civilisation they way a bio-accident could.
Hope this helps—happy to share more details or answer questions if that helps.
One follow-up question: who are safety managers? How are they trained, what’s their seniority in the org structure, and what sorts of resources do they have access to?
In the bio case it seems that in at least some jurisdictions and especially historically, the people put in charge of this stuff were relatively low-level administrators, and not really empowered to enforce difficult decisions or make big calls. From your post it sounds like safety managers in engineering have a pretty different role.
A Safety manager (in a small company) or a Safety Department (in a larger company) needs to be independent of the department whose safety they monitor, so that they are not conflicted between Safety and other objectives like, say, an urgent production deadline (of course, in reality they will know people and so on, it’s never perfect). Typically, they will have reporting lines that meet higher up (e.g. CEO or Vice President), and this senior manager will be responsible for resolving any disagreements. If the Safety Manager says “it’s not safe” and the production department says “we need to do this,” we do not want it to become a battle of wills. Instead, the Safety Manager focuses exclusively on the risk, and the senior manager decides if the company will accept that risk. Typically, this would not be “OK, we accept a 10% risk of a big explosion” but rather finding a way to enable it to be done safely, even if it meant making it much more expensive and slower.
In a smaller company or a start-up, the Safety Manager will sometimes be a more experienced hire than most of the staff, and this too will give them a bit of authority.
I think what you’re describing as the people “put in charge of this stuff” are probably not the analogous people to Safety Managers. In every factory and lab, there would be junior people doing important safety work. The difference is that in addition to these, there would be a Safety Manager, one person who would be empowered to influence decisions. This person would typically also oversee the safety work done by more junior people, but that isn’t always the case.
Again, the difference is that people in engineering can point to historical incidences of oil-rigs exploding with multiple casualties, of buildings collapsing, … and so they recognise that getting Safety wrong is a big deal, with catastrophic consequences. If I compare this to say, a chemistry lab, I see what you describe. Safety is still very much emphasised and spoken about, nobody would ever say “Safety isn’t important”, but it would be relatively common for someone (say the professor) to overrule the safety person without necessarily addressing the concerns.
Also in a lab, to some extent it’s true that each researcher’s risks mostly impact themselves—if your vessel blows up or your toxic reagent spills, it’s most likely going to be you personally who will be the victim. So there is sometimes a mentality that it’s up to each person to decide what risks are acceptable—although the better and larger labs will have moved past this.
I imagine that most people in biolabs still feel like they’re in a lab situation. Maybe each researcher feels that the primary role of Safety is to keep them and their co-workers safe (which I’m sure is something they take very seriously), but they’re not really focused on the potential of global-scale catastrophes which would justify putting someone in charge.
I again emphasise that most of what I know about safety in biolabs comes from your post, so I do not want to suggest that I know, I’m only trying to make sense of it. Feel free to correct / enlighten me (anyone!).
Hi Rose,
To your second question first: I don’t know if their are specific laws related to e.g. ASTM standards. But there are laws related to criminal negligence in every country. So if, say, you build a tank and it explodes, and it turns out that you didn’t follow the appropriate regulations, you will be held criminally liable—you will pay fines and potentially end up in jail. You may believe that the approach you took was equally safe and/or that it was unrelated to the accident, but you’re unlikely to succeed with this defence in court—it’s like arguing “I was drunk, but that’s not why I crashed my car.”
And not just you, but a series of people, perhaps up to and including the CEO, and also the Safety Manager, will be liable. So these people are highly incentivised to follow the guidelines. And so, largely independently of whether there are actually criminal penalties for not following the standards even when there is no accident, the system kind of polices itself. As an engineer, you just follow the standards. You do not need further justification for a specific safety step or cost than that it is required by the relevant standard.
But I’m very conscious that the situations are very different. In engineering, there are many years of experience, and there have been lots of accidents from which we’ve learned through experience and eventually based on which the standards have been modified. And the risk of any one accident, even the very worst kind like a building collapse or a major explosion, tends to be localised. In biohazard, we can imagine one incident, perhaps one that never happened before, which could be catastrophic for humanity. So we need to be more proactive.
Now to the specific points:
Reporting:
For engineering (factories in general), there are typically two important mechanisms.
Every incident must be investigated and a detailed incident report must be filed, which would typically be written by the most qualified person (i.e. the engineer working directly on the system) and then reviewed over several steps by more senior managers, by the safety manager and eventually by a very senior manager, often the CEO. It would also be quite typical for the company to bring in external safety experts, even from a competitor (who might be best-placed to understand the risks) to ensure the analysis is complete. This report would need to provide a full analysis of what went wrong, what could have gone worse, why it went wrong, and how to ensure that the incident never reoccurs, or that similar incidents never happen again. It would not be unusual for the entire production unit to be closed while an investigation is being carried out, and for it to not be allowed to re-open until any concerns have been fully addressed to the satisfaction of the safety manager. And this is just the internal processes. There will also frequently be external reviewers of the report itself, sometimes from state-bodies. And all this is what happens when there is an incident which does not lead to any criminal procedures or negligence charges. If these are involved, the whole thing becomes much more involved.
Most companies have a “near miss” box (physical or virtual) in which any employee can report, anonymously if they choose, any case where an accident could have occurred, or a dangerous situation which could be allowed to occur. These are taken very seriously and typically the Safety manager will lead a full investigation into the risks or hazards identified. The fact that the accident didn’t actually happen is not really a mitigating factor, since this might have been just good fortune. An example of this: if an Operator notices that the reactor is overheating and could potentially overflow, and so she turns on the cooling water to avoid the risk, this would be considered a near-miss if it was not that particular Operator’s role to do this—they were lucky that she noticed it, but what would have happened if she hadn’t?
Proactive vs. Reactive:
We start already with a very clear set of safety rules, which have been developed over the years. In my last company, I was one of the people who “developed” the initial safety procedures. But we didn’t start from scratch—rather, we started from lots of excellent procedures which are available online and from government bodies, from material safety data sheets which are available for every chemical and give detailed safe-handling instructions, and so on. Even still, this process probably took about one month of my time, as Head of Engineering, and similar amounts from a couple of other employees, including the Safety Manager. And this was just for a lab and pilot-plant set-up. In a sense you could consider this to be the “reactive” part, this is the set of rules that have been built up over years of experience. The rest of what I describe below is the proactive part, in which the people best qualified to judge evaluate what could potentially go wrong:
Starting from this as a baseline, we then do a detailed hazard analysis (HAZAN, HAZOP analyses) in which we study all the risks and hazards that exist in our lab or pilot plant, whether they be equipment or chemicals or operations. So, for example, if we use liquid Nitrogen, we’d already have (in step one) a detailed set of rules for handling liquid Nitrogen, and in step 2, we’d do a detailed analysis of what could possibly go wrong—what if the container gets dropped? what if an operator makes a mistake? what if a container has a leak? What if someone who isn’t trained to work with liquid nitrogen were in the lab? etc. And we’d then need to develop procedures to ensure that none of these scenarios could lead to accidents or personal injuries.
Then, every time we want to do something new (e.g. we work with a new chemical, we want to try a new process-experiement, …) we need to do a detailed risk analysis of that and get it approved by the safety manager before we can start.
In addition to the above, for major risks (e.g. solvent-handling that could lead to explosions) we would sometimes do additional safety risks—one example would be a “bowtie analysis” (a deceptively elegant name!) in which we create a picture like a bowtie.
In the centre (the knot) is the event itself, let’s say an ignition event, but in a bio-hazard context, it could be say a researcher dropping a glass container with dangerous viruses in it.
To the left are all the steps that are taken to prevent this from occurring. For example, for an ignition/explosion risk, you need three things—an explosive atmosphere, a source of ignition (e.g. a spark) and the presence of oxygen—and typically we’d want to make sure at most one of these was present, so that we’d have two layers of security. So, on the bowtie diagram, you would start on the left and look at something that might happen (e.g. the solvent spills) and then look sequentially at what could happen, to make sure that the safety measures in place would take care of it. An especially important concern would be any occurence that could reduce the impact of more than one layer of security simultaneously—for example, if a research in a biolab takes a sample outside the safe-working-area, intentionally or by accident, this might mean that several layers of what appeared to be an impenetrable safe system are broken.
To the right are the steps to minimise the consequences. If the explosion occurs, these are the steps that will minimise the injuries, the casualties, the damage. For example, appropriate PPE, fire-doors, fire-handling procedures, minimum number of people in the lab, etc.
I think the idea is clear. I am sure that someone, probably many people, have done this for bio-hazards and bio-security. But what is maybe different is that for engineers, there is such a wealth of documentation and examples out there, and such a number of qualified, respected experts, that when doing this risk analysis, we have a lot to build on and we’re not just relying on first-principles, although we use that too.
For example, I can go on the internet and find some excellent guides for how to run a good HAZOP analysis, and I can easily get experienced safety experts to visit my pilot-plant and lead an external review of our HAZOP analysis, and typically these experts will be familiar with the specific risks we’re dealing with. I’m sure people run HAZOP’s for Biolabs too (I hope!!) but I’m not sure they would have the same quality of information and expertise available to help ensure they don’t miss anything.
From your analysis of biolabs, it feels much more haphazard. I’m sure every lab manager means well and does what they can, but it’s so much easier for them to miss something, or maybe just not to have the right qualified person or the relevant information available to them.
What I’ve described above is not a check-list, but it is a procedure that works in widely different scenarios, where you incorporate experience, understanding and risk-analysis to create the safest working environment possible. And even if details change, you will find more or less the same approach anywhere around the world, and anyone, anywhere, will have access to all this information online.
And … despite all this, we still have accidents in chemical factories and building that collapse. Luckily these do not threaten civilisation they way a bio-accident could.
Hope this helps—happy to share more details or answer questions if that helps.
Thanks, this is really interesting.
One follow-up question: who are safety managers? How are they trained, what’s their seniority in the org structure, and what sorts of resources do they have access to?
In the bio case it seems that in at least some jurisdictions and especially historically, the people put in charge of this stuff were relatively low-level administrators, and not really empowered to enforce difficult decisions or make big calls. From your post it sounds like safety managers in engineering have a pretty different role.
Indeed,
A Safety manager (in a small company) or a Safety Department (in a larger company) needs to be independent of the department whose safety they monitor, so that they are not conflicted between Safety and other objectives like, say, an urgent production deadline (of course, in reality they will know people and so on, it’s never perfect). Typically, they will have reporting lines that meet higher up (e.g. CEO or Vice President), and this senior manager will be responsible for resolving any disagreements. If the Safety Manager says “it’s not safe” and the production department says “we need to do this,” we do not want it to become a battle of wills. Instead, the Safety Manager focuses exclusively on the risk, and the senior manager decides if the company will accept that risk. Typically, this would not be “OK, we accept a 10% risk of a big explosion” but rather finding a way to enable it to be done safely, even if it meant making it much more expensive and slower.
In a smaller company or a start-up, the Safety Manager will sometimes be a more experienced hire than most of the staff, and this too will give them a bit of authority.
I think what you’re describing as the people “put in charge of this stuff” are probably not the analogous people to Safety Managers. In every factory and lab, there would be junior people doing important safety work. The difference is that in addition to these, there would be a Safety Manager, one person who would be empowered to influence decisions. This person would typically also oversee the safety work done by more junior people, but that isn’t always the case.
Again, the difference is that people in engineering can point to historical incidences of oil-rigs exploding with multiple casualties, of buildings collapsing, … and so they recognise that getting Safety wrong is a big deal, with catastrophic consequences. If I compare this to say, a chemistry lab, I see what you describe. Safety is still very much emphasised and spoken about, nobody would ever say “Safety isn’t important”, but it would be relatively common for someone (say the professor) to overrule the safety person without necessarily addressing the concerns.
Also in a lab, to some extent it’s true that each researcher’s risks mostly impact themselves—if your vessel blows up or your toxic reagent spills, it’s most likely going to be you personally who will be the victim. So there is sometimes a mentality that it’s up to each person to decide what risks are acceptable—although the better and larger labs will have moved past this.
I imagine that most people in biolabs still feel like they’re in a lab situation. Maybe each researcher feels that the primary role of Safety is to keep them and their co-workers safe (which I’m sure is something they take very seriously), but they’re not really focused on the potential of global-scale catastrophes which would justify putting someone in charge.
I again emphasise that most of what I know about safety in biolabs comes from your post, so I do not want to suggest that I know, I’m only trying to make sense of it. Feel free to correct / enlighten me (anyone!).