(Note: For more technical readers, the original academic paper that inspired this blog can be found here.)
Robotic systems have become an integral part of numerous industries, from manufacturing and healthcare to space exploration and autonomous vehicles. Ensuring the safety of these complex machines is of paramount importance.
In an effort to mitigate potential risks and accidents, failsafe maneuvers or backup controllers are often implemented. However, relying solely on a singular failsafe maneuver may present significant limitations that can compromise the overall safety of a robotic system. This blog explores the drawbacks of depending solely on a singular failsafe maneuver and emphasizes the need for a more comprehensive approach to robotic system safety.
The dangers of relying on a singular failsafe maneuver
You create one point of failure
Implementing a singular failsafe maneuver means that the entire system's safety hinges on a single mechanism or control. While such measures may offer some level of protection, they introduce the concept of a single point of failure. If this failsafe mechanism malfunctions or is unable to detect a critical situation, the entire system may be left vulnerable, putting both operators and the environment at risk. This limitation becomes increasingly evident when facing complex or unexpected scenarios that the singular failsafe may not have been designed to handle.
You create a lack of adaptability
Robotic systems are often required to operate in dynamic and ever-changing environments. Singular failsafe maneuvers are typically designed to address specific known risks or failures, leaving little room for adaptability. In scenarios where the system encounters an unanticipated danger or unforeseen circumstances, the singular failsafe may not be capable of adequately responding or providing the necessary protection. Consequently, a more comprehensive safety approach that considers a wider range of potential risks is essential.
You create a vulnerability to sophisticated attacks
As technology advances, the potential for cyberattacks targeting robotic systems also increases. Relying on a singular failsafe maneuver exposes the system to vulnerabilities. If an attacker manages to bypass or disable the failsafe mechanism, they would effectively gain control over the entire system, posing serious threats. Employing multiple layers of security measures, such as redundant failsafes and robust cybersecurity protocols, becomes imperative to protect against sophisticated attacks.
You have limited coverage of system failures
While a singular failsafe maneuver may address certain failures or malfunctions within a robotic system, it does not guarantee comprehensive coverage. Complex systems are prone to various types of failures that may require distinct failsafe mechanisms. By relying on a singular failsafe, other critical failure modes might be overlooked, potentially leading to catastrophic consequences. A comprehensive safety strategy should encompass a range of failsafe mechanisms, each designed to address specific failure modes, ensuring a more comprehensive safety net.
You create a false sense of security
A singular failsafe maneuver can create a false sense of security among operators and users. If they rely solely on the failsafe mechanism without fully understanding its limitations, they may underestimate the potential risks and fail to take appropriate precautions. This false sense of security can result in complacency and an inadequate response when confronted with unexpected situations. Operators should be aware of the limitations of failsafe maneuvers and be prepared to intervene and take manual control when necessary.
The answer? Multiple backup controllers
To address the shortcomings of a singular backup controller, the obvious solution would be to incorporate different policies that can handle as many situations as possible. While the solution may be obvious, its implementation can be difficult in practice. Discontinuities in the control input – as well as the added computational complexity of evaluating different policy options – increase the difficulty of implementing this obvious solution.
Let’s look at a solution that solves the precedent issue. We’ll take a truck driving scenario as our guide.
For truck driving, a safety maneuver might be triggered if a car stops in front of them. Due to his high inertia, the truck needs to keep a relatively wide distance to the car if there is only one failsafe maneuver like an e-stop available. However, changing lanes if the adjacent ones are free can be considered as a better solution as it allows the truck to drive closer to a car.
This, then, is the answer: We try to temper the trajectory to avoid collision and if this path is not possible, the failsafe function which is timed-varying, continuously changes the maneuver to something safe and feasible like an e-stop.
Let's keep the example of a one lane road with a truck following a car which can stop suddenly at any moment. As mentioned before, the truck would have to keep a certain safe distance behind the car so that in the event of an emergency braking of the car, the truck can brake too without colliding with the car.
With our method, we can provide this failsafe maneuver but also expend it in the case of a 3 lane road. Let's say that the truck and the car are in the middle lane and the other two lanes are empty in the next 200 meters. A better failsafe maneuver could be to switch lanes.
The solution is to continuously check at a certain frequency. If this new maneuver is feasible in the next seconds and if it’s a valid maneuver, it can be considered as the failsafe maneuver for this timestamp. This failsafe maneuver allows the truck to follow the car at a closest distance. However, if the adjacent lanes are not available, the initial failsafe maneuver (now called recovery maneuver), as part of the extended failsafe function, is still available and ensured at any time the other maneuvers are not valid.
This method extends a single failsafe maneuver to a multiple failsafe policy without any of the drawbacks of evaluating different policies and ensuring a continuous failsafe function. It accomplishes this by having a failsafe function from t0 to tf and let’s say the “switch lanes” policy from t0 to t1 and the recovery policy from t1 to tf. The method by default sets the time of the function to t1 and if the failsafe policy from t0 to t1 is valid, it sets the time to t0. (cf next figure)
To avoid the complexity of evaluating multiple failsafe maneuvers, the method only checks for one at each time interval.
In the case of a 3 lane road we have 2 maneuvers:
- Switch to left lane
- Switch to right lane
Alongside the failsafe policies the emergency brake is considered as the recovery maneuver.
If the maneuver number 1 was not feasible at the previous iteration, the method will only evaluate the maneuver number 2. And it will not evaluate maneuver 1 anymore unless maneuver number 2 is not available. This process ensures that only one failsafe policy is evaluated at each time interval.
While this solution is not optimal, it scales to an arbitrary number of failsafe policies.
One thing to keep in mind is that the emergency brake is at each time stamp the default solution and if all other failsafe policies are unavailable then the emergency stop is triggered in case of need. This method does not force the driver to comply with these emergency policies, and the final input can be a mix between the user input and the failsafe policy.
Optimizing for maximum robotic system safety
While failsafe maneuvers and backup controllers play a crucial role in maintaining the safety of robotic systems, relying solely on a singular failsafe maneuver poses significant limitations. The concept of a single point of failure, lack of adaptability, vulnerability to sophisticated attacks, limited coverage of system failures, and the potential for a false sense of security are all important considerations.
To enhance the safety of robotic systems, a comprehensive approach that incorporates redundant failsafes, adaptability, robust cybersecurity measures, and operator awareness is crucial. By addressing these limitations, we can work towards a future where robotic systems can operate with enhanced safety and reliability in diverse and complex environments.