Diverse hands fitting wooden gears together--teamwork concept

Systems Thinking in Functional Safety: Don’t Miss the Forest for the Trees

Contributor(s): Jeff Simon

It's funny how the holidays, travel, and functional safety go together. If you’ve recently been traveling via plane, train, or automobile, hopefully, you never had reason to stop and wonder: “How is this vehicle getting me from Point A to Point B safely—that is, free from unacceptable risk?”  

This post will outline a systems thinking approach to address systematic errors in functional safety. We will show how effective product lifecycle management (PLM) can unite development processes and domains to achieve greater product safety and success.
  

Why Do We Need Systems Thinking in Functional Safety? 

While assessing any functional safety system, we often find that engineering domains are operating as silos without collaboration across domains. Even well-intentioned efforts to improve such structures can easily become siloed themselves.  

Yet these myriad processes form the “machine,” so how can they be considered in isolation from each other? From this angle, safety becomes another gear in the machine—not just a “bolt-on” activity but an essential component driving functional improvements.  

Conversations about functional safety often focus on the advantages of one analysis technique or standard vs. another or whether individual fault control requirements of the standard have been met. These conversations are missing the forest for the trees.  

Control and avoidance of systematic error requires a, well, systematic view—one that encompasses the various engineering domains, potential hardware faults and potential system faults, development processes, maintenance management, and more. 

A diagram illustrating air transportation as a series of systems within systems.

Typical system view of an aircraft and its environment, adapted from ISO 24748-1:2018 ©2018.
 

Random Hardware Faults

Engineers may be tempted to focus on implementing fault control mechanisms for random hardware faults—a worthy effort, but it misses the big picture. Consider how often accidents from hardware failures make headlines. Systematic failure is more likely to be the source of high-profile accidents and recalls.  

Thus, the conversation shifts. We don’t hear discussions about the safe failure fraction (SFF) of the applications department or the probabilistic metric for hardware failures (PMHF) of the quality organization. In these domains, the conversation is generally limited to “we followed the process” or “we have all the required work products.” 

That’s because systematic error offers a preventative opportunity that hardware failures do not. We can wait around to manage random hardware faults after the fact and even some systematic faults, yet we have an even more powerful opportunity to avoid introducing systematic faults altogether. 
 

Systematic Error 

For random hardware failures, functional safety can often be viewed as a bolt-on activity where safety can be improved by adding additional requirements. Avoiding systematic errors, on the other hand, requires a methodical approach and a systems perspective.  

Think back to your last time traveling, recent or otherwise. Which would have made you more nervous: Knowing that, at some point in the trip, a single random component would fail or knowing that a person or system would make a mistake that couldn’t be known in advance?  

I know which one I’d pick!  

It seems that safety of a trip from point A to point B could be more at risk from human error or system miscalculations than from any single random component failure. And that’s why we need a perspective of the whole functional safety forest, not just the individual trees.
 

Avoidance of Systematic Error  

Functional safety is an emergent property of the system, meaning it arises from interactions within the system rather than from the components themselves. These interactions occur between the… 

  • physical system and its parts,  
  • development processes,  
  • methodical maintenance processes, and 
  • lifecycle stages and their management. 

Organization, consistency, and management throughout a product’s lifecycle are key tenets of avoiding systematic error.  
 

Product Lifecycle Management: A System of Systems 

Systematic error avoidance requires a systematic understanding of the processes that move the product through its lifecycle stages, the dependencies between those processes, and their connection to systematic errors. Consider the relationship between safety and… 

  • Project management 
  • Quality management  
  • Hardware & software development 
  • Acquisition & supply process 
  • Requirements engineering 

All the safety standards are essentially the same when considering techniques and measures to avoid systematic errors. The respective safety lifecycles are also the same as any product lifecycle, with perhaps more attention to detail and rigor.  

Flowchart of a typical product lifecycle management model

Typical product life cycle management model
 

PLM Framework for Safety 

As we’ve said before, functional safety should never be “bolted on.” Instead, it should be adapted and integrated into an organization’s existing process framework.  

This may seem like a complex problem, but remember: We’re looking at the forest, not the trees. The whole “machine” is just a system of systems, sub-systems, and their relationships, of which functional safety is an emergent property.  

Hardware, lifecycle, and management process elements are all part of that machine. By treating them as such, organizations can create effective structures for the development, manufacture, and support of functionally safe systems.  

Product lifecycle flowchart showing how functional safety fits in as an additional subsystem.

How functional safety standards fit into overall quality and life cycle process standards

 

Systems within Systems: It’s All Connected  

This collection of interrelated workflows and management processes can be modeled as yet another system. Standards already exist that can provide details about this model.  

Taking an even broader view, a digital twin of the process environment can be created to analyze existing organizational interactions and to assess the capability of controlling systematic error. Thus, the relationship and dependencies between safety, quality, environmental management, information technology, project management etc. begin to emerge. 

It may seem that functional safety has little to do with areas like development efficiency, time to market, or product quality, yet from a bird’s eye view we can see that they are all connected. Just as engineering domains work better together, functional safety can deliver the most value as part of a holistic strategy.

Read more from the Automotive FuSa blog series