How to Describe a Redundant System

How to Describe a Redundant System

Redundancy is widely used in functional safety systems. A common way to do this is to describe a system as MooN (M out of N). In this blog, I try to describe the various MooN architectures.

In functional safety, redundancy serves many purposes, including:

  • To increase the claimed systematic capability, diverse redundancy is used
  • To allow the use of a lower SFF and still meet the hardware metrics – here it is not called out as redundancy but rather HFT (hardware fault tolerance)
  • To reduce the claimed PFH/PFD limited only by the Beta factor
  • To increase availability – you have two safety systems, and even if one fails, you still have another to ensure safety
  • To increase the lifetime of a system with load-sharing

 

I have covered some of these already in other blogs in my safety matters series, but this blog will concentrate on how to describe redundancy

Taking one of these. Below is the IEC 61508-2 table illustrating how you can trade off HFT vs SFF. This is really the only purpose of HFT in the standard.

 Trading off HFT and SFF in IEC 61508

Figure 1 - Trading off HFT and SFF in IEC 61508

Requirements for redundancy can be indicated using language such as

  • No single error
  • Single error fault tolerance
  • 1oo2, 2oo3 etc
  • HFT=1
  • A voted system

A common description is MooN, which means that there are N redundant subsystems and that M of them need to demand a trip before the safety system takes action to achieve a safe state.

So in summary:

  • 1oo1 means a non-redundant safety system
  • 1oo2 is a redundant system where the system trips when either of the sub-systems trips
  • 2oo2 is a redundant system, but the redundancy is mostly added for availability rather than safety since both of them need to trip to trip the system. One of them trips, and nothing happens
  • 2oo3 is a system that is highly available and safe. The system won’t trip if only one of the sub-systems trips but will trip if two out of the 3 demand it, so it is single-fault tolerant.

You can also show some of them on the HFT vs SFF diagram from earlier.

  HFT vs SFF chart from IEC 61508 with columns labelled

Figure 2 -  HFT vs SFF chart from IEC 61508 with columns labelled

Some of the circuits can be illustrated very nicely with relay diagrams. In the figure below, the path from the top to the bottom opens, and then safety is achieved. So, for instance, in the 2oo2 circuit, if channel A fails dangerously with its relay stuck closed, it doesn’t matter what channel B does; it can’t trip the system. This 2oo2 circuit is redundant, but its redundancy guarantees availability. Will it guarantee some level of safety? Yes, it will when both channels are working or when one of the channels has a safe failure (one that causes its relay to open).

 Relays to illustrate some MooN circuits

Figure 3 - Relays to illustrate some MooN circuits

The excellent book Safety Instrumented Systems Verification provides a very nice diagram for a circuit that achieves high safety and high availability.

 A nice description of a 2oo3 system

Figure 4  -A nice description of a 2oo3 system

The system will be available (no factory downtime) if one of the 3 fails, and you will still be protected (kept safe) if one of the 3 fails.

And now it gets a bit more controversial, but nothing too serious. IEC 61508-6:2010 describes several circuits, including 1oo1, 1oo2, 2002, 1oo2D, and 2oo3. The controversial bit is whether the “D” in 1oo2D stands for Diagnostics or Degraded. It doesn’t really matter since the functionality is the same either way.

Figure 5 -1oo2D from IEC 61508-6:2010

Looking at the above description from IEC 61508-6:2010, since both “channels need to demand the safety function before it takes place,” it starts out as 2oo2. However, if a failure is detected in either channel, that channel is switched off (its relay opened, see previous diagrams), and it becomes a 1oo1D system. That means the “D” surely stands for degraded.

If we got to the PLC (programmable logic controller) safety standard IEC 61131-6, we find this picture of a 1oo1D safety system. It's only 1 channel, so there is no degraded mode for it, but it does have two relays, one controlled by the diagnostics and one by the functional channel. So the “D” must stand for the diagnostic channel having its own switch/relay. There are redundant means to achieve a safe state. This is sometimes called an SMOD (secondary means of disconnection).

 1oo1D system according to IEC 61131-6

Figure 6 - 1oo1D system according to IEC 61131-6

The bottom line is that if someone says they have a MooND system, you might want to ask for more details.

We didn’t even get to active redundancy, load sharing, or standby redundancy. I will keep those for another day.

If you want to learn more, a very nice chart on slide 4 of this presentation compares 7 different versions of MooN circuits.

Relevant previous blogs in this series include:

               One on on-chip redundancy – see here.

               One on availability vs safety – see here.

               For a more detailed look at the 1oo2 architecture – see here.

Check back next month on the second Tuesday for the next blog in this series. Until then, I hope to post “mini blogs” on the other Tuesdays in the month directly from my LinkedIn account. Please follow me on LinkedIn if you are interested.

For previous blogs in this series, see here.

For the full suite of ADI blogs on the EngineerZone platform, see here.

For the full range of ADI products, see here.