I have taken the title of this blog from the book of the same name published in 1993 and written by Lauren Ruth Wiener. I loved the title and when I saw it could be bought second hand on Amazon.com for less than $5 I bought it (I just checked and it is still available from as little as $1.99). I would highly recommend the investment and the book is an easy read with no great knowledge of functional safety required (the book largely predates functional safety). The book features an introduction by David Parnas who spent some time as a professor of software engineering at the University of Limerick until his retirement in 2008 (UL is 5 miles down the road from where I am writing this). The back cover of this book contains a review comment by Peter G. Neumann stating “Lauren Wiener’s book is a very readable, informative, and compelling view of the primary risks inherent in computer software. It clearly reflects the realities of today and why they can be expected to remain the worries of tomorrow”. If the book is good enough for Parnas, Neumann and Co. it is worth $5! 

I have been meaning to do more functional safety for software blogs and perhaps this will start me off and perhaps I will even do a few more book reviews as I have built up a backlog of books to read some old, some new. This blog will feature excerpts and quotes from the book which caught my eye. Obviously, the bits which catch my eye would be different from those which you the reader might find interesting.

Firstly, the book is interesting in the sense that some of the future technology that it mentions is actually here and working today. Examples include video phones and self-parking cars. However many of the problems it highlights are still the same. For instance, self-driving cars are mentioned but they are not on the market. It worries me that something that was seen as imminent 25 years ago but is still described as imminent today may be permanently imminent.

To start with an early quote, “Where most product labels have an explicit warranty, software products carry an explicit disclaimer”, it almost implies that the supplier knows their software is buggy!

“Software may be inherently unreliable, but the typical software development process is not apt to improve matters”. I would hope that high integrity processes such as those given in IEC 61508-3 which was published in 1998 (5 years later than the book) would have helped in that case. However its avionics equivalent D0-178B was released in 1992 so perhaps not. Several of the examples given in the book related to avionics.

Reasons given in the book for software being unreliable include:

  • Software has more states than the human brain can comprehend
  • If you accept that software is inherently buggy, with an infinite number of things that can go wrong, then even if you remove a large number of bugs there is still an infinite number remaining. In this scenario while adding behavior is the task software developers concentrate on, “writing software is a matter of trying to take out as many bugs as possible”.
  • Software depends not only on the present state but also on what has gone before
  • Software isn’t analog in nature. A bridge over a river, for instance ,will give warnings as the stress on it becomes gradually too large. It follows the laws of physics. Similarly, the book reminds us that “real pianos don’t have to be told to maintain their timbres; they can’t help it” whereas “Software knows nothing about real-world physical constraints”. Another example given in the book is the accelerator (gas pedal) in a car. With a mechanical accelerator a small change in the pressure applied to the pedal leads to a small change in the car speed and a large change in accelerator pressure to a large change in the car speed whereas a computer system is discontinuous and a software generated output can have a large change when even a single bit flips
  • The interface between software and the real-world is a rich source of bugs
  • “The client does not know what he wants. The client usually does not know what questions must be answered, and he has almost never thought of the problem in the detail necessary for the specification”.
  • When the product release schedule is running late it turns out that “the only stage of the software life cycle that doesn’t turn out to be optional, in a pinch, is writing the code”. Perhaps optional isn’t the word I would have used but certainly a lot of things can be skipped if you must make the planned release date.
  • Customers and developers live in different worlds. Customers think about air flow and accounts payable, but software developers are thinking about data structures and module interfaces.
  • Software has no common sense. It doesn’t know that when you turn the steering wheel in your car to the right the car is supposed to go right. There is no physical link between the steering wheel and the wheels.
  • Even if you systematically test against the specification and so know that the system does everything it was supposed to do you still have to ask what else does it do that wasn’t in the specification

There are several references to failures caused by poor change management including many I had never heard of. I guess I had better things to do back around 1993/94. One such is a change to three lines out of millions of code in a telephone switching program covering large parts of the USA. Because only three lines were changed the full 13 week test run wasn’t re-run and guess what the system crashed. Another relevant quote I like is that “it may not be too hard to make the change you want, but it’s seldom simple to make only the change you want”. In regards to software maintenance the book tells us that “when you pay for software maintenance, what you are mostly paying for is for the company to fix the bugs it did not find during its own testing”. Later the books' author tells us “developers seldom maintain their own software for two reasons: prestige and job mobility” but in a more positive note “the more code you have, the more it will cost to develop and the harder it will be to maintain. The way to save money on software development is to develop less software. Lines of code are a debit, not an asset”.

 An interesting example given relates to  GM’s Hamtramck factory in Detroit Michigan (pictured to the left) which opened in 1985. It featured 50 AGV and over 260 robots. Needless to say given the topic of this book it didn’t work out so well, see here for more. Having said that I can still find articles dated 2018 about the same facility so they must have sorted it out eventually.

Another interesting discussion related to the computer architecture for the shuttle which featured five identical computers with the fifth one running diverse software to act as a backup. A train control system featuring three redundant computers is also discussed. It could track the location of the train using sensors in the wheels which contacted the train track to give its location or at least it could until they changed from clutch brakes to disk brakes. It turns out the clutch brakes had an unexpected feature which was not designed in or specified but still was there. In autumn the clutch brakes removed fallen leaves from the wheels so that contact was maintained. The disk brakes did not. The eventual fix was simple, namely to make sure that at least one carriage in every train had the older brakes.

The navigation software for the F16 fighter also gets a mention. Early versions of the software had a bug which caused the jet to flip when it crossed the equator. However the author does admit this bug was caught in simulation rather than in flight.

As regards documentation the book reminds us that “programmers like to program, but they are notorious for hating to write” and “In my experience, most programmers work best during the rewarding problem-solving and coding phases of a project. When it is time for tedious testing and mindless paperwork, their enthusiasm dwindles, and with it their effectiveness. Sermons about rigor make them feel guilty, but rarely affect their work habits”.

It’s fair to say the author of the book is not a fan of software being used in safety critical systems “The role of software in safety critical systems should always be a limited one” and is not a fan of safety standards suggesting that there is no evidence that they lead to safer software.

Looking at the examples of misbehaving software that the books author has used, I wonder what examples someone would use if writing the same book today. I imagine it would have to include the Boeing 737 max software issue ( Boeing share price down almost 25% since its peak in March) and the Toyota acceleration issue.

All in all I recommend you read the book even though it is over 25 years old. I am a member of the IEC 61508-3 working group and reading the older stuff gives me a better insight into the history behind the standard and the problems it was designed to solve. Knowing why can give you the confidence to remove or make changes. Luckily we still have some members of the committee who have been there since the start.

Anonymous