Safe Through and Through: Developing For Safety Related Systems

Software systems should always be both robust and reliable, however the moment you introduce a safety element, this need for reliability increases significantly. The level of safety required is governed by the severity and frequency of the hazards identified. Though safety related systems do not have the full responsibility for hazards such as loss of life that safety critical systems do, a malfunction or failure would pose a significant risk to the people involved i.e. the passengers on a train.

Everyone will be aware of software development lifecycles, but there are a few key differences at each stage when developing for safety related systems.

For the purpose of this article, we will be utilising the V-model, which is one of the lifecycles recommended by safety standards such as EN 61508 and EN 50128. Additionally, to further explain the differences covered throughout the article below, we will be utilising the worked example of Driver Only Operation (DOO) camera systems. These systems are utilised in sections of the UK rail network so that drivers may operate them alone without the need for a conductor.

Requirements

A software system’s requirements will always be at the heart of any software development, and as such, it is important to define the requirements of a project before starting to write the code. Where the biggest difference makes itself known is in the need for identifying safety requirements.

Safety requirements are the sets of requirements that are defined with the sole purpose of reducing the hazards associated with the project. In the case of our worked example, the hazard is a passenger becoming trapped in the doors and dragged along the platform, the risk is that the image presented to the driver is not live. A safety requirement for a DOO system would be to ensure that drivers are made aware of instances where images are frozen. Ensuring that drivers are informed when the CCTV footage they are receiving does not reflect the live status of the Platform Train Interface (PTI) is a way of mitigating that risk.

Taking great care to ensure that each safety requirement is both atomic and testable will consequently reduce the risk presented in the remaining development stages.

Design & Implementation

Moving the topic of safety requirements through into the design and implementation phases, you must ensure that each requirement can be traced through into your design specifications. Breaking down all of your designs to at least a component level, is essential so that some else, such as an assessor, can determine that the design will meet the needs of the requirement. It will also allow you to test each component in isolation from the rest of the system. Essentially focusing on a simpler part of the system rather than trying to test the whole thing at once. This, in turn, will help ensure the robustness of the system under test.

Returning to our worked example, in your design specifications, you would outline a process where, in layman’s terms, the systems would be expected to compare sequences of frames captured from a camera for the continued repetition of the same frame. By stating this process in detail in your documentation, you will ensure that its functionality will be put under test when it comes to verification and validation activities. It is also important during the design stage to cover aspects such as ensuring integrity of data, particularly stored data, and checking memory etc,as these things can all have an impact on the execution of the software. It is important that if the software fails, it fails leaving the system in a safe state.

Shifting focus more towards implementation, you should also be able to trace your safety requirements down into your developed code. Excess code poses a significant risk to the integrity of your software, as it is most likely not needed and will make the software more complex. In addition there is a chance, albeit a low one, that it will be looked over during testing. So for a given line of code, you would expect to be able to trace it back to a single design statement, which should then be traceable back to a requirement.

It is also during the implementation stage that you should be implementing the combination of applying coding standards and implementing defensive programming.

While the application of coding standards is important for any software development they are mandatory in the development of safety related systems. By applying these standards, you will ensure consistency in the quality of your code, introduce security right from the start of development and help ensure compliance with the relevant industry standards. At this point it is probably worth mentioning the availability of tools, such as static analysis, that can analyse code automatically for defects and points of non-compliance with your coding standard and other industry best practice rules.

Defensive programming practices, e.g. checking for null values and validation of input parameters, helps to ensure the continued function of software under unexpected/unforeseen circumstances. Which in turn all works towards the necessary reliability and robustness.

Verification and Validation

Commonly mistaken as the final stage in the development process, V&V activities should be integrated throughout the whole project lifecycle. You must be able to demonstrate that independent peer reviews have been carried out at every single stage of the development process. While this may not feel like the most valuable use of time, these verification activities provide an opportunity to highlight instances of missing or overlooked requirements, and that nothing extra is added.

A second important factor is the V&V plan. This documentation should be defined and recorded during the very early stages of a project to both describe the V&V activities that will be carried out throughout the project, and demonstrate how compliance with the designated safety standard will be achieved.

As you carry out your testing you should expect to see that every single one of your safety requirements has been met. In terms of our DOO example, you would be expected to force a ‘freezing’ incident with footage designed for that purpose and expect the software to return a result where the driver is alerted to the incident.

Once all your testing activities have been concluded, the very last thing you would be expected to do is produce a V&V report. The purpose of this report is to confirm that you have completed all of the activities that you outlined in the V&V plan and to provide the justification behind any incidents where a given activity was not carried out. Looking on into the future, if at any point a change is made to the software, a V&V report must be produced for each new version release. Though it may be additional effort, this ensures that the required lifecycle activities have been completed against any changes made and that the system continues to function as expected.

The next big question is, if the techniques outlined in this article produce such reliable software, why don’t we apply them to every project? Here at Zircon we have developed a software development lifecycle, that combines the aforementioned techniques to be more agile as well as robust and SIL2 compliant.