The pair of SafetyProviderID and SafetyBaseID is used to check the authenticity of the ResponseSPDU by the SafetyConsumer. SafetyProviderID and SafetyBaseID are usually assigned during engineering or during commissioning. It is in the responsibility of the end user or OEM to assign unique SafetyProviderID to individual SafetyProviders whenever this is reasonable possible. For instance, a machine builder should assign unique SafetyProviderIDs within a single machine.

As the effort for the administration of unique IDs will reach its limits when the system becomes large, OPC UA Safety uses the SafetyBaseID for cases where guaranteeing unique IDs is not possible.

An SafetyBaseID is a universal unique identifier version4 (UUIDv4, also called globally unique identifier (GUID)), as described in https://tools.ietf.org/html/rfc4122. Basically, it is a 128-bit number where more than 96 bits were chosen randomly. The probability that two randomly generated UUIDs are identical, is extremely low (2-96 < 10-28), and can therefore be neglected, even when considering applications with a safety integrity level of 4.

It is not necessary to generate an individual UUID for all SafetyProviders. If two SafetyProviders can be discriminated by their SafetyProviderIDs, they may share the same SafetyBaseID. For instance, a machine builder might generate a SafetyBaseID for each instance of a machine, which is re-used for all SafetyProviders within a machine.

When implementing or using a generator for the UUIDs, it has to be ensured that each possible value is generated with equal probability (discrete uniform distribution), and pair wisely independent from each other. When a pseudo random number generator (PNRG) is used, it is ‘seeded’ with a random source having enough collision entropy (e.g. seeds of at least 128 bits that are uniformly distributed, too; and all seeds being pair wisely independent from each other).

Most commercial systems offer random number generators for applications within a cryptographic context. These applications pose even harder requirements on the quality of random numbers than the ones mentioned above. Hence, cryptographically strong random number generators are considered to be applicable to OPC UA Safety as well. See References [2]-[5] for detailed information.

Table 30shows implementations of cryptographically strong random number-generators that can be used to calculate the random part of the UUIDv4:

Table 30 – Examples for cryptographically strong random number generators.

Environment

Function

Microsoft® Windows® Operating Systems

BCryptGenRandom

found in Bcrypt.dll

Unix®-like OS (e.g. Linux® / FreeBSD® / Solaris®)

Read from the file:/dev/urandom/

.NET®

RandomNumberGeneratorfrom System.Security.Cryptography

JavaScript®

Crypto.getRandomValues()

Java®

java.security.SecureRandom

Python®

os.urandom(size)

While being evaluated from a security point of view, probably none of these implementations has been validated with safety kept in mind. Therefore, there is a remaining risk that these implementations are subject to systematic implementation errors which might decrease the effectiveness of these random numbers. To overcome this problem, the output of the random number generator is not used directly, but a SHA256-hash is calculated over (1) the generators output, (2) a timestamp (wall-clock-time or persistent logical clock) and (3) a unique domain name. Any bits of the SHA256-hash can then be used to construct the random parts of the UUIDv4.

[RQ11.1] The parameters SafetyBaseID and SafetyProviderID shall be stored in a nonvolatile way (i.e. persistent).

The SafetyConsumerID allows for discrimination between RequestSPDUs and ResponseSPDUs belonging to different SafetyConsumers. It is mainly used for diagnostic purposes, such as detecting unintentional concurrent access of multiple SafetyConsumers on a single SafetyProvider. Safety-related communication errors which are detected by checking the SafetyConsumerID would also be detected by other mechanisms, including the MNR, the SafetyProviderID, and the SafetyConsumerTimeOut.

From a safety point of view, there are no qualitative requirements regarding the generation or administration the SafetyConsumerID. It can be assigned during engineering, commissioning, at startup, and may even change during runtime. It is not required to check for uniqueness of SafetyConsumerID.

However, assigning identical SafetyConsumerIDs to multiple consumers is not recommended because fault localization may become more difficult.

The MNR is used to discriminate telegrams stemming from the same SafetyProvider and is therefore used to detect timeliness errors such as outdated telegrams, telegrams received out-of-order, or streams of telegrams erroneously repeated by a network storing element (e.g. a router).

[RQ11.2] To be effective, the set of actually used MNR-values shall not be restricted to a small set. This could happen for connections which are restarted frequently, and which start counting from the same MNR value each time.

There are at least two ways to address this potential problem:

Option 1: Whenever the connection is terminated, the current value of the MNR shall be safely stored within non-volatile memory of the SafetyConsumer. After restart, the previously stored MNR is used for initialization of the MNR (i.e. in state S12 of the SafetyConsumer state machine).

Option 2: Whenever the SafetyConsumer is restarted (i.e. in state S12 of the SafetyConsumer state machine), the MNR is initialized with a 32-bit random number.

Following IEC61784-3, OPC UA Safety uses a black-channel-approach to detect all communication errors which can possibly occur in the underlying OPC UA stack. If an error is detected, the erroneous data is discarded. Moreover, OPC UA Safety is designed in such a way that a safety function becomes practically unusable if the failure rate in the Black Channel is higher than one error per safety error interval limit (6,60, or 600 minutes), depending on the desired SIL of the safety function, see Table 17 and Table 31).

Thus, for operational safety functions a failure rate of 0,1h-1, 1h-1, or 10h-1 can be assumed for communication errors occurring in the black channel. In order to obtain the communication’s contribution to the PFH-value of the safety function, this value has to be multiplied by the so-called conditional residual error probability Pre,cond. For the CRC-mechanism used in OPC UA Safety, it holds:

Pre,cond ≤ 4.0 x 10-10

This leads to the PFH and PFD values shown in Table 31.

The value 4.0 x 10-10 was justified by extensive numerical evaluation of the 32-bit CRC generator polynomial in use (0x F4ACFB13). The results of this evaluation - executed for all relevant data lengths and all relevant values for the bit error probability p - is shown in Figure 23. As can be seen, Pre,cond never exceeds the value 4.0 x 10-10.

image028.jpg

Figure 23 – Conditional residual error probability of the CRC-check.

An explanation that it is indeed necessary to calculate Pre,cond for all user data lengths and all relevant values of p can be found in Figure 24. For the data lengths shown in this figure, Pre,cond exceeds the desired value by several orders of magnitudes. Note that the maximum value of Pre,cond is not obtained when p becomes maximal.

image029.jpg

Figure 24 – Counter example: data lengths not supported by OPC Safety.

The boundary conditions and assumptions for safety assessments and calculations of residual error rates are listed here.

Generally:

  • Number of retries in the black channel:No restrictions
  • Black Channel CRC polynomials:No restrictions
  • Message storing elements:No restrictions; any number of message storing elements is permitted
  • Size of SafetyData within one SPDU:≤ 1500 bytes

Note: Even for safety functions which do not require manual operator acknowledgment for restart, manual operator acknowledgment is mandatory whenever the SafetyConsumer has detected certain types of errors and indicates this using OperatorAckRequested. Hence, operator acknowledgment is expected to be implemented by the safety application whenever OPC UA Safety is used. For details, see Clause 7.4.2 and Annex B.2.

The PFH-value of a logical OPC UA Safety communication link depends on the parameter of SafetyErrorIntervalLimit (see Table 17) of the link’s SafetyConsumer. Whenever the SafetyConsumer detects a mismatch of the SafetyConsumerID, SPDU_ID, MNR or CRC-checksum, it will only continue operating if the last occurrence of such an error happened more than SafetyErrorIntervalLimit time units ago. Otherwise, it will make a transition to fail-safe values, which can only be left by manual operator acknowledgment, see Clause 7.4.2.

This directly limits the rate of detected errors, and indirectly limits the rate of undetected (residual) errors.

See Table 31 for numeric PFH- and PFD-values.

Table 31 – The total residual error rate for the safety communication channel

SafetyErrorIntervalLimit

Allowed for SIL range

Total Residual error rate for one logical connection of the safety function

(PFH)

Total Residual error probability for one logical connection of the safety function, for a mission time of 20 years

(PFDavg)

6 Minutes

Up to SIL 2

< 4,0*10–9 / h

< 3,504 * 10-4

60 Minutes

Up to SIL 3

< 4,0*10–10 / h

< 3,504 * 10-5

600 Minutes

Up to SIL 4

< 4,0*10–11 / h

< 3,504 * 10-6

Note: the estimates for PFDAVG are conservative. More accurate values will be provided in the future.

Note: the parameter SafetyErrorIntervalLimit affects the PFH/PFD of the safety communication channel, only. There is no effect on the PFH/PFD-values of the network nodes the SafetyProviders and SafetyConsumers are running on. The requirements for the implementation of these nodes are specified in the IEC 61508.

[RQ11.3] According to IEC 61508-2, the suppliers of equipment implementing OPC UA Safety shall provide a safety manual. The instructions, information and parameters of Table 32 shall be included in this manual unless they are not relevant for a specific device.

Table 32 – Information to be included in the safety manual

Item

Instruction and/or parameter

Remark

1

Safety handling

Instructions on how to configure, parameterize, commission and test the device safely in accordance with IEC 61508 and IEC 61784-3

2

PFH, respectively PFDavg

The PFH, respectively PFDavg per logical connection of the safety function.

See Clause 11.3.2

and Clause 11.4

3

SFRTOPCSafety

Information, on how this value can be calculated by the end user / OEM.

See Clause 10.2

The implementation and error reaction of ConsumerCycleTime is in the responsibility of the vendor/integrator.

4

SafetyBaseID / SafetyProviderID

Information on how the SafetyBaseID and SafetyProviderID are generated and assigned.

See Clause 11.1.1

5

Commissioning

The end user / OEM is responsible for verification and validation of correct cabling and assignment of network addresses.

The safety manual shall address how this can be accomplished.

6

Operator Acknowledgment

If the SafetyConsumers makes a transition to fail-safe substitute values requiring operator acknowledgement “frequently”, this is an indication that a check of the installation (for example electromagnetic interference), network traffic load, or transmission quality is required.

It shall be mentioned in the manual that it is potentially unsafe to simply omit these checks.‘Frequently’ in this context is defined as

  • more than once per day in SIL2 and SIL3 applications
  • more than once per week in SIL4 applications

7

Duration of demand

In safety applications where the duration of a demand signal is short (e.g. shorter than the process safety time), and it is crucial that the consumer application never misses a demand, then a bidirectional communication must be arranged and the confirmation of receiving the demand at consumer side must be implemented in the application program, by sending appropriate information within the SafetyData.

8

High demand and low demand applications

The SafetyConsumer must be executed cyclically within a shorter time frame than the SafetyConsumerTimeOut.

9

Maintenance

Specific requirements for device repair and device replacement.

[RQ11.4] The device a SafetyConsumer is running on shall be able to indicate if SAPI.OperatorAckRequested is enabled. This can be done for example by an indicator LED or using an HMI.

[RQ11.5] If an LED is used for indication, it shall blink in green color with frequency of 0.5 Hz whenever the output SAPI.OperatorAckRequested is true of at least one of the SafetyConsumers running on the device.

The message shown on an HMI is application specific. For instance, the text “Machine has stopped for safety reasons. For restart, please check for obstacles and press the green button.”