Health Status – OPC Unified Architecture – Part 110: Asset Management Basics

9 Health Status

9.1 Overview

9.2 Overall health status of an asset

9.3 Asset specific information on health status

9.4 Root cause of asset specific information on health status

9.5 Standardized categories of asset specific information on health status

9.6 Tracking of health information

9.4.1 Overview

9.4.2 IRootCauseIndicationType

9.4.3 RootCauseDataType

9.5.1 Overview

9.5.2 ConnectionFailureConditionClassType

9.5.3 OverTemperatureConditionClassType

9.5.4 CalibrationDueConditionClassType

9.5.5 SelfTestFailureConditionClassType

9.5.6 FlashUpdateInProgressConditionClassType

9.5.7 FlashUpdateFailedConditionClassType

9.5.8 BadConfigurationConditionClassType

9.5.9 OutOfResourcesConditionClassType

9.5.10 OutOfMemoryConditionClassType

In order to provide the health status of assets, this specification mainly uses the concepts defined in OPC 10000-100.

The Variable 2:DeviceHealth defined in OPC 10000-100 is used to provide the overall health status of an asset. It uses an enumeration having the values NORMAL, FAILURE, CHECK_FUNCTION, OFF_SPEC and MAINTENANCE_REQUIRED (see OPC 10000-100 for details).

Note that this Variable should be used for all assets, independent of what type of asset it is, for example also for software components. For some types of assets, not all values of the enumeration can reasonably be applied.

OPC 10000-100 defines the interface 2:IDeviceHealthType containing optionally the Variable 2:DeviceHealth. In older versions of OPC 10000-100 the Variable was only defined on the 2:DeviceType.

In order to support older versions of OPC 10000-100, Objects representing an asset and providing health status information shall provide the 2:DeviceHealth Variable. This should be achieved by implementing the 2:IDeviceHealthType interface providing optionally the 2:DeviceHealth Variable. This can be done either directly on the instance (X:Asset1) or via the TypeDefinition (X:Asset2), as shown in Figure 3. However, it is also allowed to just provide the 2:DeviceHealth Variable as done by X:Asset3 by using the 2:DeviceType. Note that in the current version of OPC 10000-100 2:DeviceType does implement 2:IDeviceHealthType.

Figure 3 – Examples of assets providing health status

The SourceTimestamp of the 2:DeviceHealth Variable provides the time when the asset entered that health status. In order to provide an accurate time, Servers should preserve the time, also when restarting the Server. As this might not always be possible, a Client shall be aware that the SourceTimestamp is not always the time when the asset switched the health status.

To provide asset specific information on failure or maintenance conditions the alarming mechanism of OPC UA is used. The 2:IDeviceHealthType already defines the optional 2:DeviceHealthAlarms folder, where such alarm Objects can be provided. However, it is not required to provide the folder and the alarm Objects in the AddressSpace. Servers should use the AlarmTypes defined in OPC 10000-100 (2:DeviceHealthDiagnosticsAlarmType and subtypes) to expose the asset specific information. The event fields shall be used as defined in OPC 10000-5 and OPC 10000-9. The 0:SourceNode and 0:SourceName shall identify the asset. The 0:Severity should be used as defined in Table 15. Applications may refine those categories to a more detailed categorisation.

Table 15 – Usage of Severity for Alarms containing asset specific information

Severity	Classification	Description
801-1000	Critical Fault	The asset has permanently failed.
601-800	Major Recoverable Fault	The asset can no longer perform its function. Intervention is required.
401-600	Minor Recoverable Fault	The asset has experienced a problem but is able to continue operation until intervention occurs.
301-400	Maintenance Needed	Service is required to keep the asset operating within its designed tolerances.
201-300	Limited Resource Capacity Near Limit	A limited resource met a threshold beyond which a more serious fault would occur.
1-200	-	Used when the alarm is not active.

An asset might have several reasons why it’s not operating as expected, which often are dependent on each other. For example, the asset MyMachine might indicate the faults that a pump is not working and the oil pressure is too low. The second fault is caused by the first fault. To manage assets, it is desirable to identify the root causes of failures. In order to provide the root cause of why an asset is not operating as expected, the interface IRootCauseIndicationType is defined (see 9.4.2). Servers providing this information shall use AlarmTypes implementing this interface for alarms indicating asset specific information on failures.

The IRootCauseIndicationType is an interface and should be applied to AlarmTypes. It is formally defined in Table 16.

Note: The IRootCauseIndicationType cannot only be applied to AlarmTypes, but in general to subtypes of 0:ConditionType.

Table 16 – IRootCauseIndicationType Definition

Attribute	Value
BrowseName	IRootCauseIndicationType
IsAbstract	True
Description	Information on the root cause of conditions, should be applied to alarms (AlarmType or subtypes)

References	NodeClass	BrowseName	DataType	TypeDefinition	Other
Subtype of the 0:BaseInterfaceType Type defined in OPC 10000-5
0:HasProperty	Variable	PotentialRootCauses	RootCauseDataType[]	PropertyType	M

Conformance Units
AMB Asset Health Status Root Causes

The mandatory Property PotentialRootCauses contains an array of potential root causes of the alarm. This is intended to be a hint to the client and might be a local view on the potential root causes of the alarm. The list might not contain all potential root causes, that is, other potential root causes might exist as well. If the alarm itself is considered to be the root cause, the array shall be empty. If no potential root causes have been identified, there shall be at least one entry in the array indicating that the root cause is unknown.

The child Nodes of the IRootCauseIndicationType have additional Attribute values defined in Table 17.

Table 17 – IRootCauseIndicationType Attribute values for child Nodes

BrowsePath	Value Attribute	Description Attribute
PotentialRootCauses	-	An array of potential root causes of the alarm. This is intended to be a hint to the client and might be a local view on the potential root causes of the alarm. The list might not contain all potential root causes, that is, other potential root causes might exist as well. If the alarm itself is considered to be the root cause, the array shall be empty. If no potential root causes have been identified, there shall be at least one entry in the array indicating that the root cause is unknown.

This structure contains information about the root cause of an alarm. The structure is defined in Table 18.

Table 18 – RootCauseDataType Structure

Name	Type	Description
RootCauseDataType	structure
RootCauseId	0:NodeId	The NodeId of the root cause of an alarm. This can point to another Node in the AddressSpace or a ConditionId, that is not necessarily represented as Object in the AddressSpace. Ideally, this points directly to the root cause. Potentially, it points to an alarm that has an additional root cause. Clients shall expect that they need to follow a path to find the root cause. If the root cause is unknown, the NodeId shall be set to NULL.
RootCause	0:LocalizedText	Localized description of the root cause of an alarm. This can be the DisplayName of the Node referenced by RootCauseId or a more descriptive text. If the root cause is unknown, this should be described in the text.

Its representation in the AddressSpace is defined in Table 19.

Table 19 – RootCauseDataType Definition

References	NodeClass		BrowseName	DataType	TypeDefinition	Other
Subtype of the Structure DataType defined in OPC 10000-5

Conformance Units
AMB Asset Health Status Root Causes

Although the alarming mechanism of OPC UA is used to indicate asset specific information on the health status, there are common indications of alarms across assets. To standardize the common indications, and leave options for extensibility by companion specifications and vendors, this specification uses the mechanism of the ConditionClassId defined for conditions in OPC 10000-9. In the following, specific subtypes of BaseConditionClassType are defined that should be used as ConditionClassId for specific alarms. Other companion specifications and vendors might add additional subtypes of BaseConditionClassType and might inherit from the types defined in this specification.

The ConnectionFailureConditionClassType is used to classify conditions related to connection failures. It is formally defined in Table 20.

Table 20 – ConnectionFailureConditionClassType Definition

Attribute	Value
BrowseName	ConnectionFailureConditionClassType
IsAbstract	True
Description	One or more connections have failed

References	NodeClass	BrowseName	DataType	TypeDefinition	Other
Subtype of the 0:SystemConditionClassType defined in OPC 10000-9

Conformance Units
AMB Asset Health Status Alarm Categories

The OverTemperatureConditionClassType is used to classify conditions related to over temperature. It is formally defined in Table 21.

Table 21 – OverTemperatureConditionClassType Definition

Attribute	Value
BrowseName	OverTemperatureConditionClassType
IsAbstract	True
Description	Over temperature

References	NodeClass	BrowseName	DataType	TypeDefinition	Other
Subtype of the 0:SystemConditionClassType defined in OPC 10000-9

Conformance Units
AMB Asset Health Status Alarm Categories

The CalibrationDueConditionClassType is used to classify conditions related to calibration being due. It is formally defined in Table 22.

Table 22 – CalibrationDueConditionClassType Definition

Attribute	Value
BrowseName	CalibrationDueConditionClassType
IsAbstract	True
Description	Calibration is due

References	NodeClass	BrowseName	DataType	TypeDefinition	Other
Subtype of the 0:MaintenanceConditionClassType defined in OPC 10000-9

Conformance Units
AMB Asset Health Status Alarm Categories

The SelfTestFailureConditionClassType is used to classify conditions related to self-test failures. It is formally defined in Table 23.

Table 23 – SelfTestFailureConditionClassType Definition

Attribute	Value
BrowseName	SelfTestFailureConditionClassType
IsAbstract	True
Description	Self-Test failure

References	NodeClass	BrowseName	DataType	TypeDefinition	Other
Subtype of the 0:SystemConditionClassType defined in OPC 10000-9

Conformance Units
AMB Asset Health Status Alarm Categories

The FlashUpdateInProgressConditionClassType is used to classify conditions related to flash updates being in progress. It is formally defined in Table 24.

Table 24 – FlashUpdateInProgressConditionClassType Definition

Attribute	Value
BrowseName	FlashUpdateInProgressConditionClassType
IsAbstract	True
Description	Flash update in progress

References	NodeClass	BrowseName	DataType	TypeDefinition	Other
Subtype of the 0:MaintenanceConditionClassType defined in OPC 10000-9

Conformance Units
AMB Asset Health Status Alarm Categories

The FlashUpdateFailedConditionClassType is used to classify conditions related to flash update failures. It is formally defined in Table 25.

Table 25 – FlashUpdateFailedConditionClassType Definition

Attribute	Value
BrowseName	FlashUpdateFailedConditionClassType
IsAbstract	True
Description	Flash update has failed

References	NodeClass	BrowseName	DataType	TypeDefinition	Other
Subtype of the 0:SystemConditionClassType defined in OPC 10000-9

Conformance Units
AMB Asset Health Status Alarm Categories

The ConfigurationIsBadConditionClassType is used to classify conditions related to configurations being bad. It is formally defined in Table 26.

Table 26 – BadConfigurationConditionClassType Definition

Attribute	Value
BrowseName	BadConfigurationConditionClassType
IsAbstract	True
Description	Configuration is bad

References	NodeClass	BrowseName	DataType	TypeDefinition	Other
Subtype of the 0:SystemConditionClassType defined in OPC 10000-9

Conformance Units
AMB Asset Health Status Alarm Categories

The OutOfResourcesConditionClassType is used to classify conditions related to running out of resources. It is formally defined in Table 27.

Table 27 – OutOfResourcesConditionClassType Definition

Attribute	Value
BrowseName	OutOfResourcesConditionClassType
IsAbstract	True
Description	Out of resources issues

References	NodeClass	BrowseName	DataType	TypeDefinition	Other
Subtype of the 0:SystemConditionClassType defined in OPC 10000-9

Conformance Units
AMB Asset Health Status Alarm Categories

The OutOfMemoryConditionClassType is used to classify conditions related to running out of memory. It is formally defined in Table 28.

Table 28 – OutOfMemoryConditionClassType Definition

Attribute	Value
BrowseName	OutOfMemoryConditionClassType
IsAbstract	True
Description	Out of memory issues

References	NodeClass	BrowseName	DataType	TypeDefinition	Other
Subtype of the OutOfResourcesConditionClassType defined in 9.5.9

Conformance Units
AMB Asset Health Status Alarm Categories

Attribute

BrowseName

RootCauseDataType

IsAbstract

Description

Root cause of an alarm

When the Server wants to provide the overall health status of an asset over time, the history of the 2:DeviceHealth Variable shall be used. For each 2:DeviceHealth Variable, where the history is currently tracked, the Historizing Attribute is set to True and the AccessLevel is set to HistoryRead. In order to provide the information, when history tracking started, a HistoryDataConfiguration is referenced with a HasHistoricalConfiguration Reference (see OPC 10000-11 for details).

When the Server wants to provide information for tracking the alarms with detailed health information of an asset over time, the history of the events based on the alarms shall be used. . This specification does not define which Objects are used as EventNotifier. As defined in the base specification, the Server Object shall be an EventNotifier when events are supported, which can be used to subscribe to all events of the Server. It is, therefore, server-specific where the history of events can be accessed and what details of the events are stored. The general concepts are defined in OPC 10000-11. The server should provide for each asset the history of events for the same amount of time as the history of the 2:DeviceHealth.