OPC Unified Architecture – Part 110: Asset Management Basics
9 Health Status
9.1 Overview
In order to provide the health status of assets, this specification mainly uses the concepts defined in OPC 10000-100.
9.2 Overall health status of an asset
The Variable 2:DeviceHealth defined in OPC 10000-100 is used to provide the overall health status of an asset. It uses an enumeration having the values NORMAL, FAILURE, CHECK_FUNCTION, OFF_SPEC and MAINTENANCE_REQUIRED (see OPC 10000-100 for details).
Note that this Variable should be used for all assets, independent of what type of asset it is, for example also for software components. For some types of assets, not all values of the enumeration can reasonably be applied.
OPC 10000-100 defines the interface 2:IDeviceHealthType containing optionally the Variable 2:DeviceHealth. In older versions of OPC 10000-100 the Variable was only defined on the 2:DeviceType.
In order to support older versions of OPC 10000-100, Objects representing an asset and providing health status information shall provide the 2:DeviceHealthVariable. This should be achieved by implementing the 2:IDeviceHealthType interface providing optionally the 2:DeviceHealthVariable. This can be done either directly on the instance (X:Asset1) or via the TypeDefinition (X:Asset2), as shown in Figure 3. However, it is also allowed to just provide the 2:DeviceHealthVariable as done by X:Asset3 by using the 2:DeviceType. Note that in the current version of OPC 10000-100 2:DeviceType does implement 2:IDeviceHealthType.
Figure 3 – Examples of assets providing health status
The SourceTimestamp of the 2:DeviceHealthVariable provides the time when the asset entered that health status. In order to provide an accurate time, Servers should preserve the time, also when restarting the Server. As this might not always be possible, a Client shall be aware that the SourceTimestamp is not always the time when the asset switched the health status.
9.3 Asset specific information on health status
To provide asset specific information on failure or maintenance conditions the alarming mechanism of OPC UA is used. The 2:IDeviceHealthType already defines the optional 2:DeviceHealthAlarms folder, where such alarm Objects can be provided. However, it is not required to provide the folder and the alarm Objects in the AddressSpace. Servers should use the AlarmTypes defined in OPC 10000-100 (2:DeviceHealthDiagnosticsAlarmType and subtypes) to expose the asset specific information. The event fields shall be used as defined in OPC 10000-5 and OPC 10000-9. The 0:SourceNode and 0:SourceName shall identify the asset. The 0:Severity should be used as defined in Table 15. Applications may refine those categories to a more detailed categorisation.
Table 15 – Usage of Severity for Alarms containing asset specific information
Severity
Classification
Description
801-1000
Critical Fault
The asset has permanently failed.
601-800
Major Recoverable Fault
The asset can no longer perform its function. Intervention is required.
401-600
Minor Recoverable Fault
The asset has experienced a problem but is able to continue operation until intervention occurs.
301-400
Maintenance Needed
Service is required to keep the asset operating within its designed tolerances.
201-300
Limited Resource Capacity Near Limit
A limited resource met a threshold beyond which a more serious fault would occur.
1-200
-
Used when the alarm is not active.
9.4 Root cause of asset specific information on health status
9.4.1 Overview
An asset might have several reasons why it’s not operating as expected, which often are dependent on each other. For example, the asset MyMachine might indicate the faults that a pump is not working and the oil pressure is too low. The second fault is caused by the first fault. To manage assets, it is desirable to identify the root causes of failures. In order to provide the root cause of why an asset is not operating as expected, the interface IRootCauseIndicationType is defined (see 9.4.2). Servers providing this information shall use AlarmTypes implementing this interface for alarms indicating asset specific information on failures.
9.4.2 IRootCauseIndicationType
The IRootCauseIndicationType is an interface and should be applied to AlarmTypes. It is formally defined in Table 16.
Note: The IRootCauseIndicationType cannot only be applied to AlarmTypes, but in general to subtypes of 0:ConditionType.
Table 16 – IRootCauseIndicationType Definition
Attribute
Value
BrowseName
IRootCauseIndicationType
IsAbstract
True
Description
Information on the root cause of conditions, should be applied to alarms (AlarmType or subtypes)
References
NodeClass
BrowseName
DataType
TypeDefinition
Other
Subtype of the 0:BaseInterfaceType Type defined in OPC 10000-5
0:HasProperty
Variable
PotentialRootCauses
RootCauseDataType[]
PropertyType
M
Conformance Units
AMB Asset Health Status Root Causes
The mandatory PropertyPotentialRootCauses contains an array of potential root causes of the alarm. This is intended to be a hint to the client and might be a local view on the potential root causes of the alarm. The list might not contain all potential root causes, that is, other potential root causes might exist as well. If the alarm itself is considered to be the root cause, the array shall be empty. If no potential root causes have been identified, there shall be at least one entry in the array indicating that the root cause is unknown.
The child Nodes of the IRootCauseIndicationType have additional Attribute values defined in Table 17.
Table 17 – IRootCauseIndicationType Attribute values for child Nodes
BrowsePath
Value Attribute
Description Attribute
PotentialRootCauses
-
An array of potential root causes of the alarm. This is intended to be a hint to the client and might be a local view on the potential root causes of the alarm. The list might not contain all potential root causes, that is, other potential root causes might exist as well. If the alarm itself is considered to be the root cause, the array shall be empty. If no potential root causes have been identified, there shall be at least one entry in the array indicating that the root cause is unknown.
9.4.3 RootCauseDataType
This structure contains information about the root cause of an alarm. The structure is defined in Table 18.
Table 18 – RootCauseDataType Structure
Name
Type
Description
RootCauseDataType
structure
RootCauseId
0:NodeId
The NodeId of the root cause of an alarm. This can point to another Node in the AddressSpace or a ConditionId, that is not necessarily represented as Object in the AddressSpace. Ideally, this points directly to the root cause. Potentially, it points to an alarm that has an additional root cause. Clients shall expect that they need to follow a path to find the root cause. If the root cause is unknown, the NodeId shall be set to NULL.
RootCause
0:LocalizedText
Localized description of the root cause of an alarm. This can be the DisplayName of the Node referenced by RootCauseId or a more descriptive text. If the root cause is unknown, this should be described in the text.
Its representation in the AddressSpace is defined in Table 19.
Table 19 – RootCauseDataType Definition
Attribute
Value
BrowseName
RootCauseDataType
IsAbstract
False
Description
Root cause of an alarm
References
NodeClass
BrowseName
DataType
TypeDefinition
Other
Subtype of the Structure DataType defined in OPC 10000-5
Conformance Units
AMB Asset Health Status Root Causes
9.5 Standardized categories of asset specific information on health status
9.5.1 Overview
Although the alarming mechanism of OPC UA is used to indicate asset specific information on the health status, there are common indications of alarms across assets. To standardize the common indications, and leave options for extensibility by companion specifications and vendors, this specification uses the mechanism of the ConditionClassId defined for conditions in OPC 10000-9. In the following, specific subtypes of BaseConditionClassType are defined that should be used as ConditionClassId for specific alarms. Other companion specifications and vendors might add additional subtypes of BaseConditionClassType and might inherit from the types defined in this specification.
9.5.2 ConnectionFailureConditionClassType
The ConnectionFailureConditionClassType is used to classify conditions related to connection failures. It is formally defined in Table 20.
Subtype of the 0:SystemConditionClassType defined in OPC 10000-9
Conformance Units
AMB Asset Health Status Alarm Categories
9.5.6 FlashUpdateInProgressConditionClassType
The FlashUpdateInProgressConditionClassType is used to classify conditions related to flash updates being in progress. It is formally defined in Table 24.
Subtype of the OutOfResourcesConditionClassType defined in 9.5.9
Conformance Units
AMB Asset Health Status Alarm Categories
9.6 Tracking of health information
When the Server wants to provide the overall health status of an asset over time, the history of the 2:DeviceHealthVariable shall be used. For each 2:DeviceHealthVariable, where the history is currently tracked, the HistorizingAttribute is set to True and the AccessLevel is set to HistoryRead. In order to provide the information, when history tracking started, a HistoryDataConfiguration is referenced with a HasHistoricalConfigurationReference (see OPC 10000-11 for details).
When the Server wants to provide information for tracking the alarms with detailed health information of an asset over time, the history of the events based on the alarms shall be used. . This specification does not define which Objects are used as EventNotifier. As defined in the base specification, the ServerObject shall be an EventNotifier when events are supported, which can be used to subscribe to all events of the Server. It is, therefore, server-specific where the history of events can be accessed and what details of the events are stored. The general concepts are defined in OPC 10000-11. The server should provide for each asset the history of events for the same amount of time as the history of the 2:DeviceHealth.