6 Implementation and deployment considerations
6.1 Overview
Clause 6 provides guidance to vendors that implement OPC UA Applications. Since many of the countermeasures required to address the threats described above fall outside the scope of the OPC UA specification, the advice in Clause 6 suggests how some of those countermeasures should be provided.
For each of the following areas, Clause 6 defines the problem space, identifies consequences if appropriate countermeasures are not implemented and recommends best practices.
6.2 Appropriate timeouts:
Timeouts, the time that the implementation waits (usually for an event such as Message arrival), play a very significant role in influencing the security of an implementation. Potential consequences include
- Denial of service: Denial of service conditions may exist when a Client does not reset a Session, if the timeouts are very large.
- Resource consumption: When a Client is idle for long periods of time, the Server keeps the Client’s buffered Message or information for that period, leading to resource exhaustion.
The implementer should use reasonable timeouts for each connection stage.
6.3 Strict Message processing
The specifications often specify the format of the correct Messages and are silent on what the implementation should do for Messages that deviate from the specification. Typically, the implementations continue to parse such packets, leading to vulnerabilities.
- The implementer should do strict checking of the Message format and should either drop the packets or send an error Message as described below.
- Error handling uses the error code, defined in Part 4, which most precisely fits the condition and only when returning an error code is appropriate. Error codes can be used as an attack vector, thus their uses should be limited as described in Part 4. Part 4 describes that a single generic error is returned before and during the establishment of a secure channel. Once the secure channel has been established then appropriate specific error codes are returned.
- Another attack vector that can be used is timing variations; this is minimized by the description in Part 4 that requires the closing of the socket for any errors when establishing a secure channel. Vendors should be careful in their implementation to ensure that all paths that result in the closure of the socket do not provide a timing hint indicating which failure path was encountered. This can be accomplished by having a random delay before closing the socket or before returning a generic error code.
- All arrays lengths, string lengths and recursion depth should be strictly enforced and processed.
6.4 Random number generation
Random numbers that meet security needs can be generated by suitable functions that are provided by cryptography libraries. Common random functions such as using rand() provided by the “C” standard library do not generate enough entropy. As an alternative, implementers could use the random number generators provided by the Microsoft Windows Crypto library (WinCrypt library) or by OpenSSL. Even the random functions provided in cryptography libraries require a source of entropy to initialize and the required entropy is not always available on embedded devices. PCs can use several individual pieces of information (hardware ids like CPU, Mac, addresses, USB devices, screen resolution, installed software ...) to generate entropy, but embedded devices are built completely identically. Often only the time and maybe a MAC address is left for entropy. These sources of entropy can be guessed or discovered. This makes the embedded devices very vulnerable.
A common mistake is to generate cryptographic keys during the first boot. Thus even the time information is predictable (creation time is stored e.g. in a certificate). Some alternate solutions a vendor might want to consider:
- Add specific entropy generator hardware when designing embedded devices.
- Do not generate certificates on embedded devices. Use an external tool or the GDS to generate the certificate and load it onto the device. A problem could still remain for the symmetric keys, as these are normally not created directly during the boot phase; rather they are created when a client connects.
- Wait long enough until enough entropy information is available. Some operating systems provide hints when they have reached this point.
- For embedded systems without a good entropy source it may help to store the cryptographic pseudo-random number generator (CPRNG) state, so that it will not produce the same random numbers after every boot.
Vendor should ensure that cryptographic functions they use are initialized with suitable entropy and that the generated certificates are not created in a predictable manner.
6.5 Special and reserved packets
The implementation understands and correctly interprets any Message types that are reserved as special (such as broadcast and multicast addresses in IP specification). Failing to understand and interpret those special packets may lead to vulnerabilities.
6.6 Rate limiting and flow control
OPC UA does not provide rate control mechanisms, however an implementation can incorporate rate control.
6.7 Administrative access
OPC UA describes that certain functionality, such as the management of CertificateStores, should be restricted to administrators. This Multi-part standard does not describe the details associated with administrative access. The nature of administrative access varies from platform to platform. Some platforms only have a single administrator. Other platforms provide multiple levels of administrative access such as backup administrator, network administrator, configuration administrator etc. The deployment site should make appropriate selections for administrator access and the implementer should allow for the configuration of appropriate administrator account access.
6.8 Cryptographic Keys
Security Profiles defined in Part 7 describe required algorithms and required key lengths. Key length requirements may be specified as a range, i.e. 1024-2048. It is important that an application supports the entire range for its Application Instance Certificate. This allows an end user to generate a key (Application Instance Certificate) that meets their security requirements. This may extend the period of time for which the given Security profile can be used. For example key lengths less than 2048 are already considered insecure, but if an end user generates certificates for the high end of the range (2048), the application might still be considered secure (depending on the other algorithms).
6.9 Alarm related guidance
OPC UA supports a robust Alarm and Condition information model which includes the ability to disable alarms, shelve alarms, and to generally manage alarms. Alarm processing and management is an important part of maintaining efficient control of a plant. From a security point of view it is important that this avenue be adequately protected, to ensure that a rogue agent does not create a dangerous or financial situation. OPC UA provides the tools required for this protection, but the implementer needs to ensure that they are exercised correctly. All functions that allow changes to the running environment are able to generate Audit Events and are to be restricted to appropriate users.
The disabling of Alarms is one such function that should be restricted to personnel with appropriate access rights. Furthermore, any action that disables an alarm, whether it be initiated by personnel or some automated system, should generate an Audit Event indicating the action.
The shelving of alarms should follow similar guideline as the disabling of alarms with regard to access and Auditing, although it may be available to a wider range of users (operators, engineers). Also the implementer should ensure that appropriate timeouts are configured for Alarm Shelving. These timeouts should ensure that an Alarm cannot be shelved for a period of time that could cause safety concerns.
Dialog Events could also be used to overload a Client. It would be a best practice for Servers that support dialogs to restrict the number of concurrent dialogs that could be active. Also Dialogs should include some timeout period to ensure that they are not used to create a DOS. Client implementers should also ensure that any dialog processing cannot be used to overwhelm an operator. The maximum number of open dialogs should be restricted and dialogs should be able to be ignored (i.e. other processing should still be available).
6.10 Program access
OPC UA describes functionality that allows for programs to be executed as part of the OPC UA Server. These programs can be used to perform advanced control algorithms or other actions. The use of these actions should be restricted to personnel with appropriate access rights. Furthermore, the definition of Programs should be carefully monitored. It is recommended that statistics be maintained regarding the number of defined programs in addition to their execution frequency. This information is available to administrative personnel. In no case should an unlimited number of program executions be allowed.
6.11 Audit event management
The OPC UA specification describes Audit Events that are to be generated and the information that these Audit Events include as a minimum, however, the specification does not describe how these Audit Events are handled once they are generated. Audit Events can be subscribed to by multiple Audit tracking systems or logging systems. The OPC UA specification does not describe these systems. It is assumed that any number of vendor provided systems could provide this functionality. As a best practice whatever system is used to store and manage, Audit Events should ensure the following:
- That Audit Events are not tampered with once they are received.
- The Subscription for Audit Events should be via a Secure Channel to ensure they are not tampered with while in transition.
- For Clients that log audit events; it is recommended that the logged audit events be persisted in such a manner that the audit events can be authenticated and linked to the original transaction.
An Audit event management system could have additional requirements based on the site CSMS.