Cloning Devices to meet NERC CIP, An Approach

Owners conducting a NERC Cyber Vulnerability Assessment have a requirement to annually verify ports and services. On Windows and Unix based systems, it is trivial and safe to pull a list of listening ports and the configured services thanks to commands like netstat, sc query, and others (you can even do it through Bandolier credentialed scanning!). But, network and automation devices often don’t have these commands readily available, and these devices tend to be the most sensitive to port scans (or owners are simply not willing to risk the scans due to unknowns).

So, because of the potential frailty of certain devices, coupled with the operational risk, owners choice is limited; Schedule an expensive outage to scan the devices, potentially interrupting the business of electric power? Or attempt to scan the devices online, take other risk mitigation measures, and hope for the best? As responsible stewards of bulk electric system reliability, the risk of the unknown more often outweighs the benefit.

I’d like to float another option, cloning the devices and running scans within a test lab. The objective is to identify network, system, and device configurations that could affect the output of the scans. Then, document the reasoning behind a scan of the clone being representative (or even equal) to a scan of the original, including any assumptions or other conditions.

Be aware, cloning the systems in a test lab can be more costly, and cost will multiply by the number of devices that must be cloned. While the level of effort to conduct the scan is the same, the activities involved in loading the device, ensuring it is a reasonable clone of the original, and then documenting the test will add hours on a per device basis. But, the risk reduction may be worth the additional cost.

Assurance that the cloned systems will match the production systems from a cyber security perspective requires both cyber security professionals as well as automation professionals. There are a many variables that need to be identified, and it takes that expertise to decide whether or not a variable has an effect on the compliance requirements. For example, cloning a Cisco switch would require the firmware level at minimum and preferably the same model physical hardware, as there are some differences between firmware revisions for different models of switch. It could also require the exact same configuration be applied too, but certain options wouldn’t affect the scan. As an example, would the final output of the scan be affected by the number of fiber vs. copper ports? Probably not.

While the Cisco is relatively straightforward, what about a PLC, Relay, or DCS controller? Here is where the automation expertise is needed, as a cyber professional may not know what is necessary to clone a automation device. For example, some PLCs will listen by default on Modbus, even if there are no Modbus points configured, while others require the Modbus interface be loaded via a module. Additionally, most automation vendors do not include port and service details in firmware change logs, which make it difficult to track differences between firmware revisions. These details are important, if I were an auditor I would want to review the work done for an approach like this.

The level of documentation for this process is going to be based upon each owner’s assessment of compliance risk weighed against the level of understanding the personnel have. One owner might have test reports only, while another might have a full methodology that explains how they developed the clones. At minimum, I would document the basic steps taken to develop clones, and concerns for specific devices, such as the Modbus example above.

As far as cost, try not to think of this approach as a wasted effort for compliance, there are dividends for this investment. First, set up the lab as you would an automation lab, and look for problems that occur during the scanning. Problems can be reported to the vendor, and the next generation of devices will be more resilient to scans. The knowledge gained can allow an owner a better understanding of the risk of scanning production systems, which can lead to cost savings during the next year, making them more efficient at compliance meeting compliance requirements. Then, you can make a donation of intellectual currency to the community via the many (successful?) information sharing programs, such as those run by NESCO.

I’d like to see vendor support in this area. Vendors have the dedicated test equipment already, they have unit tests that can identify misoperation, and they have the knowledgeable personnel to conduct this effort. But since they have a limited incentive to do this, maybe a better question is “$Vendor, what would I have to do to rent your test lab for a few weeks?”

Comments

This comment continues to demonstrate the fact that Vendors must be willing to support those cyber security regulations, standards, and initiatives of the end-users of their equipment! It is difficult enough to find concise documentation on exactly what ports/services are suppose to be enabled/running on their proprietary products – something which is key to understanding your risk exposure presented by any single device.

It would be nice if some of the larger ICS vendors like Honeywell, Emerson, Invensys, Siemens, Rockwell, etc. follow the lead set by a company like Wago that provides on a single data sheet the important security settings of their device.

For this reason, I believe that the vendors, though they have made some progress in helping to secure their devices, still are in a world of “security by obscurity” and need to provide more relevant documentation to integrators and users to sufficient PROTECT these systems from cyber threats.

Having a lab is extremely important for us at Entergy. 5 years ago, my lab was two 19″ racks in my cubicle that had 2 RTUs and several Relays and IEDs. I used this for creating and testing new configs, troubleshooting troublesome as-built configs, and testing new $vendor widgets and gidgets. We moved into a different office and we had some more money to expand the lab into 7 or 8 19″ racks with an even better population of SCADA, Protection, and substation network equipment. We purchased identical devices to what was in the field in case they were needed as an emergency spare. When the new Transmission HQ was being designed, we included space for 4 labs: Relay/SCADA, Networking/Security/Communications, Insulator/Arresters, and Power System Real-time Simulator. The Relay/SCADA lab now has over 20 racks with almost every vintage. The Network lab is where we do just as your article suggests…testing of real-world substation network conditions in a nice lab environment. The latest addition to the network lab was a fuzzer. I use it as part of my testing on new firmware patches and new vendor units.

I realize that not every company can have a similar lab space. In that case, the importance of FAT/SAT is even greater. I suggest using experienced 3rd party labs such as the INL National SCADA Test Bed (one example of several).

This thread made me think of the level of information we can get of copiers as compared to ICS equipment. I had the occasion a while back to respond to a “potential’ breach of one of these copiers and the standard info provided and the relative hardening of the device resulted in it being a non-issue.
Have a look:http://global.oce.com/support/security/document-printing-security.aspx

@Joel – It really depends on the vendor. I’ve worked with several that have extensive P&S documentation available to their customer base (mainly in generation) for equipment they provide. Others, I end up diving through manuals and testing systems, and work up my own. That’s still only the first part, thoroughly checking all those devices is a huge time/money cost as well, especially when scanning is limited.

@Chris – Does your employer rent out that lab? Cause I think there will be an interest in the future for a good setup like that for both compliance and security reasons.

@Matt – We have the exact opposite problem in ICS. The standard info provided is often not from a networking and security viewpoint, and the hardening of the devices is too often poor. Just ask Reid about the triviality of popping a shell on a PLC sometime. He’s even done it over TFTP.

@Ian – I wish it were that simple. Stay tuned, I think there needs to be a future blog post to discuss exactly why this kind of scanning causes heart palpitations.

If a single server going down, whether for maintenance or an incident, is going to cause an unacceptable impact, then you have an availability issue that needs to be addressed.

This happens all the time in assessments. An owner/operator says a specific type of server cannot be tested because it can never go down, or doesn’t fail over properly, or … We don’t do online testing of that system because the risk is to great, but it does lead to a mid to high severity finding in the report related to availability.

Dale's Tweets

About Us

Digital Bond was founded in 1998 and performed our first control system security assessment in the year 2000. Over the last sixteen years we have helped many asset owners and vendors improve the security and reliability of their ICS, and our S4 events are an opportunity for technical experts and thought leaders to connect and move the ICS community forward.