Career Primer: Diving Into Industrial Control System (ICS) Security

Tech Career Primers_app academy

What do oil refineries, chemical processing plants, and fire suppression systems all have in common? They’re systems run by electromechanical Industrial Control Systems/Supervisory Control and Data Acquisition (ICS/SCADA) systems. You can think of them as the Internet of Things (IoT) for industrial applications.

There used to be a difference between ICS and SCADA, but not anymore. ICS referred to all industrial automation, i.e. the control of industrial systems, while SCADA, referred to wide-area control systems like oil pipelines and power grids. Nowadays they’re used interchangeably; the tech industry has settled on ICS as the catch-all term for industrial controls while veterans still call it “SCADA.”

Today, ICS networks are considered “operational technology” (OT), compared to business networks which are “information technology” (IT). IT networks are traditional, business-class network like routers and computers. OT networks use PLCs (Programmable Logic Controllers), RTUs (Remote Terminal Units), DCS (Distributed Control Systems), and other non-business related hardware to communicate and control industrial systems.

A large number of ICS systems have been operating for decades with few changes. Mainly, this is because they use critical, money-making infrastructure and are rarely taken down for maintenance like software upgrades.

Because they are existing systems, ICS systems are vulnerable to exploits from attackers and malware. Like Internet of Things devices, they were never designed with security in mind and can be broken.

The Background: Power Grids and Stuxnet

To put the importance of ICS systems in perspective, consider two famous security attacks.

The Ukraine power grid attack in 2015, to start, blacked out large portions of the Eastern European country, and Stuxnet in 2010 affected Iranian nuclear processing centrifuges. Russia was suspected of using cyber weapons for each of these events, apparently in an attempt to sow discontent in democratic processes and to demonstrate Russian power.

Over the course of two years, Russian attackers were suspected of digitally scouting the Ukrainian national power grid and identifying security systems at 30 different substations, including firmware installations. After identifying the systems, a trojan horse program called Black Energy infiltrated systems and was deployed through a phishing campaign attached to an infected Excel or Word file. The attack took less than a day, with the primary hacking requiring only a few minutes to launch the trojan’s payload. A VPN was created between the compromised systems, allowing the attackers to remotely open circuit breakers, as well as prevent use of telephony services, shutdown uninterruptible power supplies, corrupting substation firmware, and wiping several hard drives.

In the end, 73MWh of power was removed from the power grid for six hours. While this is less than 1% of the Ukraine’s daily power output, the fact that it affected 230,000 people at midnight near Christmas had serious ramifications for the country.

In addition to people being without heat, emergency services and medical facilities were reliant on generator backups and the telephone system was effectively down. In addition, the power system controlled many of the natural gas pipelines that carried gas to Europe.

Stuxnet was another attack with serious ramifications. The hack was performed through a worm (a self-replicating virus that requires no human interaction) that was able to jump the air-gap from the IT network to the OT network that controlled the nuclear centrifuges at the facilities. ICS systems are frequently air-gapped as a security measure; an air-gap is the physical separation of a network or system from other networks, such as the Internet. Thus, jumping the air-gap required the malware to be physically moved from an Internet-connected system to the ICS network via a USB thumb drive.

Targeting specific PLCs that were only used for these centrifuges, the malware modified the speed of the centrifuges while the Human-Machine Interfaces (HMI) indicated to the operators that everything was operating correctly. The centrifuges are designed to spin at 100,000 rpm to allow gaseous uranium hexafluoride to be separated into U-235 (used for nuclear bombs or fuel rods) and U-238 (which can’t sustain a nuclear reaction).

By manipulating the centrifuges, the separation of the uranium isotopes was disrupted, as well as damaging the centrifuges. Because the uranium is never concentrated enough to sustain a nuclear reaction, there is no possibility of causing a nuclear explosion or nuclear meltdown, but the ability to create useful U-235 for nuclear reactors is eliminated.

Later analysis revealed that the US and Israel were responsible for the creation of Stuxnet to disrupt Iran’s nuclear energy program, as well as hindering any research into nuclear weapons.

There are other examples of ICS-related problems. Some are malicious while others are accidental. However, they all demonstrate how damaging it can be when ICS networks stop functioning correctly. At a minimum, time and money is lost. But at their worst managed and used, people die.

A Legacy Problem

The legacy aspect of ICS is the main problem. Systems that are rarely looked at for security issues and are even less frequently patched are incredibly important yet are usually not treated as such. A large portion of systems currently in use were designed and put to use by companies and public utilities 20 to 30 years ago before information security and direct Internet access was a real concern. Most of the systems use plain-text communication and simple protocols because the overhead of encryption can affect the timing of operations, causing problems. IP-connectivity, remote access, and simple ease-of-use additions are frequently added with no thought about security.

While encryption is frequently the “go-to” answer for a lot of security practitioners when talking about IT, for a variety of reasons, such as confidentiality, encryption causes problems on OT networks because of the resource requirements. Encryption requires a CPU that can deal with the encryption/decryption (E/D) of the data, on-the-fly. The CPUs used by older PLCs and RTUs simply don’t have the processing power to handle their normal functions as well as deal with E/D.

If encryption were to be used, the device could simply “brick” itself, turning off all functionality and connectivity until manually reset. Even if the CPU could handle both tasks, the time required for the CPU to perform the E/D could affect the timing of the physical component the PLC/RTU is connected to. For example, if the ICS monitored power plant readings to automatically connect two power buses, and the CPU hesitated a few milliseconds to deal with encryption, the phase relationship between the buses could be out of alignment when a breaker was energized, resulting in potential damage to the breaker or other equipment.

Lack of Training and Awareness Lead to Vulnerabilities

For current cyber-security professionals, ICS security is almost a wide-open field. While there are people and organizations who know about the problems, there simply aren’t that many people who are trained in OT networks who can provide support without breaking things.

ICS is not a field that is frequently discussed, especially to IT students, so for many people it is simply a lack of knowledge about the field. As it is frequently lumped with engineering work, such as civil or mechanical engineering, it isn’t part of a normal Computer Science or even general IT education.

OT networks use different protocols than normal IT networks, such as MODBUS, so it takes special training to even understand what data is flowing on the network. Frequently, technicians have to learn on-the-job, applying general electromechanical and system control knowledge to their particular work environment. Just because someone knows how an oil refinery works doesn’t mean they know how a fire suppression system is configured.

PLCs and related devices are frequently very small computers, e.g. 512MB of RAM and <1GHz CPU, and you can’t interact with them remotely. Even if there is “remote communication”, it is generally just to the HMI at the central control center over a dedicated line.

The article continues on the next page. 

What do you think?

18 points
Upvote Downvote

Written by Cody Jackson

A Navy veteran and manager of Technology Services, Information Security, and Software Engineering for more than 20 years, Cody recently founded a consulting company based in San Antonio, Texas.

Leave a Reply