CLKscrew: Another side channel you didn’t know about

Network engineers focus on protocols and software, but somehow all of this work must connect to the hardware on which packets are switched, and data is processed. A big part of the physical side of what networks “do” is power—how it is used, and how it is managed. The availability of power is one of the points driving centralization; power is not universally available at a single price. If cloud is cheaper, it’s probably not because of the infrastructure, but rather because of the power and real estate costs.

A second factor in processing is the amount of heat produced in processing. Data center designers expend a lot of energy in dealing with heat problems. Heat production is directly related to power usage; each increase in power consumption for processing shows up as heat somewhere—heat which must be removed from the equipment and the environment.

It is important, therefore, to optimize power usage. To do this, many processors today have power management interfaces allowing software to control the speed at which a processor runs. For instance, Kevin Myers (who blogs here) posted a recent experiment with pings running while a laptop is plugged in and on battery—

Reply from 2607:f498:4109::867:5309: time=150ms
Reply from 2607:f498:4109::867:5309: time=113ms
Reply from 2607:f498:4109::867:5309: time=538ms
Reply from 2607:f498:4109::867:5309: time=167ms
Reply from 2607:f498:4109::867:5309: time=488ms
Reply from 2607:f498:4109::867:5309: time=231ms
Reply from 2607:f498:4109::867:5309: time=104ms
Reply from 2607:f498:4109::867:5309: time=59ms
Reply from 2607:f498:4109::867:5309: time=64ms
Reply from 2607:f498:4109::867:5309: time=57ms
Reply from 2607:f498:4109::867:5309: time=58ms
Reply from 2607:f498:4109::867:5309: time=64ms
Reply from 2607:f498:4109::867:5309: time=56ms
Reply from 2607:f498:4109::867:5309: time=62ms

There is a clear difference in the “plugged in” RTT and the “on battery” RTT. A common form of power management is Dynamic Voltage and Frequency Scaling (DVFS), which allows software to change the frequency at which a chip runs based on the kinds of processing being done, and power availability. The authors of this paper examined the interface between the software drivers that support energy management on some classes of processors and discover a series of vulnerabilities that allow an attacker to take control of the processing speed and voltage of the chip.

DVFS relies on two regulators in the processor, a voltage regulator that controls the amount of power supplied to the chip, and a Phased Lock Loop (PLL) regulator that determines the clock frequency of the chip. Software can reduce the amount of voltage supplied to the chip through this interface, as well as manage the speed at which the chip is running. Perhaps the simplest way to think about this is to conceive of a chip as a very complex set of interconnected sets of buckets. The pipeline of the chipset represents a “bucket brigade,” where each stage of the chip is filled with water from some previous stage. Because a little water is always spilled between stages, each stage must be “topped” off a little during processing

The faster you move water between the stages of buckets, the more the water “slops,” requiring a bit more “topping off” each time. The faster you move water between the stages, the more water is required at the input stage, as well, so the processing simply consumes more water. The voltage level can be seen as the level to which each bucket is filled, or how many buckets in each stage are used (or how many stages of buckets are available), all three. In each case, reducing the voltage impacts the amount of work the chip can accomplish in a given time period.

What happens if you force the chip to run at a higher speed than normal, or at a higher voltage than normal? You can cause the chip to overheat, for instance, or force processing errors in forcing the chip to not refill against the “interstage slop.”

This is precisely what the authors of this paper demonstrate. They take over the power management software of an ARM chipset through some simple attack vectors (it turns out power management software is not very hardened, and hence is pretty easy to take over). They then force errors in the chipset, and show how the chipset could be destroyed through this back door.

Part of the problem with the complex systems we build today is there are just too many attack surfaces for any one person to know about, or to account for. Complexity often drives insecurity—a lesson we are still in the process of learning.