Side Channel Attacks in the Wild: The Smart Home

Side channel attacks are not something most network engineers are familiar with; I provided a brief introduction to the concept over at The Network Collective in this Short Take. If you aren’t familiar with the concept, it might be worth watching that video (a little over 4 minutes) before reading this post.

Side channel attacks are more common, and more dangerous, than many engineers understand. In this post, I’ll take a look at a 2017 research paper that builds and exploits a side channel attack against several smart home devices to see how such a side channel attack plays out. They begin their test with a series of devices, including a children’s sleep monitor, a pair of security cameras, a pair of smart power plugs, and a voice based home assistant.

The attack itself takes place in two steps. The first is to correlate individual traffic flows with a particular device (where a traffic flow is a 5 tuple. The researchers did this in three different ways. First, they observed the MAC address of each device talking on the network, comparing the first three octets of this address to a list of known manufacturers. Most home device manufacturers use a small number of Ethernet chipsets; knowing the brand of the chipset can often narrow the range of possible devices sending a stream to a relatively small number.

The second mechanism the researchers used was to examine the DNS queries transmitted by a device. If a device queries, for instance, it is likely to be an Amazon produced home assistant. A list of these correlations can be built by examining different devices in an experimental setup, or even in the wild. Note these DNS queries, and their responses, are unencrypted, so this information is available regardless of any other encryption being used. Finally, the kind of device can be further pinpointed by examining the rates at which each device sends traffic. Video devices are likely to send traffic at a higher rate than voice only devices, for instance.

Once the researchers identified each device, they then began inferring specific activities within the home. This primarily involved using the amount of traffic being transmitted by each device. The researchers tried different states of operation for each device in a lab setting to determine what kind of activity correlates to different traffic levels. For instance, for a sleep monitor, a sleeping child might produce one level of traffic, an awake child might produce another, and an empty room might produce a third level. Watching television or listening to music, which indicate occupancy, would produce a different level of activity on a smart assistant device, while an empty home would produce another. Security cameras increase and decrease the amount of traffic they are generating based on how much motion is in their field of view.

Combining these traffic levels with even a basic amount of information about what kind of device is generating the traffic can provide a fairly good view of what is going on in the home. A sleeping child monitor with an intermediate level of activity combined with home assistants that are sending traffic indicating background noise indicates the house is occupied with a child sleeping in one room, and one or more adults watching television in another, for instance.

The importance of this form of attack is that it does not matter whether or not encryption is being used to mask the contents of any or all of these traffic flows. Merely the ability to determine what kind of device, combined with what “normal” traffic levels look like under different conditions, and finally with the observation of those traffic levels, reveals a good deal about the activity inside a home.

In the final section of this paper, the researchers then attempt to find some way to mitigate their ability to see the traffic levels effectively enough to infer activity from traffic levels. What they discover is that by adding random traffic to the various streams, increasing the overall traffic flow by about 20%, they can prevent the effective determination of activity. While the authors of the paper state this seems like a small amount of traffic, it is actually large amount of traffic to be carried, given the number of homes, the amount of aggregate bandwidth this represents, etc.

Side channel attacks of this kind are a real threat—while the paper considered here examines smart home devices, the ability to infer activity is much broader than this single use case. Side channel attacks are an important concept to understand for network security professionals, and network architects.

History of Hardware Switching

On this episode of the history of networking, we talk to Tony Li about the origin and history of the Cisco Silicon Switching Engine.

Short Take: Side Channel Attacks

In this short take, recently posted over at the Network Collective, I discuss what a side channel attack is, and why they are important.

Low Latency Networking

Low latency is coming to a network near you. In fact, it’s probably coming to your network, whether or not you realize it.

While bandwidth has always been the primary measure of a network, and cross sectional or non-contending bandwidth for data center fabrics, further research and reflection has taught large scale network operators that latency is actually much more of a killer for application performance than lack of bandwidth—and not only latency, but its close cousin, jitter. Why is this?

To understand, it is useful to return to an example given by Tanenbaum in his book Computer Networks. He includes a humorous example of calculating the bandwidth of a station wagon full of VHS tapes, with each tape containing the maximum amount of data possible. For those young folks out there who didn’t understand a single word in that last sentence, think of an overnight delivery box from your favorite shipping service. Now stuff the box full of high density solid state storage of some kind, and ship it. You can calculate the bandwidth of the box by multiplying the number of devices you can stuff in there by the capacity of each device, and then dividing by roughly 86,400 (the number of seconds in 24 hours).

What you will find, if you do this little exercise, is that the bandwidth of the box is greater than any link you can buy today. In fact, it’s probably greater than the bandwidth of every link available across your nation, region, or favorite ocean.

So why don’t we use boxes to ship data? Because networks are, in the final analysis, a concession to human impatience. until we reach the speed of human impatience, networks will always be able to improve in terms of delay.

To translate this into more practical terms—latency causes applications running on the network to run slow, and humans do not like slow applications. Jitter, the close cousin of latency, can be even worse. Applications cannot run to the fastest speed available in the network; rather, they must run to the slowest speed available in the network. If an end-to-end path is very jittery, meaning it exhibits a wide range of delays, the application will be forced to adjust by running at the slowest round trip time. Or worse, the application will be constantly trying to guess what the next round trip time is going to be, and guessing wrong. This can cause dropped packets, out of order packet delivery, and a host of other problems.

You might think that the world of IOT and mobile networks would be different than all of this; applications should know there is delay, and probably jitter, so they should learn to work around it, right? If you think this, you are missing a fundamental point from above: networks are a concession to human impatience, and humans are infinitely impatient. Humans do not much care what is between them and their data, they just want their data. Now!

The problem is—

(2) No matter how hard you push and no matter what the priority, can’t increase the speed of light. (2a) (corollary). No matter how hard you try, you can’t make a baby in much less than 9 months. Trying to speed this up *might* make it slower, but it won’t make it happen any quicker.

There has been a good bit of work in reducing latency in the last several years, however. Once hyperscalers started digging in to the latency problem, it quickly became obvious to other kinds of operators that they needed to take a much harder look at latency and jitter. One result is a set of standards and ideas designed to help combat latency in mobile and fixed networks. Delivering Latency Critical Communication over the Internet catalogs and explains some of these efforts in handy IETF draft form.

Section 3 considers the components of latency, specifically—

  • processing delays
  • buffer delays
  • transmission delays
  • packet loss
  • propagation delays

The difference between propagation delays and transmission delays is this: one describes the speed of light through the cable or optical fiber, while the other describes the time required to clock a packet, which is held in a parallel buffer, onto a wire, which is (generally) a serial signaling channel.

Section 3 of this document explains some of the reasons low latency networking is needed; some of these might be quite surprising. For instance, massive data transfers from Internet of Things (essentially machine to machine data transfer on a large scale) seems to require low latency support. If IoT is being used for near real time sensing, however, or unlocking the door to your house, this begins to make a little more sense. After all, when a bear is chasing you, there is some comfort in knowing there is a low latency path to the server that opens your house door.

Or… Perhaps this is another of those reason just to stick with old fashioned keys.

The document goes on to describe the Key Performance Indicators (KPIs) and Key Quality Indicators (KQIs) for a number of different kinds of applications that might be running over a network. For instance, figure 4 in the draft describes the performance requirements for remote surgery—

Finally, in section 5, the document describes how Path Computation (PCE) might be used to resolve some of these problems. This is an interesting use of PCE, as it would require a complete rethinking of the way telemetry is gathered off the network, and some modifications to the SPF calculation normally performed by PCE controllers. Since two metric calculations have been proven to be impossible to calculate (they are technically order complete computations), careful design work needs to go into thinking through the problems of managing latency and bandwidth constraints at the same time.

This is a solid draft providing a good introduction into the problems and solutions around low latency networking.

March 2018

DFS and Low Points

On a recent history of networking episode, Alia talked a little about Maximally Redundant Trees (MRTs), and the concept of

February 2018

On Breaking Things

On this short take over at the Network Collective, I talk about the importance of breaking things.

Enterprise versus Provider?

Two ideas that are widespread, and need to be addressed— FANG (read this hyper/web/large scale network operators) have very specific