Thinking about side channel attacks

When Cyrus wanted to capture Babylon, he attacked the river that flows through the city, drying it out and then sending his army under the walls through the river entrance and exit points. In a similar way, the ventilator is a movie favorite, used in both Lord of the Rings and Star Wars, probably along with a thousand other movies and stories throughout time. What do rivers and ventilators have to do with network security?

Side channel attacks. Now I don’t know if the attacks described in these papers, or Cyrus’ attack through the Euphrates, are considered side channel, or just lateral, but either way: the most vulnerable point in your network is just where you assume you can’t be attacked, or that point where you haven’t thought through security. Two things I read this week reminded me of the importance of system level thinking when it comes to security.

The first explores the Network Time Protocol (NTP), beginning with the general security of the protocol. Security in a time protocol is particularly difficult, as the entire point of encryption is to use algorithms that take a lot of time for an attacker to calculate—and there’s probably some relationship between solving the equation as an attacker and solving the equation as a receiver. While you have the key in one case, and you don’t in the other, the harder the algorithm is to solve for an attacker, the harder it’s likely to be for the receiver. The time lag of advanced encryption probably makes it more difficult to accurately distribute time across a set of widely distributed devices. So somehow encrypting NTP seems problematic in terms of it’s primary purpose.

But what if we leave NTP exposed? According to this paper, a number of attacks become feasible once you’ve altered the time. For instance, TLS certificates (used to secure most of the ‘web and email traffic in the world) are vulnerable to NTP attacks. DNSSEC uses time based keys to authenticate DNS records, caching system often use global timers to ensure the consistency of caches stored in different places, various authentication systems use time to determine how long someone should remain logged in and when they should change their password… Even the RPKI is grounded in timers. Time, then, is an excellent side channel (or ventilator, if you like) to use to find a way through systems otherwise hardened against attack.

The second is RFC7739, Security Implications of Predictable Fragment Identification Values. In this case, the attack is against fragment identifiers in IPv6. By attacking the fragment space, it’s apparently possible to cause sessions between two devices to fail consistently (denial of service), to monitor telemetry information, or to inject information into a flow from some point in the middle. There are ways to prevent this sort of attack, of course (see the RFC), but nonetheless, it’s instructive that here, under the covers of a protocol (or an abstraction, if you like), there is a set of operations that very engineers would think to check on.

So, what’s the point?

The point is this: networks are systems made up of individual components pulled together through interaction surfaces. While abstraction is a useful tool to focus on solving a single problem at a single point in time, it’s also important to step back and take a systematic view of the network as a system, paying particular attention to ventilators, in order to really assess security and threat levels.

Don’t wait until an Orc destroys your wall with a well placed explosive charge, or an X wing fighter drops a torpedo down an unprotected shaft.

Engineering Lessons, IPv6 Edition

Yes, we really are going to reach a point where the RIRs will run out of IPv4 addresses. As this chart from Geoff’s blog shows —

ipv4-exhaustion

Why am I thinking about this? Because I ran across a really good article by Geoff Huston over at potaroo about the state of the IPv4 address pool at APNIC. The article is a must read, so stop right here, right click on this link, open it in a new tab, read it, and then come back. I promise this blog isn’t going anyplace while you’re over on Geoff’s site. But my point isn’t to ring the alarm bells on the IPv4 situation. Rather, I’m more interested in how we got here in the first place. Specifically, why has it taken so long for the networking industry to adopt IPv6?

Inertia is a tempting answer, but I’m not certain I buy this as the sole reason for lack of deployment. IPv6 was developed some fifteen years ago; since then we’ve deployed tons of new protocols, tons of new networking gear, and lots of other things. Remember what a cell phone looked like fifteen years ago? In fact, if we’d have started fifteen years ago with simple dual mode devices, we could easily be fully deployed in IPv6 today. As it is, we’re really just starting now.

We didn’t see a need? Perhaps, but that’s difficult to maintain, as well. When IPv6 was originally developed (remember — fifteen years ago), we all knew there was an addressing problem. I suspect there’s another reason.

I suspect that IPv6, in it’s original form tried to boil the ocean, and the result might have been too much change too fast for the networking community to handle in such a fundamental area of the stack. What engineering lessons might we draw from the long times scales around IPv6 deployment?

For those who weren’t in the industry those many years ago, there were several drivers behind IPv6 beyond just the need for more address space. For instance, the entire world exploded with “no more NATs.” In fact, many engineers, to this day, still dislike NATs, and see IPv6 as a “solution” to the NAT “problem.” Mailing lists roiled with long discussions about NAT, security by obscurity (still waiting for someone who strongly believes that obscurity is useless to step onto a modern battlefield with a state of the art armor system painted bright orange), and a thousand other topics. You see, ARP really isn’t all that efficient, so let’s do something a little different and create an entirely new neighbor discovery system. And then there’s that whole fragmentation issue we’ve been dealing with for IPv4 for all these years. And…

Part of the reason it’s taken so long to deploy IPv6, I think, is because it’s not just about expanding the address space. IPv6, for various reasons, has tried to address every potential failing ever found in IPv4.

Don’t miss my point here. The design and engineering decisions made for IPv6 are generally solid. But all of us — and I include myself here — tend to focus too much on building that practically perfect protocol, rather than building something that was “good enough,” along with stretchy spots where obvious change can be made in the future.

In this specific case, we might have passed over one specific question too easily — how easy will this be to deploy in the real world? I’m not saying there weren’t discussions around this very topic, but the general answer was, “we have fifteen years to deploy this stuff.” And, yet… Here we are fifteen years later, and we’re still trying to convince people to deploy it. Maybe a bit of honest reflection might be useful just about now.

I’m not saying we shouldn’t deploy IPv6. Rather, I’m saying we should try and take a lesson from this — a lesson in engineering process. We needed, and need, IPv6. We probably didn’t need the NAT wars. We needed, and need, IPv6. But we probably didn’t need the wars over fragmentation.

What we, as engineers, tend to do is to build solutions that are complete, total, self contained, and practically perfect. What we, as engineers, should do is build platforms that are flexible, usable, and can support a lot of different needs. Being a perfectionists isn’t just something you say during the interview to that one dumb question about your greatest weakness. Sometimes you — we, really — do need to learn to stop what we’re doing, take a look around, and ask — why are we doing this?