We normally encounter four different kinds of addresses in an IP network. We tend to assign specific purposes to each one. There are other address-like things, of course, such as the protocol number, a router ID, an MPLS label, etc. But let’s stick to these four for the moment. Looking through this list, the first thing you should notice is we often use the IP address as if it identified a host—which is generally not a good thing. There have been some efforts in the past to split the locator from the identifier, but the IP protocol suite was designed with a separate locator and identifier already: the IP address is the location and the DNS name is the identifier.
If you haven’t found the trade-offs, you haven’t looked hard enough.
A perfect illustration is the research paper under review, Securing Linux with a Faster and Scalable Iptables. Before diving into the paper, however, some background might be good. Consider the situation where you want to filter traffic being transmitted to and by a virtual workload of some kind, as shown below.
To move a packet from the user space into the kernel, the packet itself must be copied into some form of memory that processes on “both sides of the divide” can read, then the entire state of the process (memory, stack, program execution point, etc.) must be pushed into a local memory space (stack), and control transferred to the kernel. This all takes time and power, of course.
One of the recurring myths of IPv6 is its very large address space somehow confers a higher degree of security. The theory goes something like this: there is so much more of the IPv6 address space to test in order to find out what is connected to the network, it would take too long to scan the entire space looking for devices. The first problem with this myth is it simply is not true—it is quite possible to scan the entire IPv6 address space rather quickly, probing enough addresses to perform a tree-based search to find attached devices. The second problem is this assumes the only modes of attack available in IPv4 will directly carry across to IPv6. But every protocol has its own set of tradeoffs, and therefore its own set of attack surfaces.
A few weeks ago, I was in the midst of a conversation about EVPNs, how they work, and the use cases for deploying them, when one of the participants exclaimed: “This is so complicated… why don’t we stick with the older way of doing things with multi-chassis link aggregation and virtual chassis device?” Sometimes it does seem like we create complex solutions when a simpler solution is already available. Since simpler is always better, why not just use them? After all, simpler solutions are easier to understand, which means they are easier to deploy and troubleshoot.
The problem is we too often forget the other side of the simplicity equation—complexity is required to solve hard problems and adapt to demanding environments. While complex systems can be fragile (primarily through ossification), simple solutions can flat out fail just because they can’t cope with changes in their environment.
One “sideways” place to look for value in the network is in a place that initially seems far away from infrastructure, data gravity. Data gravity is not something you might often think about directly when building or operating a network, but it is something you think about indirectly. For instance, speeds and feeds, quality of service, and convergence time are all three side effects, in one way or another, of data gravity.
As with all things in technology (and life), data gravity is not one thing, but two, one good and one bad—and there are tradeoffs. Because if you haven’t found the tradeoffs, you haven’t looked hard enough. All of this is, in turn, related to the CAP Theorem.
Data gravity is, first, a relationship between applications and data location.
A long while back now, Daniel Dib and I put together a collection of blog posts and new material, and released the collection as Unintended Features. Yes, this little book needs a serious update with more recent material, but … Anyway, after setting things up so you could purchase electronic copies on Amazon, things went well for a while.
Until Amazon decided I had violated the copyright on the material published on our blogs by republishing some of the same material in a book form. It’s not that anyone actually investigated if the copyright holders on the material were the same people, it was just assumed that the same material being in two different places at the same time must be a copyright violation. After I received the first take-down notification, I patiently wrote an explanation of the situation, and the book was restored. I received another take-down notice a week or so later, to which I also responded. And another a week or so after that, then two more on a single day a bit later, finally receiving a dozen or so on one day a month or two after receiving the initial notice.
At this point, I gave up. Unintended Features is no longer available on Amazon, though it is still available here.
What brought this to mind is this—I received another take-down notice today, this time for violating the copyright on a set of slides I shared on Slideshare. Specifically, an old set of slides for a presentation called How the Internet Really Works.
Two things which seem to be universally true in the network engineering space right this moment. The first is that network engineers are convinced their jobs will not exist or there will only be network engineers “in the cloud” within the next five years. The second is a mad scramble to figure out how to add value to the business through the network. These two movements are, of course, mutually exclusive visions of the future. If there is absolutely no way to add value to a business through the network, then it only makes sense to outsource the whole mess to a utility-level provider.
The result, far too often, is for the folks working on the network to run around like they’ve been in the hot aisle so long that your hair is on fire. This result, however, somehow seems less than ideal.
A post on Martin Fowler’s blog this week started me thinking about lock-in—building a system that only allows you to purchase components from a single vendor so long as the system is running. The point of Martin’s piece is that lock-in exists in all systems, even open source, and hence you need to look at lock-in as a set of tradeoffs, rather than always being a negative outcome. Given that lock-in is a tradeoff, and that lock-in can happen regardless of the systems you decide to deploy, I want to go back to one of the foundational points Martin makes in his post and think about avoiding lock-in a little differently than just choosing between open source and vendor-based solutions.
If cannot avoid lock-in either by choosing a vendor-based solution or by choosing open source, then you have two choices. The first is to just give up and live with the results of lock-in. In fact, I have worked with a lot of companies who have done just this—they have accepted that lock-in is just a part of building networks, that lock-in results in a good transfer of risk to the vendor from the operator, or that lock-in results in a system that is easier to deploy and manage.
Giving in to lock-in, though, does not seem like a good idea on the surface, because architecture is about creating opportunity. If you cannot avoid lock-in and yet lock-in is antithetical to good architecture, what are your other options?
For any field of study, there are some mental habits that will make you an expert over time. Whether you are an infrastructure architect, a network designer, or a network reliability engineer, what are the habits of mind those involved in the building and operation of networks follow that mark out expertise?
Experts involve the user
Experts don’t just listen to the user, they involve the user. This means taking the time to teach the developer or application owner how their applications interact with the network, showing them how their applications either simplify or complicate the network, and the impact of these decisions on the overall network.
Experts think about data
Rather than applications. What does the data look like? How does the business use the data? Where does the data need to be, when does it need to be there, how often does it need to go, and what is the cost of moving it? What might be in the data that can be harmful? How can I protect the data while at rest and in flight?
Backscatter is often used to detect various kinds of attacks, but how does it work? The paper under review today, Who Knocks at the IPv6 Door, explains backscatter usage in IPv4, and examines how effectively this technique might be used to detect scanning of IPv6 addresses, as well. Scanning the IPv6 address space is much more difficult because there are 2128 addresses rather than 232. The paper under review here is one of the first attempts to understand backscatter in the IPv6 address space, which can lead to a better understanding of the ways in which IPv6 scanners are optimizing their search through the larger address space, and also to begin understanding how backscatter can be used in IPv6 for many of the same purposes as it is in IPv4.
Kensuke Fukuda and John Heidemann. 2018. Who Knocks at the IPv6 Door?: Detecting IPv6 Scanning. In Proceedings of the Internet Measurement Conference 2018 (IMC ’18). ACM, New York, NY, USA, 231-237. DOI: https://doi.org/10.1145/3278532.3278553