Raise your hand if you think moving to platform as a service or infrastructure as a service is all about saving money. Raise it if you think moving to “the cloud” is all about increasing business agility and flexibility.
Put your hand down. You’re wrong.
Let’s be honest. For the last twenty years we network engineers have specialized in building extremely complex systems and formulating the excuses required when things don’t go right. We’ve specialized in saying “yes” to every requirement (or even wish) because we think that by saying “yes” we will become indispensable. Rather than building platforms on which the business can operate, we’ve built artisanal, complex, pets that must be handled carefully lest they turn into beasts that devour time and money. You know, like the person who tries to replicate store-bought chips by purchasing expensive fryers and potatoes, and ends up just making a mess out of the kitchen?
If you are looking for a good resolution for 2020 still (I know, it’s a bit late), you can’t go wrong with this one: this year, I will focus on making the networks and products I work on truly simpler. . . We need to go beyond just figuring out how to make the user interface simpler, more “intent-driven,” automated, or whatever it is. We need to think of the network as a system, rather than as a collection of bits and bobs that we’ve thrown together across the years. We need to think about the modules horizontally and vertically, think about how they interact, understand how each piece works, understand how each abstraction leaks, and be able to ask hard questions.
Yep, it’s that time of year when everyone does “retrospective pieces…” So… why not? There were several notable events this year—first and foremost, I kicked off a new podcast called the Hedge for network engineers. It’s probably not going to make anyone’s “top ten list of must listen to podcasts” anytime soon (if ever), but it’s been a lot of fun to move out of the commercial podcast space and just talk about “whatever seems interesting.” The History of Networking podcast also became independent this year; we are chugging along at more than 60 episodes, and there are a lot of great guests yet to come.
On the personal front, I moved from LinkedIn to Juniper Networks, and made some progress at school. I have finished my coursework and passed my comprehensive exams, so I’m now a PhD candidate, or as it is more commonly known, ABD.
Rule11 has, as a blog, had a good year. The most popular posts were:
The state of automation among enterprise operators has been a matter of some interest this year, with several firms undertaking studies of the space. Juniper, for instance, recently released the first yearly edition of the SONAR report, which surveyed many network operators to set a baseline for a better future understanding of how automation is being used. Another recent report in this area is Enterprise Network Automation for 2020 and Beyond, conducted by Enterprise Management Associates.
While these reports are, themselves, interesting for understanding the state of automation in the networking world, one correlation noted on page 13 of the EMA report caught my attention: “Individuals who primarily engage with automation as users are less likely to fully trust automation.” This observation is set in parallel with two others on that same page: “Enterprises that consider network automation a high priority initiative trust automation more,” and “Individuals who fully trust automation report significant improvement in change management capacity.” It seems somewhat obvious these three are related in some way, but how? The answer to this, I think, lies in the relationship between the person and the tool.
We normally encounter four different kinds of addresses in an IP network. We tend to assign specific purposes to each one. There are other address-like things, of course, such as the protocol number, a router ID, an MPLS label, etc. But let’s stick to these four for the moment. Looking through this list, the first thing you should notice is we often use the IP address as if it identified a host—which is generally not a good thing. There have been some efforts in the past to split the locator from the identifier, but the IP protocol suite was designed with a separate locator and identifier already: the IP address is the location and the DNS name is the identifier.
If you haven’t found the trade-offs, you haven’t looked hard enough.
A perfect illustration is the research paper under review, Securing Linux with a Faster and Scalable Iptables. Before diving into the paper, however, some background might be good. Consider the situation where you want to filter traffic being transmitted to and by a virtual workload of some kind, as shown below.
To move a packet from the user space into the kernel, the packet itself must be copied into some form of memory that processes on “both sides of the divide” can read, then the entire state of the process (memory, stack, program execution point, etc.) must be pushed into a local memory space (stack), and control transferred to the kernel. This all takes time and power, of course.
One of the recurring myths of IPv6 is its very large address space somehow confers a higher degree of security. The theory goes something like this: there is so much more of the IPv6 address space to test in order to find out what is connected to the network, it would take too long to scan the entire space looking for devices. The first problem with this myth is it simply is not true—it is quite possible to scan the entire IPv6 address space rather quickly, probing enough addresses to perform a tree-based search to find attached devices. The second problem is this assumes the only modes of attack available in IPv4 will directly carry across to IPv6. But every protocol has its own set of tradeoffs, and therefore its own set of attack surfaces.
A few weeks ago, I was in the midst of a conversation about EVPNs, how they work, and the use cases for deploying them, when one of the participants exclaimed: “This is so complicated… why don’t we stick with the older way of doing things with multi-chassis link aggregation and virtual chassis device?” Sometimes it does seem like we create complex solutions when a simpler solution is already available. Since simpler is always better, why not just use them? After all, simpler solutions are easier to understand, which means they are easier to deploy and troubleshoot.
The problem is we too often forget the other side of the simplicity equation—complexity is required to solve hard problems and adapt to demanding environments. While complex systems can be fragile (primarily through ossification), simple solutions can flat out fail just because they can’t cope with changes in their environment.
One “sideways” place to look for value in the network is in a place that initially seems far away from infrastructure, data gravity. Data gravity is not something you might often think about directly when building or operating a network, but it is something you think about indirectly. For instance, speeds and feeds, quality of service, and convergence time are all three side effects, in one way or another, of data gravity.
As with all things in technology (and life), data gravity is not one thing, but two, one good and one bad—and there are tradeoffs. Because if you haven’t found the tradeoffs, you haven’t looked hard enough. All of this is, in turn, related to the CAP Theorem.
Data gravity is, first, a relationship between applications and data location.
A long while back now, Daniel Dib and I put together a collection of blog posts and new material, and released the collection as Unintended Features. Yes, this little book needs a serious update with more recent material, but … Anyway, after setting things up so you could purchase electronic copies on Amazon, things went well for a while.
Until Amazon decided I had violated the copyright on the material published on our blogs by republishing some of the same material in a book form. It’s not that anyone actually investigated if the copyright holders on the material were the same people, it was just assumed that the same material being in two different places at the same time must be a copyright violation. After I received the first take-down notification, I patiently wrote an explanation of the situation, and the book was restored. I received another take-down notice a week or so later, to which I also responded. And another a week or so after that, then two more on a single day a bit later, finally receiving a dozen or so on one day a month or two after receiving the initial notice.
At this point, I gave up. Unintended Features is no longer available on Amazon, though it is still available here.
What brought this to mind is this—I received another take-down notice today, this time for violating the copyright on a set of slides I shared on Slideshare. Specifically, an old set of slides for a presentation called How the Internet Really Works.