On Using the Right Word

A while back, I was sitting in a meeting where the presenter described switching from a “traditional, hierarchical data center fabric” to a spine-and-leaf (while drawing CLOS, in all capital letters, on the whiteboard). He pointed out that the spine-and-leaf design is simpler because it only has two tiers rather than three.

There is so much wrong with this I almost winced in physical pain. Traditional hierarchical designs are not fabrics. Spine-and-leaf fabrics are not CLOS, but Clos, fabrics. Clos fabrics have three stages, not two—even if we draw them “folded” so you only see two apparent levels to the fabric. In fact, all spine-and-leaf fabrics always have an odd number of stages, and they are stages, not tiers.

More recently, I heard someone talking about an operating system that was built using microservices. I thought—“that would be at neat trick.” To build something with microservices does not just mean a piece of software using modules—this would be modular application (or operating system) design. Microservices architectures break the application up into the most basic components possible and then scale each kind of component out (rather than up) by spinning new copies of each service as needed. I cannot imagine scaling an operating system out by spinning multiple copies of the same service, and then providing some sort way to spread load across the various copies. Would you have some sort of anycast IPC? An internal DNS server or load balancer?

You can have an OS that natively participates in a larger microservices-based architecture, but what would microservices within the operating system look like, precisely?

Maybe my recent studies in philosophy make me much more attuned to the way we use language in the network engineering world—or maybe I’m just getting old. Whatever it is, our determination to make every word mean everything is driving me nuts.

What is the difference between a router and a switch? There used to be a simple definition—routers rewrite the L2 header and switches don’t. But now that routers switch packets, and switches route packets, the only difference seems to be … buffer depth? Feature set? The line between router and switch is fuzzy to the point of being meaningless, leaving us with no real term to describe a real switch any longer (a device that doesn’t do routing).

What about software defined networks? We’ve been treated to software defined everything now, of course. And intent? I get the point of intent, but we’re already moving down the path of making the meaning so broad that it can even contain configuring the CLI on an old AGS+. And don’t get me started on artificial intelligence, which is often learned to describe something closer to machine learning. Of course machine learning is often used to describe things that are really nothing more than statistical inference.

Maybe it’s time for a general rebellion against the sloppy use of language in network engineering. Or maybe I’m just tilting at yet another windmill. Wake me up when we’ve gotten to the point that we can use any word interchangeably with any other word in the network engineering dictionary. I await the AI that routes packets by reading your mind (through intent) called a swouter… or something.

The Hedge 73: Daniel Teycheney and Open Source in Networking

Combining, or stitching together, open source projects to build something unique for your network is becoming more common. What does this look like in the real world? What are some of the positive and negative aspects of building things this way? How do open source projects interact with the commercial world? Daniel Teycheney joins Tom Ammon, Jett Tantsura, and Russ White to discuss open source software in networking, particularly around network monitoring and management.

download

Rethinking BGP on the DC Fabric (part 5)

BGP is widely used as an IGP in the underlay of modern DC fabrics. This series argues this is not the best long-term solution to the problem of routing in fabrics because BGP is not ideal for this use case. This post will consider the potential harm we are doing to the larger Internet by pressing BGP into a role it was not originally designed to fulfill—an underlay protocol or an IGP.

My last post described the kinds of configuration required to make BGP work on a DC fabric—it turns out that the configuration of each BGP speaker on the fabric is close to unique. It is possible to automate configuring each speaker—but it would be better if we could get closer to autonomic operation.

To move BGP closer to autonomic operation in a DC fabric, there are several things we can do. First, we can allow a BGP speaker to peer with any other BGP speaker it receives an open message from—this is often called promiscuous mode. While each router in the fabric will still need to be configured with the right autonomous system, at least we won’t need to configure the correct peers on each router (including the remote AS).

Note, however, that using this kind of promiscuous peering does come with a set of tradeoffs (if you’re reading this blog, you know there will be tradeoffs). BGP speakers running in promiscuous mode open a large attack surface on the control plane of the network. We can close this attack surface by configuring authentication on all BGP speakers … but we are now adding complexity to reduce complexity. We could also reduce the scope of the attack surface by never permitting BGP to peer beyond a single hop, and then filtering all BGP packets at the fabric edge. Again, just a bit more complexity to manage—but remember that the road to highly fragile and complex systems is always paved with individual steps that never, on their own, seem to add “too much complexity.”

The second thing we can do to move BGP closer to autonomic operation is to advertise routes to every connected peer without any policy configured. This does, again, introduce some tradeoffs, particularly in the realm of security, but let’s leave that aside for the moment.

Assume we can create a version of BGP that has these modifications—it always accepts any peer from any other AS, and it advertises all routes without any policy configured. Put these features behind a single knob which also includes setting the MRAI to 0 or 1, tightens up the dampening parameters, and adjusts a few other things to make BGP work better in a DC fabric.

As an experiment, let’s enable this DC fabric knob on a BGP speaker at the edge of a dual-homed “enterprise customer.” What will happen?

The enterprise network will automatically peer to any speaker that sends an open message—a huge security hole on the open Internet—and it will advertise every route it learns even though there is no policy configured. This second issue—advertising routes with no policy configured—can cause the enterprise network to become a transit between two much larger provider networks, crashing out some small corner of the Internet.

This might seem like a trivial issue. After all, just don’t ever enable the DC fabric knob on an eBGP peering session upstream into the DFZ, or any other “real” internetwork. Sure, and just don’t ever hit the brakes when you mean to hit the accelerator, or the accelerator when you mean to hit the brakes. If I had a dime for every time we “just don’t ever make that mistake …” Well, I wouldn’t be blogging, I’d be relaxing in the sun someplace (okay, I’m not likely to ever stop working to sit around and “relax” all the time, but you get the picture anyway).

Maybe—just maybe—it would really be better overall to use two different protocols for IGP and EGP work. Maybe—just maybe—it’s better not to mix these two different kinds of functions in a single protocol. Not only is the single resulting protocol bound to be really complex (most BGP implementations are now over 100,000 lines of code, after all), but it will end up being really easy to make really bad mistakes.

No tool is omnicompetent. If you found a tool that was, in fact, omnicompetent, it would also be the most dangerous tool in your toolbox.

The Hedge 72: Lisa Caywood and Marketectures

The open source world is not much different than the commercial world in terms of building marketectures rather than useable software—largely because open source projects still rely on sources of funding and material support to build and maintain a product. Many times, however, the focus on these marketectures get in the way of real work. Join Tom Ammon, Russ White, and Lisa Caywood as we discuss the problem of marketectures and the broader world of open source software.

download

Technologies that Didn’t: Directory Services

One of the most important features of the Network Operating Systems, like Banyan Vines and Novell Netware, available in the middle of the 1980’s was their integrated directory system. These directory systems allowed for the automatic discovery of many different kinds of devices attached to a network, such as printers, servers, and computers. Printers, of course, were the important item in this list, because printers have always been the bane of the network administrator’s existence. An example of one such system, an early version of Active Directory, is shown in the illustration below.

Users, devices and resources, such as file mounts, were stored in a tree. The root of the tree was (generally) the organization. There were Organizational Units (OUs) under this root. Users and devices could belong to an OU, and be given access to devices and services in other OUs through a fairly simple drag and drop, or GUI based checkbox style interface. These systems were highly developed, making it fairly easy to find any sort of resource, including email addresses of other uses in the organization, services such as shared filers, and—yes—even printers.

The original system of this kind was Banyan’s Streetalk, which did not have the depth or expressiveness of later systems, like the one shown above from Windows NT, or Novell’s Directory Services. A similar system existed in another network operating system called LANtastic, which was never really widely deployed (although I worked on a LANtastic system in the late 1980’s).

The usual “pitch” for deploying these systems was the ease of access control they brought into the organization from the administration side, along with the ease of finding resources from the user’s perspective. Suppose you were sitting at your desk, and needed to know who over in some other department, say accounting, you could contact about some sort of problem, or idea. If you had one of these directory services up and running, the solution was simple: open the directory, look for the accounting OU within the tree, and look for a familiar name. Once you have found them, you could send them an email, find their phone number, or even—if you had permission—print a document at a printer near their desk for them to pick up. Better than a FAX machine, right?

What if you had multiple organizations who needed to work together? Or you really wanted a standard way to build these kinds of directories, rather than being required to run one of the network operating systems that could support such a system? There were two industry wide standards designed to address these kinds of problems: LDAP and X.500.

The OUs, CNs, and other elements shown in the illustration above are actually an expression of the X.500 directory system. As X.500 was standardized starting in the mid-1990’s, these network operating systems changed their native directory systems to match the X.500 schema. The ultimate goal was to make these various directory services interoperate through X.500 connectors.

Given all this background, what happened to these systems? Why are these kinds of directories widely available today? While there are many reasons, two of these stand out.

First, these systems are complex and heavy. Their complexity made them very hard to code and maintain; I can well remember working on a large Netware Directory Service deployment where objects fell into the wrong place on a regular basis, drive mapping did not work correctly, and objects had to be deleted and recreated to force their permissions to reset.

Large, complex systems tend to be unstable in unpredictable ways. One lesson the information technology world has not learned across the years is that abstraction is not enough; the underlying systems themselves must be simplified in a way that makes the abstraction more closely resemble the underlying reality. Abstraction can cover problems up as easily as it can solve problems.

Second, these systems fit better in a world of proprietary protocols and network operating systems than into a world of open protocols. The complexity driven into the network by trying to route IP, Novell’s IPX, Banyan’s VIP, DECnet, Microsoft’s protocols, Apple’s protocols, etc., made building and managing networks ever more complex. Again, while the interfaces were pretty abstractions, the underlying network was also reminiscent of a large bowl of spaghetti. There were even attempts to build IPX/VIP/IP packet translators so a host running Vines’ could communicate with devices on the then nascent global Internet.

Over time, the simplicity of IP, combined with the complexity and expense of these kinds of systems drove them from the scene. Some remnants live on in the directory structure contained in email and office software packages, but they are a shadow of Streettalk, NDS, and the Microsoft equivalent. The more direct descendants of these systems are single sign-on and OAUTH systems that allow you to use a single identity to log into multiple places.

But the primary function of finding things, rather than authenticating them, has long been left behind. Today, if you want to know someone’s email address, you look them up on your favorite social medial network. Or you don’t bother with email at all.

Rethinking BGP on the DC Fabric (part 4)

Before I continue, I want to remind you what the purpose of this little series of posts is. The point is not to convince you to never use BGP in the DC underlay ever again. There’s a lot of BGP deployed out there, and there are lot of tools that assume BGP in the underlay. I doubt any of that is going to change. The point is to make you stop and think!

Why are we deploying BGP in this way? Is this the right long-term solution? Should we, as a community, be rethinking our desire to use BGP for everything? Are we just “following the crowd” because … well … we think it’s what the “cool kids” are doing, or because “following the crowd” is what we always seem to do?

In my last post, I argued that BGP converges much more slowly than the other options available for the DC fabric underlay control plane. The pushback I received was two-fold. First, the overlay converges fast enough; the underlay convergence time does not really factor into overall convergence time. Second, there are ways to fix things.

If the first pushback is always true—the speed of the underlay control plane convergence does not matter—then why have an underlay control plane at all? Why not just use a single, merged, control plane for both underlay and overlay? Or … to be a little more shocking, if the speed at which the underlay control plane converges does not matter, why not just configure the entire underlay using … static routes?

The reason we use a dynamic underlay control plane is because we need this foundational connectivity for something. So long as we need this foundational connectivity for something, then that something is always going to be better if it is faster rather than slower.

The second pushback is more interesting. Essentially—because we work on virtual things rather than physical ones, just about anything can be adapted to serve any purpose. I can, for instance, replace BGP’s bestpath algorithm with Dijkstra’s SPF, and BGP’s packet format with a more straight-forward TLV format emulating a link-state protocol, and then say, “see, now BGP looks just like a link-state protocol … I made BGP work really well on a DC fabric.”

Yes, of course you can do these things. Somewhere along the way we became convinced that we are being really clever when we adapt a protocol to do something it wasn’t designed to do, but I’m not certain this is a good way of going about building reliable systems. 

Okay, back to the point … the next reason we should rethink BGP on the DC fabric is because it is complex to configure when its being used as an IGP. In my last post, when discussing the configuration required to make BGP converge, I noted AS numbers and AS Path filters must be laid out in a very specific way, following where each device is located in the fabric. The MRAI must be taken down to some minimum on every device (either 0 or 1 second), and individual peers must be configured.

Further, if you are using a version of BGP that follows the IETF’s BCPs for the protocol, you must configure some sort of filter (generally a permit all) to get a BGP speaker to advertise anything to an eBGP peer. If you’re using iBGP, you need to configure route reflectors and tell BGP to advertise multiple paths.

There are two ways to solve this problem. First, you can automate all this configuration—of course! I am a huge fan of automation. It’s an important tool because it can make your network consistent and more secure.

But I’m also realistic enough to know that adding the complexity of an automation system on top of a too-complex system to make things simpler is probably not a really good idea. To give a visual example, consider the possibility of automatically wiping your mouth while eating soup.

Yes, automation can be taken too far. A good rule of thumb might be: automation works best on systems intentionally designed to be simple enough to automate. In this case, perhaps it would be simpler to just use a protocol more directly designed so solve the problem at hand, rather than trying to automate our way out of the problem.

Second, you can modify BGP to be a better fit for use as an IGP in various ways. This post has already run far too long, however, so … I’ll hold off on talking about this until the next post.

The Hedge 71: Nick Russo and Automating Productivity

When we think of automation—and more broadly tooling—we tend to think of automating the configuration, monitoring, and (possibly) the monitoring of a network. On the other hand, a friend once observed that when interviewing coders, the first thing he asked was about the tools they had developed and used for making themselves more efficient. This “self-tooling” process turns out to be important not just to be more efficient at work, but to use time more effectively in general. Join Nick Russo, Eyvonne Sharp, Tom Ammon, and Russ White as we discuss self-tooling.

download