On Using the Right Word

A while back, I was sitting in a meeting where the presenter described switching from a “traditional, hierarchical data center fabric” to a spine-and-leaf (while drawing CLOS, in all capital letters, on the whiteboard). He pointed out that the spine-and-leaf design is simpler because it only has two tiers rather than three.

There is so much wrong with this I almost winced in physical pain. Traditional hierarchical designs are not fabrics. Spine-and-leaf fabrics are not CLOS, but Clos, fabrics. Clos fabrics have three stages, not two—even if we draw them “folded” so you only see two apparent levels to the fabric. In fact, all spine-and-leaf fabrics always have an odd number of stages, and they are stages, not tiers.

More recently, I heard someone talking about an operating system that was built using microservices. I thought—“that would be at neat trick.” To build something with microservices does not just mean a piece of software using modules—this would be modular application (or operating system) design. Microservices architectures break the application up into the most basic components possible and then scale each kind of component out (rather than up) by spinning new copies of each service as needed. I cannot imagine scaling an operating system out by spinning multiple copies of the same service, and then providing some sort way to spread load across the various copies. Would you have some sort of anycast IPC? An internal DNS server or load balancer?

You can have an OS that natively participates in a larger microservices-based architecture, but what would microservices within the operating system look like, precisely?

Maybe my recent studies in philosophy make me much more attuned to the way we use language in the network engineering world—or maybe I’m just getting old. Whatever it is, our determination to make every word mean everything is driving me nuts.

What is the difference between a router and a switch? There used to be a simple definition—routers rewrite the L2 header and switches don’t. But now that routers switch packets, and switches route packets, the only difference seems to be … buffer depth? Feature set? The line between router and switch is fuzzy to the point of being meaningless, leaving us with no real term to describe a real switch any longer (a device that doesn’t do routing).

What about software defined networks? We’ve been treated to software defined everything now, of course. And intent? I get the point of intent, but we’re already moving down the path of making the meaning so broad that it can even contain configuring the CLI on an old AGS+. And don’t get me started on artificial intelligence, which is often learned to describe something closer to machine learning. Of course machine learning is often used to describe things that are really nothing more than statistical inference.

Maybe it’s time for a general rebellion against the sloppy use of language in network engineering. Or maybe I’m just tilting at yet another windmill. Wake me up when we’ve gotten to the point that we can use any word interchangeably with any other word in the network engineering dictionary. I await the AI that routes packets by reading your mind (through intent) called a swouter… or something.

Rethinking BGP on the DC Fabric (part 5)

BGP is widely used as an IGP in the underlay of modern DC fabrics. This series argues this is not the best long-term solution to the problem of routing in fabrics because BGP is not ideal for this use case. This post will consider the potential harm we are doing to the larger Internet by pressing BGP into a role it was not originally designed to fulfill—an underlay protocol or an IGP.

My last post described the kinds of configuration required to make BGP work on a DC fabric—it turns out that the configuration of each BGP speaker on the fabric is close to unique. It is possible to automate configuring each speaker—but it would be better if we could get closer to autonomic operation.

To move BGP closer to autonomic operation in a DC fabric, there are several things we can do. First, we can allow a BGP speaker to peer with any other BGP speaker it receives an open message from—this is often called promiscuous mode. While each router in the fabric will still need to be configured with the right autonomous system, at least we won’t need to configure the correct peers on each router (including the remote AS).

Note, however, that using this kind of promiscuous peering does come with a set of tradeoffs (if you’re reading this blog, you know there will be tradeoffs). BGP speakers running in promiscuous mode open a large attack surface on the control plane of the network. We can close this attack surface by configuring authentication on all BGP speakers … but we are now adding complexity to reduce complexity. We could also reduce the scope of the attack surface by never permitting BGP to peer beyond a single hop, and then filtering all BGP packets at the fabric edge. Again, just a bit more complexity to manage—but remember that the road to highly fragile and complex systems is always paved with individual steps that never, on their own, seem to add “too much complexity.”

The second thing we can do to move BGP closer to autonomic operation is to advertise routes to every connected peer without any policy configured. This does, again, introduce some tradeoffs, particularly in the realm of security, but let’s leave that aside for the moment.

Assume we can create a version of BGP that has these modifications—it always accepts any peer from any other AS, and it advertises all routes without any policy configured. Put these features behind a single knob which also includes setting the MRAI to 0 or 1, tightens up the dampening parameters, and adjusts a few other things to make BGP work better in a DC fabric.

As an experiment, let’s enable this DC fabric knob on a BGP speaker at the edge of a dual-homed “enterprise customer.” What will happen?

The enterprise network will automatically peer to any speaker that sends an open message—a huge security hole on the open Internet—and it will advertise every route it learns even though there is no policy configured. This second issue—advertising routes with no policy configured—can cause the enterprise network to become a transit between two much larger provider networks, crashing out some small corner of the Internet.

This might seem like a trivial issue. After all, just don’t ever enable the DC fabric knob on an eBGP peering session upstream into the DFZ, or any other “real” internetwork. Sure, and just don’t ever hit the brakes when you mean to hit the accelerator, or the accelerator when you mean to hit the brakes. If I had a dime for every time we “just don’t ever make that mistake …” Well, I wouldn’t be blogging, I’d be relaxing in the sun someplace (okay, I’m not likely to ever stop working to sit around and “relax” all the time, but you get the picture anyway).

Maybe—just maybe—it would really be better overall to use two different protocols for IGP and EGP work. Maybe—just maybe—it’s better not to mix these two different kinds of functions in a single protocol. Not only is the single resulting protocol bound to be really complex (most BGP implementations are now over 100,000 lines of code, after all), but it will end up being really easy to make really bad mistakes.

No tool is omnicompetent. If you found a tool that was, in fact, omnicompetent, it would also be the most dangerous tool in your toolbox.

It is Easier to Move a Problem than Solve it (RFC1925, Rule 6)

Early on in my career as a network engineer, I learned the value of sharing. When I could not figure out why a particular application was not working correctly, it was always useful to blame the application. Conversely, the application owner was often quite willing to share their problems with me, as well, by blaming the network.

A more cynical way of putting this kind of sharing is the way RFC 1925, rule 6 puts is: “It is easier to move a problem around than it is to solve it.”

Of course, the general principle applies far beyond sharing problems with your co-workers. There are many applications in network and protocol design, as well. Perhaps the most widespread case deployed in networks today is the movement to “let the controller solve the problem.” Distributed routing protocols are hard? That’s okay, just implement routing entirely on a controller. Understanding how to deploy individual technologies to solve real-world problems is hard? Simple—move the problem to the controller. All that’s needed is to tell the controller what we intend to do, and the controller can figure the rest out. If you have problems solving any problem, just call it Software Defined Networking, or better yet Intent Based Networking, and the problems will melt away into the controller.

Pay no attention to the complexity of the code on the controller, or those pesky problems with CAP Theorem, or any protests on the part of long-term engineering staff who talk about total system complexity. They are probably just old curmudgeon who are protecting their territory in order to ensure they have a job tomorrow. Once you’ve automated the process you can safely ignore how the process works; the GUI is always your best guide to understanding the overall complexity of the system.

Examples of moving the problem abound in network engineering. For instance, it is widely known that managing customers is one of the hardest parts of operating a network. Customers move around, buy new hardware, change providers, and generally make it very difficult for providers by losing their passwords and personally identifying information (such as their Social Security Number in the US). To solve this problem, RFC8567 suggests moving the problem of storing enough information to uniquely identify each person into the Domain Name System. Moving the problem from the customer to DNS clearly solves the problem of providers (and governments) being able to track individuals on a global basis. The complexity and scale of the DNS system is not something to be concerned with, as DNS “just works,” and some method of protecting the privacy of individuals in such a system can surely be found. After all, it’s just software.

If the DNS system becomes too complex, it is simple enough to relieve DNS of the problem of mapping IP addresses to the names of hosts. Instead, each host can be assigned a single host that is used regardless of where it is attached to the network, and hence uniquely identifies the host throughout its lifetime. Such a system is suggested in RFC2100 and appears to be widely implemented in many networks already, at least from the perspective of application developers. Because DNS is “too slow,” application developers find it easier to move the problem DNS is supposed to solve into the routing system by assigning permanent IP addresses.

Another great example of moving a problem rather than solving it is RFC3215, Electricity over IP. Every building in the world, from houses to commercial storefronts, must have multiple cabling systems installed in order to support multiple kinds of infrastructure. If RFC3215 were widely implemented, a single kind of cable (or even fiber optics, if you want your electricity faster) can be installed in all buildings, and power carried over the IP network running on these cables (once the IP network is up and running). Many ancillary problems could be solved with such a scheme—for instance, power usage could be measured based on a per-packet system, rather than the sloppier kilowatt-hour system currently in use. Any bootstrap problems can be referred to the controller. After all, it’s just software.

The bottom line is this: when you cannot figure out how to solve a problem, just move it to some other system, especially if that system is “just software,” so the problem now becomes “software defined. This is also especially useful if moving the problem can be accomplished by claiming the result is a form of automation.

Moving problems around is always much easier than solving them.

Agglutinating Problems Considered Harmful (RFC2915, Rule 5)

In the networking world, many equate simplicity with the fewest number of moving parts. According to this line of thinking, if there are 100 routers, 10 firewalls, 3 control planes, and 4 management systems in a network, then reducing the number of routers to 95, the number of firewalls to 8, the number of control planes to 1, and the number of management systems to 3 would make the system “much simpler.” Disregarding the reduction in the number of management systems, scientifically proven to always increase in number, it does seem that reducing the number of physical devices, protocols in use, etc., would tend to decrease the complexity of the network.

The wise engineers of the IETF, however, has a word of warning in this area that all network engineers should heed. According to RFC1925, rule 5: “It is always possible to agglutinate multiple separate problems into a single complex interdependent solution. In most cases this is a bad idea.” When “conventional wisdom” and the wisdom of engineers with the kind of experience and background as those who write IETF documents contradict one another, it is worth taking a deeper look.

A good place to begin is with other RFCs that might provide examples, or otherwise shed light on this situation. Two of particular interest are RFC1776 and RFC3093.

RFC1776 describes a very simplified transport protocol for use in the Internet and private networks. In normal packet formats there are many different components, such as a header and data sections. The header is normally made up of many different fields, such as the source address, the destination address, the quality of service, etc. The data section of the packet may also be divided into many different fields providing information for such functionality as error detection, flow control, and indicators of which application on the destination host this information is destined to (the port number is an example).

The authors of RFC1776 decided that the wisdom of making a single appliance which provides many services, the firewall being the classic example, and the wisdom of using a single protocol for everything, for instance using BGP for data center fabrics and interdomain connectivity, should be applied fully to the formatting of transport packets. In the spirit of agglutination common to all network engineering, RFC1776 recommends replacing the entire contents of a transport packet with a single address. The address must be a bit longer, of course, to carry the actual data, but using a single large field is inherently simpler than using many different fields. To accomplish this task, RFC1776 specifies a packet with 1696 octets (bytes) of address space. The number of octets originally selected is compatible with ATM, an older technology which uses a 53-octet cell but should also be compatible with all modern transport systems.

While the many advantages of this system are not fully described in the specification, it should be obvious packets containing a single field—the destination address—will be easier to hosts to generate and transmit, and easier for hosts to receive and process. The entire processing of the packet will just be transferring the address field directly into memory for consumption by any application running on the host that desires to consume it. The specification does note, however, that security is much simpler because there is no “user data” to secure.

RFC3093, a more recent example of agglutination in order to simplify network design and operation. This authors of this RFC note that applications are already moving to using a single port, 80, for all traffic, as most firewalls already pass traffic transmitted through this port without restrictions. The authors note the operation of the Internet would be much simpler if all applications ran over port 80. In this way, all applications could pass through firewalls without modification, while the firewalls themselves remain perfectly operational, fulfilling their intended purpose. Implementing this specification would also simplify the absolute mess of port and protocol numbers used in transporting data today, agglutinating them all down to a single port. As less is always simpler, this would create a simpler, easier to manage, global Internet.
The lessons to learn, after examining the options, may not be what was originally intended. Reducing the number of parts does not necessarily reduce the complexity of the overall system. If you haven’t found the tradeoffs, you haven’t looked hard enough.

On Important Things

I tend to be a very private person; I rarely discuss my “real life” with anyone except a few close friends. I thought it appropriate, though, in this season—both the season of the year and this season in my life—to post something a little more personal.

One thing people often remark about my personality is that I seem to be disturbed by very little in life. No matter what curve ball life might throw my way, I take the hit and turn it around, regain my sense of humor, and press forward into the fray more quickly than many expect. This season, combined with a recent curve ball (one of many—few people would suspect the path my life has taken across these 50+ years), and talking to Brian Keys in a recent episode of the Hedge, have given me reason to examine foundational principles once again.

How do I stay “up” when life throws me a curve ball?

Pragmatically, the worst network outage in the world is not likely to equal the stresses I’ve faced in the military, whether on the flight line or in … “other situations.” Life and death were immediately and obviously present in those times. Coming face to face with death—having friend who is there one moment, and not the next—changes your perspective. Knowing you hold the lives of hundreds of people in your hands—that if you make a mistake, real people will die (now!)—changes your perspective. In these times you realize there is more to life than work, or skill, or knowledge.

Spiritually, I am deeply Christian. I am close to God in a real way. I know him, and I trust his character and plans for the future. Job 13:15 and Romans 12:2 are present realities every day, from the moment I wake until the moment I fall asleep.

These two lead to a third observation.

Because of these things, I am grateful.

I am grateful for the people in my life—deeply held friendships, people who have spoken into my life, people who have helped carry me in times of crisis. I am grateful for the things in my life. Gratitude is, in essence, that which turns what you have into what you want (or perhaps enough to fulfill your wants). Gratitude often goes farther, though, teaching you that, all too often, you have more than you deserve.

So this season, whatever your religious beliefs, is a good time to reflect on the importance of all things spiritual, the value of life, the value of friendship, the value of truth, and to decide to have gratitude in the face of every storm. Gratitude causes us to look outside ourselves, and there is power (and healing) in self-forgetfulness. Self-love and self-hate are equal and opposite errors; it is in forgetting yourself, pressing forward, and giving to others, that you find out who you are.

Have a merry Christmas, a miraculous Hanukkah, or … just a joyful time being with family and friends at home. Watch a movie, eat ice cream and cookies, make a new friend, care for someone who has no family to be with.

To paraphrase Abraham Lincoln, most folks are just about as happy as they choose to be—so make the choice to be joyful and grateful.

These are the important things.

Pulling Back the Curtains

One of the major sources of complexity in modern systems is the simple failure to pull back the curtains. From a recent blog post over at the ACM—

The Wizard of Oz was a charlatan. You’d be surprised, too, how many programmers don’t understand what’s going on behind the curtain either. Some years ago, I was talking with the CTO of a company, and he asked me to explain what happens when you type a URL into your browser and hit enter. Do you actually know what happens? Think about it for a moment.

Yegor describes three different reactions when a coder faces something unexpected when solving a problem.

Throw in the towel. Just give up on solving the problem. This is fairly uncommon in the networking and programming fields, so I don’t have much to say here.

Muddle through. Just figure out how to make it work by whatever means necessary.

Open the curtains and build an excellent solution. Learn how the underlying systems work, understand how to interact with them, and create a solution that best takes advantage of them.

The first and third options are rare indeed; it is the second solution that seems to dominate our world. What generally tends to happen is we set out to solve some problem, we encounter resistance, and we either “just make it work” by fiddling around with the bits or we say “this is just too complex, I’m going to build something new that simpler and easier.” The problem with building something new is the “something new” must go someplace … which generally means on top of existing “stuff.” Adding more stuff you do understand on top of stuff you don’t understand to solve a problem is, of course, a prime way to increase complexity in a network.

And thus we have one of the prime reasons for ever-increasing complexity in networks.

Yegor says being a great programmer by pulling back the curtain increases job satisfaction, helping him avoid depression. The same is probably true of network engineers who are deeply interested in solving problems—who are only happy at the end of the day if they know they have solved some problem, even if no-one ever notices.

Pulling back the curtains, then not only helps us to manage complexity, it can alos improve job satisfaction for those with the problem-solving mindset. Great reasons to pull back to the curtains, indeed.

The EXPERIENCE HAS SHOWN THAT Keyword (RFC1925, Rule 4)

The world of information technology is filled, often to overflowing, with those who “know better.” For instance, I was recently reading an introduction to networking in a very popular orchestration system that began with the declaration that routing was hard, and therefore this system avoided routing. The document then went on to describe a system of moving packets around using multiple levels of Network Address Translation (NAT) and centrally configured policy-based routing (or filter-based forwarding) that was clearly simpler than the distributed protocols used to run large-scale networks. I thought, for a moment, of writing the author and pointing out the system in question had merely reinvented routing in a rather inefficient and probably broken way, but I relented. Why? Because I know RFC1925, rule 4, by heart:

Some things in life can never be fully appreciated nor understood unless experienced firsthand. Some things in networking can never be fully understood by someone who neither builds commercial networking equipment nor runs an operational network.

Ultimately, the people who built this system will likely not listen to me; rather, they are going to have to experience the pain caused by large-scale failures for themselves before they will listen. Many network operators do wish for some way to get their experience across to users and application developers, however; one suggestion which has been made in the past is adding subliminal messages to the TELNET protocol. According to RFC1097, adding this new message type would allow operators to gently encourage users to upgrade the software they are using by displaying a message on-screen which the user’s mind can process, but is not consciously aware of reading. The uses of such a protocol extension, however, would be wide-ranging, such as informing application developers that the network is not cheap, and packets are not carried instantaneously from one host to another.

A further suggestion made in this direction is to find ways to more fully document operational experience in Internet Standards produced by the Internet Engineering Task Force (IETF). Currently, the standards for writing such standards (sometimes mistakenly called meta-standards, although these standards about standards are standards in their own right) only include a few keywords which authors of protocol standards can use to guide developers into creating well-developed implementations. For instance, according to RFC2119, a protocol designer can use the term MUST (note the uppercase, which means it must be shouted when reading the standard out loud) to indicate something an implementation must do. If the implementation does not do what it MUST, subliminal messaging (as described above) will be used to discourage the use of that implementation.

MUST NOT, SHOULD, SHOULD NOT, MAY, and MAY NOT are the remaining keywords defined by the IETF for use in standards. While these options do cover a number of situations, they do not express the full range of options available based on operational experience. RFC6919 proposed extensions to these keywords to allow a fuller range of intent which could be useful to express experience.
For instance, RFC6919 adds the keyword MUST (BUT WE KNOW YOU WON’T) to express operational frustration for those times when even subliminal messaging will not convince a user or application developer to create an implementation that will gracefully scale. The POSSIBLE keyword is also included to indicate what is possible in the real world, and the REALLY SHOULD NOT is included for those times when an application developer or user asks for the network operator to launch pigs into flight.

Of course, the keywords described in RFC6919 may, at some point, be extended further to include such keywords as EXPERIENCE HAS SHOWN THAT and THAT WILL NOT SCALE, but for now protocol developers and operators are still somewhat restricted in their ability to fully express the experience of operating large-scale networks.

Even with these additional keywords and the use of subliminal messaging, improper implementations will still slip out into the wild, of course.

And what about network operators who are just beginning to learn their craft, or have long experience but somehow still make mistakes in their deployments? Some have suggested in the past—particularly those who work in technical assistance centers—that all network devices be shipped according to the puzzle-box protocol.

All network devices should be shipped in puzzle boxes such that only those with an appropriate level of knowledge, experience, and intelligence can open the box and hence install the equipment. Some might argue the Command Line Interface (CLI) currently supplied with most networking equipment is the equivalent of a puzzle box, but given the state of most networks, it seems shipping network equipment with a complex and difficult-to-use CLI has not been fully effective.