In my last post on this topic, I laid out the purpose of this series—to start a discussion about whether BGP is the ideal underlay control plane for a DC fabric—and gave some definitions. Here, I’d like to dive into the reasons to not use BGP as a DC fabric underlay control plane—and the first of these reasons is BGP converges very slowly and requires a lot of help to converge at all.
Everyone uses BGP for DC underlays now because … well, just because everyone does. After all, there’s an RFC explaining the idea, every tool in the world supports BGP for the underlay, and every vendor out there recommends some form of BGP in their design documents.
I’m going to swim against the current for the moment and spend a couple of weeks here discussing the case against BGP as a DC underlay protocol. I’m not the only one swimming against this particular current, of course—there are at least three proposals in the IETF (more, if you count things that will probably never be deployed) proposing link-state alternatives to BGP. If BGP is so ideal for DC fabric underlays, then why are so many smart people (at least they seem to be smart) working on finding another solution?
Every software developer has run into “god objects”—some data structure or database that every process must access no matter what it is doing. Creating god objects in software is considered an anti-pattern—something you should not do. Perhaps the most apt description of the god object I’ve seen recently is you ask for a banana, and you get the gorilla as well.
What, really, is “technical debt?” It’s tempting to say “anything legacy,” but then why do we need a new phrase to describe “legacy stuff?” Even the prejudice against legacy stuff isn’t all that rational when you think about it. Something that’s old might also just be well-tested, or well-worn but still serviceable. Let’s try another tack.
In the realm of network design—especially in the realm of security—we often react so strongly against a perceived threat, or so quickly to solve a perceived problem, that we fail to look for the tradeoffs. If you haven’t found the tradeoffs, you haven’t looked hard enough—or, as Dr. Little says, you have to ask what is gained and what is lost, rather than just what is gained. This failure to look at both sides often results in untold amounts of technical debt and complexity being dumped into network designs (and application implementations), causing outages and failures long after these decisions are made.
While software design is not the same as network design, there is enough overlap for network designers to learn from software designers. A recent paper published by Butler Lampson, updating a paper he wrote in 1983, is a perfect illustration of this principle. The paper is caleld Hints and Principles for Computer System Design. I’m not going to write a full review here–you should really go read the paper for yourself–but rather just point out some useful bits of the paper.
A recent paper on network control and management (which includes Jennifer Rexford on the author list—anything with Jennifer on the author list is worth reading) proposes a clean slate 4d approach to solving much of the complexity we encounter in modern networks. While the paper is interesting, it’s very unlikely we will ever see a clean slate design like the one described, not least because there will always be differences between what the proper splits are—what should go where.
This week is very busy for me, so rather than writing a single long, post, I’m throwing together some things that have been sitting in my pile to write about for a long while.
From Dalton Sweeny:
A physicist loses half the value of their physics knowledge in just four years whereas an English professor would take over 25 years to lose half the value of the knowledge they had at the beginning of their career. . . Software engineers with a traditional computer science background learn things that never expire with age: data structures, algorithms, compilers, distributed systems, etc. But most of us don’t work with these concepts directly. Abstractions and frameworks are built on top of these well studied ideas so we don’t have to get into the nitty-gritty details on the job (at least most of the time).
The word on the street is that everyone—especially network engineers—must learn to code. A conversation with a friend and an article passing through my RSS reader brought this to mind once again—so once more into the breach. Part of the problem here is that we seem to have a knack for asking the wrong question. When we look at network engineer skill sets, we often think about the ability to configure a protocol or set of features, and then the ability to quickly troubleshoot those protocols or features using a set of commands or techniques.
Have you ever looked at your wide area network and wondered … what would the traffic flows look like if this link or that router failed? Traffic modeling of this kind is widely available in commercial tools, which means it’s been hard to play with these kinds of tools, learn how they work, and understand how they can be effective. There is, however, an open source alternative—pyNTM. While this tool won’t replace a commercial tool, it can give you “enough to go on” for many network operators, and give you the experience and understanding needed to justify springing for a commercial product.