One of the major sources of complexity in modern systems is the simple failure to pull back the curtains. From a recent blog post over at the ACM—
The Wizard of Oz was a charlatan. You’d be surprised, too, how many programmers don’t understand what’s going on behind the curtain either. Some years ago, I was talking with the CTO of a company, and he asked me to explain what happens when you type a URL into your browser and hit enter. Do you actually know what happens? Think about it for a moment.
According to Maor Rudick, in a recent post over at Cloud Native, programming is 10% writing code and 90% understanding why it doesn’t work. This expresses the art of deploying network protocols, security, or anything that needs thought about where and how. I’m not just talking about the configuration, either—why was this filter deployed here rather than there? Why was this BGP community used rather than that one? Why was this aggregation range used rather than some other? Even in a fully automated world, the saying holds true.
I think we can all agree networks have become too complex—and this complexity is a result of the network often becoming the “final dumping ground” of every problem that seems like it might impact more than one system, or everything no-one else can figure out how to solve. It’s rather humorous, in fact, to see a lot of server and application folks sitting around saying “this networking stuff is so complex—let’s design something better and simpler in our bespoke overlay…” and then falling into the same complexity traps as they start facing the real problems of policy and scale.
This complexity cannot be “automated away.” It can be smeared over with intent, but we’re going to find—soon enough—that smearing intent on top of complexity just makes for a dirty kitchen and a sub-standard meal.
We’re actually pretty good at finding, and “solving” (for some meaning of “solving,” of course), these kinds of immediately obvious tradeoffs. It’s obvious the street sweepers are going to lose their jobs if we replace them with a robot. What might not be so obvious is the loss of the presence of a person on the street. That’s a pair of eyes who can see when a child is being taken by someone who’s not a family member, a pair of ears that can hear the rumble of a car that doesn’t belong in the neighborhood, a pair of hands that can help someone who’s fallen, etc.
Many years ago I attended a presentation by Dave Meyers on network complexity—which set off an entire line of thinking about how we build networks that are just too complex. While it might be interesting to dive into our motivations for building networks that are just too complex, I starting thinking about how to classify and understand the complexity I was seeing in all the networks I touched. Of course, my primary interest is in how to build networks that are less complex, rather than just understanding complexity…
This led me to do a lot of reading, write some drafts, and then write a book. During this process, I ended coining what I call the complexity triad—State, Optimization, and Surface. If you read the book on complexity, you can see my views on what the triad consisted of changed through in the writing—I started out with volume (of state), speed (of state), and optimization. Somehow, though, interaction surfaces need to play a role in the complexity puzzle.
If you are looking for a good resolution for 2020 still (I know, it’s a bit late), you can’t go wrong with this one: this year, I will focus on making the networks and products I work on truly simpler. . . We need to go beyond just figuring out how to make the user interface simpler, more “intent-driven,” automated, or whatever it is. We need to think of the network as a system, rather than as a collection of bits and bobs that we’ve thrown together across the years. We need to think about the modules horizontally and vertically, think about how they interact, understand how each piece works, understand how each abstraction leaks, and be able to ask hard questions.
It’s time for a short lecture on complexity.
Networks are complex. This should not be surprising, as building a system that can solve hard problems, while also adapting quickly to changes in the real world, requires complexity—the harder the problem, the more adaptable the system needs to be, the more resulting design will tend to be. Networks are bound to be complex, because we expect them to be able to support any application we throw at them, adapt to fast-changing business conditions, and adapt to real-world failures of various kinds.
There are several reactions I’ve seen to this reality over the years, each of which has their own trade-offs.
It was quite difficult to prepare a tub full of bath water at many points in recent history (and it probably still is in some many parts of the world). First, there was the water itself—if you do not have plumbing, then the water must be manually transported, one bucket at a time, from a…
We love layers and abstraction. After all, building in layers and it’s corollary, abstraction, are the foundation of large-scale system design. The only way to build large-scale systems is to divide and conquer, which means building many different component parts with clear and defined interaction surfaces (most often expressed as APIs) and combining these many…
Simplification is a constant theme not only here, and in my talks, but across the network engineering world right now. But what does this mean practically? Looking at a complex network, how do you begin simplifying? The first option is to abstract, abstract again, and abstract some more. But before diving into deep abstraction, remember…