Back in January, I ran into an interesting article called The many lies about reducing complexity:
Reducing complexity sells. Especially managers in IT are sensitive to it as complexity generally is their biggest headache. Hence, in IT, people are in a perennial fight to make the complexity bearable.
Why are networks so insecure?
One reason is we don’t take network security seriously. We just don’t think of the network as a serious target of attack. Or we think of security as a problem “over there,” something that exists in the application realm, that needs to be solved by application developers. Or we think the consequences of a network security breach as “well, they can DDoS us, and then we can figure out how to move load around, so if we build with resilience (enough redundancy) we’re already taking care of our security issues.” Or we put our trust in the firewall, which sits there like some magic box solving all our problems.
What percentage of business-impacting application outages are caused by networks? According to a recent survey by the Uptime Institute, about 30% of the 300 operators they surveyed, 29% have experienced network related outages in the last three years—the highest percentage of causes for IT failures across the period.
A secondary question on the survey attempted to “dig a little deeper” to understand the reasons for network failure; the chart below shows the result.
Crossing from the domain of test pilots to the domain of network engineering might seem like a large leap indeed—but user interfaces and their tradeoffs are common across physical and virtual spaces. Brian Keys, Eyvonne Sharp, Tom Ammon, and Russ White as we start with user interfaces and move into a wider discussion around attitudes and beliefs in the network engineering world.
Every software developer has run into “god objects”—some data structure or database that every process must access no matter what it is doing. Creating god objects in software is considered an anti-pattern—something you should not do. Perhaps the most apt description of the god object I’ve seen recently is you ask for a banana, and you get the gorilla as well.
One of the major sources of complexity in modern systems is the simple failure to pull back the curtains. From a recent blog post over at the ACM—
The Wizard of Oz was a charlatan. You’d be surprised, too, how many programmers don’t understand what’s going on behind the curtain either. Some years ago, I was talking with the CTO of a company, and he asked me to explain what happens when you type a URL into your browser and hit enter. Do you actually know what happens? Think about it for a moment.
According to Maor Rudick, in a recent post over at Cloud Native, programming is 10% writing code and 90% understanding why it doesn’t work. This expresses the art of deploying network protocols, security, or anything that needs thought about where and how. I’m not just talking about the configuration, either—why was this filter deployed here rather than there? Why was this BGP community used rather than that one? Why was this aggregation range used rather than some other? Even in a fully automated world, the saying holds true.
I think we can all agree networks have become too complex—and this complexity is a result of the network often becoming the “final dumping ground” of every problem that seems like it might impact more than one system, or everything no-one else can figure out how to solve. It’s rather humorous, in fact, to see a lot of server and application folks sitting around saying “this networking stuff is so complex—let’s design something better and simpler in our bespoke overlay…” and then falling into the same complexity traps as they start facing the real problems of policy and scale.
This complexity cannot be “automated away.” It can be smeared over with intent, but we’re going to find—soon enough—that smearing intent on top of complexity just makes for a dirty kitchen and a sub-standard meal.
We’re actually pretty good at finding, and “solving” (for some meaning of “solving,” of course), these kinds of immediately obvious tradeoffs. It’s obvious the street sweepers are going to lose their jobs if we replace them with a robot. What might not be so obvious is the loss of the presence of a person on the street. That’s a pair of eyes who can see when a child is being taken by someone who’s not a family member, a pair of ears that can hear the rumble of a car that doesn’t belong in the neighborhood, a pair of hands that can help someone who’s fallen, etc.
Many years ago I attended a presentation by Dave Meyers on network complexity—which set off an entire line of thinking about how we build networks that are just too complex. While it might be interesting to dive into our motivations for building networks that are just too complex, I starting thinking about how to classify and understand the complexity I was seeing in all the networks I touched. Of course, my primary interest is in how to build networks that are less complex, rather than just understanding complexity…
This led me to do a lot of reading, write some drafts, and then write a book. During this process, I ended coining what I call the complexity triad—State, Optimization, and Surface. If you read the book on complexity, you can see my views on what the triad consisted of changed through in the writing—I started out with volume (of state), speed (of state), and optimization. Somehow, though, interaction surfaces need to play a role in the complexity puzzle.