Back in January, I ran into an interesting article called The many lies about reducing complexity:
Reducing complexity sells. Especially managers in IT are sensitive to it as complexity generally is their biggest headache. Hence, in IT, people are in a perennial fight to make the complexity bearable.
Gerben then discusses two ways we often try to reduce complexity. First, we try to simply reduce the number of applications we’re using. We see this all the time in the networking world—if we could only get to a single pane of glass, or reduce the number of management packages we use, or reduce the number of control planes (generally to one), or reduce the number of transport protocols … but reducing the number of protocols doesn’t necessarily reduce complexity. Instead, we can just end up with one very complex protocol. Would it really be simpler to push DNS and HTTP functionality into BGP so we can use a single protocol to do everything?
Second, we try to reduce complexity by hiding it. While this is sometimes effective, it can also lead to unacceptable tradeoffs in performance (we run into the state, optimization, surfaces triad here). It can also make the system more complex if we need to go back and leak information to regain optimal behavior. Think of the OSPF type 4, which just reinjects information lost in building an area summary, or even the complexity involved in the type7 to type 5 process required to create not-so-stubby areas.
It would seem, then, that you really can’t get rid of complexity. You can move it around, and sometimes you can effectively hide it, but you cannot get rid of it.
This is, to some extent, true. Complexity is a reaction to difficult environments, and networks are difficult environments.
Even so, there are ways to actually reduce complexity. The solution is not just hiding information because it’s messy, or munging things together because it requires fewer applications or protocols. You cannot eliminate complexity, but if you think about how information flows through a system you might be able to reduce the amount of complexity, and even create boundaries where state (hence complexity) can be more effectively hidden.
As an instance, I have argued elsewhere that building a DC fabric with distinct overlay and underlay protocols can actually create a simpler overall design than using a single protocol. Another instance might be to really think about where route aggregation takes place—is it really needed at all? Why? Is this the right place to aggregate routes? Is there any way I can change the network design to reduce state leaking through the abstraction?
The problem is there are no clear-cut rules for thinking about complexity in this way. There’s no rule of thumb, there’s no best practices. You just have to think through each individual situation and consider how, where, and why state flows, and then think through the state/optimization/surface tradeoffs for each possible way of reducing the complexity of the system. You have to take into account that local reductions in complexity can cause the overall system to be much more complex, as well, and eventually make the system brittle.
There’s no “pat” way to reduce complexity—that there is, is perhaps one of the biggest lies about complexity in the networking world.