A few weeks ago, Daniel posted a piece about using different underlay and overlay protocols in a data center fabric. He says:
There is nothing wrong with running BGP in the overlay but I oppose to the argument of it being simpler.
One of the major problems we often face in network engineering—and engineering more broadly—is confusing that which is simple with that which has lower complexity. Simpler things are not always less complex. Let me give you a few examples, all of which are going to be controversial.
When OSPF was first created, it was designed to be a simpler and more efficient form of IS-IS. Instead of using TLVs to encode data, OSPF used fixed-length fields. To process the contents of a TLV, you need to build a case/switch construction where each possible type a separate bit of code. You must count off the correct length for the type of data, or (worse) read a length field and count out where you are in the stream.
Fixed-length fields are just much easier to process. You build a structure matching the layout of the fixed-length fields in memory, then point this structure at the packet contents in-memory. From there, you can just use the structure’s contents to directly access the data.
Over time, however, as new requirements have been pushed into IGPs, OSPF has become much more complex while IS-IS has remained relatively constant (in terms of complexity). IS-IS went through a bit of a mess when transitioning from narrow to wide metrics, but otherwise the IS-IS we use today is the same protocol we used when I first started working on networks (back in the early 1990s).
OSPF’s simplicity, in the end, did not translate into a less complex protocol.
Another example is the way we transport data in BGP. A lot of people do not know that BGP’s original design allowed for carrying information other than straight reachability in the protocol. BGP speakers can negotiate multiple sessions, with each session carrying a different kind of information. Rather than using this mechanism, however, BGP has consistently been extended using address families—because it is simpler to create a new address family than it is to define a new kind of data parallel with address families.
This has resulted in AFs that are all over the place, magic numbers, and all sorts of complexity. The AF solution is simpler, but ultimately more complex.
Returning to Daniel’s example, running a single protocol for underlay and overlay is simpler, while running two different protocols is less simple. However, I’ve observed—many times—that running different protocols for underlay and overlay is less complex.
Why? Daniel mentions a couple of reasons, such as each protocol has a separate purpose, and we’re pushing features into BGP to make it serve the role of an IGP (which is, in the end, going to cause some major outages—if it hasn’t already).
Consider this: is it easier to troubleshoot infrastructure reachability separately from vrf reachability? The answer is obvious—yes! What about security? Is it easier to secure a fabric when the underlay never touches any attached workload? Again—yes!
We get this tradeoff wrong all the time. A lot of times this is because we are afraid of what we do not know. Ten years ago I struggled to convince large operators to run BGP in their networks. Today no-one runs anyone other than BGP—and they all say “but we don’t have anyone who knows OSPF or IS-IS.” I’ve no idea what happened to old-fashioned network engineering. Do people really only have one “protocol slot” in their brains? Can people really only ever learn one protocol?
Or maybe we’ve become so fixated on learning features that we no longer no protocols?
I don’t know the answer to these questions, but I will say this—over the years I’ve learned that simpler is not always less complex.