BGP is widely used as an IGP in the underlay of modern DC fabrics. This series argues this is not the best long-term solution to the problem of routing in fabrics because BGP is not ideal for this use case. This post will consider the potential harm we are doing to the larger Internet by pressing BGP into a role it was not originally designed to fulfill—an underlay protocol or an IGP.
My last post described the kinds of configuration required to make BGP work on a DC fabric—it turns out that the configuration of each BGP speaker on the fabric is close to unique. It is possible to automate configuring each speaker—but it would be better if we could get closer to autonomic operation.
Before I continue, I want to remind you what the purpose of this little series of posts is. The point is not to convince you to never use BGP in the DC underlay ever again. There’s a lot of BGP deployed out there, and there are lot of tools that assume BGP in the underlay. I doubt any of that is going to change. The point is to make you stop and think!
Why are we deploying BGP in this way? Is this the right long-term solution? Should we, as a community, be rethinking our desire to use BGP for everything? Are we just “following the crowd” because … well … we think it’s what the “cool kids” are doing, or because “following the crowd” is what we always seem to do?
In my last post, I argued that BGP converges much more slowly than the other options available for the DC fabric underlay control plane. The pushback I received was two-fold. First, the overlay converges fast enough; the underlay convergence time does not really factor into overall convergence time. Second, there are ways to fix things.
The fist post on this topic considered some basic definitions and the reasons why I am writing this series of posts. The second considered the convergence speed of BGP on a dense topology such as a DC fabric, and what mechanisms we normally use to improve BGP’s convergence speed. This post considers some of the objections to slow convergence speed—convergence speed is not important, and ECMP with high fanouts will take care of any convergence speed issues. The network below will be used for this discussion.
In my last post on this topic, I laid out the purpose of this series—to start a discussion about whether BGP is the ideal underlay control plane for a DC fabric—and gave some definitions. Here, I’d like to dive into the reasons to not use BGP as a DC fabric underlay control plane—and the first of these reasons is BGP converges very slowly and requires a lot of help to converge at all.
Everyone uses BGP for DC underlays now because … well, just because everyone does. After all, there’s an RFC explaining the idea, every tool in the world supports BGP for the underlay, and every vendor out there recommends some form of BGP in their design documents.
I’m going to swim against the current for the moment and spend a couple of weeks here discussing the case against BGP as a DC underlay protocol. I’m not the only one swimming against this particular current, of course—there are at least three proposals in the IETF (more, if you count things that will probably never be deployed) proposing link-state alternatives to BGP. If BGP is so ideal for DC fabric underlays, then why are so many smart people (at least they seem to be smart) working on finding another solution?
Tyler McDaniel joins Eyvonne, Tom, and Russ to discuss a study on BGP peerlocking, which is designed to prevent route leaks in the global Internet. From the study abstract:
BGP route leaks frequently precipitate serious disruptions to interdomain routing. These incidents have plagued the Internet for decades while deployment and usability issues cripple efforts to mitigate the problem. Peerlock, introduced in 2016, addresses route leaks with a new approach. Peerlock enables filtering agreements between transit providers to protect their own networks without the need for broad cooperation or a trust infrastructure.
I’ve been chasing BGP security since before the publication of the soBGP drafts, way back in the early 2000’s (that’s almost 20 years for those who are math challenged). The most recent news largely centers on the RPKI, which is used to ensure the AS originating an advertisements is authorized to do so (or rather “owns” the resource or prefix). If you are not “up” on what the RPKI does, or how it works, you might find this old blog post useful—its actually the tenth post in a ten post series on the topic of BGP security.
The first hour of material in my new BGP course over at Ignition dropped this week. I’m not going to talk about configuration and other operational things—this is all about understanding how BGP works, why it works that way, and thinking about design. This course will apply to cloud, Internet edge, DC fabric, and other uses of BGP.
Can you really trust what a routing protocol tells you about how to reach a given destination? Ivan Pepelnjak joins Nick Russo and Russ White to provide a longer version of the tempting one-word answer: no! Join us as we discuss a wide range of issues including third-party next-hops, BGP communities, and the RPKI.
Sue Hares, cochair of the IDR and I2RS working groups in the IETF, joins Donald Sharp and Russ White to talk about the origins of one of the first open source routing stacks, GateD. Sue was involved in MERIT and the university programs that originated this open source software, and managed its transition to a commercial offering.