Do We Really Need a New BGP?

From time to time, I run across (yet another) article about why BGP is so bad, and how it needs to be replaced. This one, for instance, is a recent example.

cross posted at APNIC and CircleID

It seems the easiest way to solvet this problem is finding new people—ones who don’t make mistakes—to work on BGP configuration, building IRR databases, and deciding what should be included in BGP? Ivan points out how hopeless of a situation this is going to be, however. As Ivan says, you cannot solve people problems with technology. You can hint in the right direction, and you can try to make things a little more sane, and a little less complex, but people cannot be fixed with technology. Given we cannot fix the people problem, would replacing BGP itself really help? Is there anything we could do to make things better?

To understand the answer to these questions, it is important to tear down a major misconception about BGP. The misconception?

BGP is a routing protocol in the same sense as OSPF, IS-IS, or EIGRP.


BGP was not designed to be a routing protocol in the way other protocol were. It was designed to provide a loop free path through a series of independently operated networks, each with its own policy and business goals. In the sense that BGP provides a loop free route to a destination, it provides routing. But the “routing” it provides is largely couched in terms of explicit, rather than implicit, policy (see the note below). Loop free routes are not always the “shortest” path in terms of hop count, or the “lowest cost” path in terms of delay, or the “best available” path in terms of bandwidth, or anything else. This is why BGP relies on the AS Path to prevent loops. We call things “metrics” in BGP in a loose way, but they are really explicit expressions of policy.

Consider this: the primary policies anyone cares about in interdomain routing are: where do I want this traffic to exit my AS, and where do I want this traffic to enter my AS? The Local Preference is an expression of where traffic to this particular destination should exit this AS. The Multiple Exit Disciminator (MED) is an expression of where this AS would like to receive traffic being forwarded to this destination. Everything other than these are just tie breakers. All the rest of the stuff we do to try to influence the path of traffic into and out of an AS, like messing with the AS Path, are hacks. If you can get this pair of “things people really care about” into your head, the BGP bestpath process, and much of the routing that goes on in the DFZ, makes a lot more sense.

It really is that simple.

How does this relate to the problem of replacing BGP? There are several things you could improve about BGP, but automatic metrics are not one of them. There are, in fact, already “automatic metrics” in BGP, but “automatic metrics” like the IGP cost are tie breakers. A tie breaker is a convenient stand-in for what the protocol designer and/or implementor thinks the most natural policy should be. Whether or not they are right or wrong in a specific situation is a… guess.

What about something like the RPKI? The RPKI is not going to help in most situations where a human makes a mistake in a transit provider. It would help with transit edge failures and hijacks, but these are a different class of problem. You could ask for BGPsec to counter these problems, of course, but BGPsec would likely cause more problems than it solves (I’ve written on this before, here, here, here, here, and here, to start; you can find a lot more on rule11 by following this link).

Given replacing the metrics is not a possibility, and RPKI is only going to get you “so far,” what else can be done? There are, in fact, several practical steps that could be taken.

You could specify that BGP implementations should, by default, only advertise routes if there is some policy configured. Something like, say… RFC8212?

Giving operators more information to understand what they are configuring (perhaps by cleaning up the Internet Routing Registries?) would also be helpful. Perhaps we could build a graph overlay on top of the Default Free Zone (DFZ) so a richer set of policies could be expressed, and policies could be better observed and understood (but you have to convince the transit providers that this would not harm their business before this could happen).

Maybe we could also stop trying to use BGP as the trash can of the Internet, throwing anything we don’t know what else to do with in there. We’ve somehow forgotten the old maxim that a protocol is not done until we have removed everything that is not needed. Now our mantra seems to be “the protocol isn’t done until it solves every problem anyone has ever thought of.” We just keep throwing junk at BGP as if it is the abominable snowman—we assume it’ll bounce when it hits bottom. Guess what: it’s not, and it won’t.

Replacing BGP is not realistic—nor even necessary. Maybe it is best to put it this way:

  • BGP expresses policy
  • Policy is messy
  • Therefore, BGP is messy

We definitely need to work towards building good engineers and good tools—but replacing BGP is not going to “solve” either of these problems.

P.S. I have differentiated between “metrics” and “policy” here—but metrics can be seen as an implicit form of policy. Choosing the highest bandwidth path is a policy. Choosing the path with the shortest hop count is a policy, too. The shortest path (for some meaning of “shortest”) will always be provably loop free, so it is a useful way to always choose a loop free path in the face of simple, uniform, policies. But BGP doesn’t live in the world of simple uniform policies; it lives in the world of “more than one metric.” BGP lives in a world where different policies not only overlap, but directly compete. Computing a path with more than one metric is provably at least bistable, and often completely unstable, no matter what those metrics are.

P.P.S. This article is a more humorous take on finding perfect people.


  1. Hemanth Raj on 19 December 2017 at 9:10 am

    BGP at the edge, BGP at the core [ not at the BGP free MPLS core ] , BGP at the Data Centers,
    BGP at the edge of ISP [ transit , settlement free , peer , lateral or as a customer ] ,

    Policies at the edge requires control on sourcing routes . Can i reach the source via this ASN , filter all the routes except the routes that are peer originated and their customers if there are.
    Policies restrict BOGON AS , Bogon/Martian prefixes , Deprecated prefixes ,

    Implenting BGP policies requires MOP( Method of Procedure ) to be validated atleast by three peers [ two internal ] and one [ peering neighbor ] incase of Peering policies for inbound and outbound.

    We have standard filter at the edge of any ISP filtering as to restrict and even RTBH [ Remotely triggered black hole ] communities in the standard ISP core filtering.

    BGP Graceful shut equivalent of ISIS Overload and OSPF max metric router LSA are all implemented in almost all the vendors to make sure BGP reroutes traffic [ match community and setting LP as 0 ) before withdrawing it to avoid the BGP Blackhole for the time it calculates the next best path and to drain traffic gracefully.

    BGP Policies are nowadays mostly communities that we use for each specific event , there is a community and we are also having support for 32 bit ie 4 byte bgp community equivalent to 32 bit ASN number.

    BGP Policies has to be peer reviewed and has to be implemented via a script using scripts to configure a device rather than human addition to prevent and script has to just copy paste the config on the device rather than and then various checkpoints has to be implemented such as ADj Rib In and ADj RIB Out has to be validated after each addition, modification and removal.

    BGP ORF can be used to update policies which is part of Route Refresh message,

  2. Nick Buraglio on 26 December 2017 at 1:26 pm

    I am always open to replacing anything with something that is truly better – something that improves on one way or another without adding unnecessary overhead in any way. However, much like everything in tech, engineers often attempt to engineer around problems that are not engineering issues. Like you stated, BGP can not be easily replaced (nor should it be), where the real issue with BGP lies is in education and process. The processes around BGP are still (by default) very manual. They require external tools that are not included in the communication protocol suite to be “safe”. BGP does not assume you (the operator) is stupid. It assumes you are smart and actually understand what it does, how to do it, and why. This is rarely the case, as we’ve seen by the blunders we see almost bi-annually. Perhaps, instead of replacing a perfectly good software mechanism the industry should concentrate on building in hooks to account for things like IRR databases. Realistically, the issues we see with BGP at the global scale aren’t technical problems at all, they’re symptoms of human error, and one can only engineer that out to a certain extent. Like most things in life, education and experience trump most things. My hope is that now that we’re pushing BGP down the racks and people touch it more and more we’ll see less and less of the issues.