TECH
Securing BGP: A Case Study (8)
Throughout the last several months, I’ve been building a set of posts examining securing BGP as a sort of case study around protocol and/or system design. The point of this series of posts isn’t to find a way to secure BGP specifically, but rather to look at the kinds of problems we need to think about when building such a system. The interplay between technical and business requirements are wide and deep. In this post, I’m going to summarize the requirements drawn from the last seven posts in the series.
Don’t try to prove things you can’t. This might feel like a bit of an “anti-requirement,” but the point is still important. In this case, we can’t prove which path along which traffic will flow. We also can’t enforce policies, specifically “don’t transit this AS;” the best we can do is to provide information and letting other operators make a local decision about what to follow and what not to follow. In the larger sense, it’s important to understand what can, and what can’t, be solved, or rather what the practical limits of any solution might be, as close to the beginning of the design phase as possible.
In the case of securing BGP, I can, at most, validate three pieces of information:
- That the origin AS in the AS Path matches the owner of the address being advertised.
- That the AS Path in the advertisement is a valid path, in the sense that each pair of autonomous systems in the AS Path are actually connected, and that no-one has “inserted themselves” in the path silently.
- The policies of each pair of autonomous systems along the path towards one another. This is completely voluntary information, of course, and cannot be enforced in any way if it is provided, but more information provided will allow for stronger validation.
There is a fine balance between centralized and distributed systems. There are actually things that can be centralized or distributed in terms of BGP security: how ownership is claimed over resources, and how the validation information is carried to each participating AS. In the case of ownership, the tradeoff is between having a widely trusted third party validate ownership claims and having a third party who can shut down an entire business. In the case of distributing the information, there is a tradeoff between the consistency and the accessibility of the validation information. These are going to be points on which reasonable people can disagree, and hence are probably areas where the successful system must have a good deal of flexibility.
Cost is a major concern. There are a number of costs that need to be considered when determining which solution is best for securing BGP, including—
- Physical equipment costs. The most obvious cost is the physical equipment required to implement each solution. For instance, any solution that requires providers to replace all their edge routers is simply not going to be acceptable.
- Process costs. Any solution that requires a lot of upkeep and maintenance is going to be cast aside very quickly. Good intentions are overruled by the tyranny of the immediate about 99.99% of the time.
Speed is also a cost that can be measured in business terms; if increasing security decreases the speed of convergence, providers who deploy security are at a business disadvantage relative to their competitors. The speed of convergence must be on the order of Internet level convergence today.
Information costs are a particularly important issue. There are at least three kinds of information that can leak out of any attempt to validate BGP, each of them related to connectivity—
- Specific information about peering, such as how many routers interconnect two autonomous systems, where interconnections are, and how interconnection points are related to one another.
- Publicly verifiable claims about interconnection. Many providers argue there is a major difference between connectivity information that can be observed and connectivity information that is claimed.
- Publicly verifiable information about business relationships. Virtually every provider considers it important not to release at least some information about their business relationships with other providers and customers.
While there is some disagreement in the community over each of these points, it’s clear that releasing the first of these is almost always going to be unacceptable, while the second and third are more situational.
With these requirements in place, it’s time to look at a couple of proposed systems to see how they measure up.
Securing BGP: A Case Study (7)
In the last post on this series on securing BGP, I considered a couple of extra questions around business problems that relate to BGP. This time, I want to consider the problem of convergence speed in light of any sort of BGP security system. The next post (to provide something of a road map) should pull all the requirements side together into a single post, so we can begin working through some of the solutions available. Ultimately, as this is a case study, we’re after a set of tradeoffs for each solution, rather than a final decision about which solution to use.
The question we need to consider here is: should the information used to provide validation for BGP be somewhat centralized, or fully distributed? The CAP theorem tells us that there are a range of choices here, with the two extreme cases being—
- A single copy of the database we’re using to provide validation information which is always consistent
- Multiple disconnected copies of the database we’re using to provide validation which is only intermittently consistent
Between these two extremes there are a range of choices (reducing all possibilities to these two extremes is, in fact, a misuse of the CAP theorem). To help understand this as a range of tradeoffs, take a look at the chart below—

The further we go to the right along this chart, the more—
-
- copies of the database there are in existence—more copies means more devices that must have a copy, and hence more devices that must receive and update a local copy, which means slower convergence.
- slower the connectivity between the copies of the database.
In complexity model terms, both of these relate to the interaction surface; slower and larger interaction surfaces face their tradeoff in the amount and speed of state that can be successfully (or quickly) managed in a control plane (hence the tradeoffs we see in the CAP theorem are directly mirrored in the complexity model). Given this, what is it we need out of a system used to provide validation for BGP? Let’s set up a specific situation that might help answer this question.
Assume, for a moment, that your network is under some sort of distributed denial of service (DDoS) attack. You call up some sort of DDoS mitigation provider, and they say something like “just transfer your origin validation to us, so we can advertise the route without it being black holed; we’ll scrub the traffic and transfer the clean flows back to you through a tunnel.” Now ask this: how long are you willing to wait before the DDoS protection takes place? Two or three days? A day? Hours? Minutes? If you can locate that amount of time along the chart above, then you can get a sense of the problem we’re trying to solve.
To put this in different terms: any system that provides BGP validation information must converge at roughly the speed of BGP itself.
So—why not just put the information securing BGP in BGP itself, so that routing converges at the same speed as the validation information? This implies every edge device in my network must handle cryptographic processing to verify the validation information. There are some definite tradeoffs to consider here, but we’ll leave those to the evaluation of proposed solutions.
Before leaving this post and beginning on the process of wrapping up the requirements around securing BGP (to be summarized in the next post), one more point needs to be considered. I’ll just state the point here, because the reason for this requirement should be pretty obvious.
Injecting validation information into the routing system should expose no more information about the peering structure of my AS than can be inferred through data mining of publicly available information. For instance, today I can tell that AS65000 is connected to AS65001. I can probably infer something about their business relationship, as well. What I cannot tell, today, is how many actual eBGP speakers connect the two autonomous systems, nor can I infer anything about the location of those connection points. Revealing this information could lead to some serious security and policy problems for a network operator.
Securing BGP: A Case Study (6)
In my last post on securing BGP, I said—
The CAP theorem post referenced above is here.
Before I dive into the technical issues, I want to return to the business issues for a moment. In a call this week on the topic of BGP security, someone pointed out that there is no difference between an advertisement in BGP asserting some piece of information (reachability or connectivity, take your pick), and an advertisements outside BGP asserting this same bit of information. The point of the question is this: if I can’t trust you to advertise the right thing in one setting, then why should I trust you to advertise the right thing in another? More specifically, if you’re using an automated system to build both advertisements, then both are likely to fail at the same time and in the same way.
First, this is an instance of how automation can create a system that is robust yet fragile—which leads directly back to complexity as it applies to computer networks. Remember—if you don’t see the tradeoff, then you’re not looking hard enough.
Second, this is an instance of separating trust for the originator from trust for the transit. Let’s put this in an example that might be more useful. When a credit card company sends you a new card, they send it in a sealed envelope. The sealed envelope doesn’t really do much in the way of security, as it can be opened by anyone along the way. What it does do, however (so long as the tamper resistance of the envelope is good enough) is inform the receiver about whether or not the information received is the same as the information sent. The seal on the envelope cannot make any assertions about the truthfulness of the information contained in the envelope. If the credit card company is a scam, then the credit cared in the envelope is still a scam even if it’s well sealed.
There is only one way to ensure what’s in the envelope is true and useful information—having an trustworthy outsider observe the information, the sender, and the receiver, to make certain they’re all honest. But now we dive into philosophy pretty heavily, and this isn’t a philosophy blog (though I’ve often thought about changing my title to “network philosopher,” just because…).
But it’s worth considering two things about this problem—the idea of having a third party watching to make certain everyone is honest is, itself, full of problems.
First, who watches the watchers? Pretty Good Privacy has always used the concept of a web of trust—this is actually a pretty interesting idea in the realm of BGP security, but one I don’t want to dive in to right this moment (maybe in a later post).
Second, Having some form of verification will necessarily cause any proposed system to rely on third parties to “make it go.” This seems to counter the ideal state of allowing an operator to run their business based on locally available information as much as possible. From a business perspective, there needs to be a balance between obtaining information that can be trusted, and obtaining information in a way that allows the business to operate—particularly in its core realm—without relying on others.
Next time, I’ll get back to the interaction of the CAP theorem and BGP security, with a post around the problem of convergence speed.
CAP Theorem and Routing
In 2000, Eric Brewer was observing and discussing the various characteristics of database systems. Through this work, he observed that a database generally has three characteristics—
-
- Consistency, which means the database will produce the same result for any two readers who happen to read a value at the same moment in time.
- Availability, which means the database can be read by any reader at any moment in time.
- Partion Tolerance, which means the database can be partitioned.
Brewer, in explaining the relationship between the three in a 2012 article, says—
The CAP theorem, therefore, represents a two out of three situation—yet another two out of three “set” we encounter in the real world, probably grounded someplace in the larger space of complexity. We’ll leave the relationship to complexity on the side for the moment, however, and just look at how CAP relates to routing.
Network engineers familiar with the concepts of complexity, and their application in the network design realm, should quickly realize that this “two out of three” paradigm almost never represents a set of “hard and fast realities,” but is better represented as a set of continuums along which a system designer can choose. Moving towards one pole in the two out of three space inevitably means giving up some measure along one of the other two; the primary choices you can make are which one and how much.
One more detail is important before diving into a deeper discussion: A partition is normally thought of as a hard divide in database theory, but for the purposes of explaining how CAP applies to routing, I’m going to soften the definition a little. Instead of strict partitions, I want to consider partitioning as relating to the distribution of the database across a set of nodes separated by a network.
So how does CAP apply to routing?
To see the application, let’s ignore the actual algorithm used to compute the best path in any given routing protocol—whether Dijkstra’s SPF, or Garcia’s DUAL, or path vector—and focus, for a moment, on just getting the information required to compute loop free paths to every router running in a single failure (and/or flooding) domain. This latter point is important, because failure and flooding domains actually represent an intentional form of information hiding that causes the database to be correct for the purposes of routing (most of the time), but inconsistent in strict database consistency terms on a permanent basis. We need, therefore, to treat aggregation and summarization as a separate topic when considering how CAP impacts routing. Since the tradeoffs are easiest to see in link state protocols, we’re actually going to focus in that area.
Once we’ve softened the definition of partition tolerance, it’s immediately obvious that all forms of flooding, for instance, are simply the synchronization process of a highly partitioned database. Further, we can observe that the database flooded through the network must be, largely, always available (as in always able to be read by a local device). Given CAP, we should immediately suspect there is some sort of consistency tradeoff, as we’ve already defined two of the three legs of the two out of three set pretty strongly.
And, in fact, we do. In link state protocols, the result of inconsistency is most easily seen in the problem of microloops during convergence. To understand this, let’s look at the network just below.

Assume, for a moment, that the link between routers A and B fails. At this moment, let’s assume:
-
-
- The best path for B is through A
- The best path for C is through B
- The best path for D is through C
-
Let’s also assume that each router in the network takes precisely the same amount of time to flood a packet, and to run SPF. When the link fails, C is going to detect the failure immediately, and hence will calculate SPF first. If we assume that D can only process this information about the topology change once it’s received an update to its database, then we can see where C will always process the new information before D.
But we can also observe this point: once C has processed the topology change—C has run SPF—router C will point at router D for its new best path. Until router D receives the new topology information and runs SPF, D will continue to point to C as the best path.
This is the microloop.
It happens any time there is a ring in the topology of more than four hops with any link state protocol. And it’s a direct result of the relationship between consistency and availability in the distributed database. We can verify this by examining the way in which the microloop can be resolved—specifically, by using ordered FIB.
In ordered FIB, each router in the network calculates an amount of time it should wait before installing newly calculated routes. This calculation is, in turn, based on the number of routers relying on the local router for the best path. If you back up and think through how the microloop happens, you can see that what’s happening is C is now waiting for D to calculate before installing the new route into its local forwarding table.
In CAP terms, what ordered FIB does is reduce availability some small amount—in effect by not allowing the forwarding plane at C to read the newly calculated routes for some period of time—in order to ensure consistency in the database. This observation should confirm microloops are related to the relationship between CAP and distributed routing.
There are a number of other relationships between CAP and routing; I’ll put some of these on my list for later posts—but this one has reached the 1000 word mark, and so is long enough for a starter on the topic.
Securing BGP: A Case Study (5)
BGP provides reachability for the global ‘net, as well as being used in many private networks. As a system, BGP (ultimately) isn’t very secure. But how do we go about securing BGP? This series investigates the questions, constraints, and solutions any proposal to secure BGP must deal with as a case study of asking the right questions, and working at the intersection of business and technology.
As a short review, we started off with three questions, described in the first post, each of which we’ve been considering in some detail:
- Should we focus on a centralized solution to this problem, or a distributed one?
- Assuming we’re using some sort of encryption to secure the information used in path validation, where do the keys come from? The fourth post considers this question.
- Should the information used to validate paths be distributed or stored in a somewhat centralized database?
- Should we consider solutions that are carried within the control plane, within BGP itself, or outside?
- What is it we can actually prove in a packet switched network? This is considered in post 2 and post 3.
Here I’m going to discuss the problem of a centralized versus distributed database to carry the information needed to secure BGP. There are actually, again, two elements to this problem—a set of pure technical issues, and a set of more business related problems. The technical problems revolve around the CAP theorem, which is something that wants to be discussed in a separate post; I’ll do something on CAP in a separate post next week and link it back to this series.
Which leaves us with the business side of things. What possible business justification could you give for choosing decentralized versus centralized storage of particular data sets? To really understand the business requirements that overlap with BGP security, you need to consider one specific point: businesses are in business to make money. For a provider, in particular, if the network isn’t up, you’re not making money. Anything which either slows down your network convergence or causes you to be dependent on connectivity outside your network is a “bad thing” from a provider’s convergence. How does the distributed versus centralized problem interact with these two points?
First, if a provider must be able to reach a centralized database in order to bring their network up from a large scale failure, this is a “bad thing.” Whatever solution is proposed, then, must be able to use a data set that is at least synchronized throughout every Autonomous System, so any information required to start the network can be stored locally during a failure.
Second, if a provider wants to minimize external control over their operations, they need to be able to build and advertise any information required to participate in the system from local resources, and they cannot count on remote resources to make the system run. While it’s okay to pull in information from outside of the local AS, it’s important not to rely on third parties more than necessary to make the network actually run.
Third, any proposed system must allow as much flexibility as possible in it’s implementation within the provider’s network. The provider doesn’t want or need another system they must adapt their processes and speed of business around. Further, any deployed system must impact the convergence of the network as little as possible.
One example of these three lines of thought is the insistence, on the part of some providers (at least), that they not be required to “touch” their edge eBGP speakers. Not only would it cost a lot of money to replace every eBGP speaker on a provider’s edge (even companies like LinkedIn can have on the order of a thousand of these devices, and transit providers can have on the order of ten thousand), reducing the speed at which these eBGP speakers can converge can have a very negative effect on the ability of the provider to sell services.
The bottom line is this: If you have a choice between two providers, one of which can promise a higher level of security, and the other of which promises a higher level of performance, you’re almost always going to take the provider with higher performance. The provider who deploys a BGP security system that impacts performance in a real way, then, will be on the losing end of the economic battle—discouraging further deployment, and putting the provider at risk from a financial perspective. None of this would be a good thing.
Reaction: BGP convergence, divergence & the ‘net
Let’s have a little talk about BGP convergence.
Geoff Huston’s recent article on the reality of Internet connectivity—no, everyone cannot connect to everyone—prompted a range of reactions from various folks I know.
For instance, BGP is broken! After all, any routing protocol that can’t provide basic reachability to every attached destination must be broken, right? The problem with this statement is it assumes BGP is, at core, a routing protocol. To set the record straight, BGP is not, at heart, a routing protocol in the traditional sense of the term. BGP is a system used to describe bilateral peering arrangements between independent parties in a way that provides loop free reachability information. The primary focus of BGP is not loop free reachability, but policy.
After all, BGP convergence is a big deal, right? Part of the problem here is that we use BGP as a routing protocol in some situations (for instance, on data center fabrics), so we have a hard time adjusting our thinking to the original peering policy based focus it was designed for. In the larger ‘net, it’s not a bug that some destinations are unreachable from some sources. It’s an expression of policy, and hence it’s a feature. There are certainly times when such policies are unintentional, but unintentional/unplanned policy is policy just the same as intentional/planned policy is.
We shouldn’t declare BGP broken for doing something it’s supposed to do.
There’s another point here, as well: Some networks never converge. And that’s okay. This is, perhaps, even harder for network engineers to get their heads around. I’ve spent twenty years making sure networks converge quickly, as loop free as possible, with as little chance for failure as possible, and using the least number of resources possible. But every network in the world doesn’t always have to converge to a single view of the topology and reachability. Really!
The problem here is the micro and macro views of the world. The ‘net doesn’t converge for two reason.
First, there’s that pesky policy problem again. Policy, in the real world, never converges. There are always contradictory policies, and policies will often form bistable states. This is maddening, of course, to the mind of an engineer, but it’s just reality intruding on our little bubble. Bubbles are, after all, meant to be burst.
Second, there’s that whole CAP theorem thing in there someplace. Not many people understand the application of CAP to routing, so I’m stuffing a post or two on this on my todo list, but just remember: you can choose to a Consistent database, a database that is Accessible by every reader/user all the time, or a database that can be Partitioned. If you think about it, routing protocols are readable by every network device all the time, and they are partitioned among all the routers/intermediate systems in the network. Which means… They aren’t going to be consistent.
As in, if you feed a routing protocol enough changes often enough, it won’t ever converge—because it’s eventual consistency will always be catching up with reality. This is just the way the world is built—piling all the SDN unicorn magic in the world into routing isn’t going to solve this one, folks. On a network the size of the Internet, someone, somewhere, is always going to be changing something. This cripples BGP convergence; the ‘net never converges.
In the history of ideas, perhaps BGP shouldn’t float to the top as one of the most brilliant (and Tony and Yaakov would probably even agree with you)—but it has, on the other hand, been one of the most successful. It’s just tight enough to work often enough to rely on the connectivity as described, and it’s just loose enough to allow policies to be injected where they need to be. No such system is ever going to be “perfect.”
We could beat our heads against a wall trying, of course, but even virtual reality has physical limitations.
Securing BGP: A Case Study (4)
In part 1 of this series, I looked at the general problem of securing BGP, and ended by asking three questions. In part 2 and part 3, I considered the third question: what can we actually prove in a packet switched network. For this section, I want to return to the first question:
Should we focus on a centralized solution to this problem, or a distributed one?
There are, as you might expect, actually two different problems within this problem:
- Assuming we’re using some sort of encryption to secure the information used in path validation, where do the keys come from? Should each AS build its own private/public key pairs, have anyone they want to validate the keys, and then advertise them? Or should there be some central authority that countersigns keys, such as the Regional Internet Registries (RIRs) so everyone has a single trust root?
- Should the information used to validate paths be distributed or stored in a somewhat centralized database? At the extreme ends of this answer are two possibilities: every eBGP speaker individually maintains a database of path validation information, just like they maintain reachability information; or there are a few servers (like the root DNS servers) which act as a source of truth, and which are synchronized somehow to each AS.
Let’s discuss the first problem first. Another way to phrase this question is: should we use a single root to provide trust for the cryptographic keys, or should we use something like a web of trust? The diagram below might be helpful in understanding these two options.

Assume, for a moment, that AS 65000 has just received some information that purports to be from AS65002. How can AS65000 validate this information? Assuming AS65002 has signed this information with its private key, AS65000 needs to find AS6500’2 public key and validate the signature. There are a number of ways AS65000 could find AS65002’s key—a URI or a distributed database of some type would do nicely—but how does AS65000 actually know that the public key it finds belongs to the real AS65002?
This is a matter finding a third party that will say/validate/attest that this public key actually belongs to AS65002. Normally this is done by the third party signing the key with their private key, which can then be validated using their private key—but then you run into an infinite regress. You can’t go on checking everyone’s public key for all of eternity; you have to stop and trust someone someplace to tell you the truth about these keys. Where do you stop? The answer to this question is the key difference between a single set of trusted roots and a web of trust.
In the single set of roots (left side of the diagram), AS65000 would stop at either RIR1 or RIR2 because they are a single set of entities everyone participating in the system should trust. In the web of trust (right side of the diagram), you stop looking when you find someone you trust who trusts the person you’re receiving information from. If AS65000 trusts AS65001 to tell the truth about AS65002, then AS65001’s signature on AS65002’s certificate should be enough for AS65000 to trust the public key it retrieved for AS65002. Multiple steps through the web of trust are possible, as well. Assume AS65000 trusts AS65003, and AS65003 trusts AS65004 to tell the truth about AS65002. AS65000 might decide that because AS65003 trusts AS65004, it can trust AS65004, as well, and accept the public key for AS65002. This is a form of transitive trust, where you trust someone because you trust someone who trusts them.
There are, of course, advantages and disadvantages to each system.
In the first system, trust is rooted in an entity that can be held financially responsible for telling the truth. This often means, however, that they will charge to use their service; this additional cost can be an effective barrier to entry as the services provided by the root entity become more complex, or as the value of the system being anchored in this entity increase. The centralized repository of trust is also a risk—the certificate used by the single trusted entity is a high value target. Finally, the single trust entity is a political risk for those who operate within the system. A single contract mistake, or a single problem of any other sort, causes the value of anyone operating a network (in this case) to immediately drop to nothing. While the single root provides an easy way to prevent abuse within the system, it’s also a single place from which to abuse those within the system.
In the second, trust is not rooted in any single entity, but rather in what might be called community consensus. This type of system is more resilient to attack, but it can also be abused through shell entities and bad actors. In operating a web of trust, you can quickly run into byzantine failures which are difficult to resolve. Beyond this, the amount of processing required to validate a certificate will depend on the number of links in the transitive trust chain you have to cross before you find someone you trust to tell the truth.
Which one is better? It really comes down to business objectives rather than technical superiority. Which one better fits the kind of system being secured? Web of trust models tend to work better in smaller communities with personal relationships. Rooted trust models tend to provide more security in diverse environments where the participants are competitors.
I can see your eyes glazing over at this point, so I’ll leave the answer to the second question ’til next time.
