Archive for 2016
Securing BGP: A Case Study (10)
The next proposed (and actually already partially operational) system on our list is the Router Public Key Infrastructure (RPKI) system, which is described in RFC7115 (and a host of additional drafts and RFCs). The RPKI systems is focused on solving a single solution: validating that the originating AS is authorized to originate a particular prefix. An example will be helpful; we’ll use the network below.
(this is a graphic pulled from a presentation, rather than one of my usual line drawings)
Assume, for a moment, that AS65002 and AS65003 both advertise the same route, 2001:db8:0:1::/64, towards AS65000. How can the receiver determine if both of these two advertisers can actually reach the destination, or only one can? And, if only one can, how can AS65000 determine which one is the “real thing?” This is where the RPKI system comes into play. A very simplified version of the process looks something like this (assuming AS650002 is the true owner of 2001:db8:0:1::/64):
- AS65002 obtains, from the Regional Internet Registry (labeled the RIR in the diagram), a certificate showing AS65002 has been issued 2001:db8:0:1::/64.
- AS65002 places this certificate into a local database that is synchronized with all the other operators participating in the routing system.
- When AS65000 receives a route towards 2001:db8:0:1::/64, it checks this database to make certain the origin AS on the advertisement matches the owning AS.
If the owner and the origin AS match, AS65000 can increase the route’s preference. If it doesn’t AS65000 can reduce the route’s preference. It might be that AS65000 discards the route if the origin doesn’t match—or it may not. For instance, AS65003 may know, from historical data, or through a strong and long standing business relationship, or from some other means, that 2001:db8:0:1::/64 actually belongs to AS65004, even through the RPKI data claims it belongs to AS65002. Resolving such problems falls to the receiving operator—the RPKI simply provides more information on which to act, rather than dictating a particular action to take.
Let’s compare this to our requirements to see how this proposal stacks up, and where there might be objections or problems.
Centralized versus Decentralized: The distribution of the origin authentication information is currently undertaken with rsync, which means the certificate system is decentralized from a technical perspective.
However—there have been technical issues with the rsync solution in the past, such that it can take up to 24 hours to change the distributed database. This is a pretty extreme case of eventual consistency, and it’s a major problem in the global default free zone. BGP might converge very slowly, but it still converges more quickly than 24 hours.
Beyond the technical problems, there is a business side to the centralized/decentralized issue as well. Specifically, many business don’t want their operations impacted by contract issues, negotiation issues, and the like. Many large providers see the RPKI system as creating just such problems, as the “trust anchor” is located in the RIRs. There are ways to mitigate this—just use some other root, or even self sign your certificates—but the RPKI system faces an uphill battle in this are from large transit providers.
Cost: The actual cost of setting up and running a server doesn’t appear to be very high within the RPKI system. The only things you need to “get into the game” are a couple of VMs or physical servers to run rsync, and some way to inject the information gleaned from the RPKI system into the routing decisions along the network edge (which could even be just plugging the information into existing policy mechanisms).
The business issue described above can also be counted as a cost—how much would it cost a provider if their origin authentication were taken out of the database for a day or two, or even a week or two, while a contract dispute with the RIR was worked out?
Information Cost: There is virtually no additional information cost involved in deploying the RPKI.
Other thoughts: The RPKI system wasn’t designed to, and doesn’t, validate anything other than the origin in the AS Path. It doesn’t, therefore, allow an operator to detect AS65003, for instance, claiming to be connected to AS65002 even though it’s not (or it’s not supposed to transit traffic to AS65002). This isn’t really a “lack” on the part of the RPKI, it’s just not something it’s designed to do.
Overall, the RPKI is useful, and will probably be deployed by a number of providers, and shunned by others. It would be a good component of some larger system (again, this was the original intent, so this isn’t a lack), but it cannot stand alone as a complete BGP security system.
Securing BGP: A Case Study (9)
There are a number of systems that have been proposed to validate (or secure) the path in BGP. To finish off this series on BGP as a case study, I only want to look at three of them. At some point in the future, I will probably write a couple of posts on what actually seems to be making it to some sort of deployment stage, but for now I just want to compare various proposals against the requirements outlined in the last post on this topic (you can find that post here).
The first of these systems is BGPSEC—or as it was known before it was called BGPSEC, S-BGP. I’m not going to spend a lot of time explaining how S-BGP works, as I’ve written a series of posts over at Packet Pushers on this very topic:
Part 1: Basic Operation
Part 2: Protections Offered
Part 3: Replays, Timers, and Performance
Part 4: Signatures and Performance
Part 5: Leaks
Considering S-BGP against the requirements:
- Centralized versus decentralized balance: S-BGP distributes path validation information throughout the internetwork, as this information is actually contained in a new attribute carried with route advertisements. Authorization and authentication are implicitly centralized, however, with the root certificates being held by address allocation authorities. It’s hard to say if this is the correct balance.
- Cost: In terms of financial costs, S-BGP (or BGPSEC) requires every eBGP speaker to perform complex cryptographic operations in line with receiving updates and calculating the best path to each destination. This effectively means replacing every edge router in every AS in the entire world to deploy the solution—this is definitely not cost friendly. Adding to this cost is the simply increase in the table size required to carry all this information, and the loss of commonly used (and generally effective) optimizations.
- Information cost: S-BGP leaks new information into the global table as a matter of course—not only can anyone see who is peered with whom by examining information gleaned from route view servers, they can even figure out how many actual pairs of routers connect each AS, and (potentially) what other peerings those same routers serve. This huge new chunk of information about provider topology being revealed simply isn’t acceptable.
Overall, then, BGP-SEC doesn’t meet the requirements as they’ve been outlined in this series of posts. Next week, I’ll spend some time explaining the operation of another potential system, a graph overlay, and then we’ll consider how well it meets the requirements as outlined in these posts.
Securing BGP: A Case Study (8)
Throughout the last several months, I’ve been building a set of posts examining securing BGP as a sort of case study around protocol and/or system design. The point of this series of posts isn’t to find a way to secure BGP specifically, but rather to look at the kinds of problems we need to think about when building such a system. The interplay between technical and business requirements are wide and deep. In this post, I’m going to summarize the requirements drawn from the last seven posts in the series.
Don’t try to prove things you can’t. This might feel like a bit of an “anti-requirement,” but the point is still important. In this case, we can’t prove which path along which traffic will flow. We also can’t enforce policies, specifically “don’t transit this AS;” the best we can do is to provide information and letting other operators make a local decision about what to follow and what not to follow. In the larger sense, it’s important to understand what can, and what can’t, be solved, or rather what the practical limits of any solution might be, as close to the beginning of the design phase as possible.
In the case of securing BGP, I can, at most, validate three pieces of information:
- That the origin AS in the AS Path matches the owner of the address being advertised.
- That the AS Path in the advertisement is a valid path, in the sense that each pair of autonomous systems in the AS Path are actually connected, and that no-one has “inserted themselves” in the path silently.
- The policies of each pair of autonomous systems along the path towards one another. This is completely voluntary information, of course, and cannot be enforced in any way if it is provided, but more information provided will allow for stronger validation.
There is a fine balance between centralized and distributed systems. There are actually things that can be centralized or distributed in terms of BGP security: how ownership is claimed over resources, and how the validation information is carried to each participating AS. In the case of ownership, the tradeoff is between having a widely trusted third party validate ownership claims and having a third party who can shut down an entire business. In the case of distributing the information, there is a tradeoff between the consistency and the accessibility of the validation information. These are going to be points on which reasonable people can disagree, and hence are probably areas where the successful system must have a good deal of flexibility.
Cost is a major concern. There are a number of costs that need to be considered when determining which solution is best for securing BGP, including—
- Physical equipment costs. The most obvious cost is the physical equipment required to implement each solution. For instance, any solution that requires providers to replace all their edge routers is simply not going to be acceptable.
- Process costs. Any solution that requires a lot of upkeep and maintenance is going to be cast aside very quickly. Good intentions are overruled by the tyranny of the immediate about 99.99% of the time.
Speed is also a cost that can be measured in business terms; if increasing security decreases the speed of convergence, providers who deploy security are at a business disadvantage relative to their competitors. The speed of convergence must be on the order of Internet level convergence today.
Information costs are a particularly important issue. There are at least three kinds of information that can leak out of any attempt to validate BGP, each of them related to connectivity—
- Specific information about peering, such as how many routers interconnect two autonomous systems, where interconnections are, and how interconnection points are related to one another.
- Publicly verifiable claims about interconnection. Many providers argue there is a major difference between connectivity information that can be observed and connectivity information that is claimed.
- Publicly verifiable information about business relationships. Virtually every provider considers it important not to release at least some information about their business relationships with other providers and customers.
While there is some disagreement in the community over each of these points, it’s clear that releasing the first of these is almost always going to be unacceptable, while the second and third are more situational.
With these requirements in place, it’s time to look at a couple of proposed systems to see how they measure up.
Securing BGP: A Case Study (7)
In the last post on this series on securing BGP, I considered a couple of extra questions around business problems that relate to BGP. This time, I want to consider the problem of convergence speed in light of any sort of BGP security system. The next post (to provide something of a road map) should pull all the requirements side together into a single post, so we can begin working through some of the solutions available. Ultimately, as this is a case study, we’re after a set of tradeoffs for each solution, rather than a final decision about which solution to use.
The question we need to consider here is: should the information used to provide validation for BGP be somewhat centralized, or fully distributed? The CAP theorem tells us that there are a range of choices here, with the two extreme cases being—
- A single copy of the database we’re using to provide validation information which is always consistent
- Multiple disconnected copies of the database we’re using to provide validation which is only intermittently consistent
Between these two extremes there are a range of choices (reducing all possibilities to these two extremes is, in fact, a misuse of the CAP theorem). To help understand this as a range of tradeoffs, take a look at the chart below—
The further we go to the right along this chart, the more—
-
- copies of the database there are in existence—more copies means more devices that must have a copy, and hence more devices that must receive and update a local copy, which means slower convergence.
- slower the connectivity between the copies of the database.
In complexity model terms, both of these relate to the interaction surface; slower and larger interaction surfaces face their tradeoff in the amount and speed of state that can be successfully (or quickly) managed in a control plane (hence the tradeoffs we see in the CAP theorem are directly mirrored in the complexity model). Given this, what is it we need out of a system used to provide validation for BGP? Let’s set up a specific situation that might help answer this question.
Assume, for a moment, that your network is under some sort of distributed denial of service (DDoS) attack. You call up some sort of DDoS mitigation provider, and they say something like “just transfer your origin validation to us, so we can advertise the route without it being black holed; we’ll scrub the traffic and transfer the clean flows back to you through a tunnel.” Now ask this: how long are you willing to wait before the DDoS protection takes place? Two or three days? A day? Hours? Minutes? If you can locate that amount of time along the chart above, then you can get a sense of the problem we’re trying to solve.
To put this in different terms: any system that provides BGP validation information must converge at roughly the speed of BGP itself.
So—why not just put the information securing BGP in BGP itself, so that routing converges at the same speed as the validation information? This implies every edge device in my network must handle cryptographic processing to verify the validation information. There are some definite tradeoffs to consider here, but we’ll leave those to the evaluation of proposed solutions.
Before leaving this post and beginning on the process of wrapping up the requirements around securing BGP (to be summarized in the next post), one more point needs to be considered. I’ll just state the point here, because the reason for this requirement should be pretty obvious.
Injecting validation information into the routing system should expose no more information about the peering structure of my AS than can be inferred through data mining of publicly available information. For instance, today I can tell that AS65000 is connected to AS65001. I can probably infer something about their business relationship, as well. What I cannot tell, today, is how many actual eBGP speakers connect the two autonomous systems, nor can I infer anything about the location of those connection points. Revealing this information could lead to some serious security and policy problems for a network operator.
Securing BGP: A Case Study (6)
In my last post on securing BGP, I said—
The CAP theorem post referenced above is here.
Before I dive into the technical issues, I want to return to the business issues for a moment. In a call this week on the topic of BGP security, someone pointed out that there is no difference between an advertisement in BGP asserting some piece of information (reachability or connectivity, take your pick), and an advertisements outside BGP asserting this same bit of information. The point of the question is this: if I can’t trust you to advertise the right thing in one setting, then why should I trust you to advertise the right thing in another? More specifically, if you’re using an automated system to build both advertisements, then both are likely to fail at the same time and in the same way.
First, this is an instance of how automation can create a system that is robust yet fragile—which leads directly back to complexity as it applies to computer networks. Remember—if you don’t see the tradeoff, then you’re not looking hard enough.
Second, this is an instance of separating trust for the originator from trust for the transit. Let’s put this in an example that might be more useful. When a credit card company sends you a new card, they send it in a sealed envelope. The sealed envelope doesn’t really do much in the way of security, as it can be opened by anyone along the way. What it does do, however (so long as the tamper resistance of the envelope is good enough) is inform the receiver about whether or not the information received is the same as the information sent. The seal on the envelope cannot make any assertions about the truthfulness of the information contained in the envelope. If the credit card company is a scam, then the credit cared in the envelope is still a scam even if it’s well sealed.
There is only one way to ensure what’s in the envelope is true and useful information—having an trustworthy outsider observe the information, the sender, and the receiver, to make certain they’re all honest. But now we dive into philosophy pretty heavily, and this isn’t a philosophy blog (though I’ve often thought about changing my title to “network philosopher,” just because…).
But it’s worth considering two things about this problem—the idea of having a third party watching to make certain everyone is honest is, itself, full of problems.
First, who watches the watchers? Pretty Good Privacy has always used the concept of a web of trust—this is actually a pretty interesting idea in the realm of BGP security, but one I don’t want to dive in to right this moment (maybe in a later post).
Second, Having some form of verification will necessarily cause any proposed system to rely on third parties to “make it go.” This seems to counter the ideal state of allowing an operator to run their business based on locally available information as much as possible. From a business perspective, there needs to be a balance between obtaining information that can be trusted, and obtaining information in a way that allows the business to operate—particularly in its core realm—without relying on others.
Next time, I’ll get back to the interaction of the CAP theorem and BGP security, with a post around the problem of convergence speed.
CAP Theorem and Routing
In 2000, Eric Brewer was observing and discussing the various characteristics of database systems. Through this work, he observed that a database generally has three characteristics—
-
- Consistency, which means the database will produce the same result for any two readers who happen to read a value at the same moment in time.
- Availability, which means the database can be read by any reader at any moment in time.
- Partion Tolerance, which means the database can be partitioned.
Brewer, in explaining the relationship between the three in a 2012 article, says—
The CAP theorem, therefore, represents a two out of three situation—yet another two out of three “set” we encounter in the real world, probably grounded someplace in the larger space of complexity. We’ll leave the relationship to complexity on the side for the moment, however, and just look at how CAP relates to routing.
Network engineers familiar with the concepts of complexity, and their application in the network design realm, should quickly realize that this “two out of three” paradigm almost never represents a set of “hard and fast realities,” but is better represented as a set of continuums along which a system designer can choose. Moving towards one pole in the two out of three space inevitably means giving up some measure along one of the other two; the primary choices you can make are which one and how much.
One more detail is important before diving into a deeper discussion: A partition is normally thought of as a hard divide in database theory, but for the purposes of explaining how CAP applies to routing, I’m going to soften the definition a little. Instead of strict partitions, I want to consider partitioning as relating to the distribution of the database across a set of nodes separated by a network.
So how does CAP apply to routing?
To see the application, let’s ignore the actual algorithm used to compute the best path in any given routing protocol—whether Dijkstra’s SPF, or Garcia’s DUAL, or path vector—and focus, for a moment, on just getting the information required to compute loop free paths to every router running in a single failure (and/or flooding) domain. This latter point is important, because failure and flooding domains actually represent an intentional form of information hiding that causes the database to be correct for the purposes of routing (most of the time), but inconsistent in strict database consistency terms on a permanent basis. We need, therefore, to treat aggregation and summarization as a separate topic when considering how CAP impacts routing. Since the tradeoffs are easiest to see in link state protocols, we’re actually going to focus in that area.
Once we’ve softened the definition of partition tolerance, it’s immediately obvious that all forms of flooding, for instance, are simply the synchronization process of a highly partitioned database. Further, we can observe that the database flooded through the network must be, largely, always available (as in always able to be read by a local device). Given CAP, we should immediately suspect there is some sort of consistency tradeoff, as we’ve already defined two of the three legs of the two out of three set pretty strongly.
And, in fact, we do. In link state protocols, the result of inconsistency is most easily seen in the problem of microloops during convergence. To understand this, let’s look at the network just below.
Assume, for a moment, that the link between routers A and B fails. At this moment, let’s assume:
-
-
- The best path for B is through A
- The best path for C is through B
- The best path for D is through C
-
Let’s also assume that each router in the network takes precisely the same amount of time to flood a packet, and to run SPF. When the link fails, C is going to detect the failure immediately, and hence will calculate SPF first. If we assume that D can only process this information about the topology change once it’s received an update to its database, then we can see where C will always process the new information before D.
But we can also observe this point: once C has processed the topology change—C has run SPF—router C will point at router D for its new best path. Until router D receives the new topology information and runs SPF, D will continue to point to C as the best path.
This is the microloop.
It happens any time there is a ring in the topology of more than four hops with any link state protocol. And it’s a direct result of the relationship between consistency and availability in the distributed database. We can verify this by examining the way in which the microloop can be resolved—specifically, by using ordered FIB.
In ordered FIB, each router in the network calculates an amount of time it should wait before installing newly calculated routes. This calculation is, in turn, based on the number of routers relying on the local router for the best path. If you back up and think through how the microloop happens, you can see that what’s happening is C is now waiting for D to calculate before installing the new route into its local forwarding table.
In CAP terms, what ordered FIB does is reduce availability some small amount—in effect by not allowing the forwarding plane at C to read the newly calculated routes for some period of time—in order to ensure consistency in the database. This observation should confirm microloops are related to the relationship between CAP and distributed routing.
There are a number of other relationships between CAP and routing; I’ll put some of these on my list for later posts—but this one has reached the 1000 word mark, and so is long enough for a starter on the topic.
Rethinking BGP path validation (part 2)
This is the second post in the two part series on BGP path validation over on the LinkedIn Engineering blog.