Securing BGP: A Case Study (5)

Securing BGP: A Case Study (5)

BGP provides reachability for the global ‘net, as well as being used in many private networks. As a system, BGP (ultimately) isn’t very secure. But how do we go about securing BGP? This series investigates the questions, constraints, and solutions any proposal to secure BGP must deal with as a case study of asking the right questions, and working at the intersection of business and technology.

As a short review, we started off with three questions, described in the first post, each of which we’ve been considering in some detail:

  • Should we focus on a centralized solution to this problem, or a distributed one?
    • Assuming we’re using some sort of encryption to secure the information used in path validation, where do the keys come from? The fourth post considers this question.
    • Should the information used to validate paths be distributed or stored in a somewhat centralized database?
  • Should we consider solutions that are carried within the control plane, within BGP itself, or outside?
  • What is it we can actually prove in a packet switched network? This is considered in post 2 and post 3.

Here I’m going to discuss the problem of a centralized versus distributed database to carry the information needed to secure BGP. There are actually, again, two elements to this problem—a set of pure technical issues, and a set of more business related problems. The technical problems revolve around the CAP theorem, which is something that wants to be discussed in a separate post; I’ll do something on CAP in a separate post next week and link it back to this series.

Which leaves us with the business side of things. What possible business justification could you give for choosing decentralized versus centralized storage of particular data sets? To really understand the business requirements that overlap with BGP security, you need to consider one specific point: businesses are in business to make money. For a provider, in particular, if the network isn’t up, you’re not making money. Anything which either slows down your network convergence or causes you to be dependent on connectivity outside your network is a “bad thing” from a provider’s convergence. How does the distributed versus centralized problem interact with these two points?

First, if a provider must be able to reach a centralized database in order to bring their network up from a large scale failure, this is a “bad thing.” Whatever solution is proposed, then, must be able to use a data set that is at least synchronized throughout every Autonomous System, so any information required to start the network can be stored locally during a failure.

Second, if a provider wants to minimize external control over their operations, they need to be able to build and advertise any information required to participate in the system from local resources, and they cannot count on remote resources to make the system run. While it’s okay to pull in information from outside of the local AS, it’s important not to rely on third parties more than necessary to make the network actually run.

Third, any proposed system must allow as much flexibility as possible in it’s implementation within the provider’s network. The provider doesn’t want or need another system they must adapt their processes and speed of business around. Further, any deployed system must impact the convergence of the network as little as possible.

One example of these three lines of thought is the insistence, on the part of some providers (at least), that they not be required to “touch” their edge eBGP speakers. Not only would it cost a lot of money to replace every eBGP speaker on a provider’s edge (even companies like LinkedIn can have on the order of a thousand of these devices, and transit providers can have on the order of ten thousand), reducing the speed at which these eBGP speakers can converge can have a very negative effect on the ability of the provider to sell services.

The bottom line is this: If you have a choice between two providers, one of which can promise a higher level of security, and the other of which promises a higher level of performance, you’re almost always going to take the provider with higher performance. The provider who deploys a BGP security system that impacts performance in a real way, then, will be on the losing end of the economic battle—discouraging further deployment, and putting the provider at risk from a financial perspective. None of this would be a good thing.