Securing BGP: A Case Study (3)

To recap (or rather, as they used to say in old television shows, “last time on ‘net Work…”), this series is looking at BGP security as an exercise (or case study) in understanding how to approach engineering problems. We started this series by asking three questions, the third of which was:

What is it we can actually prove in a packet switched network?

From there, in part 2 of this series, we looked at this question more deeply, asking three “sub questions” that are designed to help us tease out the answer this third question. Asking the right questions is a subtle, but crucial, part of learning how to deal with engineering problems of all sorts. Those questions can be summed up as:

  • Is the path through this peer going to pass through someone I don’t want it to pass through?
  • Is the path this peer is advertising a valid route to the destination?

Let’s quickly look at the first of these two to see why it’s not provable in the context of a packet switched network, using the network diagram below.

bgp-sec-02

When working with BGP at Internet scale, we tend to think of an autonomous system as one “thing”—we draw it that way on network diagrams, for instance, as I’ve been doing so far in this series. But the reality is far different. Autonomous systems are made up of those pesky little things called routers. In a packet switched network, it’s important to remember each router makes an independent forwarding decision. For instance, in this network, assume Router B is advertising some destination in AS65004, say 2001:db8:0:1::/64, to Router A with an AS Path of [65004,65002]. When Router A sends traffic to a host within that the :1::/64, then, it can assume the traffic will follow a path from AS65002 directly to AS65004—there won’t be any intermediate hops.

The problem is: this assumption is wrong. There are a number of reasons Router C might forward traffic to :1::/64 to Router D, and hence through AS65003, rather than to Router F. For instance, Router C might be a route reflector running add paths, which means Router B has multiple routes to the destination, but it’s only advertising one of the available paths to Router A. Or perhaps the :1::/64 route is actually an aggregate of two longer prefixes, and the destination Router A is forwarding traffic to has a longer prefix match through Router E. Or perhaps Router C just has a static route configured forwarding traffic along a different path than the AS is advertising.

Whatever the reason, packet switched networks just don’t work this way. The first option—that traffic forwarded based on a specific advertisement will follow the AS Path in that advertisement, is false. What of the second? It all depends on what you mean by the word “valid.” There are actually (as is often the case) two different questions embedded within this question:

  • Is there a physical path between the peer advertising the route and the reachable destination?
  • Does every AS along the path between the advertising the route and the reachable destination agree to forward traffic towards the advertised destination?

The first question could be proven by proving if every AS along the AS Path claims to have a physical connection. The second, however, is trickier. To see why, let’s switching things around a little. Assume AS65004 is advertising 2001:db8:0:1::/64 towards AS65003, but not towards AS65002. Assume, as well, that AS65003 is a customer of AS65004 and AS65002—in other words, AS65003 should not be transiting traffic to any destination. How could AS65000 know this?

First, AS65002 could filter at Router D, for instance, based on some prior knowledge, or some sort of information provided by AS65004.

Second, AS65004 could somehow signal AS65002 that AS65003 shouldn’t be transiting traffic (either at all, or for this one destination).

We’ll explore the concept of signaling later in this series, when we start thinking about what sorts of solutions might be acceptable for the problem set we’re trying to solve. For now, it’s important to consider is these two points:

  • All the signaling in the world from AS65004 isn’t going to help if AS65002 doesn’t pay attention to the signal.
  • If AS65004 is unwilling to tell AS65002 what its policy towards AS65003 is, there’s no way for anyone to enforce it.

In other words, you can’t enforce what you don’t know, and enforcement is based on a prior trust arrangement of some sort. These two crucial points should be listed in the set of requirements we’re building before we start considering solutions.

In my next post in this series, I want to back up to the original three questions we discovered and start thinking through what sorts of requirements we can decipher from them.