What would it take to secure BGP? Let’s begin where any engineering problem should begin: what problem are we trying to solve?
In this network—in any collection of BGP autonomous systems—there are three sorts of problems that can occur at the AS level. For the purposes of this explanation, assume AS65000 is advertising 2001:db8:0:1::/64. While I’ve covered this ground before, it’s still useful to outline them:
- AS65001 could advertise 2001:db8:0:1::/64 as if it is locally attached. This is considered a false origination, or a hijacked route.
- AS65001 could advertise a route to 2001:db8:0:1::/64 with the AS path [65000,65001] to AS65003. This is another form of route hijacking, but instead of a direct hijack it’s a “one behind” attack. AS65001 doesn’t pretend to own the route in question, but rather to be connected to the AS that is originating the route.
- AS65000 could consider AS65003 a customer, or rather AS65003 might be purchasing Internet connectivity from AS65000. This would mean that any routes AS65000 advertises to AS65003 are not intended to be retransmitted back to AS65004. If, for instance, 2001:db8:0:1::/64, is advertised by AS65000 to AS65003, and AS65003 readvertises it to AS65004, AS65003 would be an unintentional transit AS in the path. This could either be intentional or a mistake, of course, but either way this is an incorrect traffic pattern that can be at the root of many problems. This is considered a route leak, and is fully described in this Internet draft.
There are a number of other possibilities, but these three will be enough to deal with for thinking through the problem and solution sets. Given these are the problems, it’s in the engineering mindset to jump directly to a solution. But before we do, let’s start with at a set of questions. For instance:
- Should we focus on a centralized solution to this problem, or a distributed one? Then there are the in-between solutions that create a single database that’s synchronized among all the participating autonomous systems.
- Should we consider solutions that are carried within the control plane, within BGP itself, or outside? In other words, should every eBGP speaker in the system participate, or should there be some smaller set of devices participating?
- What is it we can actually prove in a packet switched network? This might seem like an odd question, but we are in a position where we are trying to manage traffic flows through the control plane—for instance, we are trying to prevent traffic between AS65004 and AS65000 from flowing through AS65003 in the route leak case. What, specifically, can we prove in such a case?
We’ll consider these questions, starting with the last one first, in the next post.
This post kicks off a series on what I consider to be a current and difficult design problem at Internet scale that involves just about every piece of the networking puzzle you can get in to—BGP security. This is designed to be a sort of case study around approaching design problems, not just at the protocol level, but at an engineering level. I will probably intersperse this series with other posts over the coming months.