There are—in theory—three ways BGP can be deployed within a single AS. You can deploy a full mesh of iBGP peers; this might be practical for a small’ish deployment (say less than 10), but it quickly becomes a management problem in larger, or constantly changing, deployments. You can deploy multiple BGP confederations; creating internal autonomous systems that are invisible to the world because the internal AS numbers are stripped at the real eBGP edge.
The third solution is (probably) the only solution anyone reading this has deployed in a production network: route reflectors. A quick review might be useful to set the stage.
In this diagram, B and E are connected to eBGP peers, each of which is advertising a different destination; F is advertising the 100::64 prefix, and G is advertising the 101::/64 prefix. Assume A is the route reflector, and B,C, D, and E are route reflector clients. What happens when F advertises 100::/64 to B?
- B receives the route and advertises it through iBGP to A
- A adds its router ID to the cluster list, and reflect the route to C, D, and E
- E receives this route and advertises it through its eBGP session towards G
- C does not advertise 100::/64 towards D, because D is an iBGP peer (not configured as a route reflector)
- D does not advertise 100::/64 towards C, because C is an iBGP peer (not configured as a route reflector)
Even if D did readvertise the route towards C, and C back towards A, A would reject the route because its router ID is in the cluster list. Although the improper use of route reflectors can get you into a lot of trouble, the usage depicted here is fairly simple. Here A will only have one path towards 100::/64, so it will only have one possible path across which to run the BGP bestpath calculation.
The case of 101::/64 is a little different, however. The oddity here is the link metrics. In this network, A is going to receive two routes towards 101::/64, through D and E. Assuming all other things are equal (such as the local preference), A will choose the path to the speaker within the AS with the lowest IGP metric. Hence A will choose the path through E, advertising this route to B, C, and D. What if A were not a route reflector? If every router within the AS were part of an iBGP full mesh, what would happen? In this case:
- B would receive three two routes to 101::/64, one from D with an IGP metric of 30, and a second from E with an IGP metric of 20. Assuming all other path attributes are equal, B will choose the path through E to reach 101::/64.
- C would receive two routes to 101::/64, one from D with an IGP metric of 10, and a second from E with an IGP metric of 20. Assuming all other path attributes are equal, C will choose the path through D to reach 101::/64.
Inserting the route reflector, A, into the network does not change the best path to 101::/64 from the perspective of B, but it does change C’s best path from D to E. How can the shortest path be restored in the network? The State/Optimization/Surface (SOS) three way trade off tells use there are two possible solutions—either the state removed by the route reflector must be restored into BGP, or some interaction surface needs to be enabled between BGP and some other system in the network that has the information required to restore optimal routing.
The first of these two options, restoring the state removed through route reflection, is represented by two different solutions, one of which can be considered a subset of the other. The first solution is for the route reflector, A, to send all the routes to 101::/64 to every route reflector client. This is called add paths, and is documented in RFC7911. The problem with this solution is the amount of additional state.
A second option is to provide some set of paths beyond the best path to each client, but not the entire set of paths. This solution still attacks the suboptimal problem by adding state that was removed through the reflection process. In this case, however, rather than adding back all the state, a subset of state is added back. The state added back is normally the second best path, which is enough to provide enough information to re-optimize the network, but minimal enough to not overwhelm BGP.
What about the other option—allowing BGP to interact with some other system that has the information required to tell BGP specifically which state will allow the route reflector clients to compute the optimal path through the network? This third solution is described in BGP Optimal Route Reflection (BGP-ORR). To understand this solution, begin by asking: why does removing BGP advertisements from the control plane cause suboptimal routing? The answer to this question is: because the route reflector client does not have all the available routes, it cannot compare the IGP metric of every path in order to determine the shortest path.
In other words, C actually has two paths to 101::/64, one through A and another through D. If C knew about these two paths, it could compare the two IGP costs, through A and through D, and choose the closest exit point out of the AS. What other router in the netwok has all the relevant information? The route reflector—A. If a link state IGP is being used in this network, A can calculate the shortest path from C to both of the potential exit points, D and E. Further, because it is the route reflector, A knows about both of the routes to reach 101::/64. Hence, A can compute the best path as C would compute it, taking into account the IGP metric for both exit points, and send C the route it knows the BGP best path process on C will choose anyway. This is exactly what BGP Optimal Route Reflection (BGP-ORR) describes.
Hopefully this short tour through BGP route reflection, the problem route reflection causes by removing state from the network, and the potential solutions, is useful in understanding the various drafts and solutions being proposed.