There are—in theory—three ways BGP can be deployed within a single AS. You can deploy a full mesh of iBGP peers; this might be practical for a small’ish deployment (say less than 10), but it quickly becomes a management problem in larger, or constantly changing, deployments. You can deploy multiple BGP confederations; creating internal autonomous systems that are invisible to the world because the internal AS numbers are stripped at the real eBGP edge.
The third solution is (probably) the only solution anyone reading this has deployed in a production network: route reflectors. A quick review might be useful to set the stage.
In this diagram, B and E are connected to eBGP peers, each of which is advertising a different destination; F is advertising the 100::64 prefix, and G is advertising the 101::/64 prefix. Assume A is the route reflector, and B,C, D, and E are route reflector clients. What happens when F advertises 100::/64 to B?
- B receives the route and advertises it through iBGP to A
- A adds its router ID to the cluster list, and reflect the route to C, D, and E
- E receives this route and advertises it through its eBGP session towards G
- C does not advertise 100::/64 towards D, because D is an iBGP peer (not configured as a route reflector)
- D does not advertise 100::/64 towards C, because C is an iBGP peer (not configured as a route reflector)
Even if D did readvertise the route towards C, and C back towards A, A would reject the route because its router ID is in the cluster list. Although the improper use of route reflectors can get you into a lot of trouble, the usage depicted here is fairly simple. Here A will only have one path towards 100::/64, so it will only have one possible path across which to run the BGP bestpath calculation.
The case of 101::/64 is a little different, however. The oddity here is the link metrics. In this network, A is going to receive two routes towards 101::/64, through D and E. Assuming all other things are equal (such as the local preference), A will choose the path to the speaker within the AS with the lowest IGP metric. Hence A will choose the path through E, advertising this route to B, C, and D. What if A were not a route reflector? If every router within the AS were part of an iBGP full mesh, what would happen? In this case:
- B would receive three two routes to 101::/64, one from D with an IGP metric of 30, and a second from E with an IGP metric of 20. Assuming all other path attributes are equal, B will choose the path through E to reach 101::/64.
- C would receive two routes to 101::/64, one from D with an IGP metric of 10, and a second from E with an IGP metric of 20. Assuming all other path attributes are equal, C will choose the path through D to reach 101::/64.
Inserting the route reflector, A, into the network does not change the best path to 101::/64 from the perspective of B, but it does change C’s best path from D to E. How can the shortest path be restored in the network? The State/Optimization/Surface (SOS) three way trade off tells use there are two possible solutions—either the state removed by the route reflector must be restored into BGP, or some interaction surface needs to be enabled between BGP and some other system in the network that has the information required to restore optimal routing.
The first of these two options, restoring the state removed through route reflection, is represented by two different solutions, one of which can be considered a subset of the other. The first solution is for the route reflector, A, to send all the routes to 101::/64 to every route reflector client. This is called add paths, and is documented in RFC7911. The problem with this solution is the amount of additional state.
A second option is to provide some set of paths beyond the best path to each client, but not the entire set of paths. This solution still attacks the suboptimal problem by adding state that was removed through the reflection process. In this case, however, rather than adding back all the state, a subset of state is added back. The state added back is normally the second best path, which is enough to provide enough information to re-optimize the network, but minimal enough to not overwhelm BGP.
What about the other option—allowing BGP to interact with some other system that has the information required to tell BGP specifically which state will allow the route reflector clients to compute the optimal path through the network? This third solution is described in BGP Optimal Route Reflection (BGP-ORR). To understand this solution, begin by asking: why does removing BGP advertisements from the control plane cause suboptimal routing? The answer to this question is: because the route reflector client does not have all the available routes, it cannot compare the IGP metric of every path in order to determine the shortest path.
In other words, C actually has two paths to 101::/64, one through A and another through D. If C knew about these two paths, it could compare the two IGP costs, through A and through D, and choose the closest exit point out of the AS. What other router in the netwok has all the relevant information? The route reflector—A. If a link state IGP is being used in this network, A can calculate the shortest path from C to both of the potential exit points, D and E. Further, because it is the route reflector, A knows about both of the routes to reach 101::/64. Hence, A can compute the best path as C would compute it, taking into account the IGP metric for both exit points, and send C the route it knows the BGP best path process on C will choose anyway. This is exactly what BGP Optimal Route Reflection (BGP-ORR) describes.
Hopefully this short tour through BGP route reflection, the problem route reflection causes by removing state from the network, and the potential solutions, is useful in understanding the various drafts and solutions being proposed.
In late 2016, several companies worked together to fork Quagga—that is, to pull the code based into a new Git repository, and hence into a separately maintained project. This new open source routing stack has been dubbed Free Range Routing (FRR). There is current participation in the project from Cumulus, 6Wind, LinkedIn, and others; the project is aligned with the Linux Foundation, and available on GitHub. —LinkedIn
I’ve been reading a lot about the repeal of the rules putting the FCC in charge of privacy for access providers in the US recently—a lot of it rising to the level of hysteria and “the end is near” level. As you have probably been reading these stories, as well, I thought it worthwhile to take a moment and point out two pieces that seem to be the most balanced and thought through out there.
Essentially—yes, privacy is still a concern, and no, the sky is not falling. The first is by Nick Feamster, who I’ve worked with in the past, and has always seemed to have a reasonable take on things. The second is by Shelly Palmer, who I don’t always agree with, but in this case I think his analysis is correct.
Last week, the House and Senate both passed a joint resolution that prevent’s the new privacy rules from the Federal Communications Commission (FCC) from taking effect; the rules were released by the FCC last November, and would have bound Internet Service Providers (ISPs) in the United States to a set of practices concerning the collection and sharing of data about consumers. The rules were widely heralded by consumer advocates, and several researchers in the computer science community, including myself, played a role in helping to shape aspects of the rules. I provided input into the rules that helped preserve the use of ISP traffic data for research and protocol development. —CircleID
Is it time for the IETF to give up? Over at CircleID, Martin Geddes makes a case that it is, in fact, time for the IETF to “fade out.” The case he lays out is compelling—first, the IETF is not really an engineering organization. There is a lot of running after “success modes,” but very little consideration of failure modes and how they can and should be guarded against. Second, the IETF “the IETF takes on problems for which it lacks an ontological and epistemological framework to resolve.”
In essence, in Martin’s view, the IETF is not about engineering, and hasn’t ever really been.
The first problem is, of course, that Martin is right. The second problem is, though, that while he hints at the larger problem, he incorrectly lays it at the foot of the IETF. The third problem is the solutions Martin proposes will not resolve the problem at hand.
First things first: Martin is right. The IETF is a mess, and is chasing after success, rather than attending to failure. I do not think this is largely a factor of a lack of engineering skill, however—after spending 20 years working in the IETF, there seems to be a lot of ego feeding the problem. Long gone are the days when the saying “it is amazing how much work can get done when no-one cares who gets the credit” was true of the IETF. The working groups and mailing lists have become the playground of people who are primarily interested in how smart they look, how many times they “win,” and how many drafts they can get their names on. This is, of course, inevitable in any human organization, and it is a death of a thousand cuts to real engineering work.
Second, the problem is not (just) the IETF. The problem is the network engineering world in general. Many years ago I was mad at the engineering societies that said: “You can’t use engineer in your certifications, because you are not engineers.” Now, I think they have a point. The networking world is a curious mixture of folks who wrap sheet metal, design processors, build code, design networks, and console jockeys. And we call all of these folks “engineers.” How many of these folks actually know anything beyond the CLI they’re typing away on (or automating), or the innards of a single chipset? In my experience, very few.
On the other hand, there is something that needs to be said here in defense of network engineering, and information technology in general. There are two logical slips in the line of argument that need to be called out and dealt with.
The first line of argument goes like this: “But my father was a plane fitter, and he required certifications!” Sure, but planes are rare, and people die when the fall from the sky. Servers and networks are intentionally built not to be rare, and applications are intentionally built so that people do not die when they fail. It is certainly true that where the real world intersects with the network, specifically at the edge where real people live, there needs to be more thought put into not failing. But at the core, where the law of large numbers is right, we need to think about rabid success, rather than corner case failures.
There are many ways to engineer around failure; not all are appropriate in every case. Part of engineering is to learn to apply the right failure mode thinking to the right problem set, instead of assuming that every engineering problem needs to be addressed in the same way.
The second line of argument goes like this: “But airplanes don’t fail, so we should adopt aviation lines of thinking.” Sorry to tell you this, but even the most precise engineering fails in the face of the real world. Want some examples? Perhaps this, or this, or this, or this will do? Clear thinking does not begin with imbuing the rest of the engineering world with a mystique it does not actually posses.
Third, the solutions offered are not really going to help. Licensing is appropriate when you are dealing with rare things that, when they fall out of the sky, kill people. in many other cases, however, licensing just becomes an excuse to restrict the available talent pool, actually decreasing quality and innovation while raising prices. There needs to be a balance in here someplace—a balance that is probably going to be impossible to reach in the real world. But that does not mean we should not try.
What is to be done?
Dealing only with the IETF, a few practical things might be useful.
First, when any document is made into a working group document, it should be moved to an editor/contributor model. Individual authors should disappear when the work moves into the community, and the work transitions to the work of a team, rather than a small set of individuals. In other words, do what is possible to take the egos out of the game, and replace them with the pride of a job well done.
Second, standards need to explicitly call out what their failure modes are, and how designers are expected to deal with these failure modes. For edge computing, specifically, “build more and deploy them” should not be an option. This is a serious area that needs to be addressed, rather than glossed over by placing every technology at the core, and just assuming the IP paradigm of finding another path works.
Third, the IETF needs to strengthen the IRTF, and ask it to start thinking about how to quantify the differences between the kinds of engineering needed where, and what the intersection of these different kinds of engineering might look like. Far too often, we (the IETF) spend a lot of time navel gazing over which cities we should meet in, and end up leaving the larger questions off the table. We want one “winner,” and fail to embrace the wide array of problems in favor of “the largest vendor,” or “the most politically connected person in the room.”
Fourth, the IETF needs to learn to figure out what its bounds are, and then learn to let go. When I consider that there are hundreds of YANG models, for instance, I begin to suspect that this is one place where we are making some fundamental mistake about where to place the blurry dividing line between what the open source community should (or can) do, and what should be a standard. Perhaps the protocol used to carry a model should be a standard, and perhaps the things operators should expect to be able to find out about a protocol should be a part of the standard, and the modeling language should be a standard—but maybe the outline of the model itself should be left to faster market forces?
In the larger community, I am not going to change what I have been saying for years. We need to grow up and actually be engineers. We need to stop focusing on product lines and CLIs, and start focusing on learning how networks actually work. I am working on one project in this space, and have ideas for others, but for now, I can only point in the same directions I have always pointed in the past.
Had a DNS glitch mid morning ET in switching some configurations around. It should be back up and running now, and rule11.tech should be coming up as a secondary domain soon’ish.