Archive for 2022
BGP Policy (Part 7)

At the most basic level, there are only three BGP policies: pushing traffic through a specific exit point; pulling traffic through a specific entry point; preventing a remote AS (more than one AS hop away) from transiting your AS to reach a specific destination. In this series I’m going to discuss different reasons for these kinds of policies, and different ways to implement them in interdomain BGP.
In this post—the last post in this series—I’m going to cover do not transit options from the perspective of AS65001 in the following network—

There are cases where an operator does not traffic to be forwarded to them through some specific AS, whether directly connected or multiple hops away. For instance, AS65001 and AS65005 might be operated by companies in politically unfriendly nations. In this case, AS65001 may be legally required to reject traffic that has passed through the nation in which AS65005 is located. There are at least three mechanisms in BGP that are used, in different situations, to enforce this kind of policy.
Do Not Advertise Communities (Provider Specific)
Many providers supply communities a customer can use to block the advertisement of their routes to a particular AS. For instance, if AS65002 were NTT, according to the NTT customer communities site, if AS65001 advertises 100::/64 with the community 65500:65005, NTT would advertise 100::/64 to all its other peers, but not to AS65005.
Note: NTT is not AS65002; this is only used as an illustration of using a community to block advertisement to a peer’s peer.
The operator at AS65001 might reasonably expect that blocking AS65002 from advertising 100::/64 to AS65005 will block all traffic traveling through AS65005—but the vagaries of the global Internet routing table may well cause traffic to be forwarded through AS65005 anyway in some instances.
If AS65006 has a default route pointing to AS65005, traffic destined to 100::/64 may still be forwarded to AS65005. If AS65005 happens to have a covering aggregate route, or learned of the route via AS65004, it might still carry traffic destined for 100::/64.
It is almost impossible to block all traffic to a given reachable destination from being forwarded through a given autonomous system.
AS Path Injection
An alternate, widely used mechanism is to intentionally inject an AS Path loop when advertising a route to prevent some AS from accepting the route. For instance, AS65001 might advertise 100::/64 with the AS Path [65005,65001] to AS65002. AS65005 would then reject this advertisement because the local AS is already in the AS Path.
While this might appear to “break the rules” of BGP, the reality is the AS Path was never really intended to be a “true record” of the path of an “update” (in fact, there is no such thing as an “update” that travels from one router to the next—the “update” is constructed at each hop based on local tables). This technique is problematic in providing “path security” in BGP, but it does not intrinsically break any BGP rules.
Note: For more information about this technique, refer to this episode of the Hedge.
Again, note it is almost impossible to block all traffic to a given reachable destination from being forwarded through a given autonomous system.
Do Not Advertise Communities (Well Known)
Three further well-known communities, although they are not widely used, are worth considering.
When a route is marked with NO-PEER, the AS should only advertise the route to its customers and never its peers. For instance, if AS65001 advertises 100::/64 to AS65003 with NO-PEER, AS65003 will advertise the destination to AS6507 and AS65008 (assuming these are customers), and not to AS65002 or AS65004 (because both of these autonomous systems transit traffic to and from AS65003).
When a route is marked with NO-EXPORT, the AS should not advertise the reachable destination to any other AS. For instance, if AS65001 advertises 100::/64 to AS65003 with NO-EXPORT, AS65003 will not advertise this reachable destination to any other AS, including AS65007, AS65008, AS65002, or AS65004.
When a route is marked with NO-ADVERTISE, the receiving BGP speaker should not advertise the route to any other BGP speaker, including internal and external connections.
Hedge 128: Network Engineering at College

Have you ever thought about getting a college degree in computer networking? What are the tradeoffs between this and getting a certification? What is the state of network engineering at colleges—what do current students in network engineering programs think about their programs, and what they wish was there that isn’t? Rick Graziani joins Tom Ammon and Russ White in a broad ranging discussion on network engineering and college. Rick teaches network engineering full time in the Valley.
BGP Policy (Part 6)
At the most basic level, there are only three BGP policies: pushing traffic through a specific exit point; pulling traffic through a specific entry point; preventing a remote AS (more than one AS hop away) from transiting your AS to reach a specific destination. In this series I’m going to discuss different reasons for these kinds of policies, and different ways to implement them in interdomain BGP.
In this post I’m going to cover local preference via communities, longer prefix match, and conditional advertisement from the perspective of AS65001 in the following network—

Communities an Local Preference
As noted above, MED is the tool “designed into” BGP for selecting an entrance point into the local AS for specific reachable destinations. MED is not very effective, however, because a route’s preference will always win over MED, and because it is not carried between autonomous systems.
Some operators provide an alternate for MED in the form of communities that set a route’s preference within the AS. For instance, assume 100::/64 is geographically closer to the [65001,65003] link than either of the [65001,65002] links, so AS65001 would prefer traffic destined to 100::/64 enter through AS65003.
In this case, AS65001 can advertise 100::/64 with a community that makes AS65001 prefer the route through AS65003 over the direct route to AS65001 (see 2914:450 on NTT’s list of customer set communities as an example).
Note: Many of the communities described here have regional versions for more specific use cases. These operate on the same principles, just in a more restricted topological or geographical area.
Longer Prefix Match
While MED is often not effective, and using communities is both restricted in range and complex to configure and manage, advertising a longer-prefix match always works, is simple to configure, and easy to deploy.
For instance, if AS65001 would like traffic destined to 100::/64 to only enter from AS65003, it may advertise an aggregated route, say 2001:db8:3e8100::/63 to both AS65003 and AS65002, and then advertise 100::/64 only to AS65003. Because all routing systems will select the prefix with the longest match first, the /64 through AS65003 will be selected over the /63 through AS65003 and AS65003, so the traffic always enters AS65001 the way the operator desires.
The overlapping, or covering, aggregate is advertised to provide backup reachability. If the [AS65001,AS65003] link (or peering) fails for any reason, traffic destined to 100::/64 will follow the /63 route, entering from AS65002. This is not optimal from the perspective of AS65001, but it keeps connectivity in place while any problems can be traced down and repaired.
According to Geoff Huston, a large percentage of the routes in the current global table are advertised for traffic engineering—to manipulate the point at which traffic destined to specific reachable destinations enters an AS.
Note: The use of longer prefix routes to control inbound route flows represents a “tragedy of the commons” problem to the global Internet. Work has been put into various mechanisms designed to remove these more specific routes from the routing table when they are no longer needed, but little progress has been made in implementing them, not have any of these solutions achieved widespread adoption and deployment.
Conditional Advertisement
What if AS65001 has signed a contract with AS65003 to carry traffic only if both its links to AS65002 fails? In this case, AS65001 could advertise many more longer prefix specifics through AS65002 and one shorter covering route through AS6503.
This strategy, however, has two flaws. First, it requires AS6501 to manage the more specifics and covering routes as a set, making certain the pairs are correctly configured. Second, it could be that AS65001 does not want anyone to know about this backup arrangement unless and until it is used. This is sometimes the case when two competitors agree to back one another up, and neither wants anyone to know what their backup arrangements are.
To resolve these (and other) policy problems, operators can use conditional advertisement.
Conditional advertisement is conceptually simple; if a router does not have some route, x, in its routing table, it advertises some other route (given the route is in the local tables so it can be advertised). For instance, AS65001 might configure the router at C to advertise 100::/64 only when it does not have some other route.
The hardest part of configuring conditional advertisement is knowing when to trigger the advertisement of the alternate path. Using the lack of reachability to the destination itself (100::/64 in this case) as the trigger will fail in some circumstances, and will always require the global table to converge before the alternate path is advertised. Instead, conditional advertisement is often triggered by the lack of a route to between the BGP speakers being “watched” (in this case, the two [65001,65002] links) learned through from within the AS (within AS65001, rather than through the global routing table).
Triggering on the internal state of a link directly connected to a router managed by the local operator, and carried through internal convergence, removes external convergence from the time required to begin advertising the alternate path.
Is an IP Address Protected Information?
My third article on privacy and networking is up over at Packet Pushers—
BGP Policies (Part 5)
At the most basic level, there are only three BGP policies: pushing traffic through a specific exit point; pulling traffic through a specific entry point; preventing a remote AS (more than one AS hop away) from transiting your AS to reach a specific destination. In this series I’m going to discuss different reasons for these kinds of policies, and different ways to implement them in interdomain BGP.
In this post I’m going to cover AS Path Prepending from the perspective of AS65001 in the following network—

Since the length of the AS Path plays a role in choosing which path to use when forwarding traffic towards a given reachable destination, many (if not most) operators prepend the AS Path when advertising routes to a peer. Thus an AS Path of [65001], when advertised towards AS65003, can become [65001,65001] by adding one prepend, [65001,65001,65001] by adding two prepends, etc. Most BGP implementations allow an operator to prepend as many times as they would like, so it is possible to see twenty, thirty, or even higher numbers of prepends.
Note: The usefulness of prepending is generally restricted to around two or three, as the average length of an AS Path in the global Internet is around 4 hops.
If AS65001 would like traffic destined to 100::/64 to enter from AS65003 rather than AS65002, it can prepend the AS Path at every peering point with AS65002 (A and B) with two hops (sending [65001,65001,65001] to AS65002). If preference, MED, and all other metrics are equal, AS65002 would then prefer the path with the shorter AS Path through AS65003, rather than the path directly into AS65001 (either through A or B).
That all metrics are equal is not likely, however. AS65002 will probably have preference set so routes learned directly from customers (such as AS65001) are selected over routes learned from peers (such as AS65003). The impact of prepending on route selection by directly connected peers is, therefore, uncertain.
Moving one step out in the network, consider the routes received by AS65004 to reach 100::/64. There will be one route along [65002,65001,65001,65001], and another with an AS Path of [65003,65001]. All other things being equal (same preference, etc.), AS65004 will choose to send traffic destined to 100::/64 through AS65003 rather than AS65002. How likely is it all the other BGP metrics will be equal at AS65004? So long as the peering between AS65004, AS65003, and AS65002 are all of the same type, the odds are high—so prepending can help move some (not all) traffic from one inbound link to another.
Because AS Path prepending has variable results over time, operators using this technique often “just try it” to see what the effect will be. There’s no real way to predict how effective prepending any number of times will be in moving traffic from one inbound link to another.
What if AS65001 does not want traffic destined to 100::/64 to traverse AS6505? For instance, suppose AS6506 s on across an ocean, mountain range, or other difficult-to-cross geographic feature. AS65005 crosses this geography via a satellite link, while AS65004 crosses the same geography via an optical cable. Sine optical cable runs can provide better delay and jitter than a satellite link, AS65001 may desire to choose which of these two autonomous systems is traversed to reach 100::/64.
This cannot be directly accomplished using AS Path prepend, as both AS65004 and AS65005 will both receive the same prepended path.
To express this kind of policy, some operators allow their customers to set communities that cause the operator to remotely prepend a given route advertisement. For instance, NTT allows their customers to set a community that will cause NTT to prepend specific routes when those routes are advertised to specific autonomous systems—in this case, AS65001 could add the community 65421:65005 to the advertisement for 100::/64, which would cause NTT to prepend AS65001 when advertising 100::64 to AS65005, and not prepend anything when advertising 100::/64 to AS65004.
This technique is subject to the same caveats as using AS Path prepend locally—it may work in some situations, or it may not—because the local operator does not have visibility into the policies of the operators they are trying to influence.
On Securing BGP
The US Federal Communications Commission recently asked for comments on securing Internet routing. While I worked on the responses offered by various organizations, I also put in my own response as an individual, which I’ve included below.
I am not providing this answer as a representative of any organization, but rather as an individual with long experience in the global standards and operations communities surrounding the Internet, and with long experience in routing and routing security.
I completely agree with the Notice of Inquiry that “networks are essential to the daily functioning of critical infrastructure [yet they] can be vulnerable to attack” due to insecurities in the BGP protocol. While proposed solutions exist that would increase the security of the BGP routing system, only some of these mechanisms are being widely deployed. This response will consider some of the reasons existing proposals are not deployed and suggest some avenues the Commission might explore to aid the community in developing and deploying solutions.
9: Measuring BGP Security.
At this point, I only know of the systems mentioned in the query for measuring BGP routing security incidents. There have been attempts to build other systems, but none of these systems have been successfully built or deployed. Three problems seem to affect these kinds of systems.
First, there is a general lack of funding for building and maintaining such systems. These kinds of systems require a fair amount of research and creative energy to design, including making the networking community aware of these kinds of tools.
Second, building such a system is difficult because of the nature of inter-provider policy. It is often difficult to tell if some change in the Default Free Zone (DFZ) routing is valid or is somehow related to an attack. False positives can have a very negative impact and are hard to detect and guard against.
Third, these kinds of systems generally focus on a single system—routing—while excluding hints and information that can be gained from other systems (particularly the DNS). This is, at least in part, because of the complexity of each individual system, and the difficulty in understanding how to correlate and understand information from overlapping systems.
10: Deployment of BGP Security Measures.
BGP security is divided into at least four different domains right now.
First is the exposure of policies and information through registries and similar mechanisms (such as peeringdb and whois). These mechanisms can generally be useful at the initial stages of peering, and hence are not very helpful in resolving hijacks, mistakes, etc., in near-real-time within the DFZ.
Second is the set of best common practices, such as BCP38, and represented by the MANRS effort. These will be more fully discussed in answer to question 13.
Third is origin validation, currently represented by the RPKI, which will be considered more fully in answering question 11.
Fourth is a more complete security system, currently represented by BGPSEC, which will be considered more fully in answering question 12.
11: The Commission seeks comment on the extent to which RPKI, as implemented by other regional internet registries, effectively prevents BGP hijacking.
The RPKI can effectively block some hijacking events—so long as most providers implement and “pay attention” to the validation process. There are, however, problems with the RPKI system, including—
- There is no “quality control” over the contents of the RPKI. Other systems, such as the Internet Routing Registries (IRRs), that store policy and origination information have, over time, deteriorated in terms of the quality of information housed there. There is very little research into the quality of information stored in the RPKI, nor do we have any sense about how the quality of this information will stand up over time.
- There are some concerns about the centralization of control over resources the RPKI represents. For instance, if a content or transit provider becomes entangled in a contract dispute over some resource with a registry, the registry can use the RPKI system to remove the provider from the Internet, essentially putting the provider out of business. Governments can, in theory, also cause registries to remove a provider’s authorization to use Internet resources. These are areas that may need to be researched and addressed to gain the trust of a larger part of the community.
- The RPKI system does not expose any information about a route other than the originator. This leaves the possibility of hijacking a route by an Autonomous System (AS) advertising a route even though they cannot reach the destination by simply claiming to be connected to the originating AS.
- The RPKI system does little to prevent an AS that should not be transiting traffic—end customers such as content providers and “enterprises”—from advertising routes in a way that pulls them into a transit role.
The RPKI system does appear to be gaining widespread acceptance, and its deployment is increasing in scope.
12: The Commission seeks comment on whether and to what extent network operators anticipate integrating BGPsec-capable routers into their networks.
BGPsec has not been deployed by a single provider on other than an experimental basis, as far as I know, and there are no active plans to implement BGPsec by any provider. BGPsec, in general, fails to provide enough additional security to justify the additional costs associated with its deployment. Specifically—
- Deploying BGPsec on individual routers requires the BGP speaker to perform complex cryptographic operations. No production router in existence today has the processing power to perform these operations quickly enough to be useful. The only apparent solution to this problem is to build specifically designed hardware to perform these operations—no router includes this hardware today, and no plans are in place to include them. The additional costs incurred to allow individual routers to perform these complex cryptographic operations would be prohibitive.
- If it is run “on the side” by moving the complex cryptographic operations onto a separate device, the cost and complexity of running a network are dramatically increased.
- BGPsec only signs the reachable destination (NLRI) and AS Path, which are only two components of a route. There are many other components in a route, such as the next hop and communities, which are just as important to the validity of an individual advertisement which are not covered by BGPsec. The signing of a “route” in BGPsec is a term of convenience, rather than a description of what is really signed.
- BGPsec will only provide some additional security (BGPsec is not “perfect” from a security perspective) if most providers deploy the technology. This leads to a “chicken and egg” problem.
- BGPsec reduces performance by eliminating specific optimizations, such as update packing, which have an important impact on BGP performance and BGP’s consumption of resources.
- The additional resources required by BGPsec represent a surface of attack for DDoS attacks against individual routers and, with coordination, against entire networks.
- BGPsec “freezes BGP in place” by assuming the best way to secure BGP is to “secure the way BGP works.” Deploying BGPsec would restrict future innovation in routing systems, particularly in the global Internet.
To these general problems, there is one further problem—BGPsec does not secure the withdrawal of reachability, only its advertisement. Because of this, BGPsec can only be considered a somewhat partial solution to the problems any BGP security system needs to solve.
Consider a BGP speaker that has received a signed NLRI/AS Path pair (a signed “route”). This BGP speaker can continue advertising this route so long as it appears to be valid—breaking the peering session does not invalidate the route.
Hence, the BGP speaker may mistakenly or intentionally replay this signed reachability information until something within the signed pair invalidates the information. There are four ways the signed route may be invalidated:
- A “better” route is propagated through the system
- Some form of “revocation list” is maintained and distributed
- Each signed route is given a defined “time-to-live,” after which it is invalidated
- The signing key is revoked and/or replaced
The first is impractical to guarantee in all situations. The second would involve maintaining a “negative routing table,” which is nearly impossible in practice.
The third—adding a time-to-live to BGP reachability information—imposes high operational costs. BGP assumes that so long as a peer advertising a reachable destination maintains the peering session, the destination remains reachable (the route is valid). This assumption replaces the workload of constantly advertising already existing routing information with a single “hello” process to ensure the connection is still valid. A single “hello,” then, is a proxy validating the routing information for hundreds of thousands (potentially millions) of reachable destinations. Routes, in other words, have an implied infinite time-to-live.
Adding a time-to-live to individual routes would mean a BGP speaker must readvertise a given reachable destination periodically for the routing information to continue to be considered valid. According to this site, there are currently 916,000 IPv4 routes carried by a BGP speaker connected to the Internet (the number varies by location, policies implemented, etc.). Note the analysis below does not consider IPv6 routes, which will probably be more numerous.
The time-to-live attached to any route determines how long the information can be replayed. If the originator sets the timer to 168 hours, the route can be replayed for a week before it is invalidated. It is difficult to say how long any given route should be valid, or what level of replay protection any given route requires. This illustration will assume 24 hours would be an average across many routes—but there are strong incentives to set the time-to-live much shorter, and there is little cost to the originator for doing so.
If each of these routes were given a time-to-live of 24 hours, the typical Internet BGP speaker would need to process about 10 updates/second (with the additional cryptographic processing requirements described above) just to process time-to-live expirations.
The impacts of this level of activity in the DFZ—beyond the sheer processing and bandwidth requirements—are wide-ranging. For instance, logging, telemetry, false route detection systems, and the way timers are deployed to dampen and manage high speed flapping events, would all need to be reconsidered and adjusted.
The fourth alternative is for the signing key to be revoked when a route is withdrawn.
If the operator uses a single key to sign all routes being advertised by the AS, then replacing the key on a single route requires re-advertising every route. Readvertising every route is a difficult process, fraught with potential failure modes.
If the operator assigns each BGP speaker a key, then only the key for BGP speakers impacted by withdrawing the route must have their keys changes. Hence, only the routes advertised by or through these individual speakers need to be re-advertised into the routing system. However, assigning each BGP speaker an individual key for signing routing information exposes another set of problems.
Key management is an obvious problem with this solution; the exposure of peering information, and the security implications of that exposure, are non-obvious problems. If each BGP speaker on the edge of a network has its own signing key, then outside observers can determine the actual pair of routers used to connect any two autonomous systems. This creates a “map” of points at which the network can be attacked, and is generally an unacceptable exposure of information for most providers.
These issues have, to this point, prevented any serious plans for deploying BGPsec—and will probably continue to do so for the foreseeable future. The very best that can be hoped for is BGPsec deployment in 10–20 years, and even full deployment would not necessarily improve the overall security posture of the global Internet.
13: For network operators that currently participate in MANRS and comply with its requirements, including support for IETF Best Common Practice standards, the Commission seeks comment on the efficacy of such measures for preventing BGP hijacking.
MANRS, BCP38, and peer-to-peer BGP session encryption (such as TCP-AO) should, in theory, be effective a large part of the unintentional and “unsophisticated” attacks and mistakes that cause large-scale BGP failures. There has been little research attempting to measure the impact of these measures, and it seems difficult to measure their impact.
The MANRS vendor program is an effective mechanism for promoting the common-sense practices, although it could probably be ramped up somewhat, and vendors more strongly encouraged to participate.
These measures should continue to be promoted through education, presentations, and other means, as they do appear to be improving the overall security posture of the Internet. TCP-AO, BCP38, and MANRS should, in particular, be encouraged and emphasized by all parties within the ecosystem.
14: Commission’s Role.
The Commission should focus on supporting the community in developing deployable standards and systems to improve the global routing system.
First, the Commission can encourage governmental organizations, and organizations funded by government organizations, to “go back to basics” and ask specific questions about what needs to be secured, how it can practically be secured, and what the tradeoffs are.
To this point, BGP security efforts have often begun with the question how we can secure the existing operation of BGP. This is not the right question to ask. Instead, the community needs to be encouraged to create and understand what needs to be secured. Possible questions might be—
- What does valid mean in relation to a route? Must it include the entire route, or is “just” the AS Path and reachable destination “enough?”
- In relation to the AS Path, is the AS Path given valid in the sense that it exists, and there are no policies preventing the use of this path to reach the given destination?
- In relation to the reachable destination, how can aggregation and other forms of alternate origination be supported while still answering the questions posed above?
- Will the providers along the path actually use the given path? Can “quality of path” be ensured? If so, how can the be accomplished without incurring unacceptable costs?
- How can the effectiveness of the system be measured?
- How can a system be designed so that increasing deployment increases security? How can the “tragedy of the commons” and “chicken and egg” problems be avoided?
Second, the Commission can encourage providers and operators, including large “enterprise” organizations, to participate in the process of understanding and building global routing system security. To this point, only a few providers have participated in the discussion. Quite often, those participating have a narrow perspective, and have been guided by groups asking the wrong question (as above). The scope of enquiry needs to be expanded.
What the Commission, or any other government organization, should not do is to push a solution from the top down. The IETF community is effective at finding solutions for these kinds of problems, and has vast experience in understanding the intended consequences, the unintended consequences, and operational aspects of deploying technologies at the scale of the Internet. Government agencies need to leverage these capacities, rather than trying to override them.
If funding is provided for research in this area, it should begin with some sort of “open research grant,” rather than selecting one solution to fund. Funding should not have an impact on the selection of a technical solution in open standards organizations (such as the IETF). Funding does, however, play a significant role by impacting the availability of implementations, time spent researching problems, time spent supporting a given solution at open meetings, etc.
The community must return to the beginning and find a solution that works by asking the right questions.
15: The Commission seeks comment on the extent to which the effectiveness of BGP security measures may be related to international participation and coordination.
International coordination and cooperation are basic requirements.
16″ Costs and Benefits.
Please see the answers above, as some of the costs are considered there.
17: The Commission seeks comment on whether the Commission should encourage industry to prioritize the deployment of BGP security measures within the networks on which critical infrastructure and emergency services rely, as a means of helping industry to control costs otherwise associated with a network-wide deployment.
This is an attractive idea from the perspective of finding places where routing security could be deployed at a smaller scale and in a controlled manner to understand how the system works, make improvements in the system, etc. However, I would be concerned about how these kinds of services can be “separated out” for deployment in an effective way.
This kind of deployment would, however, make the problem of incremental deployment a fundamental requirement of any proposed system, which may at least encourage steps in the right direction.
Hedge 127: FR Routing Update
The FR Routing project is a fully featured open-source routing stack, including BGP, OSPF, and IS-Is (among others), supported by a community including NVDIA, Orange, VMWare, and many others. On today’s episode of the Hedge, Tom Ammon and Russ White are joined by Donald Sharp, Alistair Woodman, and Quentin Young to update listeners on projects completed and underway in FR Routing.
