The RPKI, for those who do not know, ties the origin AS to a prefix using a certificate (the Route Origin Authorization, or ROA) signed by a third party. The third party, in this case, is validating that the AS in the ROA is authorized to advertise the destination prefix in the ROA—if ROA’s were self-signed, the security would be no better than simply advertising the prefix in BGP. Who should be able to sign these ROAs? The assigning authority makes the most sense—the Regional Internet Registries (RIRs), since they (should) know which company owns which set of AS numbers and prefixes.
The general idea makes sense—you should not accept routes from “just anyone,” as they might be advertising the route for any number of reasons. An operator could advertise routes to source spam or phishing emails, or some government agency might advertise a route to redirect traffic, or block access to some web site. But … if you haven’t found the tradeoffs, you haven’t looked hard enough. Security, in particular, is replete with tradeoffs.
Every time you deploy some new security mechanism, you create some new attack surface—sometimes more than one. Deploy a stateful packet filter to protect a server, and the device itself becomes a target of attack, including buffer overflows, phishing attacks to gain access to the device as a launch-point into the private network, and the holes you have to punch in the filters to allow services to work. What about the RPKI?
When the RKI was first proposed, one of my various concerns was the creation of new attack services. One specific attack surface is the control a single organization—the issuing RIR—has over the very existence of the operator. Suppose you start a new content provider. To get the new service up and running, you sign a contract with an RIR for some address space, sign a contract with some upstream provider (or providers), set up your servers and service, and start advertising routes. For whatever reason, your service goes viral, netting millions of users in a short span of time.
Now assume the RIR receives a complaint against your service for whatever reason—the reason for the complaint is not important. This places the RIR in the position of a prosecutor, defense attorney, and judge—the RIR must somehow figure out whether or not the charges are true, figure out whether or not taking action on the charges is warranted, and then take the action they’ve settled on.
In the case of a government agency (or a large criminal organization) making the complaint, there is probably going to be little the RIR can do other than simply revoke your certificate, pulling your service off-line.
Overnight your business is gone. You can drag the case through the court system, of course, but this can take years. In the meantime, you are losing users, other services are imitating what you built, and you have no money to pay the legal fees.
A true story—without the names. I once knew a man who worked for a satellite provider, let’s call them SATA. Now, SATA’s leadership decided they had no expertise in accounts receivables, and they were spending too much time on trying to collect overdue bills, so they outsourced the process. SATB, a competing service, decided to buy the firm SATA outsourced their accounts receivables to. You can imagine what happens next… The accounting firm worked as hard as it could to reduce the revenue SATA was receiving.
Of course, SATA sued the accounting firm, but before the case could make it to court, SATA ran out of money, laid off all their people, and shut their service down. SATA essentially went out of business. They won some money later, in court, but … whatever money they won was just given to the investors of various kinds to make up for losses. The business itself was gone, permanently.
Herein lies the danger of giving a single entity like an RIR, even if they are friendly, honest, etc., control over a critical resource.
A recent paper presented at the ANRW at APNIC caught my attention as a potential way to solve this problem. The idea is simple—just allow (or even require) multiple signatures on a ROA. The ROA will be accepted so long as it meets some “signature threshold”—there are enough signatures to convince the receiver that the ROA is valid. This would take more work, of course, because the originator must ask multiple RIRs to sign their ROAs, rather than the one they received the prefix from. The tradeoff, however, is good enough to justify the extra work.
If one RIR—even the one that allocated the addresses you are using—revokes their signature on your ROA, the remaining signatures should be enough to convince anyone receiving your route that it is still valid. Since there are five regions, you have at least five different choices to countersign your ROA. Each RIR is under the control of a different national government; hence organizations like governments (or criminals!) would need to work across multiple RIRs and through other government organizations to have a ROA completely revoked.
The question is—how many signatures should be enough? The authors of the paper suggest there should be a “Threshold Signature Module” that makes this decision. It seems, to me, a better solution is to take the PGP route—let the receiver decide. In other words, the number of signatures required should be a matter of local policy, rather than something stipulated in a best common practice or standard of some kind.
This multiple signature idea seems like a neat way to work around one of (possibly the major) attack surfaces introduced by the RPKI system. If you are interested in Internet core routing security, you should take a read through the post linked above, and then watch the video.
In old presentations on network security (watch this space; I’m working on a new security course for Ignition in the next six months or so), I would use a pair of chocolate chip cookies as an illustration for network security. In the old days, I’d opine, network security was like a cookie that was baked to be crunchy on the outside and gooey on the inside. Now-a-days, however, I’d say network security needs to be more like a store-bought cookie—crunchy all the way through. I always used this illustration to make a point about defense-in-depth. You cannot assume the thin crunchy security layer at the edge of your network—generally in the form of stateful packet filters and the like (okay, firewalls, but let’s leave the appliance world behind for a moment)—is what you really need.
There are such things as insider attacks, after all. Further, once someone breaks through the thin crunchy layer at the edge, you really don’t want them being able to move laterally through your network.
The United States National Institute of Standards and Technology (NIST) has released a draft paper describing Zero Trust Architecture, which addresses many of the same concerns as the cookie that’s crunchy all the way through—the lateral movement of attackers through your network, for instance.
The situation, however, has changed quite a bit since I used the cookie illustration. The problem is no longer that the inside of your network needs to be just as secure as the outside of your network, but rather that there is no “inside” to your network any longer. For this we need to add a third cookie—the kind you get in the soft-baked packages, or even in the jar (or roll) of cookie dough—these cookies are gooey all the way through.
To understand why this is… It used to be, way back when, we had a fairly standard Demilitarized Zone design.
For those unfamiliar with this design, D is configured to block traffic to C or A’s interfaces, and C is configured as a stateful filter and to block access to A’s addresses. If D is taken over, it should not have access to C or A; if C is taken over, it still should not have access to A. This provides a sort-of defense-in-depth.
Building this kind of DMZ, however, anticipates there will be at most a few ways into the network. These entries are choke points that give the network operator a place to look for anything “funny.”
Moving applications to the cloud, widespread remote work, and many other factors have rendered the “choke point/DMZ” model of security. There just isn’t a hard edge any longer to harden; just because someone is “inside” the topological bounds of your network does not mean they are authorized to be there, or to access data and applications.
The new solution is Zero Trust—moving authentication out to the endpoints. The crux of Zero Trust is to prevent unauthorized access to data or services on a per user, per device basis. There is still an “implied trust zone,” a topology within a sort of DMZ, where user traffic is trusted—but these are small areas with no user-controlled hosts.
If you want to understand Zero Trust beyond just the oft thrown around “microsegmentation,” this paper is well worth reading, as it explains the terminology and concepts in terms even network engineers can understand.
Can you really trust what a routing protocol tells you about how to reach a given destination? Ivan Pepelnjak joins Nick Russo and Russ White to provide a longer version of the tempting one-word answer: no! Join us as we discuss a wide range of issues including third-party next-hops, BGP communities, and the RPKI.
The security of the global routing table is foundational to the security of the overall Internet as an ecosystem—if routing cannot be trusted, then everything that relies on routing is suspect, as well. Mutually Agreed Norms for Routing Security (MANRS) is a project of the Internet Society designed to draw network operators of all kinds into thinking about, and doing something about, the security of the global routing table by using common-sense filtering and observation. Andrei Robachevsky joins Russ White and Tom Ammon to talk about MANRS.
I’s fnny, bt yu cn prbbly rd ths evn thgh evry wrd s mssng t lst ne lttr. This is because every effective language—or rather every communication system—carried enough information to reconstruct the original meaning even when bits are dropped. Over-the-wire protocols, like TCP, are no different—the protocol must carry enough information about the conversation (flow data) and the data being carried (metadata) to understand when something is wrong and error out or ask for a retransmission. These things, however, are a form of data exhaust; much like you can infer the tone, direction, and sometimes even the content of conversation just by watching the expressions, actions, and occasional word spoken by one of the participants, you can sometimes infer a lot about a conversation between two applications by looking at the amount and timing of data crossing the wire.
The paper under review today, Off-Path TCP Exploit, uses cleverly designed streams of packets and observations about the timing of packets in a TCP stream to construct an off-path TCP injection attack on wireless networks. Understanding the attack requires understanding the interaction between the collision avoidance used in wireless systems and TCP’s reaction to packets with a sequence number outside the current window.
Beginning with the TCP end of things—if a TCP packet is received with a window falling outside the current window, TCP implementations will send a duplicate of the last ACK it sent back to the transmitter. From the Wireless network side of things, only one talker can use the channel at a time. If a device begins transmitting a packet, and then hears another packet inbound, it should stop transmitting and wait some random amount of time before trying to transmit again. These two things can be combined to guess at the current window size.
Assume an attacker sends a packet to a victim which must be answered, such as a probe. Before the victim can answer, the attacker than sends a TCP segment which includes a sequence number the attacker thinks might be within the victim’s receive window, sourcing the packet from the IP address of some existing TCP session. Unless the IP address of some existing session is used in this step, the victim will not answer the TCP segment. Because the attacker is using a spoofed source address, it will not receive the ACK from this segment, so it must find some other way to infer if an ACK was sent by the victim.
How can the attacker infer this? After sending this TCP sequence, the attacker sends another probe of some kind to the victim which must be answered. If the TCP segment’s sequence number is outside the current window, the victim will attempt to send a copy of its previous ACK. If the attacker times things correctly, the victim will attempt to send this duplicate ACK while the attacker is transmitting the second probe packet; the two packets will collide, causing the victim to back off, slowing the receipt of the probe down a bit from the attacker’s perspective.
If the answer to the second probe is slower than the answer to the first probe, the attacker can infer the sequence number of the spoofed TCP segment is outside the current window. If the two probes are answered in close to the same time, the attacker can infer the sequence number of the spoofed TCP segment is within the current window.
Combining this information with several other well-known aspects of widely deployed TCP stacks, the researchers found they could reliably inject information into a TCP stream from an attacker. While these injections would still need to be shaped in some way to impact the operation of the application sending data over the TCP stream, the ability to inject TCP segments in this way is “halfway there” for the attacker.
There probably never will be a truly secure communication channel invented that does not involve encryption—the data required to support flow control and manage errors will always provide enough information to an attacker to find some clever way to break into the channel.
In this episode of the Hedge, Stephane Bortzmeyer joins Alvaro Retana and Russ White to discuss draft-ietf-dprive-rfc7626-bis, which “describes the privacy issues associated with the use of the DNS by Internet users.” Not many network engineers think about the privacy implications of DNS, a important part of the infrastructure we all rely on to make the Internet work.
No, not that kind. 🙂
BGP security is a vexed topic—people have been working in this area for over twenty years with some effect, but we continuously find new problems to address. Today I am looking at a paper called BGP Communities: Can of Worms, which analyses some of the security problems caused by current BGP community usage in the ‘net. The point I want to think about here, though, is not the problem discussed in the paper, but rather some of the larger problems facing security in routing.
Assume there is some traffic flow passing from 101::47/64 and 100::46/64 in this network. AS65003 has helpfully set up community string-based policies that allow a peer to advertise a route with a specified AS Path prepend. In this case, if AS65003 receives a route with 3:65004x to prepend the route advertised towards 65004 with x number of additional AS Path entries, and 3:65005x to prepend the route advertised towards 65005 with x number of additional AS Path entries.
Assuming community strings set by AS65002 are carried with the 100::46/64 route through the rest of the network, AS65002 can:
- Advertise 100::/46 towards AS65003 with 3:650045, causing the route received at AS65006 from AS65004 to have a longer AS Path than the route received through AS65005, causing the traffic to flow through AS65005
- Advertise 100::/46 towards AS65003 with 3:650055, causing the route received at AS65006 from AS65005 to have a longer AS Path than the route received through AS65004, causing the traffic to flow through AS65004
A lot of abuse is possible because of this situation. For instance, AS65002 might know the cost of the link between AS65006 and AS65004 is very expensive, so directing large amounts of traffic across that link will cause financial harm to AS65004 or AS65006. A malicious actor at AS65002 could also determine it can overwhelm this link, causing a sort of denial of service against anyone connected to AS65004 or AS65006.
The potential problem, then, is real.
The problem is, however, how do we solve this? The most obvious way is to block communities from being transmitted beyond one hop past the point in the network where they are set. There are, however, two problems with this solution. First, how can anyone tell which AS set a community on a route? There is no originator code in the community string, and there’s no particular way to protect this kind of information from being forged or modified short of carrying a cryptographic hash in the update—which is probably not going to be acceptable from a performance perspective.
But the technical problem here is just the “tip of the iceberg.” Even if we could determine who modified the route to include the community, there is no particular way for anyone receiving the community to determine the originator’s intent. AS65002 may well install some system which measures, in near-real time, the delay across multiple paths to determine which performs the best. Such a system could be programmed with the correct community strings to impact traffic, and then left to run some sort of machine learning process to figure out how to mark routes to improve performance. If the operator at AS65002 does not realize the cost of the AS65004->AS65006 link is prohibitive, any sort of financial burden imposed by this system could be an unintended, rather than intended, consequence.
This, it turns out, is often the problem with security. It might be that person is bypassing building security to save a life, or it could be they are doing so to steal corporate secrets. There is simply no way to know without meeting the person in question, listening to their reasoning, and allowing a human to decide which course of action is appropriate.
In the case of BGP, we’re dealing with “spooky action at a distance;” the source of the problem is several steps removed from the result of the problem, there’s no clear way to connect the two, and there’s no clear way to resolve the problem other than “picking up the phone” even if one of these operators can figure out what is going on.
The problem of intent is what RFC3514’s evil bit is poking a bit of fun at—if we only knew the attacker’s intent, we could often figure out what to actually do. Not knowing intent, however, puts a major crimp in many of the best-laid security plans.