No, not that kind. 🙂
BGP security is a vexed topic—people have been working in this area for over twenty years with some effect, but we continuously find new problems to address. Today I am looking at a paper called BGP Communities: Can of Worms, which analyses some of the security problems caused by current BGP community usage in the ‘net. The point I want to think about here, though, is not the problem discussed in the paper, but rather some of the larger problems facing security in routing.
Assume there is some traffic flow passing from 101::47/64 and 100::46/64 in this network. AS65003 has helpfully set up community string-based policies that allow a peer to advertise a route with a specified AS Path prepend. In this case, if AS65003 receives a route with 3:65004x to prepend the route advertised towards 65004 with x number of additional AS Path entries, and 3:65005x to prepend the route advertised towards 65005 with x number of additional AS Path entries.
Assuming community strings set by AS65002 are carried with the 100::46/64 route through the rest of the network, AS65002 can:
- Advertise 100::/46 towards AS65003 with 3:650045, causing the route received at AS65006 from AS65004 to have a longer AS Path than the route received through AS65005, causing the traffic to flow through AS65005
- Advertise 100::/46 towards AS65003 with 3:650055, causing the route received at AS65006 from AS65005 to have a longer AS Path than the route received through AS65004, causing the traffic to flow through AS65004
A lot of abuse is possible because of this situation. For instance, AS65002 might know the cost of the link between AS65006 and AS65004 is very expensive, so directing large amounts of traffic across that link will cause financial harm to AS65004 or AS65006. A malicious actor at AS65002 could also determine it can overwhelm this link, causing a sort of denial of service against anyone connected to AS65004 or AS65006.
The potential problem, then, is real.
The problem is, however, how do we solve this? The most obvious way is to block communities from being transmitted beyond one hop past the point in the network where they are set. There are, however, two problems with this solution. First, how can anyone tell which AS set a community on a route? There is no originator code in the community string, and there’s no particular way to protect this kind of information from being forged or modified short of carrying a cryptographic hash in the update—which is probably not going to be acceptable from a performance perspective.
But the technical problem here is just the “tip of the iceberg.” Even if we could determine who modified the route to include the community, there is no particular way for anyone receiving the community to determine the originator’s intent. AS65002 may well install some system which measures, in near-real time, the delay across multiple paths to determine which performs the best. Such a system could be programmed with the correct community strings to impact traffic, and then left to run some sort of machine learning process to figure out how to mark routes to improve performance. If the operator at AS65002 does not realize the cost of the AS65004->AS65006 link is prohibitive, any sort of financial burden imposed by this system could be an unintended, rather than intended, consequence.
This, it turns out, is often the problem with security. It might be that person is bypassing building security to save a life, or it could be they are doing so to steal corporate secrets. There is simply no way to know without meeting the person in question, listening to their reasoning, and allowing a human to decide which course of action is appropriate.
In the case of BGP, we’re dealing with “spooky action at a distance;” the source of the problem is several steps removed from the result of the problem, there’s no clear way to connect the two, and there’s no clear way to resolve the problem other than “picking up the phone” even if one of these operators can figure out what is going on.
The problem of intent is what RFC3514’s evil bit is poking a bit of fun at—if we only knew the attacker’s intent, we could often figure out what to actually do. Not knowing intent, however, puts a major crimp in many of the best-laid security plans.
Security often lives in one of two states. It’s either something “I” take care of, because my organization is so small there isn’t anyone else taking care of it. Or it’s something those folks sitting over there in the corner take care of because the organization is, in fact, large enough to have a separate security team. In both cases, however, security is something that is done to networks, or something thought about kind-of off on its own in relation to networks.
I’ve been trying to think of ways to challenge this way of thinking for many years—a long time ago, in a universe far away, I created and gave a presentation on network security at Cisco Live (raise your hand if you’re old enough to have seen this presentation!).
Reading through my paper pile this week, I ran into a viewpoint in the Communications of the ACM that revived my older thinking about network security and gave me a new way to think about the problem. The author’s expression of the problem of supply chain security can be used more broadly. The illustration below is replicated from the one in the original article; I will use this as a starting point.
This is a nice way to visualize your attack surface. The columns represent applications or systems and the rows represent vulnerabilities. The colors represent the risk, as explained across the bottom of the chart. One simple way to use this would be just to list all the things in the network along the top as columns, and all the things that can go wrong as rows and use it in the same way. This would just be a cut down, or more specific, version of the same concept.
Another way to use this sort of map—and this is just a nub of an idea, so you’ll need to think about how to apply it to your situation a little more deeply—is to create two groups of columns; one column for each application that relies on network services, and one for network infrastructure devices and services you rely on. Rows would be broken up into three classes, from the top to bottom—protection, services, and systems. In the protection group you would have things the network does to protect data and applications, like segmentation, preventing data exfiltration, etc. In the services group, you would mostly have various forms of denial of service and configuration. In the systems group, you would have individual hardware devices, protocols, software packages used to make the network “go,” etc. Maybe something like the illustration below.
If you place the most important applications towards the left, and the protection towards the top, the more severe vulnerabilities will be in the upper left corner of the chart, with less severe areas falling to the right and (potentially) towards the bottom. You would fill this chart out starting in the upper left, figuring out what each kind of “protection” the network as a service can offer to each application. These should, in turn, roll down to the services the network offers and their corresponding configurations. These should, in turn, roll across to the devices and software used to create these services, and then roll back down to the vulnerabilities of those services and devices. For instance, if sales management relies on application access control, and application access control relies on proper filtering, and filtering is configured on BGP and some sort of overlay virtual link to a cloud service… You start to get the idea of where different kinds of services rely on underlying capabilities, and then how those are related to suppliers, hardware, etc.
You can color the squares in different ways—the way the original article does, perhaps, or your reliance on an outside vendor to solve this problem, etc. Once the basic chart is in place you can use multiple color schemes to get different views of the attack surface by using the chart as a sort of heat map.
Again, this is something of a nub of an idea, but it is a potentially interesting way to get a single view of the entire network ecosystem from a security standpoint, know where things are weak (and hence need work), and understand where cascading failures might happen.
If you haven’t found the trade-offs, you haven’t looked hard enough.
A perfect illustration is the research paper under review, Securing Linux with a Faster and Scalable Iptables. Before diving into the paper, however, some background might be good. Consider the situation where you want to filter traffic being transmitted to and by a virtual workload of some kind, as shown below.
To move a packet from the user space into the kernel, the packet itself must be copied into some form of memory that processes on “both sides of the divide” can read, then the entire state of the process (memory, stack, program execution point, etc.) must be pushed into a local memory space (stack), and control transferred to the kernel. This all takes time and power, of course.
In the current implementation of packet filtering, netfilter performs the majority of filtering within the kernel, while iptables acts as a user frontend as well as performing some filtering actions in the user space. Packets being pushed from one interface to another must make the transition between the user space and the kernel twice. Interfaces like XDP aim to make the processing of packets faster by shortening the path from the virtual workload to the PHY chipset.
What if, instead of putting the functionality of iptables in the user space you could put it in the kernel space? This would make the process of switching packets through the device faster, because you would not need to pull packets out of the kernel into a user space process to perform filtering.
But there are trade-offs. According to the authors of this paper, there are three specific challenges that need to be addressed. First, users expect iptables filtering to take place in the user process. If a packet is transmitted between virtual workloads, the user expects any filtering to take place before the packet is pushed to the kernel to be carried across the bridge, and back out into user space to the second process, Second, a second process, contrack, checks the existence of a TCP connection, which iptables then uses to determine whether a packet that is set to drop because there no existing connection. This give iptables the ability to do stateful filtering. Third, classification of packets is very expensive; classifying packets could take too much processing power or memory to be done efficiently in the kernel.
To resolve these issues, the authors of this paper propose using an in-kernel virtual machine, or eBPF. They design an architecture which splits iptables into to pipelines, and ingress and egress, as shown in the illustration taken from the paper below.
As you can see, the result is… complex. Not only are there more components, with many more interaction surfaces, there is also the complexity of creating in-kernel virtual machines—remembering that virtual machines are designed to separate out processing and memory spaces to prevent cross-application data leakage and potential single points of failure.
That these problems are solvable is not in question—the authors describe how they solved each of the challenges they laid out. The question is: are the trade-offs worth it?
The bottom line: when you move filtering from the network to the host, you are not moving the problem from a place where it is less complex. You may make the network design itself less complex, and you may move filtering closer to the application, so some specific security problems are easier to solve, but the overall complexity of the system is going way up—particularly if you want a high performance solution.
One of the recurring myths of IPv6 is its very large address space somehow confers a higher degree of security. The theory goes something like this: there is so much more of the IPv6 address space to test in order to find out what is connected to the network, it would take too long to scan the entire space looking for devices. The first problem with this myth is it simply is not true—it is quite possible to scan the entire IPv6 address space rather quickly, probing enough addresses to perform a tree-based search to find attached devices. The second problem is this assumes the only modes of attack available in IPv4 will directly carry across to IPv6. But every protocol has its own set of tradeoffs, and therefore its own set of attack surfaces.
Assume, for instance, you follow the “quick and easy” way of configuring IPv6 addresses on devices as they are deployed in your network. The usual process for building an IPv6 address for an interface is to take the prefix, learned from the advertisement of a locally attached router, and the MAC address of one of the locally attached interfaces, combining them into an IPv6 address (SLAAC). The size of the IPv6 address space proves very convenient here, as it allows the MAC address, which is presumably unique, to be used in building a (presumably unique) IPv6 address.
According to RFC7721, this process opens several new attack surfaces that did not exist in IPv4, primarily because the device has exposed more information about itself through the IPv6 address. First, the IPv6 address now contains at least some part of the OUI for the device. This OUI can be directly converted to a device manufacturer using web pages such as this one. In fact, in many situations you can determine where and when a device was manufactured, and often what class of device it is. This kind of information gives attackers an “inside track” on determining what kinds of attacks might be successful against the device.
Second, if the IPv6 address is calculated based on a local MAC address, the host bits of the IPv6 address of a host will remain the same regardless of where it is connected to the network. For instance, I may normally connect my laptop to a port in a desk in the Raleigh area. When I visit Sunnyvale, however, I will likely connect my laptop to a port in a conference room there. If I connect to the same web site from both locations, the site can infer I am using the same laptop from the host bits of the IPv6 address. Across time, an attacker can track my activities regardless of where I am physically located, allowing them to correlate my activities. Using the common lower bits, an attacker can also infer my location at any point in time.
Third, knowing what network adapters an organization is likely to use reduces the amount of raw address space that must be scanned to find active devices. If you know an organization uses Juniper routers, and you are trying to find all their routers in a data center or IX fabric, you don’t really need to scan the entire IPv6 address space. All you need to do is probe those addresses which would be formed using SLAAC with OUI’s formed from Juniper MAC addresses.
Beyond RFC7721, many devices also return their MAC address when responding to ICMPv6 probes in the time exceeded response. This directly exposes information about the host, so the attacker does not need to infer information from SLAAC-derived MAC addresses.
What can be done about these sorts of attacks?
The primary solution is to use semantically opaque identifiers when building IPv6 addresses using SLAAC—perhaps even using a cryptographic hash to create the base identifiers from which IPv6 addresses are created. The bottom line is, though, that you should examine the vendor documentation for each kind of system you deploy—especially infrastructure devices—as well as using packet capture tools to understand what kinds of information your IPv6 addresses may be leaking and how to prevent it.
There is a rising tide of security breaches. There is an even faster rising tide of hysteria over the ostensible reason for these breaches, namely the deficient state of our information infrastructure. Yet the world is doing remarkably well overall, and has not suffered any of the oft-threatened giant digital catastrophes. Andrew Odlyzko joins Tom Ammon and I to talk about cyber insecurity.
Fastnetmon began life as an open source DDoS detection tool, but has grown in scope over time. By connecting Fastnetmon to open source BGP implementations, operators can take action when a denial of service event is detected, triggering black holes and changing route preferences. Pavel Odintsov joins us to talk about this interesting and useful open source project.
you can subscribe to the Hedge on iTunes and other podcast listing sites, or use the RSS feed directly from rule11.tech