Backscatter is often used to detect various kinds of attacks, but how does it work? The paper under review today, Who Knocks at the IPv6 Door, explains backscatter usage in IPv4, and examines how effectively this technique might be used to detect scanning of IPv6 addresses, as well. The best place to begin is with an explanation of backscatter itself; the following network diagram will be helpful—
Assume A is scanning the IPv4 address space for some reason—for instance, to find some open port on a host, or as part of a DDoS attack. When A sends an unsolicited packet to C, a firewall (or some similar edge filtering device), C will attempt to discover the source of this packet. It could be there is some local policy set up allowing packets from A, or perhaps A is part of some domain none of the devices from C should be connecting to. IN order to discover more, the firewall will perform a reverse lookup. To do this, C takes advantage of the PTR DNS record, looking up the IP address to see if there is an associated domain name (this is explained in more detail in my How the Internet Really Works webinar, which I give every six months or so). This reverse lookup generates what is called a backscatter—these backscatter events can be used to find hosts scanning the IP address space. Sometimes these scans are innocent, such as a web spider searching for HTML servers; other times, they could be a prelude to some sort of attack.
Kensuke Fukuda and John Heidemann. 2018. Who Knocks at the IPv6 Door?: Detecting IPv6 Scanning. In Proceedings of the Internet Measurement Conference 2018 (IMC ’18). ACM, New York, NY, USA, 231-237. DOI: https://doi.org/10.1145/3278532.3278553
Scanning the IPv6 address space is much more difficult because there are 2128 addresses rather than 232. The paper under review here is one of the first attempts to understand backscatter in the IPv6 address space, which can lead to a better understanding of the ways in which IPv6 scanners are optimizing their search through the larger address space, and also to begin understanding how backscatter can be used in IPv6 for many of the same purposes as it is in IPv4.
The researchers begin by setting up a backscatter testbed across a subset of hosts for which IPv4 backscatter information is well-known. They developed a set of heuristics for identifying the kind of service or host performing the reverse DNS lookup, classifying them into major services, content delivery networks, mail servers, etc. They then examined the number of reverse DNS lookups requested versus the number of IP packets each received.
It turns out that about ten times as many backscatter incidents are reported for IPv4 than IPv6, which either indicates that IPv6 hosts perform reverse lookup requests about ten times less often than IPv4 hosts, or IPv6 hosts are ten times less likely to be monitored for backscatter events. Either way, this result is not promising—it appears, on the surface, that IPv6 hosts will be less likely to cause backscatter events, or IPv6 backscatter events are ten times less likely to be reported. This could indicate that widespread deployment of IPv6 will make it harder to detect various kinds of attacks on the DFZ. A second result from this research is that using backscatter, the researchers determined IPv6 scanning is increasing over time; while the IPv6 space is not currently a prime target for attacks, it might become more so over time, if the scanning rate is any indicator.
The bottom line is—IPv6 hosts need to be monitored as closely, or more closely than IPv6 hosts, for scanning events. The techniques used for scanning the IPv6 address space are not well understood at this time, either.
I was reading RFC8475 this week, which describes some IPv6 multihoming ‘net connection solutions. This set me to thinking about when you should uses IPv6 PA space. To begin, it’s useful to review the concept of IPv6 PI and PA space.
PI, or provider independent, space, is assigned by a regional routing registry to network operators who can show they need an address space that is not tied to a service provider. These requirements generally involve having a specific number of hosts, showing growth in the number of IPv6 addresses used over time, and other factors which depend on the regional registry providing the address space. PA, or provider assigned, IPv6 addresses can be assigned by a provider from their PI pool to an operator to which they are providing connectivity service.
There are two main differences between these two kinds of addresses. PI space is portable, which means the operator can take the address space when them when they change providers. PI space is also fixed; it is (generally) safe to use PI space as you might private or other IP address spaces; you can assign them to individual subnets, hosts, etc., and count on them remaining the same over time. If everyone obtained PI space, however, the IPv6 routing table in the default free zone (DFZ) could explode. PI space cannot be aggregated by the operator’s upstream provider because it is portable in just this way.
PA space, on the other hand, can be aggregated by your upstream provider because it is assigned by the provider. On the other hand, the provider can change the address block assigned to its customer at any time. The general idea is the renumbering capabilities built into IPv6 make it possible to “not care” about the addresses assigned to individual hosts on your network.
How does this work out in real life? Consider the following network—
Assume AS65000 assigns 2001:db8:1:1::/64 to the operator, and AS65001 assigns 2001:db8:2:2::/64. IPv6 provides mechanisms for A, B, and D to obtain addresses from within these two ranges, so each device has two IP addresses. Now assume A wants to send a packet to some site connected to the public Internet. If it sources the packet from its address in the 1:1::/64 range, A should send the packet to E; if it sources the packet from its address in the 2:2::/64 range, it should send the packet to F. This behavior is described in RFC6724, rule 5.5:
Prefer addresses in a prefix advertised by the next-hop. If SA or SA’s prefix is assigned by the selected next-hop that will be used to send to D and SB or SB’s prefix is assigned by a different next-hop, then prefer SA. Similarly, if SB or SB’s prefix is assigned by the next-hop that will be used to send to D and SA or SA’s prefix is assigned by a different next-hop, then prefer SB.
If the host uses this solution, it needs to remember which sources it has used in the past for which destinations—at least for the length of a single session, at any rate. RFC6724, however, notes supporting rule 5.5 is a SHOULD, rather than a MUST, which means some hosts may not have this capability. What should the network operator do in this case?
RFC8475 suggests using a set of policy based routing or filter based forwarding policies at the routers in the network to compensate. If E, for instance, receives a packet with a source in the range 2:2::/64 range, it should route the packet to F for forwarding. This will generally mean forwarding the packet out the interface on which E received the packet. Likewise, F should have a local policy that forwards packets it receives with a source address in the 1:1::/64 range to E. RFC8475 provides several examples of policies which would work for a number of different situations (for active/standby, active/active, one border router or two).
There are, however, two failure modes here. For the first one, assume AS65000 decides to assign the operator another IPv6 address range. IPv6’s renumbering capabilities will take care of getting the correct addresses onto A, B, and D—but the policies at E and F must be manually updated for the new address space to work correctly. It should be possible to automate the management of these filters in some way, of course, but the complexity injected into the network is larger than you might initially think.
The second failure relates to a deeper problem. What if B is not allowed, by policy, to talk to D? If the addressing in the network were consistent, the operator could set up a filter at C to prevent traffic from flowing between these two devices. When the network is renumbered, any such filters must be reconfigured, of course. A second instance of this same kind of failure: what if D is the internal DNS server for the network? While the DNS server’s address can be pushed out through the IPv6 renumbering capabilities (through NA, specifically), some manual or automated configuration must be adjusted to get the new address into the IPv6 advertisements so it can be distributed.
The short answer to the question above—when should you use PA space for a network?—comes down to this: when you have a very small, simple network behind a set of routers connecting to the ‘net, where the hosts attached to the network support RFC6724 rule 5.5, and intra-site communication is very simple (or there is no intra-site communications at all). Essentially, renumbering is not the only problem to solve when renumbering a network.
When rolling out a new protocol such as IPv6, it is useful to consider the changes to security posture, particularly the network’s attack surface. While protocol security discussions are widely available, there is often not “one place” where you can go to get information about potential attacks, references to research about those attacks, potential counters, and operational challenges. In the case of IPv6, however, there is “one place” you can find all this information: draft-ietf-opsec-v6. This document is designed to provide information to operators about IPv6 security based on solid operational experience—and it is a must read if you have either deployed IPv6 or are thinking about deploying IPv6.
The draft is broken up into four broad sections; the first is the longest, addressing generic security considerations. The first consideration is whether operators should use Provider Independent (PI) or Provider Assigned (PA) address space. One of the dangers with a large address space is the sheer size of the potential routing table in the Default Free Zone (DFZ). If every network operator opted for an IPv6 /32, the potential size of the DFZ routing table is 2.4 billion routing entries. If you thought converging on about 800,000 routes is bad, just wait ‘til there are 2.4 billion routes. Of course, the actual PI space is being handed out on /48 boundaries, which makes the potential table size exponentially larger. PI space, then, is “bad for the Internet” in some very important ways.
This document provides the other side of the argument—security is an issue with PA space. While IPv6 was supposed to make renumbering as “easy as flipping a switch,” it does not, in fact, come anywhere near this. Some reports indicate IPv6 re-addressing is more difficult than IPv4. Long, difficult renumbering processes indicate many opportunities for failures in security, and hence a large attack surface. Preferring PI space over PA space becomes a matter of reducing the operational attack surface.
Another interesting question when managing an IPv6 network is whether static addressing should be used for some services, or if all addresses should be dynamically learned. There is a perception out there that because the IPv6 address space is so large, it cannot be “scanned” to find hosts to attack. As pointed out in this draft, there is research showing this is simply not true. Further, static addresses may expose specific servers or services to easy recognition by an attacker. The point the authors make here is that either way, endpoint security needs to rely on actual security mechanisms, rather than on hiding addresses in some way.
Other very useful topics considered here are Unique Local Addresses (ULAs), numbering and managing point-to-point links, privacy extensions for SLAAC, using a /64 per host, extension headers, securing DHCP, ND/RA filtering, and control plane security.
If you are deploying, or thinking about deploying, IPv6 in your network, this is a “must read” document.
When deploying IPv6, one of the fundamental questions the network engineer needs to ask is: DHCPv6, or SLAAC? As the argument between these two has reached almost political dimensions, perhaps a quick look at the positive and negative attributes of each solution are. Originally, the idea was that IPv6 addresses would be created using stateless configuration (SLAAC). The network parts of the address would be obtained by listening for a Router Advertisement (RA), and the host part would be built using a local (presumably unique) physical (MAC) address. In this way, a host can be connected to the network, and come up and run, without any manual configuration. Of course, there is still the problem of DNS—how should a host discover which server it should contact to resolve domain names? To resolve this part, the DHCPv6 protocol would be used. So in IPv6 configuration, as initially conceived, the information obtained from RA would be combined with DNS information from DHCPv6 to fully configure an IPv6 host when it is attached to the network.
There are several problems with this scheme, as you might expect. The most obvious is that most network operators do not want to deploy two protocols to solve a single problem—configuring IPv6 hosts. What might not be so obvious, however, is that many network operators care a great deal about whether hosts are configured statelessly or through a protocol like DHCPv6.
Why would an operator want stateful configuration? Primarily because they want to control which devices can receive an IPv6 address, and hence communicate with other devices on the network. When using DHCPv6, just like DHCP with IPv4, the operator can set parameters around what kinds of devices, or perhaps even which specific devices, will be able to receive an IPv6 address. Further, the DHCPv6 server can be tied to the DNS server, so each host which connects to the network can also be given a DNS entry. Proper DNS entries are often a requirement for many applications. There are Dynamic DNS (DDNS) implementations that can solve this problem, but they are not often considered secure enough for a controlled network environment.
Why would an operator want stateless autoconfiguration? First, because they want any random user who can successfully connect to the network to be able to get an IPv6 address without any other configuration, and without the provider needing run any sort of special protocol or configuration to allow this. In fact, DHCPv6, in some environments, at least, can be seen as an attack surface, or rather a hole through which attacks can potentially be driven. Second, stateful configuration also has a failover problem; if the DHCPv6 server fails, then hosts can no longer obtain an IPv6 address, and the network no longer works. This could be, to say the least, problematic for service providers. Finally, SLAAC has a set of privacy extensions outlined in RFC4941 that (theoretically) prevent a host from being tracked based on its IPv6 address over time. This is a very attractive property for edge facing service providers.
The original set of drafts, however, only provided for DNS information to be carried through DHCPv6, and had no failover mechanism for DHCPv6. These two things, together, made it impossible to use just one of these two options. More recent work, however, has remedied both parts of this problem, making either option able to stand on its own. RFC6106, which is a bit older (2010), provides for DNS advertisement in the RA protocol. This allows an operator who would like to run everything completely stateless to do so, including hosts learning which DNS resolver to use. On the other side, RFC8156, which was just ratified in July of 2017, allows a pair of DHCPv6 servers to act as a failover pair. While this is more complex than simple DHCPv6, it does solve the problem of a host failing to operate correctly simply because the DHCPv6 server has failed.
Which of the two is now the best choice? If you do not have any requirement to restrict the hosts that can attach to the network using IPv6, then SLAAC, combined with DNS advertisement in the RA, and possibly with DDNS (if needed), would be the right choice. However, if the environment must be more secure, then DHCPv6 is likely to be the better solution.
A word of warning, though—using DHCPv6 to ensure each host received an IPv6 address that can be used anyplace in the network, and then stretching layer 2 to allow any host to roam “anywhere,” is really just not a good idea. I have worked on networks where this kind of thing has been taken to a global scale. It might seem cute at first, but this kind of solution will ultimately become a monster when it grows up.