Of the 4.2 billion IPv4 addresses available in the global space, how many are used—or rather, how many are “alive?” Given the increasing usage of IPv6, it might seem this is an unimportant question. Answering the question, however, resolves to another question that is actually more important: how can you determine whether or not an IP address is in use? This question might seem easy to answer: ping every address in the address space. This, however, turns out to be the wrong answer.
Scanning the Internet for Liveness. SIGCOMM Comput. Commun. Rev. 48, 2 (May 2018), 2-9. DOI: https://doi.org/10.1145/3213232.3213234
This answer is wrong because a substantial number of systems do not respond to ICMP requests. According to this paper, in fact, some 16% of the hosts they discovered that would respond to a TCP SYN, and another 2% that would respond to a UDP packet shaped to connect to a service, do not respond to ICMP requests. There are a number of possible reasons for this situation, including hosts being placed behind devices that block ICMP packets, hosts being configured not to respond to ICMP requests, or a server sitting behind a PAT or CGNAT device that only passes through service requests rather than all packets.
The paper begins by building a taxonomy of liveness, describing the process they use to determine if an address is in use or not, as shown in the image replicated from the paper.
One problem of note is that address usage can shift over time; between trying to use ICMP and a TCP SYN to determine if an IP address is in use, the device connected to that address can change. To limit the impact of this problem, the researchers sent each kind of liveness test to the same address close together in time. The authors then attempt to cross reference the liveness indicated using different techniques to an overall view of liveness for a particular address.
The research resulted in a number of interesting observations, such as the 16% of hosts that respond to TCP SYN probes on some port, but do not respond to ICMP requests. The kinds of ICMP and TCP responses was also quite interesting; many TCP implementations do not seem compliant to the TCP specification in how they respond to a SYN request.
Along the way, the authors added new capabilities to ZMap which allow them to perform these measurements. The tool they used has a web based frontend, and can be accessed here.
The results are interesting for network operators because they indicate the kinds of work required to find all the devices attached to a network using IP addresses—a mass ping utility is simply not enough. The tools developed here, and the lessons learned, can be added to the set of tools used by operators in all networks to better understand their IP address usage, and the shape of their networks.