Backscatter is often used to detect various kinds of attacks, but how does it work? The paper under review today, Who Knocks at the IPv6 Door, explains backscatter usage in IPv4, and examines how effectively this technique might be used to detect scanning of IPv6 addresses, as well. The best place to begin is with an explanation of backscatter itself; the following network diagram will be helpful—
Assume A is scanning the IPv4 address space for some reason—for instance, to find some open port on a host, or as part of a DDoS attack. When A sends an unsolicited packet to C, a firewall (or some similar edge filtering device), C will attempt to discover the source of this packet. It could be there is some local policy set up allowing packets from A, or perhaps A is part of some domain none of the devices from C should be connecting to. IN order to discover more, the firewall will perform a reverse lookup. To do this, C takes advantage of the PTR DNS record, looking up the IP address to see if there is an associated domain name (this is explained in more detail in my How the Internet Really Works webinar, which I give every six months or so). This reverse lookup generates what is called a backscatter—these backscatter events can be used to find hosts scanning the IP address space. Sometimes these scans are innocent, such as a web spider searching for HTML servers; other times, they could be a prelude to some sort of attack.
Kensuke Fukuda and John Heidemann. 2018. Who Knocks at the IPv6 Door?: Detecting IPv6 Scanning. In Proceedings of the Internet Measurement Conference 2018 (IMC ’18). ACM, New York, NY, USA, 231-237. DOI: https://doi.org/10.1145/3278532.3278553
Scanning the IPv6 address space is much more difficult because there are 2128 addresses rather than 232. The paper under review here is one of the first attempts to understand backscatter in the IPv6 address space, which can lead to a better understanding of the ways in which IPv6 scanners are optimizing their search through the larger address space, and also to begin understanding how backscatter can be used in IPv6 for many of the same purposes as it is in IPv4.
The researchers begin by setting up a backscatter testbed across a subset of hosts for which IPv4 backscatter information is well-known. They developed a set of heuristics for identifying the kind of service or host performing the reverse DNS lookup, classifying them into major services, content delivery networks, mail servers, etc. They then examined the number of reverse DNS lookups requested versus the number of IP packets each received.
It turns out that about ten times as many backscatter incidents are reported for IPv4 than IPv6, which either indicates that IPv6 hosts perform reverse lookup requests about ten times less often than IPv4 hosts, or IPv6 hosts are ten times less likely to be monitored for backscatter events. Either way, this result is not promising—it appears, on the surface, that IPv6 hosts will be less likely to cause backscatter events, or IPv6 backscatter events are ten times less likely to be reported. This could indicate that widespread deployment of IPv6 will make it harder to detect various kinds of attacks on the DFZ. A second result from this research is that using backscatter, the researchers determined IPv6 scanning is increasing over time; while the IPv6 space is not currently a prime target for attacks, it might become more so over time, if the scanning rate is any indicator.
The bottom line is—IPv6 hosts need to be monitored as closely, or more closely than IPv6 hosts, for scanning events. The techniques used for scanning the IPv6 address space are not well understood at this time, either.
When a recursive resolver receives a query from a host, it will first consult any local cache to discover if it has the information required to resolve the query. If it does not, it will begin with the rightmost section of the domain name, the Top Level Domain (TLD), moving left through each section of the Fully Qualified Domain Name (FQDN), in order to find an IP address to return to the host, as shown in the diagram below.
This is pretty simple at its most basic level, of course—virtually every network engineer in the world understands this process (and if you don’t, you should enroll in my How the Internet Really Works webinar the next time it is offered!). The question almost no-one ever asks, however, is: what, precisely, is the recursive server sending to the root, TLD, and authoritative servers?
Begin with the perspective of a coder who is developing the code for that recursive server. You receive a query from a host, you have the code check the local cache, and you find there is no matching information available locally. This means you need to send a query out to some other server to determine the correct IP address to return to the host. You could keep a copy of the query from the host in your local cache and build a new query to send to the root server.
Remember, however, that local server resources may be scarce; recursive servers must be optimized to process very high query rates very quickly. Much of the user’s perception of network performance is actually tied to DNS performance. A second option is you could save local memory and processing power by sending the entire query, as you have received it, on to the root server. This way, you do not need to build a new query packet to send to the root server.
Consider this process, however, in the case of a query for a local, internal resource you would rather not let the world know exists. The recursive server, by sending the entire query to the root server, is also sending information about the internal DNS structure and potential internal server names to the external root server. As the FQDN is resolved (or not), this same information is sent to the TLD and authoritative servers, as well.
There is something else contained here, however, that is not so obvious—the IP address of the requestor is contained in that original query, as well. Not only is your internal namespace leaking, your internal IP addresses are leaking, as well.
This is not only a massive security hole for your organization, it also exposes information from individual users on the global ‘net.
There are several things that can be done to resolve this problem. Organizationally, running a private DNS server, hard coding resolving servers for internal domains, and using internal domains that are not part of the existing TLD infrastructure, can go a long way towards preventing information leaking of this kind through DNS. Operating a DNS server internally might not be ideal, of course, although DNS services are integrated into a lot of other directory services used in operational networks. If you are using a local DNS server, it is important to remember to configure DHCP and/or IPv6 ND to send the correct, internal, DNS server address, rather than an external address. It is also important to either block or redirect DNS queries sent to public servers by hosts using hard-coded DNS server configurations.
A second line of defense is through DNS query minimization. Described in RFC7816, query minimization argues recursive servers should use QNAME queries to only ask about the one relevant part of the FQDN. For instance, if the recursive server receives a query for
www.banana.example, the server should request information about
.example from the root server,
banana.example from the TLD, and send the full requested domain name only to the authoritative server. This way, the full search is not exposed to the intermediate servers, protecting user information.
Some recursive server implementations already support QNAME queries. If you are running a server for internal use, you should ensure the server you are using supports DNS query minimization. If you are directing your personal computer or device to publicly reachable recursive servers, you should investigate whether these servers support DNS query minimization.
Even with DNS query minimization, your recursive server still knows a lot about what you ask for—the topic of discussion on a forthcoming episode of the Hedge, where our guest will be Geoff Huston.
A long time ago, I worked in a secure facility. I won’t disclose the facility; I’m certain it no longer exists, and the people who designed the system I’m about to describe are probably long retired. Soon after being transferred into this organization, someone noted I needed to be trained on how to change the cipher door locks. We gathered up a ladder, placed the ladder just outside the door to the secure facility, popped open one of the tiles on the drop ceiling, and opened a small metal box with a standard, low security key. Inside this box was a jumper board that set the combination for the secure door.
First lesson of security: there is (almost) always a back door.
I was reminded of this while reading a paper recently published about a backdoor attack on certificate authorities. There are, according to the paper, around 130 commercial Certificate Authorities (CAs). Each of these CAs issue widely trusted certificates used for everything from TLS to secure web browsing sessions to RPKI certificates used to validate route origination information. When you encounter these certificates, you assume at least two things: the private key in the public/private key pair has not been compromised, and the person who claims to own the key is really the person you are talking to. The first of these two can come under attack through data breaches. The second is the topic of the paper in question.
How do CAs validate the person asking for a certificate actually is who they claim to be? Do they work for the organization they are obtaining a certificate for? Are they the “right person” within that organization to ask for a certificate? Shy of having a personal relationship with the person who initiates the certificate request, how can the CA validate who this person is and if they are authorized to make this request?
They could do research on the person—check their social media profiles, verify their employment history, etc. They can also send them something that, in theory, only that person can receive, such as a physical letter, or an email sent to their work email address. To be more creative, the CA can ask the requestor to create a small file on their corporate web site with information supplied by the CA. In theory, these electronic forms of authentication should be solid. After all, if you have administrative access to a corporate web site, you are probably working in information technology at that company. If you have a work email address at a company, you probably work for that company.
These electronic forms of authentication, however, can turn out to be much like the small metal box which holds the jumper board that sets the combination just outside the secure door. They can be more security theater than real security.
In fact, the authors of this paper found that some 70% of the CAs could be tricked into issuing a certificate for just about any organization—by hijacking a route. Suppose the CA asks the requestor to place a small file containing some supplied information on the corporate web site. The attacker creates a web server, inserts the file, hijacks the route to the corporate web site so it points at the fake web site, waits for the authentication to finish, and then removes the hijacked route.
The solution recommended in this paper is for the CAs to use multiple overlapping factors when authenticating a certificate requestor—which is always a good security practice. Another solution recommended by the authors is to monitor your BGP tables from multiple “views” on the Internet to discover when someone has hijacked your routes, and take active measures to either remove the hijack, or at least to detect the attack.
These are all good measures—ones your organization should already be taking.
But the larger point should be this: putting a firewall in front of your network is not enough. Trusting that others will “do their job correctly,” and hence that you can trust the claims of certificates or CAs, is not enough. The Internet is a low trust environment. You need to think about the possible back doors and think about how to close them (or at least know when they have been opened).
Having personal relationships with people you do business with is a good start. Being creative in what you monitor and how is another. Firewalls are not enough. Two-factor authentication is not enough. Security is systemic and needs to be thought about holistically.
There are always back doors.
Privacy problems are an area of wide concern for individual users of the Internet—but what about network operators? In this issue of The Internet Protocol Journal, Geoff Huston has an article up about privacy in DNS, and the various attempts to make DNS private on the part of the IETF—the result can be summarized with this long, but entertaining, quote:
The Internet is largely dominated, and indeed driven, by surveillance, and pervasive monitoring is a feature of this network, not a bug. Indeed, perhaps the only debate left today is one over the respective merits and risks of surveillance undertaken by private actors and surveillance by state-sponsored actors. … We have come a very long way from this lofty moral stance on personal privacy into a somewhat tawdry and corrupted digital world, where “do no evil!” has become “don’t get caught!”
Before diving into a full-blown look at the many problems with DNS security, it is worth considering what kinds of information can leak through the DNS system. Let’s ignore the recent discovery that DNS queries can be used to exfiltrate data; instead, let’s look at more mundane data leakage from DNS queries.
For instance, say you work in a marketing department for a company that is just about to release a new product. In order to build the marketing and competitive materials your sales critters will need to stand in front of customers, you do a lot of research around competitor products. In the process, you examine, in detail, each of the competing product’s pages. Or perhaps you work in a company that is determining whether or another purchasing or merging with another company might be a good idea. Or you are working on a new externally facing application, or component in an existing application, that relies on a new connection point into your network.
All of these processes can lead to a lot of DNS queries. For someone who knows what they are looking for, the pattern of queries may be enough to examine strings queried from search engines and other information, ultimately leading to someone being able to guess a lot about that new product, what company your company is thinking about buying or merging with, what your new application is going to do, etc. DNS is a treasure trove of information at a personal and organizational level.
Operators and protocol designers have been working for years to resolve these problems, making DNS queries “more private;” Geoff Huston’s article provides a good overview of many of these attempts. DNS over HTTPS (DoH), a recent (and ongoing) attempt bears a closer look.
DNS is normally sent “in plain text” over the network; anyone who can capture the packets can read not only the query, but also the responses. The simplest way to solve this problem is to encrypt the DNS data in flight using something like TLS—hence DoT, or DNS over TLS. One problem with DoT is it is carried over a unique port number, which means it is probably blocked by default by most packet filters, and can easily be blocked by administrators who either do not know what this traffic is, or do not want it on their network. To solve this, DoH carries TLS encrypted traffic in a way that makes it look just like an HTTPS session. If you block DoH traffic, you will also block access to web servers running HTTPS. This is the logical “end” of carrying everything else over HTTPS to avoid the impact of stateful and stateless packet filters and the impact of middle boxes on Internet traffic.
The good result is, in fact, that DNS traffic can no longer be “spied on” by anyone outside servers in the DNS system itself. Whether or not this is “enough” privacy is a matter of conjecture, however. Servers within the DNS system can still collect information about what queries you are making; if the server has access to other information about you or your organization, combining this data into a profile, or using it to determine some deeper investigation is warranted by looking at other sources of data, is pretty simple. Ultimately, DoH is only really useful if you trust your DNS provider.
Do you? Perhaps more importantly—should you?
DNS providers are like any other business; they must buy hardware, connectivity, and the time of smart people who can make the system work, troubleshoot the system when it fails, and think about ways of improving the system. If the service is free…
DoH, however, has another problem Geoff outlines in his article—DNS is moved up the stack so it no longer runs over TCP and UDP directly, but rather it runs over HTTPS. This means local applications, like browsers, can run DNS queries independently of the operating system. In fact, because these queries are TLS encrypted, the operating system itself cannot even “see” the contents of these DNS queries. This might be a good thing—or might be a bad thing. If nothing else, it means the browser, or any other application, can choose to use a resolver not configured by the local operating system. A browser maker, for instance, can direct their browser to send all DNS queries made within the browser to their DNS server, exposing another source of information about users (and the organizations they work for).
Remember that time you typed an internal hostname incorrectly in your browser? Thankfully, you had a local DNS server configured, so the query did not go out to a resolver on the Internet. With DoH, the query can go out to an open resolver on the Internet regardless of how your local systems are configured. Something to ponder.
The bottom line is this—the nature of DNS makes it extremely difficult to secure. Somehow you have to have someone operate, and pay for, an open database of names which translate to addresses. Somehow you have to have a protocol that allows this database to be queried. All of these “somehows” expose information, and there is no clear way to hide that information. You can solve parts of the problem, but not the whole problem. Solving one part of the problem seems to make another part of the problem worse.
If you haven’t found the tradeoff, you haven’t looked hard enough.
In the end, though, the privacy of DNS queries at a personal and organizational level is something you need to think about.
Danger Storm Kevin MacLeod (incompetech.com)
Licensed under Creative Commons: By Attribution 3.0 License
Every so often, while browsing the web, you run into a web page that asks if you would like to allow the site to push notifications to your browser. Apparently, according to the paper under review, about 12% of the people who receive this notification allow notifications. What, precisely, is this doing, and what are the side effects?
Allowing notifications allows the server to kick off one of two different kinds of processes on the local computer, a service worker. There are, in fact, two kinds of worker apps that can run “behind” a web site in HTML5; the web worker and the service worker. The web worker is designed to calculate or locally render some object that will appear on the site, such as unencrypting a downloaded audio file for local rendition. This moves the processing load (including the power and cooling use!) from the server to the client, saving money for the hosting provider, and (potentially) rendering the object in question more quickly.
A service worker, on the other hand, is designed to support notifications. For instance, say you keep a news web site open all day in your browser. You do not necessarily want to reload the page ever few minutes; instead, you would rather the site send you a notification through the browser when some new story has been posted. Since the service worker is designed to cause an action in the browser on receiving a notification from the server, it has direct access to the network side of the host, and it can run even when the tab showing the web site is not visible.
In fact, because service workers are sometimes used to coordinate the information on multiple tabs, a service worker can both communicate between tabs within the same browser and stay running in the browser’s context even though the tab that started the service worker is closed. To make certain other tabs do not block while the server worker is running, they are run in a separate thread; they can consume resources from a different core in your processor, so you are not aware (from a performance perspective) they are running. To sweeten the pot, a service worker can be restarted after your browser has restarted by a special push notification from the server.
If a service worker sounds like a perfect setup for running code that can mine bitcoins or launch DDoS attacks from your web browser, then you might have a future in computer security. This is, in fact, what MarioNet, a proof-of-concept system described in this paper, does—it uses a service worker to consume resources off as many hosts as it can install itself on to do just about anything, including launching a DDoS attack.
Given the above, it should be simple enough to understand how the attack works. When the user lands on a web page, ask for permission to push notifications. A lot of web sites that do not seem to need such permission ask now, particularly ecommerce sites, so the question does not seem out of place almost anywhere any longer. Install a service worker, using the worker’s direct connection to the host’s network to communicate to a controller. The controller can then install code to be run into the service worker and direct the execution of that code. If the user closes their browser, randomly push notifications back to the browser, in case the user opens it again, thus recreating the service worker.
Since the service worker runs in a separate thread, the user will not notice any impact on web browsing performance from the use of their resources—in fact, MarioNet’s designers use fine-grained tracking of resources to ensure they do not consume enough to be noticed. Since the service worker runs between the browser and the host operating system, no defenses built into the browser can detect the network traffic to raise a flag. Since the service worker is running in the context of the browser, most anti-virus software packages will give the traffic and processing a pass.
First, making something powerful from a compute perspective will always open holes like this. There will never be any sort of system that both allows the transfer of computation from one system to another that will not have some hole which can be exploited.
Second, abstraction hides complexity, even the complexity of an attack or security breach, nicely. Abstraction is like anything else in engineering: if you haven’t found the tradeoffs, you haven’t looked hard enough.
Third, close your browser when you are done. The browser is, in many ways, an open door to the outside world through which all sorts of people can make it into your computer. I have often wanted to create a VM or container in which I can run a browser from a server on the ‘net. When I’m done browsing, I can shut the entire thing down and restore the state to “clean.” No cookies, no java stuff, no nothing. A nice fresh install each time I browse the web. I’ve never gotten around to building this, but I should really put it on my list of things to do.
Fourth, don’t accept inbound connection requests without really understanding what you are doing. A notification push is, after all, just another inbound connection request. It’s like putting a hole in your firewall for that one FTP server that you can’t control. Only it’s probably worse.