DDoS

Weekend Reads 052518

Without adtech, the EU’s GDPR (General Data Protection Regulation) would never have happened. But the GDPR did happen, and as a result websites all over the world are suddenly posting notices about their changed privacy policies, use of cookies, and opt-in choices for “relevant” or “interest-based” (translation: tracking-based) advertising. Email lists are doing the same kinds of things. @Doc Searl’s Weblog

A newly-uncovered form of DDoS attack takes advantage of a well-known, yet still exploitable, security vulnerability in the Universal Plug and Play (UPnP) networking protocol to allow attackers to bypass common methods for detecting their actions. —Danny Palmer @ZDNet

Today, that’s coming in the form of imperceptible musical signals that can be used to take control of smart devices like Amazon’s Alexa or Apple’s Siri to unlock doors, send money, or any of the other things that we give these wicked machines the authority to do. That’s according to a New York Times report, which says researchers in China and the United States have proven that they’re able to “send hidden commands” to smart devices that are “undetectable to the human ear” simply by playing music. —Sam Barsanti @AVI News

In a paper we recently presented at the Passive and Active Measurement Conference 2018 [PDF 652 KB], we analyzed the certificate ecosystem using CT logs. To perform this analysis we downloaded 600 million certificates from 30 CT logs. This vast certificate set gives us insight into the ecosystem itself and allows us to analyze various certificate characteristics. —Oliver Gasser @APNIC

With cybercrime skyrocketing over the past two decades, companies that do business online — whether retailers, banks, or insurance companies — have devoted increasing resources to improving security and combatting Internet fraud. But sophisticated fraudsters do not limit themselves to the online channel, and many organizations have been slow to adopt effective measures to mitigate the risk of fraud carried out through other channels, such as customer contact centers. In many ways, the phone channel has become the weak link. —Patrick Cox @Dark Reading

Just a few years after Bitcoin emerged, startups began racing to build ASICs for mining the currency. Nearly all of those companies have gone belly-up, however—except Bitmain. The company is estimated to control more than 70 percent of the market for Bitcoin-mining hardware. It also uses its hardware to mine bitcoins for itself. A lot of bitcoins: according to Blockchain.info, Bitmain-affiliated mining pools make up more than 40 percent of the computing power available for Bitcoin mining —Mike Orcutt @Technology Review

Weekend Reads 030918: Botnet Avalanche, DNS Security, and IoT Privacy

It’s been a busy few weeks in cybercrime news, justifying updates to a couple of cases we’ve been following closely at KrebsOnSecurity. In Ukraine, the alleged ringleader of the Avalanche malware spam botnet was arrested after eluding authorities in the wake of a global cybercrime crackdown there in 2016. @Krebs on Security

Reflection amplification is a technique that allows cyber attackers to both magnify the amount of malicious traffic they can generate, and obfuscate the sources of that attack traffic. For the past five years, this combination has been irresistible to attackers, and for good reason. —Carlos Morales @Arbor

For years, we’ve been pioneering the use of DNS to enforce security. We recognized that DNS was often a blind spot for organizations and that using DNS to enforce security was both practical and effective. Why? Because DNS isn’t optional. It’s foundational to how the internet works and and is used by every single device that connects to the network. If you’re considering using DNS for security, it’s important to understand the facts so you can combat the fiction. —Kevin Rollinson @Cisco

Attackers have seized on a relatively new method for executing distributed denial-of-service (DDoS) attacks of unprecedented disruptive power, using it to launch record-breaking DDoS assaults over the past week. Now evidence suggests this novel attack method is fueling digital shakedowns in which victims are asked to pay a ransom to call off crippling cyberattacks. @Krebs on Security

Amazon continues to improve the Consumer IoT space, introducing more — and smarter — WIFI-enabled gadgets. Good for us, but even better for Amazon: They get both our money and our data. —Jean-Louis Gassée @Monday Note

In December, Edward Snowden unveiled a new app called Haven, which turns your Android phone into a monitoring device to detect and record activity. Snowden has pitched Haven as a safeguard against so-called evil maid attacks, in which an adversary snoops through your digital devices or installs trackers on them when you’re not around. In interviews, Snowden was clear that one group he thought might use Haven was victims of intimate partner violence, who could use it to record abusers tampering with their devices. —Karen Levy @Slate

It’s my rather controversial view that the edge will, over the longer term (10+ years), eclipse what we call the cloud: the giant centralized hyper-scale data centers, which offer a progressive set of abstractions as a service for running applications and storing data. —Chetan Venkatesh

In earlier blog posts (Looks Like We’re Upgrading Again! Dual-Rate 40G/100G BiDi Transceiver and 40/100G QSFP BiDi Transceiver’s Backward Compatibility With 40G BiDi), we introduced the dual-rate 40/100G QSFP BiDi transceiver and described how Cisco uniquely offers 40G capability and backward compatibility. Let’s review why the QSFP+ 40G BiDi was such a big hit in the first place when it was released back in 2013, and how the BiDi value proposition still makes plenty of sense. —Pat Chou @Cisco

A large number of banks, credit unions and other financial institutions just pushed customers onto new e-banking platforms that asked them to reset their account passwords by entering a username plus some other static identifier — such as the first six digits of their Social Security number (SSN), or a mix of partial SSN, date of birth and surname. Here’s a closer look at what may be going on (spoiler: small, regional banks and credit unions have grown far too reliant on the whims of just a few major online banking platform providers). —Krebs on Security

Flowspec and RFC1998?

In a recent comment, Dave Raney asked:

Russ, I read your latest blog post on BGP. I have been curious about another development. Specifically is there still any work related to using BGP Flowspec in a similar fashion to RFC1998. In which a customer of a provider will be able to ask a provider to discard traffic using a flowspec rule at the provider edge. I saw that these were in development and are similar but both appear defunct. BGP Flowspec-ORF https://www.ietf.org/proceedings/93/slides/slides-93-idr-19.pdf BGP Flowspec Redirect https://tools.ietf.org/html/draft-ietf-idr-flowspec-redirect-ip-02.

This is a good question—to which there are two answers. The first is this service does exist. While its not widely publicized, a number of transit providers do, in fact, offer the ability to send them a flowspec community which will cause them to set a filter on their end of the link. This kind of service is immensely useful for countering Distributed Denial of Service (DDoS) attacks, of course. The problem is such services are expensive. The one provider I have personal experience with charges per prefix, and the cost is high enough to make it much less attractive.

Why would the cost be so high? The same reason a lot of providers do not filter for unicast Reverse Path Forwarding (uRPF) failures at scale—per packet filtering is very performance intensive, sometimes requiring recycling the packet in the ASIC. A line card normally able to support x customers without filtering may only be able to support x/2 customers with filtering. The provider has to pay for additional space, power, and configuration (the flowspec rules must be configured and maintained on the customer facing router). All of these things are costs the provider is going to pass on to their customers. The cost is high enough that I know very few people (in fact, so few as to be 0) network operators who will pay for this kind of service.

The second answer is there is another kind of service that is similar to what Dave is asking about. Many DDoS protection services offer their customers the ability to signal a request to the provider to block traffic from a particular source, or to help them manage a DDoS in some other way. This is very similar to the idea of interdomain flowspec, only using a different signaling mechanism. The signaling mechanism, in this case, is designed to allow the provider more leeway in how they respond to the request for help countering the DDoS. This system is called DDoS Open Threats Signaling; you can read more about it at this post I wrote at the ECI Telecom blog. You can also head over to the IETF DOTS WG page, and read through the drafts yourself.

Yes, I do answer reader comments… Sometimes just in email, and sometimes with a post—so comment away, ask questions, etc.

On the ‘web: A new way to deal with DDoS

Most large scale providers manage Distributed Denial of Service (DDoS) attacks by spreading the attack over as many servers as possible, and simply “eating” the traffic. This traffic spreading routine is normally accomplished using Border Gateway Protocol (BGP) communities and selective advertisement of reachable destinations, combined with the use of anycast to regionalize and manage load sharing on inbound network paths. But what about the smaller operator, who may only have two or three entry points, and does not have a large number of servers, or a large aggregate edge bandwidth, to react to DDoS attacks?

I write for ECI about once a month; this month I explain DOTS over there. What to know what DOTS is? Then you need to click on the link above and read the story. 🙂

Distributed Denial of Service Open Threat Signaling (DOTS)

When the inevitable 2AM call happens—”our network is under attack”—what do you do? After running through the OODA loop (1, 2, 3, 4), used communities to distribute the attack as much as possible, mitigated the attack where possible, and now you realist there little you can do locally. What now? You need to wander out on the ‘net and try to figure out how to stop this thing. You could try to use flowspec, but many providers do not like to support flowspec, because it directly impacts the forwarding performance of their edge boxes. Further, flowspec, used in this situation, doesn’t really work to walk the attack back to its source; the provider’s network is still impact by the DDoS attack.

This is where DOTS comes in. There are four components of DOTS, as shown below (taken directly from the relevant draft)—

The best place to start is with the attack target—that’s you, at 6AM, after trying to chase this thing down for a few hours, panicked because the office is about to open, and your network is still down. Within your network there would also be a DOTS client; this would be a small piece of software running on a virtual machine, or in a container, someplace, for instance. This might be commercially developed, provided by your provider, or perhaps an open source version available off of Git or some other place. The third component is the DOTS server, which resides in the provider’s network. The diagram only shows one DOTS server, but the reality any information about an ongoing DDoS attack would be relayed to other DOTS servers, pushing the mitigation effort as close to the originating host(s) as possible. The mitigator then takes any actions required to slow or eliminate the attack (including using mechanisms such as flowspec).

The DOTS specifications in the IETF are related primarily to the signaling between the client and the server; the remainder of the ecosystem around signaling and mitigation are outside the scope of the working group (at least currently). There are actually two channels in this signaling mechanism, as shown below (again, taken directly from the draft)—

The signal channel carries information about the DDoS attack in progress, requests to mitigate the attack, and other meta information. The information is marshaled into a set of YANG models, and binary encoded into CoAP for efficiency in representation and processing. The information encoded in these models includes the typical five tuple sets expanded to sets—a range of source and destination address, a range of source and destination ports, etc.

The data channel is designed to carry a sample of the DDoS flow(s), so the receiving server can perform further analytics, or even examine the flow to verify the information being transmitted over the signal channel.

How is this different from flowspec mitigation techniques?

First, the signaling runs to a server on the provider side, rather than directly to the edge router. This means the provider can use whatever means might make sense, rather than focusing on performance impacting filters applied directly by a customer. This also means some intelligence can be built into the server to prevent DOTS from becoming a channel for attacks (an attack surface), unlike flowspec.

Second, DOTS is designed with third party DDoS mitigation services in mind. This means that your upstream provider is not necessarily the provider you signal to using DOTS. You can purchase access from one provider, and DDoS mitigation services from another provider.

Third, DOTS is designed to help providers drive the DDoS traffic back to its source (or sources). This allows the provider to gain through the DDoS protection, rather than just the customer. DOTS-like systems have already been deployed by various providers; standardizing the interface between the client and the server will allow the ‘net as a whole to push DDoS back more effectively in coming years.

What can you do to help?

You can ask your upstream and DDoS providers to support DOTS in their services. You can also look for DOTS servers you can look at and test today, to get a better understanding of the technology, and how it might interact with your network. You can ask your vendors(or your favorite open source project) to support DOTS signaling in their software, or you can join with others in helping to develop open source DOTS clients.

You can also read the drafts—

Use cases for DDoS Open Threat Signaling
Distributed Denial of Service (DDoS) Open Threat Signaling Requirements
Distributed-Denial-of-Service Open Threat Signaling (DOTS) Architecture
Distributed Denial-of-Service Open Threat Signaling (DOTS) Signal Channel

Each of these drafts can use readers and suggestions in specific areas, so you can join the DOTS mailing list and participate in the discussion. You can keep up with the DOTS WG page at the IETF to see when new drafts are published, and make suggestions on those, as well.

DOTS is a great idea; it is time for the Internet to have a standardized signaling channel for spotting and stopping DDoS attacks.

Don’t Leave Features Lying Around

Many years ago, when multicast was still a “thing” everyone expected to spread throughout the Internet itself, a lot of work went into specifying not only IP multicast control planes, but also IP multicast control planes for interdomain use (between autonomous systems). BGP was modified to support IP multicast, for instance, in order to connect IP multicast groups from sender to receiver across the entire ‘net. One of these various efforts was a protocol called the Distance Vector Multicast Routing Protocol, or DVMRP. The general idea behind DVMRP was to extend many of the already well-known mechanisms for signaling IP multicast with interdomain counterparts. Specifically, this meant extending IGMP to operate across provider networks, rather than within a single network.

As you can imagine, one problem with any sort of interdomain effort is troubleshooting—how will an operator be able to troubleshoot problems with interdomain IGMP messages sources from outside their network? There is no way to log into another provider’s network (some silliness around competition, I would imagine), so something else was needed. Hence the idea of being able to query a router for information about its connected interfaces, multicast neighbors, and other information, was written up in draft-ietf-idmr-dvmrp-v3-11 (which expired in 2000). Included in this draft are two extensions to IGMP; Ask Neighbors2 and Neighbors2. If an operator wanted to know about a particular router which seemed to be causing a particular multicast traffic flow problems, they could ask some local router to send the remote router an Ask Neighbors2 message. The receiving router would respond with a unicast message, Neighbors2 providing details about the local configuration of interfaces, multicast neighbors, and other odds and ends.

If this is starting to sound like a bad idea, that’s because it probably is a bad idea… But many vendors implemented it anyway (probably because there were fat checks associated with implementing the feature, the main reason vendors implement most things). Now some more recent security researchers enter into the picture, and start asking questions like, “I wonder if this functionality can be used to build a DDoS attack.” As it turns out, it can.

Team Cymru set about scanning the entire IPv4 address space to discover any routers “out there” that might happen to support Ask Neighbor2, and to figure out what the response packets would look like. The key point is, of course, that the source address of the Ask Neighbor2 packet can be forged, so you can send a lot of routers Ask Neighbor2 packets, and—by using the source address of some device you would like to attack—have all of the routers known to respond send back Neighbor2 messages. The key questions were, then, how many responders would there be, and how large the replies would be.

The good news is they only found around 305,000 responders to the Ask Neighbor2 request. Those responders, however, transmitted some 263 million packets, most of which were much larger than the original query. This could, therefore, actually be a solid base for a nice DDoS attack. Cisco and Juniper issues alerts, and started working to remove this code from future releases of their operating systems.

One interesting result of this test was that the Cisco implementation of the Neighbor2 response actually contained the IOS Software version number, rather than the IGMP version number, or some other version number. The test revealed that some 1.3% of the responding Cisco routers are running IOS version 10.0, which hasn’t been supported in 20 years. 73.7% of the Cisco routers that responded are running IOS 12.x, which hasn’t been supported in 4 years.

There are a number of lessons here, including—

  • Protocols designed to aid troubleshooting can often easily turned into an attack surface
  • Old versions of code might well be vulnerable to things you don’t know about, would never have looked for, and would likely even never have thought about looking for
  • Large feature sets often also have large attack surfaces; it is almost impossible to actually know about, or even think through, where every possible attack surface might be in tens of millions of lines of code

It is the last lesson I think network engineers need to take to heart. The main thing network engineers seem to do all day is chase new features. Maybe we need an attitude adjustment in this—even new features are a tradeoff. This is one of those points that make disaggregation so very interesting in large scale networks.

At some point, it’s not adding features that is interesting. It’s removing them.

Blocking a DDoS Upstream

In the first post on DDoS, I considered some mechanisms to disperse an attack across multiple edges (I actually plan to return to this topic with further thoughts in a future post). The second post considered some of the ways you can scrub DDoS traffic. This post is going to complete the basic lineup of reacting to DDoS attacks by considering how to block an attack before it hits your network—upstream.

The key technology in play here is flowspec, a mechanism that can be used to carry packet level filter rules in BGP. The general idea is this—you send a set of specially formatted communities to your provider, who then automagically uses those communities to create filters at the inbound side of your link to the ‘net. There are two parts to the flowspec encoding, as outlined in RFC5575bis, the match rule and the action rule. The match rule is encoded as shown below—

There are a wide range of conditions you can match on. The source and destination addresses are pretty straight forward. For the IP protocol and port numbers, the operator sub-TLVs allow you to specify a set of conditions to match on, and whether to AND the conditions (all conditions must match) or OR the conditions (any condition in the list may match). Ranges of ports, greater than, less than, greater than or equal to, less than or equal to, and equal to are all supported. Fragments, TCP header flags, and a number of other header information can be matched on, as well.

Once the traffic is matched, what do you do with it? There are a number of rules, including—

  • Controlling the traffic rate in either bytes per second or packets per second
  • Redirect the traffic to a VRF
  • Mark the traffic with a particular DSCP bit
  • Filter the traffic

If you think this must be complicated to encode, you are right. That’s why most implementations allow you to set pretty simple rules, and handle all the encoding bits for you. Given flowspec encoding, you should just be able to detect the attack, set some simple rules in BGP, send the right “stuff” to your provider, and watch the DDoS go away. …right… If you have been in network engineering since longer than “I started yesterday,” you should know by now that nothing is ever that simple.

If you don’t see a tradeoff, you haven’t looked hard enough.

First, from a provider’s perspective, flowspec is an entirely new attack surface. You cannot let your customer just send you whatever flowspec rules they like. For instance, what if your customer sends you a flowspec rule that blocks traffic to one of your DNS servers? Or, perhaps, to one of their competitors? Or even to their own BGP session? Most providers, to prevent these types of problems, will only apply any flowspec initiated rules to the port that connects to your network directly. This protects the link between your network and the provider, but there is little way to prevent abuse if the provider allows these flowspec rules to be implemented deeper in their network.

Second, filtering costs money. This might not be obvious at a single link scale, but when you start considering how to filter multiple gigabits of traffic based on deep packet inspection sorts of rules—particularly given the ability to combine a number of rules in a single flowspec filter rule—filtering requires a lot of resources during the actual packet switching process. There is a limited number of such resources on any given packet processing engine (ASIC), and a lot of customers who are likely going to want to filter. Since filtering costs the provider money, they are most likely going to charge for flowspec, limit which customers can send them flowspec rules (generally grounded in the provider’s perception of the customer’s cluefulness), and even limit the number of flowspec rules that can be implemented at any given time.

There is plenty of further reading out there on configuring and using flowspec, and it is likely you will see changes in the way flowspec is encoded in the future. Some great places to start are—

One final thought as I finish this post off. You should not just rely on technical tools to block a DDoS attack upstream. If you can figure out where the DDoS is coming from, or track it down to a small set of source autonomous systems, you should find some way to contact the operator of the AS and let them know about the DDoS attack. This is something Mara and I will be covering in an upcoming webinar over at ipspace.net—watch for more information on this as we move through the summer.

Mitigating DDoS

Your first line of defense to any DDoS, at least on the network side, should be to disperse the traffic across as many resources as you can. Basic math implies that if you have fifteen entry points, and each entry point is capable of supporting 10g of traffic, then you should be able to simply absorb a 100g DDoS attack while still leaving 50g of overhead for real traffic (assuming perfect efficiency, of course—YMMV). Dispersing a DDoS in this way may impact performance—but taking bandwidth and resources down is almost always the wrong way to react to a DDoS attack.

But what if you cannot, for some reason, disperse the attack? Maybe you only have two edge connections, or if the size of the DDoS is larger than your total edge bandwidth combined? It is typically difficult to mitigate a DDoS attack, but there is an escalating chain of actions you can take that often prove useful. Let’s deal with local mitigation techniques first, and then consider some fancier methods.

  • TCP SYN filtering: A lot of DDoS attacks rely on exhausting TCP open resources. If all inbound TCP sessions can be terminated in a proxy (such as a load balancer), the proxy server may be able to screen out half open and poorly formed TCP open requests. Some routers can also be configured to hold TCP SYNs for some period of time, rather than forwarding them on to the destination host, in order to block half open connections. This type of protection can be put in place long before a DDoS attack occurs.
  • Limiting Connections: It is likely that DDoS sessions will be short lived, while legitimate sessions will be longer lived. The different may be a matter of seconds, or even milliseconds, but it is often enough to be a detectable difference. It might make sense, then, to prefer existing connections over new ones when resources start to run low. Legitimate users may wait longer to connect when connections are limited, but once they are connected, they are more likely to remain connected. Application design is important here, as well.
  • Aggressive Aging: In cache based systems, one way to free up depleted resources quickly is to simply age them out faster. The length of time a connection can be held open can often be dynamically adjusted in applications and hosts, allowing connection information to be removed from memory faster when there are fewer connection slots available. Again, this might impact live customer traffic, but it is still a useful technique when in the midst of an actual attack.
  • Blocking Bogon Sources: While there is a well known list of bogon addresses—address blocks that should never be routed on the global ‘net—these lists should be taken as a starting point, rather than as an ending point. Constant monitoring of traffic patterns on your edge can give you a lot of insight into what is “normal” and what is not. For instance, if your highest rate of traffic normally comes from South America, and you suddenly see a lot of traffic coming from Australia, either you’ve gone viral, or this is the source of the DDoS attack. It isn’t alway useful to block all traffic from a region, or a set of source addresses, but it is often useful to use the techniques listed above more heavily on traffic that doesn’t appear to be “normal.”

There are, of course, other techniques you can deploy against DDoS attacks—but at some point, you are just not going to have the expertise or time to implement every possible counter. This is where appliance and service (cloud) based services come into play. There are a number of appliance based solutions out there to scrub traffic coming across your links, such as those made by Arbor. The main drawback to these solutions is they scrub the traffic after it has passed over the link into your network. This problem can often be resolved by placing the appliance in a colocation facility and directing your traffic through the colo before it reaches your inbound network link.

There is one open source DDoS scrubbing option in this realm, as well, which uses a combination of FastNetMon, InfluxDB, Grefana, Redis, Morgoth, and Bird to create a solution you can run locally on a spun VM, or even bare metal on a self built appliance wired in between your edge router and the rest of the network (in the DMZ). This option is well worth looking at, if not to deploy, but to better understand how the kind of dynamic filtering performed by commercially available appliances works.

If the DDoS must be stopped before it reached your edge link, and you simply cannot handle the volume of the attacks, then the best solution might be a cloud based filtering solution. These tend to be expensive, and they also tend to increase latency for your “normal” user traffic in some way. The way these normally work is the DDoS provider advertises your routes, or redirects your DNS address to their servers. This draws all your inbound traffic into their network, which it is scrubbed using advanced techniques. Once the traffic is scrubbed, it is either tunneled or routed back to your network (depending on how it was captured in the first place). Most large providers offer scrubbing services, and there are several public offerings available independent of any upstream you might choose (such as Verisign’s line of services).

A front line defense against DDoS is to place your DNS name, and potentially your entire site, behind a DDoS detection and mitigation DNS service and/or content distribution network. For instance, CloudFlare is a widely used service that not only proxies and caches your web site, it also protect you against DDoS attacks.

Dispersing a DDoS: Initial thoughts on DDoS protection

Distributed Denial of Service is a big deal—huge pools of Internet of Things (IoT) devices, such as security cameras, are compromised by botnets and being used for large scale DDoS attacks. What are the tools in hand to fend these attacks off? The first misconception is that you can actually fend off a DDoS attack. There is no magical tool you can deploy that will allow you to go to sleep every night thinking, “tonight my network will not be impacted by a DDoS attack.” There are tools and services that deploy various mechanisms that will do the engineering and work for you, but there is no solution for DDoS attacks.

One such reaction tool is spreading the attack. In the network below, the network under attack has six entry points.

Assume the attacker has IoT devices scattered throughout AS65002 which they are using to launch an attack. Due to policies within AS65002, the DDoS attack streams are being forwarded into AS65001, and thence to A and B. It would be easy to shut these two links down, forcing the traffic to disperse across five entries rather than two (B, C, D, E, and F). By splitting the traffic among five entry points, it may be possible to simply eat the traffic—each flow is now less than one half the size of the original DDoS attack, perhaps within the range of the servers at these entry points to discard the DDoS traffic.

However—this kind of response plays into the attacker’s hand, as well. Now any customer directly attached to AS65001, such as G, will need to pass through AS65002, from whence the attacker has launched the DDoS, and enter into the same five entry points. How happy do you think the customer at G would be in this situation? Probably not very…

Is there another option? Instead of shutting down these two links, it would make more sense to try to reduce the volume of traffic coming through the links and leave them up. To put it more shortly—if the DDoS attack is reducing the total amount of available bandwidth you have at the edge of your network, it does not make a lot of sense to reduce the available amount of bandwidth at your edge in response. What you want to do, instead, is reapportion the traffic coming in to each edge so you have a better chance of allowing the existing servers to simply discard the DDoS attack.

One possible solution is to prepend the AS path of the anycast address being advertised from one of the service instances. Here, you could add one prepend to the route advertisement from C, and check to see if the attack traffic is spread more evenly across the three sites. As we’ve seen in other posts, however, this isn’t always an effective solution (see these three posts). Of course, if this is an anycast service, we can’t really break up the address space into smaller bits. So what else can be done?

There is a way to do this with BGP—using communities to restrict the scope of the routes being advertised by A and B. For instance, you could begin by advertising the routes to the destinations under attack towards AS65001 with the NO_PEER community. Given that AS65002 is a transit AS (assume it is for the this exercise), AS65001 would accept the routes from A and B, but would not advertise them towards AS65002. This means G would still be able to reach the destinations behind A and B through AS65001, but the attack traffic would still be dispersed across five entry points, rather than two. There are other mechanisms you could use here; specifically, some providers allow you to set a community that tells them not to advertise a route towards a specific AS, whether than AS is a peer or a customer. You should consult with your provider about this, as every provider uses a different set of communities, formatted in slightly different ways—your provider will probably point you to a web page explaining their formatting.

If NO_PEER does not work, it is possible to use NO_ADVERTISE, which blocks the advertisement of the destinations under attack to any of AS65001’s connections of whatever kind. G may well still be able to use the connections to A and B from AS65001 if it is using a default route to reach the Internet at large.

It is, of course, to automate this reaction through a set of scripts—but as always, it is important to keep a short leash on such scripts. Humans need to be alerted to either make the decision to use these communities, or to continue using these communities; it is too easy for a false positive to lead to a real problem.

Of course, this sort of response is also not possible for networks with just one or two connection points to the Internet.

But in all cases, remember that shutting down links the face of DDoS is rarely ever a real solution. You do not want to be reducing your available bandwidth when you are under attack specifically designed to exhaust available bandwidth (or other resources). Rather, if you can, find a way to disperse the attack.

P.S. Yes, I have covered this material before—but I decided to rebuild this post with more in depth information, and to use to kick off a small series on DDoS protection.

DNS Cookies and DDoS Attacks

DDoS attacks, particularly for ransom—essentially, “give me some bitcoin, or we’ll attack your server(s) and bring you down,” seem to be on the rise. While ransom attacks rarely actually materialize, the threat of DDoS overall is very large, and very large scale. Financial institutions, content providers, and others regularly consume tens of gigabits of attack traffic in the normal course of operation. What can be done about stopping, or at least slowing down, these attacks?

To answer, this question, we need to start with some better idea of some of the common mechanisms used to build a DDoS attack. It’s often not effective to simply take over a bunch of computers and send traffic from them at full speed; the users, and the user’s providers, will often notice machine sending large amounts of traffic in this way. Instead, what the attacker needs is some sort of public server that can (and will) act as an amplifier. Sending this intermediate server should cause the server to send an order of a magnitude more traffic towards the attack target. Ideally, the server, or set of servers, will have almost unlimited bandwidth, and bandwidth utilization characteristics that will make the attack appear as close to “normal operation” as possible.

It is at this point that DNS servers often enter the picture. There are thousands (if not tens of thousands) of DNS servers out there, they all have public facing DNS services ripe for exploitation, and they tend to be connected to large/fat pipes. What the attacker can do is send a steady stream of DNS queries that appear to be originating at the target towards the DNS server. The figure below illustrates the concept.

dns-cookies-01

The attacker will carefully examine the DNS table, of course, choosing a large record—the largest TXT record possible is ideal—and send a request for this item, or a series of similar large items, asking the DNS server to send the response to the target. The DNS server, having no idea where the query actually originates, will normally reply with the information requested.

But one DNS server isn’t going to be enough. As shown in the illustration, the attacker can actually send queries to hundreds or thousands of DNS servers; an investment of ten to fifteen packets per second in queries can generate tens of thousands replies each second in return. To ramp the attack up, the attacker can install software (probably through a botnet) that can use hundreds or thousands of hosts to generate packets towards thousands of DNS servers—this kind of amplification can easily overwhelm even the largest edge circuits available.

How can this sort of attack be countered? The key point is to remove amplification wherever possible—in this case, DNS servers seem like a likely point at which the amplification attack can be cut down to size. But how? RFC7873, DNS Cookies, published in May of 2016, provides a weak authentication mechanism to reduce the effectiveness of DNS as an amplification platform.

Essentially, this RFC specifies that each DNS query should include a cookie, or what might be called a nonce. This cookie is calculated using a consistent algorithm (a hash) computed based on several items, such as the IP address of the client (sender), the IP address of the server, etc. The idea is this: the first time a host queries a DNS server, it will send a client cookie. The server can respond with an error, simply ignore the request, or respond with a DNS packet containing the server’s cookie. The rate at which the server hands the cookies out can be rather limited; assuming clients can cache the server’s cookie as well as previous DNS records retrieved from the server, the additional rate limiting should have very little impact on the host’s operation. The next time the client sends a request, it will include this server cookie. The server, on seeing a valid cookie, will reply to the request as normal.

The figure below illustrates.

dns-cookies-02

This process accomplishes two things:

  • Rate limiting responses to “cookie requests” effectively caps the speed at which any given client can send packets with a forged source address to the server and expect a reply. It doesn’t matter how fast the attacker sends packets, it will only receive a cookie making it possible to send a valid request on an infrequent basis.
  • Given the attacker will not be able to successfully forge a cookie for the target, the only thing the attacker can force the DNS server to send to the target is a (purposefully) short error message. These error messages will have nowhere near the impact of the carefully chosen large TXT (and other) records.

This is, as the draft says, a “weak protection,” but it is enough to remove DNS servers out of the realm of “easy targets” for amplification duty in DDoS attacks.