Hedge 141: Improving WAN Router Performance
Wide area networks in large-scale cores tend to be performance choke-points—partially because of differentials between the traffic they’re receiving from data center fabrics, campuses, and other sources, and the availability of outbound bandwidth, and partially because these routers tend to be a focal point for policy implementation. Rachee Singh joins Tom Ammon, Jeff Tantsura, and Russ White to discuss “Shoofly, a tool for provisioning wide-area backbones that bypasses routers by keeping traffic in the optical domain for as long as possible.”
Hedge 121: Computing in the Network with Marie-Jose Montpetit
Can computation be drawn into the network, rather than always being pushed to the edge of the network? Taking content distribution networks as a starting point, the COIN research group is looking at ways to make networks more content and computationally aware, bringing compute into the network itself. Join Alvaro Retana, Marie-Jose Montpetit, and Russ White, as we discuss the ongoing research around computing in the network.
Hedge 109: Edward Lewis and the DNS Core
What is the “core” of the DNS system, and how has it changed across the years? Edward Lewis joins Tom Ammon and Russ White to discuss his research into what the “core” of the domain name system is and how it has changed—including the rise of the large cloud players to the core of the default free zone.
Strong Reactions and Complexity
In the realm of network design—especially in the realm of security—we often react so strongly against a perceived threat, or so quickly to solve a perceived problem, that we fail to look for the tradeoffs. If you haven’t found the tradeoffs, you haven’t looked hard enough—or, as Dr. Little says, you have to ask what is gained and what is lost, rather than just what is gained. This failure to look at both sides often results in untold amounts of technical debt and complexity being dumped into network designs (and application implementations), causing outages and failures long after these decisions are made.
A 2018 paper on DDoS attacks, A First Joint Look at DoS Attacks and BGP Blackholing in the Wild provides a good example of causing more damage to an attack than the attack itself. Most networks are configured to allow the operator to quickly configure a remote triggered black hole (RTBH) using BGP. Most often, a community is attached to a BGP route that points the next-hop to a local discard route on each eBGP speaker. If used on the route advertising the destination of the attack—the service under attack—the result is the DDoS attack traffic no longer has a destination to flow to. If used on the route advertising the source of the DDoS attack traffic, the result is the DDoS traffic will no pass any reverse-path forwarding policies at the edge of the AS, and hence be dropped. Since most DDoS attacks are reflected, blocking the source traffic still prevents access to some service, generally DNS or something similar.
In either case, then, stopping the DDoS through an RTBH causes damage to services rather than just the attacker. Because of this, remote triggered black holes should really only be used in the most extreme cases, where no other DDoS mitigation strategy will work.
The authors of the Joint Look use publicly avaiable information to determine the answers to several questions. First, what scale of DDoS attacks are RTBHs used against? Second, how long after an attack begins is the RTBH triggered? Third, for how long is the RTBH left in place after the attack has been mitigated?
The answer to the first question should be—the RTBH is only used against the largest-scale attacks. The answer to the second question should be—the RTBH should be put in place very quickly after the attack is detected. The answer to the third question should be—the RTBH should be taken down as soon as the attack has stopped. The researchers found that RTBHs were most often used to mitigate the smallest of DDoS attacks, and almost never to mitigate larger ones. The authors also found that RTBHs were often left in place for hours after a DDoS attack had been mitigated. Both of these imply that current use of RTBH to mitigate DDoS attacks is counterproductive.
How many more design patterns do we follow that are simply counterproductive in the same way? This is not a matter of “following the data,” but rather one of really thinking through what it is you are trying to accomplish, and then how to accomplish that goal with the simplest set of tools available. Think through what it would mean to remove what you have put in, whether you really need to add another layer or protocol, how to minimize configuration, etc.
If you want your network to be less complex, examine the tradeoffs realistically.
Reducing RPKI Single Point of Takedown Risk
The RPKI, for those who do not know, ties the origin AS to a prefix using a certificate (the Route Origin Authorization, or ROA) signed by a third party. The third party, in this case, is validating that the AS in the ROA is authorized to advertise the destination prefix in the ROA—if ROA’s were self-signed, the security would be no better than simply advertising the prefix in BGP. Who should be able to sign these ROAs? The assigning authority makes the most sense—the Regional Internet Registries (RIRs), since they (should) know which company owns which set of AS numbers and prefixes.
The general idea makes sense—you should not accept routes from “just anyone,” as they might be advertising the route for any number of reasons. An operator could advertise routes to source spam or phishing emails, or some government agency might advertise a route to redirect traffic, or block access to some web site. But … if you haven’t found the tradeoffs, you haven’t looked hard enough. Security, in particular, is replete with tradeoffs.
Every time you deploy some new security mechanism, you create some new attack surface—sometimes more than one. Deploy a stateful packet filter to protect a server, and the device itself becomes a target of attack, including buffer overflows, phishing attacks to gain access to the device as a launch-point into the private network, and the holes you have to punch in the filters to allow services to work. What about the RPKI?
When the RKI was first proposed, one of my various concerns was the creation of new attack services. One specific attack surface is the control a single organization—the issuing RIR—has over the very existence of the operator. Suppose you start a new content provider. To get the new service up and running, you sign a contract with an RIR for some address space, sign a contract with some upstream provider (or providers), set up your servers and service, and start advertising routes. For whatever reason, your service goes viral, netting millions of users in a short span of time.
Now assume the RIR receives a complaint against your service for whatever reason—the reason for the complaint is not important. This places the RIR in the position of a prosecutor, defense attorney, and judge—the RIR must somehow figure out whether or not the charges are true, figure out whether or not taking action on the charges is warranted, and then take the action they’ve settled on.
In the case of a government agency (or a large criminal organization) making the complaint, there is probably going to be little the RIR can do other than simply revoke your certificate, pulling your service off-line.
Overnight your business is gone. You can drag the case through the court system, of course, but this can take years. In the meantime, you are losing users, other services are imitating what you built, and you have no money to pay the legal fees.
A true story—without the names. I once knew a man who worked for a satellite provider, let’s call them SATA. Now, SATA’s leadership decided they had no expertise in accounts receivables, and they were spending too much time on trying to collect overdue bills, so they outsourced the process. SATB, a competing service, decided to buy the firm SATA outsourced their accounts receivables to. You can imagine what happens next… The accounting firm worked as hard as it could to reduce the revenue SATA was receiving.
Of course, SATA sued the accounting firm, but before the case could make it to court, SATA ran out of money, laid off all their people, and shut their service down. SATA essentially went out of business. They won some money later, in court, but … whatever money they won was just given to the investors of various kinds to make up for losses. The business itself was gone, permanently.
Herein lies the danger of giving a single entity like an RIR, even if they are friendly, honest, etc., control over a critical resource.
A recent paper presented at the ANRW at APNIC caught my attention as a potential way to solve this problem. The idea is simple—just allow (or even require) multiple signatures on a ROA. To be more accurate, each authorizing party issues a “partial certificate;” if “enough” pieces of the certificate are found and valid, the route will be validated.
The question is—how many signatures (or parts of the signature, or partial attestations) should be enough? The authors of the paper suggest there should be a “Threshold Signature Module” that makes this decision. The attestations of the various signers are combined in the threshold module to produce a single signature that is then used to validate the route. This way the validation process on the router remains the same, which means the only real change in the overall RPKI system is the addition of the threshold module.
If one RIR—even the one that allocated the addresses you are using—revokes their attestation on your ROA, the remaining attestations should be enough to convince anyone receiving your route that it is still valid. Since there are five regions, you have at least five different choices to countersign your ROA. Each RIR is under the control of a different national government; hence organizations like governments (or criminals!) would need to work across multiple RIRs and through other government organizations to have a ROA completely revoked.
An alternate solutions here, one that follows the PGP model, might be to simply have the threshold signature model consider the number and source of ROAs using the existing model. Local policy could determine how to weight attestations from different RIRs, etc.
This multiple or “shared” attestation (or signature) idea seems like a neat way to work around one of (possibly the major) attack surfaces introduced by the RPKI system. If you are interested in Internet core routing security, you should take a read through the post linked above, and then watch the video.
Smart Network or Dumb?
Should the network be dumb or smart? Network vendors have recently focused on making the network as smart as possible because there is a definite feeling that dumb networks are quickly becoming a commodity—and it’s hard to see where and how steep profit margins can be maintained in a commodifying market. Software vendors, on the other hand, have been encroaching on the network space by “building in” overlay network capabilities, especially in virtualization products. VMWare and Docker come immediately to mind; both are either able to, or working towards, running on a plain IP fabric, reducing the number of services provided by the network to a minimum level (of course, I’d have a lot more confidence in these overlay systems if they were a lot smarter about routing … but I’ll leave that alone for the moment).
How can this question be answered? One way is to think through what sorts of things need to be done in processing packets, and then think through where it makes most sense to do those things. Another way is to measure the accuracy or speed at which some of these “packet processing things” can be done so you can decide in a more empirical way. The paper I’m looking at today, by Anirudh et al., takes both of these paths in order to create a baseline “rule of thumb” about where to place packet processing functionality in a network.
Sivaraman, Anirudh, Thomas Mason, Aurojit Panda, Ravi Netravali, and Sai Anirudh Kondaveeti. “Network Architecture in the Age of Programmability.” ACM SIGCOMM Computer Communication Review 50, no. 1 (March 23, 2020): 38–44. https://doi.org/10.1145/3390251.3390257.
The authors consider six different “things” networks need to be able to do: measurement, resource management, deep packet inspection, network security, network virtualization, and application acceleration. The first of these they measure by setting introducing errors into a network and measuring the dropped packet rate using various edge and in-network measurement tools. What they found was in-network measurement has a lower error rate, particularly as time scales become shorter. For instance, Pingmesh, a packet loss measurement tool that runs on hosts, is useful for measuring packet loss in the minutes—but in-network telemetry can often measure packet loss in the seconds or milliseconds. They observe that in-network telemetry of all kinds (not just packet loss) appears to be more accurate when application performance is more important—so they argue telemetry largely belongs in the network.
Resource management, such as determining which path to take, or how quickly to transmit packets (setting the window size for TCP or QUIC, for instance), is traditionally performed entirely on hosts. The authors, however, note that effective resource management requires accurate telemetry information about flows, link utilization, etc.—and these things are best performed in-network rather than on hosts. For resource management, then, they prefer a hybrid edge/in-network approach.
The argue deep packet inspection and network virtualization are both best done at the edge, in hosts, because these are processor intensive tasks—often requiring more processing power and time than network devices have available. Finally, they argue network security should be located on the host, because the host has the fine-grained service information required to perform accurate filtering, etc.
Based on their arguments, the authors propose four rules of thumb. First, tasks that leverage data only available at the edge should run at the edge. Second, tasks that leverage data naturally found in the network should be run in the network. Third, tasks that require large amounts of processing power or memory should be run on the edge. Fourth, tasks that run at very short timescales should be run in the network.
I have, of course, some quibbles with their arguments … For instance, the argument that security should run on the edge, in hosts, assumes a somewhat binary view of security—all filters and security mechanisms should be “one place,” and nowhere else. A security posture that just moves “the firewall” from the edge of the network to the edge of the host, however, is going to (eventually) face the same vulnerabilities and issues, just spread out over a larger attack surface (every host instead of the entry to the network). Security shouldn’t work this way—the network and the host should work together to provide defense in depth.
The rules of thumb, however, seem to be pretty solid starting points for thinking about the problem. An alternate way of phrasing their result is through the principle of subsidiarity—decisions should be made as close as possible to the information required to make them. While this is really a concept that comes out of ethics and organizational management, it succinctly describes a good rule of thumb for network architecture.
The Network is not Free: The Case of the Connected Toaster
Latency is a big deal for many modern applications, particularly in the realm of machine learning applied to problems like determining if someone standing at your door is a delivery person or a … robber out to grab all your smart toasters and big screen television. The problem is networks, particularly in the last mile don’t deal with latency very well. In fact, most of the network speeds and feeds available in anything outside urban areas kindof stinks. The example given by Bagchi et al. is this—
A fixed video sensor may generate 6Mbps of video 24/7, thus producing nearly 2TB of data per month—an amount unsustainable according to business practices for consumer connections, for example, Comcast’s data cap is at 1TB/month and Verizon Wireless throttles traffic over 26GB/month. For example, with DOCSIS 3.0, a widely deployed cable Internet technology, most U.S.-based cable systems deployed today support a maximum of 81Mbps aggregated over 500 home—just 0.16Mbps per home.
Bagchi, Saurabh, Muhammad-Bilal Siddiqui, Paul Wood, and Heng Zhang. “Dependability in Edge Computing.” Communications of the ACM 63, no. 1 (December 2019): 58–66. https://doi.org/10.1145/3362068.
The authors claim a lot of the problem here is just that edge networks have not been built out, but there is a reason these edge networks aren’t built out large enough to support pulling this kind of data load into a centrally located data center: the network isn’t free.
This is something so obvious to network engineers that it almost slips under our line of thinking unnoticed—except, of course, for the constant drive to make the network cost less money. For application developers, however, the network is just a virtual circuit data rides over… All the complexity of pulling fiber out to buildings or curbs, all the work of physically connecting things to the fiber, all the work of figuring out how to make routing scale, it’s all just abstracted away in a single QUIC or TCP session.
If you can’t bring the data to the compute, which is typically contained in some large-scale data center, then you have to bring the computing power to the data. The complexity of bringing the computing power to the data is applications, especially modern micro-services based applications optimized for large-scale, low latency data center fabrics, just aren’t written to be broken into components and spread all over the world.
Let’s consider the case of the smart toaster—the case used in the paper in hand. Imagine a toaster with little cameras to sense the toastiness of the bread, electronically controlled heating elements, an electronically controlled toast lifter, and some sort of really nice “bread storage and moving” system that can pull bread out of a reservoir, load them into the toaster, and make it all work. Imagine being able to get up in the morning to a fresh cup of coffee and a nice bagel fresh and hot just as you hit the kitchen…
But now let’s look at the complexity required to do such a thing. We must have local processing power and storage, along with some communication protocol that periodically uploads and downloads data to improve the toasting process. You have to have some sort of handling system that can learn about new kinds of bread and adapt to them automatically—this is going to require data, as well. You have to have a bread reservoir that will keep the bread fresh for a few days so you don’t have refill it constantly.
Will you save maybe five minutes every morning? Maybe.
Will you spend a lot of time getting this whole thing up and running? Definitely.
What will the MTBF be, precisely? What about the MTTR?
All to save five minutes in the morning? Of course the authors chose a trivial—perhaps even silly—example to use, just to illustrate the kinds of problems IoT devices combined with edge computing are going to encounter. But still … in five years you’re going to see advertisements for this smart toaster out there. There are toasters that already have a few of these features, and refrigerators that go far beyond this.
Sometimes we have to remember the cost of the network is telling us something—just because we can do a thing doesn’t mean we should. If the cost of the network forces us to consider the tradeoffs, that’s a good thing.
And remember that if your toaster makes your bread at the same time every morning, you have to adjust to the machine’s schedule, rather than the machine adjusting to yours…