Latency is a big deal for many modern applications, particularly in the realm of machine learning applied to problems like determining if someone standing at your door is a delivery person or a … robber out to grab all your smart toasters and big screen television. The problem is networks, particularly in the last mile don’t deal with latency very well. In fact, most of the network speeds and feeds available in anything outside urban areas kindof stinks.
I’s fnny, bt yu cn prbbly rd ths evn thgh evry wrd s mssng t lst ne lttr. This is because every effective language—or rather every communication system—carried enough information to reconstruct the original meaning even when bits are dropped. Over-the-wire protocols, like TCP, are no different—the protocol must carry enough information about the conversation (flow data) and the data being carried (metadata) to understand when something is wrong and error out or ask for a retransmission. These things, however, are a form of data exhaust; much like you can infer the tone, direction, and sometimes even the content of conversation just by watching the expressions, actions, and occasional word spoken by one of the participants, you can sometimes infer a lot about a conversation between two applications by looking at the amount and timing of data crossing the wire.
QUIC is a relatively new data transport protocol developed by Google, and currently in line to become the default transport for the upcoming HTTP standard. Because of this, it behooves every network engineer to understand a little about this protocol, how it operates, and what impact it will have on the network. We did record a History of Networking episode on QUIC, if you want some background.
In a recent Communications of the ACM article, a group of researchers (Kakhi et al.) used a modified implementation of QUIC to measure its performance under different network conditions, directly comparing it to TCPs performance under the same conditions. Since the current implementations of QUIC use the same congestion control as TCP—Cubic—the only differences in performance should be code tuning in estimating the round-trip timer (RTT) for congestion control, QUIC’s ability to form a session in a single RTT, and QUIC’s ability to carry multiple streams in a single connection. The researchers asked two questions in this paper: how does QUIC interact with TCP flows on the same network, and does UIC perform better than TCP in all situations, or only some?
When I think of complexity, I mostly consider transport protocols and control planes—probably because I have largely worked in these areas from the very beginning of my career in network engineering. Complexity, however, is present in every layer of the networking stack, all the way down to the physical. I recently ran across an interesting paper on complexity in another part of the network I had not really thought about before: the physical plant of a data center fabric.
The world of provider interconnection is a little … “mysterious” … even to those who work at transit providers. The decision of who to peer with, whether such peering should be paid, settlement-free, open, and where to peer is often cordoned off into a separate team (or set of teams) that don’t seem to leak a lot of information. A recent paper on current interconnection practices published in ACM SIGCOMM sheds some useful light into this corner of the Internet, and hence is useful for those just trying to understand how the Internet really works.
On this episode of the Hedge, Micah Beck joins us to discuss a paper he wrote recently considering a new model of compute, storage, and networking. Micah Beck is Associate Professor in computer science at the University of Tennessee, Knoxville, where he researches and publishes in the area of networking technologies, including the hourglass model and the end-to-end principle.
Will cyber-insurance exist as a “separate thing” in the future? The authors largely answer in the negative. The pressures of “race to the bottom,” providing maximal coverage with minimal costs (which they attribute to the structure of the cyber-insurance market), combined with lack of regulatory clarity and inaccurate measurements, will probably end up causing cyber-insurance to “fold into” other kinds of insurance.
How many 9’s is your network? How about your service provider’s? Now, to ask the not-so-obvious question—why do you care? Does the number of 9’s actually describe the reliability of the network? According to Jeffery Mogul and John Wilkes, nines are not enough. The question is—while this paper was written for commercial relationships and cloud providers, is it something you can apply to running your own network? Let’s dive into the meat of the paper and find out.
While 5 9’s is normally given as a form of Service Level Agreement (SLA), there are two other measures of reliability a network operator needs to consider—the Service Level Objective (SLO), and the Service Level Indicator (SLI).
We normally encounter four different kinds of addresses in an IP network. We tend to assign specific purposes to each one. There are other address-like things, of course, such as the protocol number, a router ID, an MPLS label, etc. But let’s stick to these four for the moment. Looking through this list, the first thing you should notice is we often use the IP address as if it identified a host—which is generally not a good thing. There have been some efforts in the past to split the locator from the identifier, but the IP protocol suite was designed with a separate locator and identifier already: the IP address is the location and the DNS name is the identifier.
If you haven’t found the trade-offs, you haven’t looked hard enough.
A perfect illustration is the research paper under review, Securing Linux with a Faster and Scalable Iptables. Before diving into the paper, however, some background might be good. Consider the situation where you want to filter traffic being transmitted to and by a virtual workload of some kind, as shown below.
To move a packet from the user space into the kernel, the packet itself must be copied into some form of memory that processes on “both sides of the divide” can read, then the entire state of the process (memory, stack, program execution point, etc.) must be pushed into a local memory space (stack), and control transferred to the kernel. This all takes time and power, of course.