About Russ

This author has not yet filled in any details.
So far Russ has created 502 blog entries.

Worth Reading: AMD and the Infinity Fabric

Starting with AMD’s Ryzen desktop processor and Epyc server architecture, AMD will implement their scalable Infinity Fabric across all its SoC and MCM products. Think of Infinity Fabric as a superset of HyperTransport, AMD’s previous socket-to-socket interconnect architecture, now managed by the HyperTransport Consortium. Infinity Fabric is a coherent high-performance fabric that uses sensors embedded in each die to scale control and data flow from die to socket to board-level. —The Next Platform

Worth Reading: The Traffic Shaping Loophole

Since the disclosures of Edward Snowden in 2013, the U.S. government has assured its citizens that the National Security Agency (NSA) cannot spy on their electronic communications without the approval of a special surveillance judge. Domestic communications, the government says, are protected by statute and the Fourth Amendment. In practice, however, this is no longer strictly true. These protections are real, but they no longer cover as much ground as they did in the past. —The Century Foundation

Worth Reading: The Internet and Trust

This narrative refers to the understanding that trust mitigates the basic uncertainties that the Internet architecture has imposed upon its operators since its inception. To this day, network engineers cannot generally be certain about the validity of the routing announcements that they receive from interconnected networks, and they have little insight into the legitimacy of the traffic that they are mandated to transmit. Historically, network operators knew and trusted one another, and, as a result, the Internet worked in spite of its uncertainties. —APNIC

Random Thoughts on Grey Failures and Scale

I have used the example of increasing paths to the point where the control plane converges more slowly, impacting convergence, hence increasing the Mean Time to Repair, to show that too much redundancy can actually reduce overall network availability. Many engineers I’ve talked to balk at this idea, because it seems hard to believe that adding another link could, in fact, impact routing protocol convergence in such a way. I ran across a paper a while back that provides a different kind of example about the trade-off around redundancy in a network, but I never got around to actually reading the entire paper and trying to figure out how it fits in.

In Gray Failure: The Achilles’ Heel of Cloud-Scale Systems, the authors argue that one of the main problems with building a cloud system is with grey failures—when a router fails only some of the time, or drops (or delays) only some small percentage of the traffic. The example given is—

  • A single service must collect information from many other services on the network to complete a particular operation
  • Each of these information collection operations represent a single transaction carried across the network
  • The more transactions there are, the more likely it is that every path in the network is going to be used
  • If every path in the network is used, every grey failure is going to have an impact on the performance of the application
  • The more devices there are physically, the more likely at least one device is going to exhibit a grey failure

In the paper, the fan out is assumed to be the number of transactions over the number of routers (packet forwarding devices). The lower the number of routers, the more likely it is that every one of them will be used to handle a large number of flows, but the more routers there are, the more likely it is that one of them will experience some kind of grey failure. There is some point at which there are enough flows over enough devices that the application will always be impacted by at least one grey failure, harming its performance. When this point is reached, the system is going to start exhibiting performance loss that is going to be very hard to understand, much less troubleshoot and repair.

Maybe an example will be helpful here. Say a particular model of router has a 10% chance of hitting a bug where the packet processing pipeline fails, and it takes some number of milliseconds to recover. Now, look at the following numbers—

  1. 1 flow over 2 routers; the application has a 50% chance of using one path or the other
  2. 10 flows over 2 routers; the application has close to a 100% chance of using every path
  3. 10 flows over 100 routers; the application has a 10% chance of using any given path path
  4. 10,000 flows over 1,000 routers; the application has close to a 100% chance of using every path

If you treat the number of flows as the numerator, and the number of paths as the denominator of a simple fraction, so long as the number of flows “swamps” the number of paths, the chances of every path in the network being used is very high. Now consider what happens if 1% of the routers produced and shipped will have the grey failure. Some very (probably not perfect) back of an envelope numbers, corresponding to the four bullets above—

  1. There is a .5% chance of the application being impacted by the grey failure
  2. There is a 2% chance of the application being impacted by the grey failure
  3. There is a 1% chance of the application being impacted by the grey failure
  4. There is a close to 100% chance of the application being impacted by the grey failure

This last result is somewhat surprising. At some point, this system—the application and network combined—will cross a threshold where the grey failure will always impact the operation of the application. But when you build a network of 1000 devices, and introduce massive equal cost multipath in the network, the problem becomes almost impossible to trace down and fix. Which of the 1000 devices has the grey failure? If the failure is transient, this is the proverbial broken connector in the box that has been sitting in the corner of the lab for the last 20 years problem.

How can this problem be solved? The paper in hand suggests that the only real solutions are more and more accurate measurements. The problem with this solution is, of course, that measurements generate data, and data must be… carried over the network, which makes the measurement data itself subject to the grey failures. What has been discovered so far, though, is this: some combination of careful measurement, combined with carefully testing across a wide variety of workload profiles, and mixed together in some data analytics, can find these sorts of problems.

Returning to the original point of this post—is this ultimately another instance of larger amounts of state causing the network to converge more slowly? No. But it still seems to be another case of more redundancy, both in the network and the way the application is written, opening holes where lower availability becomes a reality.

Worth Reading: The Value of DRM Locks

My co-authors and I at the University of Glasgow are investigating how restrictions on interoperability imposed by Digital Rights Management (DRM) systems might impact the market for goods. We are doing this as part of a larger project to better understand the economics of DRM and to figure out what changes would likely occur if the laws were reformed. Our recent working paper is titled ‘How much do consumers value interoperability: Evidence from the price of DVD players’. —EFF

Worth Reading: Converge your network with priority flow control

Back in April, we talked about a feature called Explicit Congestion Notification (ECN). We discussed how ECN is an end-to-end method used to converge networks and save money. Priority flow control (PFC) is a different way to accomplish the same goal. Since PFC supports lossless or near lossless Ethernet, you can run applications, like RDMA, over Converged Ethernet (RoCE or RoCEv2) over your current data center infrastructure. Since RoCE runs directly over Ethernet, a different method than ECN must be used to control congestion. In this post, we’ll concentrate on the Layer 2 solution for RoCE — PFC, and how it can help you optimize your network. —Cumulus

Worth Reading: New German law encourages censorship

Social media companies and other hosts of third-party content will soon face potential fines of €50 million in Germany if they fail to promptly censor speech that may violate German law. Last week, the German parliament approved the NetzDG legislation, which goes into effect 1 October and will require social media sites and other hosts of user-generated content to remove “obviously illegal” speech within 24 hours of being notified of it. —Center for Democracy and Technology

Worth Reading: Journey into the hybrid cloud

“In the cloud” is more now than just a phrase that describes a feeling. Although the cloud began as a vision, over the past decade it has become an integral part of everyday business decisions, even being evaluated for an enterprise’s most critical high-value operations. Thanks to the public cloud, many startups find it quick and easy to set up their initial operation and product-development cycles. Some, such as Uber and Airbnb, even go to market entirely in the cloud. —The Data Center Journal

Worth Reading: Cutting through the segment routing hype

Segment Routing (SR) is a new traffic-engineering technology being developed by the IETF’s SPRING Working Group. Two forwarding plane encapsulations are being defined for SR: Multiprotocol Label Switching (MPLS) and IPv6 with a Segment Routing Extension Header. This article provides some historical context by describing the MPLS forwarding plane and control plane protocols, explains how Segment Routing works, introduces the MPLS-SR forwarding plane, and shows how the SR control plane is used. Finally, the article compares SR with legacy MPLS systems, and identifies its unique merits. —The IETF Journal

On the ‘web: Is it really simpler?

Simplification is the metabuzzword overlaying the networking world today. For instance, Software Defined Wide Area Networks (SD-WANs), which propose to reduce the complexity of the enterprise wide area network, particularly the network connecting to the many thousands of remote sites many enterprises operate, by replacing MPLS and private line services purchased from service providers, with services that operate over the top of “plain jane Internet” connectivity. A second area where network operators are simplifying is by purchasing pre-integrated stacks of compute, storage, and networking, in a single rack, and controlled through a single GUI—a form of hyperconvergence. A third area is in the large amounts of interest in deploying Software Defined Networks (SDNs), which are being promoted based on the simplifications possible from removing “complex” distributed control planes from the network. There are three questions network engineers should ask in response to the simplification metabuzzword, however. —Tech Target

Tags: |