Archive for 2022
Hedge 159: Roundtable on SONiC, Antipatterns, and Resilience through Acquisition
In this last episode of 2022, Tom, Eyvonne, and Russ sit around and talk about some interesting things going on in the world of network engineering. We start with a short discussion about SONiC, which we intend to build at least one full episode about sometime in 2023. We also discuss state and antipatterns, and finally the idea of acquiring another company to build network resilience.
Hedge 158: The State of DDoS with Roland Dobbins
DDoS attacks continue to be a persistent threat to organizations of all sizes and in all markets. Roland Dobbins joins Tom Ammon and Russ White to discuss current trends in DDoS attacks, including the increasing scope and scale, as well as the shifting methods used by attackers.
Hedge 157: Vendor Lock-in with Frank Seesink
Vendor lock-in has been an issue in networking for the entire time I’ve been working in the field—since the late 1980s. I well remember the arguments over POSIX compliance, SQL middleware standards, ADA, and packet formats. It was an issue in electronics, which is where I worked before falling into a career in computer networks, too. What does “vendor independence” really mean, and what are the ways network operators can come close to having it? Frank Seesink joins Russ White and Tom Ammon to rant about—and consider—solutions to this problem.
Hedge December 22 Update
The Hedge December update contains information about upcoming episodes and training—listen in for the inside scoop!
Hedge 156: Functional Separation in Network Design with Kevin Myers
Modularization is a crucial part of network design because it supports interchangeability, reduces the size of failure domains, and controls security domains. One critical aspect of modularization is functional separation, which argues for separating services onto specific physical and logical resources. Kevin Myers joins Tom Ammon and Russ White on this episode of the Hedge to discuss the theory and importance of functional separation in network design.
The Hedge 155: DNS Deployment in Real Life with Andreas Taudte
Network engineers normally use and support DNS as a service, but don’t tend to deploy, manage, and interact with DNS servers at an application level. For this episode of the Hedge, Andreas Taudte joins Tom Ammon and Russ White to discuss the many lessons learned from planning and deploying DNS as a service.
Mean Time to Innocence is not Enough
A long time ago, I supported a wind speed detection system consisting of an impeller, a small electric generator, a 12 gauge cable running a few miles, and a voltmeter. The entire thing was calibrated through a resistive bridge–attach an electric motor to the generator, run it at a series of fixed speed, and adjust the resistive bridge until the voltmeter, marked in knots of wind speed, read correctly.
The primary problem in this system was the several miles of 12 gauge cable. It was often damaged, requiring us to dig the cable up (shovel ready jobs!), strip the cable back, splice the correct pairs together, seal it all in a plastic container filled with goo, and bury it all again. There was one instance, however, when we could not get the wind speed system adjusted correctly, no matter how we tried to tune the resistive bridge. We pulled things apart and determined there must be a problem in one of the (many) splices in the several miles of cable.
At first, we ran a Time Domain Reflectometer (TDR) across the cable to see if we could find the problem. The TDR turned up a couple of hot spots, so we dug those points up … and found there were no splices there. Hmmm … So we called in a specialized cable team. They ran the same TDR tests, dug up the same places, and then did some further testing and found … the cable was innocent.
This set up an argument, running all the way to the base commander level, between our team and the cable team. Who’s fault was this mess? Our inability to measure the wind speed at one end of the runway was impacting flight operations, so this had to be fixed. But rather than fixing the problem, we were spending our time arguing about who’s fault the problem was, and who should fix it.
When I read this line in a recent CAIDA research paper–
“Measurement is political, and often adversarial.”
It rang very true. In Internet terms, speed, congestion, and even usage are often political and adversarial. Just like the wind speed system, two teams were measuring the same thing to prove the problem wasn’t their’s–rather than to figure out what the problem is and how to fix it.
In other words, our goal is too often Mean Time to Innocence (MTTI), rather than Mean Time to Repair (MTTR).
MTTI is not enough. We need to work with our application counterparts to find and fix problems, rather than against them. Measurement should not be adversarial, it should be cooperative.
We need to learn to fix the problem, not the blame.
This is a cultural issue, but it also impacts the way we do telemetry. For instance, in the case of the wind speed indicator, the problem was ultimately a connection that “worked,” but with high capacive reactance such that some kinds of signals were attenuated while others were not. None of us were testing the cable using the right kind of signal, so we all just sat around arguing about who’s problem it was rather than solving the problem.
When a user brings a problem to you, resist the urge to go prove yourself–or your system–innocent. Even if your system isn’t the problem, your system can provide information that can help solve the problem. Treat problems as opportunities to help rather than as opportunies to swish your superhero cape and prove your expertise.