Mean Time to Innocence is not Enough

14 November 2022 |

A long time ago, I supported a wind speed detection system consisting of an impeller, a small electric generator, a 12 gauge cable running a few miles, and a voltmeter. The entire thing was calibrated through a resistive bridge–attach an electric motor to the generator, run it at a series of fixed speed, and adjust the resistive bridge until the voltmeter, marked in knots of wind speed, read correctly.

The primary problem in this system was the several miles of 12 gauge cable. It was often damaged, requiring us to dig the cable up (shovel ready jobs!), strip the cable back, splice the correct pairs together, seal it all in a plastic container filled with goo, and bury it all again. There was one instance, however, when we could not get the wind speed system adjusted correctly, no matter how we tried to tune the resistive bridge. We pulled things apart and determined there must be a problem in one of the (many) splices in the several miles of cable.

The EIGRP SIA Incident: Positive Feedback Failure in the Wild

9 April 2018 | Comments Off on The EIGRP SIA Incident: Positive Feedback Failure in the Wild

Reading a paper to build a research post from (yes, I’ll write about the paper in question in a later post!) jogged my memory about an old case that perfectly illustrated the concept of a positive feedback loop leading to a failure. We describe positive feedback loops in Computer Networking Problems and Solutions, and in…

Troubleshooting: Half Split

16 May 2017 | Comments Off on Troubleshooting: Half Split

[time-span] The best models will support the second crucial skill required for troubleshooting: seeing the system as a set of problems to be solved. The problem/solution mindset is so critical in really understanding how networks really work, and hence how to troubleshoot them, that Ethan Banks and I are writing an entire book around this…

Troubleshooting: Models

9 May 2017 | Comments Off on Troubleshooting: Models

[time-span] How well can you know each of these four systems? Can you actually know them in fine detail, down to the last packet transmitted and the last bit in each packet? Can you know the flow of every packet through the network, and every piece of information any particular application pushes into a packet,…

Troubleshooting: Basics

2 May 2017 | Comments Off on Troubleshooting: Basics

It’s 2AM, the network is down, and the CEO is on the phone asking when it is going to be back up—the overnight job crucial to the business opening in the morning has failed, and the company stands to lose millions of dollars if the network is not fixed in the next hour or so.…