Maybe my excuse should be that it was somewhere around two in the morning. Or maybe it was just unclear thinking, and that was that. Sgt P. and I were called out to fix the AN/FPS-77 RADAR system just at the end of our day (I normally came into the shop around 6:30AM after swimming a mile in the Ft. Dix pool, showering, and eating breakfast, so I truly had an early start), so we’d been fighting this problem for some seven or eight hours already. For some reason, a particular fuse down in the high voltage power supply kept blowing. Given this is the circuit that fed the magnetron with 250,000 volts at around 10 amps (yes, that’s a lot of power, especially for a device originally built in 1964), it made for some interesting discussion with the folks in base weather, who were thus dependent on surrounding weather RADAR systems to continue flight operations.
They weren’t happy.
We traced the problem back, using our best half splitting skills in a high voltage circuit that took minutes to power up and down, and finally decided it was a particular resistor located over on a corner of one assembly (we had boards back then, but this particular power supply was actually built on a small metal cage. We ordered another one and went to our respective houses, to sleep.
The next morning, I zoomed back over to the shop — skipping my morning swim, of course — and installed the part. Power on, and… the fuse blew. I should have seen that coming, right? In the midst of the storm, we’d totally jumped outside the half split, measured something wrong, and ended up fingering the wrong component.
Back to square one. What happened? We were looking for facts that would guide us to the right component. But the facts, while interesting, were ultimately irrelevant.
It’s not what we knew that led us wrong, it’s what we didn’t know. But at two in the morning, desperate to get the station chief off our backs, and desperate to get test equipment shelved and the to crawl into a warm bed, we started looking at what we knew, rather than what we didn’t know. Rather than seeking out what we didn’t know, we started thinking, “well, if this is true, and that is true, then this over here must be true.”
Fish often says that troubleshooting is like playing detective — and she’s right. The key problem in troubleshooting (and engineering in general, in fact), is that we often tend to end up watching the show rather than being the detective. If you really watch any detective show (and I’ve watched hundreds, as it’s just about the only sort of on-screen entertainment I will watch), you’ll discover one interesting thing. The twist is dependent on getting you to focus on one set of facts so you’ll jump to a conclusion about who committed the crime.
But the story is carefully set up so one more fact will change the entire face of the mystery. There’s even a Scooby Doo that plays on this — they get to the end, the part where Fred pulls the monster mask off the perpetrator of some heinous crime, and it’s someone that’s not even been in the show up to this point. Thelma screams about how unfair this is, how it’s just not right for someone they hadn’t even met to be the perpetrator, etc.
There’s a reality behind this, though. The facts, while interesting, are irrelevant. What’s relevant is what you don’t know. From design to troubleshooting, the entire point is to find out what you don’t know, not to focus on what you do know.