CONTENT TYPE
Assuming the worst is not the best assumption
It was too bad to be true, but I should have known that assuming the worst was not the best assumption. I was driving the “other” car, the Saab, on the way back from the METNAV shop around eight in the morning. Since the shop was located in the middle of the three runways, this meant I had to drive across the 18 taxiway, along the white lines painted between the C-141’s, C-130’s, KC-10’s, F-4’s, and sometimes other odds and ends, and then past the Tower, off the flightline, and onto the “surface streets.” As I was coming off a call at around three in the morning, I wasn’t in uniform. For some reason, I hadn’t driven my normal car — a white Jeep — so the folks in the Tower certainly wouldn’t recognize me.
So when the SP flipped his lights on and pulled in behind me, I was worried. Just as the lights came on, I remembered something really important: I had forgotten to put my sticker on the car. You see, to drive on the flightline, you had to have a sticker on your car. There were various colors for the different areas you could gain access to; mine was red, which meant I had access to everything on the flightline other than the red zone and hot spot. But here I was at eight in the morning, after spending five hours putting the glideslope back on the air for the morning’s landing runs, in a plain pair of jeans, a ratty T-Shirt, without a shower, electronics junk and tools strewn in the back seat of the Saab, and no sticker.
As an aside, I’d encountered the SP’s before on the flightline. Several times, in fact. I was once pushed to the ground face first because I’d accidentally crossed the red line. One night a friend and I walked out of the shelter at the localizer to find ourselves staring down the barrels of at least a dozen M16’s. It seems there was a shift change while we were inside working on something, and the outgoing duty officer had forgotten to brief in the oncoming duty officer. Not a happy memory.
Needless to say, then, I was assuming the worst.
I stopped (there is no place to “pull over” on a flightline”), rolled down the window, and waited. The officer walked up to the car, took a look at the back seat, took a look at me, and said, “I just wanted you to know your lights are on. Don’t forget when you park to turn them off. I wouldn’t want you to have to call a tow truck because of a failed battery.” With that, he turned, went back to his car, and drove off.
I’m glad he didn’t give me time to go through all my excuses. On reflection, it would have only made it worse. Of course I had my military ID handy, but just having an ID doesn’t help you if you’re on the flightline without authorization. In fact, it might just make things worse.
Thinking back through my life, I can recall a lot of times that I’ve made things a lot worse by assuming the worst — by making the worst assumption my first, and best, assumption. By assuming the worst about a situation (and about people), I’ve probably made a lot of things a lot worse than they ever needed to be.
Don’t do this.
What I learned that morning, even though my head was foggy, even though I was tired, and even though I had a few hours of paperwork staring me in the face, is this: don’t assume you’re being stopped for doing something wrong. You should allow each person who enters your life at least a neutral frame of reference, if not a positive one. In a court of law, you’re guilty until proven innocent. In real life, if you treat everyone as if they’re guilty, you’re going to make them all act like their guilty.
Sometimes someone just wants to tell you that you left your lights on.
Rule 11 is your friend
It’s common enough in the networking industry — particularly right now — to bemoan the rate of change. In fact, when I worked in the Cisco Technical Assistance Center (TAC), we had a phrase that described how we felt about the amount of information and the rate of change: sipping through the firehose. This phrase has become ubiquitous in the networking world to describe the feeling we all feel of being left out, left behind, and just plain not able to keep up.
It’s not much better today, either. SDNs threaten to overturn the way we build control planes, white boxes threaten to upend the way we view vendor relationships, virtualization threatens to radically alter the way we think about the relationship between services and the network, and cloud computing promises just to make the entire swatch of network engineers redundant. It’s enough to make a reasonable engineer ask some rather hard questions, like whether it’s better to flip burgers or move into management (because the world always needs more managers). Some of this is healthy change, of course — we need to spend more time thinking about why we’re doing what we’re doing, and the competition of the cloud is probably a good thing. But there’s another aspect here I don’t think we’ve thought about enough.
Sure there’s a firehose here. But there are fields all over the world where there’s a veritable firehose of new information, new thinking, and new products being designed, developed, and introduced. The actual work of building buildings has radically changed over the last 50–100 years. There have been some folks thrown out of the business in the process, but what we tend to see is more buildings being put up faster, not a bunch of mid life hamburger flippers who used to design buildings. All around us we see tons of new technology being pressed into service, and yet we don’t seem to always have the massive fear of dislocation combined with the constant angst that always seems to be in the air in network engineering (and the information technology industry at large).
I know it’s easy to fly the black flag and say, “well, if you can’t keep up, get out.” I don’t know if this is precisely fair to the old, grizzled folks who have families and lives outside work. I don’t even know if this is fair to the newbies coming in—a career field that eats people by the time they are 50, and says, “just save up while you make enough to do so, and forget having a family,” just doesn’t seem all that healthy to me. Instead, we need to find ways to mitigate the firehose. Somehow, we need to learn to cut it down so we can actually learn, and understand, and still live our lives.
But before I talk about Rule 11, let me be honest for a second — this industry isn’t going to change unless we change it. There’s no real reason for it to change. After all, 20 year olds cost less than 50 year olds to keep on staff, the firehose makes a lot of money for vendors, and it’s a large ego boost in asking questions like, “did you see the latest vendor x box,” or in “beating” someone in an interview.
For those of us who do want to change the networking world, or even just to keep up without sipping from the firehose, what can we use as a handle? This is where Rule 11 comes in. To refresh your memory—
Most people sniggle when they read this, because it really is funny. But if rule 11 is true, 90% of the water coming out of the firehose is, in fact, recycled.
Do you see it yet? If you can successfully build a mental model of each technology, and then learn to expand that mental model to each new technology you encounter, you will be able to mitigate the firehose.
If we’re going to survive as an industry, we need to get past the firehose. We need to stop thinking about the sheet metal and the cable colors, and start thinking about processes, ideas, and models. We need to stop flying by the seat of our pants, and start trying to make this stuff into real engineering, rather than black magic. Yes, I moved from working on airfield electronics to network engineering because I craved the magical side of this world, but magic just isn’t a sustainable business model, nor a sustainable way of life.
Information wants to be protected: Security as a mindset
I was teaching a class last week and mentioned something about privacy to the students. One of them shot back, “you’re paranoid.” And again, at a meeting with some folks about missionaries, and how best to protect them when trouble comes to their door, I was again declared paranoid. In fact, I’ve been told I’m paranoid after presentations by complete strangers who were sitting in the audience.
Okay, so I’m paranoid. I admit it.
But what is there to be paranoid about? We’ve supposedly gotten to the point where no-one cares about privacy, where encryption is pointless because everyone can see everything anyway, and all the rest. Everyone except me, that is—I’ve not “gotten over it,” nor do I think I ever will. In fact, I don’t think any engineer should “get over it,” in terms of privacy and security. Even if you think it’s not a big deal in your own life, engineers should learn to treat other people’s information with the utmost care.
In moving from the person to the digital representation of the person, we often forget it’s someone’s life we’re actually playing with. I think it’s time for engineers to take security—and privacy—personally. It’s time to actually do what we say we do, and make security a part of the design from day one, rather than something tacked on to the end.
And I don’t care if you think I’m paranoid.
Maybe it’s time to replace the old saying information wants to be free. Perhaps we should replace it with something a little more realistic, like:
Information wants to be protected.
It’s true that there are many different kinds of information. For instance, there’s the information contained in a song, or the information contained in a book, or a blog, or information about someone’s browsing history. Each piece of information has a specific intent, or purpose, a goal for which it was created. Engineers should make their default design such that information is only used for its intended purpose by the creator (or owner) of that information. We should design this into our networks, into our applications, and into our thought patterns. It’s all too easy to think, “we’ll get to security once things are done, and there’s real data being pushed into the system.” And then it’s too easy to think, “no-one has complained, and the world didn’t fall apart, so I’ll do it later.”
But what does it mean to design security into the system from day one? This is often, actually, the hard part. There are tradeoffs, particularly costs, involved with security. These costs might be in terms of complexity, which makes our jobs harder, or in terms of actual costs to bring the system up in the first place.
But if we don’t start pushing back, who will? The users? Most of them don’t even begin to understand the threat. The business folks who pay for the networks and applications we build? Not until they’re convinced there’s an ROI they can get their minds around. Who’s going to need to build that ROI? We are.
A good place to start might be here.
And we’re not going to until we all start nurturing the little security geek inside every engineer, until we start taking security (and privacy) a little more seriously. Until we stop thinking about this stuff as just bits on the wire, and start thinking about it as people’s lives. Until we reset our default to “just a little paranoid,” perhaps.
P.S. I’m not so certain we should get over it. Somehow I think we’re losing something of ourselves in this process of opening our lives to anyone and everyone, and I fear that by the time we figure out what it is we’re losing, it’ll be too late to reverse the process. Somehow I think that treating other people as a product (if the service is free, you are the product) is just wrong in ways we’ve not yet been able to define.
Micromanaging networks considered harmful: on (k)nerd knobs
Nerd Knobs (or as we used to call them in TAC, knerd knobs) are the bane of the support engineer’s life. Well, that and crashes. And customer who call in with a decoded stack trace. Or don’t know where to put the floppy disc that came with the router into the router. But, anyway…
What is it with nerd knobs? Ivan has a great piece up this week on the topic. I think this is the closest he gets to what I think of as the real root cause for nerd knobs —
Greg has a response to Ivan up; again, I think he gets close to the problem with these thoughts —
A somewhat orthogonal article caught my eye, though, that I think explains what is actually going on here with those pesky nerd knobs. The article is really about SQL and the concept of micromanaging software. To give you a flavor (in case you’re too lazy/busy to head over there and read the whole thing) —
I think this gets to the heart of the nerd knob problem. What’s happening with nerd knobs is it’s easier to tell the system how we want something done than it is to tell the system what we want to do. Think about this way: you install a routing protocol, and you tell it what you want in broad, general terms. Something like, “I want the shortest path between each pair of points in the network.” Then you run into a situation where you need that modified, so you mess around with the metrics some, and get on with your life. Then you run into a situation where you need this flow to go here, and that flow to go there, so you install some policy based routing along the way.
Per link metrics are just the first level of nerd knobs. Policy based routing is just the second. The more precise we want to get, the deeper the nerd knobs go. Want to load share over links that aren’t truly equal cost? Oh, just nerd knob it. Want to send AS’ in the AS path you shouldn’t? Just nerd knob it.
The reality is every nerd knob in routing represents a policy driven by a business requirement expressed as a tweak to the underlying fundamental routing algorithm. As Ivan rightly points out, going to SDNs isn’t going to solve this problem. If anything, it’s going to make it worse. Now, rather than seeing the nerd knob for what it is, a pain in the butt that needs to be explained and dealt with at 2AM when you’re half asleep and the TAC engineer is halfway around the world, it’s going to be “just another line of code.”
This might sound brilliant to someone who hasn’t managed, or dealt with, multi-million line projects and the vagaries of codebase management. Ask someone who has, though, before you get into this. It’s just a different set of problems, not a better set of problems.
The root cause here, though, isn’t nerd knobs. And it’s not business requirements. And it’s not really laziness (most of the time). It’s not even machismo most of the time (though I will admit the natural arrogance of the geek is probably worth studying by some anthropologist somewhere). There are two root causes, really.
First, we, the networking industry, haven’t really thought through what a control plane actually does. Oh, we have the seven layer model with the control plane thrown off to the side, or the claim that there shouldn’t even be a control plane. But this is part of why I think the seven layer model needs to die — because it’s a host focused view of the networking world. End-to-end and dumb as rocks routers are nice to contemplate, but I think we need to admit that even the dumb rocks are a bit more complex than we first thought.
Second, I don’t think we’ve really incorporated complexity into our souls. As someone once told me, “the CAP theorem is just an observer problem!” Or rather, we somehow believe that by making virtual things we can skip all that ugly physical reality stuff. Faster, cheaper, and better are all three available “on tap,” if we can just figure out how to see the problem right. This is nonsense on stilts.
We need to get in here and do some serious thinking about complexity, and how to manage it in network design. We need to do things like think about interaction surfaces, and how to prevent them from becoming so deep and broad as to be unmanageable. As the article on SQL says, from above —
In a world of regulation and increasing interdependencies between organizations, expressing intent independently of implementation means that you can avoid a class of unintended consequences of systems building.
Where have I heard this before? Oh, maybe it’s in that new book on network complexity someplace.
Seriously — I know this is a long rant, so I’ll quit now, but — seriously (!) we need to grow up and start treating the control plane as an engineering problem. Then, and only then, will we get rid of nerd knobs, no matter whether they’re some hidden CLI command, or some “if/then/else” or “goto” statement hidden someplace in the controller code.
P.S. BTW, Greg, I disagree with you about routing protocols. They’ll “go away” for a short while, until we start trying to deal with networks that don’t run on standards based routing protocols. And then we’ll beg for them to come back. We’ll form something like the IETF, and solve all the same problems all over again, convinced that we can do better than that last group of engineers did. Been there. Done that. Got the t-shirt (someplace).
Engineering Lessons, IPv6 Edition
Yes, we really are going to reach a point where the RIRs will run out of IPv4 addresses. As this chart from Geoff’s blog shows —
Why am I thinking about this? Because I ran across a really good article by Geoff Huston over at potaroo about the state of the IPv4 address pool at APNIC. The article is a must read, so stop right here, right click on this link, open it in a new tab, read it, and then come back. I promise this blog isn’t going anyplace while you’re over on Geoff’s site. But my point isn’t to ring the alarm bells on the IPv4 situation. Rather, I’m more interested in how we got here in the first place. Specifically, why has it taken so long for the networking industry to adopt IPv6?
Inertia is a tempting answer, but I’m not certain I buy this as the sole reason for lack of deployment. IPv6 was developed some fifteen years ago; since then we’ve deployed tons of new protocols, tons of new networking gear, and lots of other things. Remember what a cell phone looked like fifteen years ago? In fact, if we’d have started fifteen years ago with simple dual mode devices, we could easily be fully deployed in IPv6 today. As it is, we’re really just starting now.
We didn’t see a need? Perhaps, but that’s difficult to maintain, as well. When IPv6 was originally developed (remember — fifteen years ago), we all knew there was an addressing problem. I suspect there’s another reason.
I suspect that IPv6, in it’s original form tried to boil the ocean, and the result might have been too much change too fast for the networking community to handle in such a fundamental area of the stack. What engineering lessons might we draw from the long times scales around IPv6 deployment?
For those who weren’t in the industry those many years ago, there were several drivers behind IPv6 beyond just the need for more address space. For instance, the entire world exploded with “no more NATs.” In fact, many engineers, to this day, still dislike NATs, and see IPv6 as a “solution” to the NAT “problem.” Mailing lists roiled with long discussions about NAT, security by obscurity (still waiting for someone who strongly believes that obscurity is useless to step onto a modern battlefield with a state of the art armor system painted bright orange), and a thousand other topics. You see, ARP really isn’t all that efficient, so let’s do something a little different and create an entirely new neighbor discovery system. And then there’s that whole fragmentation issue we’ve been dealing with for IPv4 for all these years. And…
Part of the reason it’s taken so long to deploy IPv6, I think, is because it’s not just about expanding the address space. IPv6, for various reasons, has tried to address every potential failing ever found in IPv4.
Don’t miss my point here. The design and engineering decisions made for IPv6 are generally solid. But all of us — and I include myself here — tend to focus too much on building that practically perfect protocol, rather than building something that was “good enough,” along with stretchy spots where obvious change can be made in the future.
In this specific case, we might have passed over one specific question too easily — how easy will this be to deploy in the real world? I’m not saying there weren’t discussions around this very topic, but the general answer was, “we have fifteen years to deploy this stuff.” And, yet… Here we are fifteen years later, and we’re still trying to convince people to deploy it. Maybe a bit of honest reflection might be useful just about now.
I’m not saying we shouldn’t deploy IPv6. Rather, I’m saying we should try and take a lesson from this — a lesson in engineering process. We needed, and need, IPv6. We probably didn’t need the NAT wars. We needed, and need, IPv6. But we probably didn’t need the wars over fragmentation.
What we, as engineers, tend to do is to build solutions that are complete, total, self contained, and practically perfect. What we, as engineers, should do is build platforms that are flexible, usable, and can support a lot of different needs. Being a perfectionists isn’t just something you say during the interview to that one dumb question about your greatest weakness. Sometimes you — we, really — do need to learn to stop what we’re doing, take a look around, and ask — why are we doing this?

