SKILLS

Thoughts on Impostor Syndrome

How many times, on reading my blog, a book, or watching some video of mine over these many years (the first article I remember writing that was publicly available, many years ago, was the EIGRP white paper on Cisco Online, somewhere in 1997), have you thought—here is an engineer who has it all together, who knows technology in depth and breadth, and who symbolizes everything I think an engineer should be? And yet, how many times have you faced that feeling of self-doubt we call impostor synddome?

I am going to let you in on a little secret. I’m an impostor, too. After all these years, I still feel like I am going to be speaking in front of a crowd, explaining something at a meeting, I am going to hit publish on something, and the entire world is going to “see through the charade,” and realize I’m not all that good of an engineer. That I am an ordinary person, just doing ordinary things.

While I often think about these things, what has led me down the path of thinking about them this week is some reading I’ve been doing for a PhD seminar about human nature, work and leisure. Another part of the reason is that I have been struggling recently with some specific things at work. And, finally, another part of the reason is that I ran into a terrific post over at The Humble Lab on the topic of the impostor syndrome.

Some of this will be in agreement, and echoing, what I’ve read in other places. Other parts of this will be unique to my worldview (worldview warning here, for those too delicate to read things from a different perspective). Yes, this is going to be an honest post. Yes, this is going to be a long post.

Why do I feel like an impostor? For me, it is often fear. I think this is probably true of most people, if we are to be honest with ourselves.

I know many people who are afraid of public speaking. When talking to them in depth about this, what I normally find is a fear of failure. Some people who don’t get a degree or certification are hindered by a fear of failure. There are two kinds of fear of failure, I find: looking foolish in front of others, and losing control. I used to be able to climb tall towers without clips, and without fear. I could monkey the side of the RADAR tower at McGuire, 90 feet tall, and walk around on the top platform knocking wooden blocks out of their stays with a sledge. I could shimmy a 30 foot pole to reach the wind bird. I would struggle in doing those things now; I think more about what I cannot control.

So part of what drives the impostor syndrome is these two kinds of fear: fear of failure, and fear of losing control. Both of these tend to manifest themselves in another fear: the fear of missing out.

What can we do to address these fears? One answer I often hear is “man up and deal with it.” In other words, just address your fears, and get over them. There is some truth in this answer. Sometimes this is really what is necessary. My daughters have a hedgehog; this is one scared little animal. The only way the little hedgie is going to learn that it’s okay is to be placed in traumatic situations, and for nothing to happen. Sometimes we are hedgehogs, and our nails just need to be trimmed for our own good.

Sometimes you just have to deal with fear and keep going. Sometimes you are going to fail. It’s okay to fail.

To quote The Humble Lab here—

The point is, it’s what you do with that failure that defines you—not the fact that you failed at all. We need to drive a culture that encourages people to learn from failure, and grow from it.

But there is a danger in this answer, as well—that we will take this as the only answer. That whenever we face something we fear, or some obstacle, we will say to ourselves, “you can do anything you set your mind to, if you just try hard enough.” Or even “you can use every failure to learn something,” which implies that if you don’t learn from a failure, you are… a failure. The danger here is when stated absolutely, this is a lie. You cannot do anything you want to if you just set your mind to it. I cannot be a great baseball player, ice skater, or Olympic swimmer. I simply do not have the bodily attributes to do such things. And no amount of failure, with the attendant learning, is going to make me any of those things. Sometimes what you need to learn is I cannot do this.

This leads to a second answer to the impostor syndrome: learn to live in your limits There is a difference between pushing yourself to achieve something and pushing yourself too hard. I cannot tell you how to know the difference, because I think it is different for every person, but I know there are times in my life when I have pushed too hard. The more you realize that everyone has limits, the less limited you will be by your own limits.

For instance, I recently picked up a small bit of thinking by reading about Keith’s Law in the area of complexity. One of the corollaries of Keith’s Law is this:

You can only know what is at your layer, and a little above and below. The rest is rumor and pop psychology.

We work on complex systems. When I was in tech school in the USAF, we had one classroom where all the circuit diagrams for the AN/FRPS-77 RADAR system had been pieced together into one large diagram on the wall. This is before the days of the computer being common, of course, so everything was on paper. Folks from other career fields would borrow that room from time to time, and that set of patched together diagrams always gave them a start. Yes, a RADAR system is complex. But the systems I deal with now, networks and their environment, are far more complex. I can describe the intersection topological aggregation and summarization, virtualization, and protocol stacks, but there is no way I could draw it.

The reality is I cannot know it all. And that means there will be many times when I push a button, thinking it will do one thing, and it will actually do another. In other words, I will fail. Not because I intended to fail, but simply because I do not have all the knowledge in the world.

Not knowing does not make me a bad person, or a bad engineer; it just makes me human. It is okay to be human, and it is okay to fail because I don’t know everything.

Another answer to the impostor syndrome is to learn more. This is in tension with the second point, I know, but it is still an important point. Just like being afraid does not let you off the hook of dealing with hard situations, accepting that you cannot know everything does not let you off the hook of learning new things. There is a technique to learning, of course, but I have talked about this enough in other places (here, for instance), and this post is already too long.

Intentionally learn to counter your fears.

Finally, there is something else that needs to be said here: perhaps the best guard against impostor syndrome is to live and work in a community. Not a community of competitors, or a community of followers, but rather just a community. This means you need to stop trying to compete with everyone around you, and it means you need to stop treating everyone around you as competitors. It means not thinking I am better than another person because I know more, or I have done more. It means opening up about my fears with that community, and not being afraid to ask for help when I don’t know. It means accepting that I am never the smartest person in the room.

Intentionally build, and be a part of, a community. It’s okay to fail in front of other people.

Complexity Sells

According to Roman philosophers, simplicity is the hallmark of truth. And yet, networks have become ever more complex over time. Why is this? Because complexity sells. In this short take, I talk about why complexity sells, and some of the mental habits you can use to overcome our natural tendency to prefer the complex.

The EIGRP SIA Incident: Positive Feedback Failure in the Wild

Reading a paper to build a research post from (yes, I’ll write about the paper in question in a later post!) jogged my memory about an old case that perfectly illustrated the concept of a positive feedback loop leading to a failure. We describe positive feedback loops in Computer Networking Problems and Solutions, and in Navigating Network Complexity, but clear cut examples are hard to find in the wild. Feedback loops almost always contribute to, rather than independently cause, failures.

Many years ago, in a network far away, I was called into a case because EIGRP was failing to converge. The immediate cause was neighbor flaps, in turn caused by Stuck-In-Active (SIA) events. To resolve the situation, someone in the past had set the SIA timers really high, as in around 30 minutes or so. This is a really bad idea. The SIA timer, in EIGRP, is essentially the amount of time you are willing to allow your network to go unconverged in some specific corner cases before the protocol “does something about it.” An SIA event always represents a situation where “someone didn’t answer my query, which means I cannot stay within the state machine, so I don’t know what to do—I’ll just restart the state machine.” Now before you go beating up on EIGRP for this sort of behavior, remember that every protocol has a state machine, and every protocol has some condition under which it will restart the state machine. IT just so happens that EIGRP’s conditions for this restart were too restrictive for many years, causing a lot more headaches than they needed to.

So the situation, as it stood at the moment of escalation, was that the SIA timer had been set unreasonably high in order to “solve” the SIA problem. And yet, SIAs were still occurring, and the network was still working itself into a state where it would not converge. The first step in figuring this problem out was, as always, to reduce the number of parallel links in the network to bring it to a stable state, while figuring out what was going on. Reducing complexity is almost always a good, if counterintuitive, step in troubleshooting large scale system failure. You think you need the redundancy to handle the system failure, but in many cases, the redundancy is contributing to the system failure in some way. Running the network in a hobbled, lower readiness state can often provide some relief while figuring out what is happening.

In this case, however, reducing the number of parallel links only lengthened the amount of time between complete failures—a somewhat odd result, particularly in the case of EIGRP SIAs. Further investigation revealed that a number of core routers, Cisco 7500’s with SSE’s, were not responding to queries. This was a particularly interesting insight. We could see the queries going into the 7500, but there was no response. Why?

Perhaps the packets were being dropped on the input queue of the receiving box? There were drops, but not nearly enough to explain what we were seeing. Perhaps the EIGRP reply packets were being dropped on the output queue? No—in fact, the reply packets just weren’t being generated. So what was going on?

After collecting several show tech outputs, and looking over them rather carefully, there was one odd thing: there was a lot of free memory on these boxes, but the largest block of available memory was really small. In old IOS, memory was allocated per process on an “as needed basis.” In fact, processes could be written to allocate just enough memory to build a single packet. Of course, if two processes allocate memory for individual packets in an alternating fashion, the memory will be broken up into single packet sized blocks. This is, as it turns out, almost impossible to recover from. Hence, memory fragmentation was a real thing that caused major network outages.

Here what we were seeing was EIGRP allocating single packet memory blocks, along with several other processes on the box. The thing is, EIGRP was actually allocating some of the largest blocks on the system. So a query would come in, get dumped to the EIGRP process, and the building of a response would be placed on the work queue. When the worker ran, it could not find a large enough block in which to build a reply packet, so it would patiently put the work back on its own queue for future processing. In the meantime, the SIA timer is ticking in the neighboring router, eventually timing out and resetting the adjacency.

Resetting the adjacency, of course, causes the entire table to be withdrawn, which, in turn, causes… more queries to be sent, resulting in the need for more replies… Causing the work queue in the EIGRP process to attempt to allocate more packet sized memory blocks, and failing, causing…

You can see how this quickly developed into a positive feedback loop—

  • EIGRP receives a set of queries to which it must respond
  • EIGRP allocates memory for each packet to build the responses
  • Some other processes allocate memory blocks interleaved with EIGRP’s packet sized memory blocks
  • EIGRP receives more queries, and finds it cannot allocate a block to build a reply packet
  • EIGRP SIA timer times out, causing a flood of new queries…

Rinse and repeat until the network fails to converge.

There are two basic problems with positive feedback loops. The first is they are almost impossible to anticipate. The interaction surfaces between two systems just have to be both deep enough to cause unintended side effects (the law of leaky abstractions almost guarantees this will be the case at least some times), and opaque enough to prevent you from seeing the interaction (this is what abstraction is supposed to do). There are many ways to solve positive feedback loops. In this case, cleaning up the way packet memory was allocated in all the processes in IOS, and, eventually, giving the active process in EIGRP an additional, softer, state before it declared a condition of “I’m outside the state machine here, I need to reset,” resolved most of the incidents of SIA’s in the real world.

But rest assured—there are still positive feedback loops lurking in some corner of every network.

Rehashing Certifications

While at Cisco Live in Barcelona this week, I had a chat with someone—I don’t remember who—about certifications. The main point that came out of the conversation was this:

One of the big dangers with chasing a certification is you will end up chasing knowledge about using a particular vendor feature set, rather than chasing knowledge about a technology.

At some point I’m going to edit a post a video short on engineering versus meta-engineering (no, it won’t be next week), but the danger is real. For instance, in an article I’ve had in my bookmarks pile for a long while, the author says—

My boss advised me that getting my WPCE (WordPerfect Certified Resource) cert would accomplish two things: 1. It would establish my credibility as a trainer; and 2. If I didn’t know a feature before the test, I sure as heck would afterward.

I’m not going to name the author, because this is his description of thinking through a certification many years ago, rather than his current thinking on certifications—but the example is telling. I know a lot of folks studying for certifications. They mostly spend their time labbing up various protocols and… features. The temptation to focus on features is real because—

  • The test is going to test you on features
  • Learning the features is the fastest way to pass the test

This might sound like a replication, but many certification tests place the candidate on a very tight time leash, which means fast is important. When fast is important, you don’t have time to look up features, or study your options.

So what should we do about all of this?

First, not much can be done. I don’t really know how you write a certification that does not allow someone who has memorized the feature guide to do well. How do you test for protocol theory, and still have a broad enough set of test questions that they cannot be photographed and distributed? The problems here are not as simple as they first seem. The CCDE, I think, comes as close as any test I’ve been involved in to testing theory and concepts, rather than features.

Second, this is why I argue you should get a few certifications, and then go get a college degree. The degree might teach you things you don’t ever think you will need—but this fails to understand the point of a degree. Degree programs should not be designed like a vocational school. They should not be about learning the latest language, but rather about writing skills, thinking skills, and programming skills (in general). A good argument can still be made for a Masters Degree in Computer Science.

Finally, you will get out of certifications what you put into them. If you focus on the features, then you are going to learn the features just fine. If you do this, though, each time a new box comes out your certification will lose a little more value.

Certifications are good, when used right. They can also be “bad,” when used poorly. It’s worth thinking about.

Learning to Ask Questions

A lot of folks ask me about learning theory—they don’t have the time for it, or they don’t understand why they should. This video is in answer to that question.

One Weird Trick

I’m often asked what the trick is to become a smarter person—there are many answers, of course, which I mention in this video. But there is “one weird trick” many people don’t think about, which I focus on here.

Responding to Readers: How are these thing discovered?

A while back I posted on section 10 routing loops; Daniel responded to the post with this comment:

I am curious how these things are discovered. You said that this is a contrived example, but I assume researchers have some sort of methodology to discover issues like this. I am sure some things have been found through operational mishap, but is there some “standardized” way of testing graph logic for the possibility of loops? I trust this is much easier to do today than even a decade ago.

You would think there would be some organized way to discover these kinds of routing loops, something every researcher and/or protocol designer might follow. The reality is far different—there is no systematic way that I know of to find this sort of problem. What happens, in real life, is that people with a lot of experience at the intersection of protocol design, the bounds of different ways of finding loop free paths (solving the loop free path problem), and a lot of experience in deploying and operating a network using these protocols, will figure these things out because they know enough about the solution space to look for them in the first place.

I don’t know who actually discovered this problem; it is “just” a comment in the RFC, and these kinds of comments are not normally attributed. It might have even been something that developed on a mailing list, or in private conversation between folks sitting at a table drawing diagrams on a napkin. But I would bet it was the normal sort of process—one of two ways:

  • Someone thinks: “given the way this works, there should be a loop in there…” They sit down with someone else, and think through how it could happen. Then they go find examples of it in the real world, by talking to folks who have seen the loop but could not figure out how it happened.
  • Someone sees a loop, and thinks: “now why did that happen??” They talk to some other folks who know the protocol, sketch the problem out on a napkin, and they work together to figure it out.

There are three key points here. The first is the importance of knowing not only how to configure the protocol, but how the protocol really works. The second is not only knowing how the protocol works, but enough of the theory behind why it works to be able to relate the theory to the reality you are seeing in the network. The third is having someone to talk to with the same sort of understanding, who can hash out what you are seeing, and why.

In other words: operational experience, theoretical understanding, and community.

If these three sound familiar—they should.

Learning to Ask Questions

One thing I’m often asked in email and in person is: why should I bother learning theory? After all, you don’t install SPF in your network; you install a router or switch, which you then configure OSPF or IS-IS on. The SPF algorithm is not exposed to the user, and does not seem to really have any impact on the operation of the network. Such internal functionality might be neat to know, but ultimately–who cares? Maybe it will be useful in some projected troubleshooting situation, but the key to effective troubleshooting is understanding the output of the device, rather than in understanding what the device is doing.

In other words, there is no reason to treat network devices as anything more than black boxes. You put some stuff in, other stuff comes out, and the vendor takes care of everything in the middle. I dealt with a related line of thinking in this video, but what about this black box argument? Do network engineers really need to know what goes on inside the vendor’s black box?

Let me answer this question with another question. Wen you shift to a new piece of hardware, how do you know what you are trying to configure? Suppose, for instance, that you need to set up BGP route reflectors on a new device, and need to make certain optimal paths are taken from eBGP edge to eBGP edge. What configuration commands would you look for? If you knew BGP as a protocol, you might be able to find the right set of commands without needing to search the documentation, or do an internet search. Knowing how it works can often lead you to knowing where to look and what the commands might be. This can save a tremendous amount of time.

Back up from configuration to making equipment purchasing decisions, or specifying equipment. Again, rather than searching the documentation randomly, if you know what protocol level feature you need the software to implement, you can search for the specific support you are looking for, and know what questions to ask about the possible limitations.

And again, from a more architectural perspective–how do you know what protocol to specify to solve any particular problem if you don’t understand how the protocols actually work?

So from configuration to architecture, knowing how a protocol works can actually help you work faster and smarter by helping you ask the right questions. Just another reason to actually learn the way protocols work, rather than just how to configure them.

Master of None

Should you be a johnny do-it-all, or so deep that no-one understands what you are saying? It’s time to talk about the shape of knowledge—and how important it is to be intentional about the shape of your knowledge.