Architecture and Process

Driving through some rural areas east of where I live, I noticed a lot of collections of buildings strung together being used as homes. The process seems to start when someone takes a travel trailer, places it on blocks (a foundation of sorts) and builds a spacious deck just outside the door. Over time, the deck is covered, then screened, then walled, becoming a room.

Once the deck becomes a room, a new deck is built, and the process begins anew. At some point, the occupants decide they need a place to store some sort of equipment, so they build a shed. Later, the shed is connected to the deck, the whole thing becomes an extension of the living space, and a new shed is built.

These … interesting … places to live are homes to the people who live in them. They are often, I assume, even happy homes.

But they are not houses in the proper sense of the word. There is no unifying theme, no thought of how traffic should flow and how people should live. They are a lot like the paths crisscrossing a campus—built where the grass died.

Our networks are like these homes—they are not houses so much as historical records of every new idea and vendor marketing drive. There is no architecture, there are many architectures strung together with a set of tightly wound and closely followed processes.

We need to support some new application or service? Throw a new overlay on top. There was a massive failure last night? Let’s spend hours closely examining our process and find some way to prevent the failure by adding a few new steps.

We never ask if our goals are realistic because we don’t have any goal beyond: “Let’s solve this problem right now.” We never ask if there is some future goal might be better served by using this solution or that—the future will take care of itself.

Why do we fail to attend to architecture?

Architecture is hard, and we often fail to correctly anticipate the future. This perceives architecture as a detailed plan—but there’s no reason it should be. An architecture can be a rough, and slow-changing, outline of how the network is laid out, a set of services the network supports, and a set of technologies the network will use to support those services. An architecture recognizes and defines limits as well as capabilities.

Processes are comforting. When things fail, we can always take comfort in saying: “I followed the process!”

We live in a culture of now. All problems take two hours, two days, two weeks, or too long. There is no history, there is no future, there is only an ever-present now. If I cannot have it now, it is not worth having at all.

These problems are hard to solve because they are cultural rather than technical—and the network engineering world has a strong bias towards “don’t tell me how it works, tell me how to configure it.” We present this as a problem-solving mentality, even though it causes more problems than it solves.

We need to rebalance the way we think about architectures and processes—perhaps we would get better results by combining lightweight architectures with lightweight processes, instead of relying on heavy processes with no architecture to build maintainable networks and sustainable lives.

AI Assistants

I have written elsewhere about the danger of AI assistants leading to mediocrity. Humans tend to rely on authority figures rather strongly (see Obedience to Authority by Stanley Milgram as one example), and we often treat “the computer” as an authority figure.

The problem is, of course, Large Language Models—and AI of all kinds—are mostly pattern-matching machines or Chinese Rooms. A pattern-matching machine can be pretty effective at many interesting things, but it will always be, in essence, a summary of “what a lot of people think.” If you choose the right people to summarize, you might get close to the truth. Finding the right people to summarize, however, is beyond the powers of a pattern-matching machine.

Just because many “experts” say the same thing does not mean the thing is true, valid, or useful.

AI assistants can make people more productive, at least in terms of sheer output. Someone using an AI assistant will write more words per minute than someone who is not. Someone using an AI assistant will write more code daily than someone who is not.

But is it just more, or is it better?

Measuring the mediocratic effect of using AI systems, even as an assistant, is difficult. We have the example of drivers using a GPS, never really learning how to get anyplace (and probably losing all larger sense of geography), but these things are hard to measure.

However, a recent research paper on programming and security has shown at least one place where this effect can be measured. Noting that most kinds of social research are problematic (they are hard to replicate, it’s hard to infer valid results accurately, etc.), this one seems well set up and executed, so I’m inclined to put at least some trust in the results.

The researchers asked programmers worldwide to write software to perform six different tasks. They constructed a control group that did not use AI assistants and a test group that did.

The result? In almost every case, participants using the AI assistant wrote much less secure code, including mistakes in building encryption functions, creating a sandbox, allowing SQL injection attacks, local pointers, and integer overflows. Participants made about the same number of mistakes in randomness—a problem not many programmers have taken the time to study—and fewer mistakes in buffer overflows.

It is possible, of course, for companies to create programming-specific AI assistants that might resolve these problems. Domain-specific AI assistants will always be more accurate and useful than general-purpose assistants.

Relying on AI assistants improves productivity but also seems to create mediocre results. In many cases, mediocre results will be “good enough.”

But what about when “good enough” isn’t … good enough?

Humans are creatures of habit. We do what we practice. If you want to become a better coder, you need to practice coding—and remember that practice does not make perfect. Perfect practice makes perfect.

 

On Writing Complexity

I’ve been on a bit of a writer’s break after finishing the CCST book, but it’s time to rekindle my “thousand words a day” habit. As always, one part of this is thinking about how I write—is there anything I need to change? Tools, perhaps, or style?

What about the grade level complexity of my writing? I’ve never really paid attention to this, but I’m working on contributing to a site regularly that does. So maybe I should.

I tend to write to the tenth or eleventh-grade level, even when writing “popular material,” like blog posts. The recommended level is around the eighth-grade level. Is this something I need to change?

It seems the average person considers anything above the eighth-grade reading level “too hard” to read, so they give up. Every reading level calculation I’ve looked at essentially uses word and sentence length as proxies for complexity. Long words and sentences intimidate people.

On the other hand, measuring the reading grade level can seem futile. There are plenty of complex concepts described by one- and two-syllable words. Short sentences can still have lots of meaning.

Further, the reading grade level does not tell you if the sentence makes sense. A famous politician recently said, “… it’s time for us to do what we have been doing, and that time is every day.” The reading grade level of this sentence is in the sixth grade—but saying nothing is still saying nothing, even if you say it at a sixth-grade level.

While reading level complexity might be important, it is more important to say something.

Sometimes, using long words and sentences stops people from paying attention to your words. However, replacing long words and sentences with shorter ones sometimes removes your words’ real meaning (or at least flavor). I am not, at this point, certain how to balance these. I suspect I will have to consider the tradeoff in every situation.

When you write—and if you are doing your job as a network engineer well, you do write—you might want to consider the complexity of your writing. I will use the grade level as “another tool” in my set, which means I’ll be thinking about writing complexity more—but I’m not going to allow it to drive my writing style. If I can reduce the complexity of my writing without losing meaning, I may … sometimes … or I might not. 😊

Looking at the other side of the coin—what about reading grade level from a reader’s point of view? Should we only read easy-to-read things? The answer should be obvious: no.

There is a bit of a feeling that text above a certain reading level is “sheer nonsense.” Again, though, the grade level has nothing to do with the value of the content. Sometimes, saying complex things just requires complex text. Readers (all of us) need to learn to read complex text.

Reading grade level is a good tool in many situations—but it is one tool among many.

Making Networking Cool Again? (2)

Network engineering is not “going away.” Network engineering is not less important than it was yesterday, last year, or even a decade ago.

But there still seems to be a gap somewhere. There are fewer folks interested than we need. We need more folks who want to work as full-time network engineers, and more folks with network engineering skills diffused within the larger IT community. More of both is the right answer if we’re going to continue building large-scale systems that work. The real lack of enthusiasm for learning network engineering is hurting all of IT, not just network engineering.

How do we bridge this gap? We’re engineers. We solve problems. This seems to be a problem we might be able to solve (unlike human nature). Let’s try to solve it.

As you might have guessed, I have some ideas. These are not the only ideas in the world—feel free to think up more!

If you walk into a robotics class, even an introductory robotics class, you see people … building robots. If you walk into a coding class, even an introductory one, you see people … writing software. If you walk into a network network engineering class you see … someone lecturing about the OSI model, packet formats, or how to configure BGP.

What problems are people learning to solve robotic engineering? How to build a robot and get it to do something to solve a real-world problem. What problems are people learning to code solving? How to tackle some real-world problem.

Sure, the problems being solved at an introductory level might be trivial, like: “Read this file and spit out a sum of the numbers in the fourth column.” But they are still starting, right from the beginning, by taking requirements and converting them into solutions.

What problems are network engineers learning how to solve? How to choose hardware, string it all together, and configure BGP.

Do you see the difference?

All engineers solve problems—it’s the nature of engineering. But are we creating a mindset in prospective network engineers, or even adjacent fields, that we solve real-world problems? Or are we giving them the impression that we solve whiteboard problems by talking about bits, bytes, configurations, and cable types?

Have you ever seen the glazing over of eyes while explaining how you put four transport protocols on top of one another (look at all the pretty tunnels)? How about when you create a chart showing how TCP and QUIC can be “kind-of sort-of” forced into the OSI model? Or when you spin out your BGP packet format charts, showing how we’ve (mis)used address families to carry everything anyone can imagine?

I’ve been teaching this stuff for years (okay, decades). Over time, I’ve moved away from teaching configurations and packet formats. I’ve gone from Advanced IP Network Design to Computer Networking Problems and Solutions. These are very different ways of looking at network engineering.

Focusing on real-world problems would help connect business and other IT folks to the network, connect theory to practice, and people to network engineering. Going home at the end of the day saying, “I solved a problem,” can be satisfying. Going home at the end of the day saying, “I configured BGP?”

Another thing adopting the mindset of solving real-world problems might do is help us lose unnecessary complexity. I know complexity is necessary to build resilient systems; we cannot build what we build without creating and encountering complexity.

But we often run ourselves into the ditch on both sides of the road.

We unintentionally build too complex because we try to make it too simple. Quick, which is simpler: building a data center fabric with one routing protocol or two? A single chassis system or several smaller fixed format devices? A proprietary system or something built on open standards?

How many balloons fit in a bag? (thanks, Don)

Failing to start with the tradeoffs, and thinking through what problem we’re actually trying to solve, leads to unnecessary complexity. Such designs might not immediately fail, but they will fail, and “it’s so complex” just isn’t an excuse.

Don’t even try to tell me there aren’t any tradeoffs. If you think there aren’t any tradeoffs, that just means you haven’t looked hard enough. Go find them, think about them, and document them.

We also build complex things because we think it offers job security, or it’s neat, or we like to feel like the kid who says to the world, “look what I built!”

I know it’s exciting to hear stories about that time someone rescued a network from a major failure—after all, that’s solving a real-world problem. Building a network that just works might be “boring,” but it solves many more real-world problems than raising a network from the dead.

We love our fashionable capes, but … capes can get caught in a nearby jet engine. Lose the cape. In the long run, it’ll make network engineering more attractive as a career field and field of knowledge.

The Bottom Line

No, the sky is not falling. We still need networks, and we still need network engineers.

Yes, there is a problem. Too many companies are going “to the cloud” because they cannot find people qualified to build and maintain their very complex networks. There’s too much centralization and too little oppeness.

So maybe let’s stop saying “we don’t need network engineers.” And maybe let’s really think about how we’re building things. And maybe let’s focus on solving real-world problems, starting from day one in network engineering classrooms.

Network engineering is still cool—let’s go out there believing—and selling—that idea to the world.

Making Networking Cool Again? (1)

Is network engineering still cool?

It certainly doesn’t seem like it, does it? College admissions seem to be down in the network engineering programs I know of, and networking certifications seem to be down, too. Maybe we’ve just passed the top of the curve, and computer networking skills are just going the way of coopering. Let’s see if we can sort out the nature of this malaise and possible solutions. Fair warning—this is going to take more than one post.

Let’s start here: It could be that computer networking is a solved problem, and we just don’t need network engineers any longer.

I’ve certainly heard people say these kinds of things—for instance, one rather well-known network engineer said, just a few years back, that network engineers would no longer be needed in five years. According to this view, the entire network should be like a car. You get in, turn the key, and it “just works.” There shouldn’t be any excitement or concern about a commodity like transporting packets. Another illustration I’ve heard used is “network bandwidth should just be like computer memory—if you need more, add it.”

Does this really hold, though? Even if we accept the car and computer memory illustrations and individual routers like these things, is an entire network system like a car? A closer analogy for a network in the world of cars would be an entire transportation system.

You have different kinds of physical transport (rail, over-the-road trucks, air travel, ships, etc.), each with its characteristics, and all of which must be connected to move physical objects from one place to another. There must be some kind of “control plane” that coordinates, shared addressing, formatting rules, etc.

While a single car might, in some sense, be a commodity at this point (and I’ll bet there aren’t many car owners who would wholly agree with that characterization), I don’t see how we could call an entire transportation system a commodity—especially if we want to say “the skills needed to build a transportation system just aren’t needed any longer, there’s nothing more to learn, this is so … boring …”

Let’s dispense with this idea that networks just aren’t needed any longer. We must still build networks that carry traffic between servers, cities, countries, and continents. Building these networks is still a hard problem. Even if there is less room to improve these things than ten or twenty years ago, the problems are still hard. Even if many problems are solved at a broad level, not every problem is solved in every network in the universe.

A more reasonable take on this perspective is that networking skills are diffusing into a larger information technology (IT) skill set. Perhaps IT, in its relative “youth,” divided too sharply and finely—we created too many career fields. What is happening right now, then, is just a kind of right sizing in the market.

Network engineering skills, in fact, do seem to be dispersing to one degree or another. But let’s put this in perspective.

The first point is I’m not convinced there are fewer network engineers. Instead, it’s more likely there are just as many network engineers as there ever have been, if not more. Perhaps, though, “real” network engineering has been growing linearly while all the other IT fields have been growing at a rate faster than linear (I don’t want to say exponential, just something more than linear).

In a world that counts lack of growth as a failure, networking growing at a slower pace than, say, programming seems like a failure from the outside. People like to follow winners; growing is winning; network engineering is not growing as fast as other things, so network engineering is failing.

I dislike the modern progressive mindset—but while I’m working on something in this area, this isn’t the time or place to dive into this topic. Let’s agree that we must let go of the idea: “Growing slower is a failure.”

Returning to the idea of transportation—I will just about bet automobile designers built entire departments in the early days of car manufacturing. Today, there might be just as many automobile designers as ever. They’re just buried in large car manufacturing, servicing, etc., companies, so it feels like there are a lot fewer than there were.

Just because most new engineers must learn many different things, and network engineering skills are diffusing into many different areas of IT, does not mean network engineering is dying, regardless of what it might look like from the outside.

Second, there is nothing wrong with network engineering skills diffusing into the larger IT skill set. Has anyone reading this ever really been a “pure” network engineer? If so, I don’t know whether to envy or feel sorry for you.

When building networks in the military, I had to deal with all the politics of customer relationships and understanding mission needs. When taking cases in technical support, I had to deal with time management and customer-facing skills—and I needed to learn or use coding skills to be an effective network engineer. Today, I do network engineering, like I always have, but I work on security, privacy, DNS, coding, and all sorts of other things.

I cannot think of a time in my career when I would have considered myself a “pure network engineer.” I’ve always had to find and build adjacent skills to design, build, and maintain networks. I would say this is truer today than ever, but I do not believe my skills as a network engineer are any less useful than they have ever been.

Where does all of this leave us?

Let’s continue the discussion in part 2 next week.

Thoughts on 2023

As we close out 2023, some random observations about engineering, culture, and life.

Network engineering needs help. I am hearing, from all over the place, that network engineering is “not cool.” There is a dearth of students entering the pipeline. College programs are struggling, and many organizations are struggling with a lack of engineering talent—in fact, I would guess the most common reason for companies to move to “the cloud” is because they cannot find anyone who knows how to build an operate a network any longer.

It probably didn’t help that for the last few years many “thought leaders” in the network engineering space have been saying there is no future in network engineering. It also doesn’t help that network engineering training has become stilted and … boring. Coders are off talking about how to solve problems. Robotics folks are working on cool projects that solve problems.

Network engineers are being taught how to spend less money and told to “find another career.”

I don’t know how we think we can sustain a healthy world of IT without network engineers.

And yes, I know there are folks who think networking problems are simple, easy enough to solve with some basic software knowhow. I think I have enough knowledge and experience of the wider world of information technology to say those folks are wrong.

I’d actually like to help solve this specific problem. I’ve been looking for a Christian college someplace in the US interested in starting or growing a strong engineering program. Someplace where I join with a team to help build and teach an entire program from the first class to the last. If anyone knows of such a place, get in touch. We need to make network engineering cool again.

How much did you read this year? I read just over 40 books this year, not many of which were technology related. If you don’t read regularly, why not?

How much did you create this year? I wrote one book—the CCST Official Study Guide. I’ve written two dozen articles or so and created a few new slide decks. I’m working on several new live webinars with Pearson through Safari Books Online, including interview skills, open-source labs, some work around coding skills, and a few other things.

It you aren’t creating new things, why not?

Big is, for the most part, bad. I’ve started thinking that one of the worst things about technology-driven culture is how deeply it has enabled and taught—even encouraged—us to be passive-aggressive.

For instance, I’ve been “lifetime banned” from eBay. Why? I’ve no idea—I barely even use eBay. I logged in, listed a few items for sale, and then couldn’t log back in again. I tried to reset my password—the service accepted my new password, but still refused to allow me to log in. No notifications, no email, no … anything. I called customer support and was told I have been “banned for life.” They will not discuss why, only that some “system flagged my account.”

It is just this kind of “the computer says you are a bad person, and we will not explain why” thing that makes people dislike technology companies so deeply.

As always, feel free to get in touch if you have thoughts, want to chat, or have an idea for an episode of the Hedge.

Simple or Complex?

A few weeks ago, Daniel posted a piece about using different underlay and overlay protocols in a data center fabric. He says:

There is nothing wrong with running BGP in the overlay but I oppose to the argument of it being simpler.

One of the major problems we often face in network engineering—and engineering more broadly—is confusing that which is simple with that which has lower complexity. Simpler things are not always less complex. Let me give you a few examples, all of which are going to be controversial.

When OSPF was first created, it was designed to be a simpler and more efficient form of IS-IS. Instead of using TLVs to encode data, OSPF used fixed-length fields. To process the contents of a TLV, you need to build a case/switch construction where each possible type a separate bit of code. You must count off the correct length for the type of data, or (worse) read a length field and count out where you are in the stream.

Fixed-length fields are just much easier to process. You build a structure matching the layout of the fixed-length fields in memory, then point this structure at the packet contents in-memory. From there, you can just use the structure’s contents to directly access the data.

Over time, however, as new requirements have been pushed into IGPs, OSPF has become much more complex while IS-IS has remained relatively constant (in terms of complexity). IS-IS went through a bit of a mess when transitioning from narrow to wide metrics, but otherwise the IS-IS we use today is the same protocol we used when I first started working on networks (back in the early 1990s).

OSPF’s simplicity, in the end, did not translate into a less complex protocol.

Another example is the way we transport data in BGP. A lot of people do not know that BGP’s original design allowed for carrying information other than straight reachability in the protocol. BGP speakers can negotiate multiple sessions, with each session carrying a different kind of information. Rather than using this mechanism, however, BGP has consistently been extended using address families—because it is simpler to create a new address family than it is to define a new kind of data parallel with address families.

This has resulted in AFs that are all over the place, magic numbers, and all sorts of complexity. The AF solution is simpler, but ultimately more complex.

Returning to Daniel’s example, running a single protocol for underlay and overlay is simpler, while running two different protocols is less simple. However, I’ve observed—many times—that running different protocols for underlay and overlay is less complex.

Why? Daniel mentions a couple of reasons, such as each protocol has a separate purpose, and we’re pushing features into BGP to make it serve the role of an IGP (which is, in the end, going to cause some major outages—if it hasn’t already).

Consider this: is it easier to troubleshoot infrastructure reachability separately from vrf reachability? The answer is obvious—yes! What about security? Is it easier to secure a fabric when the underlay never touches any attached workload? Again—yes!

We get this tradeoff wrong all the time. A lot of times this is because we are afraid of what we do not know. Ten years ago I struggled to convince large operators to run BGP in their networks. Today no-one runs anyone other than BGP—and they all say “but we don’t have anyone who knows OSPF or IS-IS.” I’ve no idea what happened to old-fashioned network engineering. Do people really only have one “protocol slot” in their brains? Can people really only ever learn one protocol?

Or maybe we’ve become so fixated on learning features that we no longer no protocols?

I don’t know the answer to these questions, but I will say this—over the years I’ve learned that simpler is not always less complex.