Skip to content

REACTION

Autonomic, Automated, and Reality

Once the shipping department drops the box off with that new switch, router, or “firewall,” what happens next? You rack it, cable it up, turn it on, and start configuring, right? There are access to controls to configure—SSH, keys, disabling standard accounts, disabling telnet—interface addresses to configure, routing adjacencies to configure, local policies to configure, and… After configuring all of this, you can adjust routing in the network to route around the new device, and then either canary the device “in production” (if you run your network the way it should be run), or find some prearranged maintenance time to bring the new device online and test things out. After all of this, you can leave the new device up and running in the network, and move on to the next task.

Until it breaks.

Then you consult the documentation to remind yourself why it was configured this way, consult the documentation to understand how the application everyone is complaining about not working should work, etc. There are the many hours spent sitting on the console gathering information by running various commands and the output of various logs. Eventually, once you find the problem, you can either replace the right parts, or reconfigure the right bits, and get everything running again.

In the “modern” world (such as it is), we think it’s a huge leap forward to stop configuring devices manually. If we can just automate the configuration of all that “stuff” we have to do at the beginning, after the box is opened and before the device is placed into service, we think we have this whole networking thing pretty well figured out.

Even if you had everything in your network automated, you still haven’t figured this networking thing out.

We need to move beyond automation. Where do we need to move to? It’s not one place, but two. The first is we need to move beyond automation to autonomous operation. As an example, there is a shiny new system that is currently being widely deployed to automate the deployment and management of containers. Part of this system is the automation of connectivity, including routing, between containers. The routing system being deployed as part of this system is essentially statically configured policy-based routing combined with network address translation.

Let me point something out that is not going to be very popular: this is a step backwards in terms of making the system autonomous. Automating static routing information is not a better solution than building a real, dynamic, proactive, autonomic, routing system. It’s not simpler—trust me, I say this as someone who has operated large networks which used automated static routes to do everything.

The “opsification of everything” is neat, but it shouldn’t be our end goal.

Now part of this, I know, is the fault of vendors. Vendors who push EGPs onto data center fabrics because, after all, “the configuration complexity doesn’t matter so long as you can automate it.” The configuration complexity does matter, because configuration complexity belies an underlying protocol complexity, and sets up long and difficult troubleshooting sessions that are completely unnecessary.

The second place we need to move in the networking world? The focus on automation is just another form of focusing on configuration. We abstract the configuration, and we touch a lot more devices at once, but we are still thinking about configuration. The more we think about configuration, the less we think about how the system should work, how it really works, what the gaps are, and how to bridge those gaps. So long as we are focused on the configuration, automated or not, we are not focused on how the network can bring value to the business. The longer we are focused on configuration, the less value we are bringing to the business, and the more likely we are to end up being replaced by … an automated system … no matter how poorly that automated system actually works.

And no, the cloud isn’t going to solve this. Containers aren’t going to solve this. The “automated configuration pattern” is already being repeated in the cloud. As more complex workloads are moved into the cloud, the problems there are only going to get harder. What starts out as a “simple” system using policy-based routing analogs and network address translation configured through an automation server will eventually look complex against the hardest problems we had to solve using T1’s, frame relay circuits, inverse multiplexers, wire down patch panels, and mechanical switch crossbar frames. It’s fun to pretend we don’t need dynamic routing to solve the problems that face the network—at least until you hit hard problems, and have to relearn the lessons of the last 20+ years.

Yes, I know vendors are partly to blame for this. I know that, for a vendor, it’s easier to get people to buy into your CLI, or your entire ecosystem, rather than getting them to think about how to solve the problems your business is handing them.

On the other hand, none of this is going to change from the top down. This is only going to change when the average network engineer starts asking vendors for truly simpler solutions that don’t require reams configuration information. It will change when network engineers get their heads out of the configuration and features, and into the business problems.

Stop Using the OSI Model

We all use the OSI model to describe the way networks work. I have, in fact, included it in just about every presentation, and every book I have written, someplace in the fundamentals of networking. But if you have every looked at the OSI model and had to scratch your head trying to figure out how it really fits with the networks we operate today, or what the OSI model is telling you in terms of troubleshooting, design, or operation—you are not alone. Lots of people have scratched their heads about the OSI model, trying to understand how it fits with modern networking. There is a reason this is so difficult to figure out.

The OSI Model does not accurately describe networks.

What set me off in this particular direction this week is an article over at Errata Security:

The OSI Model was created by international standards organization for an alternative internet that was too complicated to ever work, and which never worked, and which never came to pass. Sure, when they created the OSI Model, the Internet layered model already existed, so they made sure to include today’s Internet as part of their model. But the focus and intent of the OSI’s efforts was on dumb networking concepts that worked differently from the Internet.

This is partly true, and yet a bit … over the top. 🙂 OTOH, the point is well taken: the OSI model is not an ideal model for understanding networks. Maybe a bit of analysis would be helpful in understanding why.

First, while the OSI model was developed with packet switching networks in mind, the general idea was to come as close as possible to emulating the circuit-switched networks widely deployed at the time. A lot of thought had gone into making those circuit-switched networks work, and applications had been built around the way they worked. Applications and circuit-switched networks formed a sort of symbiotic relationship, just as applications form with packet-switched networks today; it was unimaginable, at the time, that “everything would change.”

So while the designers of the OSI model understood the basic value of the packet-switched network, they also understood the value of the circuit-switched network, and tried to find a way to solve both sets of problems in the same network. Experience has shown it is possible to build a somewhat close-to-circuit switched network on top of packet switched networks, but not quite in the way, nor as close to perfect emulation, as those original designers thought. So the OSI model is a bit complex and perhaps overspecified, making it less-than-useful today.

Second, the OSI model largely ignored the role of middleboxes, focusing instead on the stacks implemented and deployed in hosts. This, again, makes sense, as there was no such thing as a device specialized in the switching of packets at the time. Hosts took packets in and processed them. Some packets were sent along to other hosts, other packets were consumed locally. Think PDP-11 with some rough code, rather than even an early Cisco CGS.

Third, the OSI model focuses on what each layer does from the perspective of an application, rather than focusing on what is being done to the data in order to transmit it. The OSI model is built “top down,” rather than “bottom up,” in other words. While this might be really useful if you are an application developer, it is not so useful if you are a network engineer.

So—what should we say about the OSI model?

It was much more useful at some point in the past, when networking was really just “something a host did,” rather than its own sort of sub-field, with specialized protocols, techniques, and designs. It was a very good attempt at sorting out what a network needed to do to move traffic, from the perspective of an application.

What it is not, however, is really all that useful for network engineers working within an engineering specialty to understand how to design protocols, and how to design networks on which those protocols will run. What should we replace it with? I would begin by pointing you to the RINA model, which I think is a better place to start. I’ve written a bit about the RINA model, and used the RINA model as one of the foundational pieces of Computer Networking Problems and Solutions.

Since writing that, however, I have been thinking further about this problem. Over the next six months or so, I plan to build a course around this question. For the moment, I don’t want to spoil the fun, or put any half-backed thoughts out there in the wild.

The Network Sized Holes in Serverless

Until about 2017, the cloud was going to replace all on-premises data centers. As it turns out, however, the cloud has not replaced all on-premises data centers. Why not? Based on the paper under review, one potential answer is because containers in the cloud are still too much like “serverfull” computing. Developers must still create and manage what appear to be virtual machines, including:

  • Machine level redundancy, including georedundancy
  • Load balancing and request routing
  • Scaling up and down based on load
  • Monitoring and logging
  • System upgrades and security
  • Migration to new instances

Serverless solves these problems by placing applications directly onto the cloud, or rather a set of libraries within the cloud.

Jonas, Eric, Johann Schleier-Smith, Vikram Sreekanti, Chia-Che Tsai, Anurag Khandelwal, Qifan Pu, Vaishaal Shankar, et al. “Cloud Programming Simplified: A Berkeley View on Serverless Computing.” ArXiv:1902.03383 [Cs], February 9, 2019. http://arxiv.org/abs/1902.03383.

The authors define serverless by contrasting it with serverfull computing. While software is run based on an event in serverless, software runs until stopped in a cloud environment. While an application does not have a maximum run time in a serverfull environment, there is some maximum set by the provider in a serverless environment. The server instance, operating system, and libraries are all chosen by the user in a serverfull environment, but they are chosen by the provider in a serverless environment. The serverless environment is a higher-level abstraction of compute and storage resources than a cloud instance (or an on-premises solution, even a private cloud).

These differences add up to faster application development in a serverless environment; application developers are completely freed from any system administration tasks to focus entirely on developing and deploying useful software. This should, in theory, free application developers to focus on solving business problems, rather than worrying about any of the infrastructure. Two key points the authors point out in the serverless realm are the complex software techniques used to bring serverless processes up quickly (such as preloading and holding the VM instances that back services), and the security isolation provided through VM level separation.

The authors provide a section on challenges in serverless environments and the workarounds to these challenges. For instance, one problem with real-time video compression is the object store used to communicate between processes running on a serverless infrastructure is too slow to support fine-grained communication, while the functions are too course-grained to support some of the required tasks. To solve this problem, they propose using function-to-function communication, which moves the object store out of the process. This provides dramatic processing speedups, as well as reducing the costs of serverless to a fraction of a cloud instance.

One of the challenges discussed here is the problem of communication patterns, including broadcast, aggregation, and shuffle. Each of these, of course rely on the underlying network to transport data between the compute nodes on which serverless functions are running. Since the serverless user cannot determine where a particular function will run, the performance of the underlying transport is—of course–quite variable. The authors say: “Since the application cannot control the location of the cloud functions, a serverless computing application may need to send two and four orders of magnitude more data than an equivalent VM-based solution.”

And this is where the network sized hole in serverless comes into play. It is common fare today to say the network is “just a commodity.” Speeds are feeds are so high, and so easy to build, that we do not need to worry about building software that knows how to use a network efficiently, or even understands the network at all. That matching network to software requirements is a thing of the past—bandwidth is all a commodity now.

The law of leaky abstractions, however, will always have its say—a corollary here is higher level abstractions will always have larger and more consequential leaks. The solutions offered to each of the challenges listed in the paper are all, in fact, resolved by introducing layering violations which allow the developer to “work around” an inefficiency at some lower layer in the abstraction. Ultimately, such work arounds will compound into massive technical debt, and some “next new thing” will come along to “solve the problems.”

Moving data ultimately still takes time, still takes energy; the network still (often) needs to be tuned to the data being moved. Serverless is a great technology for some solutions—but there is ultimately no way to abstract out the hard work of building an entire system tuned to do a particular task and do it well. When you face abstraction, you should always ask: what is gained, and what is lost?

Learn to Code?

A long, long time ago, in a galaxy far away, I went to school to learn art and illustration. In those long ago days, folks in my art and illustration classes would sometimes get into a discussion about what, precisely, to do with an art degree. My answer was, ultimately, to turn it into a career building slides and illustrations in the field of network engineering. 😊 And I’m only half joking.

The discussion around the illustration board in those days was whether it was better to become an art teacher, or to focus just on the art and illustration itself. The two sides went at it hammer and tongs over weeks at a time. My only contribution to the discussion was this: even if you want to be the ultimate in the art world, a fine artist, you must still have a subject. While much of modern art might seem to be about nothing much at all, it has always seemed, to me, that art must be about something.

This week I was poking around one of the various places I tend to poke on the ‘net and ran across this collage. Click to see the full image.

Get the point? If you are a coal miner out of work, just learn to code. This struck me as the same sort of argument we used to have in our art and illustration seminars. But what really concerns me is the number of people who are considering leaving network engineering behind because they really believe there is no future in doing this work. They are replacing “learn network engineering” with “learn to code.”

Not to be too snarky, but sure—you do that. Let me know how it goes when you are out there coding… what, precisely? The entire point of coding is to code something useful, not to just run off and build trivial projects you can find in Git and code training classes.

It seems, to me, that if you are an artist, having some in-depth knowledge of something in the real world would make your art have more impact. If you know, for instance, farming, then you can go beyond just getting the light right. If you know farming, then you can paint (or photograph) a farm with much more understanding of what is going on in the images, and hence to capture more than just the light or the mood. You can capture the movement, the flow of work, even the meaning.

In the same way, it seems, to me, that if you are coder, having some in-depth knowledge of something you might build code for would mean your code will have more impact. It might be good to have a useful subject, like maybe… building networks? Contrary to what seems to be popular belief, building a large-scale network is still not a simple thing to do. So long as there are any kind of resource constraints in the design, deployment, and operation of networks, I do not see how it can ever really be an “easy thing to do.”

So yes—learn to code. I will continue to encourage people to learn to code. But I will also continue to encourage folks to learn how to design, build, and operate big networks. All of us cannot sit around and code full time, after all. There are still engineering problems to be solved in other areas, challenges to be tackled, and things to be built.

Coding is a good skill, but understanding how networks really work, how to design them, how to build them, and how to operate them, are also all good skills. Learning to code can multiply your skills as a network engineer. But if you do not have network engineering skills to start with, multiplying them by learning to code is not going to be the most useful exercise.

Reaction: Open Source

As long-standing contributor to open standards, and someone trying to become more involved in the open source world (I really need to find an extra ten hours a day!), I am always thinking about these ecosystems, and how the relate to the network engineering world. This article on RedisDB, and in particular this quote, caught my attention—

There’s a longstanding myth in the open-source world that projects are driven by a community of contributors, but in reality, paid developers contribute the bulk of the code in most modern open-source projects, as Puppet founder Luke Kanies explained in our story earlier this year. That money has to come from somewhere.

The point of the article is a lot of companies that support open source projects, like RedisDB, are moving to a more closed source solutions to survive. The cloud providers are called out as a source of a lot of problems in this article, as they consume a lot of open source software, but do not really spend a lot of time or effort in supporting it. Open source, in this situation, becomes a sort of tragedy of the commons, where everyone things someone else is going to do the hard work of making a piece of software viable, so no-one does any of the work. Things are made worse because the open source version of the software is often “good enough” to solve 80% of the problems users need solved, so there is little incentive to purchase anything from the companies that do the bulk of the work in the community.

In some ways this problem relates directly to the concept of disaggregated networking. Of course, as I have said many times before, disaggregation is not directly tied to open source, nor even open standards. Disaggregation is simply seeing the hardware and software as two different things. Open source, in the disaggregated world, provides a set of tools the operator can use as a base for customization in those areas where customization makes sense. Hence open source and commercial solutions compliment one another, rather than one replacing the other.

All that said, how can the open source community continue to thrive if some parts of the market take without giving back? Simply put, it cannot. There are ways, however, of organizing open source projects which encourage participation in the community, even among corporate interests. FR Routing is an example of a project I think is well organized to encourage community participation.

There are two key points to the way FR Routing is organized that I think is helpful in controlling the tragedy of the commons. First, there is not just one company in the world commercializing FR Routing. Rather, there are many different companies using FR Routing, either by shipping it in a commercial product, or by using it internally to build a network (and the network is then sold as a service to the customers of the company). Not every user of FR Routing is using only this one routing stack in their products or networks, either. This first point means there is a lot of participation from different companies that have an interest in seeing the project succeed.

Second, the way FR Routing is structured, no single company can gain control of the entire community. This allows healthy debate on features, code structure, and other issues within the community. There are people involved who supply routing expertise, others who supply deployment expertise, and a large group of coders, as well.

One thing I think the open source world does too often is to tie a single project to a single company, and that company’s support. Linux thrives because there are many different commercial and noncommercial organizations supporting the kernel and different packages that ride on top of the kernel. FR Routing is thriving for the same reason.

Yes, companies need to do better at supporting open source in their realm, not only for their own good, but for the good of the community. Yes, open source plays a vital role in the networking community. I would even argue closed source companies need to learn to work better with open source options in their area of expertise to provide their customers with a wider range of options. This will ultimately only accrue to the good of the companies that take this challenge on, and figure out how to make it work.

On the other side of things, open source is probably not going to solve all the problems in the networking, or any other, industry in the future. And the open source community needs to learn how to build structures around these projects that are both more independent, and more sustainable, over the long run.

Whither Network Engineering? (Part 3)

In the previous two parts of this series, I have looked at the reasons I think the networking ecosystem is bound to change and why I think disaggregation is going to play a major role in that change. If I am right about the changes happening, what will become of network engineers? The bifurcation of knowledge, combined with the kinds of networks and companies noted in the previous posts in this series, point the way. There will, I think, be three distinct careers where the current “network engineer” currently exists on the operational side:

  1. Moving up the stack, towards business, the more management role. This will be captured primarily by the companies that operate in market verticals deep and narrow enough to survive without a strong focus on data, and hence can survive a transition to black box, fully integrated solutions. This position will largely be focused on deploying, integrating, and automating vertically integrated, vendor-driven systems and managing vendor relationships.
  2. Moving up the stack, towards software and business, the disaggregated network engineering role (I don’t have a better name for this presently). This will be in support of companies that value data to the point of focusing on its management as a separate “thing.” The network will no longer be a “separate line item,” however, but rather part of a larger system revolving around the data that makes the company “go.”
  3. Moving down the stack, towards the hardware, the network hardware, rack-and-stack, cabling, power, etc., engineer. Again, I do not have a good name for this role right now.

There will still be a fairly strong “soft division” between design and troubleshooting in the second role. Troubleshooting will primarily be handled by the vendor in the first role.

Perhaps the diagram below will help illustrate what I think is happening, and will continue to happen, in the network engineering field.

The old network engineering role, shown in the lower left corner of the two halves of the illustration, focused on the appliances and circuits used to build networks, with some portion of the job interacting with protocols and management tools. The goal is to provide the movement of data as a service, with minimal regard to the business value of that data. This role will, in my opinion, transition to the entire left side of the illustration as a company moves to black box solutions. The real value offered in this new role will be in managing the contracts and vendors used to supply what is essentially a commodity.

On the right side is what I think the disaggregated path looks like. Here the network engineering role has largely moved away from hardware; this will increasingly become a largely specialized vendor driven realm of work. On the other end, the network engineer will focused more on software, from protocols to applications, and how they drive and add value to the business. Again, the role will need to move up the stack towards the business to continue adding value; away from hardware, and towards software.

I could well be wrong. I would not be happy or sad if I am right or wrong.

None of these are invalid choices to make, or bad roles to fill. I do not know what role fits “you” best, your life, nor your interests. I am simply observing what I think is happening in the market, and trying to understand where things are going, because I think this kind of thinking helps provide clarity in a confusing world.

In both the first and second roles, you must move up the stack to add value. This is what happened in the worlds of electronic engineering and personal computers as they both disaggregated away from an appliance model. Living through these past experiences is part of what leads me to believe this same kind of movement will happen in the world of networking technology. Further, I think I already see these changes happening in parts of the market, and I cannot see any reason these kinds of changes should not move throughout the entire market fairly rapidly.

What is the percentage of these two roles in the market? Some people think the second role will simply not exist, in fact, other than at vendors. Others think the second role will be a vanishingly small part of the market. I tend to think the percentages will be more balanced because of shifts in the business environment that is happening in parallel with (or rather driving) these changes. Ultimately, however, the number of people in each role will driven by the business environment, rather than the networking world.

Will there be “network engineers” in the future?

If we look at the progress of time from left to right, there is a big bulge ahead, followed by a slope off, and then a long tail. This is my understanding of the current network engineering skill set. We are at A as I write this, just before the big bulge of radical change at B, and I think much farther along than many others believe. At C, there will still be network engineers in the mold of the network engineers of today. They will be valiantly deploying appliance based networks for those companies who have a vertical niche deep enough to survive. There will be vendors still supporting these companies and engineers, too. There will just be a very few of them. Like COBOL and FORTRAN coders today, they will live on the long tail of demand. I suspect a number of the folks who live in this long tail will even consider themselves the “real legacy” of network engineering, while seeing the rest of the network operations and engineering market is more of “software engineers” and “administrators.”

That’s all fine by me; I just know I’d rather be in the bubble of demand than the long tail. 🙂

What should I do as a network engineer? This is the tricky question.

First, I cannot tell you which path to take of the ones I have presented. I cannot, in fact, tell you precisely what these roles are going to look like, nor whether there will be other roles available. For instance, I have not discussed what I think vendors look like after this change at all; there will be some similar roles, and some different ones, in that world.

Second, all the roles I’ve described (other than the hardware focused role) involve moving up the stack into a more software and business focus. This means that to move into these roles, you need to gain some business acumen and some software skills. If this is all correct, then now is the time to gain those skills, rather than later. I intend to post more on these topics in the future, so watch this space.

Third, don’t be fatalistic about any of this. I hear a lot of people say things like “I don’t have any influence over the market or my company.” Wrong. Rather than throwing our hands up in frustration and waiting for our fates (or heads) to be handed to us on a silver platter, I want to suggest a way forward. I know that none of us can entirely control the future—my worldview does not allot the kind of radical freedom this would entail to individual humans. At the same time, I am not a fatalist, and I tend to get frustrated with people who argue they have no control, so we should just “sit back, relax, and enjoy the ride.” We have freedom to do different things in the future within the context and guard rails set by our past decisions (and other things outside the scope of a technical blog).

My suggestion is this: take a hard look at what I have written here, decide for yourself where you think I am right and where I am wrong, and make career decisions based on what you think is going to happen. I have seen multiple people end up at age 50 or 60 with a desire to work, and yet with no job. I cannot tell you what percentage of any particular person’s situation is because of ageism, declining skills, or just being in the wrong place at the wrong time (I tend to think all three play a different role in every person’s situation). On the other hand, if you focus on what you can change—your skills, attitude, and position—and stop worrying so much about the things you cannot change, you will be a happier person.

Fourth, this fatalism stretches to the company you work for, and anyplace you might work in the future. There is a strong belief that network engineers cannot influence business leadership. Let me turn this around: If you stop talking about chipsets and optical transceivers, and start talking about the value of data and how the company needs to think about that value, then you might get a seat at the table when these discussions are taking place. You are not helpless here; if you learn how to talk to the business, there is at least some chance (depending on the company, of course) that you can shape the future of the company you work for. If nothing else, you can use your thinking in this area to help you decide where you want to work next.

Now, let’s talk about some risk factors. While these trends seem strong to me, it is still worth asking: what could take things in a different direction? One thing that would certainly change the outlook would be a major economic crash or failure like the Great Depression. This might seem unthinkable to most people, but more than a few of the thinkers I follow in the economic and political realms are suggesting this kind of thing is possible. If this happens, companies will be holding things together with tin cans, bailing wire, and duct tape; in this case, all bets are off. Another could be the collapse of the entire disaggregation ecosystem. Perhaps another could be someone discovering how to break the State/Optimization/Surface triad, or somehow beat CAP theorem.

There is also the possibility that people, at large, will reject the data driven economy that is developing, intentionally moving back to a more personally focused world with local shopping, and offline friends rather than online. I would personally support such a thing, but but while I think such a move could happen, I do not see it impacting every area of life. The “buy local” mantra is largely focused on bookstores, food, and some other areas. Notice this, however: if “buy local” is really what it means, then it means buying from locally owned stores, rather than shifting from an online retailer to a large chain mixed online/offline retailer. Buy local is not a panacea for appliance based network engineering, and may even help drive the changes I see ahead.

So there you have it: in this first week of 2019, this is what I think is going to happen in the world of networking technology. I could be way wrong, and I am sticking my neck out a good bit in publishing this little series.

As always, this is more of a two-way conversation than you imagine. I read the comments here and on LinkedIn, and even (sometimes) on Twitter, so tell me what you think the network future of network engineering will be. I am not so old, and certain of myself, that I cannot learn new things! 🙂

Whither Network Engineering? (Part 2)

In the first post of this series at the turn of 2019, I considered the forces I think will cause network engineering to radically change. What about the timing of these changes? I hear a lot of people say” “this stuff isn’t coming for twenty years or more, so don’t worry about it… there is plenty of time to adapt.” This optimism seems completely misplaced to me. Markets and ideas are like that old house you pass all the time—you know the one. No-one has maintained it for years, but it is so … solid. It was built out of the best timber, by people who knew what they were doing. The foundation is deep, and it has lasted all these years.

Then one day you pass a heap of wood on the side of the road and realize—this is that old house that seemed so solid just a few days ago. Sometime in the night, that house that was so solid collapsed. The outer shell was covering up a lot of inner rot. Kuhn, in The Structure of Scientific Revolutions, argues this is the way ideas always go. They appear to be solid one day, and then all the supports that looked so solid just moments before the collapse are all shown to be full of termites. The entire system of theories collapses in what seems like a moment compared to the amount of time the theory has stood. History has borne this way of looking at things out.

The point is: we could wake up in five years’ time and find the entire structure of the network engineering market has changed while we were asleep at the console running traceroute. I hear a lot of people talk about how it will take tens of years for any real change to take place because some class of businesses (usually the enterprise) do not take up new things very quickly. This line of thinking assumes the structure of business will remain the same—I think this tends to underestimate the symbiotic relationship between business and information technology. We have become so accustomed to seeing IT as a cost center that has little bearing on the overall business that it is hard to shift our thinking to the new realities that are starting to take hold.

While some niche retailers are doing okay, most of the the broad-based ones are in real trouble. Shopping malls are like ghost towns, bookstores are closing in droves; even grocery stores are struggling in many areas. This is not about second-day delivery—this is about data. Companies must either be in a deep niche or learn to work with data to survive. Companies that can most effectively combine and use data to anticipate, adapt to, and influence consumer behavior will survive. The rest will not.

Let me give some examples that might help. Consider Oak Island Hardware, a local hardware store, Home Depot, Sears, and Amazon. First, there are two kinds of businesses here; while all four have products that overlap, they service two different kinds of needs. In the one case, Home Depot and Oak Island Hardware cater to geographically localized wants where physical presence counts. When your plumbing starts to leak, you don’t have time to wait for next-day delivery. If you are in the middle of rebuilding a wall or a cabinet and you need another box of nails, you are not waiting for a delivery. You will get in your car and drive to the nearest place that sells such things. To some degree, Oak Island Hardware and Home Depot are in a separate kind of market than Sears and Amazon.

Consider Sears and Amazon as a pair first. Amazon internalized its data handling, and builds semi-custom solutions to support that data handling. Sears tried to focus on local stores, inventory management, and other traditional methods of retail. Sears is gone, Amazon remains. So Home Depot and Oak Island Hardware have a “niche” that protects them (to some degree) from the ravages of the data focused world. Now consider Oak Island Hardware versus Home Depot. Here the niche is primarily geographical—there just is not enough room on Oak Island to build a Home Depot. When people need a box of nails “now,” they will often choose the closer store to get those nails.

On the other hand, what kind of IT needs does a stand-alone store like Oak Island Hardware have? I do not think they will be directly hiring any network engineers in the near future. Instead, they will be purchasing IT services in the form of cloud-based applications. These cloud-based applications, in turn, will be hosted on … disaggregated stacks run by providers.

The companies in the broader markets that are doing well have have built fully- or semi-customized systems to handle data efficiently. The network is no longer treated as a “thing” to be built; it is just another part of a larger data delivery system. Ultimately, businesses in broader markets that want to survive need to shift their thinking to data. The most efficient way to do this is to shift to a disaggregated, layered model similar to the one the web- and hyper-scalers have moved to.

I can hear you out there now, reading this, saying: “No! They can’t do this! The average IT shop doesn’t have the skilled people, the vision, the leadership, the… The web- and hyper-scalers have specialized systems built for a single purpose! This stuff doesn’t apply to enterprise networks!”

In answer to this plethora of objections, let me tell you a story.

Once, a long time ago, I was sent off to work on installing a project called PC3; a new US Air Force personnel management system. My job was primarily on the network side of the house, running physical circuits through the on-base systems, installing inverse multiplexers, and making certain the circuits were up and running. At the same time, I had been working on the Xerox STAR system on base, as well as helping design the new network core running a combination of Vines and Netware over optical links connecting Cabletron devices. We already had a bunch of other networks on base, including some ARCnet, token bus, thicknet, thinnet, and a few other things, so packet switching was definitely already a “thing.”

In the process of installing this PC3 system, I must have said something about how this was such old technology, and packet switching was eventually going to take over the world. In return, I got an earful or two from one of the older techs working on the job with me. “Russ,” he said, “you just don’t understand! Packet switching is going to be great for some specialized environments, but circuit switching has already solve the general purpose cases.”

Now, before you laugh at the old codger, he made a bunch of good points. At that time, we were struggling to get a packet switched network up between seven buildings, and then trying to figure out how to feed the packet switched network into more buildings. The circuit switched network, on the other hand, already had more bandwidth into every building on base than we could figure out how to bring to those seven buildings. Yes, we could push a lot more bandwidth across a couple of rooms, but even scaling bandwidth out to an entire large building was a challenge.

What changed? The ecosystem. A lot of smart people bought into the vision of packet switched networking and spent a lot of time figuring out how to make it do all the things no-one thought it could do, and apply it to problems no-one thought it could apply to. They learned how to take the best pieces of circuit-switched technology and apply it in the packet switched world (remember the history of MPLS).

So before you say “disaggregation does not apply to the enterprise,” remember the lesson of packet switched networks—and the lessons of a million other similar technologies. Disaggregation might not apply in the same way to web- and hyper-scale networks and enterprise networks, but this does not mean it does not apply at all. Do not throw the baby out with the bathwater.

As the disaggregation ecosystem grows—and it will grow—the options will become both broader and deeper. Rather than seeing the world as standards versus open-source, we will need learn to see standards plus open source. Instead of seeing the ecosystem as commercial versus open source, we will need to learn to see commercial plus open source. Instead of seeing protocols on appliances supporting applications, we need to will learn to see hardware and software. As the ecosystem grows, we will learn to learn from many places, including appliance-based networking, the world of servers, application development, and … the business. We will need to directly apply what makes sense and learn wisdom from the rest.

What does this mean for network engineering skills? That is the topic of the third post in this series.

Scroll To Top