Reaction: Network software quality

Over at IT ProPortal, Dr Greg Law has an article up chiding the networking world for the poor software quality. To wit—

When networking companies ship equipment out containing critical bugs, providing remediation in response to their discovery can be almost impossible. Their engineers back at base often lack the data they need to reproduce the issue as it’s usually kept by clients on premise. An inability to cure a product defect could result in the failure of a product line, temporary or permanent withdrawal of a product, lost customers, reputational damage, and product reengineering expenses, any of which could have a material impact on revenue, margins, and net income.

Let me begin here: Dr. Law, you are correct—we have a problem with software quality. I think the problem is a bit larger than just the networking world—for instance, my family just purchased two new vehicles, a Volvo and a Fiat. Both have Android systems in the center screen. And neither will connect correctly with our Android based phones. It probably isn’t mission critical, like it could be for a network, but it is annoying.

But even given software quality is a widespread issue in our world, it is still true that networks are something of a special case. While networks are often just hidden behind the plug, they play a much larger role in the way the world works than most people realize. Much like the train system at the turn of the century, and the mixed mode transportation systems that enable us to put dinner on the table every night, the network carries most of what really matters in the virtual world, from our money to our medical records.

Given the assessment is correct—and I think it is—what is the answer?

One answer is to simply do better. To fuss at the vendors, and the open source projects, and to make the quality better. The beatings, as they say, will continue until moral improves. If anyone out there thinks this will really work, raise your hands. No, higher. I can’t see you. Or maybe no-one has their hands raised.

What, then, is the solution? I think Dr. Law actually gets at the corner of what the solution needs to be in this line—

The complexity of the network stack though, is higher than ever. An increased number of protocols leads to a more complex architecture, which in turn severely impacts operational efficiency of networks.

For a short review, remember that complexity is required to solve hard problems. Specifically, the one hard problem complexity is designed to solve is environmental uncertainty. Because of this, we are not going to get rid of complexity any time soon. There are too many old applications, and too many old appliances, that no-one willing to let go of. There are too many vendors trying to keep people within their ecosystem, and too many resulting one-off connectors to bridge the gap, that will never be replaced. Complexity isn’t really going to be dramatically reduced until we bite the bullet and take these kinds of organizational and people problems on head on.

In the meantime, what can we do?

Design simpler. Stop stacking tons of layers. Focus on solving problems, rather than deploying technologies. Stop being afraid to rip things out.

If you have read my work in the past, for instance Navigating Network Complexity, or Computer Networking Problems and Solutions, or even The Art of Network Architecture, you know the drill.

We can all cast blame at the vendors, but part of this is on us as network engineers. If you want better quality in your network, the best place to start is with the network you are working on right now, the people who are designing and deploying that network, and the people who make the business decisions.