Is it time for the IETF to give up? Over at CircleID, Martin Geddes makes a case that it is, in fact, time for the IETF to “fade out.” The case he lays out is compelling—first, the IETF is not really an engineering organization. There is a lot of running after “success modes,” but very little consideration of failure modes and how they can and should be guarded against. Second, the IETF “the IETF takes on problems for which it lacks an ontological and epistemological framework to resolve.”
In essence, in Martin’s view, the IETF is not about engineering, and hasn’t ever really been.
The first problem is, of course, that Martin is right. The second problem is, though, that while he hints at the larger problem, he incorrectly lays it at the foot of the IETF. The third problem is the solutions Martin proposes will not resolve the problem at hand.
First things first: Martin is right. The IETF is a mess, and is chasing after success, rather than attending to failure. I do not think this is largely a factor of a lack of engineering skill, however—after spending 20 years working in the IETF, there seems to be a lot of ego feeding the problem. Long gone are the days when the saying “it is amazing how much work can get done when no-one cares who gets the credit” was true of the IETF. The working groups and mailing lists have become the playground of people who are primarily interested in how smart they look, how many times they “win,” and how many drafts they can get their names on. This is, of course, inevitable in any human organization, and it is a death of a thousand cuts to real engineering work.
Second, the problem is not (just) the IETF. The problem is the network engineering world in general. Many years ago I was mad at the engineering societies that said: “You can’t use engineer in your certifications, because you are not engineers.” Now, I think they have a point. The networking world is a curious mixture of folks who wrap sheet metal, design processors, build code, design networks, and console jockeys. And we call all of these folks “engineers.” How many of these folks actually know anything beyond the CLI they’re typing away on (or automating), or the innards of a single chipset? In my experience, very few.
On the other hand, there is something that needs to be said here in defense of network engineering, and information technology in general. There are two logical slips in the line of argument that need to be called out and dealt with.
The first line of argument goes like this: “But my father was a plane fitter, and he required certifications!” Sure, but planes are rare, and people die when the fall from the sky. Servers and networks are intentionally built not to be rare, and applications are intentionally built so that people do not die when they fail. It is certainly true that where the real world intersects with the network, specifically at the edge where real people live, there needs to be more thought put into not failing. But at the core, where the law of large numbers is right, we need to think about rabid success, rather than corner case failures.
There are many ways to engineer around failure; not all are appropriate in every case. Part of engineering is to learn to apply the right failure mode thinking to the right problem set, instead of assuming that every engineering problem needs to be addressed in the same way.
The second line of argument goes like this: “But airplanes don’t fail, so we should adopt aviation lines of thinking.” Sorry to tell you this, but even the most precise engineering fails in the face of the real world. Want some examples? Perhaps this, or this, or this, or this will do? Clear thinking does not begin with imbuing the rest of the engineering world with a mystique it does not actually posses.
Third, the solutions offered are not really going to help. Licensing is appropriate when you are dealing with rare things that, when they fall out of the sky, kill people. in many other cases, however, licensing just becomes an excuse to restrict the available talent pool, actually decreasing quality and innovation while raising prices. There needs to be a balance in here someplace—a balance that is probably going to be impossible to reach in the real world. But that does not mean we should not try.
What is to be done?
Dealing only with the IETF, a few practical things might be useful.
First, when any document is made into a working group document, it should be moved to an editor/contributor model. Individual authors should disappear when the work moves into the community, and the work transitions to the work of a team, rather than a small set of individuals. In other words, do what is possible to take the egos out of the game, and replace them with the pride of a job well done.
Second, standards need to explicitly call out what their failure modes are, and how designers are expected to deal with these failure modes. For edge computing, specifically, “build more and deploy them” should not be an option. This is a serious area that needs to be addressed, rather than glossed over by placing every technology at the core, and just assuming the IP paradigm of finding another path works.
Third, the IETF needs to strengthen the IRTF, and ask it to start thinking about how to quantify the differences between the kinds of engineering needed where, and what the intersection of these different kinds of engineering might look like. Far too often, we (the IETF) spend a lot of time navel gazing over which cities we should meet in, and end up leaving the larger questions off the table. We want one “winner,” and fail to embrace the wide array of problems in favor of “the largest vendor,” or “the most politically connected person in the room.”
Fourth, the IETF needs to learn to figure out what its bounds are, and then learn to let go. When I consider that there are hundreds of YANG models, for instance, I begin to suspect that this is one place where we are making some fundamental mistake about where to place the blurry dividing line between what the open source community should (or can) do, and what should be a standard. Perhaps the protocol used to carry a model should be a standard, and perhaps the things operators should expect to be able to find out about a protocol should be a part of the standard, and the modeling language should be a standard—but maybe the outline of the model itself should be left to faster market forces?
In the larger community, I am not going to change what I have been saying for years. We need to grow up and actually be engineers. We need to stop focusing on product lines and CLIs, and start focusing on learning how networks actually work. I am working on one project in this space, and have ideas for others, but for now, I can only point in the same directions I have always pointed in the past.