Technologies that Didn’t: CLNS

Note: RFC1925, rule 11, reminds us that: “Every old idea will be proposed again with a different name and a different presentation, regardless of whether it works.” Understanding the past not only helps us to understand the future, it also helps us to take a more balanced and realistic view of the technologies being created and promoted for current and future use.

The Open Systems Interconnect (OSI) model is the most often taught model of data transmission—although it is not all that useful in terms of describing how modern networks work. What many engineers who have come into network engineering more recently do not know is there was an entire protocol suite that went with the OSI model. Each of the layers within the OSI model, in fact, had multiple protocols specified to fill the functions of that layer. For instance, X.25, while older than the OSI model, was adopted into the OSI suite to provide point-to-point connectivity over some specific kinds of physical circuits. Moving up the stack a little, there were several protocols that provided much the same service as the widely used Internet Protocol (IP).

The Connection Oriented Network Service, or CONS, ran on top of the Connection Oriented Network Protocol, or CONP (which is X.25). Using CONP and CONS, a pair of hosts can set up a virtual circuit. Perhaps the closest analogy available in the world of networks today would be an MPLS Label Switched Path (LSP). Another protocol, the Connectionless Network Service, or CLNS, ran on top of the Connectionless Network Protocol, or CLNP. A series of Transport Protocols ran on top of CLNS (these might also be described as modes of CLNS in some sense). Together, CLNS and these transport protocols provided a set of services similar to IP, the Transmission Control Protocol (TCP), and the User Datagram Protocol (UDP)—or perhaps even closer to something like QUIC.

The routing protocol that held the network together by discovering the network topology, carrying reachability, and calculating loop free paths was the venerable Intermediate System to Intermediate System (IS-IS) protocol, which is still in wide use today. The OSI protocol suite had some interesting characteristics.

For instance, each host had a single address, rather than each interface. The host address was calculated automatically based on a local Media Access Control (MAC) address, combined with other bits of information that were either learned, assumed, or locally configured. A single host could have multiple addresses, using each one to communicate to the different networks it might be connected to, or even being used to contact hosts in different domains. The Intermediate System (IS) played a vital role in the process of address calculation and routing; essentially routing ran all the way down to the host level, which allowed smart calculation of next hops. While a default router could be configured in some implementations, it was not required, as the hosts participated in routing.

A tidbit of interesting history—one of the earliest uses of address families in protocols, and one of the main drivers of the Type-Length-Vector encoding of the IS-IS routing protocol, were driven by the desire to run CLNS and TCP/IP side-by-side on a single network. ISO-IGRP was one of the first multi-protocol routing protocols to use a concept similar to address families. Some of the initial work in BGP address families was also done to support CLNS routing, as the OSI stack did not specify an interdomain routing protocol.

Why is this interesting protocol not in wide use today? There are many reasons—probably every engineer who ever worked on both can give you a different set of reasons, in fact. From my perspective, however, there are a few basic reasons why TCP/IP “won” over the CLNS suite of protocols.

First, the addressing used with CLNS was too complex. There were spots in the address for the assigning authority, the company, subdivisions within the company, flooding domains, and other elements. While this was all very neat, it assumed either one of two things: that network topologies would be built roughly parallel to the organizational chart (starting from governmental entities), or that the topology of the network and addressing did not need to be related to one another. This flies in the face of Yaakov’s rule: either the topology must follow the addressing, or the addressing must follow the topology, if you expect the network to scale.

In the later years of CLNS, this entire mess was replaced by people just using the “private” organizational information to build internal networks, and assuming these networks would never be interconnected (much like using private IPv4 address space). Sometimes this worked, of course. Sometimes it did not.

This addressing scheme left users with the impression, true or false, that CLNS implementations had to be carefully planned. Numbers must be procured, the organizational structure must be consulted, etc. IP, on the other hand, was something you could throw out there and play with. You could get it all working, and plan later, or change you plans. Well, in theory, at least—things never seem to work out in real life the way they are supposed to.

Second, the protocol stack itself was too complex. Rather than solving a series of small problems, and fitting the solutions in a sort of ad-hoc layered set of protocols, the designers of CLNS carefully thought through every possible problem, and considered every possible solution. At least they thought they did, anyway. All this thinking, however, left the impression, again, that to deploy this protocol stack you had to carefully think about what you were about. Further, it was difficult to change things in the protocol stack when problems were found, or new use cases—that had not been thought of—were discovered.

It is better, most of the time, to build small, compact units that fit together, than it is to build a detailed grand architecture. The network engineering world tends to oscillate between the two extremes of no planning or all planned; rarely do we get the temperature of the soup (or porridge) just right.

While CLNS is not around today, it is hard to call the protocol stack a failure. CLNS was widely deployed in large scale electrical networks; some of these networks may well be in use today. Further, its effects are everywhere. For instance, IS-IS is still a widely deployed protocol. In fact, because of its heritage of multiprotocol design from the beginning, it is arguably one of the easiest link state protocols to work with. Another example is the multiprotocol work that has carried over into other protocols, such as BGP. The ideas of a protocol running between the host and the router and autoconfiguration of addresses, have come back in very similar forms in IPv6, as well.

CLNS is another one of those designs that shape the thinking of network engineers today, even if network engineers don’t even know it existed.

Reducing RPKI Single Point of Takedown Risk

The RPKI, for those who do not know, ties the origin AS to a prefix using a certificate (the Route Origin Authorization, or ROA) signed by a third party. The third party, in this case, is validating that the AS in the ROA is authorized to advertise the destination prefix in the ROA—if ROA’s were self-signed, the security would be no better than simply advertising the prefix in BGP. Who should be able to sign these ROAs? The assigning authority makes the most sense—the Regional Internet Registries (RIRs), since they (should) know which company owns which set of AS numbers and prefixes.

The general idea makes sense—you should not accept routes from “just anyone,” as they might be advertising the route for any number of reasons. An operator could advertise routes to source spam or phishing emails, or some government agency might advertise a route to redirect traffic, or block access to some web site. But … if you haven’t found the tradeoffs, you haven’t looked hard enough. Security, in particular, is replete with tradeoffs.

Every time you deploy some new security mechanism, you create some new attack surface—sometimes more than one. Deploy a stateful packet filter to protect a server, and the device itself becomes a target of attack, including buffer overflows, phishing attacks to gain access to the device as a launch-point into the private network, and the holes you have to punch in the filters to allow services to work. What about the RPKI?

When the RKI was first proposed, one of my various concerns was the creation of new attack services. One specific attack surface is the control a single organization—the issuing RIR—has over the very existence of the operator. Suppose you start a new content provider. To get the new service up and running, you sign a contract with an RIR for some address space, sign a contract with some upstream provider (or providers), set up your servers and service, and start advertising routes. For whatever reason, your service goes viral, netting millions of users in a short span of time.

Now assume the RIR receives a complaint against your service for whatever reason—the reason for the complaint is not important. This places the RIR in the position of a prosecutor, defense attorney, and judge—the RIR must somehow figure out whether or not the charges are true, figure out whether or not taking action on the charges is warranted, and then take the action they’ve settled on.

In the case of a government agency (or a large criminal organization) making the complaint, there is probably going to be little the RIR can do other than simply revoke your certificate, pulling your service off-line.

Overnight your business is gone. You can drag the case through the court system, of course, but this can take years. In the meantime, you are losing users, other services are imitating what you built, and you have no money to pay the legal fees.

A true story—without the names. I once knew a man who worked for a satellite provider, let’s call them SATA. Now, SATA’s leadership decided they had no expertise in accounts receivables, and they were spending too much time on trying to collect overdue bills, so they outsourced the process. SATB, a competing service, decided to buy the firm SATA outsourced their accounts receivables to. You can imagine what happens next… The accounting firm worked as hard as it could to reduce the revenue SATA was receiving.

Of course, SATA sued the accounting firm, but before the case could make it to court, SATA ran out of money, laid off all their people, and shut their service down. SATA essentially went out of business. They won some money later, in court, but … whatever money they won was just given to the investors of various kinds to make up for losses. The business itself was gone, permanently.

Herein lies the danger of giving a single entity like an RIR, even if they are friendly, honest, etc., control over a critical resource.

A recent paper presented at the ANRW at APNIC caught my attention as a potential way to solve this problem. The idea is simple—just allow (or even require) multiple signatures on a ROA. To be more accurate, each authorizing party issues a “partial certificate;” if “enough” pieces of the certificate are found and valid, the route will be validated.

The question is—how many signatures (or parts of the signature, or partial attestations) should be enough? The authors of the paper suggest there should be a “Threshold Signature Module” that makes this decision. The attestations of the various signers are combined in the threshold module to produce a single signature that is then used to validate the route. This way the validation process on the router remains the same, which means the only real change in the overall RPKI system is the addition of the threshold module.

If one RIR—even the one that allocated the addresses you are using—revokes their attestation on your ROA, the remaining attestations should be enough to convince anyone receiving your route that it is still valid. Since there are five regions, you have at least five different choices to countersign your ROA. Each RIR is under the control of a different national government; hence organizations like governments (or criminals!) would need to work across multiple RIRs and through other government organizations to have a ROA completely revoked.

An alternate solutions here, one that follows the PGP model, might be to simply have the threshold signature model consider the number and source of ROAs using the existing model. Local policy could determine how to weight attestations from different RIRs, etc.

This multiple or “shared” attestation (or signature) idea seems like a neat way to work around one of (possibly the major) attack surfaces introduced by the RPKI system. If you are interested in Internet core routing security, you should take a read through the post linked above, and then watch the video.

Everyone Must Learn to Code

The word on the street is that everyone—especially network engineers—must learn to code. A conversation with a friend and an article passing through my RSS reader brought this to mind once again—so once more into the breach. Part of the problem here is that we seem to have a knack for asking the wrong question. When we look at network engineer skill sets, we often think about the ability to configure a protocol or set of features, and then the ability to quickly troubleshoot those protocols or features using a set of commands or techniques.

This is, in some sense, what various certifications have taught us—we have reached the expert level when we can configure a network quickly, or when we can prove we understand a product line. There is, by the way, a point of truth in this. If you claim your expertise is with a particular vendor’s gear, then it is true that you must be able to configure and troubleshoot on that vendor’s gear to be an expert. There is also a problem of how to test for networking skills without actually implementing something, and how to implement things without actually configuring them. This is a problem we are discussing in the new “certification” I’ve been working on, as well.

This is also, in some sense, what the hiring processes we use have taught us. Computers like to classify things in clear and definite ways. The only clear and definite way to classify networking skills is by asking questions like “what protocols do you understand how to configure and troubleshoot?” It is, it seems, nearly impossible to test design or communication skills in a way that can be easily placed on a resume.

Coding, I think, is one of those skills that is easy to appear to measure accurately, and it’s also something the entire world insists is the “coming thing.” No coding skills, no job. So it’s easy to ask the easy question—what languages do you know, how many lines of code have you written, etc. But again, this is the wrong question (or these are the wrong questions).

What is the right question? In terms of coding skills, more along the lines of something like, “do you know how to build and use tools to solve business problems?” I phrase it this way because the one thing I have noticed about every really good coder I have known is they all spend as much time building tools as they do building shipping products. They build tools to test their code, or to modify the code they’ve already written en masse, etc. In fact, the excellent coders I know treat functions like tools—if they have to drive a nail twice, they stop and create a hammer rather than repeating the exercise with some other tool.

So why is coding such an important skill to gain and maintain for the network engineer? This paragraph seems to sum it up nicely for me—

“Coding is not the fundamental skill,” writes startup founder and ex-Microsoft program manager Chris Granger. What matters, he argues, is being able to model problems and use computers to solve them. ”We don’t want a generation of people forced to care about Unicode and UI toolkits. We want a generation of writers, biologists, and accountants that can leverage computers.”

It’s not the coding that matters, it’s “being able to model problems and use computers to solve them.” This is the essence of tool building or engineering—seeing the problem, understanding the problem, and then thinking through (sometimes by trial and error) how to build a tool that will solve the problem in a consistent, easy to manage way. I fear that network engineers are taking their attitude of configuring things and automating it to make the configuration and troubleshooting faster. We seem to end up asking “how do I solve the problem of making the configuration of this network faster,” rather than asking “what business problem am I trying to solve?”

To make effective use of the coding skills we’re telling everyone to learn, we need to go back to basics and understand the problems we’re trying to solve—and the set of possible solutions we can use to solve those problems. Seen this way, the routing protocol becomes “just another tool,” just like a function call, that can be used to solve a specific set of problems—instead of a set of configuration lines that we invoke like a magic incantation to make things happen.

Coding skills are important—but they require the right mindset if we’re going to really gain the sorts of efficiencies we think are possible.

It Has to Work (RFC1925, Rule 1)

From time immemorial, humor has served to capture truth. This is no different in the world of computer networks. A notable example of using humor to capture truth is the April 1 RFC series published by the IETF. RFC1925, The Twelve Networking Truths, will serve as our guide.

According to RFC1925, the first fundamental truth of networking is: it has to work. While this might seem to be overly simplistic, it has proven—over the years—to be much more difficult to implement in real life than it looks like in a slide deck. Those with extensive experience with failures, however, can often make a better guess at what is possible to make work than those without such experience. The good news, however, is the experience of failure can be shared, especially through self-deprecating humor.

Consider RFC748, which is the first April First RFC published by the IETF, the TELNET RANDOMLY-LOSE Option. This RFC describes a set of additional signals in the TELNET protocol (for those too young to remember, TELNET is what people used to communicate with hosts before SSH and web browsers!) that instruct the server not to provide random losses through such things as “system crashes, lost data, incorrectly functioning programs, etc., as part of their services.” The RFC notes that many systems apparently have undocumented features that provide such losses, frustrating users and system administrators. The option proposed would instruct the server to disable features which cause these random losses.

Lesson learned? Although one of the general rules of application design is the network is not reliable, the counter rule suggested by RFC748 is the application is not reliable, either. This a key point in the race to Mean Time to Innocence (MTTI). RFC1882, published a few years after RFC748, is a veritable guidebook for finding problems in a network, including transceiver failures, databases with broken b-trees, unterminated contacts, and a plethora of other places to look.  Published just before Christmas, RFC1882 is an ideal guide for those who want to spend time with their families during the most festive times of the year.

Another common problem in large-scale networks is services that want to choose to operate from the safety and security of an anonymous connection. RFC6593 describes the Doman Pseudonym System, specifically designed to support services that do not wish to be discovered. The specification describes two parties to the protocol, the first being the seeker, or “it,” and the second being the service which is attempting to hide from it. The process used is for the seeker to send a transmission declaring the beginning of the search sequence called the “ready or not,” followed by a countdown during which “it” is not allowed to peek at a list of available services. During this countdown, the service may change its name or location, although it will be penalized if discovered doing so. This Domain Pseudonym System is the perfect counterpart to the Domain Name System normally used to discover services on large-scale networks, as shown by the many networks that already deploy such a hide-and-seek method to managing services.

What if all the above guidance for network operators fails, and you are stuck troubleshooting a problem? RFC2321 has an answer to this problem: RITA — The Reliable Internetwork Troubleshooting Agent. The typical RITA is described as 51.25cm in length, and yellow/orange in color. The first test the operator can perform with the RITA is placing it on the documentation for the suspect system, or on top of the suspect system itself. If the RITA eventually flies away, there is a greater than 90% chance there is a defect in the system tested. The odds of the defects in the tested system being the root cause of the problem the operator is currently troubleshooting is not guaranteed, however. The RITA has such a high success rate because it is believed that 100% of systems in operation do, in fact, contain defects. The 10% failure rate primarily occurs in cases where the RITA itself dies during the test, or decides to go to sleep rather than flying to some other location.

Each of these methods can help the network operator fulfill the first rule of networking: it has to work.

Understandability

According to Maor Rudick, in a recent post over at Cloud Native, programming is 10% writing code and 90% understanding why it doesn’t work. This expresses the art of deploying network protocols, security, or anything that needs thought about where and how. I’m not just talking about the configuration, either—why was this filter deployed here rather than there? Why was this BGP community used rather than that one? Why was this aggregation range used rather than some other? Even in a fully automated world, the saying holds true.

So how can you improve the understandability of your network design? Maor defines understandability as “the dev who creates the software is to effortlessly … comprehend what is happening in it.” Continuing—“the more understandable a system is, the easier it becomes for the developers who created it to change it in a way that is safe and predictable.” What are the elements of understandability?

Documentation must be complete, clear, concise, and organized. The two primary failings I encounter in documentation are completeness and organization. Why something is done, when it was last changed, and why it was changed are often missing. The person making the change just assumes “I’ll remember this, or someone will figure it out.” You won’t, and they won’t. Concise is the “other side” of complete … Recording unsubstantial changes just adds information that won’t ever be needed. You have to  balance between enough and too much, of course.

Organization is another entire problem in documentation—most people have a favorite way to organize things. When you get a team of people all organizing things based on their favorite way, you end up with a mess. Going back in time … I remember that just about everyone who was assigned to the METNAV shop began their time by re-organizing the tools. Each time the re-organization made things so much easier to find, and improved the MTTR for the airfield equipment we supported … After a while, you’d think someone would ask, “Does re-organizing all the tools every year really help? Or are you just making stuff up for new folks to do?”

Moving beyond documentation, what else can we do to make our networks more understandable?

First, we can focus on actually making networks simpler. I don’t mean just glossing things over with a pretty GUI, or automating thousands of lines of configuration using Python. I mean taking steps by using protocols that are simpler to run, require less configuration, and produce more information you can use for troubleshooting—choose something like IS-IS for your DC fabric underlay rather than BGP, unless you really have several hundred thousand of underlay destinations (hint, if you’ve properly separated “customer” routes in the overlay from “infrastructure” routes in the underlay, you shouldn’t have this kind of routing tangle in the underlay anyway).

What about having multiple protocols that do the same job? Do you really need three or four routing protocols, four or five tunneling protocols, and five or six … well, you get the idea. Reducing the sheer number of protocols running in your network can make a huge difference in the tooling troubleshooting time. What about having four or five kinds of boxes in your network that fulfill the same role? Okay—so maybe you have three DC fabrics, and you run each one using a different vendor. But is there is any reason to have three DC fabrics, each of which has a broad mix of equipment from five different vendors? I doubt it.

Second, you can think about what you would measure in the case of failure, how you would measure it, and put the basic piece in place in the design phase to do those measurements. Don’t wait until you need the data to figure out how to get at it, and what the performance results of trying to get it are going to be.

Third, you can think about where you put policy in your network. There is no “right” answer to this question, other than … be consistent. The first option is to put all your policy in one place—say, on the devices that connect the core to the aggregation, or the devices in the distribution layer. The second option is to always put the policy as close to the source or destination of the traffic impacted by the policy. In a DC fabric, you should always put policy and external connectivity in the T0 or ToR, never in the spine (it’s not a core, it’s a spine).

Maybe you have other ideas on how to improve understandability in networks … If you do, get in touch and let’s talk about it. I’m always looking for practical ways to make networks more understandable.

The White Board and the Simulation

In the argument between OSPF and BGP in the data center fabric over at Justin’s blog, I am decidedly in the camp of IS-IS. Rather than lay my reasons out here, however (a topic for another blog post?), I want to focus on something else Justin said that I think is incredibly important for network engineers to understand.

I think whiteboards are the most important tool for network design currently available, which makes me sad. I wish that wasn’t true, I want much better tools. I can’t even tell you the number of disasters averted by 2-3 great network engineers arguing over a whiteboard.

I remember—way back—when I was working on the problems around making a link-state protocol work well in a Mobile Ad Hoc Network (MANET), we had two competing solutions presented to the IETF. The first solution was primarily based on whiteboarding through various options and coming up with one that should reduce flooding to an acceptable level. The second was less optimal on the whiteboard but supported by simulations showing it should reduce flooding more effectively.

Which solution “won?” I don’t know what “winning” might mean here, but the solution designed on the whiteboard has been widely deployed and is now showing up in other places—take a look at distoptflood and the flooding optimizations in RIFT. Both are like what we designed all those years ago to make OSPF work in a MANET.

Does this mean simulations, labbing, and testing are useless? No.

Each of these tools brings a different set of strengths to the table when trying to solve a problem. For instance, there is no way to optimize the flooding in a protocol unless you really know how flooding works, what information you have available, where things might fail, what features are designed into the protocol to prevent or work around failures, etc. These things are what you need to put together and understand on a whiteboard.

You cannot think of every possible situation that needs to be simulated, nor can you simulate every possible order of operation, or every possible timing problem that might occur. If you know the protocol, however, you can cover most of this ground and design in fail-safes. Simulations are necessary, but not sufficient to network and protocol design.

On the other side of the coin, Justin points out the stack they were using was known to have a weak BGP implementation and a strong OSPF implementation. This, and other scale and timing issues, will only show up under a simulation. For instance, you cannot answer the question how large can the LSDB become before the processor keels over and dies on a whiteboard—it’s just not possible. The whiteboard is necessary, but not sufficient to network and protocol design.

The danger is that we attempt to replace one with the other—that we either ignore the value of simulation because it’s hard (or even impossible), or we ignore the power of the whiteboard because we don’t understand the protocols well enough to stand at the whiteboard and argue. Every team needs to have at least one person who can “see” how the network will converge just because they understand the protocol. And every team needs to have at least one person who can throw together a simulation and show how the network will converge in the same situation.

Whiteboards and simulations are both crucial tools. Learn the protocols and design well enough to use the one and find or build the tools needed to use the other. Missing either one is going to leave a blind spot in your abilities as an engineer.

Which one are you missing in your skill set right now? If you know the answer to that question, then you know at least one thing you need to learn “next.”

Unsolicited Multicast: Random Thoughts on the LFN White Paper

A short while back, the Linux Foundation (Networking), or LFN, published a white paper about the open source networking ecosystem. Rather than review the paper, or try to find a single theme, I decided to just write down “random thoughts” as I read through it. This is the (rather experimental) result.

The paper lists five goals of the project which can be reduced to three: reducing costs, increasing operator’s control over the network, and increasing security (by increasing code inspection). One interesting bit is the pairing of cost reduction with increasing control. Increasing control over a network generally means treating it less like an opaque box and more like a disaggregated set of components, each of which can be tuned in some way to improve the fit between network services, network performance, and business requirements. The less a network is an opaque box, however, the more time and effort required to manage it. This only makes sense—tuning a network to perform better requires time and talent, both of which cost money.

The offsetting point here is disaggregation and using open source can save money—although in my experience it never does. Again, running disaggregated software and hardware requires time and talent, both of which cost money. Intuitively, then, reducing costs and increasing control don’t “pair” together like this. Building a more tunable, flexible network might be possible at the same cost as building a network in some “more traditional way,” but “costs less” is generally the last thing on my mind when I’m thinking about disaggregation, open source, and more modular, systemic wholistic designs.

The bottom line—if you are going to try to sell disaggregation to your boss, do not play the “cost card.” Focus on the business side of things, instead. Another way of putting this—don’t reduce what you are selling to a commodity, or an item with no business value other than reducing costs. First, this is simply not true. Second, you’ll never meet a salesman who says, “what I’m selling you is a commodity, so I’m really just after getting it to you for cheaper.” Even toilet paper companies sell on quality. Networks are more important than toilet paper to the business, right?

A second interesting point— “the network control layer is where end-to-end complex network services are designed and executed.” I broadly agree with this statement, but … I have been pushing, for some time, the idea that there is not just one control plane. Part of the problem with network design is we tend to modularize topologically, using one form or another of hierarchy, quite nicely. What we do no do, however, is think about how to modularize vertically to create a true wholistic view of the system “as a whole.” We do create multiple layers of data planes (they’re called tunnels!), but we do not think about creating multiple layers of control planes.

Going way back in time, when transit providers first started scaling, BGP speakers would not advertise a route that was not also present in a local IGP table (not just the routing table, but the actual OSPF. EIGRP, or IS-IS table). This was called synchronization, and it was designed to ensure traffic being forwarded into the network from the edge had a real path through the network. Since BGP can “skip over” routers using multihop, it was far too easy to send traffic to a router that was not running BGP, and hence did not have forwarding information for the destination … even though some router just one or two hops away did because it was running BGP.

Operators soon discovered keeping their IGP and BGP tables synchronized simply isn’t possible—the IGPs just could not support the route counts required. This led to running BGP on every router in the AS, which led to confederations, then to route reflectors, then to MPLS and other tunneling mechanisms to reduce state in the network core.

Running BGP on every router cleanly separates internal or infrastructure routes from external or customer routes. This is an effective use of information hiding vertically, an overlay with some information, and an underlay with other information. This is what I’ve been pressing for in the DC fabric space for quite a while now … and its something the network engineering community needs to explore and flesh out more in many other spaces.

So, there it is—I could go on, but I think you’re probably already bored with my random thoughts.