The Hedge 69: Container Networking Done Right
Everyone who’s heard me talk about container networking knows I think it’s a bit of a disaster. This is what you get, though, when someone says “that’s really complex, I can discard the years of experience others have in designing this sort of thing and build something a lot simpler…” The result is usually something that’s more complex. Alex Pollitt joins Tom Ammon and I to discuss container networking, and new options that do container networking right.
The Hedge 53: Deprecating Interdomain ASM
Interdomain Any-source Multicast has proven to be an unscalable solution, and is actually blocking the deployment of other solutions. To move interdomain multicast forward, Lenny Giuliano, Tim Chown, and Toerless Eckert wrote RFC 8815, BCP 229, recommending providers “deprecate the use of Any-Source Multicast (ASM) for interdomain multicast, leaving Source-Specific Multicast (SSM) as the recommended interdomain mode of multicast.”
Technologies that Didn’t: CLNS
Note: RFC1925, rule 11, reminds us that: “Every old idea will be proposed again with a different name and a different presentation, regardless of whether it works.” Understanding the past not only helps us to understand the future, it also helps us to take a more balanced and realistic view of the technologies being created and promoted for current and future use.
The Open Systems Interconnect (OSI) model is the most often taught model of data transmission—although it is not all that useful in terms of describing how modern networks work. What many engineers who have come into network engineering more recently do not know is there was an entire protocol suite that went with the OSI model. Each of the layers within the OSI model, in fact, had multiple protocols specified to fill the functions of that layer. For instance, X.25, while older than the OSI model, was adopted into the OSI suite to provide point-to-point connectivity over some specific kinds of physical circuits. Moving up the stack a little, there were several protocols that provided much the same service as the widely used Internet Protocol (IP).
The Connection Oriented Network Service, or CONS, ran on top of the Connection Oriented Network Protocol, or CONP (which is X.25). Using CONP and CONS, a pair of hosts can set up a virtual circuit. Perhaps the closest analogy available in the world of networks today would be an MPLS Label Switched Path (LSP). Another protocol, the Connectionless Network Service, or CLNS, ran on top of the Connectionless Network Protocol, or CLNP. A series of Transport Protocols ran on top of CLNS (these might also be described as modes of CLNS in some sense). Together, CLNS and these transport protocols provided a set of services similar to IP, the Transmission Control Protocol (TCP), and the User Datagram Protocol (UDP)—or perhaps even closer to something like QUIC.
The routing protocol that held the network together by discovering the network topology, carrying reachability, and calculating loop free paths was the venerable Intermediate System to Intermediate System (IS-IS) protocol, which is still in wide use today. The OSI protocol suite had some interesting characteristics.
For instance, each host had a single address, rather than each interface. The host address was calculated automatically based on a local Media Access Control (MAC) address, combined with other bits of information that were either learned, assumed, or locally configured. A single host could have multiple addresses, using each one to communicate to the different networks it might be connected to, or even being used to contact hosts in different domains. The Intermediate System (IS) played a vital role in the process of address calculation and routing; essentially routing ran all the way down to the host level, which allowed smart calculation of next hops. While a default router could be configured in some implementations, it was not required, as the hosts participated in routing.
A tidbit of interesting history—one of the earliest uses of address families in protocols, and one of the main drivers of the Type-Length-Vector encoding of the IS-IS routing protocol, were driven by the desire to run CLNS and TCP/IP side-by-side on a single network. ISO-IGRP was one of the first multi-protocol routing protocols to use a concept similar to address families. Some of the initial work in BGP address families was also done to support CLNS routing, as the OSI stack did not specify an interdomain routing protocol.
Why is this interesting protocol not in wide use today? There are many reasons—probably every engineer who ever worked on both can give you a different set of reasons, in fact. From my perspective, however, there are a few basic reasons why TCP/IP “won” over the CLNS suite of protocols.
First, the addressing used with CLNS was too complex. There were spots in the address for the assigning authority, the company, subdivisions within the company, flooding domains, and other elements. While this was all very neat, it assumed either one of two things: that network topologies would be built roughly parallel to the organizational chart (starting from governmental entities), or that the topology of the network and addressing did not need to be related to one another. This flies in the face of Yaakov’s rule: either the topology must follow the addressing, or the addressing must follow the topology, if you expect the network to scale.
In the later years of CLNS, this entire mess was replaced by people just using the “private” organizational information to build internal networks, and assuming these networks would never be interconnected (much like using private IPv4 address space). Sometimes this worked, of course. Sometimes it did not.
This addressing scheme left users with the impression, true or false, that CLNS implementations had to be carefully planned. Numbers must be procured, the organizational structure must be consulted, etc. IP, on the other hand, was something you could throw out there and play with. You could get it all working, and plan later, or change you plans. Well, in theory, at least—things never seem to work out in real life the way they are supposed to.
Second, the protocol stack itself was too complex. Rather than solving a series of small problems, and fitting the solutions in a sort of ad-hoc layered set of protocols, the designers of CLNS carefully thought through every possible problem, and considered every possible solution. At least they thought they did, anyway. All this thinking, however, left the impression, again, that to deploy this protocol stack you had to carefully think about what you were about. Further, it was difficult to change things in the protocol stack when problems were found, or new use cases—that had not been thought of—were discovered.
It is better, most of the time, to build small, compact units that fit together, than it is to build a detailed grand architecture. The network engineering world tends to oscillate between the two extremes of no planning or all planned; rarely do we get the temperature of the soup (or porridge) just right.
While CLNS is not around today, it is hard to call the protocol stack a failure. CLNS was widely deployed in large scale electrical networks; some of these networks may well be in use today. Further, its effects are everywhere. For instance, IS-IS is still a widely deployed protocol. In fact, because of its heritage of multiprotocol design from the beginning, it is arguably one of the easiest link state protocols to work with. Another example is the multiprotocol work that has carried over into other protocols, such as BGP. The ideas of a protocol running between the host and the router and autoconfiguration of addresses, have come back in very similar forms in IPv6, as well.
CLNS is another one of those designs that shape the thinking of network engineers today, even if network engineers don’t even know it existed.
The Hedge 44: Pete Lumbis and Open Source
Open source software is everywhere, it seems—and yet it’s nowhere at the same time. Everyone is talking about it, but how many people and organizations are actually using it? Pete Lumbis at NVIDIA joins Tom Ammon and Russ White to discuss the many uses and meanings of open source software in the networking world.
On the ‘net: Routing in DC Fabrics
Hosts Roopa Prabhu and Pete Lumbis are joined by a special guest to the podcast, Russ White! The group come together virtually to discuss what we should think about when it comes to routing protocols in the datcenter. What are the tradeoffs when using traditional protocols like OSPF or BGP? What about new protocols like RIFT or a hybrid approach with things like BGP-link state? Spoiler alert: it depends.
Understanding Internet Peering
The world of provider interconnection is a little … “mysterious” … even to those who work at transit providers. The decision of who to peer with, whether such peering should be paid, settlement-free, open, and where to peer is often cordoned off into a separate team (or set of teams) that don’t seem to leak a lot of information. A recent paper on current interconnection practices published in ACM SIGCOMM sheds some useful light into this corner of the Internet, and hence is useful for those just trying to understand how the Internet really works.
To write the paper, the authors sent requests to fill out a survey through a wide variety of places, including NOG mailing lists and blogs. They ended up receiving responses from all seven regions (based on the RIRs, who control and maintain Internet numbering resources like AS numbers and IP addresses), 70% from ISPs, 14% from content providers, and 7% from “Enterprise” and infrastructure operators. Each of these kinds of operators will have different interconnection needs—I would expect ISPs to engage in more settlement-free peering (with roughly equal traffic levels), content providers to engage in more open (settlement-free connections with unequal traffic levels), IXs to do mostly local peering (not between regions), and “enterprises” to engage mostly in paid peering. The survey also classified respondents by their regional footprint (how many regions they operate in) and size (how many customers they support).
The survey focused on three facets of interconnection: time required to form a connection, the reasons given for interconnecting, and parameters included in the peering agreement. These largely describe the status quo in peering—interconnections as they are practiced today. As might be expected, connections at IXs are the quickest to form. Since IXs are normally set up to enable peering; it makes sense that the preset processes and communications channels enabled by an IX would make the peering process a lot faster. According to the survey results, the most common timeframe to complete peering is days, with about a quarter taking weeks.
Apparently, the vast majority (99%!) of peering arrangements are by “handshake,” which means there is no legal contract behind them. This is one reason Network Operator Groups (NOGs) are so important (a topic of discussion in the Hedge 31, dropping next week); the peering workshops are vital in building and keeping the relationships behind most peering arrangements.
On-demand connectivity is a new trend in inter-AS peering. For instance, interxion recently worked with LINX and several other IXs to develop a standard set of APIs allowing operators to peer with one another in a standard way, often reducing the technical side of the peering process to minutes rather than hours (or even days). Companies are moving into this space, helping operators understand who they should peer with, and building pre-negotiated peering contracts with many operators. While current operators seem to be aware of these options, they do not seem to be using these kinds of services yet.
While this paper is interesting, it does leave many corners of the inter-AS peering world un-exposed. For instance—I would like to know how correct my assumptions are about the kinds of peering used by each of the different classes of providers is, and whether there are regional differences in the kinds of peering. While its interesting to survey the reasons providers pursue peering, it would be interesting to understand the process of making a peering determination more fully. What kinds of tools are available, and how are they used? These would be useful bits of information for an operator who only connects to the Internet, rather than being part of the Internet infrastructure (perhaps a “non-infrastructure operator,” rather than “enterprise”) in understanding how their choice of upstream provider can impact the performance of their applications and network.
Note: this is another useful, but slightly older, paper on the topic of peering.
History of Networking: Adrian Farrel and PCE
Path Computation Element (PCE) is designed to allow the computation of paths for MPLS and GMPLS Point to Point and
Point to Multi-point Traffic Engineered LSPs. Adrian Farrel, who was involved in the early work on designing an specifying PCE, joins us in this episode of the History of Networking to describe the purposes, process, and challenges involved. You can read more about Adrian on his personal home page, and about PCE on the IETF WG page.