Stop Using the OSI Model

We all use the OSI model to describe the way networks work. I have, in fact, included it in just about every presentation, and every book I have written, someplace in the fundamentals of networking. But if you have every looked at the OSI model and had to scratch your head trying to figure out how it really fits with the networks we operate today, or what the OSI model is telling you in terms of troubleshooting, design, or operation—you are not alone. Lots of people have scratched their heads about the OSI model, trying to understand how it fits with modern networking. There is a reason this is so difficult to figure out.

The OSI Model does not accurately describe networks.

What set me off in this particular direction this week is an article over at Errata Security:

The OSI Model was created by international standards organization for an alternative internet that was too complicated to ever work, and which never worked, and which never came to pass. Sure, when they created the OSI Model, the Internet layered model already existed, so they made sure to include today’s Internet as part of their model. But the focus and intent of the OSI’s efforts was on dumb networking concepts that worked differently from the Internet.

This is partly true, and yet a bit … over the top. 🙂 OTOH, the point is well taken: the OSI model is not an ideal model for understanding networks. Maybe a bit of analysis would be helpful in understanding why.

First, while the OSI model was developed with packet switching networks in mind, the general idea was to come as close as possible to emulating the circuit-switched networks widely deployed at the time. A lot of thought had gone into making those circuit-switched networks work, and applications had been built around the way they worked. Applications and circuit-switched networks formed a sort of symbiotic relationship, just as applications form with packet-switched networks today; it was unimaginable, at the time, that “everything would change.”

So while the designers of the OSI model understood the basic value of the packet-switched network, they also understood the value of the circuit-switched network, and tried to find a way to solve both sets of problems in the same network. Experience has shown it is possible to build a somewhat close-to-circuit switched network on top of packet switched networks, but not quite in the way, nor as close to perfect emulation, as those original designers thought. So the OSI model is a bit complex and perhaps overspecified, making it less-than-useful today.

Second, the OSI model largely ignored the role of middleboxes, focusing instead on the stacks implemented and deployed in hosts. This, again, makes sense, as there was no such thing as a device specialized in the switching of packets at the time. Hosts took packets in and processed them. Some packets were sent along to other hosts, other packets were consumed locally. Think PDP-11 with some rough code, rather than even an early Cisco CGS.

Third, the OSI model focuses on what each layer does from the perspective of an application, rather than focusing on what is being done to the data in order to transmit it. The OSI model is built “top down,” rather than “bottom up,” in other words. While this might be really useful if you are an application developer, it is not so useful if you are a network engineer.

So—what should we say about the OSI model?

It was much more useful at some point in the past, when networking was really just “something a host did,” rather than its own sort of sub-field, with specialized protocols, techniques, and designs. It was a very good attempt at sorting out what a network needed to do to move traffic, from the perspective of an application.

What it is not, however, is really all that useful for network engineers working within an engineering specialty to understand how to design protocols, and how to design networks on which those protocols will run. What should we replace it with? I would begin by pointing you to the RINA model, which I think is a better place to start. I’ve written a bit about the RINA model, and used the RINA model as one of the foundational pieces of Computer Networking Problems and Solutions.

Since writing that, however, I have been thinking further about this problem. Over the next six months or so, I plan to build a course around this question. For the moment, I don’t want to spoil the fun, or put any half-backed thoughts out there in the wild.

DNS Query Minimization and Data Leaks

When a recursive resolver receives a query from a host, it will first consult any local cache to discover if it has the information required to resolve the query. If it does not, it will begin with the rightmost section of the domain name, the Top Level Domain (TLD), moving left through each section of the Fully Qualified Domain Name (FQDN), in order to find an IP address to return to the host, as shown in the diagram below.

This is pretty simple at its most basic level, of course—virtually every network engineer in the world understands this process (and if you don’t, you should enroll in my How the Internet Really Works webinar the next time it is offered!). The question almost no-one ever asks, however, is: what, precisely, is the recursive server sending to the root, TLD, and authoritative servers?

Begin with the perspective of a coder who is developing the code for that recursive server. You receive a query from a host, you have the code check the local cache, and you find there is no matching information available locally. This means you need to send a query out to some other server to determine the correct IP address to return to the host. You could keep a copy of the query from the host in your local cache and build a new query to send to the root server.

Remember, however, that local server resources may be scarce; recursive servers must be optimized to process very high query rates very quickly. Much of the user’s perception of network performance is actually tied to DNS performance. A second option is you could save local memory and processing power by sending the entire query, as you have received it, on to the root server. This way, you do not need to build a new query packet to send to the root server.

Consider this process, however, in the case of a query for a local, internal resource you would rather not let the world know exists. The recursive server, by sending the entire query to the root server, is also sending information about the internal DNS structure and potential internal server names to the external root server. As the FQDN is resolved (or not), this same information is sent to the TLD and authoritative servers, as well.

There is something else contained here, however, that is not so obvious—the IP address of the requestor is contained in that original query, as well. Not only is your internal namespace leaking, your internal IP addresses are leaking, as well.

This is not only a massive security hole for your organization, it also exposes information from individual users on the global ‘net.

There are several things that can be done to resolve this problem. Organizationally, running a private DNS server, hard coding resolving servers for internal domains, and using internal domains that are not part of the existing TLD infrastructure, can go a long way towards preventing information leaking of this kind through DNS. Operating a DNS server internally might not be ideal, of course, although DNS services are integrated into a lot of other directory services used in operational networks. If you are using a local DNS server, it is important to remember to configure DHCP and/or IPv6 ND to send the correct, internal, DNS server address, rather than an external address. It is also important to either block or redirect DNS queries sent to public servers by hosts using hard-coded DNS server configurations.

A second line of defense is through DNS query minimization. Described in RFC7816, query minimization argues recursive servers should use QNAME queries to only ask about the one relevant part of the FQDN. For instance, if the recursive server receives a query for www.banana.example, the server should request information about .example from the root server, banana.example from the TLD, and send the full requested domain name only to the authoritative server. This way, the full search is not exposed to the intermediate servers, protecting user information.

Some recursive server implementations already support QNAME queries. If you are running a server for internal use, you should ensure the server you are using supports DNS query minimization. If you are directing your personal computer or device to publicly reachable recursive servers, you should investigate whether these servers support DNS query minimization.

Even with DNS query minimization, your recursive server still knows a lot about what you ask for—the topic of discussion on a forthcoming episode of the Hedge, where our guest will be Geoff Huston.

Lessons Learned from the Robustness Principle

The Internet, and networking protocols more broadly, were grounded in a few simple principles. For instance, there is the end-to-end principle, which argues the network should be a simple fat pipe that does not modify data in transit. Many of these principles have tradeoffs—if you haven’t found the tradeoffs, you haven’t looked hard enough—and not looking for them can result in massive failures at the network and protocol level.

Another principle networking is grounded in is the Robustness Principle, which states: “Be liberal in what you accept, and conservative in what you send.” In protocol design and implementation, this means you should accept the widest range of inputs possible without negative consequences. A recent draft, however, challenges the robustness principle—draft-iab-protocol-maintenance.

According to the authors, the basic premise of the robustness principle lies in the problem of updating older software for new features or fixes at the scale of an Internet sized network. The general idea is a protocol designer can set aside some “reserved bits,” using them in a later version of the protocol, and not worry about older implementations misinterpreting them—new meanings of old reserved bits will be silently ignored. In a world where even a very old operating system, such as Windows XP, is still widely used, and people complain endlessly about forced updates, it seems like the robustness principle is on solid ground in this regard.

The argument against this in the draft is implementing the robustness principle allows a protocol to degenerate over time. Older implementations are not removed from service because it still works, implementations are not updated in a timely manner, and the protocol tends to have an ever-increasing amount of “dead code” in the form of older expressions of data formats. Given an infinite amount of time, an infinity number of versions of any given protocol will be deployed. As a result, the protocol can and will break in an infinite number of ways.

The logic of the draft is something along the lines of: old ways of doing things should be removed from protocols which are actively maintained in order to unify and simplify the protocol. At least for actively maintained protocols, reliance on the robustness principle should be toned down a little.

Given the long list of examples in the draft, the authors make a good case.

There is another side to the argument, however. The robustness principle is not “just” about keeping older versions of software working “at scale.” All implementations, no matter how good their quality, have defects (or rather, unintended features). Many of these defects involve failing to release or initialize memory, failing to bounds check inputs, and other similar oversights. A common way to find these errors is to fuzz test code—throw lots of different inputs at it to see if it spits up an error or crash.

The robustness principle runs deeper than infinite versions—it also helps implementations deal with defects in “the other end” that generate bad data. The robustness principle, then, can help keep a network running even in the face of an implementation defect.

Where does this leave us? Abandoning the robustness principle is clearly not a good thing—while the network might end up being more correct, it might also end up simply not running. Ever. The Internet is an interlocking system of protocols, hardware, and software; the robustness principle is the lubricant that makes it all work at all.

Clearly, then, there must be some sort of compromise position that will work. Perhaps a two pronged attack might work. First, don’t discard errors silently. Instead, build logging into software that catches all errors, regardless of how trivial they might seem. This will generate a lot of data, but we need to be clear on the difference between instrumenting something and actually paying attention to what is instrumented. Instrumenting code so that “unknown input” can be caught and logged periodically is not a bad thing.

Second, perhaps protocols need some way to end of life older versions. Part of the problem with the robustness principle is it allows an infinite number of versions of a single protocol to exist in the same network. Perhaps the IETF and other standards organizations should rethink this, explicitly taking older ways of doing things out of specs on a periodic basis. A draft that says “you shouldn’t do this any longer,” or “this is no longer in accordance with the specification,” would not be a bad thing.

For the more “average” network engineer, this discussion around the robustness principle should lead to some important lessons, particularly as we move ever more deeply into an automated world. Be clear about versioning of APIs and components. Deprecate older processes when they should no longer be used.

Control your technical debt, or it will control you.

Disaggregation and Business Value

I recently spoke at CHINOG on the business value of disaggregation, and participated in a panel on getting involved in the IETF. If you’re interested in these two talks, the videos are linked below.

Research: Legal Barriers to RPKI Deployment

Much like most other problems in technology, securing the reachability (routing) information in the internet core as much or more of a people problem than it is a technology problem. While BGP security can never be perfect (in an imperfect world, the quest for perfection is often the cause of a good solution’s failure), there are several solutions which could be used to provide the information network operators need to determine if they can trust a particular piece of routing information or not. For instance, graph overlays for path validation, or the RPKI system for origin validation. Solving the technical problem, however, only carries us a small way towards “solving the problem.”

One of the many ramifications of deploying a new system—one we do not often think about from a purely technology perspective—is the legal ramifications. Assume, for a moment, that some authority were to publicly validate that some address, such as 2001:db8:3e8:1210::/64, belongs to a particular entity, say bigbank, and that the AS number of this same entity is 65000. On receiving an update from a BGP peer, if you note the route to x:1210::/64 ends in AS 65000, you might think you are safe in using this path to reach destinations located in bigbank’s network.

What if the route has been hijacked? What if the validator is wrong, and has misidentified—or been fooled into misidentifying—the connection between AS65000 and the x:1210::/64 route? What if, based on this information, critical financial information is transmitted to an end point which ultimately turns out to be an attacker, and this attacker uses this falsified routing information to steal millions (or billions) of dollars?

Yoo, Christopher S., and David A. Wishnick. 2019. “Lowering Legal Barriers to RPKI Adoption.” SSRN Scholarly Paper ID 3308619. Rochester, NY: Social Science Research Network. https://papers.ssrn.com/abstract=3309813.

Who is responsible? This legal question ultimately plays into the way numbering authorities allow the certificates they issue to be used. Numbering authorities—specifically ARIN, which is responsible for numbering throughout North America—do not want the RPKI data misused in a way that can leave them legally responsible for the results. Some background is helpful.

The RPKI data, in each region, is stored in a database; each RPKI object (essentially and loosely) contains an origin AS/IP address pair. These are signed using a private key and can be validated using the matching public key. Somehow the public key itself must be validated; ultimately, there is a chain, or hierarchy, of trust, leading to some sort of root. The trust anchor is described in a file called the Trust Anchor Locator, or TAL. ARIN wraps access to their TAL in a strong indemnification clause to protect themselves from the sort of situation described above (and others). Many companies, particularly in the United States, will not accept the legal contract involved without a thorough investigation of their own culpability in any given situation involving misrouting traffic, which ultimately means many companies will simply not use the data, and RPKI is not deployed.

The essential point the paper makes is: is this clause really necessary? Thy authors make several arguments towards removing the strict legal requirements around the use of the data in the TAL provided by ARIN. First, they argue the bounds of potential liability are uncertain, and will shift as the RPKI is more widely deployed. Second, they argue the situations where harm can come from use of the RPKI data needs to be more carefully framed and understood, and how these kinds of legal issues have been used in the past. To this end, the authors argue strict liability is not likely to be raised, and negligence liability can probably be mitigated. They offer an alternative mechanism using straight contract law to limit the liability to ARIN in situations where the RPKI data is misused or incorrect.

Whether this paper causes ARIN to rethink its legal position or not is yet to be seen. At the same time, while these kinds of discussions often leave network engineers flat-out bored, the implications for the Internet are important. This is an excellent example of an intersection between technology and policy, a realm network operators and engineers need to pay more attention to.

Research: BGP Routers and Parrots

The BGP specification suggests implementations should have three tables: the adj-rib-in, the loc-rib, and the adj-rib-out. The first of these three tables should contain the routes (NLRIs and attributes) transmitted by each of the speaker’s peers. The second table should contain the calculated best paths; these are the routes that will be (or are) installed in the local routing table and used to build a forwarding table. The third table contains the routes which have been sent to each peering speaker. Why three tables? Routing protocols standards are (sometimes—not always) written to provide the maximum clarity to how the protocol works to someone who is writing an implementation. Not every table or process described in the specification is implemented, or implemented the way it is described.

What happens when you implement things in a different way than the specification describes? In the case of BGP and the three RIBs, you can get duplicated BGP updates. What do parrots and BGP have in common describes two situations where the lack of a adj-rib-out can cause duplicate BGP updates to be sent.

David Hauweele, Bruno Quoitin, Cristel Pelsser, and Randy Bush. 2016. “What Do Parrots and BGP Rotuers Have in Common?” Computer Communications Review, July. http://ccracmsigcomm.info.ucl.ac.be/wp-content/uploads/2016/07/sigcomm-ccr-paper26.pdf.

The authors of this paper begin by observing BGP updates from a full feed off the default free zone. The configuration of the network, however, is designed to provide not only the feed from a BGP speaker, but also the routes received by a BGP speaker, as shown in the illustration below.

In this figure, all the labeled routers are in separate BGP autonomous systems, and the links represent physical connections as well as eBGP sessions. The three BGP updates received by D are stored in three different logs which are time stamped so they can be correlated. The researchers found two instances where duplicate BGP updates were received at D.

In the first case, the best path at C switches between A and B because of the Multiple Exit Discriminator (MED), but the remainder of the update remains the same. C, however, strips the MED before transmitting the route to D, so D simply sees what appears to be duplicate updates. In the second case, the next hop changes because of an implicit withdraw based on a route change for the previous best path. For instance, C might choose A as the best path, but then A implicitly withdraws its path, leaving the path through B as the best. When this occurs, C recalculates the best path and sends it to D; since the next hop is stripped when C advertises the new route to D, this appears to be a duplicate at D.

In both of these cases, if C had an adj-rib-out, it would find the duplicate advertisement and squash it. However, since C has no record of what it has sent to D in the past, it must send information about all local best path changes to D. While this might seem like a trivial amount of processing, these additional updates can add enough load during link flap situations to make a material difference in processor utilization or speed of convergence.

Why do implementors decide not to include an adj-rib-out in their implementations, or why, when one is provided, do operators disable the adj-rib-out? Primarily because the adj-rib-out consumes local memory; it is cheaper to push the work to a peer than it is to keep local state that might only rarely be used. This is a classic case of reducing the complexity of the local implementation by pushing additional state (and hence complexity) into the overall system. The authors of the paper suggest a better balance might be achieved if implementations kept a small cache of the most recent updates transmitted to an adjacent speaker; this would allow the implementation to reduce memory usage, while also allowing it to prevent repeating recent updates.

Ossification and Fragmentation: The Once and Future ‘net

Mostafa Ammar, out of Georgia Tech (not my alma mater, but many of my engineering family are alumni there), recently posted an interesting paper titled The Service-Infrastructure Cycle, Ossification, and the Fragmentation of the Internet. I have argued elsewhere that we are seeing the fragmentation of the global Internet into multiple smaller pieces, primarily based on the centralization of content hosting combined with the rational economic decisions of the large-scale hosting services. The paper in hand takes a slightly different path to reach the same conclusion.

cross posted at CircleID

TL;DR[time-span]

  • Networks are built based on a cycle of infrastructure modifications to support services
  • When new services are added, pressure builds to redesign the network to support these new services
  • Networks can ossify over time so they cannot be easily modified to support new services
  • This causes pressure, and eventually a more radical change, such as the fracturing of the network

 
The author begins by noting networks are designed to provide a set of services. Each design paradigm not only supports the services it was designed for, but also allows for some headroom, which allows users to deploy new, unanticipated services. Over time, as newer services are deployed, the requirements on the network change enough that the network must be redesigned.
This cycle, the service-infrastructure cycle, relies on a well-known process of deploying something that is “good enough,” which allows early feedback on what does and does not work, followed by quick refinement until the protocols and general design can support the services placed on the network. As an example, the author cites the deployment of unicast routing protocols. He marks the beginning of this process as 1962, when Prosser was first deployed, and then as 1995, when BGPv4 was deployed. Across this time routing protocols were invented, deployed, and revised rapidly. Since around 1995, however—a period of over 20 years at this point—routing has not changed all that much. So there were around 35 years of rapid development, followed by what is now over 20 years of stability in the routing realm.

Ossification, for those not familiar with the term, is a form of hardening. Petrified wood is an ossified form of wood. An interesting property of petrified wood is that is it fragile; if you pound a piece of “natural” wood with a hammer, it dents, but does not shatter. Petrified, or ossified, wood shatters, like glass.

Multicast routing is held up as an opposite example. Based on experience with unicast routing, the designers of multicast attempted to “anticipate” the use cases, such that early iterations were clumsy, and failed to attain the kinds of deployment required to get the cycle of infrastructure and services started. Hence multicast routing has largely failed. In other words, multicast ossified too soon; the cycle of experience and experiment was cut short by the designers trying to anticipate use cases, rather than allowing them to grow over time.

Some further examples might be:

  • IETF drafts and RFCs were once short, and used few technical terms, in the sense of a term defined explicitly within the context of the RFC or system. Today RFCs are veritable books, and require a small dictionary to read.
  • BGP security, which is mentioned by the author as a victim of ossification, is actually another example of early ossification destroying the experiment/enhancement cycle. Early on, a group of researchers devised the “perfect” BGP security system (which is actually by no means perfect—it causes as many security problems as it resolves), and refused to budge once “perfection” had been reached. For the last twenty years, BGP security has not notably improved; the cycle of trying and changing things has been stopped this entire time.

There are also weaknesses in this argument, as well. It can be argued that the reason for the failure of widespread multicast is because the content just wasn’t there when multicast was first considered—in fact, that multicast content still is not what people really want. The first “killer app” for multicast was replacing broadcast television over the Internet. What has developed instead is video on demand; multicast is just not compelling when everyone is watching something different whenever they want to.

The solution to this problem is novel: break the Internet up. Or rather, allow it to break up. The creation of a single network from many networks was a major milestone in the world of networking, allowing the open creation of new applications. If the Internet were not ossified through business relationships and the impossibility of making major changes in the protocols and infrastructure, it would be possible to undertake radical changes to support new challenges.

The new challenges offered include IoT, the need for content providers to have greater control over the quality of data transmission, and the unique service demands of new applications, particularly gaming. The result has been the flattening of the Internet, followed by the emergence of bypass networks—ultimately leading to the fragmentation of the Internet into many different networks.

Is the author correct? It seems the Internet is, in fact, becoming a group of networks loosely connected through IXPs and some transit providers. What will the impact be on network engineers? One likely result is deeper specialization in sets of technologies—the “enterprise/provider” divide that had almost disappeared in the last ten years may well show up as a divide between different kinds of providers. For operators who run a network that indirectly supports some other business goal (what we might call “enterprise”), the result will be a wide array of different ways of thinking about networks, and an expansion of technologies.

But one lesson engineers can certainly take away is this: the concept of agile must reach beyond the coding realm, and into the networking realm. There must be room “built in” to experiment, deploy, and enhance technologies over time. This means accepting and managing risk rather than avoiding it, and having a deeper understanding of how networks work and why they work that way, rather than the blind focus on configuration and deployment we currently teach.