DESIGN – Page 11 – rule 11 reader

CAP Theorem and Routing

In 2000, Eric Brewer was observing and discussing the various characteristics of database systems. Through this work, he observed that a database generally has three characteristics—

- Consistency, which means the database will produce the same result for any two readers who happen to read a value at the same moment in time.
- Availability, which means the database can be read by any reader at any moment in time.
- Partion Tolerance, which means the database can be partitioned.

Brewer, in explaining the relationship between the three in a 2012 article, says—

The easiest way to understand CAP is to think of two nodes on opposite sides of a partition. Allowing at least one node to update state will cause the nodes to become inconsistent, thus forfeiting C (consistency). Likewise, if the choice is to preserve consistency, one side of the partition must act as if it is unavailable, thus forfeiting A (availability).

The CAP theorem, therefore, represents a two out of three situation—yet another two out of three “set” we encounter in the real world, probably grounded someplace in the larger space of complexity. We’ll leave the relationship to complexity on the side for the moment, however, and just look at how CAP relates to routing.

Network engineers familiar with the concepts of complexity, and their application in the network design realm, should quickly realize that this “two out of three” paradigm almost never represents a set of “hard and fast realities,” but is better represented as a set of continuums along which a system designer can choose. Moving towards one pole in the two out of three space inevitably means giving up some measure along one of the other two; the primary choices you can make are which one and how much.

One more detail is important before diving into a deeper discussion: A partition is normally thought of as a hard divide in database theory, but for the purposes of explaining how CAP applies to routing, I’m going to soften the definition a little. Instead of strict partitions, I want to consider partitioning as relating to the distribution of the database across a set of nodes separated by a network.

So how does CAP apply to routing?

To see the application, let’s ignore the actual algorithm used to compute the best path in any given routing protocol—whether Dijkstra’s SPF, or Garcia’s DUAL, or path vector—and focus, for a moment, on just getting the information required to compute loop free paths to every router running in a single failure (and/or flooding) domain. This latter point is important, because failure and flooding domains actually represent an intentional form of information hiding that causes the database to be correct for the purposes of routing (most of the time), but inconsistent in strict database consistency terms on a permanent basis. We need, therefore, to treat aggregation and summarization as a separate topic when considering how CAP impacts routing. Since the tradeoffs are easiest to see in link state protocols, we’re actually going to focus in that area.

Once we’ve softened the definition of partition tolerance, it’s immediately obvious that all forms of flooding, for instance, are simply the synchronization process of a highly partitioned database. Further, we can observe that the database flooded through the network must be, largely, always available (as in always able to be read by a local device). Given CAP, we should immediately suspect there is some sort of consistency tradeoff, as we’ve already defined two of the three legs of the two out of three set pretty strongly.

And, in fact, we do. In link state protocols, the result of inconsistency is most easily seen in the problem of microloops during convergence. To understand this, let’s look at the network just below.

cap-and-routing

Assume, for a moment, that the link between routers A and B fails. At this moment, let’s assume:

- - The best path for B is through A
  - The best path for C is through B
  - The best path for D is through C

Let’s also assume that each router in the network takes precisely the same amount of time to flood a packet, and to run SPF. When the link fails, C is going to detect the failure immediately, and hence will calculate SPF first. If we assume that D can only process this information about the topology change once it’s received an update to its database, then we can see where C will always process the new information before D.

But we can also observe this point: once C has processed the topology change—C has run SPF—router C will point at router D for its new best path. Until router D receives the new topology information and runs SPF, D will continue to point to C as the best path.

This is the microloop.

It happens any time there is a ring in the topology of more than four hops with any link state protocol. And it’s a direct result of the relationship between consistency and availability in the distributed database. We can verify this by examining the way in which the microloop can be resolved—specifically, by using ordered FIB.

In ordered FIB, each router in the network calculates an amount of time it should wait before installing newly calculated routes. This calculation is, in turn, based on the number of routers relying on the local router for the best path. If you back up and think through how the microloop happens, you can see that what’s happening is C is now waiting for D to calculate before installing the new route into its local forwarding table.

In CAP terms, what ordered FIB does is reduce availability some small amount—in effect by not allowing the forwarding plane at C to read the newly calculated routes for some period of time—in order to ensure consistency in the database. This observation should confirm microloops are related to the relationship between CAP and distributed routing.

There are a number of other relationships between CAP and routing; I’ll put some of these on my list for later posts—but this one has reached the 1000 word mark, and so is long enough for a starter on the topic.

Posted in DESIGN, TECH, WRITTEN

Micromanaging networks considered harmful: on (k)nerd knobs

Nerd Knobs (or as we used to call them in TAC, knerd knobs) are the bane of the support engineer’s life. Well, that and crashes. And customer who call in with a decoded stack trace. Or don’t know where to put the floppy disc that came with the router into the router. But, anyway…

What is it with nerd knobs? Ivan has a great piece up this week on the topic. I think this is the closest he gets to what I think of as the real root cause for nerd knobs —

Instead of using cookie-cutter designs, we prefer to carefully craft unique snowflakes that magically integrate the legacy stuff that should have been dead years ago with the next-generation technologies… and every unique snowflake needs at least a nerd knob or two to make it work.

Greg has a response to Ivan up; again, I think he gets close to the problem with these thoughts —

Most IT managers have lost the ability to recognise technical debt and its impacts … Nerd Knobs are symptoms of much deeper problems/technical debt in the networking market and treat the cause not the symptom.

A somewhat orthogonal article caught my eye, though, that I think explains what is actually going on here with those pesky nerd knobs. The article is really about SQL and the concept of micromanaging software. To give you a flavor (in case you’re too lazy/busy to head over there and read the whole thing) —

So, here’s an analogy that highlights the key difference between what “imperative” languages like Java or Python and “declarative” languages like SQL do to your computation. In Python, say, you specify step-by-step what the computer should do: open the file; read the first line; if the line doesn’t match some requirement, skip it; update the counter; read the next line; update the counter again; if the counter exceeds some value, stop; if the end of file is reached, close the file; return the counter. Code often accumulates like this and builds up into complex business rules that are usually poorly understood. via infoworld

I think this gets to the heart of the nerd knob problem. What’s happening with nerd knobs is it’s easier to tell the system how we want something done than it is to tell the system what we want to do. Think about this way: you install a routing protocol, and you tell it what you want in broad, general terms. Something like, “I want the shortest path between each pair of points in the network.” Then you run into a situation where you need that modified, so you mess around with the metrics some, and get on with your life. Then you run into a situation where you need this flow to go here, and that flow to go there, so you install some policy based routing along the way.

Per link metrics are just the first level of nerd knobs. Policy based routing is just the second. The more precise we want to get, the deeper the nerd knobs go. Want to load share over links that aren’t truly equal cost? Oh, just nerd knob it. Want to send AS’ in the AS path you shouldn’t? Just nerd knob it.

The reality is every nerd knob in routing represents a policy driven by a business requirement expressed as a tweak to the underlying fundamental routing algorithm. As Ivan rightly points out, going to SDNs isn’t going to solve this problem. If anything, it’s going to make it worse. Now, rather than seeing the nerd knob for what it is, a pain in the butt that needs to be explained and dealt with at 2AM when you’re half asleep and the TAC engineer is halfway around the world, it’s going to be “just another line of code.”

This might sound brilliant to someone who hasn’t managed, or dealt with, multi-million line projects and the vagaries of codebase management. Ask someone who has, though, before you get into this. It’s just a different set of problems, not a better set of problems.

The root cause here, though, isn’t nerd knobs. And it’s not business requirements. And it’s not really laziness (most of the time). It’s not even machismo most of the time (though I will admit the natural arrogance of the geek is probably worth studying by some anthropologist somewhere). There are two root causes, really.

First, we, the networking industry, haven’t really thought through what a control plane actually does. Oh, we have the seven layer model with the control plane thrown off to the side, or the claim that there shouldn’t even be a control plane. But this is part of why I think the seven layer model needs to die — because it’s a host focused view of the networking world. End-to-end and dumb as rocks routers are nice to contemplate, but I think we need to admit that even the dumb rocks are a bit more complex than we first thought.

Second, I don’t think we’ve really incorporated complexity into our souls. As someone once told me, “the CAP theorem is just an observer problem!” Or rather, we somehow believe that by making virtual things we can skip all that ugly physical reality stuff. Faster, cheaper, and better are all three available “on tap,” if we can just figure out how to see the problem right. This is nonsense on stilts.

We need to get in here and do some serious thinking about complexity, and how to manage it in network design. We need to do things like think about interaction surfaces, and how to prevent them from becoming so deep and broad as to be unmanageable. As the article on SQL says, from above —

In a world of regulation and increasing interdependencies between organizations, expressing intent independently of implementation means that you can avoid a class of unintended consequences of systems building.

Where have I heard this before? Oh, maybe it’s in that new book on network complexity someplace.

Seriously — I know this is a long rant, so I’ll quit now, but — seriously (!) we need to grow up and start treating the control plane as an engineering problem. Then, and only then, will we get rid of nerd knobs, no matter whether they’re some hidden CLI command, or some “if/then/else” or “goto” statement hidden someplace in the controller code.

P.S. BTW, Greg, I disagree with you about routing protocols. They’ll “go away” for a short while, until we start trying to deal with networks that don’t run on standards based routing protocols. And then we’ll beg for them to come back. We’ll form something like the IETF, and solve all the same problems all over again, convinced that we can do better than that last group of engineers did. Been there. Done that. Got the t-shirt (someplace).

Posted in CULTURE, DESIGN, WRITTEN