This last week I was talking to someone at a small startup that intends to eliminate all the complex routing from campus networks. In the past, when reading blog posts about Kubernetes, I’ve read about how it was designed to eliminate routing protocols because “routing protocols are so complex.”
Color me skeptical.
There are two reasons for complexity in a design. The first is you’re solving a hard problem. The second is you’ve made bad design choices in the past, and you’re pasting complexity on top to solve some perceived problem (whether perceived or real).
The problem with all this talk about building something that’s “less complex” is people tend to see complexity of the first kind and think, “we can get rid of that complexity if we start over.” Failing to understand the past before building the future is a recipe for repeated failures of the same kind. Building a network without a distributed routing protocol hasn’t been tried before either, right? Well, yes, it has … We either forget how it turned out, or we say “well, that’s not the same thing I’m talking about here” (just like “real socialism hasn’t ever been tried”).
Even worse, they think they get rid of second and third kinds of complexity by starting over, or getting the humans out of the decision-making loop, or focusing on the data. Our modern penchant for relying “the data,” without ever thinking about the source of the data or how the data has been shaped and interpreted, is truly breathtaking.
They look over the horizon, see an unspoiled field, and think “the grass really is greener on the other side.”
Get rid of all those complex dynamic routing protocols … get rid of all those humans making decisions, so the decisions are “data driven” … and everything will be so much better.
Adding complexity to solve hard real-world problems is just the way things are, and they will always be, so the first reason for complexity will always be with us. People make mistakes, don’t see into the future perfectly, or just don’t have a perfect understanding of the system (technical debt), so the second kind of complexity will always be with us. You can’t “fix” people—God save us from those who think they can. The grass isn’t always greener—it just always looks that way.
What’s the practical upshot? Networks are always going to be complex. It’s just the nature of the problem being solved.
We add complexity because we fail to ask the right questions, we don’t understand the system, or we fail to do good design. The solution isn’t to seek out a greener field “out there,” but rather to make the field we currently live in greener by asking the right questions and reducing complexity through good design. Sometimes you might even need to start over with a new network … but when you start thinking about starting over with a newly designed set of protocols because the old ones are “too complex,” you need to ask how those old ones got that way, and how you’re going to stop the new ones from getting to the same place.
The grass is always greener because you looking at it through green-colored lenses just as the new grass is in its full flush, and before the weeds have had a chance to take over.
Learn how old things worked before you fall for some new “modern wonder” that’s going to solve every problem. The complexity in old things will show you where you can expect to find complexity grow up in new things.
While reading a research paper on address spoofing from 2019, I ran into this on NAT (really PAT) failures—
In the first failure mode, the NAT simply forwards the packets with the spoofed source address (the victim) intact … In the second failure mode, the NAT rewrites the source address to the NAT’s publicly routable address, and forwards the packet to the amplifier. When the server replies, the NAT system does the inverse translation of the source address, expecting to deliver the packet to an internal system. However, because the mapping is between two routable addresses external to the NAT, the packet is routed by the NAT towards the victim.
The authors state 49% of the NATs they discovered in their investigation of spoofed addresses fail in one of these two ways. From what I remember way back when the first NAT/PAT device (the PIX) was deployed in the real world (I worked in TAC at the time), there was a lot of discussion about what a firewall should do with packets sourced from addresses not indicated anywhere.
If I have an access list including 192.168.1.0/24, and I get a packet sourced from 192.168.2.24, what should the NAT do? Should it forward the packet, assuming it’s from some valid public IP space? Or should it block the packet because there’s no policy covering this source address?
This is similar to the discussion about whether BGP speakers should send routes to an external peer if there is no policy configured. The IETF (though not all vendors) eventually came to the conclusion that BGP speakers should not advertise to external peers without some form of policy configured.
My instinct is the NATs here are doing the right thing—these packets should be forwarded—but network operators should be aware of this failure mode and configure their intentions explicitly. I suspect most operators don’t realize this is the way most NAT implementations work, and hence they aren’t explicitly filtering source addresses that don’t fall within the source translation pool.
In the real world, there should also be a box just outside the NATing device that’s running unicast reverse path forwarding checks. This would resolve these sorts of spoofed packets from being forwarding into the DFZ—but uRPF is rarely implemented by edge providers, and most edge connected operators (enterprises) don’t think about the importance of uRPF to their security.
All this to say—if you’re running a NAT or PAT, make certain you understand how it works. Filters are tricky in the best of circumstances. NAT and PATs just make filters trickier.
It’s easy to assume automation can solve anything and that it’s cheap to deploy—that there are a lot of upsides to automation, and no downsides. In this episode of the Hedge, Terry Slattery joins Tom Ammon and Russ White to discuss something we don’t often talk about, the Return on Investment (ROI) of automation.
I cannot count the number of times I’ve heard someone ask these two questions—
- What are other people doing?
- What is the best common practice?
While these questions have always bothered me, I could never really put my finger on why. I ran across a journal article recently that helped me understand a bit better. The root of the problem is this—what does best common mean, and how can following the best common produce a set of actions you can be confident will solve your problem?
Bellman and Oorschot say best common practice can mean this is widely implemented. The thinking seems to run something like this: the crowd’s collective wisdom will probably be better than my thinking… more sets of eyes will make for wiser or better decisions. Anyone who has studied the madness of crowds will immediately recognize the folly of this kind of state. Just because a lot of people agree it’s a good idea to jump off a cliff does not mean it is, in fact, a good idea to jump off a cliff.
Perhaps it means something closer to this is no worse than our competitors. If that’s the meaning, though, it’s a pretty cynical result. It’s saying, “I don’t mind condemning myself to mediocrity so long as I see everyone else doing the same thing.” It doesn’t sound like much of a way to grow a business.
The authors do provide their definition—
For a given desired outcome, a “best practice” is a means intended to achieve that outcome, and that is considered to be at least as “good” as the best of other broadly considered means to achieve that same outcome.
The thinking seems to run something like this—it’s likely that everyone else has tried many different ways of doing this; that they have all settled on doing this, this way, means all those other methods are probably not as good as this one for some reason.
Does this work? There’s no way to tell without further investigation. How many of the other folks doing “this” spent serious time trying alternatives, and how many just decided the cheapest way was the best no matter how poor the result might be? In fact, how can we know what the results of doing things “this way” have in all those other networks? Where would we find this kind of information?
In the end, I can’t ever make much sense out of the question, “what is everyone else doing?” Discovering what everyone else is doing might help me eliminate possibilities (that didn’t work for them, so I certainly don’t want to try it), or it might help me understand the positive and negative attributes of a given solution. Still, I don’t understand why “common” should infer “best.”
The best solution for this situation is simply going to be the best solution. Feel free to draw on many sources, but don’t let other people determine what you should be doing.
Last week I began discussing why AS Path Prepend doesn’t always affect traffic the way we think it will. Two other observations from the research paper I’m working off of are:
- Adding two prepends will move more traffic than adding a single prepend
- It’s not possible to move traffic incrementally by prepending; when it works, prepending will end up moving most of the traffic from one inbound path to another
A slightly more complex network will help explain these two observations.
Assume AS65000 would like to control the inbound path for 100::/64. I’ve added a link between AS65001 and 65002 here, but we will still find prepending a single AS to the path won’t make much difference in the path used to reach 100::/64. Why?
Because most providers will have a local policy configured—using local preference—that causes them to choose any available customer connection over other paths. AS65001, on receiving the route to 100::/64 from AS65000, will set the local preference so it will prefer this route over any other route, including the one learned from AS65002. So while the cause is a little different in this case than the situation covered in the first post, the result is the same.
We can, of course, prepend twice onto the AS Path rather than once. What impact would that have here? It still won’t impact the traffic originating in 65005 because AS65001 is the only path available towards 100::64 from their perspective. Prepending cannot change anything if there’s only one path.
However, if most of the traffic destined to 100::/64 coming from AS65006, 7, and 8 rather than from AS65005, prepending two times will allow AS65000 to shift the traffic from the path through AS65002 to the path through AS65001. This example might seem a little contrived. Still, it’s pretty similar to networks that have one connection to some local provider (a cable company or something similar) and one connection to a more prominent national or international provider. Any time you are connected to two different providers who have different ranges of connectivity, prepending two autonomous systems on the AS Path will probably be able to shift traffic from one inbound link to another.
What about prepending more than two hops to the AS Path? Each additional prepend going to shift smaller amounts of traffic. It makes sense that increasing the number of prepends doesn’t shift much more because the further away you get from the edge of the Internet, the more fully connected the autonomous systems are, and the more likely you are to run into some other policy that will override the AS Path in determining the best path. The average length of the AS Path in the Internet is around four; prepending more than this normally won’t have much of an effect on traffic flow
The second question above can also be answered by looking at this network. Why can’t you shift traffic incrementally by prepending onto the AS Path? Because the connectivity close to the edge is probably not meshy enough. You can’t shift over just the traffic from one AS or another; you can only shift traffic from the entire set of autonomous systems behind your upstream from one inbound link to another. You can adjust traffic on a per-prefix basis, however, which can be useful for balancing between two inbound links.
What can you do to control inbound traffic with more certainty? Take a look at this older post for thoughts on using communities and de-aggregation to steer traffic.
Just about everyone prepends AS’ to shift inbound traffic from one provider to another—but does this really work? First, a short review on prepending, and then a look at some recent research in this area.
What is prepending meant to do?
Looking at this network diagram, the idea is for AS6500 (each router is in its own AS) to steer traffic through AS65001, rather than AS65002, for 100::/64. The most common method to trying to accomplish this is AS65000 can prepend its own AS number on the AS Path Multiple times. Increasing the length of the AS Path will, in theory, cause a route to be less preferred.
In this case, suppose AS65000 prepends its own AS number on the AS Path once before advertising the route towards AS65001, and not towards AS65002. Assuming there is no link between AS65001 and AS65002, what would we expect to happen? What we would expect is AS65001 will receive one route towards 100::/64 with an AS Path of 2 and use this route. AS65002 will, likewise, receive one route towards 100::/64 with an AS Path of 1 and use this route.
AS65003, however, will receive two routes towards 100::/64, one with an AS Path of 3 through AS65001, and one with an AS Path of 2 through AS65002. All other things being equal (local preference, etc.), AS65003 will prefer the route with the shorter AS Path through AS65002, and select that path to reach 100::/64. AS65004 will only receive one path towards 100::/64, the one through AS65002, because AS65003 will only advertise its best path to AS65004.
The obvious question—how much good does this really do? The only impact on the best path is two hops away, as AS65003, and beyond. The route chosen by AS65001 and AS65002 will not be affected by the prepending.
A recent paper found—
You might expect As Path prepending to have a much more consistent effect on inbound traffic. Why doesn’t it?
What might not be obvious (the danger of simplified diagrams): if autonomous systems directly attached to AS65001 originate most of the traffic destined to 100::/64, no amount of prepending is going to make any difference in the inbound traffic flow. Assume AS5001 has a connection to some cloud service, AS65002 does not have a connection to the same cloud service, and 100::64 is a local server that communicates with this cloud service on a regular basis. Since AS65001 is the only AS transiting traffic from the cloud service to the server located on the 100::/64 subnet, and AS65001 only has one route to 100::/64, you are not going to be able to shift traffic off that single path no matter how many times you prepend.
The first rule of prepending is location matters. You have to know where the traffic you want to shift is originating, and whether or not it can be shifted.
In my next post on this topic, I’ll continue exploring AS path prepending more in light of the results of the research paper above.
Back in January, I ran into an interesting article called The many lies about reducing complexity:
Reducing complexity sells. Especially managers in IT are sensitive to it as complexity generally is their biggest headache. Hence, in IT, people are in a perennial fight to make the complexity bearable.
Gerben then discusses two ways we often try to reduce complexity. First, we try to simply reduce the number of applications we’re using. We see this all the time in the networking world—if we could only get to a single pane of glass, or reduce the number of management packages we use, or reduce the number of control planes (generally to one), or reduce the number of transport protocols … but reducing the number of protocols doesn’t necessarily reduce complexity. Instead, we can just end up with one very complex protocol. Would it really be simpler to push DNS and HTTP functionality into BGP so we can use a single protocol to do everything?
Second, we try to reduce complexity by hiding it. While this is sometimes effective, it can also lead to unacceptable tradeoffs in performance (we run into the state, optimization, surfaces triad here). It can also make the system more complex if we need to go back and leak information to regain optimal behavior. Think of the OSPF type 4, which just reinjects information lost in building an area summary, or even the complexity involved in the type7 to type 5 process required to create not-so-stubby areas.
It would seem, then, that you really can’t get rid of complexity. You can move it around, and sometimes you can effectively hide it, but you cannot get rid of it.
This is, to some extent, true. Complexity is a reaction to difficult environments, and networks are difficult environments.
Even so, there are ways to actually reduce complexity. The solution is not just hiding information because it’s messy, or munging things together because it requires fewer applications or protocols. You cannot eliminate complexity, but if you think about how information flows through a system you might be able to reduce the amount of complexity, and even create boundaries where state (hence complexity) can be more effectively hidden.
As an instance, I have argued elsewhere that building a DC fabric with distinct overlay and underlay protocols can actually create a simpler overall design than using a single protocol. Another instance might be to really think about where route aggregation takes place—is it really needed at all? Why? Is this the right place to aggregate routes? Is there any way I can change the network design to reduce state leaking through the abstraction?
The problem is there are no clear-cut rules for thinking about complexity in this way. There’s no rule of thumb, there’s no best practices. You just have to think through each individual situation and consider how, where, and why state flows, and then think through the state/optimization/surface tradeoffs for each possible way of reducing the complexity of the system. You have to take into account that local reductions in complexity can cause the overall system to be much more complex, as well, and eventually make the system brittle.
There’s no “pat” way to reduce complexity—that there is, is perhaps one of the biggest lies about complexity in the networking world.