Is it really the best just because its the most common?

I cannot count the number of times I’ve heard someone ask these two questions—

  • What are other people doing?
  • What is the best common practice?

While these questions have always bothered me, I could never really put my finger on why. I ran across a journal article recently that helped me understand a bit better. The root of the problem is this—what does best common mean, and how can following the best common produce a set of actions you can be confident will solve your problem?

Bellman and Oorschot say best common practice can mean this is widely implemented. The thinking seems to run something like this: the crowd’s collective wisdom will probably be better than my thinking… more sets of eyes will make for wiser or better decisions. Anyone who has studied the madness of crowds will immediately recognize the folly of this kind of state. Just because a lot of people agree it’s a good idea to jump off a cliff does not mean it is, in fact, a good idea to jump off a cliff.

Perhaps it means something closer to this is no worse than our competitors. If that’s the meaning, though, it’s a pretty cynical result. It’s saying, “I don’t mind condemning myself to mediocrity so long as I see everyone else doing the same thing.” It doesn’t sound like much of a way to grow a business.

The authors do provide their definition—

For a given desired outcome, a “best practice” is a means intended to achieve that outcome, and that is considered to be at least as “good” as the best of other broadly considered means to achieve that same outcome.

The thinking seems to run something like this—it’s likely that everyone else has tried many different ways of doing this; that they have all settled on doing this, this way, means all those other methods are probably not as good as this one for some reason.

Does this work? There’s no way to tell without further investigation. How many of the other folks doing “this” spent serious time trying alternatives, and how many just decided the cheapest way was the best no matter how poor the result might be? In fact, how can we know what the results of doing things “this way” have in all those other networks? Where would we find this kind of information?

 

In the end, I can’t ever make much sense out of the question, “what is everyone else doing?” Discovering what everyone else is doing might help me eliminate possibilities (that didn’t work for them, so I certainly don’t want to try it), or it might help me understand the positive and negative attributes of a given solution. Still, I don’t understand why “common” should infer “best.”

The best solution for this situation is simply going to be the best solution. Feel free to draw on many sources, but don’t let other people determine what you should be doing.

The Effectiveness of AS Path Prepending (2)

Last week I began discussing why AS Path Prepend doesn’t always affect traffic the way we think it will. Two other observations from the research paper I’m working off of are:

  • Adding two prepends will move more traffic than adding a single prepend
  • It’s not possible to move traffic incrementally by prepending; when it works, prepending will end up moving most of the traffic from one inbound path to another

A slightly more complex network will help explain these two observations.

Assume AS65000 would like to control the inbound path for 100::/64. I’ve added a link between AS65001 and 65002 here, but we will still find prepending a single AS to the path won’t make much difference in the path used to reach 100::/64. Why?

Because most providers will have a local policy configured—using local preference—that causes them to choose any available customer connection over other paths. AS65001, on receiving the route to 100::/64 from AS65000, will set the local preference so it will prefer this route over any other route, including the one learned from AS65002. So while the cause is a little different in this case than the situation covered in the first post, the result is the same.

We can, of course, prepend twice onto the AS Path rather than once. What impact would that have here? It still won’t impact the traffic originating in 65005 because AS65001 is the only path available towards 100::64 from their perspective. Prepending cannot change anything if there’s only one path.

However, if most of the traffic destined to 100::/64 coming from AS65006, 7, and 8 rather than from AS65005, prepending two times will allow AS65000 to shift the traffic from the path through AS65002 to the path through AS65001. This example might seem a little contrived. Still, it’s pretty similar to networks that have one connection to some local provider (a cable company or something similar) and one connection to a more prominent national or international provider. Any time you are connected to two different providers who have different ranges of connectivity, prepending two autonomous systems on the AS Path will probably be able to shift traffic from one inbound link to another.

What about prepending more than two hops to the AS Path? Each additional prepend going to shift smaller amounts of traffic. It makes sense that increasing the number of prepends doesn’t shift much more because the further away you get from the edge of the Internet, the more fully connected the autonomous systems are, and the more likely you are to run into some other policy that will override the AS Path in determining the best path. The average length of the AS Path in the Internet is around four; prepending more than this normally won’t have much of an effect on traffic flow

The second question above can also be answered by looking at this network. Why can’t you shift traffic incrementally by prepending onto the AS Path? Because the connectivity close to the edge is probably not meshy enough. You can’t shift over just the traffic from one AS or another; you can only shift traffic from the entire set of autonomous systems behind your upstream from one inbound link to another. You can adjust traffic on a per-prefix basis, however, which can be useful for balancing between two inbound links.

What can you do to control inbound traffic with more certainty? Take a look at this older post for thoughts on using communities and de-aggregation to steer traffic.

The Effectiveness of AS Path Prepending (1)

Just about everyone prepends AS’ to shift inbound traffic from one provider to another—but does this really work? First, a short review on prepending, and then a look at some recent research in this area.

What is prepending meant to do?

Looking at this network diagram, the idea is for AS6500 (each router is in its own AS) to steer traffic through AS65001, rather than AS65002, for 100::/64. The most common method to trying to accomplish this is AS65000 can prepend its own AS number on the AS Path Multiple times. Increasing the length of the AS Path will, in theory, cause a route to be less preferred.

In this case, suppose AS65000 prepends its own AS number on the AS Path once before advertising the route towards AS65001, and not towards AS65002. Assuming there is no link between AS65001 and AS65002, what would we expect to happen? What we would expect is AS65001 will receive one route towards 100::/64 with an AS Path of 2 and use this route. AS65002 will, likewise, receive one route towards 100::/64 with an AS Path of 1 and use this route.

AS65003, however, will receive two routes towards 100::/64, one with an AS Path of 3 through AS65001, and one with an AS Path of 2 through AS65002. All other things being equal (local preference, etc.), AS65003 will prefer the route with the shorter AS Path through AS65002, and select that path to reach 100::/64. AS65004 will only receive one path towards 100::/64, the one through AS65002, because AS65003 will only advertise its best path to AS65004.

The obvious question—how much good does this really do? The only impact on the best path is two hops away, as AS65003, and beyond. The route chosen by AS65001 and AS65002 will not be affected by the prepending.

A recent paper found—

We observe that the effectiveness of prepending can strongly depend on the location (for around 20% of cases, ASPP has moved no targets, while for another 20% , it moved almost all targets).

You might expect As Path prepending to have a much more consistent effect on inbound traffic. Why doesn’t it?

What might not be obvious (the danger of simplified diagrams): if autonomous systems directly attached to AS65001 originate most of the traffic destined to 100::/64, no amount of prepending is going to make any difference in the inbound traffic flow. Assume AS5001 has a connection to some cloud service, AS65002 does not have a connection to the same cloud service, and 100::64 is a local server that communicates with this cloud service on a regular basis. Since AS65001 is the only AS transiting traffic from the cloud service to the server located on the 100::/64 subnet, and AS65001 only has one route to 100::/64, you are not going to be able to shift traffic off that single path no matter how many times you prepend.

The first rule of prepending is location matters. You have to know where the traffic you want to shift is originating, and whether or not it can be shifted.

In my next post on this topic, I’ll continue exploring AS path prepending more in light of the results of the research paper above.

Ambiguity and complexity: once more into the breach

Recent research into the text of RFCs versus the security of the protocols described came to this conclusion—

While not conclusive, this suggests that there may be some correlation between the level of ambiguity in RFCs and subsequent implementation security flaws.

This should come as no surprise to network engineers—after all, complexity is the enemy of security. Beyond the novel ways the authors use to understand the shape of the world of RFCs (you should really read the paper; it’s really interesting), this desire to increase security by decreasing the ambiguity of specifications is fascinating. We often think that writing better specifications requires having better requirements, but down this path only lies despair.

Better requirements are the one thing a network engineer can never really hope for.

It’s not just that networks are often used as a sort of “complexity sink,” the place where every hard problem goes to be solved. It’s also the uncertainty of the environment in which the network must operate. What new application will be stuffed on top of the network this week? Will anyone tell the network folks about this new application, or just open a ticket when it doesn’t work right? What about all the changes developers are making to applications right now, and their impact on the network? There are link failures, software failures, hardware failures, and the mean time between mistakes. There is the pace of innovation (which I tend to think is a bit overblown—rule11, after all—we are often talking about new products rather than new ideas).

What the network is supposed to do—just provide IP transport between two devices—turns out to be hard. It’s hard because “just transporting packets” isn’t ever enough. These packets must be delivered consistently (jitter and drops) across an ever-changing landscape.

To this end—

[C]omplexity is most succinctly discussed in terms of functionality and its robustness. Specifically, we argue that complexity in highly organized systems arises primarily from design strategies intended to create robustness to uncertainty in their environments and component parts.

Uncertainty is the key word here. What can we do about all of this?

We can reduce uncertainty. There are three ways to reduce uncertainty. First, you can obfuscate it—this is harmful. Second, you can reduce the scope of the job at hand, throwing some of the uncertainty (and therefore complexity) over the cubicle way. This can be useful in some situations, but remember that the less work you’re doing, the less value you add. Beware of self-commodifying.

Finally, you can manage the uncertainty. This generally means using modularization intelligently to partition off problems into smaller sets. It’s easier to solve a set of well-scope problems with little uncertainty than to solve one big problem with unknowable uncertainty.

This might all sound great in theory, but how do we do this in real life? Where does the rubber hit the road? This is what Ethan and I tried to show in Problems and Solutions—how to understand the problems that need to be solved, and then how to solve each of those problems within a larger system. This is also what many parts of The Art of Network Architecture are about, and then again what Jeff and I wrote about in Navigating Network Complexity.

I know it often seems like it’s not worth learning the theory; it’s so much easier to focus on the day-to-day, the configuration of this device, or the shiny thing that vendor just created. It’s easier to assume that if I can just hide all the complexity behind intent or automation, I can get my weekends back.

The truth is that we’re paid to solve hard problems, and solving hard problems involves complexity. We can either try to cover that up, or we can learn to manage it.

Complexity Reduction?

Back in January, I ran into an interesting article called The many lies about reducing complexity:

Reducing complexity sells. Especially managers in IT are sensitive to it as complexity generally is their biggest headache. Hence, in IT, people are in a perennial fight to make the complexity bearable.

Gerben then discusses two ways we often try to reduce complexity. First, we try to simply reduce the number of applications we’re using. We see this all the time in the networking world—if we could only get to a single pane of glass, or reduce the number of management packages we use, or reduce the number of control planes (generally to one), or reduce the number of transport protocols … but reducing the number of protocols doesn’t necessarily reduce complexity. Instead, we can just end up with one very complex protocol. Would it really be simpler to push DNS and HTTP functionality into BGP so we can use a single protocol to do everything?

Second, we try to reduce complexity by hiding it. While this is sometimes effective, it can also lead to unacceptable tradeoffs in performance (we run into the state, optimization, surfaces triad here). It can also make the system more complex if we need to go back and leak information to regain optimal behavior. Think of the OSPF type 4, which just reinjects information lost in building an area summary, or even the complexity involved in the type7 to type 5 process required to create not-so-stubby areas.

It would seem, then, that you really can’t get rid of complexity. You can move it around, and sometimes you can effectively hide it, but you cannot get rid of it.

This is, to some extent, true. Complexity is a reaction to difficult environments, and networks are difficult environments.

Even so, there are ways to actually reduce complexity. The solution is not just hiding information because it’s messy, or munging things together because it requires fewer applications or protocols. You cannot eliminate complexity, but if you think about how information flows through a system you might be able to reduce the amount of complexity, and even create boundaries where state (hence complexity) can be more effectively hidden.

As an instance, I have argued elsewhere that building a DC fabric with distinct overlay and underlay protocols can actually create a simpler overall design than using a single protocol. Another instance might be to really think about where route aggregation takes place—is it really needed at all? Why? Is this the right place to aggregate routes? Is there any way I can change the network design to reduce state leaking through the abstraction?

The problem is there are no clear-cut rules for thinking about complexity in this way. There’s no rule of thumb, there’s no best practices. You just have to think through each individual situation and consider how, where, and why state flows, and then think through the state/optimization/surface tradeoffs for each possible way of reducing the complexity of the system. You have to take into account that local reductions in complexity can cause the overall system to be much more complex, as well, and eventually make the system brittle.

There’s no “pat” way to reduce complexity—that there is, is perhaps one of the biggest lies about complexity in the networking world.

Loose Lips

When I was in the military we were constantly drilled about the problem of Essential Elements of Friendly Information, or EEFIs. What are EEFis? If an adversary can cast a wide net of surveillance, they can often find multiple clues about what you are planning to do, or who is making which decisions. For instance, if several people married to military members all make plans to be without their spouses for a long period of time, the adversary can be certain a unit is about to be deployed. If the unit of each member can be determined, then the strength, positioning, and other facts about what action you are taking can be guessed.

Given enough broad information, an adversary can often guess at details that you really do not want them to know.

What brings all of this to mind is a recent article in Dark Reading about how attackers take advantage of publicly available information to form Spear Phishing attacks—

Most security leaders are acutely aware of the threat phishing scams pose to enterprise security. What garners less attention is the vast amount of publicly available information about organizations and their employees that enables these attacks.

Going back further in time, during World War II, we have—

What does all of this mean for the average network engineer concerned about security? Probably nothing different than being just slightly paranoid about your personal security in the first place (way too much modern security is driven by an anti-paranoid mindset, a topic for a future post). Things like—

  • Don’t let people know, either through your job description or anything else, that you hold the master passwords for your company, or that your account holds administrator rights.
  • Don’t always go to the same watering holes, and don’t talk about work while there to people you’ve just met, or even people you see there all the time.
  • Don’t talk about when and where you’re going on vacation. You can talk about it, and share pictures, once you’re back.

If an attacker knows you are going to be on vacation, it’s a lot easier to create a fake “emergency,” tempting you to give out information about accounts, people, and passwords you shouldn’t. Phishing is primarily a matter of social engineering rather than technical acumen. Countering social engineering is also a social skill, rather than a technical one. You can start by learning to just say less about what you are doing, when you are doing it, and who holds the keys to the kingdom.

Time and Mind Savers: RSS Feeds

I began writing this post just to remind readers this blog does have a number of RSS feeds—but then I thought … well, I probably need to explain why that piece of information is important.

The amount of writing, video, and audio being thrown at the average person today is astounding—so much so that, according to a lot of research, most people in the digital world have resorted to relying on social media as their primary source of news. Why do most people get their news from social media? I’m pretty convinced this is largely a matter of “it saves time.” The resulting feed might not be “perfect,” but it’s “close enough,” and no-one wants to spend time seeking out a wide variety of news sources so they will be better informed.

The problem, in this case, is that “close enough” is really a bad idea. We all tend to live in information bubbles of one form or another (although I’m fully convinced it’s much easier to live in a liberal/progressive bubble, being completely insulated from any news that doesn’t support your worldview, than it is to live in a conservative/traditional one). If you think about the role of social media and the news feed on social media services, this makes some kind of sense. The social media service tries to guess at what will keep you interested (engaged, and therefore coming back to the service), but at the same time each social media service also has a worldview they want to promote. The service largely attempts to both cater to what keeps you there and to pull you towards what the service, itself, believes.

The solution is stop getting your news from social media. period, full stop, end of sentence (although I’ve seen a recent paper indicating people find periods and other punctuation marks offensive in some way—when you find a period offensive, maybe it’s time to grow a little thicker skin).

So how should you get information instead? There are a lot of ways, from email based newsletters to watching television (please don’t, television turns everything into entertainment, including things that are not meant to entertain). My suggestion is, however, is through RSS feeds. Grab an account on Feedly or some other service, find the RSS feeds for the sites you find informative, and subscribe to their feeds. Some services have a learning mechanism that tries to accomplish the same thing as social media feeds—building intelligent filters to emphasize things you find important. I don’t tend to use these things; I have learned to just glance at the headline and first paragraph and make a quick decision about whether I think the post is worth reading.

Following RSS feeds can help you stop binging, jumping from place to place on a single site—essentially wasting time. It works against the mechanisms designers use to “increase engagement,” which often just means to consume more of your attention and time than you intended to give away. Following RSS feeds can also help you gain a broader view of the world if you intentionally subscribe to feeds from sites and people you don’t always agree with. It’s healthy to regularly read “the other side.” Following strong, well-written arguments from “the other side” will do much more for your mind than seeing just the facile, emotionally charged, straw-man arguments often presented (and allowed through the filters) on social media.

Further, services like feedly also allow you to follow lots of other things, including twitter accounts, youtube channels, and podcasts. I follow almost all podcasts through feedly, downloading the individual episodes I want to listen to, storing them in a cloud directory, and then deleting the files when I’m done. This gives me one list of things to listen to, rather than a huge playlist full of seemingly never-ending content.

All this said, this blog has a lot of different RSS feeds available. I don’t have a complete list, but these are a good place to start—

I keep these very same links on a page of RSS feeds you can find under the about menu. If you’re interested in the RSS feeds I follow, please reach out to me directly, as feedly no longer has any way to share your feeds other than pushing an OPML file (at least not that I can find).