2017 – rule 11 reader

SLAAC and DHCPv6

When deploying IPv6, one of the fundamental questions the network engineer needs to ask is: DHCPv6, or SLAAC? As the argument between these two has reached almost political dimensions, perhaps a quick look at the positive and negative attributes of each solution are. Originally, the idea was that IPv6 addresses would be created using stateless configuration (SLAAC). The network parts of the address would be obtained by listening for a Router Advertisement (RA), and the host part would be built using a local (presumably unique) physical (MAC) address. In this way, a host can be connected to the network, and come up and run, without any manual configuration. Of course, there is still the problem of DNS—how should a host discover which server it should contact to resolve domain names? To resolve this part, the DHCPv6 protocol would be used. So in IPv6 configuration, as initially conceived, the information obtained from RA would be combined with DNS information from DHCPv6 to fully configure an IPv6 host when it is attached to the network.

There are several problems with this scheme, as you might expect. The most obvious is that most network operators do not want to deploy two protocols to solve a single problem—configuring IPv6 hosts. What might not be so obvious, however, is that many network operators care a great deal about whether hosts are configured statelessly or through a protocol like DHCPv6.

Why would an operator want stateful configuration? Primarily because they want to control which devices can receive an IPv6 address, and hence communicate with other devices on the network. When using DHCPv6, just like DHCP with IPv4, the operator can set parameters around what kinds of devices, or perhaps even which specific devices, will be able to receive an IPv6 address. Further, the DHCPv6 server can be tied to the DNS server, so each host which connects to the network can also be given a DNS entry. Proper DNS entries are often a requirement for many applications. There are Dynamic DNS (DDNS) implementations that can solve this problem, but they are not often considered secure enough for a controlled network environment.

Why would an operator want stateless autoconfiguration? First, because they want any random user who can successfully connect to the network to be able to get an IPv6 address without any other configuration, and without the provider needing run any sort of special protocol or configuration to allow this. In fact, DHCPv6, in some environments, at least, can be seen as an attack surface, or rather a hole through which attacks can potentially be driven. Second, stateful configuration also has a failover problem; if the DHCPv6 server fails, then hosts can no longer obtain an IPv6 address, and the network no longer works. This could be, to say the least, problematic for service providers. Finally, SLAAC has a set of privacy extensions outlined in RFC4941 that (theoretically) prevent a host from being tracked based on its IPv6 address over time. This is a very attractive property for edge facing service providers.

The original set of drafts, however, only provided for DNS information to be carried through DHCPv6, and had no failover mechanism for DHCPv6. These two things, together, made it impossible to use just one of these two options. More recent work, however, has remedied both parts of this problem, making either option able to stand on its own. RFC6106, which is a bit older (2010), provides for DNS advertisement in the RA protocol. This allows an operator who would like to run everything completely stateless to do so, including hosts learning which DNS resolver to use. On the other side, RFC8156, which was just ratified in July of 2017, allows a pair of DHCPv6 servers to act as a failover pair. While this is more complex than simple DHCPv6, it does solve the problem of a host failing to operate correctly simply because the DHCPv6 server has failed.

Which of the two is now the best choice? If you do not have any requirement to restrict the hosts that can attach to the network using IPv6, then SLAAC, combined with DNS advertisement in the RA, and possibly with DDNS (if needed), would be the right choice. However, if the environment must be more secure, then DHCPv6 is likely to be the better solution.

A word of warning, though—using DHCPv6 to ensure each host received an IPv6 address that can be used anyplace in the network, and then stretching layer 2 to allow any host to roam “anywhere,” is really just not a good idea. I have worked on networks where this kind of thing has been taken to a global scale. It might seem cute at first, but this kind of solution will ultimately become a monster when it grows up.

Posted in STANDARDS, TECH

If you haven’t found the tradeoff…

This week, I ran into an interesting article over at Free Code Camp about design tradeoffs. I’ll wait for a moment if you want to go read the entire article to get the context of the piece… But this is the quote I’m most interested in:

[time-span]

Just like how every action has an equal and opposite reaction, each “positive” design decision necessarily creates a “negative” compromise. Insofar as designs necessarily create compromises, those compromises are very much intentional. (And in the same vein, unintentional compromises are a sign of bad design.)

In other words, design is about making tradeoffs. If you think you’ve found a design with no tradeoffs, well… Guess what? You’ve not looked hard enough. This is something I say often enough, of course, so what’s the point? The point is this: We still don’t really think about this in network design. This shows up in many different places; it’s worth taking a look at just a few.

Hardware is probably the place where network engineers are most conscious of design tradeoffs. Even so, we still tend to think sticking a chassis in a rack is a “future and requirements proof solution” to all our network design woes. With a chassis, of course, we can always expand network capacity with minimal fuss and muss, and since the blades can be replaced, the life cycle of the chassis should be much, much, longer than any sort of fixed configuration unit. As for port count, it seems like it should always be easier to replace line cards than to replace or add a box to get more ports, or higher speeds.

Cross posted at CircleID

But are either of these really true? While it might “just make sense” that a chassis box will last longer than a fixed configuration box, is there real data to back this up? Is it really a lower risk operation to replace the line cards in a chassis (including the brains!) with a new set, rather than building (scaling) out? And what about complexity—is it better to eat the complexity in the chassis, or the complexity in the network? Is it better to push the complexity into the network device, or into the network design? There are actually plenty of tradeoffs consider here, as it turns out—it just sometimes takes a little out of the box thinking to find them.

What about software? Network engineers tend to not think about tradeoffs here. After all, software is just that “stuff” you get when you buy hardware. It’s something you cannot touch, which means you are better off buying software with every feature you think you might ever need. There’s no harm in this right? The vendor is doing all the testing, and all the work of making certain every feature they include works correctly, right out of the box, so just throw the kitchen sink in there, too.

Or maybe not. My lesson here came through an experience in Cisco TAC. My pager went off one morning at 2AM because an image designed to test a feature in EIGRP had failed in production. The crash traced back to some old X.25 code. The customer didn’t even have X.25 enabled anyplace in their network. The truth is that when software systems get large enough, and complex enough, the laws of leaky abstractions, large numbers, and unintended consequences take over. Software defined is not a panacea for every design problem in the world.

What about routing protocols? The standards communities seem focused on creating and maintaining a small handulf of routing protocols, each of which is capable of doing everything. After all, who wants to deploy a routing protocol only to discover, a few years later, that it cannot handle some task that you really need done? Again, maybe not. BGP itself is becoming a complex ecosystem with a lot of interlocking parts and pieces. What started as a complex idea has become more complex over time, and we now have engineers who (seriously) only know one routing protocol—because there is enough to know in this one protocol to spend a lifetime learning it.

In all these situations we have tried to build a universal where a particular would do just fine. There is another side to this pendulum, of course—the custom built network of snowflakes. But the way to prevent snowflakes is not to build giant systems that seem to have every feature anyone can ever imagine.

The way to prevent landing on either extreme—in a world where every device is capable of anything, but cannot be understood by even the smartest of engineers, and a world where every device is uniquely designed to fit its environment, and no other device will do—is consider the tradeoffs.

If you haven’t found the tradeoffs, you haven’t looked hard enough.

A corollary rule, returning to the article that started this rant, is this: unintentional compromises are a sign of bad design.

Posted in DESIGN, WRITTEN

Leave Your Ego at the Door

You are just about to walk into the interview room. Regardless of whether you are being interviewed, or interviewing—what are you thinking about? Are you thinking about winning? Are you thinking about whining? Or are you thinking about engaging? I have noticed, on many mailing lists, and in many other forums, that interviews in our world have devolved into a contest of egos.

The person on the other side of the table has some certification I don’t care about—how can I prove they are dumb, not as smart as their certification might indicate, or… The person on the other side of the table claims to know some protocol, can I find some bit of information they don’t know? These kinds of questions are really just ego questions—and you need to leave them at the door. This is particularly acute with certifications right now—a lot of people doubt the value of certifications, claiming folks who have them don’t know anything, the certifications are worthless, they don’t reflect the real world, etc.

I will agree that we have a problem with the depth and level of knowledge of network engineers at the moment. We all need to grow up a little, learn technologies rather than CLIs, and actually learn how to be engineers. On the other hand, when you interview someone, or when you are being interviewed…

Leave your ego at the door.

Is it really worth losing a really good hire because you needed your ego stroked by “beating” someone in an interview?

No, I didn’t think so.

Posted in CULTURE, WRITTEN