Hedge 142: George Michaelson and the Pace of IPv6 Deployment

IPv6 is still being deployed, years after the first world IPv6 day, even more years after its first acceptance as an Internet standard by the IETF. What is taking so long? George Michaelson (APNIC) joins Tom Ammon and Russ White on this episode of the Hedge to discuss the current pace of IPv6 deployment, where there are wins, and why things might be moving more slowly in other areas.

download

RFC9199: Lessons in Large-scale Service Deployment

While RFC9199 (are we really in the 9000’s?) is targeted at large-scale DNS deployments–specifically root zone operators–so it might seem the average operator won’t find a lot of value here.

This is, however, far from the truth. Every lesson we’ve learned in deploying large-scale DNS root servers applies to any other large-scale user-facing service. Internally deployed DNS recursive servers are an obvious instance, but the lessons here might well apply to a scheduling, banking, or any other multi-user application accessed from a lot of places by a lot of different users. There are some unique points in DNS, such as the relatively slower pace of database synchronization across nodes, but the network-side lessons can still be useful for a lot of applications.

What are those lessons?

First, using anycast dramatically improves performance for these kinds of services. For those who aren’t familiar with the concept, anycase turns an IP address into a service identifier. Any host with a copy (or instance) or a given service advertises the same address, causing the routing table to choose the (topologically) closest instance of the service. If you’re using anycast, traffic destined to your service will automatically be forwarded to the closest server running the application, providing a kind of load sharing among multiple instances through routing. If there are instances in New York, California, France, and Taipei, traffic from users in North Carolina will be routed to New York and traffic from users in Singapore will be routed to Taipei.

You can think of an anycast address something like a cell tower; users within a certain desintance will be “captured” by a particular instance. The more copies of the service you deploy, the smaller the geographic region the service will support. Hence you can control the number of users using a particular copy of the service by controlling the number and location of service copies.

To understand where and how to deploy service instances, create anycast catchment maps. Again, just like a wifi signal coverage map, or a cellphone tower coverage map, it’s important to understand which users will be directed to which instances. Using a catchment map will help you decide where new instances need to be deployed, which instances need the fastest links and hardware, etc. The RIPE ATLAS pobes and looking glass servers are good ways to start building such a map. If the application supports a large number of users, you might be able to convince the application developer to include some sort of geographic information in requests to help build these maps.

Third, when deploying service instance, pay as much attention to routing and connectivity as you do the number of instances deployed. As the authors note, sometimes eight instances will provide the same level of service as several thousand instances. The connectivity available into each instance of the service–bandwidth, delay, availability, etc.–still has a huge impact on service speed.

Fourth, reduce the speed at which the database needs to be synchronized where possible. Not every piece of information needs to be synchronized at the same rate. The less data being synchronized, the more consistent the view from multiple users is going to be.

RFC9199 is well worth reading, even for the average network engineer.

Hedge 141: Improving WAN Router Performance

Wide area networks in large-scale cores tend to be performance choke-points—partially because of differentials between the traffic they’re receiving from data center fabrics, campuses, and other sources, and the availability of outbound bandwidth, and partially because these routers tend to be a focal point for policy implementation. Rachee Singh joins Tom Ammon, Jeff Tantsura, and Russ White to discuss “Shoofly, a tool for provisioning wide-area backbones that bypasses routers by keeping traffic in the optical domain for as long as possible.”

download

BGP Peering (1)

Why does BGP use TCP for peering? What happens if two BGP speakers begin the peering process at the same time? In this video, recorded for Packet Pushers, I start looking at the BGP peering process.

https://packetpushers.net/learning-bgp-module-2-lesson-1-peering-part-1-video/

Learning to Ride

Have you ever taught a kid to ride a bike? Kids always begin the process by shifting their focus from the handlebars to the pedals, trying to feel out how to keep the right amount of pressure on each pedal, control the handlebars, and keep moving … so they can stay balanced. During this initial learning phase, the kid will keep their eyes down, looking at the pedals, the handlebars, and . . . the ground.

After some time of riding, though, managing the pedals and handlebars are embedded in “muscle memory,” allowing them to get their head up and focus on where they’re going rather than on the mechanical process of riding. After a lot of experience, bike riders can start doing wheelies, or jumps, or off-road riding that goes far beyond basic balance.
Network engineer—any kind of engineering, really—is the same way.

At first, you need to focus on what you are doing. How is this configured? What specific output am I looking for in this show command? What field do I need to use in this data structure to automate that? Where do I look to find out about these fields, defects, etc.?

The problem is—it is easy to get stuck at this level, focusing on configurations, automation, and the “what” of things.

You’re not going to be able to get your head up and think about the longer term—the trail ahead, the end-point you’re trying to reach—until you commit these things to muscle memory.
The point, with technology, is learning to stop focusing on the pedals, the handlebars, and the ground, and start focusing on the goal—whether its nailing this jump or conquering this trail or making it there.

Transitioning is often hard, of course, but its just like riding a bike. You won’t make the transition until you trust your muscle memory a bit at a time.

Learning the theory of how and why things work the way they are is a key point in this transition. Configuration is just the intersection of “how this works” with “what am I trying to do…” If you know how (and why) protocols work, and you know what you’re trying to do, configuration and automation will become a matter of asking the right questions.

Learn the theory, and riding the bike will become second nature—rather than something you must focus on constantly.

Hedge 140: Aftab S and RIR Policies

Regional Internet Registries (RIRs) assign and manage numbered Internet resources like IPv4 address space, IPv6 address, and AS numbers. If you ever try to get address space or an AS number, though, it might seem like the policies the RIRs use to determine what kin and scale of resources you can get are a bit arbitrary (or even, perhaps, odd). Aftab Siddiqui joins Russ White and Tom Ammon to explain how and why these policies are set the way they are.

download

Hedge 139: Open Source Supply Chain Security

There is a rising concern about the security of open source projects—particularly in terms of open source software supply chain. Alistair Woodman, who works closely with multiple open source software projects, joins Tom and Russ to discuss the reality of securing open source projects. The final answer? Essentially, buyer—or in the case of open source software, user—beware.

download