One the ‘net: The Network Collective and Choosing a Routing Protocol

The Network Collective is a new and very interesting video cast of various people sitting around a virtual table talking about topics of interest to network engineers. I was on the second episode last night, and the video is already (!) posted this morning. You should definitely watch this one!

In episode 2 our panel discusses some key differences between routing protocols and the details that should be considered before choosing to implement one over another. Is there any difference between IGP routing protocols at this point? When does it make sense to run BGP in an enterprise network? Is IS-IS an old and decaying protocol, or something you should viably consider? Russ White, Kevin Myers, and the co-hosts of Network Collective tackle these questions and more.

MegaSwitch: an interesting new data center fabric

Data center fabrics are built today using spine and leaf fabrics, lots of fiber, and a lot of routers. There has been a lot of research in all-optical solutions to replace current designs with something different; MegaSwitch is a recent paper that illustrates the research, and potentially a future trend, in data center design. The basic idea is this: give every host its own fiber in a ring that reaches to every other host. Then use optical multiplexers to pull off the signal from each ring any particular host needs in order to provide a switchable set of connections in near real time. The figure below will be used to explain.

In the illustration, there are four hosts, each of which is connected to an electrical switch (EWS). The EWS, in turn, connects to an optical switch (OWS). The OWS channels the outbound (transmitted) traffic from each host onto a single ring, where it is carried to every other OWS in the network. The optical signal is terminated at the hop before the transmitter to prevent any loops from forming (so A’s optical signal is terminated at D, for instance, assuming the ring runs clockwise in the diagram).

The receive side is where things get interesting; there are four full fibers feeding a single fiber towards the server, so it is possible for four times as much information to be transmitted towards the server as the server can receive. The reality is, however, that not every server needs to talk to every other server all the time; some form of switching seems to be in order to only carry the traffic towards the server from the optical rings.

To support switching, the OWS is dynamically programmed to only pull traffic from rings the attached host is currently communicating with. The OWS takes the traffic for each server sending to the local host, multiplexes it onto an optical interface, and sends it to the electrical switch, when then sends the correct information to the attached host. The OWS can increase the bandwidth between two servers by assigning more wavelengths on the OWS to EWS link to traffic being pulled off a particular ring, and reduce available bandwidth by assigning fewer wavelengths.

There are a number of possible problems with such a scheme; for instance—

  • When a host sends its first packet to another host, or needs to send just a small stream, there is a massive amount of overhead in time and resources setting up a new wavelength allocation at the correct OWS. To resolve these problems, the researchers propose having a full mesh of connectivity at some small portion of the overall available bandwidth; they call this basemesh.
  • This arrangement allows for bandwidth allocation as a per pair of hosts level, but much of the modern data networking world operates on a per flow basis. The researchers suggest this can be resolved by using the physical connectivity as a base for building a set of virtual LANs, and packets can be routed between these various vLANs. This means that traditional routing must stay in place to actually direct traffic to the correct destination in the network, so the EWS devices must either be routers, or there must be some centralized virtual router through which all traffic passes.

Is something like MegaSwitch the future of data center networks? Right now it is hard to tell—all optical fabrics have been a recurring idea in network design, but do not ever seem to have “broken out” as a preferred solution. The idea is attractive, but the complexity of what essentially amounts to a variable speed optical underlay combined with a more traditional routed overlay seems to add a lot of complexity into the mix, and it is hard to say if the complexity is really worth the tradeoff, which primarily seems to be simpler and cheaper cabling.

You can read the full MegaSwitch paper here.

Administravia 20170420

A couple of minor items for this week. First, I’ve removed the series page, and started adding subcategories. I think the subcategories will be more helpful in finding the material you’re looking for among the 700’ish posts on this site. I need to work through the rest of the posts here to build more subcategoies, but what is there is a start. Second, I’ve changed the primary domain from to, and started using the rule 11 reader name more than the ‘net Work name. will still work to reach this site, eventually will time out and die. Finally, I’ve put it on my todo list to get a chronological post page up at some point.

Happy Reading!