What about I2RS performance?
The first post in this series provides a basic overview of I2RS; there I used a simple diagram to illustrate how I2RS interacts with the RIB—
One question that comes to mind when looking at a data flow like this (or rather should come to mind!) is what kind of performance this setup will provide. Before diving into the answer to this question, though, perhaps it’s important to ask a different question—what kind of performance do you really need? There are (at least) two distinct performance profiles in routing—the time it takes to initially start up a routing peer, and the time it takes to converge on a single topology and/or route change. In reality, this second profile can be further broken down into multiple profiles (with or without an equal cost path, with or without a loop free alternate, etc.), but for our purposes I’ll just deal with the two broad categories here.
If your first instinct is to say that initial convergence time doesn’t matter, go back and review the recent Delta Airlines outage carefully. If you are still not convinced initial convergence time matters, go back and reread what you can find about that outage. And then read about how Facebook shuts down entire data centers to learn what happens, and think about it some more. Keep thinking about it until you are convinced that initial convergence time really matters. 🙂 It’s a matter of “if,” not “when,” where major outages like this are concerned; if you think your network taking on the order of tens of minutes (or hours) to perform initial convergence so applications can start spinning back up is okay, then you’re just flat wrong.
How fast for initial convergence is fast enough? Let’s assume we have a moderately sized data center fabric, or larger network, with something on the order of 50,000 routes in the table. If your solution can install routes on the order of 8,000 routes in ten seconds in a lab test (as a recently tested system did), then you’re looking at around a minute to converge on 50,000 routes in a lab. I don’t know what the actual ratio is, but I’d guess the “real world” has at least a doubling effect on route convergence times, so two minutes. Are you okay with that?
To be honest, I’m not. I’d want something more like ten seconds to converge on 50,000 routes in the real world (not in a lab). Let’s think about what it takes to get there. In the image just above, working from a routing protocol (not an I2RS object), we’d need to do—
- Receive the routing information
- Calculate the best path(s)
- Install the route into the RIB
- The RIB needs to arbitrate between multiple best paths supplied by protocols
- The RIB then collects the layer 2 header rewrite information
- The RIB then installs the information into the FIB
- The FIB, using magic, pushes the entry to the forwarding ASIC
What is the point of examining this process? To realize that a single route install is not, in fact, a single operation performed by the RIB. Rather, there are several operations here, including potential callbacks from the RIB to the protocol (what happens when BGP installs a route for which the next hop isn’t available, but then becomes available later on, for instance?). The RIB, and any API between the RIB and the protocol, needs to operate at about 3 to 4 times the speed at which you expect to be able to actually install routes.
What does this mean for I2RS? To install, say, 50,000 routes in 10 seconds, there needs to be around 200,000 transactions in that 10 seconds, or about 20,000 transactions per second. Now, consider the following illustration of the entire data path the I2RS controller needs to feed routing information through—
For any route to be installed in the RIB from the I2RS controller, it must be:
- Calculated based on current information
- Marshalled, which includes pouring it into the YANG format, potentially pushed to JSON, and placed into a packet
- Transported, which includes serialization delay, queuing, and the like
- Unmarshalled, or rather locally copied from the YANG format into a format that can be installed into the RIB
- Route arbitration and layer 2 rewrite information calculation performed
- Any response, such as an “install successful,” or “route overridden” returned through the same process to the I2RS controller
It is, of course, possible to do all of this 20,000 times per second—especially with a lot of heavy optimization, etc., in a well designed/operated network. But not all networks operate under ideal conditions all the time, so perhaps replacing the entire control plane with a remote controller isn’t the best idea in the world.
Luckily, I2RS wasn’t designed to replace the entire control plane, but rather to augment it. To explain, the next post will begin considering some use cases where I2RS can be useful.