I posted a link to a worth reading story last week about Liqid’s composable hyperconverged system. A reader (Vova Moki) commented on the LinkedIn post with this question—
Although I don’t understand how much faster is the PCIe than regular NICs?
Excellent question! PCIe is currently faster than Ethernet— this article lists the highest speed of PCIe as 15.8G/s across 16 lanes, with faster speeds expected into the future—but the two have similar road maps into the future speed wise. In 2018, PCIe 4 is expected, which will run 31.5 GBytes/second; Ethernet, on the other hand, expects 400Gbits/second, or about 50Gbytes/second. The road maps are similar enough that speed will likely only be a short term advantage on either side of the equation.
However, PCIe runs on parallel lanes, which means it must be very difficult to build a switch for the technology. The simplest way to build such a switch would be to pull the signals off the 16 different lanes, serialize them into a single packet of some sort, and then push them back out into 16 lanes again (potentially in different order/etc.).
So why should composable systems use something like PCIe, rather than using 100g Ethernet? After all, the Ethernet NIC is essentially doing precisely what a PCIe switch would need to do by pulling the data off a PCIe bus, serializing the data, and sending it over a network to a switch, which can, with the right design, already switch these packets back to another NIC at line rate. And line rate is some 6+ times faster.
While I am certainly not privy to vendor discussions around such decisions, I can imagine a line of argument that would be persuasive. Specifically, the argument would be based on the idea that less is more.
Assume you wanted to build a composable system much like the one Liqid is building, using Ethernet as the primary fabric. You would need to build the sort of thing shown in the illustration below.
The goal here is to be able to attach either processor, CPU1 or CPU2, to any set of storage and network devices. If you build such a system out of Ethernet, you must have one of two things: each component with an Ethernet interface, or some sort of fabric side NIC, along with the appropriate processing power (probably just a small CPU), and the correct PCIe chipset, to the PCIe command set into some form transportable through Ethernet. Ultimately, the processor must somehow believe that the selected SSD (storage) and NIC (network interface) are directly connected through its local PCIe bus, including interrupt handling, parallel transfer, etc., for applications designed to run on a more traditional “non-composable” system to run on this sort of composable system.
In designing such a device, the question would come down to this: is it easier to extend PCIe to support switching, and longer runs, or is it easier to design an entire protocol to (effectively) run PCIe over Ethernet? There are valid arguments for both answer, I think—but this is the essential question that must be asked, and answered.
In this case, the less is more argument is that it is easier to live with the lower bandwidth PCIe bus, figuring out how to build a PCIe switch, and figuring out to transport PCIe over longer distances, than it would be to build PCIe over Ethernet, with all that would potentially entail.
This, I think, is the reason for building such a composable system in this way.