Multicast is, at best, difficult to deploy in large scale networks—PIM sparse and BIDIR are both complex, adding large amounts of state to intermediate devices. In the worst case, there is no apparent way to deploy any existing version of PIM, such as large-scale spine and leaf networks (variations on the venerable Clos fabric). BEIR, described in RFC8279, aims to solve the per-device state of traditional multicast.
In this network, assume A has some packet that needs to be delivered to T, V, and X. A could generate three packets, each one addressed to one of the destinations—but replicating the packet at A is wastes network resources on the A->B link, at least. Using PIM, these three destinations could be placed in a multicast group (a multicast address can be created that describes T, V, and X as a single destination). After this, a reverse shortest path tree can be calculated from each of the destinations in the group towards the source, A, and the correct forwarding state (the outgoing interface list) be installed at each of the routers in the network (or at least along the correct paths). This, however, adds a lot of state to the network.
BIER provides another option.
Returning to the example, if A is the sender, B is called the Bit-Forwarding Ingress Router, or BFIR. When B receives this packet from A, it determines T, V, and X are the correct destinations, and examines its local routing table to determine T is connected to M, V, is connected to N, and X is connected to R. Each of these are called a Bit-Forwarding Egress Router (BFER) in the network; each has a particular bit set aside in a bit field. For instance, if the network operator has set up an 8-bit BIER bit field, M might be bit 3, N might be bit 4, and R might be bit 6.
Given this, A will examine its local routing table to find the shortest path to M, N, and R. It will find the shortest path to M is through D, the shortest path to N is through C, and the shortest path to R is through C. A will then create two copies of the packet (it will replicate the packet). It will encapsulate one copy in a BIER header, setting the third bit in the header, and send the packet on to D. It will encapsulate the second copy into a BIER header, as well, setting the fourth and sixth bits in the header, and send the packet on to C.
D will receive the packet, determine the destination is M (because the third bit is set in the BIER header), look up the shortest path to M in its local routing table, and forward the packet. When M receives the packet, it pops the BIER header, finds T is the final destination address, and forwards the packet.
When C receives the packet, it finds there are two destinations, R and N, based on the bits set in the BIER header. The shortest path to N is through H, so it clears bit 6 (representing R as a destination), and forwards the packet to H. When H receives the packet, it finds N is the destination (because bit 4 is set in the BIER header), looks up N in its local routing table, and forwards the packet along the shortest path towards N. N will remove the BIER header and forward the packet to V. C will also forward the packet to G after clearing bit 4 (representing N as a destination BFER). G will examine the BIER header, find that R is a destination, examine its local routing table for the shortest path to R, and forward the packet. R, on receiving the packet, will strip the BIER header and forward the packet to X.
While the BFIR must know how to translate a single packet into multiple destinations, either through a group address, manual configuration, or some other signaling mechanism, the state at the remaining routers in the network is a simple table of BIER bitfield mappings to BFER destination addresses. It does not matter how large the tree gets, the amount of state is held to the number of BFERs at the network edge. This is completely different from any version of PIM, where the state at every router along the path depends on the number of senders and receivers.
The tradeoff is the hardware of the BIER Forwarding Routers (BFRs) must be able to support looking up the destination based on the bit set in the BIER bitfield in the header. Since the size of the BIER bitfield is variable, this is… challenging. One option is to use an MPLS label stack as a way to express the BIER header, which leverages existing MPLS implementations. Still, though, MPLS label depth is often limited, the ability to forward to multiple destinations is challenging, and the additional signaling required is more than trivial.
BIER is an interesting technology; it seems ideal for use in highly meshed data center fabrics, and even in longer haul and wireless networks where bandwidth is at a premium. Whether or not the hardware challenges can be overcome is a question yet to be answered.
HW challenges on the high end silicon have been mostly solved, we will see 3 implementations coming out this year.
Thanks Jeff! It’s good to know there is silicon coming out in this area.