Anycast and Latency

One of the things I hear from time to time is how smaller Internet facing service deployments, with just a few instances, cannot really benefit from anycast. Particularly in the active-active data center use case, where customers can connect to one data center or another, the cost of advertising the service as an anycast, and the resulting requirement to keep the backend databases tightly synchronized, is often played as a eating a lot of complexity for the simplicity of having a single address in the DNS system, and hence not losing customer interaction time while the DNS records are timing out so the customer can reconnect to the service.

There is, in fact, some interesting recent research in this area. The research is directed at the DNS root servers themselves, probably because they are publicly accessible, and a well known system that has relied on anycast for many years (so the operators of the root DNS servers are probably well versed in the ways of anycast). One interesting chart from the post over at APNIC’s blog is—

The C root has 8 servers, while the L root has around 144 (according to the article pointed to above). Why is it that the C and L roots both show about the same performance, from an RTT perspective, even though they have wildly different numbers of servers serving an anycast address? The most likely explanation lies in the problem of diminishing returns; once you get past some number of servers in an anycast cluster, the gain from adding “one more server” really isn’t all that great.

How many servers? The authors of the paper say the magic number is 12.

But—it is important to point out that root servers serve a (largely) global audience, and so the “customers” of each of these anycast addresses is bound to be widely dispersed. For narrower audiences (geographically), the problem of diminishing returns will likely set in more quickly, perhaps with three or four servers servicing the anycast address.

The bottom line? Geographic dispersion against your customer base probably has more impact on the effectiveness of anycast in spreading load than the number of servers services the anycast address, above some minimal number (probably around three or four). If you are running any sort of service that needs high availability, even if its is grounded in TCP rather than UDP or QUIC, it is well worth looking at anycast to provide faster service and higher availability.