Service Provider Tech Doesn’t Apply?
Service provider problems are not your problems. You should not be trying to solve your problems the same way service providers do.
This seems intuitively true—after all, just about everything about a train or a large over-the-road truck (or lorry) is different from a passenger car. If the train is the service provider network and the car is the “enterprise” network, it seems to be obvious the two have very little in common.
Or is it?
What this gets right is that if an operator sells access to their network, or a single application, their network is likely to be built differently than the more general-purpose designs used in organizations that must support a wide range of applications and purposes. These differences are likely to show up in the choice of hardware, how the network is operated, and the kinds of services offered (or not).
What this gets right is operators who sell access to their networks, or support a single application, always seem to build at a scale far beyond what more general-purpose networks ever reach. Microsoft and Facebook number their servers in the millions, and single purchase orders include thousands of routers. eBay and LinkedIn number their servers in the hundreds of thousands, and their routers and switches in the tens of thousands. How can a small enterprise network of a few hundred servers be anything like these larger networks?
What this gets wrong is assuming none of the technologies, tools, or attitudes from these larger-scale networks is every applicable to the smaller networks many engineers encounter on a day-to-day basis.
All those networks with BGP deployed in their data center fabrics are using technology designed primarily for interconnecting intermediate systems on the default-free zone—in other words, for connecting the networks of transit service providers. All those networks with OSPF deployed are using a link state protocol originally designed to provide edge-to-edge reachability in transit service provider networks. All those networks with IS-IS deployed are using a link state protocol originally designed to provide connectivity to large-scale telephony-style networks.
What about transport technologies? The only transport technologies originally designed specifically for “enterprise use” have long since been replaced by optical technologies designed for large-scale provider or “hyperscale” use. Token Ring and ARCnet are long gone, as is the original shared medium Ethernet, replaced by switched Ethernet largely over optical transport. Even current general WiFi is primarily designed for public operator use cases—look at 5G and WiFi 6 and note how public operator requirements have influenced these technologies.
The truth is there is no “pure” enterprise technology; following the dictum that you should not use “service-provider technologies” in your network would leave you with … no network at all.
There is a second realm where this line of argument falls flat, and its more important than the question of which technologies to use: the techniques and attitudes learned in the operation of truly large-scale networks hold valuable lessons for all network engineers. Should you use a spine and leaf topology in your data center, rather than a more traditional hierarchical design? The answer has nothing to do with scale, and everything to do with flexibility in design and operational agility. Should you automate your network, even if its only ten routers? The answer has nothing to do with what Amazon is doing, and everything to do with how much time you want to spend on configuring and troubleshooting versus responding to real business needs.
Think of it this way: the driver who drives the large over-the-road truck is still going to learn lessons and instincts about driving that will make them a better driver in a minivan.
Come join me at NXTWORK in November to continue the conversation in my master class on building and operating data center fabrics, as I explore how you can apply lessons from the hyperscale world to your network.
Agree Russ to most of the points.
But you need to consider it all boils down to Business models that applies to Enterprises vs. Service Provider vs. Web Scales.
For most enterprises IT is a necessary Evil to keep the lights on and get business processes work. That’s why in many cases Cloud makes sense for them as it allows them to move to consumption model which is flexible, low Capex, Agile and so forth (Assuming they get their homework done right).
While CLOS is a good example of Modular design and other advantages it offers, someone must do the required hard work to get it right and if it makes sense at all for a Enterprise.
In my personal experience, most Enterprises don’t necessarily have people with deep Technical skills. So if someone decides to move to CLOS to solve problems around east-west traffic per say, they better have data to prove that point instead of being driven by Vendor marketing.
I was called by a Telco few years back to do DC refresh for them. While my assessment with data said 95% of their DC traffic was North-South in nature, they ended up building CLOS fabric based on a vendor recommendation. Which shouldn’t hurt that much through, but with no operational or theoretical understanding of how Layer 3 Fabric in Clos Design works, How they need to do lot of re architecture to get new design and integrations working, how their operational staff needs to get better at Routing, VxLAN and other stuff.
So it turned out they were set to failure despite my recommendations as they believed Vendor is the ultimate God and knows better.
I guess they realized and understood it hard way later after couple of outages and project which took significantly longer than it should have.
[…] This article from Russ had some great insights into why it’s not wise to entirely rule out doing things the way service providers do just because you’re working in enterprise. I’ve had experience in both SPs and enterprise and I agree that there are things that can be learnt on both sides. […]