GRE was the first tunneling protocol ever designed and deployed—and although it largely been overtaken by VXLAN and other tunnel protocols, it is still in widespread use today. For this episode of the History of Networking, Stan Hanks, the inventor of GRE—and hence the inventor of the concept of tunneling in packet switched networks—joins us to describe how and why GRE tunneling was invented.

download

I think we can all agree networks have become too complex—and this complexity is a result of the network often becoming the “final dumping ground” of every problem that seems like it might impact more than one system, or everything no-one else can figure out how to solve. It’s rather humorous, in fact, to see a lot of server and application folks sitting around saying “this networking stuff is so complex—let’s design something better and simpler in our bespoke overlay…” and then falling into the same complexity traps as they start facing the real problems of policy and scale.

This complexity cannot be “automated away.” It can be smeared over with intent, but we’re going to find—soon enough—that smearing intent on top of complexity just makes for a dirty kitchen and a sub-standard meal.

While this is always “top of mind” in my world, what brings it to mind this particular week is a paper by Jen Rexford et al. (I know Jen isn’t on the lead position in the author list, but still…) called A Clean Slate 4D Approach to Network Control and Management. Of course, I can appreciate the paper in part because I agree with a lot of what’s being said here. For instance—

We believe the root cause of these problems lies in the control plane running on the network elements and the management plane that monitors and configures them. In this paper, we argue for revisiting the division of functionality and advocate an extreme design point that completely separates a network’s decision logic from the protocols that govern interaction of network elements.

In other words, we’ve not done our modularization homework very well—and our lack of focus on doing modularization right is adding a lot of unnecessary complexity to our world. The four planes proposed in the paper are decision, dissemination, discovery, and data. The decision plane drives network control, including reachability, load balancing, and access control. The dissemination plane “provides a robust and efficient communication substrate” across which the other planes can send information. The discovery plane “is responsible for discovering the physical components of the network,” giving each item an identifier, etc. The data plane carries packets edge-to-edge.

I do have some quibbles with this architecture, of course. To begin, I’m not certain the word “plane” is the right one here. Maybe “layers,” or something else that implies more of a modular concept with interactions, and less a “ships in the night” sort of affair. My more substantial disagreement is with the placement of “interface configuration” and where reachability is placed in the model.

Consider this: reachability and access control are, in a sense, two sides of the same coin. You learn where something is to make it reachable, and then you block access to it by hiding reachability from specific places in the network. There are two ways to control reachability—by hiding the destination, or by blocking traffic being sent to the destination. Each of these has positive and negative aspects.

But notice this paradox—the control plane cannot hide reachability towards something it does not know about. You must know about something to prevent someone from reaching it. While reachability and access control are two sides of the same coin, they are also opposites. Access control relies on reachability to do its job.

To solve this paradox, I would put reachability into discovery rather than decision. Discovery would then become the discovery of physical devices, paths, and reachability through the network. No policy would live here—discovery would just determine what exists. All the policy about what to expose about what exists would live within the decision plane.

While the paper implies this kind of system must wait for some day in the future to build a network using these principles, I think you can get pretty close today. My “ideal design” for a data center fabric right now is (modified) IS-IS or RIFT in the underlay and eVPN in the overlay, with a set of controllers sitting on top. Why?

IS-IS doesn’t rely on IP, so it can serve in a mostly pure discovery role, telling any upper layers where things are, and what is changing. eVPN can provide segmented reachability on top of this, as well as any other policies. Controllers can be used to manage eVPN configuration to implement intent. A separate controller can work with IS-IS on inventory and lifecycle of installed hardware. This creates a clean “break” between the infrastructure underlay and overlays, and pretty much eliminates any dependence the underlay has on the overlay. Replacing the overlay with a more SDN’ish solution (rather than BGP) is perfectly do-able and reasonable.

While not perfect, the link-state underlay/BGP overlay model comes pretty close to implementing what Jen and her co-authors are describing, only using protocols we have around today—although with some modification.

But the main point of this paper stands—a lot of the reason for the complexity we fact today is simply because we modularize using aggregation and summarization and call the job “done.” We aren’t thinking about the network as a system, but rather as a bunch of individual appliances we slap together into places, which we then connect through other places (or strings, like between tin cans).

Something to think about the next time you get that 2AM call.

More than 4.7 million sources in five countries — the US, China, South Korea, Russia, and India — were used to level distributed denial-of-service (DDoS) attacks against victims in the second quarter of 2020, with the portmap protocol most frequently used as an amplification vector to create massive data floods, security and services firm A10 Networks says in its threat report for the second quarter.

Thousands of people graduate from colleges and universities each year with cybersecurity or computer science degrees only to find employers are less than thrilled about their hands-on, foundational skills. Here’s a look at a recent survey that identified some of the bigger skills gaps, and some thoughts about how those seeking a career in these fields can better stand out from the crowd.

Three standards for email security that are supposed to verify the source of a message have critical implementation differences that could allow attackers to send emails from one domain and have them verified as sent from a different — more legitimate-seeming — domain, says a research team who will present their findings at the virtual Black Hat conference next month.

We’ve tried to use GUI to simplify the YAML file. But Devops/ operators wanted more options. We’ve also tried all-in-one YAML. Developers really hated those “additional” fields. Finally, I realized the problem is because Kubernetes API is not team centric. In many organizations, developers and Devops/operators are two different roles. Yet when using Kubernetes, they have to work on the same YAML file. And that means trouble.

Service meshes have attracted an enormous amount of hype around them. With at least a few talks about service meshes during each tech conference, one can easily be convinced that having a service mesh in their infrastructure is a must. However, hype isn’t a good indicator of whether the new shiny tech is the right solution for your problems.

Twitter on Wednesday disclosed that the attackers who took over accounts belonging to several high-profile individuals last week managed to access the direct message inbox of at least 36 individuals. The update further highlights the severity of the breach at Twitter and shows why organizations need to have measures protecting against — and mitigating fallout from — compromises of corporate social media accounts and accounts belonging to their top executives.

The ability of the human brain to process massive amounts of information while consuming minimal energy has long fascinated scientists. When there is a need, the brain dials up computation, but then it rapidly reverts to a baseline state. Within the realm of silicon-based computing, such efficiencies have never been possible.

As organizations let billions of connected devices into their corporate networks, do they really know what those devices are made of and the risk they may pose? The answer is likely: not really.

With the recent launch of Chrome 83, and the upcoming release of Mozilla Firefox 79, web developers are gaining powerful new security mechanisms to protect their applications from common web vulnerabilities.

The Digital Services Act is essentially an update to the E-Commerce Directive of 2000, which provides the legal framework regulating digital services in the EU and and sets out the liability regime for “information society service” providers, including Internet service providers, as well as those that act as “online intermediaries”, such as hosting and cloud providers.

There is much evidence suggesting an increase in cyberattacks during the COVID-19 pandemic — and the method of particular concern for folding, contracting, or merging brands is that of abandoned domain names.

Human beings—well most of us anyways—are wired to help. If we see someone in trouble, we want to assist them. It is what has kept our rather soft and squishy species alive when there were lions and tigers and bears trying to eat us. Strength in numbers and all that. When we see a car broken down on the side of the road, and if we notice that little, old lady trying to cross the street, there is that instinct to lend aid.

While many network engineers think about getting a certification, not many think about going after a degree. Is there value in getting a degree for the network engineer? If so, what is it? What kinds of things do you learn in a degree program for network engineering? Eric Osterweil, a professor at George Mason University, joins Jeremy Filliben and Russ White on this episode of the Hedge to consider degrees for network engineers.

download

Did you know that today Video, gaming and social media account for almost 80% of the world’s internet traffic? The change in the composition of traffic has been accompanied by a dramatic change in the way content is delivered to the internet user: ISPs and transit providers have diminished in number and importance as the power and profile of a few Content Distribution Networks has increased as content is being pushed closer to the edge. But isn’t that just competition at work? What are the long-term consequences of such a change? In the second episode of our three-part series on “Internet Consolidation”, we talk to Russ White, co-host of The History of Networking and The Hedge podcast and a distinguished infrastructure architect and internet transit and routing expert.

Should the network be dumb or smart? Network vendors have recently focused on making the network as smart as possible because there is a definite feeling that dumb networks are quickly becoming a commodity—and it’s hard to see where and how steep profit margins can be maintained in a commodifying market. Software vendors, on the other hand, have been encroaching on the network space by “building in” overlay network capabilities, especially in virtualization products. VMWare and Docker come immediately to mind; both are either able to, or working towards, running on a plain IP fabric, reducing the number of services provided by the network to a minimum level (of course, I’d have a lot more confidence in these overlay systems if they were a lot smarter about routing … but I’ll leave that alone for the moment).

How can this question be answered? One way is to think through what sorts of things need to be done in processing packets, and then think through where it makes most sense to do those things. Another way is to measure the accuracy or speed at which some of these “packet processing things” can be done so you can decide in a more empirical way. The paper I’m looking at today, by Anirudh et al., takes both of these paths in order to create a baseline “rule of thumb” about where to place packet processing functionality in a network.

Sivaraman, Anirudh, Thomas Mason, Aurojit Panda, Ravi Netravali, and Sai Anirudh Kondaveeti. “Network Architecture in the Age of Programmability.” ACM SIGCOMM Computer Communication Review 50, no. 1 (March 23, 2020): 38–44. https://doi.org/10.1145/3390251.3390257.

The authors consider six different “things” networks need to be able to do: measurement, resource management, deep packet inspection, network security, network virtualization, and application acceleration. The first of these they measure by setting introducing errors into a network and measuring the dropped packet rate using various edge and in-network measurement tools. What they found was in-network measurement has a lower error rate, particularly as time scales become shorter. For instance, Pingmesh, a packet loss measurement tool that runs on hosts, is useful for measuring packet loss in the minutes—but in-network telemetry can often measure packet loss in the seconds or milliseconds. They observe that in-network telemetry of all kinds (not just packet loss) appears to be more accurate when application performance is more important—so they argue telemetry largely belongs in the network.

Resource management, such as determining which path to take, or how quickly to transmit packets (setting the window size for TCP or QUIC, for instance), is traditionally performed entirely on hosts. The authors, however, note that effective resource management requires accurate telemetry information about flows, link utilization, etc.—and these things are best performed in-network rather than on hosts. For resource management, then, they prefer a hybrid edge/in-network approach.

The argue deep packet inspection and network virtualization are both best done at the edge, in hosts, because these are processor intensive tasks—often requiring more processing power and time than network devices have available. Finally, they argue network security should be located on the host, because the host has the fine-grained service information required to perform accurate filtering, etc.

Based on their arguments, the authors propose four rules of thumb. First, tasks that leverage data only available at the edge should run at the edge. Second, tasks that leverage data naturally found in the network should be run in the network. Third, tasks that require large amounts of processing power or memory should be run on the edge. Fourth, tasks that run at very short timescales should be run in the network.

I have, of course, some quibbles with their arguments … For instance, the argument that security should run on the edge, in hosts, assumes a somewhat binary view of security—all filters and security mechanisms should be “one place,” and nowhere else. A security posture that just moves “the firewall” from the edge of the network to the edge of the host, however, is going to (eventually) face the same vulnerabilities and issues, just spread out over a larger attack surface (every host instead of the entry to the network). Security shouldn’t work this way—the network and the host should work together to provide defense in depth.

The rules of thumb, however, seem to be pretty solid starting points for thinking about the problem. An alternate way of phrasing their result is through the principle of subsidiarity—decisions should be made as close as possible to the information required to make them. While this is really a concept that comes out of ethics and organizational management, it succinctly describes a good rule of thumb for network architecture.

 

 

Blame it on the webinar yesterday… I’m late again. 🙂

The remote work environment is different from the office. Different environments call for different communication systems. A communication system that is adapted to the remote work reality can unlock amazing benefits: (even) better productivity, a competitive advantage in the long run and much a better work-life balance on top of it all.

Krill already lets you manage and publish ROAs seamlessly across multiple Regional Internet Registries (RIR). Now Krill will also tell you what is the effect of all of the ROAs that you’ve created, indicating which announcements seen in BGP are authorised and which ones are not, along with the reason. This ensures your ROAs accurately reflect your intended routing at all times.

Though CMMI is not an exact science, it is a way to present a quantifiable level of risk within the different elements of the ISP. CMMI can be a tool to provide the justification for necessary investment in information security.

However, there have been multiple reported attempts to exploit the transfer market as part of malicious operations — such as the hijacking of dormant address space and spamming — which, judging by exchanges between operators in mailing lists and messaging boards, is of concern for network operators.

The internet has been a success largely because users across the globe can connect to each other without barriers. China wants to move away from this model, and its top tech firms such as Huawei, Tencent, and China Telecom are working hard to ensure that industry standards for next-generation networks translate into positive outcomes for China.

The Verizon DBIR team documents a precipitous decline of malware-related threats, from 50% in 2016 to just 6%, stating that “we think that other attack types such as hacking and social breaches benefit from the theft of credentials, which makes it no longer necessary to add malware in order to maintain persistence. So while we definitely cannot assert that malware has gone the way of the eight-track tape, it is a tool that sits idle in the attacker’s toolbox in simpler attack scenarios.”

They are nondescript, relatively small network devices that quietly run in an industrial plant. But these low-profile ICS protocol gateways, which translate and carry traffic between older industrial networks and IP-based ones, have been found to contain some serious security flaws an attacker could use to seize control of plant processes.

One of the key elements of Google’s software engineering culture is the use of defining software designs through design docs. These are relatively informal documents that the primary author or authors of a software system or application create before they embark on the coding project. The design doc documents the high level implementation strategy and key design decisions with emphasis on the trade-offs that were considered during those decisions.

Unstructured data is the nucleus of an organization, irrespective of its size. The vast majority of an organization’s data is unstructured, thanks to organizations, which are heavily investing in IoT and embracing artificial intelligence. This kind of data holds huge potential for business leaders to leverage and gain an edge on initiating real-time campaigns through the use of content connectors.

The International Trade Commission, or ITC, is a forum that’s meant to protect U.S. industries from unfair trade practices. But in recent years, the prime beneficiaries of the sprawling ITC haven’t been American manufacturers, but patent owners, patent lawyers, and—you guessed it—patent trolls.

It sometimes feels like nanotechnology is a solution looking for a problem. Though there has been much speculation about the potential applications of nanotechnology, many benefits the technology remain quite underdeveloped. One exception is in the area of cybersecurity.

IoT devices are relatively easy targets for malicious attackers who can use them to access networks and the confidential data they may hold, or recruit them into a network of compromised devices, called a botnet, to launch distributed denial of service (DDoS) attacks with the purpose of bringing down websites and servers.

I’m starting to feel like a broken record here, but man: Microsoft really is shaping up to be one of the most interesting forces in the Android ecosystem this year.

Hybrid creates two fundamentally different employee experiences to manage. Despite recent successes with remote work, employers are reopening offices to some of their employees to encourage social bonding, reinforce culture, and increase business collaboration.