Let G be a directed graph containing n vertices, one of which is a distinguished source s, and m edges, each with a non‐negative cost. We consider the problem of finding, for each possible sink vertex v, a pair of edge‐disjoint paths from s to v of minimum total edge cost. Suurballe has given an O(n2 logn)‐time algorithm for this problem. We give an implementation of Suurballe's algorithm that runs in O(m log(1+ m/n)n) time and O(m) space. Our algorithm builds an implicit representation of the n pairs of paths; given this representation, the time necessary to explicitly construct the pair of paths for any given sink is O(1) per edge on the paths.
This paper proposes an approach to the design of large-scale general-purpose data center networks based on the notions of volume and area universality introduced by Leiserson in the 1980's in the context of VLSI design. In particular, we suggest that the principle goal of the network designer should be to build a single network that is provably competitive, for any application, with any network that can be built for the same amount of money. After describing our approach, we survey the technology choices available to network designers today, and examine several existing commercial data center networks. In the most recent of these networks resources are allocated roughly as we suggest in this paper.
Large flows like videos consume significant bandwidth. Some ISPs actively manage these high volume flows with techniques like policing, which enforces a flow rate by dropping excess traffic. While the existence of policing is well known, our contribution is an Internet-wide study quantifying its prevalence and impact on video quality metrics. We developed a heuristic to identify policing from server-side traces and built a pipeline to deploy it at scale on traces from a large online content provider, collected from hundreds of servers worldwide. Using a dataset of 270 billion packets served to 28,400 client ASes, we find that, depending on region, up to 7% of lossy transfers are policed. Loss rates are on average six times higher when a trace is policed, and it impacts video playback quality. We show that alternatives to policing, like pacing and shaping, can achieve traffic management goals while avoiding the deleterious effects of policing.
By all accounts, today's Internet is not moving data as well as it should. Most of the world's cellular users experience delays of seconds to minutes; public Wi-Fi in airports and conference venues is often worse. Physics and climate researchers need to exchange petabytes of data with global collaborators but find their carefully engineered multi-Gbps infrastructure often delivers at only a few Mbps over intercontinental distances.
There exists a widely recognized need to better understand and manage complex “systems of systems,” ranging from biology, ecology, and medicine to network-centric technologies. This is motivating the search for universal laws of highly evolved systems and driving demand for new mathematics and methods that are consistent, integrative, and predictive. However, the theoretical frameworks available today are not merely fragmented but sometimes contradictory and incompatible. We argue that complexity arises in highly evolved biological and technological systems primarily to provide mechanisms to create robustness. However, this complexity itself can be a source of new fragility, leading to “robust yet fragile” tradeoffs in system design. We focus on the role of robustness and architecture in networked infrastructures, and we highlight recent advances in the theory of distributed control driven by network technologies. This view of complexity in highly organized technological and biological systems is fundamentally different from the dominant perspective in the mainstream sciences, which downplays function, constraints, and tradeoffs, and tends to minimize the role of organization and design.
Government access to the plaintext of encrypted communications and stored data presents difficult, important, and controversial issues that reveal conflicting values within the government and society at large. The debate over efforts to ensure that access is very polarized. Critics of government access, even as they acknowledge the importance of effective law enforcement, cite legal and practical objections, including risks to security, privacy and civil liberties, and U.S. commercial interests. Government officials acknowledge the value of encryption to protect privacy and confidential information but also express the need to be able to access information relevant to investigations when properly authorized.
This paper presents a design principle that helps guide placement of functions among the modules of a distributed computer system. The principle, called the end-to-end argument, suggests that functions placed at low levels of a system may be redundant or of little value when compared with the cost of providing them at that low level. Examples discussed in the paper include bit error recovery, security using encryption, duplicate message suppression, recovery from system crashes, and delivery acknowledgement. Low level mechanisms to support these functions are justified only as performance enhancements.
Maintaining the highest levels of availability for content providers is challenging in the face of scale, network evolution, and complexity. Little, however, is known about the network failures large content providers are susceptible to, and what mechanisms they employ to ensure high availability. From a detailed analysis of over 100 high-impact failure events within Google’s network, encompassing many data centers and two WANs, we quantify several dimensions of availability failures. We find that failures are evenly distributed across different network types and across data, control, and management planes, but that a large number of failures happen when a network management operation is in progress within the network. We discuss some of these failures in detail, and also describe our design principles for high availability motivated by these failures. These include using defense in depth, maintaining consistency across planes, failing open on large failures, carefully preventing and avoiding failures, and assessing root cause quickly. Our findings suggest that, as networks become more complicated, failures lurk everywhere, and, counter-intuitively, continuous incremental evolution of the network can, when applied together with our design principles, result in a more robust network.
Configuration problems are not only prevalent, but also severely impair the reliability of today's system software. One fundamental reason is the ever-increasing complexity of configuration, reflected by the large number of configuration parameters (knobs). With hundreds of knobs, configuring system software to ensure high reliability and performance becomes a daunting, error-prone task. This paper makes a first step in understanding a fundamental question of configuration design: ;do users really need so many knobs?
This book describes the design and engineering tradeoffs of datacenter networks. It describes interconnection networks from topology and network architecture to routing algorithms, and presents opportunities for taking advantage of the emerging technology trends that are influencing router microarchitecture. With the emergence of “many-core” processor chips, it is evident
that we will also need “many-port” routing chips to provide a bandwidth-rich network to avoid the performance limiting effects of Amdahl’s Law. We provide an overview of conventional topologies and their routing algorithms and show how technology, signaling rates and cost-effective optics are motivating new network topologies that scale up to millions of hosts.The book also provides detailed case studies of two high performance parallel computer systems and their networks.
Studying the design and implementation of a number of computer has led to some general hints for system design. They are described here and illustrated by many examples, ranging from hardware such as the Alto and the Dorado to application programs such as Bravo and Star.
Embedded systems are everywhere, from consumer electronics to critical infrastructure, and with the rise of the Internet of Things (IoT), such systems are set to proliferate throughout all aspects of everyday life. Due to their ubiquitous and often critical nature, such systems have myriad security and privacy concerns, but proper attention to these aspects in the embedded world is often sorely lacking. In this article I will discuss how embedded binary security in particular tends to lag behind what is commonly expected of modern general purpose systems, why bridging this gap is non-trivial, and offer some suggestions for promising defensive research directions.
The IPv6 world is a complicated one, there is no doubt about that. The numerous new features and possibilities added to the protocol, either from its very first specification, RFC 2460 [1], to its later enhancements, certainly increase flexibility and capabilities, but they may also introduce operational issues, at least in “contradicting scenarios” which may lead even to security implications.
We present our approach for overcoming the cost, operational complexity, and limited scale endemic to datacenter networks a decade ago. Three themes unify the five generations of datacenter networks detailed in this paper. First, multi-stage Clos topologies built from commodity switch silicon can support cost-effective deployment of building-scale networks. Second, much of the general, but complex, decentralized network routing and management protocols supporting arbitrary deployment scenarios were overkill for single-operator, pre-planned datacenter networks. We built a centralized control mechanism based on a global configuration pushed to all datacenter switches. Third, modular hardware design coupled with simple, robust software allowed our design to also support inter-cluster and wide-area networks. Our datacenter networks run at dozens of sites across the planet, scaling in capacity by lOOx over ten years to more than lPbps of bisection bandwidth.
The research community has considered in the past the application of Artificial Intelligence (AI) techniques to control and operate networks. A notable example is the Knowledge Plane proposed by D.Clark et al. However, such techniques have not been extensively prototyped or deployed in the held yet. In this paper, we explore the reasons for the lack of adoption and posit that the rise of two recent paradigms: Software-Defined Networking (SDN) and Network Analytics (NA), will facilitate the adoption of AI techniques in the context of network operation and control. We describe a new paradigm that accommodates and exploits SDN. NA and AI. and provide use-cases that illustrate its applicability and benefits. We also present simple experimental results that support, for some relevant use-cases, its feasibility. We refer to this new paradigm as Knowledge-Defined Networking (KDN).
Loop-Free Alternates (LFAs) are a local fast-reroute mechanism defined for IP networks. They are simple but suffer from two drawbacks. Firstly, some flows cannot be protected due to missing LFAs, i.e., this concept does not provide full protection coverage, which depends on network topology. Secondly, some LFAs cause loops in case of node or multiple failures. Avoiding those LFAs decreases the protection coverage even further. In this work, we propose to apply LFAs to OpenFlow-based networks. We suggest a method for loop detection so that loops can be avoided without decreasing protection coverage. We propose an implementation with OpenFlow that requires only a single additional flow rule per switch.
the DUAL Algorithm, on which EIGRP and BABEL are based
A family of distributed algorithms for the dynamic computation of the shortest paths in a computer network or internet is presented, validated, and analyzed. According to these algorithms, each node maintains a vector with its distance to every other node. Update messages from a node are sent only to its neighbors; each such message contains a distance vector of one or more entries, and each entry specifies the length of the selected path to a network destination, as well as an indication of whether the entry constitutes an update, a query, or a reply to a previous query.
All software construction involves essential tasks, the fashioning of the complex conceptual structures that compose the abstract software entity, and accidental tasks, the representation of these abstract entities in programming languages and the mapping of these onto machine languages within space and speed constraints. Most of the big past gains in software productivity have come from removing artificial barriers that have made the accidental tasks inordinately hard, such as severe hardware constraints, awkward programming languages, lack of machine time. How much of what software engineers now do is still devoted to the accidental, as opposed to the essential? Unless it is more than 9/10 of all effort, shrinking all the accidental activities to zero time will not give an order of magnitude improvement.
The Internet’s routing system is facing stresses due to its poor fundamental scaling properties. Compact routing is a research field that studies fundamental limits of routing scalability and designs algorithms that try to meet these limits. In particular, compact routing research shows that shortest-path routing, forming a core of traditional routing algorithms, cannot guarantee routing table (RT) sizes that on all network topologies grow slower than linearly as functions of the network size. However, there are plenty of compact routing schemes that relax the shortest-path requirement and allow for improved, sublinear RT size scaling that is mathematically provable for all static network topologies. In particular, there exist compact routing schemes designed for grids, trees, and Internet-like topologies that offer RT sizes that scale logarithmically with the network size.
The European General Data Protection Regulation (GDPR) gives primacy to purpose: Data may be collected and stored only when (i) end-users have consented, often explicitly, to the purposes for which that data is collected, and (ii) the collected data is actually necessary for achieving these purposes. This development in data protection regulations begets the question: how do we audit a computer system’s adherence to a purpose? We propose an approach that identifies a purpose with a business process, and show how formal models of interprocess communication can be used to audit or even derive privacy policies. Based on this insight, we propose a methodology for auditing GDPR compliance. Moreover, we show how given a simple interprocess dataflow model, aspects of GDPR compliance can be determined algorithmically.
This paper discusses modularization as a mechanism for improving the flexibility and comprehensibility of a system while allowing the shortening of its development time. The effectiveness of a “modularization” is dependent upon the criteria used in dividing the system into modules. Two system design problems are presented, and for each, both a conventional and unconventional decomposition are described. It is shown that the unconventional decompositions have distinct advantages for the goals outlined. The criteria used in arriving at the decompositions are discussed. The unconventional decomposition, if implemented with the conventional assumption that a module consists of one or more subroutines, will be less efficient in most cases. An alternative approach to implementation which does not have this effect is sketched.
Over the last decade, the field of so-called Agile software development has grown to be a major force in the socio-economic arena of delivering quality software on time, on budget, and on spec. The acceleration in changing needs brought on by the rise in popularity of the Internet has helped push Agile practices far beyond their original boundaries, and possibly into domains where their application is not the optimal solution to the problems at hand. The question of where Agile software development practices and techniques make sense, and where are they out of place, is a valid one. It can be addressed by looking at software development as a complex endeavour, and using tools and techniques from the Cynefin method and other models of social complexity.
The bandwidth and latency requirements of next-generation datacenter networks stress the limits of CMOS manufacturing. A key trend in their design will be a move from single-channel links and switches to multi-channel links and switches. Today's network topologies erase this distinction, providing the illusion of a unified network fabric. In this work we propose P-FatTree, which is a FatTree topology designed specifically for the future multi-channel reality. P-FatTree requires fewer switch chips and as a result has lower cost, power consumption, and latency than existing approaches. Furthermore, by embracing the parallel nature of the network itself, it enables compelling new ways to better manage and deliver application traffic.
A fierce debate has been ongoing for many years over strong computer encryption of communications and data, which can both deliver security and privacy for individuals but also make it difficult for the intelligence and law enforcement communities to perform their surveillance and investigative duties. In particular, the question of whether encryption systems should be required to have a “backdoor” to give the government special access to encrypted information remains divisive.
Quantum computers are designed to outperform standard computers by running quantum algorithms. Areas in which quantum algorithms can be applied include cryptography, search and optimisation, simulation of quantum systems, and solving large systems of linear equations. Here we briefly survey some known quantum algorithms, with an emphasis on a broad overview of their applications rather than their technical details. We include a discussion of recent developments and near-term applications of quantum algorithms.
This paper argues that a common design paradigm for systems is fundamentally flawed, resulting in unstable, unpredictable behavior as the complexity of the system grows. In this flawed paradigm, designers carefully attempt to predict the operating environment andfailure modes of the system in order to design its basic operational mechanisms. However, as a system grows in complexity, the diffuse coupling between the components in the system inevitably leads to the butterfly effect, in which small perturbations can result in large changes in behavior. We explore this in the context of distributed data structures, a scalable, cluster-based storage server. We then consider a number of design techniques that help a system to be robust in the face of the unexpected, including overprovisioning, admission control, introspection, adaptivity through closed control loops. Ultimately, however, all complex systems eventually must contend with the unpredictable. Because of this, we believe systems should be designed to cope with failure gracefully.
A keynote talk discussing the relationship between agile development the increasing availability of general purpose processors, complexity, and security.
Speaking roughly, it may be said that the seventeenth, eighteenth, and nineteenthcenturies formed the period in which physical science learned variables, whichbrought us the telephone and the radio, the automobile and the airplane, thephonograph and the moving pictures, the turbine and the Diesel engine, and themodern hydroelectric power plant.
As the extension of Distributed Denial-of-Service (DDoS) attacks to application layer in recent years, researchers pay much interest in these new variants due to a low-volume and intermittent pattern with a higher level of stealthiness, invaliding the state-of-the-art DDoS detection/defense mechanisms. We describe a new type of low-volume application layer DDoS attack–Tail Attacks on Web Applications. Such attack exploits a newly identified system vulnerability of n-tier web applications (millibottlenecks with sub-second duration and resource contention with strong dependencies among distributed nodes) with the goal of causing the long-tail latency problem of the target web application (e.g., 95th percentile response time > 1 second) and damaging the long-term business of the service provider, while all the system resources are far from saturation, making it difficult to trace the cause of performance degradation.
The information economy is a blue-whale economy with its energy uses mostly out of sight. Based on mid-range estimate, the world’s Information-Communications-Technologies (ICT) ecosystem uses about 1,500 TWh of electricity annually, equal to all the electric generation of Japan and Germany combined – as much electricity as was used for global illumination in 1985. The ICT ecosystem now approaches 10% of world electricity generation. Or in other energy terms – the zettabyte era already uses about 50% more energy than global aviation.
The last section of our inquiry into the failure dealt with the criteria of quality of software. “In the recent struggle to deliver any software at all, the first casualty has been consideration of the quality of the software delivered. The quality of software is measured by a number of totally incompatible criteria, which must be carefully balanced in the design and implementation of every program.” We then made a list of no less than seventeen criteria which has been published in a guest editorial in Volume 2 of the journal, Software Practice and Experience.
The network is reliable tops Peter Deutsch's classic list, "Eight fallacies of distributed computing,"(https://blogs.oracle.com/jag/resource/Fallacies.html), all [of which] prove to be false in the long run and all [of which] cause big trouble and painful learning experiences. Accounting for and understanding the implications of network behavior is key to designing robust distributed programs—in fact, six of Deutsch's fallacies directly pertain to limitations on networked communications.
We present our experience with QUIC, an encrypted, multiplexed, and low-latency transport protocol designed from the ground up to improve transport performance for HTTPS traffic and to enable rapid deployment and continued evolution of transport mechanisms. QUIC has been globally deployed at Google on thousands of servers and is used to serve traffic to a range of clients including a widely-used web browser (Chrome) and a popular mobile video streaming app (YouTube). We estimate that 7% of Internet traffic is now QUIC. We describe our motivations for developing a new transport, the principles that guided our design, the Internet-scale process that we used to perform iterative experiments on QUIC, performance improvements seen by our various services, and our experience deploying QUIC globally. We also share lessons about transport design and the Internet ecosystem that we learned from our deployment.
In this article I will first argue that a Service-Infrastructure Cycle is fundamental to networking evolution. Networks are built to accommodate certain services at an expected scale. New applications and/or a significant increase in scale require a rethinking of network mechanisms which results in new deployments. Four decades-worth of iterations of this process have yielded the Internet as we know it today, a common and shared global networking infrastructure that delivers almost all services. I will further argue, using brief historical case studies, that success of network mechanism deployments often hinges on whether or not mechanism evolution follows the iterations of this Cycle. Many have observed that this network, the Internet, has become ossified and unable to change in response to new demands. In other words, after decades of operation, the Service-Infrastructure Cycle has become stuck. However, novel service requirements and scale increases continue to exert significant pressure on this ossified infrastructure. The result, I will conjecture, will be a fragmentation, the beginnings of which are evident today, that will ultimately fundamentally change the character of the network infrastructure. By ushering in a ManyNets world, this fragmentation will lubricate the Service-Infrastructure Cycle so that it can continue to govern the evolution of networking. I conclude this article with a brief discussion of the possible implications of this emerging ManyNets world on networking research.
This white paper provides an overview of AT&T’s vision for an Open Architecture for a Disaggregated Network Operating System (dNOS). Our goal is to start an industry discussion on technical feasibility, build interest in participating in the formulation of technical detail, and determine suitable vehicles (standards bodies, open source efforts, consortia, etc.) for common specification and architectural realization.