LEFT

Shows in left side column — all but worth reading should be in this category

Responding to Readers: Automated Design?

Deepak responded to my video on network commodization with a question:

What’s your thoughts on how Network Design itself can be Automated and validated. Also from Intent based Networking at some stage Network should re-look into itself and adjust to meet design goals or best practices or alternatively suggest the design itself in green field situation for example. APSTRA seems to be moving into this direction.

The answer to this question, as always, is—how many balloons fit in a bag? ūüôā I think it depends on what you mean when you use the term design. If we are talking about the overlay, or traffic engineering, or even quality of service, I think we will see a rising trend towards using machine learning in network environments to help solve those problems. I am not convinced machine learning can solve these problems, in the sense of leaving humans out of the loop, but humans could set the parameters up, let the neural network learn the flows, and then let the machine adjust things over time. I tend to think this kind of work will be pretty narrow for a long time to come.

There will be stumbling blocks here that need to be solved. For instance, if you introduce a new application into the network, do you need to re-teach the machine learning network? Or can you somehow make some adjustments? Or are you willing to let the new application underperform while the neural network adjusts? There are no clear answers to these questions, and yet we are going to need clear answers to them before we can really start counting on machine learning in this way.

If, on the other hand, you think of design as figuring out what the network topology should look like in the first place, or what kind of bandwidth you might need to build into the physical topology and where, I think machine learning can provide hints, but it is not going to be able to “design” a network in this way. There is too much intent involved here. For instance, in your original question, you noted the network can “look into itself” and “make adjustments” to better “meet the original design goals.” I’m not certain those “original design goals” are ever going to come from machine learning.

If this sounds like a wishy-washy answer, that’s because it is, in the end… It is always hard to make predictions of this kind—I’m just working off of what I know of machine learning today, compared to what I understand of the multi-variable problem of network designed, which is then mushed into the almost infinite possibilities of business requirements.

Weekend Reads 011918: IoT, Cyberthreat Thinking, and Techlash

Throughout 2016 and 2017, attacks from massive botnets made up entirely of hacked IoT devices had many experts warning of a dire outlook for Internet security. But the future of IoT doesn‚Äôt have to be so bleak. Here‚Äôs a primer on minimizing the chances that your IoT things become a security liability for you or for the Internet at large. —Krebs on Security

The cybercrime and cyber terrorism raging today are the most visible symptoms of a more pervasive problem concerning cyber security. How to establish a fair and just governance regime in cyberspace and establish international rules spark a storm of controversy. The controversy reflects the competing interests and demands of three distinct cyberspace actors: the state, the citizen, and the international community. By focusing only on one‚Äôs own interests, each actor ignores the interests of the other two, resulting in the current situation in which each sticks to its own argument and refuses to reconcile. —Hao Yeli

Deputy Attorney General Rosenstein has given talks where he proposes that tech companies decrease their communications and device security for the benefit of the FBI. In a recent talk, his idea is that tech companies just save a copy of the plaintext… —Schneier on Security

In this post, I‚Äôll talk about fingerprinting‚Äč documents‚Äč using text-based steganography‚ÄŹ‚Äé. T‚Ā†he problem we‚Äôre‚Äč trying‚Äč to solve is as follows‚ÄŹ‚Äé. We‚Äč have‚Äč a‚Äč sensitive document that‚Äč must‚Äč be distributed‚Äč to‚Äč some‚Äč number of‚Äč readers. Let‚Äôs say, for‚Äč example, that‚Äč Grandpa has‚Äč decided‚Äč to share his‚Äč famous‚Äč cookie recipe‚Äč with‚Äč each‚Äč of‚Äč his grandchildren‚ÄŹ‚Äé. B‚Ā†ut‚Äč it‚Äôs super important‚Äč to him that‚Äč the‚Äč recipe‚Äč stays in‚Äč the‚Äč family! S‚Ā†o they‚Äôre‚Äč not‚Äč allowed to share it with‚Äč anyone else‚ÄŹ‚Äé. I‚Ā†f‚Äč Grandpa finds‚Äč pieces of his‚Äč cookie‚Äč recipe online later, he‚Äč wants to know which‚Äč grandchild‚Äč broke the‚Äč family‚Äč trust. —by Noam with Micha @FF Labs

U.S. lawmakers are urging AT&T Inc, the No. 2 wireless carrier, to cut commercial ties to Chinese phone maker Huawei Technologies Co Ltd and oppose plans by telecom operator China Mobile Ltd to enter the U.S. market because of national security concerns, two congressional aides said. —Diane Bartz @The Free Beacon

U.S. Chamber of Commerce President Thomas J. Donohue on January 10, 2018, warned that “techlash” is a threat to prosperity in 2018. What was he getting at? A “backlash against major tech companies is gaining strength ‚ÄĒ both at home and abroad, and among consumers and governments alike.” “Techlash” is a shorthand reference to a variety of impulses by government and others to shape markets, services, and products; protect local interests; and step in early to prevent potential harm to competition or consumers. —Megan L. Brown @CircleID

Cisco Live Barcelona 2018

I will be presenting at the CCDE Techtorial at Cisco Live in Barcelona on the 30th of January. This is a great opportunity to come out and learn about the Cisco Certified Design Expert from one of the best group of speakers around.

IETF 101

I will be at IETF 101 in London in March. If you have never been to an IETF before and live in the London area, this is a great chance to come see how the standardization process works, and even get involved for the long term.

On the ‘web: The Value of MANRS

Route leaks and Distributed Denial of Service (DDoS) attacks have been in the news a good deal over the last several years; but the average non-transit network operator might generally feel pretty helpless in the face of the onslaught. Perhaps you can buy a DDoS mitigation service or appliance, and deploy the ubiquitous firewall at the edge of your network, but there is not much else to be done, right? Or maybe wait on the Internet at large to “do something” about these problems by deploying some sort of BGP security. But will adopting a “secure edge,” and waiting for someone else to solve the problem, really help? @ECI

The Overoptimization Meltdown

In simple terms Meltdown and Spectre are simple vulnerabilities to understand. Imagine a gang of thieves waiting for a stage coach carrying a month’s worth of payroll.

There are two roads the coach could take, and a fork, or a branch, where the driver decides which one to take. The driver could take either one. What is the solution? Station robbers along both sides of the branch, and wait to see which one the driver chooses. When you know, pull the resources from one branch to the other, so you can effectively rob the stage. This is much the same as a modern processor handling a branch—the user could have put anything into some field, or retrieved anything from a database, that might cause the software to run one of two sets of instructions. There is no way for the processor to know, so it runs both of them.

cross posted at CircleID

To run both sets of instructions, the processor will pull in the contents of specific memory locations, and begin executing code across these memory locations. Some of these memory locations might not be pieces of memory the currently running software is supposed to be able to access, but this is not checked until the branch is chosen. Hence a piece of software can force the processor to load memory it should not have access to by calling the right instructions in a speculative branch, exposing those bits of memory to be read by the software.

If you are interested in more details, there is an entire page here of articles about these two problems.

But my point here is not to consider the problem itself. What is more interesting is the thinking that leads to this kind of software defect being placed into the code. There are, in all designs, tradeoffs. For instance, in the real (physical) world, there is the tradeoff between fast, cheap, and quality. In the database world, there is the tradeoff among consistency, accessability, and partitionability. I have, for many years, maintained that in network design there is a tradeoff between state, optimization, and surfaces.

What meltdown and spectre respresent is the unintended consequence of a strong drive towards enhancing performance. It’s not that the engineers who designed speculative execution, and put it into silicon, are dumb. In fact, they are brilliant engineers who have helped drive the art of computing ever faster forward in ways probably unimaginable even twenty years ago. There are known tradeoffs when using speculative execution, such as:

  • Power—some code is going to be run, and the contents of some memory fetched, that will not be used. Fetching these memory locations, and running this code, is not free; there is some amount of power used, and heat generated, in speculative execution. This was actually a point of discussion early in the life of speculative execution, but the performance gains were so solid that the power and heat concerns were eventually set aside.
  • Real Estate—speculative execution requires physical real estate in the processor. It makes processors larger, and uses silicon gates that could be used for something else. Overall, the most performance enhancing use of the available real estate was shown to be the most economically useful, and thus speculative execution became an important part of chip design.
  • State—speculative execution drives the amount of state, and the speed at which that state is changing, much higher than it would otherwise be. Again, the performance gains were strong enough to make the added state worth the effort.

There was one more tradeoff, we now know, that was not considered during the initial days and years when speculative execution was being discussed—security.

So maybe it is time to take stock, and think about lessons learned. First, it is always the unexpected consequence that will come back to bite you in the end. Second, there is almost always an unexpected consequence. The value of experience is in being bitten by unexpected consequences enough times to learn to know what to look for in the future.

Well, in theory, anyway.

Finally, if you haven’t found the tradeoffs, you haven’t looked hard enough. Any time you think you have come up with a way to do things that will outperform any other way, you need to find all the tradeoffs. Don’t just find one tradeoff, and say, “see, I have that covered.”

A single minded focus on performance, at the cost of all else, will normally cost you more than you think, in the end. Overoptimization can sometimes cause meltdowns. And spectres.

It’s a lesson well worth learning.

Weekend Reads 011118: Mostly Security and Policy

Traveling is stressful. The last thing you want to worry about is getting scammed by crooks on the street. Your best tool? Knowledge. Know how they work. Know what they‚Äôll do. Prevent it from happening in the first place. —Relatively Interesting

The European Union‚Äôs competition chief is zeroing in on how companies stockpile and use so-called big data, or enormous computer files of customer records, industry statistics and other information. The move diverges starkly from a hands-off approach in the U.S., where regulators emphasize the benefits big data brings to innovation. —Natalia Drozdiak @ MarketWatch

The cybersecurity industry has mushroomed in recent years, but the data breaches just keep coming. Almost every day brings news of a new data breach, with millions of records compromised ‚ÄĒ including payment details, passwords, and other information that makes those customers vulnerable to theft and identity fraud. —Alistair Johnston @ MarketWatch

To break the dominance of Google on Android, Gael Duval, a former Linux developer and creator of now defunct but once hugely popular Mandrake Linux (later known as Mandriva Linux), has developed an open-source version of Android that is not connected to Google. —Kavita Iyer @ TechWorm

China has rarely undertaken a role in developing public international cybersecurity law over the many years the provisions have existed. Only once did it submit a formal proposal ‚ÄĒ fifteen years ago to the 2002 Plenipotentiary Conference where it introduced a resolution concerning “rapid Internet growth [that] has given rise to new problems in communication security.” Thus, a China formal submission to the upcoming third EG-ITRs meeting on 17-19 January 2018 in Geneva is significant in itself. —Anthony Rutkowski @ CircleID

If all you want is the TL;DR, here‚Äôs the headline finding: due to flaws in both Signal and WhatsApp (which I single out because I use them), it‚Äôs theoretically possible for strangers to add themselves to an encrypted group chat. However, the caveat is that these attacks are extremely difficult to pull off in practice, so nobody needs to panic. But both issues are very avoidable, and tend to undermine the logic of having an end-to-end encryption protocol in the first place. —Krebs on Security

This past Friday Twitter issued what is perhaps one of the most remarkable statements in modern diplomatic history: it said both that it would not ban a world leader from its platform and that it reserved the right to delete official statements by heads of state of sovereign nations as it saw fit. Have we truly reached a point in human history where private companies now wield absolute authority over what every government on earth may say to their citizens in the online world that has become the defacto modern town square? —Kalev Leetaru @ Forbes

Section 10 Routing Loops

A (long) time ago, a reader asked me about RFC4456, section 10, which says:

Care should be taken to make sure that none of the BGP path attributes defined above can be modified through configuration when exchanging internal routing information between RRs and Clients and Non-Clients. Their modification could potentially result in routing loops. In addition, when a RR reflects a route, it SHOULD NOT modify the following path attributes: NEXT_HOP, AS_PATH, LOCAL_PREF, and MED. Their modification could potentially result in routing loops.

On first reading, this seems a little strange‚ÄĒhow could modifying the next hop, Local Preference, or MED at a route reflector cause a routing loop? While contrived, the following network illustrates the principle.

Note the best path, from an IGP perspective, from C to E is through B, and the best path, from an IGP perspective, from B to D is through C. In this case, a route is advertised over eBGP from F towards E and D. These two eBGP speakers, in turn, advertise the route to their iBGP neighbors, B and C. Both B and C are route reflectors, so they both reflect the route on to A, which advertises the route to some other eBGP speaker outside AS65000 (not shown in the network diagram). In this case, assume the best path (for whatever reason) should be the route learned through D.

What happens if C changes the next hop for the route so it points to E rather than D? This should be fine, at first glance; when E receives traffic for the destination reachable through F, it will use the local eBGP route learned from F directly to forward the traffic. But there is a subtle problem here. Assume A receives both routes, one from B with a next hop of D, and one from C with a next hop of E. A, for whatever reason, chooses the path with a next hop of D. The best path to D, according to the IGP metrics, is through C, so A forwards the traffic to C.

C, however, has been configured to set the next hop to E through a local configuration. The best IGP path to E is through B, so C will forward the traffic towards B to be forwarded to E. B, however, has a next hop towards this destination of D, so when it receives packets destined beyond F in AS65001, it will examine its local routing table for the best path towards D, and find this is through C. Hence, B will forward the traffic to C to be forwarded towards D.

Thus a routing loop is formed because the best IGP path towards the next hop always points through another router with a next hop that points back to the router forwarding the traffic. The problem is B and C have inconsistent bestpaths, such that they each think the bestpath is through one another.

This is, of course, an artifact of overlaying two different control planes, each with their own rules about how to determine a loop free path to any given destination. This sort of problem can arise with any pair of control planes overlaid in this way.

What about MED, Local Preference, or the AS Path? C could modify any of these while reflecting the route to cause E to be chosen as the best exit point locally, while B and A continue to choose D as the best exit point. Any of these, then, can be used to create a routing loop in this topology.

Again, this is a somewhat contrived example, but if a loop can be contrived, then it will likely show up in more complex (and not-so-contrived) networks in the real world. It would be much easier to create a loop with a hierarchical route reflector, or even by causing an inconsistent route advertisement on the AS edge (two different eBGP speakers advertising different paths to a given destination reachable through the local AS).

Adminstravia 010918

I’ve reorganized the menu on the left just a little, combining some items under “reading,” and adding a new item called “topics.” Under this new item, you’ll find collections of articles on specific topics from other sources, starting with the ‘net neutrality page and the meltdown and spectre post reformatted as a page, with some new additions. I’m always trying to find new ways to organize the information here, making it easier to find things; hopefully this is a useful change.

Flowspec and RFC1998?

In a recent comment, Dave Raney asked:

Russ, I read your latest blog post on BGP. I have been curious about another development. Specifically is there still any work related to using BGP Flowspec in a similar fashion to RFC1998. In which a customer of a provider will be able to ask a provider to discard traffic using a flowspec rule at the provider edge. I saw that these were in development and are similar but both appear defunct. BGP Flowspec-ORF https://www.ietf.org/proceedings/93/slides/slides-93-idr-19.pdf BGP Flowspec Redirect https://tools.ietf.org/html/draft-ietf-idr-flowspec-redirect-ip-02.

This is a good question—to which there are two answers. The first is this service does exist. While its not widely publicized, a number of transit providers do, in fact, offer the ability to send them a flowspec community which will cause them to set a filter on their end of the link. This kind of service is immensely useful for countering Distributed Denial of Service (DDoS) attacks, of course. The problem is such services are expensive. The one provider I have personal experience with charges per prefix, and the cost is high enough to make it much less attractive.

Why would the cost be so high? The same reason a lot of providers do not filter for unicast Reverse Path Forwarding (uRPF) failures at scale—per packet filtering is very performance intensive, sometimes requiring recycling the packet in the ASIC. A line card normally able to support x customers without filtering may only be able to support x/2 customers with filtering. The provider has to pay for additional space, power, and configuration (the flowspec rules must be configured and maintained on the customer facing router). All of these things are costs the provider is going to pass on to their customers. The cost is high enough that I know very few people (in fact, so few as to be 0) network operators who will pay for this kind of service.

The second answer is there is another kind of service that is similar to what Dave is asking about. Many DDoS protection services offer their customers the ability to signal a request to the provider to block traffic from a particular source, or to help them manage a DDoS in some other way. This is very similar to the idea of interdomain flowspec, only using a different signaling mechanism. The signaling mechanism, in this case, is designed to allow the provider more leeway in how they respond to the request for help countering the DDoS. This system is called DDoS Open Threats Signaling; you can read more about it at this post I wrote at the ECI Telecom blog. You can also head over to the IETF DOTS WG page, and read through the drafts yourself.

Yes, I do answer reader comments… Sometimes just in email, and sometimes with a post—so comment away, ask questions, etc.