Traffic Pattern Attacks: A Real Threat
Assume, for a moment, that you have a configuration something like this—
Some host, A, is sending queries to, and receiving responses from, a database at C. An observer, B, has access to the packets on the wire, but neither the host nor the server. All the information between the host and the server is encrypted. There is nothing the observer, B, can learn about the information being carried between the client and the server? Given the traffic is encrypted, you might think… “not very much.”
A recent research paper published at CCS ’16 in Vienna argues the observer could know a lot more. In fact, based on just the patterns of traffic between the server and the client, given the database uses atomic operations and encrypts each record separately, it’s possible to infer the key used to query the database (not the cryptographic key). The paper can be found here. Specifically:
We then develop generic reconstruction attacks on any system supporting range queries where either access pattern or communication volume is leaked. These attacks are in a rather weak passive adversarial model, where the untrusted server knows only the underlying query distribution. In particular, to perform our attack the server need not have any prior knowledge about the data, and need not know any of the issued queries nor their results. Yet, the server can reconstruct the secret attribute of every record in the database after about N^4 queries, where N is the domain size.
What is interesting about this is the attack infers a potentially useful piece of information from passive observation of encrypted data being passed between a client and server. To put this in more real world terms, assume for a moment that A is your computer, C is a web server that has information stored about you based on a nonce stored in a cookie stored on your local hard drive. The web site uses only encrypted sockets (TLS, let’s say). Some outside observer has access to your wifi connection, but not to your computer.
The nonce, in effect, acts like a key into the database. If the site wants to know the last five purchases you’ve made without you signing in, or the last five sites you’ve visited, etc., it would pull the cookie off your computer and query a database server that translates the cookie into the information required. If this all happens on you computer (a likely scenario, as most such information gathering system rely on locally executed code), over enough observations—even if those observations are of encrypted traffic—the observer can infer the value of your cookie. Using the information gathered over these observations, the attacker could then retrieve the information stored about you in the database…
The likelihood of such an attack actually working is going to be mitigated by several factors. First, there needs to be a lot of observations. Second, the key needs to not change across those transactions. Third, the transactions must be atomic (which means each transaction stands alone, there are not multiple parts to a single transaction spread out over a number of packet exchanges).
At the same time, this type of research shows what is possible just by observing traffic passing over the wire between two devices—which means it is something to think about when considering cloud based storage and database solutions of data (particularly sensitive data).