Research: Covert Cache Channels in the Public Cloud
One of the great fears of server virtualization is the concern around copying information from one virtual machine, or one container, to another, through some cover channel across the single processor. This kind of channel would allow an attacker who roots, or otherwise is able to install software, on one of the two virtual machines, to exfiltrate data to another virtual machine running on the same processor. There have been some successful attacks in this area in recent years, most notably meltdown and spectre. These defects have been patched by cloud providers, at some cost to performance, but new vulnerabilities are bound to be found over time. The paper I’m looking at this week explains a new attack of this form. In this case, the researchers use the processor’s cache to transmit data between two virtual machines running on the same physical core.
The processor cache is always very small for several reasons. First, the processor cache is connected to a special bus, which normally has limits in the amount of memory it can address. This special bus avoids reading data through the normal system bus, and this is (from a networking perspective) at least one hop, and often several hops, closer to the processor on the internal bus (which is essentially an internal network). Second, the memory used in the cache is normally faster than the main memory.
The question is: since caches are at the processor level, so multiple virtual processes share the same cache, is it possible for one process to place information in the cache that another process can read? Since the cache is small and fast, it is used to store information that is accessed frequently. As processes, daemons, threads, and pthreads enter and exit, they access difference parts of main memory, causing the contents of the cache to change rapidly. Because of this constant churn, many researchers have assumed you cannot build a covert channel through the cache in this way. In fact, there have been attempts in the past; each of these has failed.
The authors of this paper argue, however, these failures are not because building a covert channel through the cache is not possible, but rather because previous attempts at doing so have operated on bad assumptions, attempting to use standard error correction mechanisms.
The first problem with using standard error correction mechanisms is that entire sections of data can be lost due to a cache entry being deleted. Assume you have two processes running on a single processor; you would like to build a covert channel between these processes. You write some code that inserts information into the cache, ensuring it is written in a particular memory location. This is the “blind drop.” The second process now runs and attempts to read this information. Normally this would work, but between the first and second process running, the information in the cache has been overwritten by some third process you do not know about. Because the entire data block is gone, the second process, which is trying to gather the information from the blind drop location, cannot tell there was ever any information at the drop point at all. There is no data across which the error correction code can run, because the data has been completely overwritten.
A possible solution to this problem is to use something like a TCP window of one; the transmitter resends the information until it receives an acknowledgement of receipt. The problem with this solution, in the case of a cache, is that the sender and receiver have no way to synchronize their clocks. Hence there is no way to form any sense of a serialized channel between the two processes.
To overcome these problems, the researchers use techniques used in wireless networks to ensure reliable delivery over unreliable channels. For instance, they send each symbol (a 0 or a 1) multiple times, using different blind drops (or cache locations), such that the receiver can compare these multiple transmit instances, and decide what the sender intended. The broader the number of blind drops used, the more likely information is to be carried across the process divide through the cache, as there are very few instances where all the cache entries representing blind drops will be invalidated and replaced at once. The researchers increase the rate at which this newly opened covert channel can operate by reverse engineering some aspects of a particular model processor’s caching algorithm. This allows them to guess which lines of cache will be invalidated first, how the cache sets are arranged, etc., and hence to place the blind drops more effectively.
By taking these steps, and using some strong error correction coding, a 42K covert channel was created between two instances running in an Amazon EC2 instance. This might not sound like a like, but it is higher speed than some of the fastest modems in use before DSL and other subscriber lines were widely available, and certainly fast enough to transfer a text-based file of passwords between two processes.
There will probably be some counter to this vulnerability in the future, but for now the main protection against this kind of attack is to prevent unknown or injected code from running on your virtual machines.