Meltdown, Spectre and the CPU Flaw That Was Everywhere

The first week of January 2018 opened with a disclosure that landed differently from most security vulnerabilities. Meltdown and Spectre were not software flaws in a particular application or operating system. They were fundamental design flaws in the processors that power almost every computer, phone, and server made over the previous two decades.

The vulnerabilities exploited a technique called speculative execution, which modern processors use to improve performance. Rather than waiting for instructions to be confirmed, processors predict what computations are likely to be needed and execute them in advance. This makes chips faster. It also, as researchers at Google Project Zero and several academic institutions discovered, creates a channel through which a malicious programme can read memory that should be completely isolated from it. In practice this meant that carefully crafted code running in one context could read sensitive data, including passwords and encryption keys, from memory belonging to entirely separate processes.

The disclosure was coordinated but messy. Intel, AMD, ARM, and the major cloud providers had been working under embargo for months to develop patches before the vulnerabilities became public. The embargo broke before the intended disclosure date when details began leaking, which forced an earlier announcement than planned. The result was a period of confusion about exactly which processors were affected, which operating systems had been patched, and what the actual risk was.

The patches created their own problem. Fixing speculative execution flaws required disabling or adding checks to the very optimisations that made modern processors fast. Performance testing in the weeks after the patches showed slowdowns that varied significantly by workload. Database-heavy operations, which involve many small transactions with the operating system, were more affected than computationally heavy work. For cloud providers running large shared infrastructure, the implications were significant.

What made Meltdown and Spectre particularly uncomfortable was the response to the question of accountability. These were design decisions made over many years by chip manufacturers optimising for performance. The optimisations worked. The security implications were not understood at the time, or were not considered seriously enough. Fixing them required changes at the hardware level for new processors and software mitigations for everything already deployed, and the software mitigations imposed real costs.

The episode was a reminder that security assumptions baked into hardware have a very long lifetime, and that the cost of getting them wrong is distributed across every system that runs on the affected hardware.

Meltdown, Spectre and the CPU Flaw That Was Everywhere

Explore more from the archive

Related Articles