How do we know that famous works of art on display are authentic and not forgeries? It’s clearly not an easy task, as the art forgery industry is estimated to generate around $30 billion per year. Artists frequently sign their original work but it’s rarely possible to distinguish an original signature from a reproduction.

One approach for identifying forgeries makes use of atomic bombs, or rather, the signature of a high carbon-14 isotope ratio that is found in all organic material following the atomic bomb tests of the 1960s. If the organic matter used to bind the pigments in the paint came from plants that died before this period, the ratio of carbon-14 will be substantially lower than the ratio in more modern paint, allowing investigators to reliably identify modern forgeries.

As with art, we frequently care more about the authenticity of our data than whether or not others are able to view it. For example, how do we know that the code running on our devices is authentic? We aren’t always concerned with hiding the code - anyone can view the open source u-boot bootloader code. However, it’s critical that our devices are able to verify that the bootloader code is unaltered and authentic before starting execution.

This blog introduces message authentication codes and digital signatures, which are cryptographic approaches for verifying the integrity (has the data changed?) and authenticity (who generated the data?) of data. But why not just use a checksum or CRC to verify that the data hasn’t been altered? As we will see in the next section, they unfortunately don’t protect against an adversary that is intentionally modifying data.

Why CRCs Aren’t Appropriate

A well-known approach for verifying whether or not data has been altered is to add an error detecting code, such as a Cyclic Redundancy Check (CRC). For example, consider the 1-bit CRC generated by the polynomial x+1. This is equivalent to adding an even parity bit, where a 1 is appended to the data if the number of 1s in the data is odd, and a 0 is appended otherwise.

If we are given input data 10010111, we can see that there is an odd number of 1s in the data, so our parity bit will be 1. So why isn’t this efficient error detection scheme appropriate for ensuring integrity and authenticity?

The problem is that CRCs are the equivalent of an artist’s signature. If we think like a CRC forger, how could a device be fooled into accepting altered data as unchanged? First, let’s take the easiest case: assume the data and its CRC are stored together in unprotected memory (e.g., flash). All the attacker needs to do is overwrite the original data with their own, then calculate and append the appropriate CRC to their data. When the device reads the altered data, it will compute the CRC and verify that it matches the CRC generated by the attacker, accepting the altered data as the original! For a real-world example of this attack, this walkthrough explains in detail how hackers were able to gain control of a Docker server by manipulating the CRCs in the Linux kernel.

Circumventing CRCs: (Left) Original Code and CRC Passes Verification, (Middle) Code with Adversarial Modifications and Original CRC Fails Verification, (Right) Code with Adversarial Modifications and Adversarial CRC Passes Verification

To make things a bit harder for the attacker, the device could store the CRC in one-time programmable (OTP) memory. Unfortunately, this doesn’t eliminate the problem. While CRCs are a good approach for detecting benign errors from channel noise or unstable cells, they don’t protect against actively malicious modifications.

Let’s assume that our earlier example data is used to enable or disable different product features depending on the license model selected by the customer. If the high-value features are in the left-most bits, being able to alter our example vector of 10010111 to flip more left-most bits to 1s would enable the features without paying for the higher priced license model. This is trivial for an adversary to do:

Original Data

Modified Data

10010111  1

11110111  1

Since the 1-bit CRC is 1 in the original data (indicating an odd number of 1s in the data), the adversary only needs to ensure that the modified data also has an odd number of 1s. By flipping the two leftmost 0s to 1s, the adversary has enabled two additional high-valued features without modifying the CRC check bit that is stored securely in OTP. This strategy also works for the larger CRCs used in real-world applications.

The important takeaway is that CRCs only provide protection against benign errors, such as noise on a communication channel or unstable SRAM cells. When the adversary is actively trying to modify the data, they can do so in a way that will not alter the CRC generated from the original data.

So if CRCs don’t provide any protection against malicious adversaries, what does? Next week's blog will share a stronger solution than error detecting/correcting codes like CRCs.