As we look upon the sunset of a remarkably tiresome year, I thought it would be appropriate to talk about **cryptographic wear-out**.

### What is cryptographic wear-out?

It’s the threshold when you’ve used the same key to encrypt so much data that you should probably switch to a new key before you encrypt any more. Otherwise, you might let someone capable of observing all your encrypted data perform interesting attacks that compromise the security of the data you’ve encrypted.

The exact value of the threshold varies depending on how exactly you’re encrypting data (n.b. AEAD modes, block ciphers + cipher modes, etc. each have different wear-out thresholds due to their composition).

Let’s take a look at the wear-out limits of the more popular symmetric encryption methods, and calculate those limits ourselves.

## Specific Ciphers and Modes

### Cryptographic Limits for AES-GCM

I’ve written about AES-GCM before (and why I think it sucks).

AES-GCM is a construction that combines AES-CTR with an authenticator called GMAC, whose consumption of nonces looks something like this:

- Calculating H (used in GHASH for all messages encrypted under the same key, regardless of nonce):
`Encrypt(00000000 00000000 00000000 00000000)`

- Calculating J
_{0}(the pre-counter block):- If the nonce is 96 bits long:
`NNNNNNNN NNNNNNNN NNNNNNNN 00000001`

where the`N`

spaces represent the nonce hexits.

- Otherwise:
`s = 128 * ceil(len(nonce)/nonce) - len(nonce)`

`J0 = GHASH(H, nonce || zero(s+64) || int2bytes(len(nonce))`

- If the nonce is 96 bits long:
- Each block of data encrypted uses J0 + block counter (starting at 1) as a CTR nonce.
- J0 is additionally used as the nonce to calculate the final GMAC tag.

AES-GCM is one of the algorithms where it’s easy to separately calculate the safety limits per message (i.e. for a given nonce and key), as well as for all messages under a key.

#### AES-GCM Single Message Length Limits

In the simplest case (nonce is 96 bits), you end up with the following nonces consumed:

- For each key:
`00000000 00000000 00000000 00000000`

- For each (nonce, key) pair:
`NNNNNNNN NNNNNNNN NNNNNNNN 000000001`

for J0`NNNNNNNN NNNNNNNN NNNNNNNN 000000002`

for encrypting the first 16 bytes of plaintext`NNNNNNNN NNNNNNNN NNNNNNNN 000000003`

for the next 16 bytes of plaintext…- …
`NNNNNNNN NNNNNNNN NNNNNNNN FFFFFFFFF`

for the final 16 bytes of plaintext.

From here, it’s pretty easy to see that you can encrypt the blocks from `00000002`

to `FFFFFFFF`

without overflowing and creating a nonce reuse. This means that each (key, nonce) can be used to encrypt a single message up to blocks of the underlying ciphertext.

Since the block size of AES is 16 bytes, this means the maximum length of a single AES-GCM (key, nonce) pair is **bytes** (or 68,719,476,480 bytes). This is approximately 68 GB or 64 GiB.

Things get a bit tricker to analyze when the nonce is not 96 bits, since it’s hashed.

The disadvantage of this hashing behavior is that it’s possible for two different nonces to produce overlapping ranges of AES-CTR output, which makes the security analysis very difficult.

However, this hashed output is also hidden from network observers since they do not know the value of H. Without some method of reliably detecting when you have an overlapping range of hidden block counters, you can’t exploit this.

(If you want to live dangerously and motivate cryptanalysis research, mix 96-bit and non-96-bit nonces with the same key in a system that does something valuable.)

#### Multi-Message AES-GCM Key Wear-Out

Now that we’ve established the maximum length for a single message, how many messages you can safely encrypt under a given AES-GCM key depends entirely on how your nonce is selected.

If you have a reliable counter, which is guaranteed to never repeat, and start it at 0 you can theoretically encrypt messages safely. Hooray!

You probably don’t have a reliable counter, especially in real-world settings (distributed systems, multi-threaded applications, virtual machines that might be snapshotted and restored, etc.).

Additionally (thanks to 2adic for the expedient correction), you cannot safely encrypt more than blocks with AES because the keystream blocks–as the output of a block cipher–cannot repeat.

Most systems that cannot guarantee unique incrementing nonces simply generate nonces with a cryptographically secure random number generator. This is a good idea, but no matter how high quality your random number generator is, random functions will produce collisions with a discrete probability.

If you have possible values, you should expect a single collision(with 50% probability) after (or )samples. This is called the **birthday bound**.

However, 50% of a nonce reuse isn’t exactly a comfortable safety threshold for most systems (especially since nonce reuse will cause AES-GCM to become vulnerable to active attackers). 1 in 4 billion is a much more comfortable safety margin against nonce reuse via collisions than 1 in 2. Fortunately, you can calculate the discrete probability of a birthday collision pretty easily.

If you want to rekey after your collision probability exceeds (for a random nonce between 0 and ), you simply need to re-key after messages.

#### AES-GCM Safety Limits

- Maximum message length: bytes
- Maximum number of messages (random nonce):
- Maximum number of messages (sequential nonce): (but you probably don’t have this luxury in the real world)
**Maximum data safely encrypted under a single key with a random nonce:**about bytes

### Cryptographic Limits for ChaCha20-Poly1305

The IETF version of ChaCha20-Poly1305 uses 96-bit nonces and 32-bit internal counters. A similar analysis follows from AES-GCM’s, with a few notable exceptions.

For starters, the one-time Poly1305 key is derived from the first 32 bytes of the ChaCha20 keystream output (block 0) for a given (nonce, key) pair. There is no equivalent to AES-GCM’s H parameter which is static for each key. (The ChaCha20 encryption begins using block 1.)

Additionally, each block for ChaCha20 is 512 bits, unlike AES’s 128 bits. So the message limit here is a little more forgiving.

Since the block size is 512 bits (or 64 bytes), and only one block is consumed for Poly1305 key derivation, we can calculate a message length limit of , or 274,877,906,880 bytes–nearly 256 GiB for each (nonce, key) pair.

The same rules for handling 96-bit nonces applies as with AES-GCM, so we can carry that value forward.

#### ChaCha20-Poly1305 Safety Limits

- Maximum message length: bytes
- Maximum number of messages (random nonce):
- Maximum number of messages (sequential nonce): (but you probably don’t have this luxury in the real world)
**Maximum data safely encrypted under a single key with a random nonce:**about bytes

### Cryptographic Limits for XChaCha20-Poly1305

XChaCha20-Poly1305 is a variant of XSalsa20-Poly1305 (as used in libsodium) and the IETF’s ChaCha20-Poly1305 construction. It features 192-bit nonces and 32-bit internal counters.

XChaCha20-Poly1305 is instantiated by using HChaCha20 of the key over the first 128 bits of the nonce to produce a subkey, which is used with the remaining nonce bits using the aforementioned ChaCha20-Poly1305.

~~This doesn’t change the maximum message length,~~ but it *does* change the number of messages you can safely encrypt (since you’re actually using up to distinct keys).

Thus, even if you manage to repeat the final ChaCha20-Poly1305 nonce, as long as the total nonce differs, each encryptions will be performed with a distinct key (thanks to the HChaCha20 key derivation; see the XSalsa20 paper and IETF RFC draft for details).

**UPDATE (2021-04-15)**: It turns out, my read of the libsodium implementation was erroneous due to endian-ness. The maximum message length for XChaCha20-Poly1305 is blocks, and for AEAD_XChaCha20_Poly1305 is blocks. Each block is 64 bytes, so that changes the maximum message length to about . This doesn’t change the extended-nonce details, just the underlying ChaCha usage.

#### XChaCha20-Poly1305 Safety Limits

- Maximum message length: bytes (earlier version of this document said
~~)~~ - Maximum number of messages (random nonce):
- Maximum number of messages (sequential nonce): (but you probably don’t have this luxury in the real world)
**Maximum data safely encrypted under a single key with a random nonce:**about bytes

### Cryptographic Limits for AES-CBC

It’s tempting to compare non-AEAD constructions and block cipher modes such as CBC (Cipher Block Chaining), but they’re totally different monsters.

- AEAD ciphers have a clean delineation between message length limit and the message quantity limit
- CBC and other cipher modes do not have this separation

Every time you encrypt a block with AES-CBC, you are depleting from a **universal** bucket that affects the birthday bound security of encrypting more messages under that key. (And unlike AES-GCM with long nonces, AES-CBC’s IV is public.)

This is in addition to the operational requirements of AES-CBC (plaintext padding, initialization vectors that never repeat *and* must be unpredictable, separate message authentication since CBC doesn’t provide integrity and is vulnerable to chosen-ciphertext atacks, etc.).

For this reason, most cryptographers don’t even bother calculating the safety limit for AES-CBC in the same breath as discussing AES-GCM. And they’re right to do so!

If you find yourself using AES-CBC (or AES-CTR, for that matter), you’d best be performing a separate HMAC-SHA256 over the ciphertext (and verifying this HMAC with a secure comparison function before decrypting). Additionally, you should consider using an extended nonce construction to split one-time encryption and authentication keys.

However, for the sake of completeness, let’s figure out what our practical limits are.

CBC operates on entire blocks of plaintext, whether you need the entire block or not.

On encryption, the output of the previous block is mixed (using XOR) with the current block, then encrypted with the block cipher. For the first block, the IV is used in the place of a “previous” block. (Hence, its requirements to be non-repeating *and* unpredictable.)

This means you can informally model (IV xor PlaintextBlock) and (PB_{n} xor PB_{n+1}) as a pseudo-random function, before it’s encrypted with the block cipher.

If those words don’t mean anything to you, here’s the kicker: You can use the above discussion about birthday bounds to calculate the upper safety bounds for the **total** number of blocks encrypted under a single AES key (assuming IVs are generated from a secure random source).

If you’re okay with a 50% probability of a collision, you should re-key after blocks have been encrypted.

If your safety margin is closer to the 1 in 4 billion (as with AES-GCM), you want to rekey after blocks.

**However, blocks encrypted doesn’t map neatly to bytes encrypted.**

If your plaintext is always an even multiple of 128 bits (or 16 bytes), this allows for up to bytes of plaintext. If you’re using PKCS#7 padding, keep in mind that this will include an entire padding block per message, so your safety margin will deplete a bit faster (depending on how many individual messages you encrypt, and therefore how many padding blocks you need).

On the other extreme (1-byte plaintexts), you’ll only be able to eek encrypted bytes before you should re-key.

Therefore, to stay within the safety margin of AES-CBC, you SHOULD re-key after blocks (including padding) have been encrypted.

Keep in mind: single-byte blocks is still approximately 281 TiB of data (including padding). On the upper end, 15-byte blocks (with 1-byte padding to stay within a block) clocks in at about or about 4.22 PiB of data.

#### That’s Blocks. What About Bytes?

The actual plaintext byte limit sans padding is a bit fuzzy and context-dependent.

The local extrema occurs if your plaintext is always 16 bytes (and thus requires an extra 16 bytes of padding). Any less, and the padding fits within one block. Any more, and the data:padding ratio starts to dominate.

Therefore, the worst case scenario with padding is that you take the above safety limit for block counts, and cut it in half. Cutting a number in half means reducing the exponent by 1.

But this still doesn’t eliminate the variance. blocks could be anywhere from to bytes of real plaintext. When in situations like this, we have to assume the worst (n.b. take the most conservative value).

Therefore…

#### AES-CBC Safety Limits

**Maximum data safely encrypted under a single key with a random nonce:**bytes (approximately 141 TiB)

## Take-Away

Compared to AES-CBC, AES-GCM gives you approximately a million times as much usage out of the same key, for the same threat profile.

ChaCha20-Poly1305 and XChaCha20-Poly1305 provides even greater allowances of encrypting data under the same key. The latter is even safe to use to encrypt arbitrarily large volumes of data under a single key without having to worry about ever practically hitting the birthday bound.

I’m aware that this blog post could have simply been a comparison table and a few footnotes (or even an IETF RFC draft), but I thought it would be more fun to explain how these values are derived from the cipher constructions.

## 10 replies on “Cryptographic Wear-Out for Symmetric Encryption”

So when you are encrypting your videos or photos, how do you actually keep track of how much data you have encrypted with the same key or nonce? Is this something that a key management solution like Java KeyStore can help with?

LikeLike

I’m not familiar with the Java KeyStore, but the best effort you can probably do is use a separate (key, nonce) per photo/video and write your own code that counts the bytes processed under that (key, nonce).

Or you could just use the AWS Encryption SDK and make it their problem and not yours: https://docs.aws.amazon.com/encryption-sdk/latest/developer-guide/java.html

LikeLike

[…] cryptography), and can be safely implemented at relatively large sizes (e.g. for encrypting practically unlimited data under a single key), modern symmetric cryptography research is likely to focus on […]

LikeLike

[…] For example, ChaCha20-Poly1305 accepts a 256-bit key and 96-bit nonce (with an internal 32-bit block counter). If you generate the nonce randomly, you should expect a collision with probability 2^-32 after 2^32 messages. […]

LikeLike

I believe that Libsodium’s XChaCha20-Poly1305 actually uses a 64 bit counter, so messages can be quite large.

LikeLike

Are you sure about that? https://github.com/jedisct1/libsodium/blob/85e48722a8f0b263943cc44d579f372a1e4cbc0d/src/libsodium/crypto_aead/xchacha20poly1305/sodium/aead_xchacha20poly1305.c#L133-L140

LikeLike

Oh no! I got confused by endianness! https://github.com/jedisct1/libsodium/blob/85e48722a8f0b263943cc44d579f372a1e4cbc0d/src/libsodium/crypto_stream/chacha20/ref/chacha20_ref.c#L74-L77

This actually allows for 2^64-1 blocks, or about 2^70 byte messages.

LikeLike

I was going off of the libsodium docs, which cite the 2**64-1 number. I believe there has been some confusion over it this point, actually, something about the RFC and libsodium disagreeing on this point? I don’t know because I haven’t read it, but I’ve seen this 32 vs 64 bit counter thing come up several times.

LikeLike

Yep, it turns out you were right, and I’ve corrected the blog post.

LikeLike

Thank you! Really enjoying your blog, btw. Got into crypto stuff recently and found your postings yesterday. Illuminating and fun to read.

LikeLiked by 1 person