Categories
Cryptography

Cryptographic Wear-Out for Symmetric Encryption

As we look upon the sunset of a remarkably tiresome year, I thought it would be appropriate to talk about cryptographic wear-out.

What is cryptographic wear-out?

It’s the threshold when you’ve used the same key to encrypt so much data that you should probably switch to a new key before you encrypt any more. Otherwise, you might let someone capable of observing all your encrypted data perform interesting attacks that compromise the security of the data you’ve encrypted.

Soatok Explains it All
My definitions always aim to be more understandable than pedantically correct.
(Art by Swizz)

The exact value of the threshold varies depending on how exactly you’re encrypting data (n.b. AEAD modes, block ciphers + cipher modes, etc. each have different wear-out thresholds due to their composition).

Let’s take a look at the wear-out limits of the more popular symmetric encryption methods, and calculate those limits ourselves.

Specific Ciphers and Modes

I did the math
(Art by Khia. Poorly edited by the author.)

Cryptographic Limits for AES-GCM

I’ve written about AES-GCM before (and why I think it sucks).

AES-GCM is a construction that combines AES-CTR with an authenticator called GMAC, whose consumption of nonces looks something like this:

  • Calculating H (used in GHASH for all messages encrypted under the same key, regardless of nonce):
    • Encrypt(00000000 00000000 00000000 00000000)
  • Calculating J0 (the pre-counter block):
    • If the nonce is 96 bits long:
      • NNNNNNNN NNNNNNNN NNNNNNNN 00000001 where the N spaces represent the nonce hexits.
    • Otherwise:
      • s = 128 * ceil(len(nonce)/nonce) - len(nonce)
      • J0 = GHASH(H, nonce || zero(s+64) || int2bytes(len(nonce))
  • Each block of data encrypted uses J0 + block counter (starting at 1) as a CTR nonce.
  • J0 is additionally used as the nonce to calculate the final GMAC tag.

AES-GCM is one of the algorithms where it’s easy to separately calculate the safety limits per message (i.e. for a given nonce and key), as well as for all messages under a key.

AES-GCM Single Message Length Limits

In the simplest case (nonce is 96 bits), you end up with the following nonces consumed:

  • For each key: 00000000 00000000 00000000 00000000
  • For each (nonce, key) pair:
    • NNNNNNNN NNNNNNNN NNNNNNNN 000000001 for J0
    • NNNNNNNN NNNNNNNN NNNNNNNN 000000002 for encrypting the first 16 bytes of plaintext
    • NNNNNNNN NNNNNNNN NNNNNNNN 000000003 for the next 16 bytes of plaintext…
    • NNNNNNNN NNNNNNNN NNNNNNNN FFFFFFFFF for the final 16 bytes of plaintext.

From here, it’s pretty easy to see that you can encrypt the blocks from 00000002 to FFFFFFFF without overflowing and creating a nonce reuse. This means that each (key, nonce) can be used to encrypt a single message up to 2^{32} - 2 blocks of the underlying ciphertext.

Since the block size of AES is 16 bytes, this means the maximum length of a single AES-GCM (key, nonce) pair is 2^{36} - 256 bytes (or 68,719,476,480 bytes). This is approximately 68 GB or 64 GiB.

Things get a bit tricker to analyze when the nonce is not 96 bits, since it’s hashed.

The disadvantage of this hashing behavior is that it’s possible for two different nonces to produce overlapping ranges of AES-CTR output, which makes the security analysis very difficult.

However, this hashed output is also hidden from network observers since they do not know the value of H. Without some method of reliably detecting when you have an overlapping range of hidden block counters, you can’t exploit this.

(If you want to live dangerously and motivate cryptanalysis research, mix 96-bit and non-96-bit nonces with the same key in a system that does something valuable.)

Multi-Message AES-GCM Key Wear-Out

Now that we’ve established the maximum length for a single message, how many messages you can safely encrypt under a given AES-GCM key depends entirely on how your nonce is selected.

If you have a reliable counter, which is guaranteed to never repeat, and start it at 0 you can theoretically encrypt 2^{96} messages safely. Hooray!

Soatok is HYPED!!!
Hooray!
(Art by Swizz)

You probably don’t have a reliable counter, especially in real-world settings (distributed systems, multi-threaded applications, virtual machines that might be snapshotted and restored, etc.).

Soatok angrily grasping computer monitor
Confound you, technical limitations!
(Art by Swizz)

Additionally (thanks to 2adic for the expedient correction), you cannot safely encrypt more than 2^{64} blocks with AES because the keystream blocks–as the output of a block cipher–cannot repeat.

Most systems that cannot guarantee unique incrementing nonces simply generate nonces with a cryptographically secure random number generator. This is a good idea, but no matter how high quality your random number generator is, random functions will produce collisions with a discrete probability.

If you have 2^n possible values, you should expect a single collision(with 50% probability) after \sqrt{2^{n}} (or 2^{n/2})samples. This is called the birthday bound.

However, 50% of a nonce reuse isn’t exactly a comfortable safety threshold for most systems (especially since nonce reuse will cause AES-GCM to become vulnerable to active attackers). 1 in 4 billion is a much more comfortable safety margin against nonce reuse via collisions than 1 in 2. Fortunately, you can calculate the discrete probability of a birthday collision pretty easily.

If you want to rekey after your collision probability exceeds 2^{-32} (for a random nonce between 0 and 2^{96}), you simply need to re-key after 2^{32} messages.

AES-GCM Safety Limits

  • Maximum message length: 2^{36} - 256 bytes
  • Maximum number of messages (random nonce): 2^{32}
  • Maximum number of messages (sequential nonce): 2^{64} (but you probably don’t have this luxury in the real world)
  • Maximum data safely encrypted under a single key with a random nonce: about 2^{68} bytes
Not bad, but we can do better.
(Art by Khia.)

Cryptographic Limits for ChaCha20-Poly1305

The IETF version of ChaCha20-Poly1305 uses 96-bit nonces and 32-bit internal counters. A similar analysis follows from AES-GCM’s, with a few notable exceptions.

For starters, the one-time Poly1305 key is derived from the first 32 bytes of the ChaCha20 keystream output (block 0) for a given (nonce, key) pair. There is no equivalent to AES-GCM’s H parameter which is static for each key. (The ChaCha20 encryption begins using block 1.)

Additionally, each block for ChaCha20 is 512 bits, unlike AES’s 128 bits. So the message limit here is a little more forgiving.

Since the block size is 512 bits (or 64 bytes), and only one block is consumed for Poly1305 key derivation, we can calculate a message length limit of 2^{38} - 64, or 274,877,906,880 bytes–nearly 256 GiB for each (nonce, key) pair.

The same rules for handling 96-bit nonces applies as with AES-GCM, so we can carry that value forward.

ChaCha20-Poly1305 Safety Limits

  • Maximum message length: 2^{38} - 64 bytes
  • Maximum number of messages (random nonce): 2^{32}
  • Maximum number of messages (sequential nonce): 2^{96} (but you probably don’t have this luxury in the real world)
  • Maximum data safely encrypted under a single key with a random nonce: about 2^{70} bytes
Soatok
A significant improvement, but still practically limited.
(Art by Khia.)

Cryptographic Limits for XChaCha20-Poly1305

XChaCha20-Poly1305 is a variant of XSalsa20-Poly1305 (as used in libsodium) and the IETF’s ChaCha20-Poly1305 construction. It features 192-bit nonces and 32-bit internal counters.

XChaCha20-Poly1305 is instantiated by using HChaCha20 of the key over the first 128 bits of the nonce to produce a subkey, which is used with the remaining nonce bits using the aforementioned ChaCha20-Poly1305.

This doesn’t change the maximum message length, but it does change the number of messages you can safely encrypt (since you’re actually using up to 2^{128} distinct keys).

Thus, even if you manage to repeat the final ChaCha20-Poly1305 nonce, as long as the total nonce differs, each encryptions will be performed with a distinct key (thanks to the HChaCha20 key derivation; see the XSalsa20 paper and IETF RFC draft for details).

UPDATE (2021-04-15): It turns out, my read of the libsodium implementation was erroneous due to endian-ness. The maximum message length for XChaCha20-Poly1305 is 2^{64} blocks, and for AEAD_XChaCha20_Poly1305 is 2^{64} - 1 blocks. Each block is 64 bytes, so that changes the maximum message length to about 2^{70}. This doesn’t change the extended-nonce details, just the underlying ChaCha usage.

XChaCha20-Poly1305 Safety Limits

  • Maximum message length: 2^{70} bytes (earlier version of this document said 2^{38} - 64)
  • Maximum number of messages (random nonce): 2^{80}
  • Maximum number of messages (sequential nonce): 2^{192} (but you probably don’t have this luxury in the real world)
  • Maximum data safely encrypted under a single key with a random nonce: about 2^{118} bytes
Galaxy Brain meme, but a furry
I can see encrypt forever, man.
(Art by Khia.)

Cryptographic Limits for AES-CBC

It’s tempting to compare non-AEAD constructions and block cipher modes such as CBC (Cipher Block Chaining), but they’re totally different monsters.

  • AEAD ciphers have a clean delineation between message length limit and the message quantity limit
  • CBC and other cipher modes do not have this separation

Every time you encrypt a block with AES-CBC, you are depleting from a universal bucket that affects the birthday bound security of encrypting more messages under that key. (And unlike AES-GCM with long nonces, AES-CBC’s IV is public.)

This is in addition to the operational requirements of AES-CBC (plaintext padding, initialization vectors that never repeat and must be unpredictable, separate message authentication since CBC doesn’t provide integrity and is vulnerable to chosen-ciphertext atacks, etc.).

This isn't IND-CCA2 Secure!
My canned response to most queries about AES-CBC.
(Art by Khia.)

For this reason, most cryptographers don’t even bother calculating the safety limit for AES-CBC in the same breath as discussing AES-GCM. And they’re right to do so!

If you find yourself using AES-CBC (or AES-CTR, for that matter), you’d best be performing a separate HMAC-SHA256 over the ciphertext (and verifying this HMAC with a secure comparison function before decrypting). Additionally, you should consider using an extended nonce construction to split one-time encryption and authentication keys.

(Art by Riley.)

However, for the sake of completeness, let’s figure out what our practical limits are.

CBC operates on entire blocks of plaintext, whether you need the entire block or not.

On encryption, the output of the previous block is mixed (using XOR) with the current block, then encrypted with the block cipher. For the first block, the IV is used in the place of a “previous” block. (Hence, its requirements to be non-repeating and unpredictable.)

This means you can informally model (IV xor PlaintextBlock) and (PBn xor PBn+1) as a pseudo-random function, before it’s encrypted with the block cipher.

If those words don’t mean anything to you, here’s the kicker: You can use the above discussion about birthday bounds to calculate the upper safety bounds for the total number of blocks encrypted under a single AES key (assuming IVs are generated from a secure random source).

If you’re okay with a 50% probability of a collision, you should re-key after 2^{64} blocks have been encrypted.

This video handwaves the “block size” requirement in their attack, but it works (if you have the resources to encrypt that many blocks.)

If your safety margin is closer to the 1 in 4 billion (as with AES-GCM), you want to rekey after 2^{48} blocks.

However, blocks encrypted doesn’t map neatly to bytes encrypted.

If your plaintext is always an even multiple of 128 bits (or 16 bytes), this allows for up to 2^{52} bytes of plaintext. If you’re using PKCS#7 padding, keep in mind that this will include an entire padding block per message, so your safety margin will deplete a bit faster (depending on how many individual messages you encrypt, and therefore how many padding blocks you need).

On the other extreme (1-byte plaintexts), you’ll only be able to eek 2^{48} encrypted bytes before you should re-key.

Therefore, to stay within the safety margin of AES-CBC, you SHOULD re-key after 2^{48} blocks (including padding) have been encrypted.

Keep in mind: 2^{48} single-byte blocks is still approximately 281 TiB of data (including padding). On the upper end, 2^{48} 15-byte blocks (with 1-byte padding to stay within a block) clocks in at about 2^{51.9} or about 4.22 PiB of data.

That’s Blocks. What About Bytes?

The actual plaintext byte limit sans padding is a bit fuzzy and context-dependent.

The local extrema occurs if your plaintext is always 16 bytes (and thus requires an extra 16 bytes of padding). Any less, and the padding fits within one block. Any more, and the data:padding ratio starts to dominate.

Therefore, the worst case scenario with padding is that you take the above safety limit for block counts, and cut it in half. Cutting a number in half means reducing the exponent by 1.

But this still doesn’t eliminate the variance. 2^{47} blocks could be anywhere from 2^{47} to 2^{50.9} bytes of real plaintext. When in situations like this, we have to assume the worst (n.b. take the most conservative value).

Therefore…

AES-CBC Safety Limits

  • Maximum data safely encrypted under a single key with a random nonce: 2^{47} bytes (approximately 141 TiB)
Yet another reason to dislike non-AEAD ciphers.
(Art by Khia.)

Take-Away

Compared to AES-CBC, AES-GCM gives you approximately a million times as much usage out of the same key, for the same threat profile.

ChaCha20-Poly1305 and XChaCha20-Poly1305 provides even greater allowances of encrypting data under the same key. The latter is even safe to use to encrypt arbitrarily large volumes of data under a single key without having to worry about ever practically hitting the birthday bound.

I’m aware that this blog post could have simply been a comparison table and a few footnotes (or even an IETF RFC draft), but I thought it would be more fun to explain how these values are derived from the cipher constructions.

Soatok heart sticker
(Art by Khia.)

By Soatok

Security engineer with a fursona. Ask me about dholes or Diffie-Hellman!

16 replies on “Cryptographic Wear-Out for Symmetric Encryption”

So when you are encrypting your videos or photos, how do you actually keep track of how much data you have encrypted with the same key or nonce? Is this something that a key management solution like Java KeyStore can help with?

Like

I was going off of the libsodium docs, which cite the 2**64-1 number. I believe there has been some confusion over it this point, actually, something about the RFC and libsodium disagreeing on this point? I don’t know because I haven’t read it, but I’ve seen this 32 vs 64 bit counter thing come up several times.

Like

Thank you! Really enjoying your blog, btw. Got into crypto stuff recently and found your postings yesterday. Illuminating and fun to read.

Liked by 1 person

As XTS is the mode used for storage, (RAM and disk), and can work on parallel blocks (worn fast), hence it is interesting for the industry. Question: in XTS the attacker does not see the plaintext and ciphertext of the internal AES. Can we say that because of that it may worn out for a given tweak/address, but if the collection of information is spread all over (if attacker cannot control memory address), the collision of different tweaks do not build up to a single BDay attack but rather parallel attacks that do not wear the key at the same rate?

Like

The underlying mechanics of XTS mode don’t really matter here. You’re working with 128-bit blocks and reasoning about collision probabilities thereof.

XTS mode behaves like CBC mode, except you XOR the plaintext *and* ciphertext with the same IV (and there’s an additional encryption step from a “tweak key”), and use Galois Field multiplication instead of XOR to calculate the next IV. This effectively hides the IV from the ciphertext, but that addresses security properties irrelevant to our discussion.

When talking about symmetric wear-out, you’re only concerned with the inputs and outputs of the encryption function. XTS mode reduces to AES(IV xor plaintext, static key) xor IV. You’ll get a single IV collision after encrypting 2^64 blocks (with 50% probability).

Whether or not you can pivot from this first collision to a practical attack is an exercise left to the reader. Although with block-level encryption and how many filesystems encode data at a low level, you’ll likely have enough redundancy to collide (IV, plaintext) pairs at some point in excess of this threshold.

Thus, the discussions of XTS mode don’t really warrant a separate section.

Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s