Cryptographic agility is a vaguely defined property, but is commonly understood to mean, “Able to quickly swap between cryptographic primitives in response to new attacks.”
Wikipedia defines cryptographic agility as:
Cryptographic agility is a practice paradigm in designing information security protocols and standards in a way so that they can support multiple cryptographic primitives and algorithms at the same time. Then the systems implementing a particular standard can choose which combination of primitives they want to use. The primary goal of cryptographic agility was to enable rapid adaptations of new cryptographic primitives and algorithms without making disruptive changes to the systems’ infrastructure.Specific revision
This is still a vague statement, as it covers a broad spectrum of implementation decisions. However, regardless of the implementation details, cryptographic agility inevitably leads to some degree of in-band negotiation.
In-band signaling is a major foot-gun in cryptography engineering. Consequently, cryptographic agility is at odds with securing a cryptographic protocol.
When Cryptographic Agility Leads to Insecurity
Case Study: JSON Web Tokens
Take a JWT signed by a first party or a third party.
Change the value of the
"alg" header to
"none". Strip off the signature. Change some of the claims to whatever you want.
Does your target system accept this token as valid?
Too often, the answer is “Yes”. This is a trivial way to bypass the cryptographic protections of JSON Web Tokens, which enables existential forgery.
Depending on what the JWT is protecting, this can lead to all sorts of bad outcomes (privilege escalation is common, but sometimes you can even get Remote Code Execution from an alg=none bug).
Of course, sometimes JWT advocates cry foul when you cite alg=none as a problem with the JWT standard.
That Doesn’t Count. It’s Low-Hanging Fruit. Developers Should Know to Reject Alg=None Tokens!
Okay, so let’s do a slightly more advanced attack: Take a JWT signed by a third party.
"alg" header from an asymmetric algorithm (
"ES256", etc.) to a symmetric MAC algorithm (
Change some of the claims to whatever you want.
Now use the asymmetric public key (RSA public key, ECDSA public key, etc.) as your symmetric key and calculate the HMAC tag of your altered token.
Does your target system accept this token as valid?
Too often, the answer is “Yes”. Many JWT implementations don’t provide any mechanism for enforcing the algorithm matches what the developer expected.
That’s Still Not JWT’S Fault! Blame the Implementation. Follow RFC 8725!
Ah yes, a best practices RFC. Surely they covered all of the corner cases?
Consider a web application framework in a scripting language (PHP, Python, Ruby, etc.) that encourages users to put all of their configuration into a single file or directory and injects a dependency object at runtime.
In this setup, assume that they’re using JWT for sessions (with a symmetric algorithm), while also accepting third-party JWTs for OpenID Connect (signed using RSA or ECDSA).
Let’s be generous and assume the JWT library is correctly asserting that the JWT
alg header is what the developer expects it to be in both contexts, so none of those two previous attacks work. Let’s also assume all of the RFC 8725 best practices are being followed strictly and consistently.
Everything should be fine right?
Nope. You can often exploit the Key ID (
"kid") header to bypass the
"alg" check and reintroduce the RSA/HMAC type confusion attack.
Why Does Cryptographic Agility Fail in Practice?
Cryptographic Agility is tempting because it’s extremely convenient from a development and operations perspective: If you ever need to change your cryptosystems, you just change your configuration and you don’t have to think about migration strategies. Everything “just works”.
The problem is that they’re even more convenient for attackers: Agility allows attackers to exercise control over which validation rules the target system will follow.
Whether an attacker is disabling security entirely (
alg = none), exploiting algorithm confusion (
alg: RSA -> HMAC), or leveraging a lack of type safety between symmetric and asymmetric key material, the fundamental problem is that they have the freedom to change anything at all.
Superior Alternatives to Agility
One True Ciphersuite To Rule Them All
This is the WireGuard approach. There is only one protocol WireGuard speaks: Noise with X25519, Cha-Poly, BLAKE2, SipHash, and HKDF.
You can’t get an AES version of WireGuard. You can’t get an X448 version of WireGuard.
If you use WireGuard, you use a very specific subset of cryptographic primitives, stitched together in a very specific way, with a formal proof of its correctness.
If a vulnerability is ever discovered with WireGuard (or a quantum computer enters the mix), the author will publish a new major version which is totally incompatible with the current version.
One True Ciphersuite: No agility.
Update: The creator of WireGuard points out that WireGuard is, in fact, versioned. However, there is only currently one version in-flight, and if a new version is specified, it’s overwhelmingly likely that the incumbent version will be deprecated. In practice, this means each runtime only has to implement one of the versions. Contrast this with the next subsection.
This is the PASETO approach. There are two actively supported versions of PASETO (v3, v4).
PASETO v3 uses NIST cryptography (ECDSA P-384, AES-CTR + HMAC-SHA384 (encrypt then MAC), HKDF).
PASETO v4 uses modern cryptography (Ed25519, XChaCha20 + BLAKE2b (encrypt then MAC), HKDF).
Versioned protocols technically meet the vaguest sense of cryptographic agility, but each version is hard-coded for maximum security. An attacker cannot disable security (there is no “alg=none” equivalent) or choose wild ciphersuites (RC4 + SHA1, MAC-then-Encrypt), unless a protocol version is defined that permits dumb options.
If a vulnerability is discovered, the authors of the protocol will specify a new version of the protocol and deprecate the old version. This is why v1/v2 fell by the wayside.
If an attack is ever discovered against XChaCha20 or Ed25519, PASETO users can immediately move to v3 while waiting for the author to specify a new protocol version. This is still “agile”, but in a very minimal way.
Version protocols: Minimal agility.
Cryptography Migration Strategies
Imagine you have 10+ years worth of files stored in the cloud that were encrypted client-side using AES in CBC mode. One day, your penetration test vendor suggests migrating to AES in GCM mode.
When you use cryptographic agility, you handwave the need for a migration strategy at the expense of security against active attackers. Your old data is vulnerable to a padding oracle attack, unless you re-encrypt. But the new files are protected against active attackers because of AES-GCM, right?
Nope. You can decrypt an AES-GCM ciphertext by telling the system to decrypt it as if it were AES-CBC.
By virtue of allowing more than one algorithm at the same time, you will be vulnerable to downgrade attacks forever. This is also true if you never supported CBC Mode before, but your application permits its usage if an attacker selects it.
Cryptographic agility, in truth, isn’t convenient. It’s just lazy.
You cannot simultaneously be secure and avoid the pain of a cryptosystem migration.
Migrating Encryption At Rest
Using the CBC-GCM example above, this is how you have to migrate your encryption. In this order:
- Update every application in scope to support reading the new cryptography protocol and/or message format.
- Only after step 1 is complete, update every application to start writing the new cryptography protocol and/or message format.
- Only after step 2 is complete, re-encrypt all the old records to use the new cryptography protocol and/or message format.
- Only after step 3 is complete, turn off support for the old mode.
It’s important that every step does not begin until the preceding step has concluded.
If you attempt to move onto step 2 before step 1 is finished, you will find your network in a split-brain scenario where freshly-encrypted data cannot be read by some machines. This can lead to outages or data corruption.
If you attempt to move onto step 3 before step 2 is finished, you will find some machines are still writing the legacy format, which means more insecurely-encrypted data at risk.
If you attempt to move onto step 4 before step 3 is finished, you will lose access to some subset of old encrypted data, since it will not be migrated yet.
Migrating Encryption in Transit
Since encryption-in-transit is more ephemeral, migration is a bit simpler (assuming custom protocols used within your network).
- Enable support for the new encryption protocol version (preferably on a different TCP/UDP port) across your fleet.
- Once everyone is using the new version, disable the old protocol version (and consider blocking the old port, if applicable).
The catch with this is, if we’re talking about TLS or some other Internet-scale protocol, the time between step 1 and step 2 will be measured in years.
Bonus Round: Migrating Password Hashes
If you were previously storing passwords in plaintext, apply a password hashing function over all users’ passwords and force a password reset next time they log in. Storing them in plaintext is a terrible idea!
If you were previously using a fast cryptographic hash function (MD5, SHA1, SHA256) instead of a password hashing function (bcrypt, scrypt, Argon2):
- Use a password hashing function over the existing weak hash, immediately.
- Store a “legacy” flag so your system knows to first rehash the supplied password with the old hash function.
- Opportunistically upgrade users’ hashes to not use the legacy flag when they successfully authenticate. Clear the flag.
You may also want to force a password reset when they reauthenticate.
If you were previously using a password hashing function and are migrating to a newer password hashing function (e.g. bcrypt -> argon2id), or are increasing the parameters (bcrypt cost 10 -> 12, or higher argon2 memory requirements), you can take a lazier approach:
- When a user successfully authenticates, check if the algorithm or parameters changed.
- If so, calculate a new password hash with the new algorithm/parameters.
In this scenario, you probably don’t need to force a password reset (unless motivated by a data breach).
Don’t listen to the siren song of cryptographic agility.
Instead, invest in serious security engineering efforts to protect your data while remaining organizationally agile enough to respond to an evolving threat landscape.
This may sound like the boring “eat your vegetables” of cryptography advice, but every time someone ignored this, they paid for it down the road.
A lot of these points were covered in my previous blog post, but I wanted to discuss this topic independent of my critical remarks therein.
One reply on “Cryptographic Agility and Superior Alternatives”
Hey Soatok, my name is Manu Sporny and I’m a standards editor for a variety of security specifications at W3C and IETF. I’ve read a number of your posts over the years and have found many of them on point and reflective of concerns that a number of us have at both W3C and IETF. More recently, we’ve started work at W3C in the Verifiable Credentials Working Group on a work item called Data Integrity (which has been incubating since 2012) that attempts to fix a number of the issues you highlight in this post as well as other good points you’ve made in the past. Namely, you might find this draft specification text of interest:
Cryptographic Agility and Cryptographic Hardening
Cryptographic Suite Versioning
CFRG discussion on how much cryptographic optionality, and at what layer, is useful:
As you might imagine, we’ve had significant push back by a vocal minority in the JOSE/JWT community over the years, large technology vendors have actively objected to creating work items that might reign some of the worst aspects of JOSE and JWTs in… actions which have greatly slowed the work down. Application developers that use JWTs often react negatively to the notion that there are fundamental problems with JWT — everything from “If that were true, why is JWT so successful and widely deployed!?”… to “oh, the problem is with the implementers, not the standards”… to “oh, the problem is that we don’t have good test suites”. The responses to those defending the design of JOSE/JWTs tend to be “Just because something is widely deployed doesn’t mean it’s easy to use in a safe manner.”… “The standards create optionality but don’t do much to protect application developers from picking the wrong options.”… “Yes, test suites might help — it’s been almost a decade, where are the JOSE test suites… where is the language and test suit that deprecates alg=none and fails implementations for not deprecating it?”. As you can guess, those conversations have been going nowhere for the better part of a decade.
So, we’re actively doing something about it by demonstrating that it is not only possible to ratchet the attack surface around cryptographic library implementations down and protect application developers from themselves, but that it’s necessary to standardize tightly controlled combinations of cryptography (cryptographic suites), and actively test implementations against a conformance test suite that is maintained by the Working Group. This is active work that’s been going on for a while now, and we’re in the last two years before we standardize such a mechanism. I invite you to contribute to the discussion, if you have the time via the IETF CFRG, W3C VCWG, or Github issue trackers for Data Integrity. We’d love to have your input as we create safer cryptographic standards led by the principles you have written about in this blog.