Suppose you need to encrypt data between two peer-to-peer devices over an untrusted medium (i.e. the Internet), and you have an authenticated low-bandwidth channel that can be used to send and authenticate a few bytes (less than 100), but that channel isn’t itself encrypted (otherwise it’d be a chicken-and-egg problem).
Aside: If it helps your mental image, you can pretend that the authenticated channel is a blockchain, if you want. You can squeeze a few bytes in there, and the data structure is verifiable, but everything you transmit is public. But the actual channel implementation isn’t important here; this is a thought experiment.
This was an unsolved problem in mathematics and cryptography until the 1970’s, when Whitfeld Diffie, Martin Hellman, and Ralph Merkle published the original Diffie-Hellman protocol and invented asymmetric (a.k.a. public-key) cryptography.
Modern cryptography uses a variant of the initial Diffie-Hellman protocol, defined over a mathematical structure called an Elliptic Curve. Thus, it is called Elliptic Curve Diffie Hellman (ECDH).
Note: I’m intentionally breaking from the mathematical orthodoxy in this blog post in the hopes that it’s more accessible to everyone outside the ivory tower. As a result, my definitions may lack precision, but I’m saving you from jargon.
Elliptic Curve Diffie-Hellman
Without going into the nuts and bolts of ECDH (a subject of a future blog post!), this is basically what’s happening in layman’s (n.b. non-mathematician’s) terms:
- You’ve got an elliptic curve, which satisfies some equation. There are many different curve shapes. One example is , where A and B are integer constants. (This example is called a Montgomery curve.)
- You’ve got a field parameter, . This is usually a prime number. (If it isn’t, and you aren’t a cryptographer, you probably want to run away from it.)
- You’ve got a generator, (which is some specific point on the curve), which transforms the whole shebang into a thing called a “group” which has a thing called an “order” (which in turn is an integer of value ). The fact that this thing is a “group” at all is what makes Diffie-Hellman possible.
You don’t need to worry about what these things even are, because I’m not delving into the underlying mathematics (nuts and bolts) at all. But if I omit them, the pedants complain, so for the sake of completeness: Here you go.
Take a mathematics course that covers Galois theory if you want the inside scoop on what the hell those things are. But for now, if you’re lost, just know that you have some special parameters/properties that have specific names and don’t worry about what they mean.
These parameters are meant to be public, and preferably hard-coded. When they’re not, you get CVE-2020-0601.
Each participant (simplest case: there are only two) will need two keys to agree on a shared secret: Their secret key and its corresponding public key.
To calculate a secret key (), a random integer is selected between and (inclusive).
To calculate a public key (), the generator is added to itself times.
This sounds trivial and slow, but Point Addition in an Elliptic Curve is a peculiar operation, and the process is a form of multiplication. Strictly speaking, this is multiplication.
I will cover this in detail in a future blog post. For now, just know that you’re multiplying a number (“scalar”) by a point, and that the underlying mechanics of multiplication (which is just adding a point to itself some number of times) involves point addition, which is very weird if you’re not a cryptographer.
We can delve into the complexity later, just keep in mind you’re calculating to get a public key from a secret key.
It bears emphasizing: Each participant does this.
The orthodox characters for describing cryptographic protocols are Alice and Bob. Alice wants to talk to Bob, Bob wants to talk to Alice, and they want no one else to successfully interfere in their communication.
Alice has .
Bob has .
If they perform one calculation, they can derive a shared secret while only sharing their public keys over the authenticated channel: Multiply their secret key by the other party’s public key.
Alice performs .
Bob performs .
Since multiplication is commutative, they both arrive at the same shared value, despite only transmitting public keys. Someone snooping on the authenticated channel cannot, with current technology, recover the shared secret or either party’s secret key.
With some caveats, of course:
Which Public Key Do I AKE?
We handwaved this away earlier, but an authenticated channel is critically necessary to transmit public keys. Otherwise, an active attacker can just substitute public keys and sit in the middle, with neither party realizing they’re talking to a middleman.
This turns out to be a bit tricky, but not intractable.
TLS (the S in HTTPS) uses long-term digital signature algorithms and chains of digital signatures from a short list of trusted Root Certificate Authorities to authenticate the identity of the server you’re talking to, and then that server’s digital signature public key (signed by an intermediary Certificate Authority) is used to sign ECDH public keys.
I briefly mentioned this in the post about Authenticated Key Exchanges.
Your Curve is Invalid
This is less of a problem with newer elliptic curves, but for a very long time, public keys were transmitted as a pair of coordinates.
Earlier, we mentioned that both parties have to agree on some curve parameters. But in protocols where you send arbitrary points, an attacker could send a point on a different curve to leak bits about the target’s secret key.
The standard mitigation is to solve the elliptic curve equation and verify that the coordinates of the point satisfy the chosen elliptic curve parameters.
Modern protocols only send a single coordinate, and possibly a single bit to signify whether the other coordinate is positive or negative (either or ).
This is called point compression and used to be patented (until 2018). Since the patent has expired, protocols should investigate adopting point compression if they can’t just switch to secure curves.
Since multiplication is commutative, it’s possible that two unrelated pairs of participants could perform an ECDH step and agree on the same shared secret.
This is true for the same reason that , and because your secret key is a random integer.
For this reason, it’s generally unwise to use raw ECDH outputs as encryption keys. Instead, you want to use a cryptographic hash function over the ECDH output and each participants’ public keys to guarantee your symmetric session key is distinct from other pairs.
Whose Time Is It Anyway?
In addition to what I’ve described above, all of the algorithms involved need to perform in constant-time. Any failure to do so will result in your secret key leaking to attackers through network traffic timing (and other side-channels).
I’ll explore the engineering that goes into constant-time implementations in a future blog post.
My hope in writing this page is that anyone without a mathematics background can walk away with a basic intuition for how ECDH works and how dangerous cryptography is when left to the hands of amateurs.
If you find yourself faced with a scenario like we discussed here, you want to use a well-studied Authenticated Key Exchange instead of rolling your own.
If you have to roll your own, try to wrap or take inspiration from libsodium.