Hacker challenge part 2

Apr 3, 2009

Well, I guess I'm not quitting my day job to become a cryptographer any time soon.

As was instantly ascertained on reddit, the key algorithm in my "Hacker Challenge" is RSA. I was hoping that it would take at least an hour to crack the private key, but alas, I had severely underestimated the time that modern elliptic-curve and number field sieve integer factorizers would take. It factored in under a second, which means the key would have to be many orders of magnitude larger to offer any kind of security.

So within an hour a factorization of n was posted to reddit by c0dep0et. Even so, LoneStar309 pointed out an embarassing implementation mistake which significantly weakened it (given a valid key, you could generate several other valid encodings of the same key); I patched this, as I mentioned in my update. And then teraflop demonstrated posession of a working keygen a couple hours later.

I wanted the key generator challenge to be possible, and it definitely wasn't trivial, but it was still far easier than I had hoped. Still, I couldn't be happier with the result, and I would like to thank my fellow programming.redditors for a great discussion.

For those who haven't studied how RSA works in sufficient detail to go from a factored n to a key generator, go take a moment to read up on Wikipedia. Basically the public and private keys, e and d, are multiplicative inverses mod φ(n) where φ(n) is Euler's totient function. In the case of n=pq where p and q are prime, φ(n) = (p - 1)(q - 1). So you use the extended euclidean algorithm to find e from d and φ(n). If you're using GMP (I am), you can just call mpz_invert to do that.

Once you've recovered e from d, you just RSA-encrypt the message m = 12345678 + check_mod*N where N is the key number of your choosing and 12345678 is a "nothing up my sleeve" number I chose for validating a decryption. The ciphertext is thus m^e (mod n), calculated using exponentiation by squaring, mod n at each step (which is what expmod does in bn.h), and then you do the reverse of decode to turn the number into a string.

The code I used for generating RSA private key pairs is rsa_genpriv.c and for generating license keys is rsa_genlic.c. These require libgmp; the job is just too big for poor little bn.h.

(All my code here is MIT-licensed, by the way, so feel free to steal it for your own purposes. By all means, use it instead of some silly easy-to-duplicate hashing scheme for your application...)

So, will RSA-based license schemes work? Not with such a short key length. Can we just make the key length longer? Well, that depends. Your ciphertext is always going to be a number between 2 and n, if n is 512 bits then so is your ciphertext. 1024 bits is probably the smallest reasonably secure size you'd want to use for RSA, which is 205 characters in the A-Y,1-9 code I'm using. So if your users are pasting keys out of an email, that's probably fine, but if they're typing it in by hand off of a CD case, forget it.

Also, this scheme, though cryptographically weak, has some points in its favor. If a theoretical cracker disassembles the code, he absolutely must understand RSA at some level, extract n, and factor it in order to create a key generator. I probably wouldn't have the patience to do it if the least bit of obfuscation were used in conjunction. It's totally self-contained (so you don't have to link in libcrypto or libopenssl or libgmp), so it's pretty much a drop-in replacement for whatever hashing scheme that most software tends to use.

And, though the backbone of the challenge was quickly broken, only one person demonstrated a keygen. I guess one is all it takes.

Can we do better? Yes, I think we can do much better. RSA's security derives from the difficulty of the integer factorization problem. There are two other commonly used classes of asymmetric key cryptosystems based on harder problems: discrete logarithm and elliptic curve discrete logarithm. Each provides more "strengh" per bit of key than the last.

james_block brings up some good points along these lines. It may not be possible to create a software license scheme with both short license codes and enough security to withstand a large, coordinated effort to break it. But it's far better to use a license key scheme that could be broken with a large effort than one that will definitely be broken with a small effort, when the former is an easy drop-in replacement for the latter. Truly uncrackable (in the cryptographic sense) security will require longer keys and users who paste keys out of emails.

So here is challenge #2. I've used another common algorithm which is no longer encumbered by a patent. The ciphertext is still slightly less than 125 bits. It is not impossible to crack by any means, but it is much harder (in terms of CPU time necessary) than the previous one. And there's always the possibility that I screwed something up and left a big back door in it, which is a good reason for proposing the challenge in the first place.

The code:
keydecode2.cpp - challenge #2 decoder
bn.h - quick and dirty bignums (updated from last time)

I plan on issuing one further challenge next week, and there's a good chance that this one will be broken before then if it receives the same level of attention as the first one did.

Here is the reddit thread for part 2.

Update: The solution to part 2 has been posted.