Landon Hemsley Digital Elegance Delivered Fri, 17 Dec 2021 20:26:31 +0000 en-US hourly 1 130243796 Bleichenbacher ’06 RSA Signature Forgery: What they assume you know Fri, 17 Dec 2021 14:45:21 +0000 In 2006, Daniel Bleichenbacher shared a discovery in an evening session at a cryptography conference: Several implementations of RSA-based PKCS 1 v 1.5 cryptographic signature verification were fatally flawed and susceptible to signature forgery.

It is as bad as it sounds. The sad part: The flaw in the signature verification algorithm is that the signature submitted for validation is trusted too much. Any engineer worth his salt knows you should never trust user input.

This blog post is directly tied to cryptopals challenge 42 in which you are asked to exploit a still-existing weakness (or at least still-existing up to 2016 … I would hope it’s fixed now) in several RSA cryptographic signature verification implementations. But they don’t give you a lot of background on how to construct a signature in the first place. Some parts are obvious; many are not.

The algorithm

If you google this subject long enough, you’ll run into something that looks like this in lots of different places.


This format (sans GARBAGE) is outlined in RFC 2313. (Spend some time there… it will help). The signature algorithm, as outlined there, is pretty straightforward. Using the public RSA key (not the private key), follow these steps:

  • Use a hashing algorithm to hash a message. SHA1 is ok, but you could use MD4, MD5, SHA256, SHA512, or whatever you want
  • Encode that message in ASN.1 format following a specific encoding scheme (as outlined below)
  • Pad that octet string out to the width of the RSA public modulus (aka n), starting with a byte of 00, then a byte of 01, then k bytes of FF, followed by another 00 byte, then the ASN.1 encoded octet string.
  • RSA-encrypt that octet string using the public key (not the private key).

Signature verification is basically the same, but backwards:

  • RSA-decrypt the submitted signature using the private key (not the public key).
  • Parse the decrypted octet string, verifying and validating the padding scheme.
  • Seek the beginning of the ASN.1 encoded octets.
  • Using the hash algorithm specified in the ASN.1 encoded octets, hash the message submitted for signature verification.
  • Compare the resulting hash with the hash included in the signature. If they match, then the signature is valid.

ASN.1 Formatting

If you’re a n00b like I was, the first questions are “What in the blazes is ASN.1?” and “How am I supposed to encode something in it?”

The answer: Don’t do it manually. There are libraries available all over the internet that make it easy. You just have to know how to use them. I used bouncy castle.

In RFC 2313, the encoding schema for the ASN.1 encoding is outlined as follows

   DigestInfo ::= SEQUENCE {
     digestAlgorithm DigestAlgorithmIdentifier,
     digest Digest }

   DigestAlgorithmIdentifier ::= AlgorithmIdentifier

   Digest ::= OCTET STRING

AltorithmIdentifier is defined in RFC 5280 (X.509) as follows.

   AlgorithmIdentifier  ::=  SEQUENCE  {
        algorithm               OBJECT IDENTIFIER,
        parameters              ANY DEFINED BY algorithm OPTIONAL  }

People the world over have written libraries to do this. Here’s a how I implemented the encoding. (It seems simple, but boiling this down to knowing which library objects to use was not obvious and took a lot of searching and reading).

The object identifier is a defined constant elsewhere in the library. It was just a matter of finding it and knowing which one to use.

    public byte[] encodeHashToAsn1SignatureFormat(final byte[] hash, final ASN1ObjectIdentifier hashAlgo) {
        ASN1Sequence s1 = new DERSequence(new ASN1Encodable[] {
                new AlgorithmIdentifier(hashAlgo, DERNull.INSTANCE),
                new DEROctetString(hash)
        ByteArrayOutputStream out = new ByteArrayOutputStream();
        return out.toByteArray();

Small e and poor padding validation

Like I said … the fatal flaw in many RSA signature algorithms is that they trusted the submitted signature too much. Specifically, they didn’t validate the length of the signature to make sure it was as long as the RSA public modulus, and didn’t make sure that the content of the signature is right justified. That means that if you have a signature that nominally follows the signature encoding format of …

00 01 FF FF FF ... FF FF 00 ASN HASH

… then you can throw whatever you want (GARBAGE) after HASH, and the signature will still validate.

This is sort of a big deal. Recall from this explainer that RSA encryption is based on modular exponentiation. If you have a small exponent and a large modulus, you run the risk that a message might not be numerically large enough to wrap around the modulus after exponentiation. This being the case, all someone would have to do to forge a signature is come up with a figure that when exponentiated follows the format:


This is hard with larger exponents. It’s almost trivial when e=3. Why? All you have to do is build a message, take the cube root of that figure, and round up.

Why round up? Let’s be honest … you’re probably not going to be crafting a message that when translated into a large integer is a perfect cube. Computers are good at lots of things. Computing roots and computing prime factors of large figures are not among those things. But you don’t need a perfect cube anyway because you don’t care what comes after the HASH, right?

If you take a cube root of your crafted message and round up, the least significant (read: the right-most) bytes will be different than where you started, but the most significant (read: the left-most) bytes will stay the same. If you’re dealing with a padding validator that doesn’t validate that HASH is right-justified, you’ve won.

Here’s my implementation of an RSA signature “forger.” Pretty simple, pretty scary that it works so well.

The alternative, and why I didn’t pursue it

Hal Finney, in his original summary of this exploit, detailed that signatures could also be forged by treating everything before GARBAGE as if it was the result of (A – B)3. He loosely outlined a way that a message could crafted such that when cubed would follow the same basic pattern.

I saw this as needlessly complex. And, I already had a method written to find cube roots (and other roots) of large integers. I didn’t see a reason to bang my head against the wall when I was already 90% of the way there.

However, this is definitely a problem to revisit in the future.

Thanks for reading! -LH

RSA for those who aren’t number theorists Fri, 29 Oct 2021 12:45:28 +0000 I just finished cryptopals challenge 39, in which I had to implement RSA.

For me, it wasn’t enough for me to just implement the RSA algorithm. I sort of needed to understand a bit about the underlying number theory. I say that because I’ve faced instances in the past where a typo or error in a cryptopals challenge description threw me off the trail for days or weeks. So, rather than bang my head against a wall, I took as deep a dive as I needed to understand enough about the math behind this algorithm to make sense of it all.

I still don’t know if I’m all the way there. I am not a math novice, but I’m definitely no number theorist. In any case, here’s a best attempt to try to explain this in a way that will make sense to those of us who are also not mathematicians but have need to learn about how RSA is supposed to work mathematically.


Here are the steps to generate an RSA key.

  1. Select two prime numbers p and q. Multiply them to get n.
  2. Using modulo of n, Euler’s theorem tells us akφ(n)+1 ≡ a mod n if a and n are coprime. (To make sure of this, we use very large prime numbers p and q to determine n.)
  3. We can substitute kφ(n)+1 with ed provided that we select an e that is coprime with φ(n).
  4. φ(n) = (p-1)(q-1)
  5. In order for e to be coprime with (p – 1)(q – 1), we select an e that is prime and greater than or equal to both p and q.
  6. Alternatively, we check to ensure prime number e is not a factor of φ(n) and keep picking different primes for e until we find one that isn’t a factor of φ(n). This will make e coprime to φ(n).
  7. Once we’ve selected an e, we know that a modular multiplicative inverse must exist because e and φ(n) are coprime; we find the inverse d using the extended euclidean algorithm.
  8. The lock is [e, n] and the key is [d, n].
  9. Congratulate yourself!

What in the….?

We are going to go through the theory behind these steps in more detail. We’ll start with what invertibility means, which will lead us into the notion of modular multiplicative inverses. Then we’ll discuss φ(n), called simply the “phi function,” look at phi functions of prime numbers and products of prime numbers, and we’ll finish with Euler’s theorem. Then, with that foundation, we’ll select a key pair that meets all the criteria that will allow RSA to work.

There are links spread throughout this post, but as a favor, here are some of the sources that helped me understand this. A lot of them have more number theory than I think the average joe needs to understand to make this work. If these don’t help well enough, feel free to use a search engine. The info about how this works is everywhere. Keep grinding, and it will click eventually.

Prime Numbers and RSA by Computerphile

RSA by Eddie Woo Part One and Part Two

Encryption and Huge Numbers by Numberphile

RSA Encryption in 5 minutes

Jeremy Kun Math x Programming June 2011


First things first: In order for an encryption scheme to be worth anything, it has to be invertible (or, someone somewhere has to be able to decrypt an encrypted message). If you can’t decrypt an encrypted message, what’s it worth? Nothing.

RSA is inverted using modular exponentiation. That is, exponentiate a figure, divide it by another figure, and find the remainder of that division to encrypt. Repeat using different figures to arrive back at the beginning.

(Note: You’ll notice in the equations that follow, we don’t speak in terms of equality. We speak in terms of congruence. What’s congruence? Read the section on modular arithmetic here.)

Encryption: Me mod n ≡ C

Decryption: Cd mod n ≡ M

If you please, we can make a substitution for C and get the fundamental expression that undergirds RSA.

Med mod n ≡ M

Or, as you will see just about anywhere if you search around…

Med ≡ M mod n ⟵ The golden formula. If you get lost, remember that this is ultimately what we’re after.

Modular multiplicative inverse

Notice that because we’re talking congruence, we can replace ed with 1 and it’s still true.

M1 ≡ M mod n

There’s a special term for a pair of integers that when multiplied together mod n is congruent to 1: Modular multiplicative inverse. Or, we are after a d that is the modular multiplicative inverse of e and vice-versa.

There’s a rule about modular multiplicative inverses. For any two figures a and n, a will have a modular multiplicative inverse b if an only if a and n are coprime.

We must define coprime: A number a is coprime to another number n if they have no common factors.

The numbers don’t have to be prime themselves. But if they were, they would by definition be coprime since they would have no common factors.

All this to say that if we’re going to find any pair e and d that will satisfy our golden formula, it’s going to involve some sort of modular multiplicative inverse. Once we have found a pair of figures that satisfy our golden formula, then we have found an asymmetric encryption key. The lock is [e, n] (power of e, mod of n), and the key is [d, n] (power of d, mod of n).

Now, the question is this: how do we find e and d? We’re not there yet. Hang in there.

Phi function

I have to lay another piece of ground work. We need to talk about the phi function, or φ(n).

φ(n) is a special number. φ(n) is the quantity of numbers that are coprime with n, meaning how many numbers greater than or equal to one that share no common factors with n.

As it happens, calculating φ(n) is trivial for prime numbers. Think about it. If a number is prime, then by definition, it has no factors other than 1 and itself. One is a factor of everything, so we can ignore it. Therefore, for a prime number p

φ(p) = p – 1

Phi function of products of primes

Next step: what if we have two prime numbers p and q and want to figure out what φ(pq) is? (I know this seems random, but stay with me. It’s important.)

With this one, multiply the phi-function of each prime together to find the phi of their product. The proof is a bit over my head. If you want to read the proof, go here.

φ(pq) = φ(p)φ(q) = (p – 1)(q – 1)

For RSA, pq = n, or in other words, n is the product of two primes p and q, and φ(n) is the product of the phi function of each of its prime factors.

Let’s work a simple example with two primes: 3 and 5. Here are all the numbers from 1 to 15 with multiples of 3 and 5 bolded. Because the bolded figures are multiples of our prime numbers, they are not coprime with 15.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

Count the number of non-bolded figures. There are 8: {1, 2, 4, 7, 8, 11, 13, 14}.

φ(15) = φ(3)φ(5) = 2 * 4 = 8

Seems good to me.

Euler’s theorem

Last sidetrack before we find e and d, I promise.

Once upon a time a guy named Euler came up with a theorem that says that if you have two numbers a and n that are coprime, then the following is true. If you want to read up on why or how you can go here or here.

aφ(n) ≡ 1 mod n

This can be toyed with. Let raise both sides to power k and simplify.

akφ(n)1k mod nakφ(n)1 mod n

Further, let’s multiply both sides by a and simplify.

a * akφ(n) ≡ a * 1 mod n a * akφ(n) ≡ a mod nakφ(n) + 1 ≡ a mod n

Let’s call that out again, because it’s important

akφ(n) + 1 ≡ a mod n

Does this look at least a little familiar? Golden formula anyone?

Med ≡ M mod n

Therefore, if we can find two figures e and d such that the following is true, we have found our e and d.

ed = kφ(n) + 1

Choosing e… finally

Let’s make an assumption that there is a pair of figures e and d that satisfy this congruence. (Aside: recall from above that for this to be true, e and φ(n) have to be coprime.)

ed ≡ 1 mod φ(n)

Subtract one from both sides.

ed – 1 ≡ 0 mod φ(n)

If that’s true, then φ(n) divides evenly into ed – 1. Or, in other words, ed – 1 will be equal to some multiple of φ(n). We can call that multiplier k.

ed – 1 ≡ 0 mod φ(n)ed – 1 = kφ(n)

At this point, add one to both sides again.

ed = kφ(n) + 1 ⟶ This is the equation we are after!

Therefore, given a certain condition, we can straight substitute ed into Euler’s Theorem, and this gives us our golden formula.

What are those conditions? As stated before, e and φ(n) have to be coprime.

How can we guarantee that? φ(n) = (p – 1)(q – 1). Recall that p and q are prime numbers (ideally, really large prime numbers). Choose a 3rd prime for e that is greater than (p – 1) and (q – 1), and you’ll be guaranteed to have it be coprime with φ(n), because the only way any prime number figure would be a factor of φ(n) is if it divides evenly into φ(n). If that prime number is greater than the figures we multiplied to get φ(n), then it will definitely be coprime to φ(n).

Alternatively, we can just pick random primes, small or large, and see if a chosen prime is a factor of φ(n). If it is, we have to pick again. We keep picking until we find one that is coprime to φ(n).

So to sum up, as long as e and φ(n) are coprime (no common factors), then there will absolutely be a d that makes the equation ed = kφ(n) + 1 true for some value k. The bounds for e are that it has to be greater than one and less than (p-1)(q-1), with the caveat that it cannot be a factor of (p-1)(q-1). The easiest way to ensure that is to pick a prime number that is greater than p and q.

Solving for d

Now that we have an e, we have to find d, which is the modular multiplicative inverse of e mod φ(n). Once we have e, d, and n, we are finished!

There’s a defined algorithm for this: The extended euclidean algorithm. I’m not going to explain it here because, honestly, it’s over my head. There is psuedocode on wikipedia that makes it pretty simple and straightforward to implement.

Interestingly, java comes with an implementation of this algorithm already attached to BigInteger. It’s the modInverse function. So, we already have what we need in order to determine the inverse of e mod φ(n). This makes any homebrew implementation pretty irrelevant in java, but the challenge invites us to implement it anyway. Here’s my implementation.

     * find the modular inverse of a mod n, which we call t
     * an implementation of the extended euclidean algorithm
     * sourced from <a href="" target="_blank">wikipedia</a>
     * this is functionally equivalent to {@link BigInteger#modInverse(BigInteger)}
     * @param a the number
     * @param n the modulus
     * @return the result, which I am calling t
    public BigInteger invMod(final BigInteger a, final BigInteger n) {
        BigInteger t = BigInteger.ZERO;
        BigInteger nextT = BigInteger.ONE;
        BigInteger r = n;
        BigInteger nextR = a;

        while (nextR.compareTo(BigInteger.ZERO) != 0) {
            var q = r.divide(nextR);
            var tempT = t;
            t = nextT;
            nextT = tempT.subtract(q.multiply(nextT));

            var tempR = r;
            r = nextR;
            nextR = tempR.subtract(q.multiply(nextR));

        if (r.compareTo(BigInteger.ONE) > 0) {
            throw new ArithmeticException("a is not invertible");

        if (t.compareTo(BigInteger.ZERO) < 0) {
            t = t.add(n);

        return t;

Generating a key

Remember, the public key (or lock) is [e, n] and the private key (or key) is [d, n]. Let’s do a quick recap on the steps it takes to find this key.

  1. Select two prime numbers p and q. Multiply them to get n.
  2. Using modulo of n, Euler’s theorem tells us akφ(n)+1 ≡ a mod n if a and n are coprime. (To make sure of this, we use very large prime numbers p and q to determine n.)
  3. We can substitute kφ(n)+1 with ed provided that we select an e that is coprime with φ(n).
  4. φ(n) = (p-1)(q-1)
  5. In order for e to be coprime with (p – 1)(q – 1), we select an e that is prime and greater than or equal to both p and q.
  6. Alternatively, we check to ensure prime number e is not a factor of φ(n) and keep picking different primes for e until we find one that isn’t a factor of φ(n). This will make e coprime to φ(n).
  7. Once we’ve selected an e, we know that a modular multiplicative inverse must exist because e and φ(n) are coprime; we find the inverse d using the extended euclidean algorithm.
  8. The lock is [e, n] and the key is [d, n].
  9. Congratulate yourself!

Hopefully it makes more sense now!

Thanks for reading! -LH

]]> 1 317
Secure Remote Password Demystified Thu, 16 Sep 2021 13:49:59 +0000 Secure Remote Password (SRP) is a protocol by which a user in a system is able to log in to that system without the system ever knowing or storing the user’s password.

Consider this description of the SRP protocol from cryptopals challenge 36:

Replace A and B with C and S (client & server)

C & S
 - Agree on N=[NIST Prime], g=2, k=3, I (email), P (password)
 - Generate salt as random integer
 - Generate string xH=SHA256(salt|password)
 - Convert xH to integer x somehow (put 0x on hexdigest)
 - Generate v=g**x % N
 - Save everything but x, xH
 - Send I, A=g**a % N (a la Diffie Hellman)
 - Send salt, B=kv + g**b % N
S, C
 - Compute string uH = SHA256(A|B), u = integer of uH
 - Generate string xH=SHA256(salt|password)
 - Convert xH to integer x somehow (put 0x on hexdigest)
 - Generate S = (B - k * g**x)**(a + u * x) % N
 - Generate K = SHA256(S)
 - Generate S = (A * v**u) ** b % N
 - Generate K = SHA256(S)
 - Send HMAC-SHA256(K, salt)
 -Send "OK" if HMAC-SHA256(K, salt) validates

This is supposed to be a simple summary of the exchanges between a client and server to securely authenticate a user. It’s a lot to take in, and it’s not super intuitive what is going on or how it’s supposed to work.

Here’s my best shot at summarizing this.

First, secure Remote Password (SRP) mostly concerns authentication ex-post-facto. It doesn’t concern registration, except for one thing: the server needs to have some way to authenticate that a user is who he claims to be without actually having the password. To do that, a server needs a “password verifier.” This is v.

Here’s where cryptopals confused me: They indicate in their challenge summary that the server is supposed to build v and salt. Well that makes no sense. If the client sends the plaintext password over the network in an unencrypted manner to begin with, what’s the point? You’ve already given away the game at that point. Solution: Have the client do it instead. The client comes up with v and salt and sends both of these values to the server for storage as part of a registration. Wikipedia agrees with me (or at least did when I wrote this).

From that point on, it’s really not too bad. The user is registered. When it comes time to actually log in, how is it that the client can prove he is the user he claims to be? Math.

The client comes up with A, an ephemeral key based on a one-time, random, private-key value a. A and the username get sent to the server. The server responds with the previously saved salt and an ephemeral public key B based on both v and another one-time, random, private-key value b. They each do math to come up with a new number S independently. If the numbers match, then it’s considered proven that the client is who he claims to be, and authentication is completed.

Now you’re thinking, “That’s nice, but that description doesn’t show how the math adds up. Can you show me?” I can do my best. Again, wikipedia has this math pretty awesomely summarized. One major difference between me and them, though, is that I’m going to retain the mod operation rather than condensing it down. Admittedly, it’s unwieldy, but in languages like java, it’s important to understand these operations can be done with BigInteger#modPow and not just BigInteger#pow. So, I kept the mod operations in this math. If you want to see a version without, check wikipedia.

Ok. Here we go…

There is a desired end value S that both the client and server should be able to independently calculate:

S = (gb mod N)(a + ux) mod N

Client –> Server

Client: I want to log in. Here is my username and a public key A.


The server knows k, g and N. It also knows its own key b. It just received a username from the client and can retrieve a previously saved v from storage. It also can build u with both public keys (as follows).

u = SHA256(A|B) <— This means “A concat B hashed into SHA256 and converted to an integer”

A = ga mod N

v = gx mod N

With these values, we can compute a value S that is mathematically equal to the prior definition of S.

S = [A(vu mod N])b mod N

substitute A and v for their values

S = [(ga mod N)(gx mod N)u mod N]b mod N

simplify by reshuffling the exponents … (basez)y = basezy

S = [(ga mod N)(gux mod N)]b mod N

simplify by reshuffling exponents again … basez * basey = base(z+y)

S = [g(a + ux) mod N]b mod N

swap the exponents … (basez)y = baseyz = basezy = (basey)z

S = (gb mod N)(a + ux) mod N <— desired end value

Server –> Client

Server: I see your login request. Here is a salt and a public key B.


The client gets B and salt from the server. It also knows k, N, g, and password. It obviously knows its own private key a. And, because it has both A and B, it can build u the same way the server did. Also, from salt and the password, it can rebuild v through intermediate x.

u = SHA256(A|B)

x = SHA256(salt | password)

v = gx mod N

B = kv + gb mod N

With these values, we can compute a value S that is mathematically equal to the definition of S stated above as well as the value of S computed by the server.

S = [B – k(gx mod N)](a + ux) mod N

substitute B for its value

S = [kv + gb mod N – k(gx mod N)](a + ux) mod N

substitute v for its value

S = [(k(gx mod N) + gb mod N – k(gx mod N)](a + ux) mod N

simplify … k(gx mod N) cancels itself out

S = (gb mod N)(a + ux) mod N <— desired end value

Final authentication

Once the client and server have independently found the shared secret S, the client produces a hash of S and sends it to the server. The server independently produces a hash of S and compares. If they match, the user is authenticated.

You can see my implementation of the client and server on my github repo. I had fun implementing this.

Fatal flaw?

Look at the math again. Consider the question posed in challenge 37: what if the client sends a value of 0 for A? What does that do to the shared secret value S computed by the server?

S = [A(v^u mod N)]^b mod N
  = [0(v^u mod N)]^b mod N
  = 0^b mod N
S = 0

If a malicious client sends 0 for A, then S = 0, which means that the SHA256 value sent to the server for authentication just becomes a constant: SHA256(0), which enables anyone to log in without the password. If A is any multiple of N, then S = 0 because any multiple of a number N modulo N will equal 0.


Timing leaks and multi-threading Tue, 24 Aug 2021 13:48:55 +0000 What if the server that verified MACs took longer to verify a correct mac than an incorrect one? Or, perhaps put differently, what if you could tell the difference between a more correct guess than an obviously wrong one? If you can, you can break MAC authentication schemes, and that’s what the cryptopals authors are trying to get at in challenges 31 and 32.

Write a function, call it "insecure_compare", that implements the == operation by doing byte-at-a-time comparisons with early exit (ie, return false at the first non-matching byte).

In the loop for "insecure_compare", add a 50ms sleep (sleep 50ms after each byte).

Use your "insecure_compare" function to verify the HMACs on incoming requests, and test that the whole contraption works. Return a 500 if the MAC is invalid, and a 200 if it's OK.

This is simple enough. Remember that a hash value is simply a byte array 20 bytes long in hexadecimal string form. If you take a byte array of 20 bytes and start changing then start changing one byte at a time, you can pretty easily determine when a byte in your candidate is correct. How? Brute force each byte in the hash.

  • Stand up a vulnerable server. I used spring boot and jetty. You can see my vulnerable server code here.
  • Starting with a byte array of all zeros, make 256 requests against the server, rotating byte n=0 (the first byte) through all 256 possible values. Measure how long each request takes. The correct byte in position n=0 is the one that took the longest.
  • Save the byte in position n=0
  • Repeat for positions n=(1, 19) until you finally get a 200 out of the server.

Challenge 31 was simple because of how obvious the timing leak is at 50 ms. It got much more difficult with challenge 32 because of the statistical noise involved with sending requests with a 5 ms delay instead of the 50 ms delay. Any number of things on the server can cause delays of a millisecond or two, especially when you’re running a small web server on a not-powerful system. Every solution for this challenge I could find elsewhere involved either not setting up an explicit web server like I had done and just relied on class-to-class method calls (who could blame them?) and usually relied on simply averaging the lengths of the request and taking the group of requests that took the longest.

Ultimately in my solution, I did rely on averaging, but I also relied on a process of elimination where I limited the number of requests I made for all possible solutions and then make 3x more requests for the five candidates that took the longest on the first go round. Both for 31 and 32, I used multi-threading and used the same tool to crack a timing leak at 50 ms and a timing leak at 5 ms. For the 50ms timing leak, I used upwards of 30 threads. The noise that produced in response times was not enough to pollute the results. For the 5ms leak, I could only use two or three and successfully break the mac. Otherwise, the noise was too much. You can see my solution below.

Final note: I used multi-threading to try to speed up build times with mixed results. This crack takes a long time to run. After you find the first byte correctly, each successive request is also going to be delayed. This means finding the first byte takes a fraction of a second. Finding the second byte with a single thread takes (50ms * 256) + 50ms at minimum, which is roughly 13 seconds. The third byte takes (100ms * 256) + 50ms, which is roughly 26 seconds. This grows in linear fashion. If you can multi-thread, you can make concurrent requests to arrive at the solution more quickly. Even so, getting reliable results takes time. With my solution, each challenge takes 20 minutes or so to compute. Rather than sit around for 3/4 of an hour waiting for the crack to finish, I tagged these tests and ignore them by default. I only run to them when I push to a specific branch on a GitHub server. GitHub will tell me later if there’s a problem.

Here’s my timing leak exploiter:

 * a class dedicated to exploiting timing leaks in order to complete
 * challenges 31 and 32
public class C31_32_TimingLeakExploiter {

    private final String file;
    private final int port;
    private final RestTemplate restTemplate;
    private final Executor ex;

    public C31_32_TimingLeakExploiter(String file, int port, RestTemplate restTemplate, int numOfThreads) {
        this.file = file;
        this.port = port;
        this.restTemplate = restTemplate;
        this.ex = Executors.newFixedThreadPool(numOfThreads);

    public void exploitLeak(final byte[] forgedHash) {
        for (int i = 0; i < forgedHash.length; i++) {
  "starting round one with full byte set");
            //round 1
            Set<Byte> byteSet = IntStream.range(Byte.MIN_VALUE, Byte.MAX_VALUE + 1).mapToObj(n -> (byte) n).collect(Collectors.toSet());
            final SortedSet<Pair<Byte, Double>> tree = gatherSortedData(forgedHash, i, byteSet, 5, 33);
  "round one over. top five results: {}", tree);
            //round 2
            byteSet =;
            final SortedSet<Pair<Byte, Double>> tree2 = gatherSortedData(forgedHash, i, byteSet, 1, 100);
            forgedHash[i] = tree2.first().getKey();
  "found a byte ({}). now the hash is {}", tree2.first().getKey(), Hex.toHexString(forgedHash));

            if (HttpStatus.OK == makeRequest(forgedHash).get().getKey()) {
      "the hash was {}", Hex.toHexString(forgedHash));

    private SortedSet<Pair<Byte, Double>> gatherSortedData(final byte[] forgedHash, final int i, Collection<Byte> initialCandidates,
                                                           int limitOfNewCandidates, int limitOfRequest) {
        final var futuresMap = new HashMap<Byte, List<CompletableFuture<Pair<HttpStatus, Long>>>>();
        for (byte k : initialCandidates) {
            final List<byte[]> samples = new ArrayList<>();
            for (int j = 0; j < limitOfRequest; j++) {
                var clone = Arrays.clone(forgedHash);
                clone[i] = k;
            final var futures =;
            futuresMap.put(k, futures);

        SortedSet<Pair<Byte, Double>> candidates = new TreeSet<>(Comparator.comparing(Pair::getRight, Comparator.reverseOrder()));

        for (Map.Entry<Byte, List<CompletableFuture<Pair<HttpStatus, Long>>>> entry : futuresMap.entrySet()) {
            var futures = entry.getValue();
            var all = CompletableFuture.allOf(futures.toArray(new CompletableFuture[0]));
            var totalTime = all.thenApply(v ->, Long::sum)).join();
            final double mean = totalTime.doubleValue() / (double) entry.getValue().size();
            if (candidates.size() < limitOfNewCandidates) {
                candidates.add(Pair.of(entry.getKey(), mean));
            } else if ( -> e.getValue() < mean)) {
                candidates.add(Pair.of(entry.getKey(), mean));
        return candidates;

    public CompletableFuture<Pair<HttpStatus, Long>> makeRequest(final byte[] forgedHash) {
        return CompletableFuture.supplyAsync(() -> {
            final long startTime = System.currentTimeMillis();
            final String signature = Hex.toHexString(forgedHash);
            final URI uri = URI.create(String.format("http://localhost:%s/leak/test/%s?signature=%s",
            final ResponseEntity<String> response = restTemplate.getForEntity(uri, String.class);
            final long responseTime = System.currentTimeMillis() - startTime;
            if (response.getStatusCode() == HttpStatus.BAD_REQUEST) {
                throw new AssertionError("Got a bad request response");
            return Pair.of(response.getStatusCode(), responseTime);
        }, ex);
SHA1 and MD4 Length Extension Attacks Explained Tue, 13 Jul 2021 13:27:21 +0000 Continuing my series on the cryptopals challenges…

In section four, two of the challenges require you to get past a checksum test by spoofing a hash associated with a forged message. The idea is that if you can manage to pass a query string to an application (say a web application) that has been toyed with and provide a valid message hash with that manipulated query string, you can get other systems to do what you want. As an example, flickr was exposed as vulnerable to this attack in 2009.

Because of the algorithms that undergird them, SHA1, MD4, and (as I understand it), MD5 are susceptible to this type of attack. Even if the message is prepended by a secret key of unknown length before being hashed, it is possible to leverage common implementations of these algorithms to generate new hashes that will validate against manipulated strings. For this reason (and others), simple message authentication codes (MACs) produced by SHA1, MD4 and MD5 shouldn’t be trusted.

But how? To understand, you need to understand what happens in the underlying algorithm. In describing this, I’ll note that there are some meaningful differences between SHA1 and MD4, such as endianness, digest length, and the underlying processing method, but in the ways that matter for this attack, they’re the same. I’ll call the differences out if they’re important, but otherwise, just assume I’m speaking in terms of SHA1.

When a message is hashed, that message is converted mathematically into a code. With SHA1 and MD4 (the two algorithms in the challenge), that code is a hexadecimal character string 40 characters long, which is derived from a byte array 20 bytes long (or 16 bytes for MD4). In other words, the algorithm takes the message, uses it to manipulate a byte array, and at the end of all the math, that resulting byte array is the code. Importantly, this implies the algorithm is stateful.

If you look at the bouncycastle implementation of SHA1 (the one I used), you’ll notice five integer registers: H1, H2, H3, H4, H5. Each integer occupies 32 bits or 4 bytes, and since there are five registers, that makes for 20 bytes stored in the digest’s state. This state variable is where the ultimate hash comes from. As the digest mathematically processes the message, the end result of each processing step is to manipulate these registers. But the registers are only manipulated once for every 512-bit/64-byte block, which begs the question: What if your message doesn’t have a length of a multiple of 64 bytes?

The algorithm will pad the final block and fill it out so that there are 64 bytes in the block to process. It does this in two meaningful ways:

  • The first byte after the end of the message is always 10000000. I’ll call this the “bitflag.”
  • The last eight bytes contain a count of the bits, or bitcount, (not bytes) in the message (and if there isn’t a full eight bytes available to hold it, then a whole new, empty, 64-byte block is appended). For example:
    • if the message length is 20 bits, then the last byte will be 00010100
    • if the message contains 1000 bits, then the last two bytes will be 00000011 11101000
    • if the message contains 100,000 bits, then the last three bytes will be 00000001 10000110 10100000
    • you get it

After the bitflag and the bitcount are placed in the final 64-bit block of the message, the final block is processed just like any other block. The register state then becomes the digest that is output as the final result of the hash, at which point, the registers are reset.

I’ll say that again: the final output of the algorithm is the state of the registers after the final block is processed.

Ergo, if you can manipulate the registers, you can pretend that a bitflag and a bitcount were part of your message to begin with, and then feed the algorithm more message bytes (which, of course, you control) which will ultimately result in a new MAC that can authenticate against a manipulated message.

To do this, step one is to manually un-reset the registers back to their final state before the message hash was output. To be clear, you’re not just setting the registers back to what they were after the final byte of the message is processed. What you’re doing is resetting the registers to their final state after the bitflag and bitcount were put in their places. (Cryptopals calls the message contents from the bitflag to the bitcount “glue padding,” so I will too.)

Once the registers are reset back to their final state, you feed the digest more message and get a new hash. What’s really going to happen is the digest will take your new message, append new glue padding to round out a block, and will process that new block to produce your new hash.

So now you have a new hash for a manipulated message. But, if you can’t replicate the message the hash is tied to, then the hash isn’t worth much.

To do this, you have to recreate what has now become the original message: prefix + message + glue padding + appendix. You know the message and the appendix (and their lengths), obviously. Whether or not you know the prefix doesn’t matter as long as you know, or can at least guess, its length. If you know those lengths, you can rebuild the glue padding and fill out a block with that padding before tacking on the appendix.

Here’s how I did it. The end result is a byte array that starts with a single 1 and has the bitcount packed into the final eight bytes. The length of the result is enough to round out a 64-byte block (or to finish off one block and append a new 64-byte block if there isn’t enough space in the block to pack the bitcount into the final eight bytes).

     * given a message, build the MD padding as close to the same
     * way as possible
     * the algorithm, best as i can tell, leads with a single bit and then over-writes
     * the last bytes of the 512-bit block with the number of BITS in the message
     * @param message the message
     * @return the md padding
    public byte[] buildGluePadding(final byte[] message) {
        //build a 512-bit block
        byte[] block = new byte[BLOCK_SIZE];

        //get the last 64 characters of the message
        //or if it IS 64, then an empty byte array
        final byte[] subMsg;
        if (message.length == BLOCK_SIZE) {
            subMsg = new byte[0];
        } else if (message.length > BLOCK_SIZE) {
            final int startPos = (message.length / 64) * 64;
            subMsg = new byte[message.length - startPos];
            System.arraycopy(message, startPos, subMsg, 0, message.length - startPos);
        } else {
            subMsg = message;

        //copy the message bytes into the block
        System.arraycopy(subMsg, 0, block, 0, subMsg.length);
        int index = subMsg.length;

        //make the next byte the flag byte
        block[index++] = Byte.MIN_VALUE;

        //have to allow for the bit count to take up to 8 bytes
        if (index >= BLOCK_SIZE - BIT_COUNT_SPACE) {
            block = ByteArrayUtil.concatenate(block, new byte[BLOCK_SIZE]);

        //get the number of bits in the message
        final long messageBitCount = (long) message.length << 3;
        packTheBitCount(block, messageBitCount);

        final int padLength = block.length - subMsg.length;
        final byte[] returnValue = new byte[padLength];
        System.arraycopy(block, subMsg.length, returnValue, 0, padLength);

        return returnValue;

Note: This algorithm works for either SHA1 or MD4, but the method by which the bitcount is packed into the final block, packTheBitCount(), is different between the two. Here’s the implementation for SHA1, and here’s MD4. The biggest difference (which was frustrating), is that MD4 is little-endian, meaning least significant bit first. SHA1 is big-endian, meaning least significant bit last.

Once you’ve got the glue padding, you can try to authenticate. The prefix is going to be prepended before being hashed, so the message to be submitted is message + glue padding + appendix. Submit that with the hash. As your manipulated message is processed, block by block, it will eventually arrive at the block that contains your manipulated glue padding. If you do it right, the register state after that block is processed should be identical to the state you forced when you obtained your new hash.

From there, the algorithm behaves deterministically. It fills out the final block that contains ;admin=true with new glue padding, then it processes the final 64-byte block, producing a hash. If the register state going into the processing of that final block was the same as it was when you overrode the register state, then the final result will be the same, and your message will authenticate.

Breaking Counter Mode Encryption Fri, 04 Jun 2021 21:29:34 +0000 The subject of today’s post is breaking counter mode encryption, which directly concerns three cryptopals challenges: challenge 19, challenge 20, and challenge 25. (And maybe more … I’m only as far as challenge 25 at this point.)

What is counter mode encryption? Counter mode encryption is a method of encryption in which the content of a message itself is subjected to an XOR binary mathematical operation against an encrypted stream of bytes. It’s called counter mode because the non-encrypted stream of bytes is simply a unique nonce paired up with a counter that is incremented, encrypted, and chained over and over again. Wikipedia has a great summary of what it is.

Look at the image at the top of this post. That’s the algorithm. Can you spot any weaknesses?

In the three previously mentioned challenges, the task is to decrypt a ciphertext without knowing the key. If you look carefully at that image, it should be relatively obvious that there are two weaknesses.

  • The block cipher is deterministic. If you can figure out what the nonce and counter is, you can produce the same keystream. That’s usually pretty hard if it’s done right, but it’s still a weakness.
  • If you can figure out what the keystream is (regardless of the nonce, counter and key), then you can figure out the plaintext. After all, the plaintext is just an XOR of the ciphertext and the keystream.

In challenges 19 and 20, multiple messages are built using the exact same keystream. This is a no-no. If you encrypt several messages with the same keystream, it becomes relatively simple to figure out the keystream. After all, for a single character in a message, there are only 256 possible bytes against which the message can be XOR’d to produce the cipher text. When you know that the same character different characters (say the first one) in several different messages were built via XOR against the same byte of keystream, then you can find the keystream by trying all 256 bytes and taking what you consider the “best one.”

How to find the best one? In challenge 19, they ask you to primarily use trial and error. Cycle through all 256 a few times for the first few columns, review the results, try some more bytes, and make more substitutions.

In challenge 20, they tell you to do this programmatically. Remember ETAOIN SHRDLU? Frequency analysis on each individual first, second, third, nth character in each of these messages is how I determined which was the best.

public abstract class AbstractFrequencyAnalyzingCTRKeyDeterminer {
    //without knowing the key, can we derive the keystream?
    // ciphertext block XOR keystream block = plaintext block
    // since we have the cipher text block, we just have to figure out what to xor against these texts to make them
    // legible

    final Chi chi = new Chi();
    final XOR xor = new XOR();

    public abstract void additionalManualTweaks(final byte[][] ciphertexts, final byte[] keyStream);

    public byte[] findTheKeyStream(final byte[][] ciphertexts) {
        //inefficient, but find the longest cipher length
        int maxLen =
                .map(b -> b.length)

        byte[] keyStream = new byte[maxLen];

        //get a column of letters in cipher text
        // gracefully pass by ciphertexts without letters in the column under examination
        for (int l = 0; l < keyStream.length; l++) {
            final byte[] temp = new byte[ciphertexts.length];

            int count = 0;
            for (byte[] ciphertext : ciphertexts) {
                if (l < ciphertext.length) {
                    temp[count] = ciphertext[l];
            final byte[] byteColumn = new byte[count];
            System.arraycopy(temp, 0, byteColumn, 0, count);
            keyStream[l] = determineKeyByte(byteColumn);

        //this function will manually place characters to resolve the full keystream
        additionalManualTweaks(ciphertexts, keyStream);

        return keyStream;
    *  use chi-squared scores to find the "best" byte for the key
    private byte determineKeyByte(final byte[] byteColumn) {
        //get the most likely first byte, but print them all

        double lowestChiScore = Double.MAX_VALUE;
        Integer winner = null;
        for (int i = Byte.MIN_VALUE; i <= Byte.MAX_VALUE; i++) {
            char[] xordFirstLetters = xor.singleKeyXOR(byteColumn, i);
            double localChi = chi.score(xordFirstLetters);
            if (localChi < lowestChiScore) {
                lowestChiScore = localChi;
                winner = i;

        assert winner != null;
        return (byte) winner.intValue();

In challenge 25, they give you a new tool. They say, “Hey! You can now edit the original message via an edit function!” They tell you to build the keystream, encrypt the new text against the appropriate keystream bytes, and overwrite the original message. Then, if you decrypt the message, you’ll find your text beginning at the appropriate offset.

    public void edit(final byte[] cipherText, final int offset, final String newText) {
        // get keystream of length offset plus newText.length rounded up to block size
        final LittleEndianNonce nonce = new LittleEndianNonce();
        final int blockLength = nonce.get().length;
        final int newLength = ((newText.length() + offset) / blockLength) * blockLength + blockLength;
        final int numOfBlocks = newLength / blockLength;
        final byte[] keystream = new byte[newLength];
        for (int block = 0; block < numOfBlocks; block++) {
            var encryptedNonce = ecb.AES(nonce.get(), CipherMode.ENCRYPT);
            System.arraycopy(encryptedNonce, 0, keystream, block * encryptedNonce.length, encryptedNonce.length);

        //get the portion of the keystream we actually care about
        final byte[] ktext = new byte[newText.length()];
        System.arraycopy(keystream, offset, ktext, 0, newText.length());

        // overwrite the ciphertext
        var newTextBytes = newText.getBytes();
        var sub = xor.multiByteXOR(newTextBytes, ktext);
        System.arraycopy(sub, 0, cipherText, offset, sub.length);

But this is a fatal flaw. XOR has a property that dooms whoever thinks this might have been a good idea: XOR-ing a text against all zeros (and I mean binary zero –> [00000000], not character zero, which is actually 48 in binary –> [00110000]) leaves the original text. So, if you edit a ciphertext such that the entire ciphertext is overwritten with zeros, you’ll get the unaltered keystream.

     * plaintext can be anything. Find an ascii art and use that if you want.
    void recoverThePlainText() {
        final String plaintext = getPlainTextFromFile();
        final byte[] cipherText = ctr.encrypt(plaintext);

        //we can get the keystream out of the edit function by
        // passing in all 0s
        final String breakerString = new String(new byte[cipherText.length]);

        //copy the cipherText so we retain original
        final byte[] keystream = ArrayUtils.clone(cipherText);

        //edit the copy with the breaker string to get the keystream
        ctr.edit(keystream, 0, breakerString);

        //once we have the keystream, we can simply xor it against the cipherText to recover the plaintext
        final String broken = new String(xor.multiByteXOR(cipherText, keystream));

        assertEquals(plaintext, broken);

Game over.

Of these three challenges, I probably enjoyed challenge 20 the most. It took me a bit to figure out that I could treat the set of messages they provided as a matrix, where every column in the matrix was XOR’d against the same character. Once I figured that out, it just became a single character XOR and frequency analysis. Challenge 25 was way too obvious, and I didn’t have the patience to sit there poking and plugging at challenge 19. I solved challenge 20 first, and then applied my solution to challenge 19.

Thanks for reading! -LH

Cloning a Mersenne Twister Random Number Generator from its output Wed, 26 May 2021 17:38:13 +0000 As was said in my last post, I’m doing cryptopals. Just last night I finished Challenge 23. I was able to successfully clone a 32-bit Mersenne Twister pseudorandom number generator (PRNG) from its output. You can see how I did this by checking out my solution in my github repo.

If you’re like me when first looking at this, you probably are overwhelmed. So, I thought I would take a moment to try to explain how and why this works. I can’t do this, though, without acknowledging this blog post and its author. It was incredibly helpful in understanding how to reverse the tempering function of the mersenne twister.

Also, I want to address the question posed in the original problem:

How would you modify MT19937 to make this attack hard? What would happen if you subjected each tempered output to a cryptographic hash?

Let’s get going…


What’s interesting about the mersenne twister is that the PRNG retains an array within that manages the generator’s state. It has to do this in order to ensure that it doesn’t start repeating numbers. The period of a PRNG is how long it takes before the generator starts repeating numbers, and the primary perk to this particular PRNG is that it has a super long period: 2^19937-1 . (That’s a huge number.)

When the PRNG gives you a “random number,” it pulls a number out of its state array and subjects that number to “tempering.” What that means is that the number is subjected to several bitwise mathematical operations, and is then returned. Here’s what that looks like in the version of the twister that I implemented. (The capital letters are constants … Pay attention to the two private functions.)

     * temper a value
     * @param untempered the value to be tempered
     * @return the tempered value
    int temper(int untempered) {
        int y = temperRightShift(untempered, U, D);
        y = temperLeftShift(y, S, B);
        y = temperLeftShift(y, T, C);
        y = temperRightShift(y, L, FULL_MASK);
        return y;

    private int temperRightShift(final int y, final int shift, final int mask) {
        return y ^ ((y >>> shift) & mask);

    private int temperLeftShift(final int y, final int shift, final int mask) {
        return y ^ ((y << shift) & mask);

There are two “versions,” so to speak, of transformation: a left shift operation and a right shift operation. In both cases, the original value is subjected to a bit shift, then and’d against a mask, and then xor’d against the original.

What’s not obvious

If you haven’t yet, you really should go check out this article. If you have, I’m going to try to do this in a way that is perhaps a bit clearer than he put it. I’m going to use the number -252645136, which when written out in binary, looks like this.

1111 0000 1111 0000 1111 0000 1111 0000

Nice and easy to see what’s going on when the number pattern is super obvious. Let’s perform and then undo the third transformation using this number: y := y xor ((y << s) and b), where y is our number, s is the shifting constant 7, and b is the bitmask 0x9D2C5680.

Let’s start with the first operation: Shift our number 7 digits to the left and AND it against b. Let’s call the result y’. An asterisk is an original part of the number, and a – is something that came in as a result of a shift.

  **** **** **** **** **** **** *--- ----     
  0111 1000 0111 1000 0111 1000 0000 0000  <-- 7-bit shift left
& 1001 1101 0010 1100 0101 0110 1000 0000  <-- 0x9D2C5680
  0001 1000 0010 1000 0101 0000 0000 0000  <-- y'

Gonna pause there before i do the next operation. Look at the last 7 bits of y’. It’s all 0s. If you’ve ever done XOR before, you know that XOR-ing something against 0 will leave you with what you started with. It’s the same as AND-ing something against a 1. No change. And, the reason that those figures are 0 is because of the shift. (If the original number ended in all 1’s, the last 7 bits of the shifted number would still all be 0. For this reason, I said 7 bits and not 12.)

This means a piece of the original number is going to remain in our final result. That is important, but it’s not super obvious!

Let’s do the next operation: XOR the result against the unmodified original. Let’s call the result z, like bloglien does.

  1111 0000 1111 0000 1111 0000 1111 0000  <-- y
X 0001 1000 0010 1000 0101 0000 0000 0000  <-- y' 
  1110 1000 1101 1000 1010 0000 1111 0000  <-- z

Now, to undo this. If you didn’t know this, I should say that XOR is reversible. q XOR r = t and t XOR r = q. This means that all we have to do recover the original is recover y’.

But how can we do that? y’ is gone once the operation is done. Is it recoverable?

Yes. yes it is, but only because you retain a piece of the original y in z. That’s what’s not obvious in this at first glance.

Recovering y’ and y

In order to recover the original y, y’ is needed. How does this work? Bloglien calls it “waterfalling.” I call it “redoing the original operation 7 bits at a time.” As I do this, I’m going to leave out what I call “extraneous bits,” (which after masking is always 0). If they’re not of immediate concern, I’m just not going to write them. I will, however, keep them vertically aligned with the other figures. If you want to pull a piece of paper out and fill out every bit, feel free.

Step one: Mask the portion of z that contains the unaltered portion of y.

  1110 1000 1101 1000 1010 0000 1111 0000 <-- z
& 0000 0000 0000 0000 0000 0000 0111 1111 <-- mask the lowest 7 bits
                                 111 0000 <-- what we are left with

Step two: Just like we did to the original y, shift 7 bits to the left and AND that against b (the masking constant), and poof! We have 7 bits of y’.

                        11 1000 0         <-- lowest 7 bits shifted left 7 bits
                      & 01 0110 1         <-- 7 bits of b in these positions
                        01 0000 0         <-- 7 bits of recovered y'

Step three: XOR the recovered y’ against z to recover 7 more bits of y.

                        10 0000 1         <-- 7 bits of z
                    XOR 01 0000 0         <-- 7 bits of recovered y'
                        11 0000 1         <-- 7 more bits of y

Step four: REPEAT! You have 7 more bits of y. Mask them, shift them, AND against b to recover 7 more bits of y’, and then XOR against z to recover 7 more bits of y.

               1 1000 01                  <-- masked and shifted
             & 0 1100 01                  <-- b
               0 1000 01                  <-- recovered y'
           XOR 1 1000 10                  <-- z
               1 0000 11                  <-- y

       1000 011                           <-- y, masked and shifted
     & 1101 001                           <-- b
       1000 001                           <-- recovered y'
   XOR 1000 110                           <-- z
       0000 111                           <-- recovered y

  0111                                    <-- y, masked and shifted (we lose 3 bits)
& 1001                                    <-- b
  0001                                    <-- recovered y'
X 1110                                    <-- z
  1111                                    <-- recovered y

Now, take all the bits of recovered y and concatenate them, and you get:

 1111 0000 1111 0000 1111 0000 1111 0000  <-- the original y

That’s the algorithm. Mask, shift, AND, XOR, Repeat. I could redo this for a right shift to show you how it works in the opposite direction, but it’s the same except that you retain the most significant bits of a number instead of the least significant bits. Also, the right shift is less difficult than the left because the shifting constants are larger, meaning you run out of room more quickly. (With the fourth transformation, you don’t even need to repeat. It’s one and done.)

In code

So how does “untempering” look when you put it in code? Well, pretty close to tempering, but with a few gotchas.

The first line in each unTemper function calls on a method to convert an integer into a bitmask. In other words, 7 becomes 0000 0000 0000 0000 0000 0000 0111 1111 in convertIntToRightEndMask. The definition for this can be found here.

From there, the mask is moved horizontally by shift bits, starting with 0 (since blockMaskShift starts at 0). Then we AND it against z, which allows us to get the original bits that stayed from the previous transformation all by themselves. Then we shift those bits, mask them against the constant, and XOR it against z, and repeat. It works for left or right shifts the exact same. Plug in the right inputs (which come from the MT spec), and you can undo tempering.

    int unTemper(final int tempered) {
        int y = unTemperRightShift(tempered, L, FULL_MASK);
        y = unTemperLeftShift(y, T, C);
        y = unTemperLeftShift(y, S, B);
        y = unTemperRightShift(y, U, D);
        return y;

    private int unTemperRightShift(final int z, final int shift, final int maskConst) {
        final int blockMask = convertIntToLeftEndMask(shift);
        int zp = z;
        for (int blockMaskShift = 0; blockMaskShift < (W - shift); blockMaskShift += shift) {
            zp = zp ^ (((zp & (blockMask >>> blockMaskShift)) >>> shift) & maskConst);
        return zp;

    private int unTemperLeftShift(final int z, final int shift, final int maskConst) {
        final int blockMask = convertIntToRightEndMask(shift);
        int zp = z;
        for (int blockMaskShift = 0; blockMaskShift < (W - shift); blockMaskShift += shift) {
            zp = zp ^ (((zp & (blockMask << blockMaskShift)) << shift) & maskConst);
        return zp;

Clone the twister

Once you’ve got past the hard part of undoing the twister, all you need is a sample of 624 outputs from a Mersenne Twister PRNG to clone it. You just “untemper” those outputs, overwrite the PRNG’s internal state array with the untempered values, and boom. You’re done. See here.

Addressing their question

So, to go back to the authors’ original question about this…

How would you modify MT19937 to make this attack hard? What would happen if you subjected each tempered output to a cryptographic hash?

Good question.

If you subject the state value to a cryptographic hash at any point during tempering, you wipe out the original bits tied to the state value, which makes it super difficult to recover inasmuch as the cryptographic hash is super difficult to crack (which they generally are). At that point, you’d basically be left with brute force, trying at most 2^32-1 different permutations to see if an untempered value matches the cryptographic hash. This would effectively make the PRNG far more secure since you wouldn’t be able to easily undo it.

In sum, if you get rid of the original bits from the untempered inputs, then you make it super hard to recover the untempered value tied to the PRNG’s state array.

This is cool

At first, this sort of blew my mind, but as I worked through it a few times in code and on paper, it ended up being one of the most fun challenges to complete in cryptopals. That’s the reason behind this lengthy write-up: I really got excited about this problem. I hope that others are just as fun to crack in the future.

Peace! -LH

]]> 2 258
I’m doing cryptopals Fri, 21 May 2021 22:54:02 +0000 Cryptography fascinates me. It’s amazing how critical cryptography is to the internet and the digital economy. Even more amazing to me is how simple it is to crack if it’s insecure.

I don’t have a computer science degree; I took some courses on algorithmic design in college, but felt so totally lost and overwhelmed that I changed my major. Several years later, I’m in the mood to learn some more.

In an effort to better understand cryptography in general, and as a personal project, I started doing the cryptopals challenges. I took a brief hiatus during the summer and fall due to a remodel and a move, but now that those things are done, I’m back at it.

To date, I’ve made it through roughly two and a half of their eight sets. They’re not super easy and take some time to crack. You can check my progress at my github repo.

I don’t think it’s worthwhile to share the detailed solutions to each problem. That’s all over the internet. Mostly, I want to just fill in the gaps where the cryptopals writers leave them… because a lot of things are not at all clear as you read the problem. (The most recent example that comes to mind is the sideways mention of little endian nonces in challenge 18.) I also will probably share code snippets from time to time. Who knows.

To kick it off, I want to draw attention to challenge 3…

Single-byte XOR cipher

The hex encoded string...


... has been XOR'd against a single character. Find the key, decrypt the message.

You can do this by hand. But don't: write code to do it for you.

How? Devise some method for "scoring" a piece of English plaintext. Character frequency is a good metric. Evaluate each output and choose the one with the best score.

Achievement Unlocked
You now have our permission to make "ETAOIN SHRDLU" jokes on Twitter.

ETAOIN SHRDLU is an overt reference to frequency analysis. If you’ve never heard of that, it’s the analysis of frequency of letters in the English language. Most know E is the most frequent letter in the English language. Few know anything beyond that. A little googling here and there can tell you what the rest of the letters are in order of frequency.

Knowing frequency is one thing. Applying a test to see how well a candidate decrypted text fits up against expected letter frequency is totally another. Fortunately, there is a statistical test that can be applied: X2 (chi-squared) goodness of fit tests. I implemented such a test in this class.

Understanding chi-squared tests, and having a good implementation of such a test, comes in handy in several challenges following this one. Challenges 4, 6 and 20 depend on it to validate a candidate decryption. So take your time on this one. While writing this, I actually improved my implementation, so that’s good.

You can see what I did for my solution here.

Peace! -LH

Spring Batch Testing: Asserting file equality after running a single step Wed, 20 Jan 2021 14:16:46 +0000 For some time at SoFi, we’ve worked with Spring Batch to provide a third-party integration with a service without a robust API, but that loves to work in terms of batch files.

There are a number of ways to deal with that, and we’ve taken a few different approaches. One of them is to implement a spring batch implementation that parses incoming files from the third party, logs content, and turns file content into a data stream consumable by other applications. Similarly, this little application also will take a data state that we persist and construct flat files that the third-party vendor can consume.

Although implementations of spring batch can vary widely (It is a pretty flexible system, after all), the team had a concern about littering the file system with unnecessary files hundreds of megabytes large at the end of every run. So, in our implementation, part of every “outgoing job,” as it were, deletes the file from the local file system after it’s built and shipped off to the vendor.

It quickly became obvious that testing modifications to our files was going to be a problem with this step involved in our batch jobs. End-to-end testing of a batch job is relatively straightforward. But rather than do end-to-end testing, what we needed was a way to test a single step, which was the step that captures data from our model and turns that into a file.

Here’s what we came up with. (Note: To protect SoFi IP, I’ve sanitized this pretty heavily. But I think that this code is representative of what was actually implemented.)

import static project.constants.GENERATE_PLACEMENT_FILE;

public class GeneratePlacementFileStepIntegrationTests {

    private static final String OUTPUT_FILE_LOCATION = "build/test_output/generate_placement_file_step_test_results.txt";
    private static final String ASSERTION_FILE_LOCATION = "src/test/resources/placement/generate_file_step_expected_test_results.txt";

    static class SingleStepConfig {
        public JobLauncherTestUtils getJobLauncherTestUtils() {
            return new JobLauncherTestUtils() {
                public void setJob(@NotNull Job generateAndUploadPlacementFile) {

    static void removeFileIfExists() {
        var file = new File(OUTPUT_FILE_LOCATION);
        if (file.exists()) {

     * this test will put data in the database
     * then it will run the step in the batch that builds a file
     * and will compare the built file to a manually built file that matches expected behavior
    void insertDataAndRunStepAndTestFileEquality() throws Exception {
        assertFalse(new File(OUTPUT_FILE_LOCATION).exists());

        //build the params for this test execution
        final var params = new JobParametersBuilder()
        JobExecution je = jobLauncherTestUtils.launchStep(GENERATE_PLACEMENT_FILE, params);
        assertEquals(ExitStatus.COMPLETED, je.getExitStatus());
//make sure the file looks like it should
        assertFileEquals(new File(ASSERTION_FILE_LOCATION),
                new File(OUTPUT_FILE_LOCATION));

If you look at the Spring Batch Reference Documentation on end-to-end testing, they have the @SpringBatchTest annotation, and as part of that annotation, a JobLauncherTestUtils bean gets made available for you. This bean can run single steps in a batch job independently of the job as a whole… just what I wanted.

For us, though, going with @SpringBatchTest borked up our H2 data structure and didn’t spin up the necessary JPA beans in order to interact with our data layer. So, instead, we had to go with @SpringBootTest to spin up the full application context. This made all the JPA beans and the entire Hibernate data layer available, but left us without a JobLauncherTestUtils bean to run a single step.

So I made my own. The key here is this little snippet:

    static class SingleStepConfig {
        public JobLauncherTestUtils getJobLauncherTestUtils() {
            return new JobLauncherTestUtils() {
                public void setJob(@NotNull Job generateAndUploadPlacementFile) {

This test configuration made the test utils bean available for this job. I then autowired the bean into the test and used it. Voila! The test ran the sql scripts that inserted my baseline data and use-case-specific data into the H2 db, and then the step that extracted that data and produce a file was run.

The last step is to test file equality.

        assertFileEquals(new File(ASSERTION_FILE_LOCATION),
                new File(OUTPUT_FILE_LOCATION));

The best part about this test is that in order to test future changes to this particular batch step, I don’t have to make changes to the test itself. I just have to make changes to the data state that gets built at the start and changes to the ASSERTION_FILE.

I thought this was a robust solution to our problem. I intend to repeat this testing pattern for other file-based batch jobs in the future.

Beware Hibernate’s caching when using database filters Tue, 30 Apr 2019 14:08:03 +0000

The stack I work in every day uses Hibernate and Spring Data JPA for its object/relational mapping framework. My company is hardly alone in using these tools to map data from a database into Java objects. They’re quite commonly used, and also quite powerful.

One of the nifty features of Hibernate is Filtering. You can put a filter definition on one of your entity classes, and then use that filter to dynamically alter the SQL that actually is issued against the database. Doing so, you can ignore data you know you won’t want or won’t need.

We’ve used this database filter mechanism in an interesting, object-oriented way. First, we define the filter on one of our entities. Note the @FilterDef annotation.

@Table(name = "...obfuscated...")
@FilterDef(name = "effective", defaultCondition = "((current_date between start_date and end_date) or " +
                                                      "(start_date <= current_date and end_date is null))")
public class Entity implements AuditableText, Serializable, Comparable<Entity> { ... }

Then, create an abstract class that will leverage the Hibernate entity manager to enable or disable a filter.

import org.hibernate.Session;

import javax.inject.Inject;
import javax.persistence.EntityManager;

public abstract class MyAwesomeFilter {
    @Inject protected EntityManager em;

    public void enable() {

    public void disable() {

    abstract String getFilterName();

You can see that this class is mostly a wrapper for some methods on the hibernate entity manager.

Because it’s an abstract class, we need an implementation that is attached to the filter we defined and attached to one of our entities.

import org.springframework.context.annotation.Scope;

import javax.inject.Named;

public class EffectiveFilter extends MyAwesomeFilter {
    String getFilterName() {
        return "effective";

Now, when you open a transactional window into the database and start retrieving data, all you have to do in the call path is specify whether or not you want this filter enabled. 

Notice as well that because this is a named object, it can be injected via Spring’s DI framework into whatever class it’s needed.

public class AwesomeServiceClass {

    private final EffectiveFilter effective;
    private final EntityRepo repo;

    public ActiveDutyService(EffectiveFilter effective, EntityRepo repo) {
        this.effective = effective;
        this.repo = repo;


    private Entity getMyEntity (final int entityId) {
        return repo.findOne(entityId);

This has caused us so much frustration

While it’s cool that you can flip a filter on or off at will, it’s also caused a ton of consternation for one simple reason: it’s very easy to lose track of whether or not the filter is turned on when a call is issued. Hibernate and other ORMs are supposed to allow you think less about what sql is executed against a database, but once you start down the filter path (especially if you can turn it on or off), your call path could be filled with places where the filter is turned on and turned off and then turned on again. 

Which leads us to … the Hibernate cache!

To minimize the amount of round trips hibernate has to make to the database, Hibernate (and several other ORMs … I know .NET’s EntityFramework does this too) will cache results of queries within the JVM. The benefits of this are obvious… fewer round trips to a database means better performance.

But in a filtering context, this can give you nightmares.

What if, at the beginning of an execution path, you need data with the filter turned on, but later in the path, you need that same data but with the filter turned off? As I recently discovered (the hard way), you’re sort of hosed.

Hibernate, for all its advantages, will recognize that you’ve already retrieved the entity on the first go round (with the filter on). Even though you’ve disabled the filter on the second go round, hibernate will rely on its cache before it goes back to the database to get the actual data that you want.

We ended up having to get the data we were after via a more direct route: an explicit sql query attached to a hibernate entity repository interface. (In other words, we forced a trip to the database.)