Trying to see ECB-encrypted image shadows

by Landon | Jun 21, 2024 | Leave a comment

It’s been a couple years since I started working on the cryptopals project. But, two years later, I am returning to this project hopefully to finish it all the way through.

Given the time that has elapsed since I started cryptopals in earnest, I thought it would be a good idea to go back through earlier challenges and solidify my understanding of what I wrote and how it works. In that sense, many of the blog posts that I have written previously have been very helpful.

But, if you notice, there isn’t very much that I wrote on the earlier challenges in the series. This is because I didn’t even have the idea to write about my journeys through cryptopals until I was finished with set no. 4.

Anyway, today I want to discuss challenge 7 and challenge 10. These two challenges require you to implement ECB and CBC symmetric encryption. In challenge 7, they tell you libraries are fine to use, but in challenge 10, they tell you to basically implement the thing yourself, leaning on libraries that provide ECB (I used javax.crypto).

If you research ECB much (wikipedia is probably sufficient for our needs here), you can see examples of images encrypted with ECB vs. CBC or other cipher methods. In some of these images, you can see shadows of the original. This obviously is not ideal in a world where you’re trying to conceal a message (which in this case is a visual message). If someone can discern your image after encryption, is the encryption worth much of anything? Probably not.

In any case, two years later, I decided that I wanted to see if my implementations of ECB would betray itself in the form of showing image shadows post-encryption. I’m happy(?) to say that yes, it does.

ECB Encryption on an image with uniform colors: Before and after

So, why? Why is it that an ECB-encrypted image shows shadows of the original, especially in circumstances where there are a lot of uniform colors?

To understand this, you need to understand a common term in computer science: deterministic (or determinism). Something in computers is deterministic when, given identical inputs, you get identical outputs.

You can think of this in terms of a manufacturing plant. If I go to a Frito-Lays plant, I can see a bunch of inputs that go into the process of creating Cool Ranch Doritos: spices, corn flour, MSG, the plastic materials for the bags, glue to seal the bags, water, etc., etc. All of these inputs go into a system, much of which is mechanized. At the end of that system, out pops a tightly sealed bag of the greatest snack food known to man. In a system like that, there may be a slight variation in the number of chips, but the bag is generally always full to about the same degree, and the bag sizes are always the same.

With computers, it’s much more rigid. Imagine if there was a plant that always produced the exact same bag of chips with no variation at all. That’s what computers give you. You give computer systems inputs, and if they are programmed deterministically, they will always give you the same outputs.

What makes ECB so easy to break is that it has this deterministic property. You give it the same input, it will give you the same output everytime. For example, if I give an ECB encryption algorithm a series of three white pixels from an image, it transform it into a blue, green and yellow pixel. and it will do that for every series of three white pixels. So, if you have an image with several white pixels in a row, a pattern quickly and easily emerges (as you can see above).

This property of ECB actually is sort of implied in what the acronym ECB stands for: Electronic Code Book. This hearkens back to the days before computers when secret messages were encrypted using code books that were clandestinely distributed to parties between whom messages needed to be sent. One party would change words, numbers, letters, or phrases according to the code book. The other would receive the message, and then would undo that encryption by changing known codes with their original meangings.

The ECB algorithm is very similar to this. Here’s essentially what it does:

  • Split your message into blocks of a certain length (which is why it’s called block cryptography). In this case, the block length is 16 bytes (or 128 bits).
  • Take a user-supplied key, and transform a block using a reversible encryption algorithm (cryptopals suggests AES, the implementation of which is beyond the scope of this post) with the key as an input to that transformation.
  • Repeat for every block in the message.

Then, when you want to decrypt your message, you do the same thing, but instead of applying the encryption, you reverse it. As long as the key is the same, you end up with the same original message.

There are other cipher modes that do this also, but in a way that inject pseudo-randomness into the encryption such that, given an image, at least the message isn’t nearly so discernible post-encryption. (CBC is one such approach, and it’s the subject of challenge 10).

The same image subjected to CBC encryption

Hopefully you learned something interesting! Thanks for stopping by. 🙂