How might two people communicate without others even knowing they’re communicating? They could be communicating to harm some entity and are being observed by that entity.1 Because of this, they want to send a message that others can’t even detect if it’s present.
What method can we use to talk/communicate without others even knowing we’re talking? The first and most obvious approach is to use the least significant bit to encode your message. That is, this method will take an image and hide message in the least significant of the 8 bits in an image. This process is best shown with an image:
This hardly changes the value of the pixel color, and it can encode a message (if given enough pixels). The code to do this approach would be
This method can even be made secure: both the communicator and receiver can randomly shuffle their image in the same way using a key they both know (i.e., with
np.random.seed(key)). This makes it an attractive method, especially if communicating something you’d like to remain secret and know you’re being watched.
This method succeeds, at least by visual inspection. By looking at the images, it’s impossible to tell if a message has been sent.
However, this method fails by more rigorous testing. If we plot a histogram of the results, we can see an strange pattern occurring on the odd/even values:
This method works and withstands uploading to imgur because the images are saved as PNGs, a lossless compression. However, if compressed to JPG to save space, the least significant bit will be corrupted because JPG isn’t a lossless compression.
Instead of the least significant bit method, let’s hide our message along the edges of an image. This is where the values are quickly changing; flipping a bit will be a small change in a much larger change. This means that we’ll certainly be able to hide a smaller image; we can only hide near the edges. Earlier, we made no restrictions on where we could hide the message.
To do that, we’ll use the wavelet transform. You don’t have to know this in detail, just know that the wavelet transform is:
- A unique way of representing an image. It’s possible to back-and-forth between the wavelet transform and image.
- The edges of the image are represented by large values in magnitude in the image. We’ll be changing the least significant bit of these large values that characterize the image.
To do this, we would find the wavelet coefficients that are larger than some threshold (the threshold would have to be known on both sides). Then we could find the support of a wavelet coefficient and change those values. By the definition of the wavelet transform, this would correspond to changing pixel values that are near edges.
When I first implemented this, I didn’t find the support of each term and instead changed the value of coefficients larger than some threshold. Regardless, plotting the histograms shows us that if we change values near the edges, our message is better hidden in the histogram:
By visual inspection, I can’t tell these two curves apart without knowing the other curve. This is exactly what this method hopes to achieve. It’s impossible to recover the message without knowing the message exactly – if you knew the message exactly, why would you go to the work of recovering the message?
To detect data hiding methods like hide the fact that two parties are communicating, agencies that intercept these communications might try a suite of commonly used methods to decode the message.
I’m sure you imagine more situations where other more nefarious people are communicating and know they’re being watched. ↩