When we blur red and green, we get this:

Why? I do not expect the brownish color in the middle: I don’t think it’s half red and half green.

I’m not an expert but I should know why. I work in an academic lab that deals with image processing (although that’s not our sole focus). Lab-mates have taken entire classes on image processing. You think they would know all the gory details of how images are taken and stored, right? It turns out there’s a surprisingly simple concept that plays an enormous role in image processing and we were all surprised to learn it!

We happened to stumble across this concept through the MinutePhysics video “Computer color is broken” (/r/programming discussion).

In short, typical image files do not represent what the camera sees. If the camera sees $x$ photons at a certain pixel, they store $\sqrt{x}$ at that pixel. That makes sense with how our eyes work and it only takes up half as much space.

This process works; when we display the color, our monitors display the square of the value found in the file.

This is only a problem when we want to modify the image. Almost every computer program lazily computes the average of the raw values found in the image file (e.g., PNG or JPG). This is widespread – programs like Photoshop include the physically incorrect behavior by default (but you can fix it by changing the gamma correction settings). In picture form, they compute

Let’s see this result. As explained in the video, when we’re mixing red and green we would expect to see a yellowish color, as that’s what happens when a camera is out of focus. We would not expect to see a gray brownish color.

But before we see the results, let’s write a script to average the two colors. Before seeing the above video, this is how I would have written this script:

from pylab import imread, hstack

x = imread('red.png')
y = imread('green.png')

middle = (x + y) / 2

# corresponds to the image on the left
final_brown = hstack((x, middle, y))

Almost by definition, that computes what I want right? I’m averaging the two pixels and using a well-known formula for averaging. But no, this average does not correspond to the number of photons hitting the camera lens in two different pixel locations, the number I want.

To get the physically natural result, we only need to add two lines of code:

from pylab import imread, hstack, sqrt

x = imread('red.png')
y = imread('green.png')

middle = (x**2 + y**2) / 2
middle = sqrt(middle)

# corresponds to the image on the right
final_yellow = hstack((x, middle, y))

When I saw this, I was shocked. How was this not default, especially in programs like Photoshop? Of course, this behavior can be enabled in Photoshop by playing with the gamma correction settings and using the sRGB color space. I assume the method described above is an approximation for the sRGB color space, with $\gamma = 2$.

At first, I was also surprised that Python/Matlab’s imread didn’t enable this behavior by default. After thinking about it more, I came to realize that it’d be almost impossible to implement this behavior and wouldn’t make sense. imread is a very raw function that gives access to raw values in image files. The pixel values could not be of type ndarray but would instead have to be wrapped in some Image class (and of course PIL has an ImageMath module). How can NumPy dictate x + y is really sqrt((x**2 + y**2) / 2) for images but not ndarrays?

At first, I decided to test an image deblurring algorithm1 – this algorithm takes in a blurred image and (ideally) produce a sharp image. I saw much more natural results with the method described above.

Something felt off about this; there could be other stuff that’s getting in the way. We were fairly drastically changing the input to a mysterious blind deconvolution algorithm. What if they computed similar results but one image needed more iterations? What if the parameters happened to be better tuned for parameters in one image?

Additionally, this also felt wrong because I was only using this new and seemingly invented color space. I knew other color spaces existed as an image doesn’t have to be represented in RGB. It can be represented in many other color spaces that emphasize different things. Some color spaces might emphasize how humans naturally see colors and others might try to make addition of two images work and some might try to represent an image but save disk space.

So I decided to test blurring in different color spaces. Going into this, I wasn’t really sure what to expect but after thinking about it (and seeing the results), I realize this tests what’s important in each color space. What does each pixel and it’s color represent?

Do they emphasize additive or subtractive color? Do they emphasize the true pigments and act like mixing paint? Do they go for smooth gradients? Are they constrained by the device and instead optimize for image size/etc?

To be clear, here are the following color spaces I tested in and what I gleaned2 from their descriptions on wiki pages/etc:

Color space Design goal
HSV hue saturation; represents how humans see color (meaning pigments?)
XYZ defined to be a link between wavelengths and humans perceive color
RGB default for images; red green blue color planes. When implemented, the storage space of the image was kept in mind.
LUV used extensively in computer graphics to deal with colored lights
LAB designed to approximate human vision; can be used to make accurate color balance corrections
HED not sure but I found a paper (and on deconvolution!)
RGB2 the method described above. I assume this a rough sRGB approximation

Okay, let’s try mixing the two colors! We should not expect to see the same result in each color space. They all optimize for different things; they might try to mix the pigments or might be constrained by device physics and produce garbage.

from skimage.filters import gaussian_filter
from skimage.color import rgb2hsv, hsv2rgb

I = imread('two_colors.png') # NxMx3 array
I_hsv = rgb2hsv(I)

# blur the image using skimage
I_hsv = gaussian_filter(I, 15, multichannel=True)

# show this rgb image (RGB since plt.imshow takes RGB values)
I = hsv2rgb(I_hsv)

After wrapping this function and using ipywidgets, I can produce this interactive widget!

Playing with the input colors, we can see that most color blends look pretty similar. Green/orange, purple/green and yellow/orange all seem to be fairly similar in different color spaces. The colors that best show the different results for different color spaces are red/green and blue/yellow.

We see that certain color spaces are constrained by device limitations (RGB, HED). We see that other color spaces emphasize the pigments (HSV) or other elements like additive/subtractive color (LUV, LAB). We see that certain color spaces play nicely with addition and perform a smooth gradient between the two colors (XYZ, RGB$^2$ aka the method described above).

While writing this blog post I learned about color spaces. I learned they’re not only transformations but also try to pull out different elements. Some color spaces try to represent how humans see color (e.g., HSV) and others optimize for device parameters (e.g., RGB optimizes for image size).

…but at the end of the day I’ll keep this trick in mind.

  1. Using blind deconvolution. In essence, this tries to undo some system’s effect on an image. In this case, that system is blurring and this algorithm needs an estimate of the point spread function 

  2. And I’m still not confident on all the details