Wednesday, July 9, 2014

Astrophotography Part 4 - The Color Matrix

Log Entry
The Color Matrix

This will be the last piece in a series of entries dedicated to unencoding the Nikon Images. I mentioned in the previous post how to huffman decode the image. It turns out, that each value decoded for each pixel is still not the final value for each pixel in the image. Before I mention the extra piece missing, let's go into how digital cameras record color.


The Nikon D90 is a color camera. This means that it registers not only brightness information, but color as well. We learn in kindergarten that most of all colors we see can be made up of a linear combination of three primary colors: red, green and blue. So the simple way to obtain a color image is to have a set of three images containing the intensity for each the red, green and blue colors. 

So how do you take a red, green and blue image? A simple solution is simply to take three separate images. Before taking each, place a red filter, then a green filter, then a blue filter.

It turns out that this simple picture in CCD cameras is not that practical. How do you insert the filter? How do you switch it fast enough so that you can take three images for the price of one?

Bryce Bayer, while working for Eastman Kodak, found a solution for colour digital cameras. First, let’s discuss the way digital cameras work. Digital cameras measure an image focused on a chip with mini detectors we call pixels. Each pixel basically registers the intensity of the light impinging on it with some units we usually arbitrarily label as Analog to Digital Units (ADU).

The idea Bayer introduced is to have each pixel dedicated to measuring only one of the three primary colors, with twice as many measuring green as opposed to red and blue to mimic the human eye. This was simple to implement as all that was necessary was to permanently place a filter on each pixel. One could then use simple interpolation techniques to then recover the colour information for the rest of the pixels. He may have passed away 1.5 years ago, but his legacy is in 99% of cameras today.

Here is a sample Bayer Matrix of what I believe the Nikon images contain (after some playing around):

Sample Bayer Matrix from the Nikon D90.

In this matrix, you see one 2x2 matrix repeated throughout the whole CCD chip.

Back to the RAW Image
The reason why I am discussing the color matrix here is that it turns out, when reading the RAW image from the Nikon D90, each value obtained is actually the difference to add in for a previous value. Here goes, this will be a little complicated. I will try to point out the key points but you may need to do a little bit of work to get this on your own (basically, I don't have much time):

Read the first huffman byte sequence based on the tree in a previous post. Next, read that number of bits as your value. When you obtain the value, follow the following rule. If:
1. The leading bit of this value is zero: Subtract 2^len from it (where len is the length of the bitstream)
2. The leading bit is not zero: keep it as is.

Next, this value needs to be summed to the previous value. The previous value is defined by the following rule:
1. If the current pixel is within the first two rows and the first two columns: (first Bayer matrix element) The previous value is defined as the value from the vpred matrix obtained with the linearization curve. This matrix is a 2x2 matrix and is obtained with the linearization curve. A summary will be written eventually (not yet, when I have time). You may find information on vpred here for now.
2. If the current pixel is not within the first two rows but within the first two columns: The previous value is defined as the previous element in the Bayer matrix (that is the previous element two rows above here).
3. If the current pixel is not within the first row or the first column: The previous value is defined as the previous element in the Bayer matrix horizontally to the left (so two columns back)

There you go. Now all you need to do is for each pixel, follow the leading bit rule and sum the previous value to it which is defined by the rules above.

If you think about it, this makes sense. One would always expect a pixel close by not to differ by much, so you save space by saving pixel differences as opposed to pixel values. Note with these rules, you are always forced to add the difference to a pixel not more than 2 pixels away. The underlying logic is much simpler than the explanation.



Here is a slightly edited version of dcraw's code to point out what needs to be done:
 
for(row = 0; row < imgheight; row++){
     for (col=0; col < imgwidth; col++) {
      len = gethuff(huff);//read the next huff from byte sequence
      diff = (getbits(len));//get the next len bits
      if ((diff & (1 << (len-1))) == 0)
        diff -= (1 << len); //subtract 2^len if leading bit zero
      if (col < 2)  //if in the first two columns

           hpred[col] = (vpred[row & 1][col] += diff);
      else         hpred[col & 1] += diff;//else      

 //RAW is index into raw image, curve is the interpolation from 
//the value based on the linearization curve. you can ignore the 
//latter step and always do the interpolation later.
RAW(row,col) = curve[LIM((short)hpred[col & 1],0,0x3fff)];
    }

}

And that's it. To avoid any copyright issues, I must cite Dave Coffin and link to his full source code here.  (Click dcraw link in page)
          

One more small detail...
It turns out that the images you will read will be 4352 x 2868 pixels (the dimensions are also specified in the EXIF tags). The columns from 4311 to 4352 actually just contain garbage (or they look like garbage to me, very high values). I just ignore them and use a 4310 x 2868 sub image. Note that this is still larger than the maximum resolution of the Nikon D90 (4,288 × 2,848 pixels from wikipedia), but who cares, it looks like good data. It is probably good to keep this in mind though.

Interpolating from Bayer to RGB
A simple way to interpolate the image is convert each 2x2 element to an RGB pixel and I will do this for all my pictures here.
The dimensions of the Nikon raw images are 4352 x 2868. So for these images, my final resolution will be 2176 x 1434 pixels. This is fine enough for me.

Here is some quick pseudo code to obtain each color from an image:
img = read_image();
RED = img(1::2,1::2);
GREEN = (img(2::2,1::2) + img(1::2,2::2))*.5;
BLUE = img(2::2,2::2);

Now let's see if it worked and just plot the gray image by calculating:
GRAY = RED + GREEN + BLUE;

where min:max:skip is a notation meaning starting from index min up to max, skipping skip values every time. When left blank, min defaults to the beginning, max to the end and skip to 1 (so don't skip). 

Let's try this with a moon image I took:

The moon at 800 ISO 1/4000 s exposure with the Nikon D90.
And voila.

This concludes the image reading portion of the Nikon images. Any questions, feel free to comment. If you are interested in source code, please comment. I can include Dave Coffin's code with a diff file showing how to make it just a raw image reader.

I will try to write up a reference sort of like here but a little more complete so the next person will not feel as frustrated as I did. Good luck!

No comments:

Post a Comment