Images take a lot of data to store. For SxT black and white images, each pixel is a gray pixel whose value is from 0 to 255. To fully store an image would require one to store a sequence of SxT numbers from 0 to 255. For color images, each pixel has a red, green, and blue value, each from 0 to 255. Then to fully store such an image would require one to store a sequence of SxTx3 numbers from 0 to 255. This can mean that images can take up huge amounts of storage. However, by using the least squares method, a function with a limited number of coefficients could be used to interpolate the data. One example of this is found by using the descrete cosine transform (DCT).
The DCT allows for a reasonable amount of size compression for a small amount of loss. It uses a sum of carefully chosen cosine functions where, in this choice, the process of performing the least squares method will result in working with an orthogonal matrix, which is ideal for this type of situation. A 2D DCT not only accomplishes this, but it also performs it on 2-dimensional data. For instance, the pixel values of an image could be stored using a 2D DCT. These cosine functions and this process can be deduced by examination of the code included in this project. To perfectly interpolate the data, the number of coefficients in this function is equal to the number of data points it meant to interpolate. However, very little data is lost when some of the coefficients are removed or reduced. To lose as little information as possible, one can use quantization. So rather than removing entire coefficients, the coefficients can be rounded in various ways to slightly reduce the accuracy and storage needed for each coefficient.
This project will be exploring the effects of the DCT and types of quantization using this image. It is titled "Pensive Parakeet," and is included in the download of Dropbox.
First, the image was broken into 8x8 blocks, each block was stored using the DCT, then the data in each block was reduced using linear quantization, then the blocks were reconstructed into a full image. The first sample was extracted from (81,81), and the second sample was from (105,601).
This process can be analyzed by examining the code here.
Actual matrices for this data can be found here.
p | Full Image | Sample 1 (scaled) | Sample 2 (unscaled) |
0 | |||
1 | |||
2 | |||
4 |
Next, the same process was performed, but instead of linear quantization, the JPEG-suggested matrix was used, which can be found in the code from the previous part.
p | Full Image | Sample 1 (scaled) | Sample 2 (unscaled) |
0 | |||
1 | |||
2 | |||
4 |
For this part, the same process was done to a color image. The image was broken down into its three natural components: Red, Green, and Blue. Each of those components store values from 0 to 255. To the process perviously described was performed to each component. Then using the compressed components, new images were reconstructed.
The code for this altered process is here.
p | Full Image | Sample 1 | Sample 2 |
0 | |||
1 | |||
2 | |||
4 |
p | Full Image | Sample 1 | Sample 2 |
0 | |||
1 | |||
2 | |||
4 |