Project 5

Project 5 : Image Compression

by Elisa Kim

Reference: Numerical Analysis by Timothy Sauer (p.495-514)

We were asked to obtain a grayscale image file using the command imread to import into MATLAB. Using a standard formula, \(X_{gray} = 0.2126R +0.7152G+0.0722B\), a color RGB image could be converted to gray scaled image. My original image was 700 by 940 pixels. So I cropped the image into 640 by 640 pixels so that it is a dimension of a multiple of 8.

Here's the code to obtain the 640 by 640 cropped, gray scaled image.

The two-dimensional Discrete Cosine Transform (DCT) is often used to compress the image. By using DCT, we could compress the parts that are not critically visible by the human eyes. For 2-D DCT, we simply apply one-dimensional DCT vertically first, and then horizontally.

Definition
The two-dimensional Discrete Cosine Transform (2D-DCT) of the \(n \times n\) matrix \(X\) is the matrix \(Y=CXC^T\) where \( C_{ij}=\frac{\sqrt 2}{\sqrt n} \cos \frac {i(2j+1)\pi}{2n} \) for \(i,j\)=0,...,n-1.
The inverse two-dimensional Discrete Cosine Transform (2D-DCT) of the \(n \times n\) matrix \(Y\) is the matrix \(X=C^TYC\) .

8 by 8 block	Grayscale pixel values	Grayscale pixel values minus 128

The table above shows one of the 8 by 8 blocks of the picture and the numerical value of the grayscaled pixel. Note that pixel value minus 128 make them approximately centered around zero. The loss parameter, denoted by p, determines the accuracy of the image. The smaller the loss parameter, the more accurate the image will be.
Yq is a matrix form of the resulting coefficients, which is affected by the loss parameter p, the linear quantization matrix, and rounding to integer.
Here in this step we used the hilbert matrix as a linear quantization matrix (using the MATLAB code hilb(8)).
With the larger p, the more coefficients in Yq becomes zeros, loosing some data.
Here are codes: p5_1.m and dct.m

Loss parameter p	Yq	reconstructed image
p=1	\begin{bmatrix} -9 & -6 & -2 & -1 & 0 & 0 & 0 & 0 \\ -1 & -1 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \end{bmatrix}
p=2	\begin{bmatrix} -5 & -3 & -1 & -1 & 0 & 0 & 0 & 0 \\ -1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \end{bmatrix}
p=4	\begin{bmatrix} -2 & -2 & -1 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \end{bmatrix}

By looking at the result of the reconstructed images, we could observe that the larger loss parameter becomes, the more inaccurate the image becomes.
Now we can apply not only to the 8 by 8 pixel block, but also to whole picture by repeat to this process to every 8 by 8 blocks in the image.
Here is the code.

original	p=1

p=2	p=4

The following quantization matrix is provided based on the experiments with the human visual system, which is used widely in current distributed JPEG encoders. This matrix with loss parameter value p=1, which is close to the default JPEG quantization, was used in this step.

\(Q_y = p\begin{bmatrix}16 & 11 & 10 & 16 & 24 & 40 & 51 & 61 \\ 12 & 12 & 14 & 19 & 26 & 58 & 60 & 55 \\ 14 & 13 & 16 & 24 & 40 & 57 & 69 & 56 \\ 14 & 17 & 22 & 29 & 51 & 87 & 80 & 62 \\ 18 & 22 & 37 & 56 & 68 & 109 & 103 & 77\\ 24 & 35 & 55 & 64 & 81 & 104 & 113 & 92\\ 49 & 64 & 78 & 87 & 103 & 121 & 120 & 101\\ 72 & 92 & 95 & 98 & 112 & 100 & 103 & 99\end{bmatrix}\)

	Original	Image using Hilbert matrix	Image using JPEG-Suggested Matrix
whole grayscaled picture
one 8 by 8 block
grayscaled pixel values (8 by 8)
difference from original	0

Here are code for 8 by 8 block and code for whole picture.

In order to see how far the pixel values were off from the original pixel value, I used the RMSD(Root-mean-square deviation) of the original pixel value and the pixel vlaues achieved by using different quantization matrices. RMSD = \(\sqrt{\frac{\sum_{i=1}^n(y_i'-y_i)^2}{n}}\).
RMSD for 8 by 8 pixel values by linear quantization matrix was approximately 59.2954.
RMSD for 8 by 8 pixel values by JPEG-Suggested Matrix was approximately 45.2351.
RMSD for whole picture pixel values by linear quantization matrix was approximately 7.9849.
RMSD for whole picture pixel values by JPEG-Suggested Matrix was approximately 5.4134.
This results tells us that the pixel values achieved by JPEG-Suggested Matrix is closer to the original pixel values.

In this step, Step 1 was repeated but with a color image, which is expressed in the RGB color system. The RGB color system allow us to assign three integers, one for red, green and blue, which indicates each color's intensity.
Therefore, for the color image compression, we need to repeat the process for each color separately, and then reconstitute the image by putting the three color intensity together.

Loss parameter p	8 by 8 reconstructed image	whole reconstructed image
p=1
p=2
p=4

Here are the code for 8 by 8 block and code for whole picture.m

In this step, we transform the RGB color values to luminance and color difference coordinates.
In order to transform the RGB color data to the YUV system, we need followings: the luminance \(Y=0.299R+0.587G+0.114B\) and the color differences \(Y=B-Y\) and \(V=R-Y\).
We used the same quantization matrix \(Q_y\) in step 2 for luminance variable Y, and we use the following quantization matrix \(Q_c\) for the color differences U and V. \(Q_c = p\begin{bmatrix}17 & 18 & 24 & 47 & 99 & 99 & 99 & 99\\ 18 & 21 & 26 & 66 & 99 & 99 & 99 & 99\\ 24 & 26 & 56 & 99 & 99 & 99 & 99 & 99\\ 47 & 66 & 99 & 99 & 99 & 99 & 99 & 99\\ 99 & 99 & 99 & 99 & 99 & 99 & 99 & 99\\ 99 & 99 & 99 & 99 & 99 & 99 & 99 & 99\\ 99 & 99 & 99 & 99 & 99 & 99 & 99 & 99\\ 99 & 99 & 99 & 99 & 99 & 99 & 99 & 99\end{bmatrix}\)

Loss parameter p	8 by 8 reconstructed image	whole reconstructed image
p=1
p=2
p=4

Here are the code for 8 by 8 block.m and code for whole picture.m