by Elisa Kim
Reference: Numerical Analysis by Timothy Sauer (p.495-514)
We were asked to obtain a grayscale image file using the command imread to import into MATLAB.
Using a standard formula, \(X_{gray} = 0.2126R +0.7152G+0.0722B\), a color RGB image could be converted to gray scaled image.
My original image was 700 by 940 pixels. So I cropped the image into 640 by 640 pixels so that it is a dimension of a multiple of 8.
The two-dimensional Discrete Cosine Transform (DCT) is often used to compress the image. By using DCT, we could compress the parts that are not critically visible by the human eyes. For 2-D DCT, we simply apply one-dimensional DCT vertically first, and then horizontally.
Definition
The two-dimensional Discrete Cosine Transform (2D-DCT) of the \(n \times n\) matrix \(X\) is the matrix \(Y=CXC^T\) where
\( C_{ij}=\frac{\sqrt 2}{\sqrt n} \cos \frac {i(2j+1)\pi}{2n} \) for \(i,j\)=0,...,n-1.
The inverse two-dimensional Discrete Cosine Transform (2D-DCT) of the \(n \times n\) matrix \(Y\) is the matrix \(X=C^TYC\) .
8 by 8 block | Grayscale pixel values | Grayscale pixel values minus 128 |
---|---|---|
The table above shows one of the 8 by 8 blocks of the picture and the numerical value of the grayscaled pixel.
Note that pixel value minus 128 make them approximately centered around zero.
The loss parameter, denoted by p, determines the accuracy of the image. The smaller the loss parameter, the more accurate the image will be.
Yq is a matrix form of the resulting coefficients, which is affected by the loss parameter p, the linear quantization matrix, and rounding to integer.
Here in this step we used the hilbert matrix as a linear quantization matrix (using the MATLAB code hilb(8)).
With the larger p, the more coefficients in Yq becomes zeros, loosing some data.
Here are codes: p5_1.m and dct.m
Loss parameter p | Yq | reconstructed image |
---|---|---|
\begin{bmatrix} -9 & -6 & -2 & -1 & 0 & 0 & 0 & 0 \\ -1 & -1 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \end{bmatrix} | ||
\begin{bmatrix} -5 & -3 & -1 & -1 & 0 & 0 & 0 & 0 \\ -1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \end{bmatrix} | ||
\begin{bmatrix} -2 & -2 & -1 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \end{bmatrix} |
By looking at the result of the reconstructed images, we could observe that the larger loss parameter becomes, the more inaccurate the image becomes.
Now we can apply not only to the 8 by 8 pixel block, but also to whole picture by repeat to this process to every 8 by 8 blocks in the image.
Here is the code.
original | p=1 |
---|---|
p=2 | p=4 |
The following quantization matrix is provided based on the experiments with the human visual system, which is used widely in current distributed JPEG encoders. This matrix with loss parameter value p=1, which is close to the default JPEG quantization, was used in this step.
\(Q_y = p\begin{bmatrix}16 & 11 & 10 & 16 & 24 & 40 & 51 & 61 \\ 12 & 12 & 14 & 19 & 26 & 58 & 60 & 55 \\ 14 & 13 & 16 & 24 & 40 & 57 & 69 & 56 \\ 14 & 17 & 22 & 29 & 51 & 87 & 80 & 62 \\ 18 & 22 & 37 & 56 & 68 & 109 & 103 & 77\\ 24 & 35 & 55 & 64 & 81 & 104 & 113 & 92\\ 49 & 64 & 78 & 87 & 103 & 121 & 120 & 101\\ 72 & 92 & 95 & 98 & 112 & 100 & 103 & 99\end{bmatrix}\)
Original | Image using Hilbert matrix | Image using JPEG-Suggested Matrix | |
---|---|---|---|
In order to see how far the pixel values were off from the original pixel value, I used the RMSD(Root-mean-square deviation) of the original pixel value and the pixel vlaues achieved by using different quantization matrices.
RMSD = \(\sqrt{\frac{\sum_{i=1}^n(y_i'-y_i)^2}{n}}\).
RMSD for 8 by 8 pixel values by linear quantization matrix was approximately 59.2954.
RMSD for 8 by 8 pixel values by JPEG-Suggested Matrix was approximately 45.2351.
RMSD for whole picture pixel values by linear quantization matrix was approximately 7.9849.
RMSD for whole picture pixel values by JPEG-Suggested Matrix was approximately 5.4134.
This results tells us that the pixel values achieved by JPEG-Suggested Matrix is closer to the original pixel values.
Loss parameter p | 8 by 8 reconstructed image | whole reconstructed image |
---|---|---|
In this step, we transform the RGB color values to luminance and color difference coordinates.
In order to transform the RGB color data to the YUV system, we need followings:
the luminance \(Y=0.299R+0.587G+0.114B\) and the color differences \(Y=B-Y\) and \(V=R-Y\).
We used the same quantization matrix \(Q_y\) in step 2 for luminance variable Y, and we use the following quantization matrix \(Q_c\) for the color differences U and V.
\(Q_c = p\begin{bmatrix}17 & 18 & 24 & 47 & 99 & 99 & 99 & 99\\
18 & 21 & 26 & 66 & 99 & 99 & 99 & 99\\
24 & 26 & 56 & 99 & 99 & 99 & 99 & 99\\
47 & 66 & 99 & 99 & 99 & 99 & 99 & 99\\
99 & 99 & 99 & 99 & 99 & 99 & 99 & 99\\
99 & 99 & 99 & 99 & 99 & 99 & 99 & 99\\
99 & 99 & 99 & 99 & 99 & 99 & 99 & 99\\
99 & 99 & 99 & 99 & 99 & 99 & 99 & 99\end{bmatrix}\)
Loss parameter p | 8 by 8 reconstructed image | whole reconstructed image |
---|---|---|