Image & Audio Compression

Vincent Caetto


Introduction

In this project, we examined how to compress image and audio files using different compression methods. For images, we used linear quantization, JPEG-matrix compression and JPEG quantization.


Linear Quantization

For linear quantization, we first convert our image into doubles for calculations and then split them into a series of smaller 8 by 8 matrices. For colored images, this process is done for each seperate color layer (R,G,B),while a black and white image is simply converted as a whole. We then subtract 128 from each value, which moves the range of our color values from 0 - 256 to -128 - 128. Doing so centers the numbers around 0, which helps reduce some of the data loss that results from the compression. We now perform a least squares approximation of our subsets using the 2D-DCT, which is an interpolation theorem defined by: \[ C_{ij} = \frac{\sqrt{2}}{\sqrt{n}} a_i cos \frac{i(2j+1)\pi}{2n} \] The resulting orthogonal matrix C can then be multiplied by our 8x8 subset and its own inverse to interpolate the data in our subset. By interpolating the data, we set our subsets up for a least squares fit in our next step. We will later undo the interpolation to retrieve the original data. At this point we bring in an 8x8 matrix called the quantization matrix, which will regulate how each pixel in our 8x8 subsets is transformed and will be used for a least squares fit. For linear quantization, this matrix is given by the formula:

\[q_{kl} = 8p(k+l+1) \;for\; 0 \le k, l \le 7\] which results in the following matrix: \[ Q = p \begin{bmatrix} 8 & 16 & 24 & 32 & 40 & 48 & 56 & 64 \\ 16 & 24 & 32 & 40 & 48 & 56 & 64 & 72 \\ 24 & 32 & 40 & 48 & 56 & 64 & 72 & 80 \\ 32 & 40 & 48 & 56 & 64 & 72 & 80 & 88 \\ 40 & 48 & 56 & 64 & 72 & 80 & 88 & 96 \\ 48 & 56 & 64 & 72 & 80 & 88 & 96 & 104 \\ 56 & 64 & 72 & 80 & 88 & 96 & 104 & 112 \\ 64 & 72 & 80 & 88 & 96 & 104 & 112 & 120 \\ \end{bmatrix}\]

where p is called the loss parameter. The higher the value used for p is, the more compressed the image becomes and the more data is lost as a result. By dividing each value in our 8x8 subsets by the respective value in Q and rounding, we reduce the amount of information stored in the subsets. Multiplying value by Q again returns the rounded versions of our values. Afterwards, we undo the earlier 2D-DCT by multiplying the transpose C matrix by our 8x8 subset and then by the original C matrix. Lastly, we add back the 128 to our values to bring them back to the original range and convert them back to unsigned 8-bit integers for plotting.

Links:

Compression code Problem 3 Problem 5


JPEG Compression

The simple JPEG compression is the same as linear quantization, except that the quantization matrix Q is different. The suggested matrix used for JPEG compression is:

\[ Q_Y = p \begin{bmatrix} 16 & 11 & 10 & 16 & 24 & 40 & 51 & 61 \\ 12 & 12 & 14 & 19 & 26 & 58 & 60 & 55 \\ 14 & 13 & 16 & 24 & 40 & 57 & 69 & 56 \\ 14 & 17 & 22 & 29 & 51 & 87 & 80 & 62 \\ 18 & 22 & 37 & 56 & 68 & 109 & 103 & 77 \\ 24 & 35 & 55 & 64 & 81 & 104 & 113 & 92 \\ 49 & 64 & 78 & 87 & 103 & 121 & 120 & 101 \\ 72 & 92 & 95 & 98 & 112 & 100 & 103 & 99 \\ \end{bmatrix}\]

Links:

Problem 4


JPEG-Quantization

Like linear quantization, most of the steps for JPEG-Quantization are the same. However, JPEG-Quantization does not use RGB like the previous two methods. Instead we convert our RGB color data to the YUV system with:

\[ Y = 0.299R + 0.587G + 0.114B \\ U = B - Y\\ V = R - Y \] where Y is the luminance and U & V are the color differences. Using our newly obtained values, we now apply our Q matrix multiplication. But unlike before, we use different Qs for different values. We will also use the matrix: \[ Q_C = p \begin{bmatrix} 17 & 18 & 24 & 47 & 99 & 99 & 99 & 99 \\ 18 & 21 & 26 & 66 & 99 & 99 & 99 & 99 \\ 24 & 26 & 56 & 99 & 99 & 99 & 99 & 99 \\ 47 & 66 & 99 & 99 & 99 & 99 & 99 & 99 \\ 99 & 99 & 99 & 99 & 99 & 99 & 99 & 99 \\ 99 & 99 & 99 & 99 & 99 & 99 & 99 & 99 \\ 99 & 99 & 99 & 99 & 99 & 99 & 99 & 99 \\ 99 & 99 & 99 & 99 & 99 & 99 & 99 & 99 \\ \end{bmatrix}\] This matrix is noticeably more extreme in its values, which is fine because we will be using it on the less important parts of our image. Using our YUV values, we will apply the \(Q_Y\) matrix from the previous part to our luminance Y. Changes in the color differences U and V are much less noticeable to the human eye and, as such, we can safely apply the \(Q_C\) matrix to them. The rest of the process is the same as in the linear quantization case, except that we have to convert back into the RGB system at the end, using: \[ R = V + Y \\ B = U + Y \\ G = \frac{Y - 0.299R - 0.114B}{0.587} \]

Links:

Problem 6


Further examples

Links:

Animation code