Alright. In this section, the Fourier Basis, a particular Wavelet Basis, and the Cosine Basis are analysed using the "Lena" image. After one selects a basis to use, they use non-linear thresholding as it is the best approximation to the original image. (1) Within that non-linear thresholding one wants to determine which basis decreases the fastest as the number of coefficients, M, is increased. This is a way of knowing which basis is most efficient, given a particular approximation method. In the words of Mr. Peryê, "The goal is to use an ortho-bais B so that the error, ||f-fm||, decays as fast as possible when M increases, for a large class of images." (2) Below is the image that will be used in this section.
First, the Fourier Basis is used to approximate the image. The two-dimensional DFT is performed, followed by a non-linear thresholding that retains all the Fourier Coefficients above ||.3||. The image below is the result of the thresholding.
The next image shows a different kind of thresholding. The first image shows the approximation with 0.01 of the coefficients retained, the second one shows the result with 0.05 of the coefficients retained.
Below are the Fourier Coefficients of the "Lena" image, plotted on a log scale.
To conclude the Fourier section, we define the following:
In other words, this is represents the squared difference error between the original image and the non-linear approximation. Using the cumsum function of Matlab, a plot is generated showing the error as the number of coefficients is increased. Note that this plot only shows the first 5200 or so coefficients. Specifically, this is the norm of the image squared, minus the cumulative sum of the coefficients squared. This is reminiscent of Parseval's Theorem. This graph is important because in order to decide on the best basis to use, one wants to have an idea of how fast the error diminishes as they keep adding more coefficients.
Now we return to the now familiar Wavelet Basis. ""The Wavelet basis of 2-D functions is defined by scaling and translating three mother wavelet atoms": (3)
Mr. Peyrê also states that "Non-linear wavelet approximation is at the heart of the JPEG-2000 compression standard." (4)
Interestingly enough, with "Lena" Mr. Peryê decides to use the Daubechies Wavelet with ten vanishing moments. I assume this is for a good reason. Below are the Wavelet Coefficients.
Next, Mr. Peyrê explores retaining only a fraction of the coefficients for reconstruction. That is .01 or .05 of the largest coefficients. The result using .05 is a pretty good reconstruction of the image.
To conclude the Wavelet section, we examine the difference in the squared difference error between the Fourier and Wavelet Basis. This was explained previously in the Fourier section. The main goal in this section was to determine what basis offered the fastest rate of error decay as we added more times. So far, the Daubechies Wavelet with 10 vanishing moments in superior. I tried the same comparison with Daubechies Wavelets with 4 and 6 vanishing moments, and the Wavelet Basis was still superior to Fourier. What does this mean? Wavelet compression lets me use less coefficients than I would have to use with Fourier to achieve a certain level of desired compression.
Here, one might wonder, "If the Fourier Basis uses Sines and Cosines, then the Cosine Basis is probably very similar to the Fourier Basis. Indeed, it is. Mr. Peyrê describes the difference between the two. "The discrete cosine approximation (DCT) is similar to the Fourier approximation, except that is uses symmetric boundary conditions, instead of periodic boundary conditions, and is thus more useful to approximate an image." (5) What exactly is meant by these boundary conditions? In regards to the Fourier Basis, it can be shown that when one takes a DFT on an image, the resulting DFT is infinitely periodic. Due to this periodicity, at the boundaries of an image, one gets periodic boundary conditions. This has important consequences in image processing where one can end up with undesirable boundary conditions due to this property. One can use the technique of zero padding in order to lessen this effect. This topic is addressed more thoroughly in "Digital Image Processing Using Matlab" by Rafael C. Gonzalez, Richard E. Woods, and Steven L. Eddins.
We know from basic trigonometry that Cosine is an even function, and it should be no surpise that the DCT has even boundary conditions. It turns out that the DCT can yield a superior compression rate to that of the DFT, and the reasons for this have to do with boundary conditions. In short, if one has a signal with even boundary conditions on both sides, using a DCT will yield superior compression results because at those even boundaries, the DCT will yield a continuous extension of the boundaries. Whereas with the DFT, at those boundaries will be discontinuities, which require more terms to represent. This stems from the rate of convergence of Fourier Series. This is a very fascinating subject, that I hope to develop a deeper understanding of.
With that in mind, the two-dimensional DCT is performed on the "Lena" image and the DCT coefficients are plotted below.
In the exactly the same manner that was done with the Fourier Basis and the Wavelet Basis, the DCT approximation is produced with 1% and 5% of the biggest DCT coefficients.
How do these compare with Fourier and the Daubechies Wavelet? This fares better than the Fourier Basis, but worse than the Wavelet Basis. Next, the error between the Fourier and Cosine Basis' are compared.
This next approximation is truly fascinating. One decides on the size of a square, one would want this square to be small in size, and it must be smaller than the dimensions of their picture. For example, with a 512 x 512 image, Mr. Peyrê uses a 16 x 16 square. Note that 512 / 16 = 32. Obviously, a square size that divides the dimensions of the picture evenly is preferrable. Next, one iterates through the square, taking 16 x 16 pixels at a time, and performing a DCT on those selected pixels.
Pictured above is a 16 x 16 block of the "Lena" image, and next to it is the DCT of that square. Next, one iterates through the entire image, generating LDCT coefficients.
Creating an image approximation using the LDCT is very similar to generating the coefficients, code-wise. One iterates through the coefficients, selects a square of them, then performs the inverse DCT on those pixels.
Here again, 1% and 5% of the of the LDCT coefficients are kept. For the record, I am calling these the LDCT coefficients, but keep in mind that the actual transform itself used in this process is the DCT. The first approximation, using 1% of the coefficients has an alright SNR, but is very blocky. Literally, one is able to see the squares that were used to iterate through the image. However, the second image is quite good, it is able to achieve a superior SNR to the Daubechies Wavelet.
Above is the comparison of all the various basis' used in this section. The LDCT is of particular interest. It initially starts considerably worse than the other basis'. However, as one begins to add more and more coefficients, it becomes the best basis of them all.
Mr. Peryê closes with a very interesting section. Images themselves have a notion of "complication" to them. Mr. Peryê writes, "An image is more complicated than another for a given orthogonal basis if it's approximation error decays more slowly." (6) The examples below will help generate a proper notion of this.
From top right to top left, then bottom right to bottom left, we have, "Regular", "Phantom", "Lena", and "Mandrill". Here the Daubechies 6 Filter is used.
The first two images, "Regular" and "Phantom", are crude and blurry. These images do not require many high frequency componenets in order to represent them well. On the other hand, both "Lena" and "Mandrill", especially the latter, have very fine details in them. In order to represent the fine details well, one needs to have more coefficients in there approximation. Due to this, by the definition given by Mr. Peryê, we can say that start with "Regular," the images increase in complication. This is because each subsequent image has finer details in it than the previous one, and more Wavelet coefficients are needed in order to create a satisfactory approximation to the image.
This section was a lot of fun. To me, the LDCT is the most fascinating topic in this section. So much so, that I have made a pixelated LDCT of myself my Twitter avatar. The use of the LDCT raised many new questions though. Why is it's error rate more dramatic than other basis'? What happens if you change the size of the window? In fact, this section raised more questions than it answered for me in some ways. Suffice to say, I find this subject matter to be extremely fascinating, and very rewarding to study. In the next section, I explore Entropy Coding.
Works Cited
(1) Peryê, Gabriel. "Image Approximation with Orthogonal Bases." Image Approximation with Orthogonal Bases. N.p., 2010. Web. 07 July 2014.
(2) Peryê, Gabriel. "Image Approximation with Orthogonal Bases." Image Approximation with Orthogonal Bases. N.p., 2010. Web. 07 July 2014.
(3) Peryê, Gabriel. "Image Approximation with Orthogonal Bases." Image Approximation with Orthogonal Bases. N.p., 2010. Web. 09 July 2014.
(4) Peryê, Gabriel. "Image Approximation with Orthogonal Bases." Image Approximation with Orthogonal Bases. N.p., 2010. Web. 09 July 2014.
(5) Peryê, Gabriel. "Image Approximation with Orthogonal Bases." Image Approximation with Orthogonal Bases. N.p., 2010. Web. 16 July 2014.
(6) Peryê, Gabriel. "Image Approximation with Orthogonal Bases." Image Approximation with Orthogonal Bases. N.p., 2010. Web. 16 July 2014.
G. Peyré, The Numerical Tours of Signal Processing - Advanced Computational Signal and Image Processing, IEEE Computing in Science and Engineering, vol. 13(4), pp. 94-97, 2011.