fix; «I 33;: ii; 1' . Egg. ‘ :9? ' . k . \ a .l i r; ‘5‘ .flmmmg :1 .pnfima. in. {£0 6 I v.22? V n Y‘ .114 .. a tit. inunwg a I. :3: min.- !" -IC‘ 0 oulpl >~A.1d|»l 1"...- ‘ .‘0 [l- . I 2 7.5:}.118311. 3“ ....Iv3......i$h . .......9. 711%. ‘ 3.1.. 1 . . ... y. . fiazwqyaggfigfiairafia fi,§fiwfifi ,. @fimfi? ._ avg. h aéfi 3%» ; . flow. :L Ar. t : .3... \ fiesta: '2 no} This is to certify that the dissertation entitled NEW TRANSFORM-BASED REPRESENTATION FOR IMAGES presented by RAMIN ESLAMI has been accepted towards fulfillment of the requirements for the Ph.D. degree in Electrical Engineering /' 24/ fig /7/ lVTajor Professofs Signature August 1, 2006 Date MSU is an Affirmative Action/Equal Opportunity Institution v. _4fi__ -———-——— LIBRARY Michigan State University ;,----0-0-.---o-C-I-c-I-a-C-O-I-o-v- PLACE IN RETURN BOX to remove this checkout from your record. TO AVOID FINES return on or before date due. MAY BE RECALLED with earlier due date if requested. DATE DUE DATE DUE DATE DUE 2/05 p:/ClRC/DaleDue.indd-p,1 NEW TRANSFORM-BASED REPRESENTATIONS FOR IMAGES By Ramin Eslami A DISSERTATION Submitted to Michigan State University in partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY Department of Electrical and Computer Engineering 2006 ABSTRACT NEW TRANSFORM-BASED REPRESENTATIONS FOR IMAGES By Ramin Eslami This dissertation designs and develops new directional representation of images that are based on true two-dimensional (2-D) basis. Furthermore, we employ the proposed directional schemes in image denoising and coding, where the satisfactory results of these applications are heavily dependent on the nonlinear approximation behavior of the proposed schemes. First, we develop methods to find a translation-invariant version of a general multidimensional multi-channel subsampled filter bank. In particular, we extend the algorithme a trous to a generalized algorithme a trous. Then, based on this algorithm, we propose the translation-invariant contourlet transform (TICT) and also the less complex and less redundant scheme of semi-TICT (STICT). We use the proposed TICT and STICT schemes for image denoising and compare them with the state-of-the-art methods. Second, we propose a new family of nonredundant geometrical transforms using Hybrid Wavelets and Directional filter banks (HWD). We convert the wavelet basis functions in the finest scales to a flexible and rich set of directional basis elements by employing directional filter banks, where we form a nonredundant transform family, which exhibits both directional and nondirectional basis functions. We demonstrate the potential of the proposed transforms using nonlinear approximation. In addition, we employ the proposed family in two key image processing applications, image coding and denoising, and show its efficiency for these applications. Third, we examine the issue of linearly combining different denoising schemes in order to improve the denoising results. We propose to optimally mixing the denoising results by minimizing the £2 norm of the overall error, where we develop optimal and suboptimal approaches that do or do not require an access to the original image and the noisy image. To My Parents and My Wife iv ACKNOWLEDGMENTS First, I must thank the almighty God, who is the origin of all creation and innovation, for enabling me to advance my work by trusting in and relying on Him. Second, I sincerely thank Professor Hayder Radha, my advisor, whose guidance, support, encouragement, patience, and considerateness taught me precious lessons in my academic life and played the key role in order that I prosper and thrive. I am grateful to Professor John Deller, Jr. who partially advised my dissertation, and was very friendly and supportive. I would like to also thank Professors Anil Jain and Selin Aviyente as my other committee members whom I received their guidance and suggestions. I would like to express my gratitude to Professors Hassan Khalil, Lalita Udpa, and Michael Frazier for their teaching and help during my Ph.D. program. More importantly, I am sincerely and deeply grateful of my parents where their overall support, patience, and encouragements have been essential factors that I could pursue my degree. Finally, many thanks go to my wife and my daughter since it would have been impossible for me to accomplish many things in life without their support, companionship, and sympathy. TABLE OF CONTENTS LIST OF TABLES .............................................................................. viii LIST OF FIGURES .............................................................................. ix Chapter 1 INTRODUCTION ............................................................ 1 1.1 Motivation .............................................................................. l . Problem Statement ..................................................................... 3 1.3 Dissertation Proposal Outline ........................................................ 4 Chapter 2 TRANSLATION-INVARIANT CONTOURLET TRANSFORM 6 2.1 Introduction ............................................................................ 6 A. Background .................................................................... 7 B. Contributions .................................................................. 8 C. Overview of the Chapter .................................................... 10 2.2 Contourlet Denoising using Cycle Spinning .................................... 11 A. Cycle-Spinning Algorithm ................................................. 11 B. Numerical Experiment ...................................................... 13 2.3 Developing a TI Scheme for a Subsampled Filter Bank ........................ 15 2.4 Translation-Invariant Contourlet Transform .................................... 23 A. Translation-Invariant Pyramids ........................................... 23 B. Translation-Invariant DFB ................................................. 30 C. TI and Semi-TI Contourlet Transforms .................................. 33 D. Complexity Analysis and Efficient Realization ......................... 35 2.5 Image Denoising ..................................................................... 40 A. STICT Denoising Scheme Using Bivariate Shrinkage ................. 41 B. Simulation and Results ...................................................... 42 2.6 Conclusion ........................................................................... 47 Chapter 3 A NEW FAMILY OF NONREDUNDANT DIRECTIONAL TRANSFORMS ............................................................. 49 3.1 Introduction ........................................................................... 49 3.2 ' Background ........................................................................... 51 A. Motivation .................................................................... 51 B. 2-D Separable Wavelets .................................................... 53 C. Directional Filter Banks .................................................... 53 3.3 Hybrid Wavelets and DFB .......................................................... 56 A. Construction .................................................................. 56 B. HWD for Quincunx Wavelets ............................................. 64 C. Scaling Law and DFB Levels ............................................. 65 3.4 Analysis and Realization ........................................................... 67 A. Multiresolution Analysis ................................................... 67 B. Approximation ............................................................... 69 vi C. Efficient Realization ........................................................ 71 D. Complexity ................................................................... 74 3.5 Applications and Results ........................................................... 76 A. Nonlinear Approximation ................................................... 76 B. Image Coding ................................................................ 79 C. Image Denoising ............................................................ 82 3.6 Conclusion ........................................................................... 87 Chapter 4 ON THE LINEAR COMBINATION OF DENOISING SCHEMES 88 4.1 Introduction ........................................................................ 88 4.2 Related Work ........................................................................ 89 4.3 Optimal Linear Combination ...................................................... 90 A. Optimal LMS Approach ................................................... 90 B. LMS Approach using the Noisy Signal ................................ 93 C. Approximation to the Optimal LMS Approach .......................... 94 4.4 Experimental Results ............................................................... 96 A. A Brief Note on the Denoising Artifacts ................................. 96 B. Results ........................................................................ 99 4.5 Conclusion .......................................................................... 103 Chapter 5 FUTURE WORK .......................................................... 105 5.1. Image Coding ....................................................................... 105 5.2. Image Denoising .................................................................... 106 5.3. Image Watermarking ............................................................. 106 5.4. Other Possible Areas for HWD ................................................... 106 APPENDICES .................................................................................... 107 A.l Proof of Remark 2.1 ............................................................... 107 A2 Proof of Remark 2.3 ............................................................... 108 A3 The Rearrangement Algorithm .................................................. 111 REFERENCES ................................................................................... 113 vii Table 1 Table 2 Table 3 Table 4 Table 5 Table 6 Table 7 LIST OF TABLES PSNR Values of the Denoising Experiments ................................... 45 Largest First-Order Difference of the DF B When Applied to (a ............. 74 PSNR Values of the NLA Experiment for the Barbara Image ............... 78 PSNR Values of the NLA Experiment for the Barbara Image (Quincunx Case) ................................................................................. 79 PSNR Values of the Denoising Experiments Left Part: Different Transforms with Hard Thresholding, Right Part: Different Denoising Methods 84 PSNR Values of the Denoised Images ......................................... 100 PSNR Values of the Denoised Images (Advanced Methods) ............... 101 viii Figure 1. Figure 2. Figure 3. Figure 4. Figure 5. Figure 6. Figure 7. Figure 8. Figure 9. Figure 10. Figure 1 1. Figure 12. LIST OF FIGURES A flow graph of the contourlet transform. The image is first decomposed into subbands by the Laplacian pyramid and then each detail image is analyzed by the directional filter banks. ........................................ 13 The contourlet transform of the Boats image using 3 LP levels and {3, 2, 2} directional levels. For better visualization, the transform coefficients are clipped between 0 and 15. ......................................................... 14 The PSNR values of the denoised images versus the standard deviation of noise for the denoising experiments. ............................................. 15 Denoising experiment of the GoldHilI image corrupted with a Gaussian noise of sigma = 20. ................................................................ 16 A single-level multi-channel filter bank. ........................................ 17 The nonsubsampled or T1 multi-channel filter bank scheme in the polyphase domain. .............................................................................. 19 The effects of subsampling on the filters of a filter bank should be considered when developing a TI scheme. .................................... 20 Laplacian pyramid. Left: The signal x is decomposed into a detail signal, d, and approximation, c. Right: The reconstruction scheme. .................... 23 Single-level 2-D LP in the form of an oversampled filter bank. ............. 24 The frequency responses of the analysis filters in the 2-D oversampled LP (Figure 9). ........................................................................... 28 (a) The frequency response of a DFB decomposed in 3 levels. (b) An example of the vertical directional filter banks using 8/2 directions. (0) An example of the horizontal directional filter banks using 8/2 directions. .. .. 32 Some of the STICT coefficients of the Boats image using L =3 and {ii}lSiS3 = {3,2,2} directional levels. The coefficients for only one TILP Id) Id) subband (k = 3) are depicted. From left to right the subbands 771,3 772.3 and 77f; with all directions are shown. For better visualization, the transform coefficients have been clipped. ................................................... 34 ix Figure 13. Figure 14. Figure 15. Figure 16. Figure 17. Figure 18. Figure 19. Figure 20. Figure 21. Figure 22. Figure 23. (a) Reconstructed images using the subbands 71m pill. and 77:1,). for l s i s 3 , and 1 S k S 4 , (b) Image reconstructed using the subband(s) with indices (1 :1, i = 3 , and 1 S k s 4, (c) Original image. ..................... 36 The normalized joint histograms along with their contour plots of parents- children of the STICT coefficients for the images Barbara and Peppers. . 43 The denoising results of the Barbara image when (a) o = 20, (b) a = 40 ...................................................................................................................... 46 The denoising results of the GoldHill image when a = 10. .................. 47 The denoising results of the Boats image when a = 20 . ..................... 48 (a) 2-D separable wavelet transform. (b) The frequency partitioning of the separable wavelets. (c) Some basis functions corresponding to Horizontal, Vertical, and Diagonal subbands of biorthogonal 9/7 wavelets (from left to right). Note that only positive values are shown. .............................. 54 (a) The frequency response of a (full-tree) DFB decomposed in 3 levels. (b) An example of the vertical directional filter bank (VDFB) using 3 levels. (0) An example of the horizontal directional filter bank (HDFB) using 3 levels. ......................................................................................... 55 A schematic plot of the HWD-H transform using 1, = 3 directional levels. ......................................................................................... 60 The frequency partitioning in the HWD family using 11 = 3 directional levels. ................................................................................. 61 A directional subband of HWD (VDP). The shaded regions show the frequency support of the subband and the thick line indicate its major direction in the space domain. .................................................... 62 The effect of downsampling on the wavelet highpass subband which gives rise to frequency scrambling. By applying VDFB to horizontal subband in the HWD-H scheme, one can avoid inputting low-frequency regions of the wavelet subband to the directional decomposition. ........................... 63 Figure 24. Figure 25. Figure 26. Figure 27. Figure 28. Figure 29. Figure 30. Figure 31. Figure 32. Figure 33. Figure 34. Figure 35. Figure 36. (a) Some directional basis functions of the HWD-H (first two rows of which the last column corresponds to the pseudo-directional subbands) and HWD-F (all except the pseudo-directional ones) when l, = 3 . (Only positive values are shown.) (b) Four basis functions of the DFB with l = 3. (c) The corresponding magnitudes of the Fourier transform of the basis functions in (a). .................................................................................... 64 (a) The HQWD transform. (b) The frequency partitioning in the HQWD 1, =12 = 3 directional levels. (0) Left: A basis function of the QWT. Right: Some directional basis functions of HQWD. ................................... 65 Top: The analysis bank of the triplet QFB. Bottom: The synthesis bank. .. 72 Top: The fan filter pair using triplet filter bank [1]. Bottom: The fan filter pair using double-halfband filter bank [72]. .................................... 73 Left: The HWD-F transform of the Barbara image. Here J = 3, J," = 2 and 1] =12 =3. Right: The HWD-H transform of the Boats image with J=3,J,,,=2 andll=12=2. .................................................. 76 Examples of the nonlinear approximation PSNR results. Left: The NLA results for the Barbara image. Right: The NLA results for the Peppers image. ................................................................................ 77 Example of the NLA visual results for the Barbara image when M = 8192 . ......................................................................................... 79 Coding performance of the wavelet and HWD coders using SPIHT algorithm in terms of PSNR versus the rate for the Barbara image. 81 Coding results of the Barbara image at rate 0.25 bpp. ........................ 82 Coding results of the T exture16 image at rate 0.1 bpp. ....................... 83 Denoising results of the Peppers image when 0' = 20 . ....................... 85 Denoising results of the Barbara image when 0' = 40 . ...................... 86 (3) Some basis functions of wavelets. (b) The denoising result of the Barbara image using wavelets (0,, =20 ). (c) Some basis functions of contourlets. (d) The denoising result of the Barbara image using contourlets (0', = 20). ........................................................................... 97 xi Figure 37. Figure 38. Figure 39. Figure 40. The denoising results of the Boats image when a = 10_ (Note that the visual results for LMS__O and LMSfiN are similar to that of LMS_A.) ........... 102 The denoising results of the Barbara image when a = 40 . ................ 103 The polyphase domain scheme of the multi-channel filter bank given in Figure 5. ............................................................................ 109 A schematic diagram of the directional subbands using 1 levels (‘S’ is ‘ H ’ or ‘V’, and d = 2’“). Left: Subbands obtained by applying HDFB to a wavelet highpass channel. Middle: Subbands obtained by applying VDFB. Right: Subbands obtained by applying full-tree DF B. ...................... 111 xii Chapter 1 INTRODUCTION 1.1 MOTIVATION Natural images contain smooth regions separated by regular edges, which are the common singularities found in these 2-D signals. As a result, there are two major types of correlation (or forms of dependencies) that exist among pixels within these images: the dependencies over the smooth regions and those dependencies that are found along edges. This implies that images can be represented more efficiently than when using the Euclidean basis, which is their current representation in the spatial domain. There have been numerous approaches that attempted to decorrelate the dependencies in images. We can divide these methods into two major categories. Some methods work in the spatial domain, where they divide an image into homogeneous segments taking into account the edges. Binary space trees [75], quad-tree decomposition [83], and beamlets [31] are a few examples in the literature. These methods, however, are not successful in representing images with complex structure. The other category of schemes is based on a transformation or specifically, a basis or a frame decomposition, which could be overcomplete. Fourier and wavelets are two classical transforms in this category. A regular l-D signal having a singularity produces many large (or significant) coefficients in the Fourier domain, whereas the wavelet transform of this signal results in a few significant coefficients around the singularity [91]. This makes the wavelet transform an efficient scheme for piecewise smooth l-D signals. To measure this efficiency, we consider the nonlinear approximation behavior of the transform. In nonlinear approximation, one retains the M largest transform coefficients (significant coefficients) and sets the others to zero. Then one measures the (fz-norm) error of the reconstructed signal. For an orthonormal transform, this error is the same as the error in the transform domain, which is the [2 -norm of the insignificant coefficients. Hence, retaining the M largest coefficients is equivalent to the best nonlinear approximation for an orthonormal transform. It is shown that for piecewise smooth l-D signals, wavelets provide an approximation error with a fast decay as M increases [60]. For instance, this error for a piecewise constant signal with a single discontinuity is of order TM while it is of order —— when we use the Fourier transform [91]. In the 2-D case, however, although wavelets provide good nonlinear approximation results when compared to the Fourier case, they are not optimal. Two dimensional wavelets are constructed using tensor product of the 1-D transforms, which lead to separable bases. Consequently, they are not true bases for 2-D signals. That is, they ignore the regular geometry (e.g. regular edges) found in natural images. As a result, constructing new bases, which are more suited to natural images, has been a challenging task in recent years. Several studies of the human visual system have shown that it is a localized, multiresolution and directional system. In addition, [68] shows that the image patches used as a basis to represent images sparsely are localized, oriented and bandpass functions. Therefore, directionality is an important feature that is not rich in wavelets. Bandelets [57], brushlets [65], complex wavelets [53], contourlet transform [28], curvelet transform [12], directionlets [90], and steerable pyramids [84] are among the directional multiresolution transforms in the literature. There are, however, some disadvantages associated with some of these transforms. One major issue is redundancy, which is not desirable in image coding. Another issue is pseudo-Gibbs phenomena artifacts, which are introduced when setting some transform coefficients to zero prior to reconstructing the image. This happens in some image processing tasks such as image coding and denoising. It is worth noting that the problem of ringing artifacts is more serious for directional transforms when compared with their non-directional counterparts such as traditional 2-D wavelets. Another aspect of some of the recently proposed directional transforms is that they are adaptive in the sense that they do not follow a fixed procedure (independent of the signal) but rather signal- dependent procedures. Meanwhile, a fixed-procedure transform is usually desirable for a low computational complexity realization. Hence, one may argue that the recently developed signal-adaptive directional transforms have a key disadvantage when compared with non-directional transforms. 1.2 PROBLEM STATEMENT In this study, we look for new 2-D transforms, which satisfy the following criteria: 0 Provide perfect reconstruction 0 Be multiresolution and localized transforms 0 Have the feature of directionality 0 Be nonredundant (required for coding applications) 0 Provide good nonlinear approximation 0 Introduce a minimum level of ringing artifacts during nonlinear approximation (and thus when decoding and denoising) o Incur reasonable computational complexity A transform fulfilling the above criteria would be a proper choice for image denoising and coding. Therefore, we look for efficient 2-D directional transforms and study their applications in image denoising and coding. Note that redundancy is not a requirement for denoising applications. Hence, this particular feature has not been pursued by some of the transforms developed under this work. Furthermore, although the complexity aspects are critical for practical realizations of any transforms, these aspects do not represent the highest priority when developing new transforms with other (arguably more critical) features such as reconstruction with minimal artifacts and high efficiency nonlinear approximation. Nevertheless, this dissertation attempts to develop low-complexity variations of high-complexity directional transforms. 1.3 DISSERTATION PROPOSAL OUTLINE In Chapter 2 we study and propose methods to develop translation-invariant directional transforms that are primarily targeting denoising applications. To achieve this, we first develop a translation-invariant scheme corresponding to a general multidimensional multi-channel filter bank, where we extend the algorithme a trous to a generalized algorithme a trous. Then we use the proposed methods to develop translation-invariant contourlet transform and employ it to image denoising. We propose a new family of nonredundant directional transforms using Hybrid Wavelets and Directional filter banks (HWD) in Chapter 3. Then we use this family in a few image processing applications including nonlinear approximation, coding, and denoising and Show its efficiency when compared with other leading transforms. In Chapter 4 we propose and formulate methods to optimally/suboptimally and linearly combine different denoising schemes in order to improve the denoising results. Then we apply the proposed combination schemes to several image denoising approaches and provide the results. Finally, in Chapter 5 we state possible extensions and improvements that we are currently developing and pursuing for the proposed methods. Chapter 2 TRAN SLATION- INVARIAN T CONTOURLET TRANSFORM 2.1 INTRODUCTION During the past decade, wavelets have proven their capability in many signal and image processing applications such as compression and denoising [60]. Owing to the good nonlinear approximation property of wavelets for piecewise smooth signals, they have been very effective in generating efficient representation of one-dimensional (l-D) waveforms. In the case of natural images in which piecewise regions are separated by smooth curves (or edges), however, one can still observe that there are self-similarities among the wavelet subbands. This implies that one is able to further process wavelet coefficients of an image in order to achieve more de-correlation. It is well known that wavelets are properly structured to treat point-wise singularities; hence, they are appropriate in representing piecewise smooth l-D signals. In contrast, natural images contain 2-D singularities (edges), which need a more efficient transform than the wavelet transform (WT). A. Background An important factor of an effective transform is directionality. Having this feature, a transform would have the potential to handle 2-D singularities effectively. Many directional transforms have been introduced in recent years. Continuous (directional) wavelets [2], complex wavelets [54], steerable pyramids [85], and brushlets [65] are some examples in the literature. The wavelet X-ray transform [95] and ridgelet transform [l4] apply wavelets to the radon transform of an image in such a way that one could effectively represent arbitrarily-oriented lines in an image. To make the ridgelet transform applicable to a natural image, the authors in [12] constructed curvelets using three steps: subband decompositions of the image, partitioning the subbands into blocks in such a way to satisfy the anisotropic scaling law', and then applying the ridgelet transform to the resulting blocks. Using a similar idea of combining subband decomposition with a directional transform, Do and Vetterli introduced the contourlet transform (CT) [26]-[28]. In the CT, a Laplacian pyramid (LP) [10] serves as the first stage and directional filter banks (DFB) [5] as the second stage. The LP is a redundant transform with a redundancy ratio of up to 4/3; thus, since the DFB is critically sampled, the redundancy factor of the CT is up to 4/3. Both the LP and the DFB involve subsampling in their implementations and similar to the wavelet transform they are shift variant. Therefore, it follows that the CT is a shift- variant transform. An important advantage of translation invariance is that the performance for denoising applications is significantly improved when a translation- ' The anisotropic scaling law, which is different from the isotropic scaling law in wavelets, is based on having frame elements (generalized basis functions) with a 2—D support dimension that obeys the rule width or lengthz. invariant (TI) scheme of a subsampled transform is employed. This advantage of TI schemes is achieved due to the elimination of the pseudo-Gibbs phenomenon artifacts resulting from thresholding the transform coefficients [22]. Translation-invariant (sometimes called stationary or undecimated) wavelets have been introduced in several studies [6], [22], [62], [66], [71]. TI denoising can be realized through the cycle-spinning algorithm [22], [43]. Cycle-spinning, however, may not be a computationally efficient way to perform TI denoising for many applications. Hence, an alternative to cycle- spinning-based approaches is desired. Indeed, since wavelets are partially TI, by using an appropriate algorithm such as “algorithme a trous” [49], [82], the TI wavelet coefficients can be derived with low complexity, where only N log2 N wavelet coefficients are needed to obtain the TI wavelets of a signal of size N, when using L = logz N levels. In contrast, in the cycle-spinning algorithm, one needs to calculate N 2 wavelet coefficients for the same signal. In algorithme a trous, which is originally introduced for l-D wavelets, we need to update the wavelet filters at each level (except the first one) when using undecimated transform (i.e., there is no subsampling operations). In a nutshell, starting from the second level, we update the decomposition filters by upsampling them with two. That is, we put zeros between samples (trous in French means holes). In addition to using the same procedure in the synthesis stage, we also add a scaling factor equal to 1/ 2 at the end of each reconstruction level. 8. Contributions In this chapter we propose methods for developing a TI scheme from a general multidimensional and multi-channel subsampled filter bank (FB). In particular, the following contributions are presented in this chapter: 1. First, we employ the cycle-spinning algorithm (originally developed for the wavelet transform) to the contourlet transform, where it provides significant improvements over its wavelets counterpart when applied to noisy images. The high complexity of the cycle spinning algorithm, however, motivates us to develop another more efficient directional TI transform. In particular, we extend the algorithme c‘z trous [49] introduced for 1-D wavelets to a “generalized algorithme c‘z trous” (GAT), which is applicable to a general multidimensional subsampled (uniform or non-uniform) FB. We prove that the TI version of a subsampled FB obtained through the GAT provides a tight framez if the original subsampled F B has a tight frame. Using the proposed GAT along with employing modified versions of the DFB, we introduce the T I contourlet transform (TICT). Although the TICT has potential to image denoising applications [36], its high redundancy in conjunction with a relatively high complexity provides an opportunity for fiirther improvements. As a consequence, we propose semi- TIC T (STICT), which is a less redundant and less complex scheme. We also provide efficient realizations for the proposed TICT and STICT schemes. Subsequently, we propose a new image denoising scheme where we employ the STICT alongside the powerful bivariate shrinkage firnction (a Bayesian-based shrinkage approach, which considers the dependencies between transform coefficients) [79] and show its capability when compared with some 2 In the tight frame condition, the reconstruction frame elements are the same as those for the decomposition; thus, the signal can be easily reconstructed (see Section 2.4A). other outstanding denoising schemes. C. Overview ofthe Chapter The outline of the chapter is as follows: In the next section we present our first approach for translation-invariant contourlet denoising, where we employ the cycle- spinning algorithm. In Section 2.3 we study and develop a TI scheme of a subsampled F B. Then, we propose a TI scheme of the CT in Section 2.4. Section 2.5 presents a new image denoising scheme based on the STICT along with the simulation results and finally, our main conclusions and future work are given in Section 2.6. Glossary of Abbreviations: APS additions per input sample BLS-GSM Bayes least-squares estimate using Gaussian scale mixtures BS bivariate shrinkage CT contourlet transform CS cycle spinning DF B directional filter bank DTCWT dual-tree complex wavelet transform FB filter bank GAT generalized algorithme (‘1 trous HDFB horizontal DFB HT hard thresholding LP Laplacian pyramid MPS multiplications per input sample QFB quincunx filter bank 10 STICT semi-TI contourlet transform TI translation-invariant TICT TI contourlet transform TIDF B TI directional filter bank TILP TI Laplacian pyramid TIWT TI wavelet transform VDF B vertical DFB WT wavelet transform 2.2 CONTOURLET DENOISING USING CYCLE SPINNING A. C ycle-Spinning Algorithm Cycle spinning for denoising is a simple method that can be applied to a shift variant transform for signal denoising. For a shift variant transform T, operating on a noisy image x = S + noise, where S is the original image, we denote the 2-D circular shift by SM and the threshold operator by (9. Now, if the following procedure is applied: K,K2 3: 1 .2 S_,.,_,.(T“(6lT(S,-,,-(x))])), where (K I ,K 2) are the maximum number of shifis, one would expect an improvement for the estimation 5 compared to the de-noised image without cycle spinning. The above cycle spinning procedure consists of the following steps: 1. Initialize § = 0, i=0, and j=0 2. Circular-shift the noisy image by (i, j) 11 3. Decompose the shifted noisy image using the transform T 4. Threshold the transform coefficients 5. Obtain the denoised image by reconstructing the thresholded coefficients 6. Shift back the denoised image by (-i,— j) and multiply by 1/(K1K2) 7. Add the output of Step 6 to S 8. Incrementiby 1; if i 5 K1 go to Step 1 9. Incrementj by 1; if j S K2 go to Step 1 10. S is the denoising result For the wavelet transform decomposed into L levels (L S logz (N) ), where the input image is of size (N, N) and K = 2L , after K shifts in each direction, the transform output repeats. Since the contourlet transform is a shift variant transform, we can apply the same approach to the contourlet transform. Note that cycle-spinning algorithm is a highly complex procedure for translation-invariant denoising. As a result, we will later develop a translation-invariant contourlet transform to achieve efficient denoising. Figure 1 shows a flow graph of the contourlet transform. It consists of two major stages: the subband decomposition and the directional transform. At the first stage, we use Laplacian pyramid (LP), and for the second one we use directional filter banks (DFB). Quincunx filter banks are the building blocks of the DFB. We used the fan filters designed by Phoong, Kim, Vaidyanathan, and Ansari [72] with support size of (23, 23) and (45, 45) for the quincunx filter banks in the DF B stage. Figure 2 depicts an example of the contourlet coefficients of the Boats image decomposed into three LP levels and {1].}IS 1.53 = {3,2,2} directional levels (from finest 12 . Lp - DFB Image Figure l. A flow graph of the contourlet transform. The image is first decomposed into subbands by the Laplacian pyramid and then each detail image is analyzed by the directional filter banks. to lowest scales). B. Numerical Experiment To test our algorithm, we selected four images of size 512x512: Barbara, GoldHill, Mandrill, and Peppers. We used four approaches for our experiments: the contourlet transform (CT), the wavelet transform (WT), and the translation invariant wavelet transform (WT-CS) in addition to the proposed method based on the contourlet transform using cycle spinning (CT-CS). Moreover, we applied 81 cycle spins in this experiment. We used biorthogonal Daubechies 9-7 wavelet transform and the same wavelet filters for the LP stage of the contourlet transform. For the contourlet transform, we used 6 LP levels and {1].}1SjS6 = {5,4,4,3,3,2} directional levels. We added zero-mean Gaussian noise to the images and applied the above de-noising methods using a simple hard- thresholding to the noisy images. We set the thresholds to some values so that we obtain best PSNR values of the denoised images. For contourlet denoising, we set the threshold values to {1313,36 = a X {3.40‘,2.40‘,2.40‘,1.70’,1.70‘,1.20’} , where a is a constant around 13 Figure 2. The contourlet transform of the Boats image using 3 LP levels and {3, 2, 2} directional levels. For better visualization, the transform coefficients are clipped between 0 and 15. one used to obtain the best value of PSNR. Note that at the finer scales we apply higher thresholds, which improves the contourlet denoising results. Note also that this method of variable thresholding fails to improve the wavelet denoising results since it introduces annoying artifacts, which resemble speckle noise. Figure 3 shows the PSNR values of the denoised images versus a range of the input noise. For the Peppers image, which is a piecewise smooth image, and hence a “wavelet- friendly” image, the WT-CS performs almost the same as the CT-CS at a range of the input noise power. However, in case of images containing mostly textures and contours such as the Barbara image, the CT-CS yields significant improvements of up to 1.5 db over the WT-CS. To visually compare the estimated images, we show part of the denoised images of the GoldHill image for sigma = 20 in Figure 4. Barbara 2 ' .- . -._ . . _ -- 1k -1“ H iL L . LL. L . . . . - . 90 40 60 30 230 4O 60 80 standard deviation standard deviation Figure 3. The PSNR values of the denoised images versus the standard deviation Of noise for the denoising experiments. 2.3 DEVELOPING A TI SCHEME FOR A SUBSAMPLED FILTER BANK In this section we develop a TI version of a general multi-channel and multidimensional FB. Translation invariance is achieved through several ways. Consider a l-D wavelet transform scheme with periodic extension, and a signal of size N. At the first level, one decomposes the signal using the wavelet lowpass and highpass filters h and g, and then downsamples the resulting approximation and detail coefficients, that is, discards the odd-indexed coefficients. Now, if one cyclic-shifts the original signal by an even number, e.g. 2k, (k E Z) , the output will be shifted by k, that is, the single-level wavelet transform is T1 for even shifts. So, to make this wavelet transform TI, we also need the odd-indexed values of the filtered coefficients. To obtain these coefficients, one can cyclic-shift the signal by l (or an odd number) and decompose it; or, use the same signal and in downsampling shreds the even-indexed coefficients. Hence, one obtains two sets of transform coefficients for 15 Denoised Image Using WT-CS Denoised Image Using CT-CS PSNR = 28.87 PSNR = 29.39 Figure 4. Denoising experiment of the GoIdHill image corrupted with a Gaussian noise of sigma = 20. each channel (each has a size of N/2) at the first level. At the next wavelet level, we encounter the same situation. However, since we have two sets of approximation coefficients, using the same procedure as the one we used for the first level, for each wavelet channel we obtain four sets of coefficients, each having a length of M4, leading to 2N coefficients at this level. We can decompose the signal up to logzN levels, where we get a total of N logZN wavelet coefficients and an approximation signal that is constant [60]. This way we achieve translation invariance by keeping all transform coefficients. For images, however, this procedure should be done in two dimensions: row and column. Consequently, omitting even- or odd-indexed rows and even- or odd-indexed columns, one can downsample the coefficients in four ways. Below we examine the problem for a general FB. Consider a perfect reconstruction d-dimensional N—channel FB as illustrated in Figure 5. We denote M as a d xd sampling matrix. Note that if N =dM, where dM =|det(M)|, the PE is critically sampled and if N >dM, it is oversampled. Suppose we denote the outputs of the analysis filters before downsampling as H0(z) MIM ml) ”I‘M G0(1) W1(Z) C) Y1(Z) X— 111(2) ' I::>——' 01(2) x HN-l(z) WN—I (Z)YN-1(z) @ GN_1(Z) Figure 5. A single-level multi-channel filter bank. WIIHI’ for 0 S i < N, where n = (n1, ...,nd )T e Zd. Hence, we have Min]: WIIMHI- Below we state a generalized procedure for obtaining all possible shifts of a multidimensional and multi-channel FB. Remark 2.1: If one computes all possible shifts of wi[n] (see Figure 5) by3 kc EN(M), (OSchM —1), that is, {Wim'I-kcnOsCde—l’ where d M = Idet(M)| , and N (M ) is the set of integer vectors of the form Mt, t E [0,1)d , then the output of the analysis section is translation invariant. Notice that the shifting of wt.[n] is equivalent to shifting either the input signal x[n] or the analysis filter hi[n] by the value of kc . It is clear that for a multilevel FB, one can apply the above method at each level for as many inputs as that level has. (See Appendix Al for proof.) In the next remark, we give an example that illustrates the fact that the existence of subsampling operations in a FB is not sufficient for shift variance of the FB. In what md A follows we define zm = Z1 ---zd , where z = (21,...,zd)T and A . m=(ml,...,md)T EZd, and define zM =(sz‘,...,zmc‘1 )T where M 18 a dxd 3 Note that in general this shift could be kc + mM where m E Zd is an arbitrary integer vector. 17 o o l 1 matrix with mm. as its lth column. We also adopt the notatlon zM 9- 2W ) for an integer l E Z. Remark 2.2: Regarding Remark 2.1, if in a critically-sampled FB, without loss of generality, we have wi[n] = W0 [11 + RI], (0 Si S dM - I) (suppose that k0 = 0 ), the FB will be T1. In this case the analysis and synthesis filters satisfy H I. (z) = zk’ H 0 (z) and Gi(z) = z—k’GO (z), and {yi[n]}0 Wt.) G,(z > _, x z1 z 1 x, 1 x, L, ' dM W (2) k — N" —k _ z dM 1 , z dM I Figure 6. The nonsubsampled or T1 multi-channel filter bank scheme in the polyphase domain. scaled version of x, , i.e. x, = axr, where a =1/dM (see Figure 6). Furthermore, the redundancy of the resulting TI filter bank equals N. (See Appendix A2 for proof). Note that each synthesis channel of the non-subsampled FB in the polyphase domain [89], [92] yields a signal equal to the reconstructed signal, and as a result, the reconstructed signal (x,) is the average of the dM reconstructed signals resulting from the (1M output channels of PB in the polyphase domain. However, the non-subsampled PE is redundant by a factor of N. Although redundancy is not desirable in some signal processing applications such as compression, it provides abundant information about a signal, which is advantageous for some applications such as denoising. According to Remark 2.3, one can simply omit the subsampling operations at a single-level FB scheme to obtain a TI realization of the FB. For a multilevel FB, however, we cannot merely do so at every level to construct the corresponding TI scheme. In the nonsubsampled version of the FB, one has to change the analysis filters of level(s) l > 1 such that they operate the same way as if there is subsampling. In the next proposition, we show how one can construct new filters when one omits the subsampling operations in a multilevel PE, in order to achieve translation invariance. 19 Yt"’(z>—@—1 H."’ E K‘”(Z)——> H,(’)(zM)——><:>———> th(z)——> Gi(l)(z) L_.@_. Y,“’(z)——@—u Gil) (1M) __) Figure 7. The effects of subsampling on the filters of a filter bank should be considered when developing a TI scheme. Proposition 2.1 (generalized algorithme c‘z trous): Assume that we have an L-level octave-band multi-channel FB with analysis and synthesis filters at level 1 as H i(l)(z) and G,-(l)(z), (0 Si < N, IS I S L), respectively and a general d-dimensional sampling matrix M (with size d x d ). If one omits the subsampling operations in the F3 to obtain the TI scheme, the new analysis and synthesis filters at a level I, (I S l S L) 1—1 1-1 are H ,Il)(z) = H 1. (2M ) and Cit-(”(1): Gi(zM ), respectively. Proof We prove this proposition through induction. For the first level (1 =1), as stated in Remark 2.3, the filters remain unchanged. Now suppose we have the TI filters of H I. (z) = H I. (z ) and G,- (2) = G,(z ) for a level 1. Assume that the output of the analysis part at this level is Yi“)(z), (O Si < N). Now at the next level, I + l, we apply a F B using the previous level filters, which are H i“) (z) and Ci“) (2). Since in the original FB, each analysis (synthesis) filter presumes a downsampled (upsampled) version of the output of the last level, as depicted in Figure 7, the equivalent filters are obtained using the noble identities: H III“) (2) = H 5’) (2M) and GEM) (z) = Gr“) (1M) . Hence, 20 H.."+”(z) = H.(z"”""” ) = H.(z”’ ), and Gf’+”(z> = G.(z‘M’""M > = G.(z“’ >. Note that according to Remark 2.3, one has to include a scaling factor equal to 1/ (1M after each synthesis bank. 1:] The following corollary generalizes Proposition 2.1 when the sampling matrices are not the same. Corollary 2.1: Suppose that M I, (ISIS L), is the (d-dimensional) sampling matrix for the level I in the F B mentioned in Proposition 2.1. Then the equivalent analysis and synthesis filters for the non-subsampled F8 for levels I 2 2 (they remain unchanged I-I l-l HMJ] {II/"1] for the first level) are Hi(1)(z)=H,-(z[j=l ), and G,.(')(z)=G,.(z Fl ) respectively. To ensure perfect reconstruction, a scaling factor equal to l/d M1 is required after each synthesis bank having the sampling matrix M 1 . In Proposition 2.1 and Corollary 2.1 we have extended the well—known algorithme d trous proposed in [49] for the wavelet transform to a generalized algorithme d trous, which is applicable to a general multidimensional multi-channel FB system. Below we extend Remark 2.3 for a single-level non-uniform FB (i.e. has different sampling matrices). The equivalent single-level FB form of a multilevel wavelet transform is an example of such FBs. Proposition 2.2: Suppose that a perfect reconstruction multidimensional non- J uniform FB has N = Zn, channels with analysis and synthesis filters as i=1 {Hm,H,.,2,...,f-I,.,ni}15iS J, and {Gi,1,G,’2,...,Gi,ni}IS,.SJ, respectively. Suppose also 21 that the sampling matrices for the channels {(i,n); IS 11 S ”1} are equal, where we denote them as M i , (l S i S J) . Now, if we remove the sampling operations, the F B can be made perfect reconstruction by adding a scaling factor equal to 1/ (1 Mi , after each synthesis filter for the channels {(i,n); l S n S n,}. Proof: Here we provide a proof for a special case, where the FB with subsampling (which we call it A) is a simplified version (which is obtained through F Bs identities) of an L-level FB (15’), where B is composed of a series and parallel combinations of distinct FBs like the one given in Figure 5. Note that the total number of channels in B is equal to N. Now, regarding Remark 2.3, for each distinct PE in B one can omit the subsampling operations, add the proper scaling factors at the end of each distinct FE, and modify the filters taking into account the relevant level and the sampling matrices before this level. Finally, one can simplify the resulting FB (C) to a single-level one, where we denote it D. Note that the filters in D are the same as those in A. Assume that a set of distinct FBs, F , from the input to output of C with the scaling factors {1/ did; 1 S l S L}, lead to the resulting channels { (i n); l S n._ < n } in D, which have the saine scaling factor equal to 1/ d M .This scaling factor IS in fact equal to H(1/ d MI). On the other hand, one can conclude that the sampling matrices assciciated with F in the FB 8 are {M (I; ISIS L}. Thus, the sampling matrix for channels {(i, In); 1S n Sni} in A, which is La reduced version of B, is M =1:[(MiI ) , hence, dM_ =det{H(M,’ )}— — l_[(dM ,). Therefore, to obtain D fro=m A, one can simply [=1 omit the subsampling operations and add the scaling factor 1/ d Mi to the channels associated with the sampling matrix, M i . Cl In the next section, we develop a TI scheme of the contourlet transform using the algorithms mentioned in this section. 22 x— 4D D -.?—d d— WI i—x, Figure 8. Laplacian pyramid. Left: The signal x is decomposed into a detail signal, (1, and approximation, c. Right: The reconstruction scheme. 2.4 TRANSLATION-INVARIANT CONTOURLET TRANSFORM The contourlet transform is composed of two stages: a Laplacian pyramid (LP) [10], [29] and a directional filter bank (DFB) [5], [70]. The LP decomposes an image into a number of radial subbands plus an approximation image. Then, the DFB is applied to each resulting detail subband where a maximum number of directions are used at the finest subband, and this number of directional levels is decreased at every other radial detail subband to achieve the anisotropic scaling law of width at length2 [12], [27]. Since the contourlet transform is realized using two stages of subsampled F 85, to create a TI contourlet transform (TICT), we need to develop TI schemes for both stages, as explained below. A. T ranslation-In variant Pyramids A new reconstruction scheme was proposed for the LP that is based on the frame reconstruction, leading to more robustness against noise [29]. Figure 8 shows the LP decomposition as well as its new reconstruction schemes. We let the sampling matrix, M , equal to diag(2,2) for images, where diag(al,az, ...,aN) is defined as a diagonal N x N matrix having ((11,612, ...,aN) as its diagonal elements. Here H and G are 2-D 23 - m» M ’°“’ in Ge Y(z) K.(z) _s@_L<>__. F0( ) X _ z x 19(1) 173(2) 7(2) 8(1) V Figure 9. Single—level 2-D LP in the form of an oversampled filter bank. lowpass filter pair4. Note that if one removes the subsampling operations from this LP framework, the resulting nonsubsampled FB will fail to be perfect reconstruction. Do and Vetterli [29] proposed the LP in the polyphase domain [89], [92] in the form of an oversampled F B. In this form one can better observe the structure of the pyramids, and besides, it is a more suitable framework for developing the TI version of the LP. Defining the vector of the polyphase components of a signal x in the z-domain as X7, (z) = (X0 (z),...,X3 (z))T, and the filters h and g as the row and column vectors Hp(z) 2 (H0 (z),...,H3 (z)) and (373(1) = (GO (z),...,G3 (2))T, one can write the input-output relationship of the LP as (see Figure 9), ( H(z) \ (Hap zko ’ Go (1M)H(Z) K0 (2) 7(2) = (2.1) tz“ — G.(z“>H}. Figure 9 shows a single-level LP in the form of an oversampled FB. To construct a multilevel LP, one can simply iterate the single-level LP on the lowpass channel. Using the generalized algorithme a‘ trous developed in Section 2.3, the multilevel TI scheme of the LP is constructed by suppressing all subsampling Operations and modifying the filters at level I: T">(z)=(H(z“'"‘) Ko(z“’") K3(z’f"))r. and —1 —1 —1 s">(z>=(6(z”’ ) Fo(z) = (1 + z)N' (1 + z“ )N1 R(z), (2.2) where R(z) = R(z—l). Then, if Héld)(z) and HIM)(Z) are polyphase components of H(ld)(z), Hl(ld)(22) has zeros at z=ij or a)=:tn'/2. In addition, Héld)(22) 26 and Hl(ld)(22) cannot have a zero at z = —1. Proof: We can write Hg‘d>(zz) = Even{H(ld)(z)} and Hfld)(zz) = z“ Odd{H“d)(z)}. Therefore, Hl‘d’<22)|z=ij =:j2”r‘—R(—j»=o and for the second part since H (1d)(z) is a lowpass filter, we have H ("0(1) at 0 and consequently, regarding (2.2) its even and odd parts are not zero at z = —1 . D As a matter of fact, H 1(1d)(22) is also a lowpass filter with about half of the cutoff frequency of H(ld)(z). Note that the filters Gi(zM)H(z), (0SiS3) in (2.1) are separable and obtained from l-D filters Géld)(z)H(ld)(z) and Glad)(z)H(ld)(z), where they have zeros at a) = if and the latter has also zeros at a) = in/ 2. It turns out that the analysis LP filters K I. (z) , (0 Si S 3) - and similarly H(z) - have passbands with different cutoff frequencies as illustrated in Figure 10. Note that when we remove the subsampling operations in a F B, we encounter fewer restrictions in the filter design of such FBs. For instance, to ensure perfect reconstruction in a TI pyramid having five channels, the filters have to just fulfill the following condition: 3 H(z)G(z> + 26K.<21F.-(z)=1. i: However, there is no standard method for designing 2-D filters having more than two channels with arbitrary passband regions. Moreover, the McClellan transformation, 27 Figure 10. The frequency responses of the analysis filters in the 2-D oversampled LP (Figure 9). which is normally used in 2-D filter design, seems to be disadvantageous in designing 2-D multi—channel FB. As a result, we resort to biorthogonal filters in the TILP similar to the LP. Next, a frame analysis is provided for a single-level TILP. For redundant transforms, frames [25], [24] are efficient tools for analysis. A frame is defined as follows. Definition: Let the sequence {49].}1.er and signal x, be in the Hilbert space H; then {19].} jer is a frame if there exist two constants A>0, and B|2 S Bllxllz. We call A and B, the frame bounds. A frame is known jeF as a tight frame when A = B. In a tight frame, the signal is reconstructed through x=A_]Z19j. jer It is important to note that when a scheme can be expressed by a frame, it represents a stable framework, where the existence of an inverse transform is guaranteed. This is especially important for the schemes that are redundant. Since the LP is an oversampled FB, it could be better analyzed using flame theory. In the next proposition we prove that the TI realization of a single-level subsampled FB having a tight flame, is also a tight frame. Proposition 2.5: Consider a single-level multidimensional FB (see Figure 5) having a tight frame with frame bounds equal to one. Then, the corresponding TI FB provides a tight flame with frame bounds A = B = d M . Proof' From Remark 2.1, a technique to obtain a TI set of outputs is to shift the analysis filters by kc, (O S c S d M — 1). We also shift the synthesis filters, correspondingly. Hence, for each shift we have a distinct set of kernel functions. Furthermore, each set provides a tight flame as we show it below. Assume the kernel functions of the original FB are {9j}j zd , thus the tight flame condition implies that E 2 Z Kxagj» = ”xllz NOW a Shift in {6].}. 2" corresponds to the same shift in x as ’6 jeZd mentioned in Remark 2.1. As a result, the shifted version of the kernel functions is a distinct tight frame with flame bounds equal to one. Now suppose we denote the analysis 29 kernel functions for a shift of k c , by {gjc}jeZd , where O S C S d M — I. Let us denote {(01} = {9;}, (OSCSdM — l, and jEZd ), as the kernel functions of the TI FB. Then we have Z Kan->1 = Z KM) =dM llxll - jeZd 6:0 jeZd Therefore, the TI FB provides a tight flame with flame bounds A = B = d M . 1:1 As a result, if a subsampled FB provides a stable framework, the corresponding TI scheme also represents a stable realization. Corollary 2.2: If the subsampled FB in Proposition 2.5 has a tight flame with frame bounds equal to K, then the corresponding TI scheme provides a tight flame with flame bounds A = B =dMK. Corollary 2.3: Since it is proven that a LP with orthogonal filters provides a tight frame [29], the single-level 2-D TILP with orthogonal filters provides a tight frame with frame bounds equal to four. B. T ranslation-Invariant DF B The DFB is the major part of the contourlet transform. It is realized through iterated quincunx FBs, and some resampling operations that just rearrange coefficients. In an I - level DFB, we decompose the frequency space to 2’ wedge-shaped partitions (Figure 11). Using the noble identities, one can transfer all sampling Operation to the end (beginning) of the forward (inverse) transform of the DFB [70]. As a result, one obtains 2I analysis and 21 synthesis filters, Hi“) and Ci“) respectively and the overall ’ 3O sampling matrices Si”), (O Si < 21 ), as given below [70], [27]: s“): diag(2i‘l,2), for OSi<2i“ ‘ diag(2, 2H), for 2"‘Si<2’ Since it is the equivalent iterated DFB system for 1 levels, to construct the TI scheme, it is sufficient to suppress the subsampling operations and multiply the reconstructed signal by a scaling factor, which is 1/ det(S,-(I))=2-I for both vertical and horizontal directions. Therefore, the redundancy factor of such a scheme is equal to the number of . . i directions 2 . According to the passband regions of the TILP highpass filters (see Figure 10), for filters K 1 and K 2 it is more appropriate to employ vertically- and horizontally-oriented DFBS [38], respectively (as we explain further in Section 2.4C). In vertical DFB (VDF B) and horizontal DFB (HDFB) we can achieve vertical directions (directions between 45° and 135°) and horizontal directions (directions between -45° and +45°) as depicted in Figure 11. In these two modified DFB schemes, we stop decomposing the signal horizontally or vertically after the first level of the DFB. Therefore, the overall sampling matrices for VDFB and HDFB will be SN) _ Q, for subband y1 ' _ diag(2,2H), for 2"1Si<2” and SW) = {diag(2i“‘, 2), for 0 S i < 21:1 I Q, for subband yo where Q is the quincunx sampling matrix. Note that we can change the shape of subbands 31 (-fl,—7T) 41 (7r, 7!) 2 5 8 5 Lwi 7 6 8 5 3 (—7[, _7[) (a) (02 a) 2 I (7r, 7:) 11 (7F, 7?) yl 4 2 1 5 8 6 7 xa)l yo L601 7 6 yo 3 5 yl I 3 4 (—7[a —fl) (b) (C) Figure 11. (a) The frequency response of a DFB decomposed in 3 levels. (b) An example of the vertical directional filter banks using 8/2 directions. (c) An example of the horizontal directional filter banks using 8/2 directions. y0 and y1 (see Figure 11) in the spatial domain into a rectangle using a resampling matrix and shifting as explained in [38]. In the TI versions of the VDFB and HDFB we should consider the new sampling matrices to obtain the proper scaling factors. The redundancy factor of the modified (either vertical or horizontal) TIDFB will be 2’-1 + 1. Note that the construction provided for the (modified) TIDFB is not efficient in terms of complexity. We will present an efficient construction in Section 2.4D. 32 C. T I and Semi-T I Contourlet Transforms The TI contourlet transform (TICT) is constructed using the TILP and (modified) TIDFB. In fact, we employ a similar structure as the one used in the contourlet transform. However, when developing the TICT, since every level of the TILP has four highpass subbands, we propose to apply the (modified) TIDFB to each one of these subbands. The form of passbands of highpass filters in the TILP (Figure 10) suggests to apply regular TIDFB to highpass outputs of K0 and K 3 and use TI VDFB and TI HDFB for outputs of K1 and K2, respectively. To preserve the anisotropic scaling law of width 0C length2 , we apply (modified) TIDFBs with a desired maximum number of directional levels to the four finest subbands of the TILP, where we are at level one, then as we decrease the radial resolution of the TILP at higher levels, we decrease the directional levels at every other TILP level. Remark 2.4: Assume that a TILP has L levels and we apply ii-level (1 Si S L) (modified) TIDFBS to the four detail subbands of level i of the TILP. Then the redundancy factor of the constructed TICT is 3L2:=1 21". + 3 . Improvement in denoising performance is an important reason justifying the construction of a TI version of a subband scheme. Since the redundancy of the (modified) TIDFB increases exponentially as the number of directional levels is raised, it makes the TICT highly redundant when comes along with the redundant transform of the TILP. Therefore, we propose another variety of the CT, which is less redundant and less complex. This new scheme is accomplished through applying the critically-sampled (modified) DFBS to the TILP in much the same way that we employ the (modified) 33 Figure 12. Some of the STICT coefficients of the Boats image using L = 3 and {ii}lsis3 = {3,2,2} directional levels. The coefficients for only one TILP subband (k = 3) are depicted. From left to right the (d) (d) (d) . . . . . . subbands 771 3 , 772 3 , and 773 3 With all directions are shown. For better Visualization, the transform coefficients have been clipped. TIDFBs to realize the TICT. Hence, this contourlet realization is not TI and therefore, we refer to this approach as the semi-TI contourlet transform (STICT). The redundancy factor of this scheme is the same as that of the TILP, which is 4L + 1 . Figure 12 shows an example of the STICT of the Boats image using three TILP levels and (modified) DFBS with {libs-$3 = {3,2,2} directional levels. Images at the top part of each level in this figure indicate the horizontal directions. We will denote the transform coefficients of the TICT and STICT by pg) (m) and 7752 (m) , respectively, where i , (1 S i S L) indicates the pyramidal level, k , (1 S k S 4) shows the pyramidal subband at each level, (1, (1Sd.S2Ii -for regular DFB) specifies the directional subband at each level, and m denotes the position in two dimensions. Likewise, we can also denote the CT coefficients by yéd) (m) with the same definition for i , d , and m. Although the STICT is not TI, our preliminary image denoising results indicated the 34 potential of this approach [36]. The main drawback of a shift-variant FB scheme to employ for denoising is due to the appearance of artifacts when one reconstructs the signal from not all of the transform coefficients. Here, we perform a simple experiment (similar to the one in [53]) to evaluate our proposed methods. First we obtain the transform of a synthetic image of a circle using the CT, TICT, and STICT (with {1:}19.S3 = {3,2,2} ). Then, for each method we reconstruct the image by keeping one directional subband at a level (L =L) and its parent subbands in the other levels (L < L). Figure 13 shows some examples of the reconstructed images. It is clear that the images reconstructed using the CT show a lot of artifacts approving the unsuitability of this scheme for some image processing tasks such as image denoising. In contrast, the results of the TICT are almost artifact-flee with higher directional resolution. The STICT, interestingly, provides the results without noticeable differences to those of the TICT, which clarifies the importance of making the pyramidal subbands translation invariant. Hence, making the DFB stage translation invariant does not have much impact in improving denoising results. In the next subsection we will provide fast realizations of the TILP and (modified) TIDF B as well as the complexity analysis of the different proposed contourlet schemes. D. Complexity Analysis and Efficient Realization When employing the STICT and TICT we encounter alternative FB schemes for which we propose and express efficient realizations along with their individual complexities; then we identify the complexities of the above transforms. Note that we compute the complexities for the decomposition (analysis) stages while we have similar 35 TICT STICT (a) TICT STICT CT (b) (C) Figure 13. (a) Reconstructed images using the subbands 79) , péli, and ”ilk for IS i S 3 , and 1 S k S 4. (b) Image reconstructed using the subband(s) with indices 6. = I, i = 3 , and 1 S k S 4. (c) Original image. ones for the reconstruction (synthesis) bands. 1) E Since the sampling matrix M is separable, the 2-D filtering could be carried out in a 0") had) and g with lengths 1,, and lg, separable mode using the 1-D filters respectively. Therefore, from Figure 8, we have (lh + l g ) / 2 multiplications per input sample (MPS) and (1,, + lg — 2) / 2 additions per input sample (APS) (note that the input to the filter G has N 2 / 4 nonzero samples for an input image of size N x N). For 36 a multilevel LP, the complexity is up to 4/3 the complexity of a single-level LP. 2) TILP: Considering the transfer function of a single-level TILP (2.1), although the filters K I. , (0 Si S 3) are indeed nonseparable, we can do the filtering in a separable mode by computing the difference of the filtered input image (by first H (z) and then (fl-(ZN), which are both separable filters) from the input image shifted by k i. Since H (z) is a common filter in all the channels, we are just required to filter the input image by this filter once, which needs 21,, MPS and 21,, — 2 APS. Now, without loss of generality, suppose that lg is odd, then the polyphase components of gud) (denoted by g6”) and gl(1d)) will have the lengths of (Ig —l)/ 2 and (lg +l)/2. Meanwhile, the filters Cit-(ZN), (0 Si S 3) are created using the ID filters g3”) and glad) . It turns out that one needs to filter the rows and columns of the input signal by 851d) and glad) once. Hence, the complexity of filtering by (fl-(2M ), (0 Si S3) will be 2 x (lg - l)/2 + 2 x (lg +1)/2 = 21g MPS and 21g — 4 APS. Consequently, the overall complexity is 2(lg + [h) MPS and 2(lg +1}, -1) APS. For an L-level TILP, the complexity will be L times the complexity of the single-level TILP. 3) DEB; Although the quincunx sampling matrix is nonseparable and thus filtering using fan filters is nonseparable, Phoong et al. [72] proposed an efficient approach, which provides separable filtering in the polyphase domain. Suppose that the kernel function ,B(z) in the 37 ladder network [72] has length l , which generates the 2-D synthesis filters with support sizes of (213 — 1,21'3 - l) and (4lfl - 3,413 — 3). Then, the complexity of the quincunx filter bank (QF B) is 213 MPS and 2113 — 1 APS. Since we iterate the QFB at all channels for the higher directional levels, the complexity of the i -1evel DFB will be i times and that of a modified DF B is (i + 1) / 2 times the complexity of the QFB. 4) m In this case, opposite to the DFB, we have to perform nonseparable filtering at some levels due to omitting the subsampling operations. Nevertheless, we show that we can have a complexity similar to the separable filtering. Using again the filters designed in [72], we have the synthesis filters for QFB as follows: 41/3 -1 -1 HO(ZI922):(1/2)(zl +Zi 76(2122 )fl(2122)), and —4Ifl+l Hl(zl922)=zl _18(zlz2l)fl(2122)H0(zl’22), provided that the sampling matrix is Here, since both ,B(zlzz—l) and fl(zlzz) are diagonal matrices having [[3 nonzero elements, the complexity of the QFB is 41p MPS and 41 fl — 2 APS, which is the same as the separable case. In the second level of the TIDFB, we upsample the filters by Q, where ,B(z12;1)fl(zlzz) transforms to 3(222 )fl(z]2 ) and therefore, for each QFB we 38 reach the same complexity as the first level. For higher levels, in addition to the sampling matrix Q, we have resampling matrices, as well. The overall sampling matrices for these levels after the second level are in the form of [70] i-2 1 0 1% 11°10 21:21 ,. [‘4 Consequently, for a level 123, ,B(Z§)fl(zlz) converts to ,6(Z§),6(Z12 ) or ("—1 ,6(z§ )fl(212), which indicates that for each QFB at these levels also we have the same complexity as the first level. Since the total number of the QFBS employed in an i - level TIDFB is 2] — 1 and that of the modified TIDFB is 21“1 , (l 2 2), the complexity of these schemes will be 41,6 (21 — 1) MPS and (41,5 — 2)(2[ — 1) APS, and 41/, 21—1 MPS and (41,3 — 2)2H APS, respectively. 5) STICT: In this case, for an L-level STICT employing (modified) DFBS with A ll. , (1 S i S L) levels, we have 2L(l,, + lg) + 61,222. + 21,,1. MPS, and L A 2w, +1g — 1) + 3(21, “DZ-=11: + (21,, — 1) APS. 6) TICT: Considering the complexity of the TILP and (modified) TIDFB, 39 an L-level TICT having TIDFBs with 1,, (1 _<_. i s L) levels has the complexity of 2L0, +1g) + 121,321?=1 2’? —81,, MPS, and 2L(l,, + 1g — 1) + (41, — 2)ZL1(3(2’3' ) — 2) APS. Note that in the above calculations, we have considered general forms of the filters. If, however, linear phase filters are employed, we can use about half of the filter lengths in the above complexities. We see that due to the (modified) TIDFB, both the complexity and redundancy ratio of the TICT are exponentially proportional to the A directional levels 1,, (l S i S L), whereas they appear as linear terms in those of the STICT. Hence, significant reductions in complexity can be achieved when using STICT, especially when using a high number of levels. 2.5 IMAGE DENOISING One of the major applications of the wavelet transform is denoising. For images, however, directionality is an important feature that the regular WT lacks. It follows that, when one denoises images using wavelets, the edges and fine details are smeared. Therefore, using subband decompositions having the feature of directionality as well as a good nonlinear approximation property would result in superior image denoising performance [36], [80], [86]. The CT has been shown to be a better alternative choice than the WT at some cases [28], [36], [43], [74]. In [43], a cycle-spinning algorithm is employed to improve the denoising performance of contourlets. Although it is equivalent to a TI denoising if all of the possible shifts of the input image are used [22], the 40 computational complexity of this procedure for an image of size N X N is N 2 times that of the CT, which consequently makes this algorithm almost prohibitive for rather large-size images. Our preliminary work on contourlet TI denoising demonstrated the effectiveness of the STICT in image denoising [36]. Here, we improve our method through finding a suitable shrinkage function. A. S TIC T Denoising Scheme Using Bivariate Shrinkage One of the most crucial factors in image denoising is the method of shrinkage. Because of the inter-scale and intra-scale dependencies amongst the transform coefficients, it is of key importance to build the shrinkage operation upon an appropriate probability model to account for these dependencies. Bivariate shrinkage is a recent shrinkage approach, which in addition to taking into account the dependencies among the coefficients in each subband, considers the parent-children relationship into the MAP (maximum a posteriori probability) estimation [79]. In this work, we introduce a new image denoising scheme based on the proposed STICT and incorporating bivariate shrinkage. This shrinkage approach is established through modeling the joint probability distribution function (PDF) of parents and children of the transform coefficients. For wavelets and also dual-tree complex wavelet coefficients, [79] proposed the following non-Gaussian joint PDF 3 x = e a , 2.3 pX( ) 272'0'2 ( ) where x1 and x2 denote parents and children. The main advantage of this model is that it provides a closed-form shrinkage function that results in easy realization and also generates competitive results in comparison with the more sophisticated models [79]. 41 For the STICT, we need to first study the joint PDF of parents-children. In this case, we propose a parent-children relationship, which is similar to the one introduced for the CT coefficients [74]. Suppose that we have the STICT with I}, (1 S i S L) directional levels, then we consider the following parent-children relationship where III = (ml / 2, m2 )T for horizontal and III = (m1, m2 / 2)T for vertical subbands: parent child(ren) ”Six (m) if [1+1 = A1 a (i 7‘: L) nifi.’(m) —> ' nitrite) and mime) if I"... =I". +1, (in) For subbands corresponding to y0 or yl (see Figure 11), the children lie at the same position where the parents are in the next coarser level. Note that for the approximation subband 77L (m) , all the directional subbands at the previous level are children subbands with a similar relationship that was mentioned above. Using this definition, in Figure 14 we demonstrate the normalized joint histogram of parents—children for the Barbara and Peppers images, when an STICT with {BLSKZ '2 {3,3} directional levels is employed. We see that the joint histograms are similar to that of the wavelet coefficients (see [79]) and hence, we propose to use the model (2.3) for our bivariate shrinkage function in the STICT domain. B. Simulation and Results To evaluate the proposed schemes, we performed several experiments on a variety of images all of size 512 x 512. Here, we also provide the CT and TICT denoising results using hard thresholding. For the sake of comparison, we also employed some of the state 42 Barbara Image Peppers Image 0 “U E - x10- 0 Children Figure 14. The normalized joint histograms along with their contour plots of parents-children of the STICT coefficients for the images Barbara and Peppers. of the art methods in literature such as the dual-tree complex wavelet transform (DTCWT) with both hard thresholding (HT) and bivariate shrinkage (BS) [80], and the BLS-GSM denoising method propose by Portilla et a1. [73] (using full steerable pyramid with window size (3, 3) and inclusion of parents). Furthermore, we used a T1 (or undecimated) wavelet transform (TIWT) as well as adaptive Wiener filter (function wiener2 in Matlab) using a window size of (5, 5). Note that using the generalized algorithme a trous proposed in Section 2.3, we can easily construct the TIWT. Hence, a TIWT with L levels has the redundancy factor of 3L +1 and complexity of 2L(lg +l,,) MPS and 2L(lg +l,, — 2) APS where I,' and Ig are the lengths ofthe 1- hUd) (M). D analysis filters and g The filters we used for the TIWT and TILP in the (S)TICT are biorthogonal 43 Daubechies 9/7. Further, we used 5 levels for the TIWT and a 4-level TILP in the (S)TICT. For the (modified) DFB and TIDFB, we utilized the fan filters designed in [72] with [fl =12 and hence, support sizes of (23,23) and (45,45). We applied {LMS-S4 = {3,3,2,2} directional levels to the (S)TICT except for the STICT (BS) for the Barbara image where we used {1:}1554 = {4,3,3,2}. Note that if we use more directions and levels in the (S)TICT, there will be more artifacts introduced in the denoised images. The images were contaminated by a zero-mean Gaussian noise with a standard deviation of 0'. For all the denoising schemes, we assumed that 0' is unknown and we estimated it using the robust median estimator [32]. Moreover, we mirror-extended the noisy images to avoid boundary distortion. Although the size of the noisy images is rather large, the PSNR values of the denoising results change slightly (usually up to i0.1 dB) when we use a different noise instance. Hence, to obtain more accurate PSNR values, we repeated each denoising experiment ten times using different noise realizations and found the average of PSNR values. We also clipped the noisy images to set the pixel values in the allowable range of 0 to 255. Table 1 shows the PSNR values of the denoising results when the standard deviation of the input noise is varying between 0' = 5 and 0' = 100. In the first part of the table we used hard thresholding to compare different transforms for denoising. As seen, our proposed TICT (HT) method outperforms the other methods in most cases. In addition, the STICT (HT) provides competitive PSNR values to the other outstanding schemes. The second part of Table 1 shows our proposed STICT (BS) denoising results as well 44 Table l PSNR Values of the Denoisin Experiments Nois TIWT DTCWT STICT TICT TIWT prcwr BLS- STICT Image a Imagi; CT (HT) (HT) (HT) (HT) (HT) ll (BS) (BS) 180] GSM173] (BS) 5 34.15 33.64 36.54 36.04 36.75 37.07 37.85 37.09 37.96 37.95 w 10 28.13 30.17 32.58 32.66 33.19 33.49 33.72 33.50 34.40 34.42 a. 20 22. 15 26.49 28.26 28.93 29.26 29.53 29.41 29.78 30.60 30.59 3 40 16.38 23.17 24.53 25.02 25.42 25.70 25.33 26.35 26.70 26.77 70 12.25 21.04 22.19 22.28 22.54 22.78 22.38 23.44 23.54 23.57 100 10.16 19.76 20.60 20.62 20. 75 20.95 20.67 2_]_.__4_1_ 21.51 2 I .40 a F- 5 34.15 33.97 36.88 36.37 36.78 37.11 38.13 37.55 38.17 38.04 10 28.14 30.59 33.32 33.24 33.31 33.59 34.19 34.02 34.69 34.35 g9 20 22.18 27.27 29.75 29.80 29.79 30.03 30.32 30.63 31.14 30. 73 a 40 16.42 24.08 26.32 26.20 26.25 26.49 26.47 27.36 27.65 27.15 70 12.31 21.50 23.27 23.10 23.10 23.33 23.27 24.33 24.41 23.91 ‘ 100 10.20 19.83 2—l._0=6 21.00 20.89 21.08 L 21.12 21.94 21.55 5 34.15 35.00 37 33 37.35 37.30 37.52 38.27 38.00 38.22 38.20 10 28.13 32.10 34 53 34.71 34.67 34.89 35.08 35.30 35.60 35.44 g 20 22.13 28.91 31 35 31.45 31.52 31.75 31.54 32.35 32.63 32.30 8 40 16.35 25.62 27 83 27.67 27.94 28.22 27.68 29.24 29.39 28.82 70 12.22 22.78 24 55 24.31 24.52 24.82 24.23 25.93 25.96 25.35 100 10.15 21.98 g 22.09 22.15 22.39 _ 22.05 23.30 22.79 5 34.21 33.75 35 86 36.04 35.68 35.83 r 36.88 36.52 36.46 36.46 V 10 28.25 31.53 33 91 33.99 33.82 34.00 34.49 34.27 34.57 34.43 g 20 22.32 28.50 30 97 30.91 30.92 31.15 31.12 31.49 31.92 31.60 3 40 16.59 24.82 27 06 26.79 26.99 27.26 26.78 28.00 28.21 27.77 70 12.46 21.62 23 29 23.00 23.20 23.45 22.90 24.29 24.26 23.91 100 10.30 19.58 20 70 20.58 20.61 20.80 20.54 21.48 21.44 21.19 as those of the TIWT (BS), DTCWT (BS) [80] and BLS-GSM [73]. The computational times for these methods on the computer we ran the simulation were roughly 35s, 178, 58, and 95s, respectively. As seen in Table 1, for low and moderate noise (0' S 20) our method performs competitively to other methods but for higher power of noise this approach slightly degrades due to the amount of introduced artifacts. Visually, however, the proposed ST ICT (BS) method performs better in recovering very fine details found in some images such as the Barbara image. Figure 15 shows the visual results of the Barbara image where the superior performance of the proposed approach is clear. 45 TIWT (BS) DTCWT (BS) [80] BLS-GSM [73] STICT (BS) R = 30.60 )1 , H4 PSNR = 26.35 PSNR = 26.70 (b) Figure 15. The denoising results of the Barbara image when (a) 0' = 20 , (b) 0' = 40 . Another visual example is depicted in Figure 16, which illustrates part of the GoldHilI image. Again, we can see that the detail over the roofs are better recovered using the STICT (BS) approach. Finally, Figure 17 depicts another example from the Boats image. Here, we can compare the artifacts introduced around edges by these methods. Note that both the TIWT and dual-tree complex wavelets produce more (visible) artifacts around strong edges. The proposed method provides similar performance to that of the BLS-GSM in this figure. 46 TIWT (BS) DTCWT (BS) [80] PSNR = 33.09 PSNR = 32.82 BLS-GSM [73] STICT (BS) PSNR = 33.27 PSNR = 32.97 Figure 16. The denoising results of the GoldHiIl image when 0' = 10. 2.6 CONCLUSION In this work, we studied and developed new approaches for converting a general multi-channel multidimensional subsampled FB to a translation-invariant or non- subsampled FB. Particularly, we extended the algorithme £1 trous, which has been introduced for l-D wavelets, to a generalized algorithme a‘ trous, which is applicable to a general multidimensional and multi-channel FB framework. Using the proposed generalized algorithme d trous as well as incorporating modified versions of the DFB, we constructed the new scheme of the translation-invariant contourlet transform (TICT). We also proposed semi-TICT (STICT) to reduce the high redundancy and complexity of the TICT. Then, we used a competent Bayesian-based shrinkage approach in conjunction with the proposed STICT to create an efficient 47 BLS-GSM [73]; PSNR = 31 .14 STICT (BS); PSNR = 30.73 Figure 17. The denoising results of the Boats image when 0' = 20 . denoising scheme. Our results indicate the potential of this new scheme in image denoising. 48 Chapter 3 A NEW/ FAMILY OF NONRE DUNDAN T DIRECTIONAL TRANSFORMS 3.1 INTRODUCTION Recently, there have been several studies showing that separable 2-D wavelets fail to represent images optimally [60], [91]. It is well known that the wavelet transform provides efficient approximation of 1-D piecewise smooth signals; nevertheless, since natural images possess l-D singularities in the form of regular edges, approximation behavior of 2-D wavelets for images indicates the need for further improvement [91], [12]. As a means to offset this deficiency to some extent, most image processing systems utilizing the wavelet transform, for instance coding and denoising systems, usually take advantage of a post-processing stage to treat the inter- and intra-scale dependencies amongst the wavelet coefficients [73], [78], [79]. However, this approach alone does not necessarily eliminate the demand and need for more efficient image transforms. To construct an efficient image transform, the following criteria are critical. First, the transform should provide a good nonlinear approximation (NLA) [60] behavior. This 49 requires the transform to be direction-sensitive (or geometric) in addition to being able to provide perfect reconstruction, multiresolution representation, and localized analysis. Other important features include the transform performance in terms of introducing a minimum level of ringing artifacts during NLA. The second criterion is that the transform should incur reasonable computational complexity. In the light of this property, fixed- procedure transforms are more desirable in contrast to the adaptive transforms, which normally impose more computations. Finally, being nonredundant is a requirement in some image processing tasks, most notably image coding. In this chapter we introduce a new family of image transforms fulfilling the aforementioned criteria, study their properties and show their applications to coding and denoising of natural images. This family is one of the first nonadaptive directional approaches that is employed for image coding. The proposed transform family is constructed using Hybrid Wavelets and Directional filter banks (HWD); thus we refer to them as the HWD transforms. Other nonredundant geometrical image transforms include bandelets [57], CRISP- contourlets [59], directionlets [90], nonredundant complex wavelets [48], and multiresolution direction filter banks [67]. A primary difference between our proposed transform family and the other nonredundant transforms mentioned above is the following. While HWD is nonadaptive, it possesses a rich set of directions, and provides an efficient NLA by taking advantage of the wavelet transform in its construction, and thus, it could be directly employed in key image processing applications such as coding. We should also note that there have been a few attempts in the past to increase the directionality of wavelets using checkerboard filter bank [4], [17]. Although these 50 transforms provide nonadaptive directional extension of wavelets, they are limited to a small number of directions and do not have flexibility when compared with our proposed scheme. The chapter is organized as follows: In the next section we briefly present background material and the notations required for developing the proposed scheme. In Section 3.3 we explain the construction of the proposed HWD transform family. In Section 3.4 we provide multiresolution analysis and efficient realization of the transforms. The applications of the proposed family as well as the experimental results are given in Section 3.5 followed by our main conclusions in Section 3.6. 3.2 BACKGROUND A. Motivation It is known that the wavelet transform (WT) fails to provide optimal NLA decay for images containing regular regions of C a (a > 1) (i.e., a -order continuously differentiable regions) separated by regular discontinuities (or edges) of C a . While the optimal decay rate of NLA is of 0(M_a) where M is the number of retained coefficients during NLA, wavelets provide a decay rate of 0(M '1) [60], [57]. This low decay rate is due to the fact that the discontinuities in images yield many wavelet coefficients of large magnitude. That is, the regularity over the edges remains unseen from the WT. It turns out that there is quite ample room to further improve the NLA decay rate of wavelets. Here, we attempt to pass wavelet coefficients through a filter bank in order to combine the large wavelet coefficients around discontinuities to achieve a more sparse 51 representation. Although it is possible to construct a totally different basis from wavelets, we believe that improving wavelet basis has some key advantages: 0 The discrete wavelet transform can be realized using a critically-sampled filter bank and consequently, provides a nonredundant image decomposition. o Wavelets are popular in the image processing community and there exists numerous algorithms and procedures utilizing wavelets for image processing applications; hence, one can benefit from these algorithms by cleverly adapting them to the proposed transform family. 0 Wavelet packets is an alternative to handle the problem adaptively. One can also enjoy this feature of wavelets when extending it for the proposed transforms. Other leading approaches such as curvelets [12], [13] and contourlets [28] use a similar idea of combining large transform coefficients around discontinuities. The curvelet transform has been proposed in the continuous domain and therefore, implementing it in the discrete domain is challenging. The contourlet transform employs a Laplacian pyramid [10] to extract edges of an image and applies directional filter banks (DFB) [5] to all bandpass outputs of the pyramid with decreasing number of directions when moving to the coarser pyramid subbands. The DFB stage attempts to decorrelate the dependencies found over the edges in the bandpass outputs of the pyramid. The reason for choosing the Laplacian pyramid as the first stage is that because its highpass channels are not subject to downsampling and thus there is no frequency scrambling for these channels. This construction, however, leads to the existing redundancy of the contourlet scheme, which makes this transform unsuitable for image coding. Below we outline the notations we use in the chapter. Then we very briefly present 52 key aspects of the 2-D wavelet transform and DFB, which are required for the realization of the HWD family. Notation: We denote a discrete (1' -dimensional signal by x[n] where n = (nl,n2,...,nd), and its z-transform as X(z) = Znezd )c[n]z"n , where . d . z = (Zl,zz,...,zd) IS a complex vector and zn =1—IHZ?‘ . We also define zM as m m m . . z =(z 1,2 2,...,Z d),where M=[m1 m2 md]1sa dxd integer matrix with mi as its ith column. B. 2-D Separable Wavelets The 2-D separable wavelet transform [60] is obtained from the tensor product of the corresponding l-D wavelets. Suppose that H (M) (Z) and Gad)(z) are l-D lowpass and highpass decomposition filters, then the lowpass and three highpass channels corresponding to the Horizontal, Vertical, and Diagonal subbands for 2-D wavelets are obtained as Figure 18(a) illustrates. In this work we denote the sampling matrix M as diag(2, 2) = 212. Figure 18(b) shows how the WT5 partitions the frequency space. Since the WT uses a separable construction, the basis functions are merely aligned in two horizontal and vertical directions (see Figure 18(0)). As a result, wavelets have poor directionality. C. Directional Filter Banks Bamberger and Smith introduced directional filter banks (DF B) using quincunx and parallelogram filter banks [5], [3]. An improved version of the DFB using tree-structured 5 From hereafter we mention 2-D separable wavelet transform as wavelet transform (WT). 53 (7W!) To the H(z) = H(ld)(zl )H(ld)(22) @ next level D1 H" D1 x1n] 01(2) : H(‘d)(zl )G(ld)(zz) Wrizonml _ —— V. 51 v1 —-)0), G: (2) = thd)(zl )H(ld)(zz) Wired 63(2) = G“")(zl)G“"’(z.> We“! D. H, p, ('72., '7!) i (a) (b) (C) Figure 18. (a) 2-D separable wavelet transform. (b) The frequency partitioning of the separable wavelets. (c) Some basis functions corresponding to Horizontal, Vertical, and Diagonal subbands of biorthogonal 9/7 wavelets (from lefi to right). Note that only positive values are shown. filter banks was developed recently [70]. In an I -level DFB, the frequency space is divided into 21 wedge-shaped subbands (see Figure 19(a)). The overall sampling matrices Dim for channels 1 S i S 21 of such a DFB is [70] D") = diag(2H,2), for ISiSZI"l diag(2, 2H), for 2’"1 < i S 2’ ’ where the channels 151' _<_ 21‘1 correspond to the mostly horizontal subbands and the channels 2"1 < i S 2’ indicate the mostly vertical subbands. We can also construct half-tree DFBS by just decomposing the mostly vertical directions or the mostly horizontal directions, where we call the resulting schemes vertical DFB (VDFB) and horizontal DFB (HDFB) as Figure 19 depicts [38]. In VDFB 54 A (ma) 4 3 2 l 5 8 5 7 Jul 7 6 8 5 1 2 3 4 (—7[a—”) (02 502 A (7:, It) (a) Ar (7:, 72') yr 4 3 2 l 5 8 6 7 >601 y, Jo] 7 6 y 8 5 ’ (—7r9-7[) (_7r9~”) (b) (C) Figure 19. (a) The frequency response of a (full-tree) DFB decomposed in 3 levels. (b) An example of the vertical directional filter bank (VDFB) using 3 levels. (c) An example of the horizontal directional filter bank (HDFB) using 3 levels. (HDF B), we stop iterating at subband yh ( yv) after one level and do not decompose the signal horizontally (vertically) further. We call the subbands yh and yv as pseudo- directional subbands. The first level of the DFB is a simple quincunx filter bank (QFB) with fan filters. Therefore, the overall sampling matrices for VDF B and HDF B are DiVU) : Q, for subband yh (i = 'h ') diag(2, 2H), for 2"1 as 2’ ’ and 55 DH“) = diag(2"‘,2), for 1si_<_2"‘ ‘ Q, for subband y, (i = 'v')’ where Q is a quincunx sampling matrix. Note that we number the directional channels in the half-tree DFBS similar to a regular DFB. Moreover, we denote the overall reconstruction filters by F}__ L DD1 subbands Gs (Z) DFB Figure 20. A schematic plot of the HWD-H transform using 11 = 3 directional levels. 0 H WD using F 1411- Tree DFBS (H WD-F )8: Apply (full-tree) DFBs with lj levels to all three highpass subbands of wavelets at levels 1S j S Jm. We denote the subbands by VDY), HDy) . and DDS") . I - . . l - (z e I ((1! ) = {l | 1S1 S 2 I }) corresponding to the vertical, horizontal. and diagonal wavelet subbands to which we applied the DFBS. A schematic diagram of the HWD-H transform is illustrated in Figure 20. Using the noble identities [89], we can move the DFB filters before downsampling by M in the 8 We formerly called HWD-F and HWD-H as HWD type 3 and HWD type 2, respectively [42]. 6O ia’z wz l HWD-H HWD-F Figure 21. The frequency partitioning in the HWD family using II = 3 directional levels. WT. Consequently, we can find the frequency partitioning by the HWD family as Figure 21 demonstrates. Remark 3.1 (directional subbands): In HWD-H, since we apply VDFB to wavelet horizontal subband and HDFB to wavelet vertical subband, we convert wavelet horizontal and vertical subbands to mostly vertical and horizontal directional subbands, respectively. However, in HWD-F, we have all set of directions at each wavelet highpass subbands in the finest scales. Similar to the DFB, the major direction represented by each directional subband in HWD is perpendicular to the major axis passing through the subband (in the Fourier domain) as Figure 22 shows. As seen, a directional subband in the wavelet vertical subband represents a mostly horizontal direction (see also Figure 24). Remark 3.2 (frequency scrambling): Since our Objective is the construction of a critically-sampled scheme, we cannot avoid subsampling in the wavelet stage of the HWD transforms. As a result, frequency scrambling in the wavelet highpass subbands is inevitable. That is, the frequency regions of wavelet highpass subbands are subject to stretching and displacement due to downsampling by M. For instance, as Figure 23 61 a)2 ll Figure 22. A directional subband of HWD (VD?) ). The shaded regions show the frequency support of the subband and the thick line indicate its major direction in the space domain. shows, the high frequency regions (frequencies near 602 = fl') of the horizontal wavelet subband are mapped to low-frequency regions (frequencies near 602 = 0) after downsampling. Therefore, to decrease the aliasing due to downsampling, in HWD-H we decompose horizontal wavelet subbands (at finest scales) into mostly vertical directions (see Figure 23) and vertical wavelet subbands into mostly horizontal directions. Nonetheless, for some images with a large amount of textures and oscillatory patterns, taking advantage of full-tree DFBs in all wavelet finest subbands as in HWD-F, yields better results indicating the minor impact of the frequency scrambling in this case. In Figure 24 we show some basis functions of the HWD family in both the space and Fourier domains. (Note that the Fourier transform of a basis function corresponds to its relevant subband.) As a matter of fact, in the HWD family we convert the wavelet basis functions at a few finest scales to more directional basis elements. We also show a few basis functions of the DFB in Figure 24(b). Remark 3.3: Note that under HWD-F, since some Of the DFB filters are oriented 62 M HDl L.....l m subbands fl VDF B Figure 23. The effect of downsampling on the wavelet highpass subband which gives rise to frequency scrambling. By applying VDFB to horizontal subband in the HWD-H scheme, one can avoid inputting low- frequency regions of the wavelet subband to the directional decomposition. similar to the wavelet subbands, we have more aliasing. Additionally, from Figure 24(b) last row, we can see that since the wavelet filters fail to perfectly separate frequency regions, we have more leakage to low-frequency region in those subbands. Remark 3.4: As mentioned in Remark 3.1, the major direction represented by each directional subband in HWD is the same as the direction of the DFB subband that is employed in the HWD subband. As a result, all three directional subbands VD?) , HDg-i) , and DD?) represent the same direction as subband i of the DFB stands for (see Figure 19). We can see this fact from Figure 24 when we compare parts (a) and (b). Note that both stages utilized in the HWD family (i.e. the WT and DFB) are nonredundant and we can use any number of directions in this construction. Consequently, the HWD transforms provide a family of nonredundant, flexible and rich directional and nondirectional basis elements leading to good NLA decay for natural 63 HOV” vol") Win DU) (3) (b) Figure 24. (a) Some directional basis functions of the HWD-H (first two rows of which the last column corresponds to the pseudo-directional subbands) and HWD-F (all except the pseudo-directional ones) when 11 = 3. (Only positive values are shown.) (b) Four basis functions of the DFB with l = 3. (c) The corresponding magnitudes of the Fourier transform of the basis functions in (3). images as we demonstrate in Section 3.5. B. H WD for Quincunx Wavelets Similar to the HWD, we can add directionality to the quincunx wavelet transform (QWT) to construct Hybrid Quincunx Wavelets and Directional filter banks (HQWD). In contrast to the WT, the QWT uses nonseparable diamond filters and has just one highpass channel at each level. As a result, we propose the HQWD transform as follows (see Figure 25(a)): . m.- Apply (full-tree) DFBs with lj levels to the highpass subbands of quincunx 64 LL. ' 1 l v. as l the (I) next level __@_ To the I t"_.;-ll'.-il?.:z. ~- 4 x[n] QD1 subbands DFB (0) Figure 25. (a) The HQWD transform. (b) The frequency partitioning in the HQWD ll =12 =3 directional levels. (c) Left: A basis function of the QWT. Right: Some directional basis functions of wavelets at levels 1S j S Jm. We denote the resulting subbands by ODE-i) (Isiszbi Again, after using the noble identities, the frequency span of the HQWD is obtained as Figure 25(b) shows. A few basis functions are depicted in Figure 25(c). We consider the following quincunx sampling matrices for odd and even QWT levels, respectively: 91431:] at; '1‘]- In this case, at even levels (i.e. j = 2k, k e Z) we have the equivalent overall sampling of MU”). C. Scaling Law and DFB Levels Suppose that in HWD, we apply I]. -level DFBS to the highpass subbands of level j 65 hate 5 rain the b3: literal directi are C01 Cc ll‘ldll COEII'SL' abou' Shouj in the WT. Then a transform coefficient in the directional subbands of the HWD will , l'~1 . have support 8126 of about 26f x 2] t’f after the DFB reconstruction, where the maximum size of the fan filter pair of the DFB are assumed (Efjf). Now, to obtain the basis element, we pass the resulting coefficients through j -level WT synthesis bank. Therefore, we have an upsampling by Mj , which expands the size of the input by 2j x Zj followed by filtering by the overall synthesis filters of size about (Zj —1)l’g x (2" —1)€g. As a result, the support size of the basis elements in the +j—l directional subbands of the HWD family is about 211L131 x 2’ 62, where E, z 62 are constants. Consequently, similar to contourlets, we can hold the parabolic scaling law of Width 0C length2 [12], [28] through decreasing the directional levels I]. at every other coarser wavelet scales up to level Jm: [jg =le —|_(j2 — j1)/ ZJ, for levels 13 jl < j2 S Jm. Note that in the case of the HQWD, the QWT synthesis bank involves upsampling by Qj that is equal to M172. Hence, the support size of the basis elements in HQWD is +j/2—l about 2j/2+l€l x 2’j 62 (fl z 82). It follows that for the HQWD transform we should enforce the scaling rule: ljz =lj1—LU2 ‘j1)/4j, (151i = 2,, ejzzfi D.- Inn/int). for zeI.’ and 773130): 2... Z2f.‘”[m— D.“ ’njy/jma), for 1_<_is2’f(orietj,’f’). Therefore, by using orthogonal DFBS, we span the detail subspaces W }J , Wf’z , and W123 ( j S J m ) into the following orthogonal directional subspaces: w2,l:@w2,,i(l), w2,2_ ®W2,2,(i), and w2,3: @W2H3U) 1'er J ielh ield " For the other levels J m < j S J in HWD-H, we have the same functions as wavelets: {Win (t)}neZ2 ' lSkS3 68 Remark 3.5: The family 1(0) 2,(1j) 3,(1j) i1,j,n ’ i2,j,n ’ i3,j,n ilelv,izelh,i3eld neZ provides an orthonormal basis for W J? ( j S J m ). In addition, each individual family 1.(1j) 2.(’j) d 3.(’j) 7711,13" ilelv ’ 7712,13" izelh’ an ”bum i361d’ 2 2 2 Del neZ neZ . . . 2,1 2,2 2,3 provrdes an orthonormal basrs for the detail subspaces Wj , Wj , and Wj (1 S j S J m ), respectively. Proof of the above statement is similar to that of the quad-tree decomposition in wavelet packets [60]. Since we use orthonormal filters in the DFBS, we divide Wf’k , (1 S k S 3) into orthogonal detail subspaces after each directional level. As a result, the proof is achieved through induction. . LU ') 211') 3,0 ') Remark 3.6: The family {77:1, jfn (t), 77,2, ff“ (t), 77,3an(t)},.le,v,26,h,3e,d neZZ, jSJm together with {Win (t), (”Ln (t)}nezz, b1.) Jm provide an orthonormal basis for lSkS3 L2 (1R2) . We can derive similar analyses for HWD-F and also HQWD. B. Approximation Owing to the similar structure of the proposed HWD-F to contourlets [28], one can prove a similar NLA rate of decay for HWD-F for a class of signals. In particular, it can be shown that an image x containing C 2 regions separated by C 2 curves when 69 decomposed by an HWD-F transform, follows the NLA decay of 2 3 -2 “x — xM ”2 S A(logM) M , where xM is the reconstructed image using M largest-magnitude transform coefficients and A is a positive constant. Note that the directional subbands in the HWD-F transform should have as many directional vanishing moments as possible (ideally have perfect flat passbands and are zero elsewhere) and the wavelet scaling function should satisfy peC? The proof is similar to the one provided for the contourlet transform’s approximation decay [28], however, we must emphasize a few points. 1. Generally, the curves in the image will have components in all three wavelet highpass subbands where they are subject to being directionally decomposed by the DFBS. Thus, each segment of curve will have just significant components when it intersects a directional basis function oriented alongside the curve. The fact that we have three highpass channels in the WT stage of HWD as opposed to the one highpass subband of the pyramid stage of contourlets, only changes the constant A in the NLA decay rate. 2. The scales j > J m (wavelet subbands) in HWD-F mostly stand for the smooth regions and the wavelet highpass subbands furnish sufficient angular resolution for the curve components. 3. Unlike the contourlet transform, since HWD take advantage of wavelets with horizontal and vertical vanishing moments and good NLA decay when compared with the Laplacian pyramid in contourlets, practically HWD shows better NLA decay in 70 comparison to the contourlet transform (see Section 3.5A). In the next section, we provide an efficient realization for the proposed transform family. C. Efficient Realization While the WT can be implemented efficiently using 1-D filters, the DFBS in the HWD need to be treated carefully. Although the quincunx filter bank [89], which is the major building block of a DFB, is a non-separable filter bank system, it is possible to implement it using ladder network and hence benefiting from low computational cost similar to 2-D separable filtering. Phoong et al. [72] proposed a two-channel filter bank using a pair of halfband filters, which can be realized in the polyphase domain using ladder network. This scheme, however, has some restrictions. Ansari et al. [1] proposed a two-channel filter bank using a triplet of halfband filters, where they could address the restrictions in double-halfband filter bank. We use this scheme to construct the DFBS. The equivalent analysis band of the quincunx filter bank (QFB) scheme using diamond filter is shown in Figure 26. The parameter c is set to J2 — 1 , to ensure that the 1-D lowpass and highpass filters have the same magnitude of 1/J2 at a): fi/ 2; a condition not achieved using the double-halfband filter bank. To have the maximum regularity of the filters, we use the Lagrange coefficients (ak) in the FIR l-D T-filter [1] Id NT —k k-l R( )(z) = Zk=l ak (z + z ), where a ___ (—1)"+NT“1'[:1VT(NT + 0.5 — i) . " (NT - k)!(NT — 1 + k)!(k — 0.5) 71 —..——~—>|(1+c)/2IL fl} El» y, V l. x 1- —c R(z)| [05(1 +c)zl R(z)] _il +3 R(z) -cR(z) LTQ Hun-m} eggs y. Yo ——~1/\/§ w 2/(l+c) (1- film) —o.5(1+c)zl R(z) y! e-aéb 413—4 Figure 26. Top: The analysis bank of the triplet QFB. Bottom: The synthesis bank. TO have a QFB with diamond filter pair, we use the transformation R(z) = R(ld)(zl)R(ld)(z2), whereas we use time-reversed versions of the T —filters to obtain a QFB with fan filters: R(z) = R(ld)(—ZI)R(1d)(—zz). The resulting filter pair will have support sizes of (8NT — 3, SN, —3) and (12NT - 5, 12NT — 5). Figure 27 depicts the frequency responses of the fan filter pairs using both double- halfband (with support size of (23, 23) and (45, 45)) and triple-halfband (with support size of (29, 29) and (43, 43)) ladder structures. It is clear that the triplet filter bank yields smoother fan filters and consequently introduces less visible ringing artifacts when employed in the DFBS. One of the issues affecting the efficiency of a transform is the regularity of its filters [55]. Since HWD is composed of two filter bank stages, its regularity is dependent on the 72 Figure 27. Top: The fan filter pair using triplet filter bank [1]. Bottom: The fan filter pair using double- halfband filter bank [72]. regularity of both wavelets and DFBS. While the regularity of DFBS needs a comprehensive treatment, here we resort to performing a simple test. The largest first-order difference of the coefficients of the iterated filter bank in the lowpass channel (which leads to an approximate of the scaling function) is an indication of the regularity [55], [56]. Here we measure the largest first-order difference when we apply an I -level DFB to the wavelet scaling function. We approximate (p by 7 iterations of the Daubecheis 9/7 analysis filter and apply both double- and tipple-halfband DFBS with normalized filters and 1 levels to q). Table 2 shows the maximum value of the first- order differences obtained for all the DFB subbands in the n, and n2 directions. As seen 73 Table 2 Largest F irst-Order Difference of the DFB When Applied to to | DFBlevel(l) INODFB 2 3 4 I DFB(THF)' l 0.0239 0.0245 0.0320 0.0431 | | DFB(DHF)2 | 0.0239 0.0466 0.0815 0.1479 J IDF B using normalized triple-halfband filters 2DFB using normalized double-halfband filters and expected, it is clear that the DF B with triple-halfband filters has more regularity. In what follows, we examine the complexities of the proposed schemes. D. Complexity Since the HWD transforms are composed of two stages, we first express the complexities of wavelets and DFBS. Here we evaluate the complexities of the analysis banks; similar expressions for the synthesis banks can be derived. 1) BLT; Suppose that we use analysis 1-D filters of a same even length, 6 ,, , in the 2-D WT. Then the complexity of a single-level WT is 26 ,, multiplications per input sample (MPS) and 26 ,, - 2 additions per input sample (APS). If we use linear-phase filters, we have 6 ,, MPS and 26 ,, — 2 APS in the WT. For an octave-band WT, the complexity will be up to 4/ 3 times the complexity of a single-level WT [92]. 2) DFB: Regarding the ladder network shown in Figure 26, the complexity of the QFB is evaluated as follows. Since at each channel we decrease the number of samples by 2 and we have three levels of separable convolution with 1-D T -filters having a length of 6, , the complexity of the QFB is 36, +1 MPS and 36, — 3/ 2 APS. For the linear- phase maximally flat filters that we use in this work, we have 3N, +1 MPS and 3N, - 3/ 2 APS (NT = 6, / 2) for the QFB. For a J -level octave-band QWT, we 74 have 2(1 — 1/ 2J) times the complexity of the OF B. The complexity of an I -level DFB is 1 times that of the QFB (note that after each level the size of the signal is halved). A half-tree DFB needs (1+ 1)/2 times the operations required for the QFB. 3) HWD: Suppose that in the HWD (considering linear-phase filters), we apply I, - level DFBS (1 S j S J m) to the j th highpass channels of the WT. Then the complexity Of HWD-F is about (4/3)r,, + 3(3N, + 92:3, (1,741) MPS and (4/3)r,, +9(N, —1/2)Z::'l (1,741') APS. In the case of HWD-H, the complexity is about (4/3)r,, + (3N, + 02:; ((21,. +1)/4f) MPS and Jm - (4/3)t,, +3(N,. — 1/2)Z,=1 ((21,. +1)/4J) APS. And for the HQWD we have Jm - (3N, +1)(2(1-1/2J)+ 27:1 (1,721)) MPS and 3(N, —1/2)(2(1—-1/2J)+Zj_':l (1,721)) APS. 75 Figure 28. Left: The HWD-F transform of the Barbara image. Here J = 3 , J m = 2 , and I, = I, = 3. Right: The HWD-H transform of the Boats image with J = 3 , J m = 2 , and I, = 12 = 2 . 3.5 APPLICATIONS AND RESULTS In this section we show examples of the HWD transforms and then present potential applications of the proposed transforms. Particularly, we examine their applications in nonlinear approximation, image coding, and image denoising. Figure 28 depicts two examples of the HWD transforms of the Barbara and Boats images. In this figure the wavelet subbands are separated with white lines and the directional subbands at the two finest wavelet subbands are separated with gray lines. The transform coefficients are clipped for better visualization. A. Nonlinear Approximation Nonlinear approximation (NLA) is an efficient approach to measure the capability of 76 MAfafi'eF’enoeIsinage 36 38 34- /i 35' z A // , I I. 32* a: 30‘ : l ‘ Z //," g 30. hi 28» ,/ . Xi ,/ 28‘ 26»- /,/I . . I! 4 . ,/’/. —HND 26* ——HND 24» f/ff .. ' ' Wavelets 24 ~ Wavelets . 2 I I I I a I I I I 11 12 13 14 15 11 12 13 14 15 1092M 1092M Figure 29. Examples of the nonlinear approximation PSNR results. Left: The NLA results for the Barbara image. Right: The NLA results for the Peppers image. a transform in sparse representation of a signal. Having a good NLA behavior, a transform would have potential in several signal processing applications such as coding, denoising, and feature extraction. We tested our proposed transforms using a variety of images and compared them with other transforms such as the WT and contourlets [28]. We used five decomposition levels in all methods and employed Daubechies 9/7 filters for the WT. For the HWD transforms we set Jm = 2. For contourlets we used {1,}ls 1.35 = {5,4, 4,3,3} (j =1 corresponds to the finest scale) directional levels. Figure 29 shows two examples of the NLA PSNR results versus the number of retained coefficients. For the Barbara image we used HWD- F with 11 =12 = 3 directional levels while for the Peppers image (and other images that contain less texture) we used HWD-H with 11 =12 = 2. The proposed HWD transform shows promising results for the Barbara image (and other images with significant texture content) where it consistently outperforms both 77 Table 3 PSNR Values of the NLA Experiment for the Barbara Ima e Method / M 2048 4096 8192 16374 32768 HWD (THF)' 23.91 25.86 28.28 31.35 35.39 HWD (DHF)2 23.87 25.73 28.05 30.95 34.73 Wavelets 23.33 24.63 26.68 29.95 34.58 Contourlets [zg 23.29 24.95 27.08 29.63 32.91 IHWD using triple-halfband filters 2HWD using double-halfband filters wavelets and contourlets. In particular, it achieves up to 1.6 dB (1.2 dB) improvement over the WT (contourlet transform). In the case of the Peppers image, the HWD transform provides comparable result to that of wavelets. Note that for many other images such as Boats, Fingerprint, GoldHill, Mandrill, and texture images our experiments indicated that HWD always provides better NLA performance. Some numerical values for the NLA of the Barbara image are also given in Table 3. To demonstrate the effect of employing regular fan filters in HWD, we also provided the HWD results when using double-halfband filters. The superior results especially for large values of M are clear for the HWD transform using maximally flat triple-halfband fan filters. Figure 30 shows the visual results of NLA of the Barbara image when M = 8192. As seen, the proposed HWD method provides better detail in conjunction with an acceptable level Of artifacts in the result. Our experiments implied that the HWD-H is more appropriate for images that are mostly smooth, whereas HWD-F provides very good performance for images containing a significant amount of fine textures. Weialso performed NLA for the HQWD transform and compared it with the quincunx wavelet transform. Table 4 shows the PSNR values obtained for the Barbara image. As seen, HQWD provides a growing improvement in the PSNR values as the 78 ‘1 i- . . (b. Alill. ‘il‘llll 11"" v ‘m '/6// lit, \1‘” , ) t I“ Contourlets; PSNR % 27.08 Figure 30. Example of the NLA visual results for the Barbara image when M = 8192 . Table 4 PSNR Values of the NLA Experiment for the Barbara Ima e uincunx Case Method / M 2048 4096 8192 16374 32768 I HQWD 22.21 23.69 25.59 28.18 31.90 Quincunx WT 21.95 23.06 24.54 26.64 30% number of retained coefficients increases. In this experiment we used ten wavelet levels andfortheHQWDweused Jm =4 and 1, =3 (lSjSJm). B. Image Coding Due to the good NLA performance of the HWD family and since this transform family is nonredundant, a potential key application for the proposed transforms is image coding. Although the NLA decay rate of wavelets for images is suboptimal, one can benefit 79 from tree-based coding schemes to improve this decay rate [20]. The SPIHT algorithm is an efficient tree-based wavelet coding scheme [78]. In this scheme, inter-scale dependencies are considered through the parent-children relationships existing among the wavelet coefficients. Note that although there are other superior schemes such as space- frequency quantization [94] and WSFQ [93], since the scope of this chapter is not image coding, we just provide our preliminary results using SPIHT coding algorithm. To take advantage of the SPIHT scanning algorithm for the HWD transform coefficients, a new parent-children relationship should be considered. Suppose that we have an HWD transform with J levels. For the levels J m < j S J , we have the same relationship as the one in the WT, and for the levels 1 S j S J m , for each subband HD, , VD and DD j we can use a similar parent-children relationship as the one considered 1" for the contourlet coefficients [74]. The problem appears when we attempt to define the children of coefficients lying at level J m +1. By applying DFBS to level J m , we almost remove the inter-scale dependencies that existed between wavelet levels J m and J m + 1. Nevertheless, we employ a suboptimal but simple rearrangement algorithm to be able to apply a similar SPIHT scanning algorithm as the one we use for wavelets (see Appendix A3 for detail). Although the described procedure is not optimal, we will show that the low bit-rate SPIHT coding results are rather promising for images with high amount of textures and details, where we could capture more details in the HWD coded images when compared with the wavelet coder. In our coding simulation, we used the image Barbara and an image composed of 16 textures [9]. For both images we used HWD-F with 5 wavelet 80 -— ~WaleletCoder PSNR 888888888 $5 4 £3 -2 T o LOGZR (538 Figure 31. Coding performance of the wavelet and HWD coders using SPIHT algorithm in terms of PSNR versus the rate for the Barbara image. levels and J m = 2, where for the Barbara image we used 1, =12 = 3 directional levels and 1] =12 = 4 directional levels for the Texture16 image. The number of directional levels is handy-optimized. In Figure 31 we show the RD curves of the wavelet and proposed coding schemes for the Barbara image. As seen, our method provides better or comparable result for a wide range of low bitrate when compared with the wavelet SPIHT coder. Remarkably, unlike the NLA performance of HWD in comparison to that of wavelets, the coding performance does not show significant improvement, which indicates a need for more sophisticated algorithms that we would address later. Figure 32 shows visual coding results of the Barbara image at 0.25 bpp and the results for Texture16 at 0.1 bpp are depicted in Figure 33. As can be seen from the figures, more directional features are retained when using the HWD transform (for example table cover and chair in Figure 32). Further, we have improved PSNR values compared with those of the wavelet coder. 81 Wavelets/SPIHT; PSNR = 27.45 HWD-F/SPIHT; PSNR = 27.83 Figure 32. Coding results of the Barbara image at rate 0.25 bpp. C. Image Denoising Image denoising is another application of the HWD transforms. We tested the proposed transforms for denoising of noisy images corrupted with additive white Gaussian noise. For the first part of simulation, we used a simple hard-thresholding rule to shrink the transform coefficients. This way we can observe to what extent the transform is efficient without the use of more complex shrinkage schemes. The threshold is selected as 30' [60] where 0' is the standard deviation of the input noise and is estimated using robust median estimator [32]. We also mirror-extended the images to remedy boundary artifacts. Although the sizes of the noisy images are rather large, the PSNR values of the denoising results change slightly (usually up to 3.0.1 dB) when we use a different noise instance. Hence, to obtain more accurate PSNR values, we repeated each denoising experiment ten times and found the average PSNR values. We also clipped the noisy images to set the pixel values in the 82 ‘1 ~ ,.. 111431 1‘1,“ . Wavelets/SPIHT; PSNR = 18.07 Figure 33. Coding results of the Texturel6 image at rate 0.1 bpp. allowable range of 0 to 255. Since the HWD transforms are shifi variant, they introduce many artifacts in the denoising results. Therefore, we also constructed translation-invariant HWD (TIHWD) transforms by removing subsampling operations to improve the results. A delicate point in developing the TIHWD schemes, is that we should not change the frequency partitioning of the HWD transforms (see Figure 21). As a result, we first upsample the DFB filters at level j (1 S j S J m ) by Mj , where M = diag(2,2) and then remove the sampling operations using the generalized algorithme a trous introduced in [36], [44]. In addition to the proposed methods, we also employed the wavelet transform (WT), 83 Table5 PSNR Values of the Denoising Experiments Left Part: Different Transforms With Hard Thresholmgight Part: Different Denoising Methods ‘ Nois CT CuTFW DTCWT TICT DTCWT STICT BLS- TIHWD [magi WT QB] [11] HWD TIWT [53] [36] (BS)[801tBS)[36]GSM[73l (BS) 10 28.13 29.86 29.78 29.71 30.07 32.58 32.66 33.49 33.70 33.50 34.42 34.40 34.46 20 22.15 25.80 26.31 25.89 26.58 28.26 28.93 29.53 30.01 29.78 30.59 30.60 30.78 40 16.38 22.44 22.94 23.67 23.16 24.53 25.02 25.70 26.33 26.35 26.77 26.70 27.15 60 13.30 20.99 21.04 22.60 21.21 22.82 23.00 23.57 23.96 24.28 24.48 24.41 24.84 10 28.14 30.76 30.30 31.44 30.86 33.32 33.24 33.59 33.70 34.02 34.35 34.69 34.50 20 22.18 27.21 26.88 28.66 27.29 29.75 29.80 30.03 30.13 30.63 30.73 31.14 30.96 40 16.42 23.83 23.55 25.82 23.82 26.32 26.20 26.49 26.66 27.36 27.15 27.65 27.48 60 13.37 21.94 21.53 23.96 21.77 24.17 23.98 24.26 24.42 25.25 24.87 25.43 25.24 10 28.13 29.97 29.72 30.86 30.02 32.07 31.98 32.10 32.36 32.82 32.98 33.27 33.25 20 22.17 26.98 26.84 28.43 27.05 29.17 29.16 29.16 29.44 29.97 29.93 30.31 30.22 40 16.41 23.93 23.82 25.87 24.00 26.32 26.02 26.35 26.63 27.26 27.01 27.49 27.37 60 13.34 22.09 21.88 24.22 22.08 24.40 23.94 24.57 24.71 25.40 25.05 25.54 25.44 10 28.13 32.10 31.58 33.46 32.08 34.53 34.71 34.89 34.73 35.30 35.44 35.60 35.39 20 22.13 28.58 28.16 30.59 28.59 31.35 31.45 31.75 31.60 32.35 32.30 32.63 32.40 40 16.35 24.97 24.64 27.27 24.92 27.83 27.67 28.22 28.17 29.24 28.82 29.39 29.15 60 13.27 22.94 22.42 25.29 22.73 25.50 25.25 25.82 25.80 26.96 26.38 27.03 26.78 10 28.25 31.83 31.04 32.54 31.69 33.91 33.99 34.00 33.95 34.27 34.43 34.57 34.44 20 22.32 28.49 27.89 29.67 28.43 30.97 30.91 31.15 31.18 31.49 31.60 31.92 31.72 40 16.59 24.47 24.09 26.20 24.46 27.06 26.79 27.26 27.44 28.00 27.77 28.21 28.14 60 13.53 22.00 21.64 23.91 21.91 24.38 24.04 24.56 24.71 25.41 25.04 25.42 25.44 Image 0 [1‘le owqwg smog lllHP/OD vua'] sxaddad contourlet transform (CT) [28], and adaptive wiener filter using “wiener2” function in Matlab. Moreover, we used the curvelet transform via frequency wrapping (CuTFW) [11] (where we found that it gives better denoising results than curvelets via unequispaced FFT approach), translation-invariant WT (TIWT) [22], dual-tree complex ‘ wavelet transform (DTCWT) [53], and translation-invariant CT (TICT) (see Chapter 2) using hard thresholding for the sake of comparison. Except for the Barbara image that we used HWD-F with [1:12 =3, for the other images we used HWD-H with 11:12 =2. Similar to the NLA experiment, for contourlets we used {If}1S 155 = {5,4,4,3,3} directional levels. The left part of Table 5 shows the PSNR values of the denoising results for different images and noise levels using hard thresholding. As seen, the HWD transform yields in 84 TIWT; PSNR = 30.97 DTCWT; PSNR = 30.91 TIHWD; PSNR = 31.18 Figure 34. Denoising results ofthe Peppers image when 0' = 20 . better PSNR values than the CT. Moreover, for the Barbara image it achieves superior results when compared with the WT. In the case of translation-invariant (TI) denoising, we see that the proposed TIHWD denoising scheme almost always provides better results (improvements up to 1.80 dB) when compared with the TIWT and DTCWT schemes. Moreover, it outperforms curvelet (CuTFW) denoising scheme. As an example of the visual results for this part of denoising, in Figure 34 we show the T1 denoising results of the Peppers image when 0' = 20. We see that the TIHWD scheme provides less visible artifacts in the denoised image. In the second part of denoising experiment, we took advantage of the bivariate shrinkage (BS) scheme with local variance estimation [80] for TIHWD, where we also used this approach for semi-translation invariant contourlet transform (STICT) described in Chapter 2. For TIHWD (BS) we used a window size equal to (17, 17) for estimation of local variance whereas we used a window with size (5, 5) for STICT (BS). We also compared our method to some other leading denoising approaches: DTCWT (BS) [80] and BLS-GSM (Bayes least squares using Gaussian scales mixtures) [73]. The right part of Table 5 shows PSNR values resulting from the above methods. We see that for the image Barbara, our TIHWD (BS) denoising scheme provides better 85 i 7% ’ ,2’1‘111 “/7112 BLS-GSM [73]; PSNR = 30.60 V "111 ,0, 111 (ll/ll l STICT (BS) [44]; PSNR = 30.59 DTCWT (BS) [80]; PSNR = 29.78 Figure 35. Denoising results of the Barbara image when 0' = 40 . results whereas for other images it shows comparable performance (within 0.25 dB). Our results are also comparable to those reported in [23] for NSCT—LAS (nonsubsampled contourlet transform using local adaptive shrinkage). Figure 35 demonstrates a visual example for this part for the Barbara image and noise level of 0'=40. It clearly shows the superior performance of the TIHWD in retaining details along with introducing fewer (or comparable) artifacts in the result. 86 3.6 CONCLUSION We proposed a new family of nonredundant geometrical image transforms by employing wavelets and directional filter banks. We showed that to avoid artifacts introduced during nonlinear approximation (and thus coding and denoising), we should change the wavelet basis functions in only a few finest wavelet scales. This way we take advantage of both directional and nondirectional basis functions to efficiently represent natural images. The proposed family benefit from a number of essential characteristics. They are nonredundant and at the same time provide promising nonlinear approximation behavior for natural images especially those having a significant amount of periodic texture. Consequently, they have potential for image coding. In the experiments, we employed the proposed transform family in nonlinear approximation, image coding, and image denoising and demonstrated their efficiency in these applications. 87 Chapter 4 ON THE LINEAR COMBINATION OF DE NOISING SCHEMES 4.1 INTRODUCTION As have been highlighted in previous chapters of this thesis, wavelets have proven their capability in removing noise from a piece-wise smooth signal [60]. Under wavelet denoising with hard thresholding, one simply sets to zero the transform coefficients of the noisy signal, which are below a threshold, and reconstructs the resulting coefficients to obtain the denoised signal. Recently, several new image transform schemes have been introduced, where most of them take advantage of the important feature of directionality. Following a similar procedure to the wavelet denoising scheme, one can employ other transforms for denoising [36], [44]. Owing to the characteristics of a transform, a transform-domain denoising scheme introduces some artifacts in the denoising results that are different from those of other schemes. Furthermore, each denoising scheme may have some advantages over the others 88 and also some drawbacks. As a consequence, combination of different schemes would be a solution to reduce artifacts and to compensate for the drawbacks. Taking advantage of the same idea, the authors in [22] proposed translation-invariant (TI) wavelet denoising, where it is, in effect, equivalent to the average of all denoised images resulting from the cycle-spinning algorithm. In [36] we have shown that the pseudo-Gibbs phenomena artifacts that usually appear in the denoising results when we use the contourlet transform [28] denoising scheme, can be significantly reduced. In this chapter, however, we use a different strategy that is based on a linear optimization approach in conjunction with employing different denoising schemes, where we extend our preliminary work [41]. The remainder of the chapter outline is as follows. In the next section we briefly highlight some of the related work. Section 4.3 provides problem formulation for our proposed linear combination method. The experimental results are provided in Section 4.4 and finally Section 4.5 concludes the chapter. 4.2 RELATED WORK In a general framework, where one wishes to find an optimal representation over a dictionary of bases functions, a few algorithms have been proposed [51]. Suppose that D: {3, ,... B M } is the dictionary of the bases functions, where B, is the matrix corresponding to the ith basis and we want to decompose a signal x (given in a row vector of length N ) using the dictionary D. Therefore, we have M x : ZizlaiBi ’ where {a,}ls,sM are the arrays of coefficients and a, = (a,,,...,a,N). The method of frames [25], provides a with a minimized 6 2 norm. Basis pursuit [18] is another 89 algorithm, which optimizes a subject to 61 norm; and hence, results in a linear programming approach. Matching pursuit [63] attempts to find a best basis in ’D utilizing a greedy algorithm. It sequentially adds elements from D that are most correlated with the residual. The above methods, however, are computationally expensive. Total variation is a technique employed to denoising [77] and later was combined with traditional wavelet decompositions to reduce artifacts in wavelet denoising [16], [33]. Using the same idea, the authors in [15] applied the curvelet transform in conjunction with total variation to improve the denoising results of curvelets. Starck et al. [87] proposed an algorithm to combine several transforms, where they used an iterative approach to minimize an 6 1 norm instead of total variation norm. Our approach, however, is based on a linear combination of the denoising results and thus is not complex. Meanwhile, the proposed approach provides improvement in the PSNR values and, more importantly, in visual quality due to significant reduction of the artifacts. Here we assume that the image is corrupted with an additive white Gaussian noise. 4.3 OPTIMAL LINEAR COMBINATION A. Optimal LMS Approach We wish to find an analytical solution to the following problem. Suppose that an image denoted by a row vector 5? 6 RM, with size N is corrupted with an additive zero-mean white Gaussian noise 2 =11 + v. Note that we use boldface for random variables and random vectors and a tilde sign for a variable that does not have zero mean. 90 Now we employ M different denoising schemes to the noisy image to obtain the denoised signals {91,y,,...,y,,}, where 5', ERIXN. Assuming that none of the denoised signals is an exact copy of the original signal, which is usually the case because of the artifacts introduced by the denoising schemes and also because of the remaining noise in the denoising results, we wish to find a better estimation )2 of the image J? using the denoised images {y, , y, ,. . ., 52M }. In this work, we are only interested in the linear estimation approach. To obtain more compact equations, we centralize the random ~ ~ , .. N .. vectors x and {y,},S,-3M around then means m,E = mean(x) = 1/ N Zn=lx[n] and .. N .. my, 2 mean(y,)=l/Nzn=1y,[n] as where lMxN is a matrix of size M X N with all entries equal to one. Using the denoising results, we are interested in a linear estimation of x , ,. M x : 21:1 k1 yi ’ and therefore a vector of coefficients K = (k1,...,kM) in order to minimize the variance of error e = x —— it (note that the variance of error is a scalar): K0 = arg min {Heuz} = arg min{E[x — it][x — if}, K K where (.)T denotes the transpose operation. This is the well-known problem of optimal linear estimation in a least-mean-squares (LMS) sense. Note that we define the inner product of two random vectors as the expectation: 91 r 2 T 2 (x,y) = E[xy ]= CW, and ”x" = (x,x) = E[xx ]= 0,. Hence, we can express the above equations in terms of inner products where it is straightforward to switch between deterministic and stochastic cases. Defining the matrix ofthe denoising results as y = (y{,. . .,yL )T e RMXN , we can find the optimal LMS estimator coefficients K 0 as [52] K =C C‘1 (4.2) WY, where C.=E1yy71= (EIy.y,-1)..,, M e814”, =il,.. .,M is the covariance matrix of y and C..=E1xy’1=(E1xyl] E[xyhi) est“, denotes the cross-covariance matrix. Therefore, using the computed coefficients in (4.2), we find the optimal LMS estimation as fro = 0y = nyC;'y. (4.3) Further, the minimum mean-square error is found to be [52] “e H — _§a — ny C; ‘,C It follows from (4.1) that the Optimal estimated image is ) i0 =mf11xN +Ko(y—mj3) =millxN +nyC;l(y-my)a _ _ T T MxN where C,—C,-,,ny—C~-,, andmy =(mT mM) ER . }’l’ 92 While the above LMS method provides optimal coefficients to combine the denoising results, its dependence on the statistical properties of the original image 3? makes this approach unattainable in practice. Nevertheless, we use this oracle approach for the sake of comparison. Note that in the above formulation, we used only one coefficient k,- to scale all pixels in a denoising result y,. The reason for this is to facilitate developing a non-oracle method to find a set of near optimal coefficients. B. LMS Approach using the Noisy Signal Since finding the optimal coefficients K 0 as given in (4.2) is subject to having the original signal, we need to develop other approaches that do not rely on having knowledge of the original signal. One approach is to use the noisy image instead of the original. (Note that the noisy signal is the maximum likelihood estimation of the original signal [7].) Therefore, the suboptimal estimated (removed-mean) image 2,“, , and the corresponding coefficient vector K are expressed as '10 ’ A _ _ —l xno —K,,0y and Km — C2,,Cy . (4.4) Note that, since 2 = i — millxN and thus 2 = x + V, we have 0',2 = 0'3 + 0'3, C —C +C 6118”“ d] C -—C +C 6118““ 6 t1 2,, - x), v), , an a so ,2 — y, y, . onsequen y, we can calculate the covariance of the error (associated with the noisy signal) as 2 _ 2 2 —1 -1 ||e,,|| _ "e0” + a, — C,,C, (C, + C,,) - CWC, C,,. (4.5) Since the denoising schemes usually take advantage of nonlinear approaches to 93 remove noise, it is not trivial to find statistical measures such as C ,3, and va ; hence, we attempt to use approximate values. If we assume that y, = x (1 Si SM), we can express (4.5) as 2 —”e0“2 :03 _22:1C"}’i _C C vy yv ~ 2- Z” -Z'" 2 _ 0V 2 i=1 CW1“ l=1vai In addition, it is reasonable to assume that there is low correlation between the noise eno (4.6) 2 41.0112 20. It e v and the denoised signals y. Meanwhile, we always have ,0 follows that the variance of the noise is the dominant term in (4.6), and hence, for a rather good estimation of the signal using the proposed approach given in (4.4), we should ensure that the power of the input noise is low. C. Approximation to the Optimal LMS Approach Now we consider another method to this linear estimation problem. Here we suppose that the denoised signals are corrupted versions of the original signal with additive distortions expressed as y, = i + 11, (1 S i S M ) or equivalently after removing the mean values we have y, =x+qi (lSiSM), where I], E RIXN is the distortion vector due to the artifacts and remaining noise in the denoised signal y ,. In particular, we assume that the denoised signal is linearly related to the original signal. We also assume that 1|, (lSi SM), is uncorrelated with x. Denoting 1| =(111T,...,1]L)T e RMXN , we can write y = 1Mx1 x + I]. The linear LMS 94 solution [52] to this problem is found to be 0 0 it = K y, with Ko =(||x||’2 +11... C,;‘1M,,)_l1,xMC,;‘. >> 1, we obtain the Fisher estimation of x as By assuming ”x A —- _l — xF = (11XM Cr; 1 lMxl) llxM Cr] 13' , (4-7) and lleFl|=(llxMCI;-11Mxl)‘l' Note that although there is no explicit dependence on the signal x in (4.7), we have no prior knowledge of C”. Meanwhile, it is important to note that when deploying different denoising schemes, one would target a variety of such schemes that generate different types of artifacts. Ideally, the resulting artifacts from the different denoising schemes are uncorrelated. Therefore, under such scenario, we can assume that C” as I , which results in the estimation 1 1 A M x =——l =— , ., 4.8 .. MtxMy M24. ( ) = 1/ M . That is, 2,, is Obtained through averaging the denoised images (where 2 ea and we assumed that m, =mean(m,-,,)). Note that when C” =1, 32,, is the optimal unbiased LMS estimation of x. As mentioned above, for the assumption C” =1 to be viable, one should have E[n,nf]=0 (lSi,jSM and i¢j) and E[tyityiT]zl (lSiSM), which is the case (to some extent) when one employs different denoising schemes having comparable 95 performance and producing different types of artifacts. 4.4 EXPERIMENTAL RESULTS A. A Brief Note on the Denoising Artifacts Here we briefly outline the type of artifacts introduced by some different image denoising schemes to better appreciate the benefit of combining denoised image signals. Particularly, we consider denoising under the wavelet transform, contourlet transform, and adaptive Wiener filter [58]. Notice that while there exist other approaches, we have selected the above methods due to their different characteristics where each of them can be representing a different class of image denoising schemes. Remarkably, when some statistical approaches could improve the above methods, their intrinsic features that result in denoising artifacts still remain and cannot be totally eliminated. l) Wavelet Denoising Scheme: The wavelet transform has shown its capability for denoising piece-wise smooth images [60]. Wavelets, indeed, provide unconditional bases of 62 and also of many smoothness spaces [30]. As a result, wavelet shrinkage is a smoothing operation for a wide variety of signal classes. Wavelet shrinkage in comparison to other older methods such as convolutional smoothers and Fourier-domain damping is much simpler, and offers many broad near-optimality properties not achievable by the older methods [30]. An important problem that arises in a transform-domain denoising is the artifacts introduced when one thresholds the transform coefficients. The ringing artifacts are in fact due to pseudo-Gibbs phenomena, which occur near edges and discontinuities and resemble the basis functions of the transform. Figure 36(a) shows some of the basis functions of the wavelet transform. Note that this transform is efficient in capturing point- 96 (C) Figure 36. (a) Some basis functions of wavelets. (b) The denoising result of the Barbara image using wavelets (0', = 20). (c) Some basis functions of contourlets. (d) The denoising result of the Barbara image using contourlets (0', = 20). wise singularities and the basis functions are like points. Therefore, the artifacts in a denoised image will look like the basis functions as shown in Figure 36(b), which depicts an example, where the standard deviation of the input noise is 20. Notably, this denoising scheme is incapable of capturing some textures and fine details and therefore, yielding another kind of distortion. 2) Contourlet Denoising Scheme: The contourlet transform, one of the geometrical image transform, is introduced to better capture directional features of an image [28]. Owing to the directionality of this transform, the basis functions are in the form of needle-shaped segments, which can be oriented in different directions as Figure 36(c) shows. Consequently, the denoising artifacts will look like arbitrarily-oriented needle-shaped 97 segments that are more visible in the smooth regions (see Figure 36(d)). To have sufficient directional resolution, one has to use directional filters with large support. As a result, the basis functions and thus the artifacts appear in the form of long segments, which severely degrade the quality of the denoised image. Nevertheless, the contourlet denoising scheme outperforms the wavelet approach in retaining textures and fine details in the denoising results (compare Figure 36(b) and (d)). 3) Adaptive Wiener Filtering: The Wiener filtering is a traditional denoising method, which usually leads to a lowpass filtration. This approach would be the optimal linear minimum-mean-square error estimate of the signal if x[n] and v[n] are samples of stationary random processes that are linearly independent from each other and their power spectral densities (PSD) are known [58]. Practically, however, the above assumptions do not hold. To improve the performance of this scheme, we can use adaptive filtering, where one locally estimates the PSDS of the signal and noise and 9) estimates the signal. As a result, if mg, and 022(9) denote the local minimum and variance of the noisy image 2 in the window Q , respectively, the local estimate 5.10) is [58] 2(0) 2 A 0' - 0' 3(0) = min) + 2 v (210) _ mu»), 02(0) 2 2 where we assumed that V is a zero-mean white Gaussian noise. Since this approach is a locally filtering task, it introduces artifacts, which resemble speckle noise (see Figure 38). Additionally, blurring textures and edges in the output is another kind of distortion resulting from this approach. As we enlarge the window size, 98 the artifacts reduce, but more edges are smeared in the denoised image. 4) Other Methods: In addition to the above methods, we also employ our proposed linear combining of multi-scheme denoising approach in conjunction with two competitive advanced denoising methods, namely, the BLS-GSM [73] (Bayes least-squares using Gaussian scales mixtures) and TIHWD (translation-invariant hybrid wavelets and directional filter banks) using bivariate shrinkage (BS) (see Section 3.5C). In Section 3.5C we showed that the TIHWD (BS) denoising approach provides sometimes better results in retaining fine textures than other leading approaches such as the BLS-GSM denoising method. However, BLS—GSM renders fewer artifacts in the results. Consequently, we attempt to linearly combine these two methods in order to achieve better results. B. Results As we mentioned earlier, to better take advantage of the proposed linear combination approach when there is no prior information of the original signal, it is advised that the denoising schemes for which we employ the LMS approach, provide comparable performances. As a result, for the first part of our experiments, we used the three denoising techniques, wavelets [60], contourlets [28], [44], and adaptive Wiener (function “wiener2” in Matlab) [58] while employing hard thresholding for the former ones. And in the second part, we used the two leading approaches of BLS-GSM [73] and TIHWD (BS) (see Chapter 3). To test our proposed LMS approach, we selected a variety of images and used additive white Gaussian noise with 0' =10, 20, and 40. We employed all the denoising schemes with the same parameters as those reported in Section 3.5C. We assumed that 0' 99 Table 6 PSNR Values of the Denoised Images Noisy Adaptive WT Image Wiener (HT) CT (HT) LMS_O LMS_N LMS_A 10 28.13 28.31 29.86 30.17 31.36 31.32 31.14 Barbara 20 22.15 26.44 25.80 26.49 27.74 27.43 27.68 40 16.38 23.65 22.44 23.17 24.50 22.88 24.35 10 28.14 30.44 30.76 30.59 32.14 32.11 32.11 Boats 20 22.18 28.36 27.21 27.27 29.18 28.77 29.04 40 16.42 25.09 23.83 24.08 26.06 23.70 25.84 10 28.13 32.66 32.10 32.10 33.78 33.68 33.72 Lena 20 22.13 30.00 28.58 28.91 30.89 30.05 30.72 40 16.35 26.10 24.97 25.62 27.46 23.94 27.33 10 28.25 32.87 31.83 31.53 33.53 33.46 33.31 Peppers 20 22.32 30.09 28.49 28.50 30.84 30.25 30.48 40 16.59 25.79 24.47 24.82 27.04 24.06 26.63 Image 0 WT (HT): ...... wavelet transform with hard thresholding CT (HT): ...... contourlet transform with hard thresholding LMS_O: ........optimal LMS method using the original image (see Section 4.3A) LMS_N: .....LMS method using noisy image (see Section 4.38) LMS_A: ........method of averaging (see Section 4.3C) is unknown and we estimated it using the robust median estimator [32]. Moreover, we mirror-extended the noisy images to avoid boundary distortion. Although the size of the noisy images is rather large, the PSNR values of the denoising results change slightly (usually up to 21:01 dB) when we use a different noise instance. Hence, to obtain more accurate PSNR values, we repeated each denoising experiment ten times using different noise realizations and evaluated the average of PSNR values. We also clipped the noisy images to set the pixel values in the allowable range of 0 to 255. Table 6 provides the PSNR values for the first part. We refer to the optimal LMS method of Section 4.3A as LMS_O, the method of Section 4.33 as LMS_N, and finally the averaging method of Section 4.3C as LMS_A. As seen, the proposed LMS method achieves between one and two dB improvement in most cases. For low noise levels (say 0' = 10), the results where the noisy image is used as the estimate of the original signal in the LMS algorithm (LMS_N) are comparable with those of the LMS_O (optimal LMS 100 Table 7 PSNR Values of the Denoised Images (Advanced Methods) Noisy BLS- TIHWD Image GSM (ES) LMS_O LMS_N LMS_A 10 28.13 34.40 34.46 34.70 34.60 34.70 Barbara 20 22.15 30.60 30.78 30.97 30.86 30.95 40 16.38 26.70 27.15 27.30 27.02 27.15 10 28.25 34.57 34.44 34.62 34.62 34.60 Peppers 20 22.32 31.92 31.72 32.03 31.36 31.94 40 16.59 28.21 28.14 28.73 25.95 28.32 BLS-GSM: ......Bayes least-squares using Gaussian scales mixtures TIHWD (BS): ...translation-invariant HWD with bivariate shrinkage Image 0 with oracle) results. Remarkably, the averaging approach (LMS_A) provides near- optimal PSNR values. The reason is that the selected denoising methods are comparable (and the types of artifacts generated are different/uncorrelated). Note that although the PSNR values achieved using adaptive Wiener method are usually higher than those of the wavelet and contourlet schemes, it introduces more visible artifacts in the results. That is due to the fact that the PSNR measure treats the low-frequency artifacts similar to the high-frequency ones; while, eyes are usually more sensitive to the low-frequency artifacts. Table 7 shows the PSNR results for the second part where we employ the combining approach to the BLS-GSM and TIHWD (BS) denoising schemes. Since these methods provide much fewer artifacts, linear combination cannot improve the PSNR values significantly. Note that again, since these schemes yield close PSNR values, the averaging method yields near-optimal performance. Now we show some visual results. Figure 37 depicts the visual results of the Boats image when 0' =10. We see that the distortions resulting from the denoising schemes are very different as pointed out earlier. The LMS approach shows a significant reduction in the artifacts while the salient advantages of the denoising schemes are preserved. 101 \ 1373271548 ,.‘. I . .1 CT — PSNR = 30.59 LMS_A — PSNR = 32.11 Figure 37. The denoising results of the Boats image when 0' =10. (Note that the visual results for LMS_O and LMS_N are similar to that of LMS_A.) Further, all the LMS methods provide similar results in this case. Another example is demonstrated in Figure 38, where we used BLS-GSM and 102 , . - , .' .‘1u‘\ ‘ 1 I ~.- \“ o n it BLS—GSM PSNR = 26.70 LMS_A — PSNR = 2715 Figure 38. The denoising results of the Barbara image when 0 = 40 . TIHWD (BS) denoising schemes with 0' = 40. As seen, using the averaging approach takes advantage of desirable features of both methods, that is, producing fewer artifacts in the BLS-GSM approach and better retaining of fine textures in the TIHWD denoising scheme. 4.5 CONCLUSION In this chapter we proposed a method based on the least-mean-squares approach to linearly combine different denoising schemes in an optimal sense. We found the proposed scheme to be efficient in improving denoising results through significant reduction in the artifacts and hence an increase in the PSNR values. We also proposed 103 non-oracle methods of averaging and LMS estimation with the noisy image as special cases of linear combining, where we showed that we can achieve near-optimal results for comparable denoising schemes. 104 Chapter 5 FUTURE WORK Using directional filter banks, we added more directionality to the wavelet transform and proposed the new transform family using Hybrid Wavelets and Directional Filter Banks (HWD). Then we employed the proposed scheme to several image processing applications. Our preliminary results indicate the potential of the proposed HWD family for image processing. We can utilize this family in several directions explained in the following sections. 5.1 IMAGE CODING For image coding, we plan to investigate more sophisticated approaches that take care of dependencies among the HWD coefficients. For this purpose, finding appropriate statistical models is necessary. While we can benefit from a nonadaptive and thus less-complex coding scheme using HWD family, it is possible to achieve more efficiency through applying the proposed family in an adaptive framework similar to wavelet packets. In this direction, we used an early version of the HWD family in [46]. 105 5.2 IMAGE DENOISING While the denoising results of the translation-invariant HWD (TIHWD) are quite satisfactory (see Section 3.5C), there is still room to further improve the performance of the TIHWD denoising scheme by developing more suitable shrinkage schemes for this family. Furthermore, note that in all of the scenarios for image denoising in the thesis, we considered white Gaussian additive noise for greylevel images. Under a possible extension of our denoising approaches, we can consider nonlinear noise, non-Gaussian noise, and color images. 5.3 IMAGE WATERMARKING Another application that the proposed family could be useful for is image watermarking. Regarding the fact that the significant HWD transform coefficients better represent edges and fine details when compared with wavelets, one could easily embed a watermark in these image components. Since the details correspond to perceptually most significant image coefficients, we expect to have a robust watermarking scheme [34], [37]. 5.4 OTHER POSSIBLE AREAS FOR HWD One can also utilize the proposed family in other applications. For example, feature extraction iS a possibility for pattern recognition. Another example is 3-D image processing where the need to extend and develop a 3-D HWD transform family is inevitable. 106 APPENDICES A.l Proof of Remark 2.1 (see Section 2.3) Let us denote the output of channel i for each shift R, as: S,c[n], for 0Si@—dLl(—> YN‘1(Z ——>@-—+z—de —-1 TM z-kl . Figure 39. The polyphase domain scheme of the multi-channel filter bank given in Figure 5. signal as X, (z) = AT(z")G,,(zM)H,,(zM)x,,(zM). We denote T73 (ZM ) = Gp(zM )Hp(zM) as the transfer fimction of the FB in the polyphase domain. It is clear that for perfect reconstruction we should have T7, (2M ) = I 4M , which leads to X, (z) = AT (2‘1 )XP (1M) = X(z). Now, we omit the subsampling operations, leading to a non-subsampled FB as depicted in Figure 6. Note that before doing this, it is better to transfer the downsampling (upsampling) operations to the right (lefi) side of the analysis (synthesis) filters. As shown below, this is advantageous since when we omit the subsampling operations, the transform coefficients will be the same as those in Figure 5 before downsampling, i.e., W(Z) = (W0 (2),. . . , W,,,_l (z))T. The forward transform is expressed as W(z) = X(z)H,, (zM )A(z). On the other hand, X, (z) = AT(z")G,,(zM )W(z). 109 As a result, Xt(z)=dMX(z)- It follows that we need a scaling factor to ensure perfect reconstruction as a = 1/ d M . Since the resulting TI scheme is non-subsampled, the redundancy of this scheme equals the number of channels, N. Finally, since the output of the channel c, (O S c S d M — 1) of synthesis bank is X56) (2) = szz-kCX(z) = X(z), we have (1 -1 X.(z)=(1/d..131 X.‘C’