.18. a , .
, i . t

. .mm ...wi.
....

 

haunt. «PM...

40‘ «‘11. ’
bﬂénﬂfxgw. ,

sq. .. 3.. s).

:
I; ..

32% humid. ,

if; 3.93.4
.Waﬁrﬁﬁu. .

swan»; . a”...
v . inn».-

1 2...». 6:...

 

If u. ”L-
. anmrx thaw”

.. n! IVS: 3 cl

1 .1, Elllmtuﬁu)‘:
93.35.33; I51
Hitlélf. .....r... 3

v. .3“ big

. ...
3.3.. i;

 

$7
. :lx‘il
14‘. It

 

 

 

    

I H , .dmmﬁg

 

I .. 13.33%",
A . .731; 1 ...}:

.. . ..v..v t I 3. ..E

      

   

i
4:,

 

 

 

 

LIBRARY
Michigan State
University

This is to certify that the
dissertation entitled

MODEL BASED IMAGE FUSION

presented by

MRITYUNJAY KUMAR

has been accepted towards fulﬁllment
of the requirements for the

PhD. degree in Electrical Engineering

 

 

kiwi/V

Major Professor’ﬁigﬁature
Cl / K/ o 8

Date

 

MSU is an Afﬁrmative Action/Equal Opportunity Employer

.--—.-.-n-u—o-n_.---.--—-—-.--.—n-

 

PLACE IN RETURN BOX to remove this checkout from your record.
TO AVOID FINES return on or before date due.
MAY BE RECALLED with earlier due date if requested.

 

DATE DUE DATE DUE DATE DUE

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

5/08 K:IProj/AccaPres/CIRCIDaleDue indd

MODEL BASED IMAGE FUSION

By

Mrityunj ay Kumar

A DISSERTATION

Submitted to
Michigan State University
in partial fulﬁllment of the requirements
for the degree of

DOCTOR OF PHILOSOPHY

Electrical Engineering

2008

ABSTRACT

MODEL BASED IMAGE FUSION

By

Mrityunj ay Kumar

Data fusion integrates redundant as well as complementary information present
in input signals in such a manner that the fused signal describes the true source
better than any of the individual signals. The exploitation of redundant information
improves accuracy and reliability whereas integration of complementary information
improves the interpretability of the signal. Data fusion plays a crucial role in image-
based applications such as computer vision, robotics, remote sensing, etc. The goal
of such fusion is to extract information from input images such that the fused image
provides better information for human or machine perception as compared to any of
the input images.

The goal of the research presented in this dissertation is to develop model based im-
age fusion algorithms. The primary advantage of model based fusion approach is that
it allows identiﬁcation of complementary and redundant information present. in the
input images in a systematic manner. In this thesis, we present three approaches to
image fusion, based on multichannel deconvolution methods, weighted average meth-
ods and local afﬁne models for pixel level image fusion. The proposed multichannel
fusion algorithm fuses blurred images obtained from different sensors such that the

fused image does not look blurred. The algorithm assumes a linear space invariant

model for each of the measurements. However, it does not require knowledge of the
point spread function or PSF of the imaging sensors and can be used in several areas
of image processing such as digital camera imaging, nondestructive evaluation, etc.
This framework has been extended to perform superresolution, where multiple low
resolution images of the same scene are used to generate a higher resolution image of
the scene.

In a more general setting, the degradation in an image may be due to a nonlinear
or a local process. In such cases, weighted average or local afﬁne models are more
appropriate. A Rayleigh quotient approach for fusing multisensor images using the
weighted average model is proposed and contrasted with other existing techniques for
pixel level fusion. Finally, the total variation (TV) norm is used to solve the problem
of fusion when using a local afﬁne model. Results of a range of datasets indicate the

feasibility and performance of the proposed algorithms.

To my family

iv

ACKNOWLEDGMENTS

I would like to thank my advisor, Dr. Pradeep Ramuhalli for his direction and
encouragement. I am greatly indebted to him for his advice and support.

I wish to thank the other members of my thesis committee, Dr. Satish Udpa, Dr.
Lalita Udpa and Dr. Sarat Dass. During these years, I have known them as very
inspiring and enthusiastic people who have made a deep impression on me. I have
been fortunate to have them on my committee. I would particularly like to express
my gratefulness to Dr. Satish Udpa and Dr. Lalita Udpa for showing faith in me
and supporting me when I really needed it. I cannot thank them enough. I am also
indebted to Dr. Dass for his encouragement and kindness in answering the questions.
The numerous fruitful discussions we had, helped me sort out the technical details of
Chapter 5 of this dissertation.

My internship at Eastman Kodak during the summer of 2007 was a unique experi-
ence. It inspired the series of mathematical models and experiments described in this
dissertation. I am grateful to my mentor at Eastman Kodak, Mr. Rodney Miller, for
his encouragement and advice during my internship and thereafter.

I thank my family members for their love and support. At last, but not at least,
I would like to express my heart-felt gratitude to my wife Sonali who has been a

constant source of love, support, patience, encouragement and strength all these years.

\Y

TABLE OF CONTENTS

LIST OF TABLES
LIST OF FIGURES

1 Introduction

1.1 Overview ..................................
1.2 Image Fusion Issues ............................
1.3 Research Contribution ..........................
1.4 Organization of the Report ........................

2 Literature Review

2.1 Data Fusion Models ............................
2.2 Evaluation of Fused Image ........................
2.3 Prior Work ................................
2.3.1 Image Fusion using Multichannel Deconvolution ........
2.3.2 An Optimal Weighted Average Pixel-Level Fusion .......
2.3.3 Local Affine Models for Pixel-Level Fusion ...........
2.3.4 Super-resolution ..........................

3 Image Fusion Using Multichannel Deconvolution

3.1 Richardson-Lucy Deconvolution Algorithm ...............
3.2 Proposed Approach ............................
3.3 Results ...................................
3.4 Discussion .................................
3.5 Superresolution ..............................
3.6 Superresolution Results ..........................

4 An Optimal Weighted Average Pixel-Level Fusion
4.1 Proposed Approach ............................

4. 2 Overall Approach .............................
4.3 Noise Analysis ...............................
4.4 Results ...................................
4.5 Discussion .................................

vi

viii

ix

03»wa

10

11
11
12
16
16
19
2O
22

28
28
29
31
41
41
43

50
51
53
53
54
57

5 A Total Variation Based Algorithm for Pixel Level Image Fusion

5.1 The Image Acquisition Model ......................
5.2 Image Fusion ...............................

5.2.1 Total Variation Norm for Image Eision .............

5.2.2 Estimation of B ..........................
5.3 Overall Approach .............................
5.4 Results ...................................
5.5 Discussion .................................

6 Conclusions And Future Research

6.1 Conclusions

oooooooooooooooooooooooooooooooo

6.2 Future Research ..............................

BIBLIOGRAPHY

vii

66
66
67
67
71
74
75
77

87
87
88

91

3.1

4.1

5.1

LIST OF TABLES

Performance summary of the multichannel deconvolution algorithm on
benchmark images ............................ 38

Performance summary of the weighted average pixel level fusion algo-
rithm .................................... 57

Performance summary of the total variation based pixel level fusion
algorithm ................................. 77

viii

1.1
1.2

1.3

2.1
2.2
2.3
3.1
3.2
3.3

3.4

3.5

3.6

3.7

LIST OF FIGURES

Image fusion example (a) LLTV Image (b) F LIR Image (c) Fused Image 3

Single Input Multiple Output (SIMO) Model .............. 7
Super-resolution Example: (a) LR Image 1 (256x256) (b) LR Image 2

(256x256) (c) LR Image 3 (256x256) ((1) SR Image (512x512) . . . . 9
Single Sensor Image Acquisition Process . '. .............. 16
Multiple Sensors Image Acquisition Process .............. 17
Observation Model for Super-resolution ................. 25
Multichannel Image Acquisition Model ................. 30
Serial multistage model for blind deconvolution ............ 30
Flowchart of the multichannel blind deconvolution algorithm ..... 32

Cameraman Image:(a) Original (b) Channel 1: Blurred by motion blur
of length 5 and angle 13 (c) Channel 2: Blurred by motion blur of length
6 and angle 120 (d) Channel 3: Blurred by motion blur of length 6 in
vertical direction ............................. 34

Deconvolution of Cameraman Image:(a) Channel 1 only (b) Channel
2 only (0) Channel 3 only (d) Proposed approach ........... 35

Barbara Image:(a) Original (b) Channel 1: Blurred by motion blur of
length 9 and angle 30 (c) Channel 2: Blurred by motion blur of length
9 and angle 60 ((1) Channel 3: Blurred by motion blur of length 11 in
horizontal direction ............................ 36

Deconvolution of Barbara Image:(a) Channel 1 only (b) Channel 2 only
(c) Channel 3 only (d) Proposed approach ............... 37

ix

3.8

3.9

3.10

3.11

3.12

3.13

3.14

3.15

3.16
4.1

4.2

4.3

Noise Analysis (Additive white Gaussian Noise, mean = 0, SNR :-
19 dB): Cameraman Image (a) Channel 1 (b) Channel 2 (c) Channel 3
(d) Fused Image: Using RL without damping (e) Fused Image: Using

damped RL ................................ 39
Plot of SNR vs. UIQI ........................... 40
Convergence Analysis (a) Fused Cameraman Image (b) Fused Barbara

Image ................................... 40
Overall Approach for Super-resolution ................. 42

LR frames for Cameraman image: (a) Original Image (256x256) (b)
LR Image 1 (128x 128): Blurred by motion blur of length 7 and angle
60 (c) LR Image 2 (128x128): Blurred by motion blur of length 7 and
angle 90 (d) LR Image 3 (128x128): Blurred by motion blur of length
9 and angle 0 (e) LR Image 4 (128x128): Blurred by motion blur of
length 9 and angle 135 .......................... 45

SR results for Cameraman image (256x 256) (a) Using one Input Image
shown in Fig. 3.12b (b) Using two input images shown in Fig. 3.12b
and 3.12b (c) Using three input images shown in Fig. 3.12b-3.12d ((1)
Using all the four input images ..................... 46

LR frames for Barabara image: (a) Original Image (512x512) (b) LR
Image 1 (256x256): Blurred by motion blur of length 9 and angle 60
(c) LR Image 2 (256x256): Blurred by motion blur of length 9 and
angle 90 ((1) LR Image 3 (256x256): Blurred by motion blur of length
11 and angle 0 (e) LR Image 4 (256x256): Blurred by motion blur of

length 11 and angle 135 (f) SR Image (512x512) ........... 47
Performance in presence of noise, SNR = 18 dB: (a) Input 1 (b) Input

2 (c) Input 3 (d)Input 4 (e) SR Image ................. 48
Plot of SNR vs. UIQI for “barbara” image ............... 49

Flowchart of the Proposed Weighted Average Pixel Level Image Fusion 54

Medical Images (a) MRI Image (b) CT Image (c) Fused Image: Pro-
posed Approach (d) Fused Image: “Max” rule ............. 59

Aircraft Navigation Images (a) LLTV Image (b) FLIR Image c) Fused
Image: Proposed Approach (d) Fused Image: “Max” rule ...... 60

4.4

4.5

4.6

4.7

4.8

5.1

5.2

5.3

C)!
A};

5.5

5.6

5.7

5.8

5.9

5.10

Multifocus Images (a) Focus on the front part (b) Focus on the back

part c) Fused Image: Proposed Approach (d) Fused Image: “Max” rule 61

Medical Images (SN R = 15 dB)(a) MRI Image (b) CT Image (c) Fused
Image: Proposed Approach (d) Fused Image: “Max” rule ......

Aircraft Navigation Images (SNR = 11 dB) (a) LLTV Image (b) FLIR
Image 0) Fused Image: Proposed Approach (d) Fused Image: “Max”
rule ....................................

Multifocus Images (SNR = 16 dB) (a) Focus on the front part (b)
Focus on the back part c) Fused Image: Proposed Approach (d) Phsed
Image: “Max” rule ............................

Plot of SNR vs. Similarity Quality Index ................

Comparison of least square and TV norm based fusion(a) LLTV Im-
age (b) FLIR Image (c) Fused Image: Least Square (d) Fused Image:
Proposed Approach (TV Norm) .....................

Flowchart of the Proposed TV Norm Based Pixel Level Image Fusion
Medical Images: (a) CT Image (b) MRI Image .............

Medical Images, SNR = 23 dB: (a) CT Image (b) MRI Image (c) Fused
Image: Least Square ((1) Fused Image: Proposed Approach ......

Medical Images, SNR = 12 dB: (a) CT Image (b) MRI Image (c) Fused
Image: Least Square (d) Fused Image: Proposed Approach ......

Medical Images, SN R = 0 dB: (a) CT Image (b) MRI Image (c) Fused
Image: Least Square (d) Fused Image: Proposed Approach ......

Aircraft Navigation Images: (a) LLTV Image (b) FLIR Image . . . .

Aircraft Navigation Images, SNR = 23 dB: (a) LLTV Image (b) FLIR.
Image (c) Fused Image: Least Square (d) Fused Image: Proposed Ap-
proach ...................................

Aircraft Navigation Images, SNR = 8 dB: (a) LLTV Image (b) FLIR
Image (c) Eised Image: Least Square ((1) Fused Image: Proposed Ap-
proach ...................................

Aircraft Navigation Images, SNR = 0 dB: (a) LLTV Image (b) FLIR.
Image (c) Fused Image: Least Square (d) Fused Image: Proposed Ap-
proach ...................................

xi

62

63

64

78

79

80

81

82

83

84

85

5.11 Plot of SNR vs. Similarity Quality Index (a) CT, MRI dataset (b)
LLTV, FLIR dataset ........................... 86

xii

CHAPTER 1

Introduction

1. 1 Overview

Data fusion integrates redundant as well as complementary information present in
input signals in such a manner that the fused signal describes the true source bet-
ter than any of the individual signals. The exploitation of redundant information
improves accuracy and reliability whereas integration of complementary information
improves the interpretability of the signal [1—5]. In a typical data fusion application,
multiple sensors observe a common source and a decision is taken based on the collec-
tive information. For example, radar provides the ability to accurately determine an
aircraft’s range, but has a limited ability to determine the angular extent of the air-
craft. By contrast, an infrared imaging sensor can accurately determine the angular
extent of an aircraft, but is imprecise in measuring range. If these two observations
are correctly associated, then the combination of data from the two sensors provides
an improved determination of location than could be obtained by either of the two in-
dependent sensors [2]. Another application, reconstruction of high-resolution image,
attempts to obtain high-resolution image from multiple low-resolution images [6].

The concept of data fusion is not new. It is naturally performed by living beings to

achieve a more accurate assessment of the surrounding environment and identiﬁcation
of threats, thereby improving their chances of survival. For example, quality of an
edible object can be perceived more accurately by using a combination of sight, touch,
smell and taste. The advent of new sensors, advanced processing techniques, and
improved processing hardware has provided the ability to emulate, in hardware and
software, the natural data fusion capabilities of humans and animals [2].

Data fusion plays a crucial role in image-based applications such as computer vision,
robotics, and remote sensing. The goal of image fusion is to extract information from
input images such that the fused image provides better information for human or
machine perception as compared to any of the input images [7—10]. An example of
image fusion is demonstrated in Fig. 1.1. Fig. 1.1a and 1.1b show the images of the
same scene captured using a low-light-television (LLTV) sensor and a forward-looking-
infrared (FLIR) sensor, respectively. LLTV sensor provides the surface information of
the ground as well as building and vegetation details. However, FLIR image describes
the road network details accurately. The fused image shown in Fig. 1.1c contains all
the salient features of both the sensors.

Image fusion has been used extensively in various areas of image processing such
as remote sensing, biomedical imaging, nondestructive evaluation, etc. [11—13]. In
remote sensing, the objective is the acquisition and interpretation of spectral mea-
surements made at a distant location to obtain information about the Earth’s sur-
face. In order to produce a high-accuracy map, the classiﬁcation process assigns each
pixel of the image to a particular class of interest, (such as water, dry soil, wet soil,
etc.). The image resulting from the labeling of individual pixels is known as the the-
matic (or class) map [3]. A typical remote sensing system uses multispectral sensors
and the thematic map is generated using information present across observed im-
ages [5]. As another example, recent developments in medical imaging have resulted

in many imaging techniques to capture various aspects of the patient’s anatomy and

 

(C)

Figure 1.1: Image fusion example (a) LLTV Image (b) FLIR Image (c) Fused
Image

metabolism [14,15]. However these techniques are complementary in nature. For
example, magnetic resonance imaging (MRI) is very useful for deﬁning anatomical
structure whereas metabolic activity can be captured very reliably using Positron
Emission Tomography (PET). Hence, by using fusion, it is possible to obtain a single
image that describes anatomical as well as metabolic activity of the patient effec-
tively [16]. Similarly, in nondestructive evaluation of materials, a single sensor does
not capture all the details of the flaw [13]. For example, high frequency eddy current
sensors are capable of capturing the surface footprint of the ﬂaw but they cannot pro-
vide much information about its depth. Similarly, low frequency eddy current sensors
characterize the depth of the flaw but are poor at estimating the surface details. It
has been reported in the literature that fusion of images obtained from high and low
frequency eddy current sensors provides accurate and robust estimation of ﬂaws as

compared to results obtained from the input images [17,18].

1.2 Image Fusion Issues

The central objective of image fusion is to combine complementary and redundant
information from multiple images such that the fused image represents the original
scene better than any of the input images. However, development of fusion algorithms
is a challenging task. Image fusion algorithms are always application dependent and
require good knowledge of the image acquisition process. Typically, in image fusion
based applications a suite of sensors are used to acquire images. Different sensors have
different sensing characteristics; as a result, the input images may be signiﬁcantly
different from each other and cannot be compared. For example, at pixel-level fusion,
averaging works well if the input images are similar. But averaging cannot be used
to fuse visible-band and infrared images as it will reduce the contrast [19]. Therefore,

for fusion purposes, these images must be transformed into representations in which

they can be compared and fused [20]. However, the transformation is application
dependent. Some of the challenges that are commonly faced during the development

of fusion algorithms are as given below:

1. Feature Discrepancy: The design of the sensors are based on different phys-
ical phenomena; therefore, their image acquisition processes are very different.
Consequently, the features (edges, local contrast, etc.) of the images obtained
using these sensors are very different. Feature mismatch makes it difﬁcult to
compare the input images and fuse them. For example, the local contrast of
visible-band and infrared images have opposite polarity [21]. This is also referred
to as local polarity reversal. Hence, to fuse these images, the fusion algorithm
must be able to handle the local polarity reversal. In some cases, image features
present in one image are missing in others. Such features are called complemen-
tary features [22,23]. The fusion algorithm should retain the complementary

features as they improve the interpretability of the original scene.

2. Common Geometric Representation: Images obtained from different sen-
sors may represent different geometry of the scene. For example, in visible-band
imaging, images are formed by a 2—D projection of the 3—D scene on the image
plane. However, millimeter-wave radar (MMWR) sensors have a range geom-
etry as they measure the intensity of the reﬂected radio wave. To fuse images
obtained from these sensors, the input images must be transformed to a common

geometric representation [20].

3. Registration: Registration is the process of aligning images such that there
is a one-to-one correspondence between the pixels of these images [24]. Regis-
tration is a key requirement for fusion, especially for pixel-level fusion. Images
of different sensors may be misaligned because of the several factors including

different spatial positions of the sensors, different temporal capture rates, etc.

The misaligned input images must be co-registered before fusion.

1.3 Research Contribution

The main contribution of this thesis is the development of algorithms based on a
model based image fusion framework. The central objective of any fusion algorithm
is to identify the redundant and complementary information and integrate them suit-
ably. The relationship between the input and fused images may be represented using
mathematical models. The primary advantage of the model based approach is that
it allows incorporation of complementary as well as redundant information present
in the input images in a mathematically convenient and analytically tractable man-
ner. If the framework that relates the input images is sufﬁciently generic then the
fusion algorithm based on this framework can be used in more than one area of image
processing. For example, in several areas of image processing such as NDE, digital
camera applications, medical imaging, etc., the image acquisition process is modeled
as a convolution of the true image and the point spread function (PSF) of the sensor.
Therefore, a fusion algorithm based on a convolution model can be used in all these
application areas.

The single input multiple output (SIMO) multichannel model that has been used
extensively for multichannel image restoration [25,26] can also be used as an image
fusion model. Fig. 1.2 shows the schematic of a SIMO model. In this model, it is
assumed that the measured or observed images (11,- - - ,1”) are obtained by imaging
the same scene (10) using different sensors that have complementary information.
Therefore, the original image can be estimated by using all of the measured images
simultaneously. This model can also be perceived as an image fusion model where
11,- - - ,1" are the observed images and 10 is the fused image. If the sensors are linear

and space invariant, then the measured images can be modeled as the output of linear

 

-—>| Sensor] I» 11W)
10m) ... —»| Sen... 2 ]-. 12w)

—’| Sensorn I" In(x,y)

Figure 1.2: Single Input Multiple Output (SIMO) Model

 

 

 

shift invariant systems. In this case, a multichannel deconvolution approach can be
used to estimate 10 from 11,- - - ,In [25]. Currently existing algorithms typically solve
this problem using a linear inverse problem framework [26]. But these approaches
usually involve storage of large matrices, which makes it difﬁcult to use these algo-
rithms for large images. This thesis presents an alternative approach for multichannel
blind deconvolution based on the Richardson-Lucy deconvolution algorithm for im-
age deblurring [27,28]. The proposed algorithm is iterative and does not require the
knowledge of the sensor PSFs. The algorithm is applied to motion blur removal which
is a common problem in digital camera application.

A multichannel deconvolution based image fusion approach is of use only if all
observed images are blurred. Blurring is a global operation, i.e., the entire image
appears blurry. However, sometimes observed images have blurs at the local level,
i.e., only some regions of the images are blurred. The multifocus problem is a classic
example of this case. The multifocus problem is caused due to the limited depth
of ﬁeld of the optical lenses. Lenses of long focal length cannot generate very clear

images of all the objects in a scene that are located at different distances from the lens.

As a result the object which is in focus appears to be visually appealing whereas the
rest of the objects appear blurry [29]. Clearly, the multichannel deconvolution model
is not appropriate to solve this problem. Similarly, in many applications such as radar
imaging, visual and IR imaging, the complementary and redundant information are
available at local level in the measured images [21,30]. These types of images are not
blurred at all but contain complementary information and convolution based models
cannot be used to fuse such images. To fuse such images, two models have been
investigated, namely (i) weighted average model, and (ii) local afﬁne model. In the
weighted average approach, the fusion process is modeled as a weighted average of
input pixels where the weights of the fusion process are estimated by maximizing
the average power of the corresponding fused pixel. This approach does not require
any user speciﬁed threshold unlike other similar algorithms and is computationally
efﬁcient. However, this model is susceptible to sensor noise. In a local afﬁne model, the
relationship between fused and input pixels are expressed using an afﬁne model [21,30,
31]. Existing approaches estimate the fused image from this model using parametric
techniques. In the proposed research, a total variation (TV) norm [32] based approach
has been used to estimate the fused image. The total variation norm has been used
in several image processing applications [32-35] and is based on a nonlinear partial
differential equation (PDE). It is robust to noise [36,37], and is a data driven approach
that does not require knowledge of the probability density function of the fused pixels.

An interesting application of image fusion is super-resolution (SR). In many areas
of image processing, high resolution (HR) images are desired and often required.
The range of resolution of imaging devices is limited. The images captured using
such devices may not capture all the details of the scene if the scene has many
levels of details. The resolution of an image can be increased either at the hardware
or software level. Hardware based approaches are sensitive to noise and the cost

involved is inappropriate for general purpose commercial applications. In software

based approaches multiple low resolution images of the scene are captured and signal
processing techniques are used to fuse these low resolution images to obtain a higher
resolution image. This approach is called super-resolution [6, 38—41]. An example of
super-resolution is shown in Fig. 1.3. Fig. 1.3a - 1.3c are low resolution images and

Fig. 1.3d shows the corresponding high resolution image.

 

(C) ((0

Figure 1.3: Super-resolution Example: (a) LR Image 1 (256x256) (b) LR Image
2 (256x256) (c) LR Image 3 (256x256) (d) SR Image (512x512)

The super-resolution technique is much more cost effective than a. hardware solu—

tion. Super-resolution is an inverse problem [6,38] and the forward model consists

of three stages namely (i) Warping (ii) Blurring and (iii) Downsampling. Classic
super-resolution approaches assume knowledge of all three stages [6]. In addition,
these approaches are computationally expensive. A computationally efﬁcient ap-
proach that does not require the knowledge of blurring transfer function and uses the

fusion algorithm proposed here has been investigated for super-resolution.

1.4 Organization of the Report

The rest of the report is organized as follows. Chapter 2 provides a literature survey
of image fusion that includes quality estimation of fused images, inverse problems and
local afﬁne based frameworks for super-resolution and image fusion. Chapter 3 de-
scribes a multichannel deconvolution algorithm and its extension to super-resolution
for digital images. Chapter 4 discusses a weighted average approach for image fu-
sion whereas Chapter 5 provides a local afﬁne transform based approach for fusion.

Finally, Chapter 6 provides a summary of the proposed research work.

10

CHAPTER 2

Literature Review

2.1 Data Fusion Models

Due to the large number of applications as well as the diversity of fusion techniques,
considerable efforts have been put in developing standards for data fusion. Several
models for data fusion have been proposed in the recent past [42,43]. One of the
models commonly used in signal processing applications is the three-level fusion model
that is based on the levels at which information is represented [44]. This model
classiﬁes data fusion into three levels depending on the way information present in
the data/ image is represented and combined. If the raw images are directly used for
fusion then it is called low level fusion. In the case when the features of the raw images
such as edge, texture, etc. are used for fusion then it is called feature or intermediate
level fusion. The third level of fusion is known as decision or high level fusion in which
decisions made by several experts are combined. A detailed description of these three

levels of fusion is given below:

0 Low Level Fusion: At this level, several raw images are combined to produce
anew ”raw” image that is expected to be more informative than the inputs [45—

49,49]. The main advantage of low level fusion is that the original measured

11

quantities are directly involved in the fusion process. Furthermore, algorithms
are computationally efﬁcient and easy to implement. Low level fusion requires

a precise registration of the available images.

0 Feature or Intermediate Level Fusion: Feature level fusion combines var-
ious features such as edges, corners, lines, texture parameters, etc. [50-52].
In this model several feature extraction methods are used to extract the fea-
tures of input images and a set of relevant features are selected from the avail-
able features. Methods of feature fusion include Principal Component Analysis
(PCA), Multi-layer Perceptron (MLP), etc. [53]. The fused features are typically
used for computer vision applications such as segmentation, object detection,
etc. [53]. In feature level fusion, all the pixels of the input images do not con-

tribute in fusion. Only the salient features of the images are extracted and

fused.

0 Decision or High Level Fusion: This stage combines decisions from several
experts [54—57]. Methods of decision fusion include voting methods [55], sta-
tistical methods [54], fuzzy logic based methods [56], and Dempster-Schafer’s

method [57] , etc.

2.2 Evaluation of Fused Image

Image quality measurement is crucial for most image processing applications. Gen-

erally speaking, an image quality metric has three kinds of applications [58]:

1. Control the Quality of Image Acquisition Process: The image quality
metric can be used to optimize the performance of an image acquisition system.
For example, an image and video acquisition system can use the quality metric
to monitor and automatically adjust itself to obtain the best quality image and

video data.

12

2. Benchmarking of Image Processing Systems and Algorithms: It can
be employed to benchmark image processing systems and algorithms. Suppose
we need to select one from multiple image processing systems for a speciﬁc
task, then a quality metric can help evaluate which algorithm provides the best

result.

3. Design of Image Processing System: Image quality metric can be used to
optimize the algorithms and the parameters of image processing algorithms. For
instance, in a visual communication system, a quality metric can help optimal
design of the pre-ﬁltering and bit assignment algorithms encoder and the post

processing algorithms at the decoder [58].

Usually, an image quality metric is derived by comparing the processed image with
the ground truth image [59]. However, in the case of image fusion, ground truth
data is not always available. Therefore, the quality must be measured by using only
the input and the fused images. Due to this reason, the objective assessment of the
quality of the fused image is a challenging problem in image processing. Some of
the quality metrics that have been commonly used in the area of image fusion are

summarized below.

0 Similarity Index Based Quality Metric: A similarity index based quality
metric is based on the universal image quality index (UIQI) proposed by Wang
et a1 [59]. It is used to compare a processed image with the corresponding
ground truth image. For the processed image y : {yi [ z' z 1, 2, - -- ,N} and
the corresponding ground truth image x : {17, | i = 1, 2, - -- ,N}, the quality

metric Q is deﬁned as

401111701! ya'v

Q : (a; + caravan? + (yaazr

 

(2.1)

13

where

N 1 N
33M? N 2171 yap — ' 92
1:1 221
N N
022—1 Eur—:1: J2 02=—-1——:(U'—y )2
1' N _ 1 i 1 Z all i y N _1 _ 1 g 2 av
: 1.:
1 N
03:3; : —N———I 210% — $av)(yi — yam)
2:

The dynamic range of Q is [— 1, 1]. Due to the nonstationary nature of the image
signal, (2.1) is applied to local regions of the image using a sliding window and

the ﬁnal index is computed by taking the average of all the local quality indices.

Cvejic et al. extended UIQI for measuring the quality of fused images by using
only the input and the fused images [60]. This quality index is based on the
weighted combination of the universal image quality index between the fused

and the input images. A brief description of the algorithm is given below.

Let x 2 {xi [2 : 1,2,--- ,N} and y 2 {3), [z : 1,2,--- ,N} be two input
images for fusion and f = {f, | 2' : 1,2, - u ,N} is the corresponding fused

image. The fusion quality index Q F is deﬁned as

QF : sim(x,y, f)Q(x, f) + (1 — sim(x,y,f))Q(y, f). (2.2)

The similarity factor sz'm(x, y, f) is expressed as

f

 

O if——U:5f——- < 0,
Uxf + ny
sz'm(x,y,f) : ( —(—73—f————— if0 g -———U¥i—— S 1, (2.3)
. Uzi + ”yr Uxf + 0y!
1 if 0“” > 1
\ 033f + 09f

14

where
[V
1

Oatf = m. N101} — $av)(f'£ " fat?)
2:

——:"1‘Z(yi— yavx fav)

The similarity index lies between —1 and +1. Index values close to :tl indicate

the robustness of the fusion algorithm.

Entropy Based Quality Metric: The entropy [61] of an image is deﬁned
as [62,63]

G
—§kmmmnm. as

i=0
where G is the number of gray levels in the image and p(i) is the normalized
frequency of occurrence of the it” gray level. The entropy is used to measure
the overall information in the fused image. The larger the value of entropy, the

better the fusion result.

Cross EntrOpy Based Quality Metric: The overall cross entropy [61] of

input images [63] X, Y and fused image F is deﬁned as

CE(X, F) + car, F)

 

 

 

CE = 2 (2.5)
where
E(X F)_ 2pm 982325)) (2.6)
, _ z. 0 0271/03)
awn—gamemw. an

p X(1I), py(z') and pp(i) are the normalized frequency of occurrence of the it“
gray level of images X, Y and F respectively. G is the number of gray levels

in the input and fused images. The overall cross entropy is used to measure

15

the difference between the input images and the fused image. In general, the

smaller the value, the better the fusion result.

The entropy and the cross entropy based quality metrics are susceptible to noise
and are not used to evaluate the fused result in presence of noise.

The quality metrics for fusion discussed above are based on the ﬁrst and the second
order statistics of image data, which are not sufﬁcient to capture all the features of
the fused image that are relevant to human visual system. Hence, a quality metric
based on human visual system will be more appropriate to evaluate the fused image
quantitatively.

The quality metrics discussed above are useful if the fused image is the ﬁnal output
of the system. However, in some cases, image fusion is just an intermediate step and
the fused image is further used by other image processing applications. In such cases,
the performance of the application that exploits the fused image can be utilized to

judge the quality of the fusion.

2.3 Prior Work

In this section, a survey of prior work related to the research foci in this dissertation

is provided.

2.3.1 Image Fusion using Multichannel Deconvolution

Under the assumption that the sensor used in image acquisition process is linear and
space invariant, the observed image can be modeled as the output of a linear, space
invariant system [64] as shown in Fig. 2.1. Io(:r,y) and 1(33, y) are true and observed
images, respectively, h(:c, y) is the sensor PSF and 77(3), y) is noise added during image

acquisition. In this model, noise has been assumed to be additive.

16

77(x. y)

10(x.y)——9| h(x,y) I(x.y)

Figure 2.1: Single Sensor Image Acquisition Process

 

The relationship between 10(ar,y) and I (say) can be expressed as a convolution

operation [64] as shown below:

1(18, :1) = titty) * 1003.31) + May) (2-8)

In general, I (as, y) is a smeared representation of 10(x,y) since the sensor PSF is
not a delta function. Typically, inverse techniques are used to restore Io(:r,y) from
I (:12, y) and the most commonly used technique is deconvolution [64]. In cases where
the sensor PSF h.(:c,y) is known, a Wiener ﬁlter based deconvolution approach [65]
can be used. In most practical applications, the only known quantity is 1(1ry) In
such cases, a blind deconvolution technique is necessary to estimate both 10(13, 3;) and
h(a:, y) [66]. However, perfect reconstruction of 10(33, y) from I (x, y) is difﬁcult if zeros
in the frequency domain of h(x, y) coincide with the non-zero frequency components of
10(1):, y). Estimation of 10(33, y) can be improved signiﬁcantly by using a multichannel
setup [25,26] as shown in Fig. 2.2. In the multichannel setup, the source is inspected
using multiple sensors. If the sensors are complementary, i.e., if sensor PSFs do not
have common zeros then it is possible to reconstruct 10(1r,y) perfectly by using all
the observed images simultaneously [5].

In Fig. 2.2, h1(:r,y), h2(:z:,y), ~-, hn(a:,y) are PSFS of n. sensors used to inspect
[0(13, y). 1,;(112, y)(1 S 2' S n) is the output of it“ sensor and 1),:(93, y) is the correspond-
ing sensor noise. The relationship between the original or source image 10(33, y) and

output images 11(x, y), -- - , In(:r, y) is expressed as given below.

Many) = fat-(1230* 100?, y) + May); 1 S 'i- s 71» (‘2-9)

17

 

 

 

712 (ml
10(.,y)__.] 122(3) 1,05)
I
: Unlwl

 

 

Figure 2.2: Multiple Sensors Image Acquisition Process

Typically, [0(x,y) is estimated by deconvolving all the observed images simultane-
ously. This approach is known as multichannel deconvolution [25,26]. Similar to
single channel deconvolution, if PSFs h.1(:r, y), h2(:r,y), ---, hn(:c,y) are unknown,
the approach is called blind multichannel deconvolution. Note that the multichannel
deconvolution schematic can be viewed as an image fusion model in which Io(:r, y) is
the fused image that needs to be estimated by using all the observed images [1(17, y),
12(5r,y), , In,(:r,y) simultaneously.

In [25] a Fourier transform based approach has been proposed to estimate Io(:r. y),
from multiple digital camera images. Let [if ( f1, f2)(1 _<_ i _<_ n.) be the Fourier
transform of ith' observed image I,-(:r,y). Similarly, H1(f1,f2), H2(f1,f2), ---,
Hn(f1, f2) are Fourier transforms of h.1(:c,y), h2(a:,y), ---, hn(:r,y), respectively.

Let 15(E81)(f1, f2) be the Fourier transform of estimated 10(27, y). Then,

_ 232:1 H§“(f1,f2)1{(f1.f2)

f(E3t)
10 (f1? f2) - i=1 le'(f1,f2)l2 + K I (2.10)

 

where H; (f1, f2) is conjugate of H,-(f1, f2) and K is regularization constant. How-

18

ever, this approach assumes that PSFs are estimated from the camera exposure time
which limits its application. Furthermore, the regularization constant K is selected
heuristically.

Reference [26] estimates 10(13, y) by solving a set of linear equations. Let [5,, I f and
77$ be the lexicographical representation of input image Io(:r,y), it” observed image
I,(:c, y) and noise 77,017, y), respectively. Therefore, if 1 S a: S M and 1 _<_ y _<_ N, then
each of If” If and 7721. is a vector of size M x N. Let H,- be the matrix representation

of PSF h,(:c,y). Clearly, the size of H,- is M2 x N2. From eq. (2.9),

If : H,Ié+n,l;;15i_<_n

2

=> I = HIé+n. (2.11)

‘ where I = [I[,--- ,I,l‘,]T, n : [77[,--- ,n,’,]T, H = [H[,—-- ,H,",]T, and []T represents
the matrix transpose operation. Estimation of I (I, from eq. (2.11) is a classical restora-
tion problem and there are several methods available to solve this equation [67,68].
However, the size of H, is very large even for moderate size of images, and it is

therefore difﬁcult to use this approach for large size images.

2.3.2 An Optimal Weighted Average Pixel-Level Fusion

A simple and intuitive approach for pixel level fusion comprises of estimating the
fused pixel by taking the weighted combination of the corresponding pixels in the
input images [47,69]. Let X and Y be two input images that are to be fused and Z is
the corresponding fused image. For any given spatial location (m, n), the fused pixel

and the input pixels can be related as

Z(m, n.) : wg;(m, n)X(m, n) + wy(m, n)Y(m, n). (2.12)

19

To compute the weights 11:1; and my, a match measure M Xy(m. n) is computed [47,70].

The expression for M Xy(m, n) is given below:

2: 217(3, t)X(m + s, n + t)Y(m + 3,71, + t)
SESJET

 

MXi/(m. n) = , (‘2-13)

A§(m, n) + A2Y(m, n)
where w(s,t) is a weight window and ZsES,t€T w(s,t) = 1. S and T describe the
size of the window. The pixels of the window are deﬁned by the user. A X (m, n.) and

Ay(m, n) are activity levels as deﬁned below:

AX(m,n) : Z w(s,t)|X(m+s,n+t)| (2.14)
SES,tET

Ay(m,n) : Z w(s,t)|Y(m+s,n+t)[. (2.15)
sES,t€T

The following rule is used to estimate weights 212;; and my:

u1x(m, n) : 0, tug/(mm) : 1 — wx(m,n) if MXy(m,n) < 0.,
1 1 1— MXY ,
2. ,ﬁ , : — — — ——-— r ,, , > 2.16
u1(mvn) 2 2( l—a ) 1fMAy(m n)_a ( )
u1y(m, n) = 1 — uix(m,n), (2.17)

where a is a user deﬁned threshold.
A local variance based method has also been proposed to estimate weights rum and
my [47]. In this approach, the weight for each input image is determined by computing

the variance in a neighborhood (N x M) centered at (m, n):

a

1 2
uzm(7n.,n) : MN 2: (X (mm) — Xaz,(m,n)) , (2.18)
, sES,tET
where X av(m, n) is the mean value computed over the neighborhood (N x M) centered
at (m, n) and exponent a allows the weight distribution to be modiﬁed. The weight

toy is estimated in the similar manner.

Petrovic et al. proposed a background elimination method for fusion [71]. In this

20

approach, the fused pixel Z (777., n) is estimated as:

Z(m,n) = X(m,n) — p.X(m, n) + Y(m, n) — py(m, n)
#lea n) + #Ylms 77’)

 

 

2.19
+ 2 < )
where ,uX(m, n) and py (m, n) are the mean values in a given window.
In [72], following approach is proposed:
Z(m, n) : {X (m, n)Y(m, n) + b, (2.20)

where the constant b is a bias in order to modify the ﬁnal fused image.

The approaches for weighted pixel level fusion discussed above are heuristic in
nature. Furthermore, most of them depend on a user speciﬁed threshold. In this
dissertation we propose a “Rayleigh Quotient” [73] based approach for weighted pixel
level fusion. This algorithm is mathematically robust, computationally efﬁcient and

does not require any heuristic threshold.

2.3.3 Local Afﬁne Models for Pixel-Level Fusion

The local afﬁne transform based model for image fusion has gained signiﬁcant atten-
tion recently [21,30,31] because it can efﬁciently relate the complementary as well as
redundant information present in the input images at the local level. Furthermore.
it allows incorporation of the effect of sensor noise. Let 10(17, y) be the true image,
which is inspected by a different sensors and 11(17, 3;), 12(1r,y), , In(:r.,y) are cor-
responding n measurements. The local afﬁne transform that relates the input pixel

and the corresponding pixel in measured images is given by

11(1‘, 31) = ﬁi($.y)10(r,y) + 771(5):. y); 1 S 2' S n. (221)

6,-(17, y) and 77,-(17, y) are the gain, offset and sensor noise, respectively, of the 2"“ sensor

at location (:1: y). The goal of fusion is to estimate 10(17, y).

21

In many applications such as radar imaging, visual and IR imaging, the comple-
mentary, redundant and polarity reversed information are available at local level in
the measured images [21,30]. The main advantage of the local afﬁne transform model
is that it can relate these local information contents efﬁciently, which is not possible
in the multichannel deconvolution setup.

Note that the two sensors 2' and j (2' # j; 1 S i, j S n) have complementary
information at location (:c,y) if 6,-(23, y) aé ﬁj(:c,y) and 6,:(23, y),,8j(:r, y) 6 {0,1}.
Similarly, these two sensors have redundant information if 6,-(33, y) = 1 and ﬁj(:c, y) =
1. Furthermore, 13,:(2:,y) sé @(ayy) and B,(:r,y),ﬁj(x,y) E {—1, 1} corresponds to
polarity reversal.

Eq. (2.21) can be re-written as

\

 

 

 

 

 

 

 

 

(1,) (,1) (,1) in,

' = 3 10+ 3 + 3 (2‘22)
W) W) W) \"n/

=> I : ﬁlo+n. (2.23)

where, I = [11, - -- ,In]T, B : [61, - -- ,/3n,]T, and n = [771, - -- ,nn]T. The reference to
the pixel location (3:, y) has been dropped for simplicity.

In [21], eq. (2.23) is solved using a Bayesian approach. The noise vector 17 is assumed

2

to be Gaussian with zero mean and diagonal covariance 2,, = diag[o,71 , - -

- ’Uiinl' 0%.
is the variance of the noise of the it“ sensor at location (1:,y). Furthermore, the a
priori probability density function of [0 is assumed to be Gaussian with mean m0
and variance 0%. The input pixel 10 is estimated using maximum a posterion'ﬂVlAP)
technique. Let If“ be the MAP estimate of 10. Then,
If“ : [ptzglp + %] _1 + (chain — a) + -m—.§’) (2.24)
o

06

Sensor gains ,6 is estimated using factor analysis. However, the performance of this

22

approach is compromised if sensor noise is not z.z..d Gaussian or the a priorl density
is not Gaussian.

Reference [30] estimates 10(23, y) using an Expectation-Maximization approach.
However, it is assumed that Biting) 6 {—1,0, 1}. Furthermore, sensor noise 77,-(113, y)

is modeled using K -term mixture of Gaussian probability density functions.

2.3.4 Super-resolution

The range of resolution of imaging devices is limited. The images captured using such
devices may not capture all the details of the scene if the scene has many levels of
details. In many areas of image processing, high resolution (HR) images are desired
and often required. The pixel density within an HR image is high, consequently,
the HR image offers more detail that may be critical in various applications such as
digital imaging, medical images, satellite imagery, etc.

The resolution of images can be increased either at hardware or software level.
Hardware based approaches typically include increasing the number of pixels per
unit area or increasing the size of the sensor chip. Increasing the number of pixels per
unit area increases shot noise, which in turns makes HR image noisy [40]. Increase in
chip size increases the cost of imaging device, which may be inappropriate for general
purpose applications. In addition, increasing chip size also increases the capacitance,
which reduces the image pixel transfer rate. In the software based approach, multiple
low resolution (LR) images of the scene are captured and signal processing techniques
are used to fuse these low resolution images to obtain high resolution image. This
approach is called super-resolution [6,38—41]. The super-resolution technique is much
more cost effective than a hardware solution.

Conceptually, each LR—observed image is a downsampled or aliased version of the
HR image. The aliasing effect cannot be removed by processing any one of the

observed LR images alone. However, if these LR images are shifted with respect to

23

each other with subpixel precision, then the shift information can be exploited to
undo the aliasing effect and obtain HR images [40]. Note that if the LR images are
shifted by integer units, then each image contains the same information, and thus
there is no new information that can be used to reconstruct the HR image. For
static scenes, the shift is introduced by the image acquisition process. For example,
if multiple imaging sensors are used to image the scene, then these sensors must be
located at different locations. Similarly, for a single imaging sensor case, the sensor
is moved appropriately to introduce subpixel level shifts [40]. For dynamic scenes,
the necessary shift is introduced by the relative motion between the camera and the
scene [40].

Super-resolution has been used in several image processing applications such as
remote sensing, medical imaging, biometrics, image and video surveillance, etc.

Super-resolution techniques have been used extensively to increase the spatial res-
olution of the remote sensing data. The multispectral images obtained using remote
sensing technologies have subpixel level shifts that are used for super-resolution pur-
pose [74,75].

Super-resolution has also been recently applied to magnetic resonance imaging
(MRI) images [76,77]. In this case, super-resolution has been used to reduce the
thickness of 2D slices [77] and to obtain 3D images from 2D slices [76].

Biometrics is also an important area of application of super-resolution techniques.
Application of super-resolution for iris and face data to improve the accuracy of target
identiﬁcation and authentication can be found in [78] and [78], respectively.

In surveillance based applications, the location of interest is monitored using dig-
ital cameras, webcams, camcorders, etc. and these images/videos are analyzed for
detecting abnormalities. Due to bandwidth constraints, these data are often captured
at low resolution and super-resolution is used to obtain HR images/ videos [79].

Fig. 2.3 shows an observation model that relates the observed LR images and the

24

corresponding desired HR image. Let 11,12,- - - ,In be the lexicographical representa-
tion of 77. low resolution images obtained using n different sensors. Note that these
sensors could represent 71 different physical devices or the same device with n differ-
ent settings. The 2”“ low resolution image I, is obtained by warping, blurring and

downsampling the HR image I H R(lexicographically arranged). Warping captures the

[HR

 

§Sensorz 2

   

 

 

 

 

 

 

 

 

 

Warping: W1 Warping: W2
Blurring: H1 Blurring; H 2 i
i f

Downsgmpling: Downsgmpling:

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

I‘DDOO. ...-0.0...
CI} oooocoooooooooo00000000000

 

 

 

06063111011

 

 

 

OD‘UOCU

 

 

 

I O ' I
..........................................................................................

     

Figure 2.3: Observation Model for Super-resolution

global or local translation and rotation of the imaging sensors. Imaging sensors also
introduce blurring due to out-of—focus, diffraction limit, relative motion between the
imaging sensor and the original scene, etc. The spatial resolution of HR image is
higher than the low resolution observed images. In the observation model, the spatial
resolution reduction is represented by the downsampling stage. Furthermore, the low
resolution images are corrupted by sensor noise. 7),: is the noise of 2”” sensor, which
has been assumed to be additive.

Representing the warping, blurring and downsampling operation of it“ sensor by

25

matrices W,, H, and D,, the relationship between the LR images and the correspond-

ing HR image can be expressed as

I, :— D,H,'VV,'IHR + 772'; IS i S n. (2.25)

=>I = CIHR+TI. (2.26)

where I z [11,- .. ,In]", C : [D1H1W1,-~-,DanWn]",and n 2 [n1,--- ,nnj’. Es-
timation of I H R from eq. (2.26) is a classic restoration problem [67,68]. Typically,
the matrices D,, H,, and W, are estimated from the observed images and then de-
terministic or probabilistic approaches are used to solve eq. (2.25) to estimate high
resolution image I H R- The downsampling matrix D, can be estimated if the deci-
mation ratio between the HR image I HR and the kth LR image 1k is known [26].
The number of available LR images usually determines the upper limit of the size
of the HR image [26]. To estimate the warping matrix VV,, the LR images 11,- . - ,1"
are upsampled and then the motion vector estimation techniques are used on these
upsampled LR images to estimate W, [80,81]. The blurring matrix H, is typically
assumed to be known a priori [82]. A brief literature survey for super-resolution
methods is presented next.

The HR image I HR can be estimated from eq. (2.25) using a constrained least

square (CLS) framework [40]. In this approach, I H R is estimated to minimize

TI
2 III. — Dim/1111312 + allCIHRllga (2.27)
i=1

where a is Lagrange multiplier and C is weighting matrix to impose smoothness.
However, this approach is computationally expensive and the matrices D,, H, and
W, must be known.

Projection onto convex sets (POCS) based approach for super-resolution has been
proposed in [6,83]. In this approach, each a priori information about the HR image,

such as positivity, smoothness, etc. is modeled as a convex set. in the vector space that.

26

contains the HR image. The desired HR image is the intersection of all the convex
sets. However, this approach usually takes a large number of iterations to converge
and solution also depends on the initial guess.

The iterative back projection (IBP) method for super-resolution has been discussed
in [84,85]. IBP is an iterative algorithm in which observed LR images are simulated
by projecting HR image onto LR images space and the error between the actual and
the simulated LR images is used to update HR image. This approach is prone to
noise and selection of error metric such that solution converges is not trivial.

Several probabilistic approaches have also been proposed to solve eq. (2.25).
Eq. (2.25) can be re-arranged as eq. (2.26). The HR image I H R can be estimated by
maximizing the a posteriori probability density function P(I H R /I) This approach
is known as maximum a posteriori (MAP) estimation. Using eq. (2.26), P(IHR/I)

can be expressed as:

P(IHR/I) = P (I/ 125811)) U” R). (2.28)

 

Note that the matrix C has been assumed to be known. If it is unknown then it must
be estimated from the observed images as explained above. Since P(I) is independent
of P(IHR), maximizing eq. (2.28) is equivalent to maximizing its numerator, i.e.,
P(I/ I H R)P(I H R). P(I/IHR) can be estimated from eq. (2.26) if probability density
function of noise r] is known. The a priori density function of I HR: P(IHR), is
usually determined based on the prior knowledge available about HR image.

Schultz and Stevenson used a Huber Markov Random Field (HMRF) model for
P(I H R) and assumed noise to be z'.z'd. random variables to estimate HR image using
MAP model [86]. Hardie et al. used MAP model for super-resolution under the
assumption of Gaussian noise and Gaussian MRF prior model [87]. Sementilli et al.
presented a MAP estimator for super-resolution where posteriori density was derived

from a Poisson observation model and a Poisson prior for the HR image [88].

27

CHAPTER 3

Image Fusion Using Multichannel

Deconvolution

An iterative approach for image fusion using blind multichannel deconvolution is
discussed in this chapter. The proposed algorithm is based on the Richardson-Lucy
blind deconvolution algorithm [27,28,89,90].

3.1 Richardson-Lucy Deconvolution Algorithm

Richardson-Lucy (RL) deconvolution is an iterative algorithm to deblur a blurred
image [27,28]. In this approach, the observed image I (say) and the original image

[0(23, y) are related using a convolution relationship as given below:

HM) = Marty) * May), (31)

where h(:c, y) is the blurring point spread function (PSF) and is assumed to be known.
The RL algorithm uses a Bayesian approach to estimate 10(x,y) iteratively. Let

Ig(:r, y) be an estimate of 10(1):, y) at (i + I)” iteration. Then

 

i+1 _ 101%?!) * _ _ 2'
10 (am—{[Iax’yhmﬂm] h( at, 9)}10(2zy)- (3-2)

28

The RL algorithm has also been extended to blind deconvolution where both h(:c, y)

and 10(23, y) are estimated iteratively [90], using eq.( 3.2) and ( 3.3).

 

2+1 _ Hm) * _ _ i
h (x’y)—{[Io(x.y)*hi(x,y)[ 10(x, y)}h(rr,y)- (3-3)

The RL algorithm is susceptible to noise. It ampliﬁes the noise if the observed data
is blurry and noisy [91]. A modiﬁcation known as the “damped” RL algorithm has
been proposed to avoid the ampliﬁcation of noise that occurs in the RL iteration. In
this approach, the iteration is stopped in regions of the estimated image that ﬁt the
observed image well while allowing the estimated image to continue to improve in

regions where it ﬁts the observed image less well [91].

3.2 Proposed Approach

Let 11(a:,y),- - - ,In(:c, y) be it images acquired using 11. linear space-invariant sensors
whose PSFs are denoted as h1(:r,y),- - - ,hn(:r,y). The schematic of the image ac-
quisition process is shown in Fig 3.1 where Io(x,y) is the source or original image.

Because sensors are assumed to be linear and shift-invariant, the observed images

can. be modeled as [92,93]:
12(39): h’i($1y)* 10(x,y); 1—<— i S Tl. (34)

The goal is to estimate 10(1):, y),h1(a:,y),- - - ,hn(:r, y) from eq. (3.4).

Note that 10(x,y) is common in all the measurements. Motivated by this fact,
we propose a serial multistage model for blind deconvolution as shown in Fig. 3.2.
S, is the it“ stage of the model. I},(a:,y) and Ig+1(:c,y) are the input and output
variables of it“ stage, respectively. The PSF estimation is done in a serial manner
and only one PSF is estimated at each stage, i.e., PSF h, (:c, y) is estimated at stage

S, (1 S i S 72.). Let h.;9‘(:r,y) be an estimate of h, (:r,y). Then h,;9‘(:c, y) is estimated

29

 

_,| h1 (x, y) |—> 11(x.y)
lot...) .| gay) |__,12(.,y)
_.| hn(x. y) l——> 1,,(x, y)

 

 

 

 

 

Figure 3.2: Serial multistage model for blind deconvolution

such that the following cost function is minimized:

(130331) = am min ll 1103,31) - lit-(any) * 13.032) ||2- (3-5)

2,, 113,31

subject to the constraints

Zia-(cry) : 1, and h,(:c,y) Z 0 Vi. (3.6)
my

and

121m) = 1.03;, 1,, hat-v.21). (3.7)

30

The constraint given in eq. (3.6) ensures that PSFs are low pass in nature and do
not alter the average value of the source image. Eq. (3.7) is used to combine stages
S, and S,+1. The output variable of nth stage Ig+1(:r,y) is the estimated value of
10(17in-

However, knowledge of 101 (27,71) is required to execute the proposed serial model.
Since 13(23, y) is unknown, an initial guess of I; (:17, y) is used and the model is imple-
mented in an iterative manner. After the ﬁrst iteration, Ig+l(:r, y), computed at the
previous iteration is used as I; (:r, y) in the next iteration. The complete ﬂowchart
of this approach is shown in Fig. 3.3. At the kth iteration, Ionic—11(x’ y) is applied
to stage 1, which in conjunction with measurement image 11 (:r,y) is used to esti—
mate h1,k(:r,y). Then, h.1,k(:r,y) and 11(x,y) are used to estimate 12k(:r, y), which
is passed to the next stage as an input variable. This process is repeated for all
the subsequent stages. At each stage, the Richardson—Lucy blind deconvolution al-
gorithm [90] is used to estimate h,,k(:r, y) andI I'+1(:r, y). The output variable of the
nth stage is the estimated value of source image 10(23, y). The algorithm terminates

when the convergence criterion in eq. (3.8) is satisﬁed.

ll 13:11am 21—) 1:1iltz.y) 11’s n. (3.8)

77 is a pre-deﬁned tolerance value.

3.3 Results

The proposed algorithm was tested using benchmark images that were artiﬁcially
blurred using three known motion blur ﬁlters (three-channel image restoration). In
all cases, the support of the PSFs were assumed to be known. An example is the
“cameraman” image in Fig. 3.4a that was blurred using three different types of ﬁlters-
(1) Motion ﬁlter of length 5 and angle 13 (Channel 1) (ii) Motion ﬁlter of length 6
and angle 120 (Channel 2) and (iii) Motion ﬁlter of length 6 and angle 90 (Channel

31

 

 

 

 

 

. g 1 o o I
Initial Values. 109,110, ’hn,0
l _ +1 Stage 1
[Ia—1431...?

 

 

 

1 .
th (‘10,k—1’h1k—1’11 ‘ k=k+1

 

 

 

 

 

 

 

2 1
10,k (_10,k—l’hl,k’

V
In
0,k

)1 Stage I! St
h”) k FIO,k’hgk—l’1n on

No
11

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

n+1 n
IO,k (_IO,k’hn,k’In

 

 

 

 

 

 

 

Figure 3.3: Flowchart of the multichannel blind deconvolution algorithm

3). The blurred images are shown in Fig. 3.4b—3.4d. The results of the single channel
deconvolution as well as the proposed algorithm are shown in Fig. 3.5. Clearly, the
proposed multichannel algorithm performs better than single channel deconvolution.
Similar results for “barbara” image are shown in Fig. 3.6 and 3.7.

While the results of proposed algorithm are visually pleasing, a quality metric is
necessary to objectively compare the results of single and multichannel deconvolution.
The universal image quality index (UIQI) proposed by Wang and Bovik [59] is used
to compare the deconvolved image with the corresponding ground truth image. UIQI
is similar to the correlation coefﬁcient, and lies between —1 and +1. Index values
close to i1 indicate the robustness of the image processing algorithm. But this index
can only be computed when the original image is available. The UIQI values for
“cameraman” and “barbara” images have been summarized in Table 3.1. Clearly, the

high value of UIQI indicate the feasibility of the proposed algorithm. However, note

32

that for “barbara” image, UIQI values of single as well as multichannel deconvolution
are the same, but visual quality of multichannel deconvolution is better than that of
single channel deconvolution.

The Richardson Lucy (RL) deconvolution algorithm ampliﬁes noise [91]. Since the
proposed algorithm is based on RL algorithm, it is expected that the fused image
would be noisy if the input images are noisy. To validate this property empirically,
white Gaussian noise of mean 0 andlsignal to noise ratio (SNR) 19 dB was added
to the blurred cameraman images shown in Fig. 3.4b-3.4d. The SNR was computed
using the following expression:

Signal Variance

 

SNR : 101091 dB. (3.9)

0 Noise Variance
The corresponding noisy and blurred images are shown in Fig. 3.8a—3.8c. Fig. 3.8d
shows the corresponding fused image. It is clear that the proposed algorithm ampliﬁes
the noise. The UIQI of the fused image obtained by comparing it with the original
cameraman image was 0.37. A modiﬁcation to RL deconvolution, known as “Damped
Richardson Lucy” has been proposed in the literature, which is more robust to noise
as compared to RL algorithm [91]. Fig. 3.8e shows the fused image obtained by
using damped RL algorithm in the proposed fusion framework. The UIQI in this
case was 0.39. Note that the fused image obtained using damped BL is less noisy as
compared to the one obtained using RL algorithm. However, damping also introduces
extra smoothing. Fig 3.9 shows a plot of SNR of the input images vs. UIQI of the
corresponding deblurred cameraman image. It is clear that “Damped” RL performs
better than the one without damping.

The proposed fusion framework is heuristic in nature. Hence, the mathematical
framework for convergence analysis is not available. However, for an empirical un-
derstanding of the convergence of the algorithm, the blurred cameraman and barbara
images shown in Fig. 3.4 and 3.6 were fused, using a random image with pixel values

lying between 0 and 255 as the initial guess for the fused image. The fused images

33

as shown in Fig. 3.10 are clearly different from those shown in Fig. 3.4 and 3.6. This

suggests that the proposed fusion algorithm is sensitive to the initial guess.

 

Figure 3.4: Cameraman Image:(a) Original (b) Channel 1: Blurred by motion blur
of length 5 and angle 13 (c) Channel 2: Blurred by motion blur of length 6 and
angle 120 ((1) Channel 3: Blurred by motion blur of length 6 in vertical direction

34

v

 

L. \
' \
.1- .... ...__.____.-. v"--. . ..

Figure 3.5: Deconvolution of Cameraman Image:(a) Channel 1 only (b) Channel
2 only (0) Channel 3 only (d) Proposed approach

35

 

(C) (d)

Figure 3.6: Barbara Image:(a) Original (b) Channel 1: Blurred by motion blur of
length 9 and angle 30 (c) Channel 2: Blurred by motion blur of length 9 and angle
60 (d) Channel 3: Blurred by motion blur of length 11 in horizontal direction

36

 

(C) (01)

Figure 3.7: Deconvolution of Barbara Image:(a) Channel 1 only (b) Channel 2
only (c) Channel 3 only ((1) Proposed approach

37

Table 3.1: Performance summary of the multichannel deconvolution algorithm on
benchmark images

 

 

 

SI No. Images UIQI: UIQI: UIQI: UIQI:
Channel Channel Channel PrOposed
1 2 3 approach

1. Cameraman 0.45 0.41 0.63 0.75

2. Barbara 0.99 0.99 0.99 0.99

 

 

 

 

 

 

 

 

38

 

Figure 3.8: Noise Analysis (Additive white Gaussian Noise, mean = 0, SNR =
19 dB): Cameraman Image (a) Channel 1 (b) Channel 2 (c) Channel 3 (d) Fused
Image: Using RL without damping (e) Fused Image: Using damped BL

39

 

 

 

 

 

+ No Damping
0.6- an Damping ..ar "
L-+
0.5- -
1‘ 0_4_ * .. _
I .. *+
5 0.3- .1" + -
0.2- -
0.1- ,1!" _
+
0 + 1 1 1 1 1 I 1
0 5 10 15 20 25 30
SNR(dB)--->

Figure 3.9: Plot of SNR vs. UIQI

 

(a) (b)

Figure 3.10: Convergence Analysis (a) Fused Cameraman Image (b) Fused Bar-
bara Image

40

3.4 Discussion

An iterative algorithm based on Richardson-Lucy blind deconvolution is proposed in
this chapter. Initial results indicate that the proposed algorithm provides superior
performance when compared to single channel blind deconvolution. However, this
algorithm is based on Richardson-Lucy deconvolution algorithm, which tends to am-
plify the noise [91]. Therefore, in presence of noise either damped Richardson-Lucy
algorithm [91] should be used or the input images should be denoised prior to de-
convolution using denoising algorithms [94,95]. Furthermore, the proposed algorithm

assumes the knowledge of true size or support of the PSFs.

3.5 Superresolution

Recently, a deconvolution based approach has also been proposed for super-
resolution [96,97]. If warping is a global translation and blur is shift invariant, then
the blur matrix H, and the warping matrix W, in eq. (2.25) correspond to convolution

matrices and they commute. In this case eq. (2.25) can be re-written as
12' = Din'NIHR + 772'; 1 S i S n (310)

where H?” = H,W,. Since H, and W, are convolution matrices, Hz-N is also a convo-
lution matrix, and the HR image I H R can be estimated from eq. (3.10) using blind
deconvolution. Sroubek et al. proposed a MAP estimator based blind deconvolu-
tion to estimate I H R [96]. The noise was assumed to be additive, white Gaussian.
Nguyen et al. also proposed a multi-channel framework but a parametric model was
assumed for convolution matrix H 2N [97]. This is a very limiting assumption for many
applications.

The multi-channel deconvolution framework has also been investigated for superres-

olution in which the LR images are ﬁrst upsampled and multichannel deconvolution

41

algorithm is applied to the upsampled LR images to estimate HR image [82]. In this
section we explore this approach for super-resolution. Representing the upsampled

version of I, as IiT , it is clear from eq. (3.10) that
InggleRang lgz‘gn. (3.11)

I H R can be estimated from eq. (3.11) using a multichannel blind deconvolution ap-
proach. We propose to use the multi-channel blind deconvolution algorithm proposed

earlier to estimate I H R from eq. (3.11). The overall approach for the proposed super-

| Upsampling I l Upsampling I | Upsampling :

   
  

     

l Multi-channel Blind Deconvolution I
l ................

Figure 3.11: Overall Approach for Super-resolution

resolution algorithm is shown in Fig. 3.11. The observed LR images 11, - -- ,In are
assumed to be noise free. These LR images are ﬁrst upsampled and then the upsam-
pled LR images are deconvolved using the multichannel blind deconvolution algorithm

proposed above.

42

3.6 Superresolution Results

The proposed super-resolution algorithm was tested using “cameraman” and
“barabara” images. In all cases, the support of the PSFs were assumed to be known.
The ﬁrst set of results is shown for cameraman image. Fig. 3.12a shows the original
image. Low resolution (LR) images were generated by blurring the original image
using four different types of motion ﬁlter (1) Motion ﬁlter of length 7 and angle 60
(ii) Motion ﬁlter of length 7 and angle 90 (iii) Motion ﬁlter of length 9 and angle 0
and (iv) Motion ﬁlter of length 9 and angle 135. These blurred images were further
downsampled by a factor of 0.5. The LR images are shown in Fig. 3.12b-3.12e. For
comparison purpose the LR images have been shown at the same scale as the original
image. The super-resolution algorithm explained in Section 3.5 was applied to these
LR images. In order to investigate the effect of number of frames used to construct
the SR image, the proposed SR algorithm was applied to 1 input image, 2 input
images with minimum blur, 3 input images with minimum amount of blur and all the
4 input images. Fig. 3.13 shows all the four SR images. The UIQI was computed for
each of these SR images by comparing them with the corresponding ground truth.
The UIQI for SR images shown in Fig. 3.13a, 3.13b, 3.130 and 3.13d were 0.43, 0.53,
0.62 and 0.61, respectively. Visual inspection also suggests that the quality of SR
images obtained using 3 and 4 input images are much better than those obtained
using only 1 or 2 inputs. Furthermore, the performance of 3 and 4 inputs appear to
be the same. Similar results for “barbara” image are shown in Fig. 3.14. The motion
blurs used were-(i) Motion ﬁlter of length 9 and angle 60 (ii) Motion ﬁlter of length
9 and angle 90 (iii) Motion ﬁlter of length 11 and angle 0 and (iv) Motion ﬁlter of
length 11 and angle 135. In this case the high resolution image was obtained using
all the four inputs. The UIQI for the “barbara” image was 0.66.

To study the performance of the proposed algorithm in the presence of noise, zero

mean white Gaussian noise was added to the downsampled, blurred LR. “barbara”

43

images and the proposed SR algorithm was applied to these noisy LR frames. F ig. 3.15
shows one set of noisy images and the corresponding SR image. These noisy LR
images were obtained by adding zero mean white Gaussian noise of SN R 18 dB to the
LR images shown in Fig. 3.14. The SNR was computed using eq. (3.9). Clearly the
SR image appears to be noisy. The UIQI of this SR images was 0.43. Fig. 3.16 shows
a plot of UIQI of SR image and the corresponding SNR of the input images. It is
apparent that the noise affects the performance of the proposed algorithm. Note that

the damped RL algorithm was used in the multichannel blind deconvolution process.

44

 

(6)
Figure 3.12: LR frames for Cameraman image: (a) Original Image (256x256)
(b) LR Image 1 (128x128): Blurred by motion blur of length 7 and angle 60 (c)
LR Image 2 (128x128): Blurred by motion blur of length 7 and angle 90 ((1) LR
Image 3 ( 128x 128): Blurred by motion blur of length 9 and angle 0 (e) LR Image
4 (128x128): Blurred by motion blur of length 9 and angle 135

45

 

(C) ((0

Figure 3.13: SR results for Cameraman image (256x256) (a) Using one Input
Image shown in Fig. 3.12b (b) Using two input images shown in Fig. 3.12b and
3.12b (c) Using three input images shown in Fig. 3.12b-3.12d (d) Using all the
four input images

46

 

(C) (I)

Figure 3.14: LR frames for Barabara image: (a) Original Image (512x512) (b)
LR Image 1 (256x256): Blurred by motion blur of length 9 and angle 60 (c)
LR Image 2 (256x256): Blurred by motion blur of length 9 and angle 90 (d) LR
Image 3 (256x256): Blurred by motion blur of length 1 1 and angle 0 (e) LR Image
4 (256x256): Blurred by motion blur of length 11 and angle 135 (f) SR Image
(512x512)

47

 

(C)

Figure 3.15: Performance in presence of noise, SNR : 18 dB: (a) Input 1 (b)
Input 2 (0) Input 3 (d)Input 4 (e) SR Image

48

 

1 f I 1 I 1

 

 

 

 

 

 

 

4* SR UIQI: Damped RL
09* --+- SR UIQI: No Noise 1
0.81
0.7r
................................................................. .+
A 0.6’ '1
[ it
605- ...-“N“- .1
5 _ '3'
0.4 .--*'
0.3- “I
0.2-
0.1' $.41" ‘
O 1 1 1 I 1 1 1
0 5 10 15 20 25 30
SNR(dB)--->

Figure 3.16: Plot of SNR vs. UIQI for “barbara” image

49

CHAPTER 4

An Optimal Weighted Average

Pixel-Level Fusion

In the previous chapter a multichannel deconvolution algorithm was presented. Con-
volution is a global model in which the PSF of the sensor affects the entire image.
Often, however, the complementary information is present at the local level. Fur-
thermore, the input images that are to be fused may not be blurred at all and still
contain complementary information. For example, CT and MRI sensors provide com-
plementary information even if the images do not look blurred. Clearly, in such cases
convolution is not an appropriate model for fusion. A simple and intuitive approach
to extract local complementary information is pixel level fusion. In this chapter a.
computationally efﬁcient approach is discussed for pixel level fusion. The proposed
approach models the fusion process as a weighted average of input data where the
weights of the fusion process are estimated by maximizing the average power of the
corresponding fused pixel. The proposed approach does not require any user speciﬁc

threshold and is computationally efﬁcient.

50

4. 1 PrOposed Approach

Let f1(:r,y), f2(:r,y),--- , fn,(:z:,y) be n observed images that are to be fused for
(Lt, y) E Q. We assume pixels of the observed images to be random variables. Fur-
thermore, we also assume f,(:c, y) 2 0 (1 S 2' S n). This assumption is valid for many
imaging devices such as digital cameras, IR cameras, etc. and does not limit the
proposed algorithm in any way since data not satisfying this requirements (i.e., with
negative pixel values) can always be transformed using a simple linear transformation
to make the pixel values positive. Denoting the fused image as f0(x, y), the pixel of

the fused image and the corresponding pixels in the observed images are related as:

f0(93a?/) Z a1(:c,y)f1(:ry) + (12(1‘, y)f2(x. y) + ' ° ' + an.(;r,y)fn(;r,y). (41)

The following constraints are imposed on the weights ai(;r, y) (1 S i S n):
Tl
, . 2 __ . ~
away) 6 [0,1], :01,th 1, IS 2 S 72, (aw) E 9- (4-2)
7 i=1
The weights a,(:r, y) are usually unknown and must be estimated from the data.
In the absence of noise, the average power of the pixel can be viewed as estimate
of the contrast of the image [59]. Therefore, we propose to estimate the weights
a1(:r., y), ag(:r,y), . - . , an(:r, y) such that the average power of the fused pixel f0(;r., y)
is maximized.

Eq. (4.1) can be re-written as
f0 = an (4.3)

where a : [(11, - -- ,an]T, f 2 [f1,--- , fan, and the reference to the pixel location
(say) has been dropped for notational simplicity. The average power of the fused

pixel f0, denoted as Pfo, can be estimated as shown below:

13,0 = E[f0f0T] = E[anfTa] = aTE[ffT]a. (4.4)

51

Let 2f be the correlation matrix of the random vector f, i.e. 2f: E [ﬂ'T]. Also,

 

from eq. (4.2), it is clear that aTa : 1. Therefore,
P — aTZfa (4 5)
f0 — aTa ' '

We propose to estimate the weight vector a such that PfO in eq. (4.5) is maximized.
Note that 2f is a real and symmetric matrix and for any given real and symmetric
matrix 2f, Pfo is the Rayleigh quotient [73]. Therefore, PfO is maximized if the
weight vector a is equal to the principal eigenvector of 2f. It is evident that the
correlation matrix 2f is unknown and must be estimated from the input images to
estimate weight vector a. We use a region-based approach to estimate the correlation
matrix for all the pixels in that region as explained below.

Let us assume that the input images f1, - - - , fn are divided into

blocks of size p x q. For one such block denoted by R, assume that

0 X X
(fll’ f1 ”’fpxq)? (f21’fi’ ' fp Q), "'9 (lei9f’g9m9 71?). q) are the
lexicographical representation of pixel values of n input images. These
pixels can be viewed as n—variate random variables (f11,f21, ' ' ' 9 fd)
T T
(fifim .fg) , "w (ffxq, fgqu- f3”) . Furthermore, we assume

these n-variate random variables are 1'.sz Therefore, the correlation matrix of these

random variables, represented as 2V, can be computed as shown below.

qu
2 I/kI/T (4.6)
v- p x q _2 A
k k. k T . . . . .
and Vk : (f1, f2 , - -- , fn) . 2,, IS used in eq. (4.5) to estimate optimal weights to

fuse all the input pixels inside region R. Therefore, we propose the following rule to

fuse the pixels inside region R.
1. Compute 2,, for block R.

2. Let u be the principal eigenvector of 2V. Estimate M such that ”Tu : 1.

52

3. Set weight a : M; V (any) 6 B

This process may be repeated for each block. Note that the weights computed this

way are constant over the region.

4.2 Overall Approach

The ﬂowchart of the proposed pixel level fusion method is shown in Fig. 4.1. We ﬁrst
split the input images into several blocks, and arrange the pixels inside each block
lexicographically. The lexicographically arranged pixels of any given block from all
the input images are used to compute the correlation matrix 23,, and the principal
eigenvector of this correlation matrix is used to determine the optimal weights to fuse
that block. This process is repeated for all the blocks. After the optimal weights
are estimated, eq. (4.1) is used to estimate the fused pixels, f0(:r, y) for all locations

(any) 6 0

4.3 Noise Analysis

To study the performance of the proposed algorithm in the presence of sensor noise, we
assume that the input images f1(x, y), f2(:r., y), - - - , fn(:r., y) are corrupted by additive

noise. Therefore, the fused pixel in eq. (4.1) can be written as:

f0 = a1(f1 + 771) + a2(f2 + 722) + ° ' ' + anlfn + 7772.). (4-7)
=> f0 : an + aTn (4.8)
__ T _ .. T — T '
Where a — [(11, ,anl 9 — [f 1, ' ,fnl and 77 — [771, .7719] - Assuming f and

n to be uncorrelated, the average power of the fused pixel f0, denoted as PfO’ can be

expressed as:

, T :3 + 2,
PfO : ElfofoTl : a ( :Ta ”3 (4-9)

 

 

oInput: n. co-registered sensor images

OOutput: Fused image

OSensor Gain Estimation:
—) Split each input image into B blocks each
of size p x q
—) For each block (2)21: 1,--- ,B
0 Compute the correlation matrix of the block.
0 Estimate optimal weights for each block:
Principal eigenvector of the correlation matrix

as presented in Section 4.1

o Rision Process:

 

 

—) Use eq. (4.1) to fuse input pixels.

 

Figure 4.1: Flowchart of the Proposed Weighted Average Pixel Level Image Fusion

where 2f : E [ffT] and Zn 2 E [’rmT]. Hence, it is clear that in presence of noise, the
estimation of a using the rules proposed in Section 4.1 will be affected by the noise,
and the fused image will be noisy. Therefore the use of appropriate pre-ﬁltering is

necessary to mitigate the effect of noise in this algorithm.

4.4 Results

The results of the proposed fusion algorithm are presented in this section. The pro-
posed fusion algorithm was applied to three different datasets: (i) medical imaging (ii)
aircraft navigation, and (iii) multifocus images. For each dataset, only two input im-
ages were considered for the fusion process and these two inputs were co—registered.
To compare the performance of the proposed algorithm with those using heuristic

rules, the proposed fusion approach is also compared to the “Max” rule for fusion. In

54

the “Max” rule, the input pixels having the maximum absolute value is selected as
the fused pixel [47].

Fig. 4.2 shows the medical images that were used to validate the proposed algo-
rithm. Fig. 4.2a and 4.2b are the images of a human brain obtained using magnetic
resonance imaging (MRI) and computed tomography (CT), respectively. These im-
ages are available online at [98]. CT and MRI sensors provide complementary in-
formation, with CT proving a good visual description of bone tissue, whereas soft
tissues are better visualized by MRI. To apply the proposed approach, CT and MRI
images were divided into non-overlapping regions of size 8 x 8 and for each region, the
optimal weights for fusion was computed using the approach explained in Section 4.2.
The fused image is shown in Fig. 4.2c. The fused image obtained using “Max” rule is
shown in Fig. 4.2c. In this case the performance of the “Max” rule and the proposed
approach visually appears to be similar.

The proposed algorithm was also applied to the aircraft navigation images [98]
shown in Fig. 4.3. Fig. 4.3a was captured using a low-light-television (LLTV) sensor
and Fig. 4.3b was obtained using a forward-looking-infrared (FLIR) sensor. Note
that LLT V sensor provides the surface information of the ground as well as building
and vegetation details. However, FLIR image describes the road network details
accurately. The goal of fusion is to have all the salient features of both the sensors in
the fused image. Fig. 4.3c shows the corresponding fused image. The fusion process
was exactly the same as that of the medical imaging dataset. Fig. 4.3d shows the
output of the “Max” rule. Visually, the proposed approach appears to be better than
the heuristic approach.

The third set of results is shown for the multifocus image [99] shown in Fig. 4.4.
Fig. 4.4a corresponds to the case when the camera was focused on the front part
of the scene, while Fig. 4.4b shows the image when the camera was focused on the

background part of the scene. Fig. 4.4c shows the corresponding fused image. The

55

fusion process was exactly the same as those of the previous datasets. Fig. 4.4d shows
the fused image obtained using “Max” rule. Again, it is obvious that the proposed
algorithm provides a better fused image.

To validate the noise analysis presented in Section 4.3, zero mean white Gaussian
noise was added to the input images of all the three datasets shown in Fig. 4.2-4.4.
Fig. 4.5a and 4.5b show the noisy input images obtained by adding zero mean white
Gaussian noise of SNR :: 15 dB. The SNR was computed using eq. (3.9). Fig. 4.5c and
4.5d are the fused images obtained using the proposed approach and the “Max” rule,
respectively. The results for the “aircraft navigation” and the “multifocus” images
are shown in Fig. 4.6 and 4.7, respectively. The SNR for these images were 11 dB
and 16 dB, respectively. Visual inspection clearly shows that in all the cases both
types of fused images are noisy, however, except for the medical images, the proposed
approach provides more details about the scene even in the presence of noise.

The similarity index based quality metric was used to estimate the quality of the
fused image quantitatively [60]. In the similarity based index, a similarity metric is
computed between the measurement images and the fused image. This index lies
between —1 and +1. Index values close to 21:1 indicate the robustness of the fusion
algorithm. Table 4.1 summarizes this quality index for the cases depicted in Figs. 4.2-
4.4 and Figs. 4.5-4.7. Note that the maximum value of the similarity index based
metric is 1. Clearly, the high value of similarity index indicates the feasibility of the
proposed algorithm. However, the similarity index and visual inspection of the fusion
results in different conclusions. For example, in the case of “aircraft navigation”
and “multifocus” images, the similarity index for “Max” rule are almost the same or
slightly higher than the corresponding values for the proposed approach. But visual
inspection suggests that for these images the proposed approach provides better visual
quality than the “Max” rule even in the presence of noise.

A plot of the similarity index of the proposed fused image as a function of SNR is

56

presented in Fig.4.8. The similarity index for the “Max” rule has not been reported
in this plot as the quality metric provides ambiguous information in this case. It
is clear from the plot quality index decreases with SN R. Therefore, performance of
the proposed fusion algorithm is prone to sensor noise. Hence, in presence of noise,
noise removal should be used as a pre-processing stage in the proposed framework for

fusion.

Table 4.1: Performance summary of the weighted average pixel level fusion algo-
rithm

 

 

 

 

 

 

 

 

 

 

 

 

Image DataSet SNR (dB): SNR (dB): Similarity Similarity
Input 1 Input 2 Index: Index:
Proposed Max rule
Approach
, oo(No noise) oo(No noise) 0.93 0.94
Medical Image
15 15 0.39 0.39
N '. N '. 0.79 0.84
Aircraft Navigation 00( O n01se) OO( 0 none)
11 11 0.28 0.27
_ 00(No noise) oo(No noise) 0.84 0.86
Multifocus
16 16 0.35 0.34

 

 

The proposed algorithm was also found to be computationally efﬁcient. The fusion
of images shown in Fig. 4.2, which are of size 256 x 256 took 0.25 sec. Similarly. for
the images shown in Fig. 4.3 (size = 512 x 512) and 4.4 (size = 512 x 512), the fusion
took 0.55 and 0.59 sec., respectively. All the simulations were done using Matlab 7.1

running on 1.86 GHz Intel Machine, with 1.0 GB RAM.

4.5 Discussion

An optimal weighted average approach for pixel level fusion is proposed in this chap-
ter. The average power of the fused pixel is modeled as a Rayleigh quotient and the

weights are selected such that the Rayleigh quotient is maximized. The initial results

57

indicate the feasibility of the proposed approach. However, this algorithm is prone to
sensor noise. Therefore, in the presence of noise a de—noising ﬁlter should be applied

to the inputs prior to fusion.

58

 

(C) (d)

Figure 4.2: Medical Images (a) MRI Image (b) CT Image (c) Fused Image: Pro—
posed Approach (d) Fused Image: “Max” rule

59

 

(c) (d)

Figure 4.3: Aircraft Navigation Images (a) LLTV Image (b) FLIR Image c) Fused
Image: Proposed Approach (d) Fhsed Image: “Max” rule

60

 

(C) (d)

Figure 4.4: Multifocus Images (a) Focus on the front part (b) Focus on the back
part c) Fused Image: Proposed Approach ((1) Fused Image: “Max” rule

61

 

(c) (d)

Figure 4.5: Medical Images (SNR : 15 dB)(a) MRI Image (b) CT Image (c) Fused
Image: Proposed Approach (d) Fused Image: “Max” rule

62

 

(C) (01)

Figure 4.6: Aircraft Navigation Images (SNR = 11 dB) (a) LLTV Image (b) FLIR
Image c) Fused Image: Proposed Approach (d) Fused Image: “Max” rule

63

 

(C) (d)

Figure 4.7: Multifocus Images (SNR : 16 dB) (a) Focus on the front part (b)
Focus on the back part c) Fused Image: Proposed Approach ((1) Fused Image:
“Max" rule

64

Similarity Quality Index-->

 

1 I I I I I I I I

 

 

 

 

 

 

 

--+-- Medical Image
0.9 - ...... -- Aircraft Navigation
0 8 _ ~--D ~ Multifocus
o.7~ .
* .' I
0.6- as a
0.5L *_,.'.":4'~,'B
_ "+13“
0.4 7 9F+ .'
+.:' . D
0.3- _.--ji::*i..EI'n
0.2 -
0.1 - .+. 3+1}.
0 SE55-
-5 0 5 10 15 20 25 30
SNR (dB)--->

Figure 4.8: Plot of SNR vs. Similarity Quality Index

65

CHAPTER 5

A Total Variation Based Algorithm

for Pixel Level Image Fusion

The pixel level fusion discussed in the previous chapter is susceptible to sensor noise.
In this chapter a total variation (TV) based approach is discussed for pixel level fusion
to fuse images acquired using multiple sensors in the presence of sensor noise. In this
approach, fusion is posed as an inverse problem and a locally afﬁne model is used
as the forward model. A total variation norm based approach in conjunction with

principal component analysis is used iteratively to estimate the fused image.

5.1 The Image Acquisition Model

Let. f0(;r., y) be the true image, which is inspected by 71 different sensors and f1 (:17 y),
fg(:1:,y), ~--, f,,,(.r,y) are the corresponding n measurements for (1:, y) E Q. The
local afﬁne transform that relates the input pixel and the corresponding pixel in the

measured images is given by [21]

fi(x9y) : ﬂi(I9y)f0(xiy) + 772(139); 132$ 71.. (51)

66

Here, 6,-(33, y) and 77,:(17, y) are the gain and sensor noise, respectively, of the it” sensor
at location (:13, y). The goal of fusion is to estimate fo(:r, y) from f,(x,y), 1 S i S n.

In many applications such as radar imaging and visual and IR imaging, the comple-
mentary as well as redundant information are available at the local level in the mea-
sured images [21,30]. The main advantage of the local aﬂine transform model is that it.
can relate this local information content. in a mathematically convenient manner. For
example, as an extreme case, two sensors 2' and j (z 75 j; 1 S i, j S n) have complemen-
tary information at location (2:, y) if 6,-(27, y) 7f ,8j(:r, y) and 6,-(512, y), 63- (x, y) E {0, 1}.

Similarly, these two sensors have redundant information if 6,-(36, y) = ﬂj(x,y).

5.2 Image Fusion

5.2.1 Total Variation Norm for Image Fusion

In order to estimate f0(:r,y) from eq. (5.1), we assume that f0(;r,y), f,(:c, y) 2 0
(1 S i S n). This assumption is valid for many imaging devices such as digital
cameras, IR cameras, etc. and does not limit the proposed algorithm in any way
since data not satisfying this requirement (i.e., with negative pixel values) can always
be transformed using a simple linear transformation to make the pixel values positive.
Furthermore, we also assume that sensor noise 01(17, y), 772(2:, 3}), - - - , nn(x, y) are zero
mean random variables and are independent of each other. The standard deviation
of 77,-(33, y) is denoted as 0,, and 0,- is assumed to be known a priori and independent

of spatial location (as, y).

67

From eq. (5.1),

“fl” “51” “m”

 

 

 

 

 

 

w W) w

:2 f : ﬁfo+77 (53)

where f : [f19' ' ° 9fn]T9 ﬂ 2 [1819 ' ° ' 99372111977 : [771, ' ' ' 977an9 and the reference t0
the pixel location (:13, y) has been dropped for notational simplicity. Eq. (5.3) can be

re-arranged as
T f
BM“ = n+3”?! (5.4)

where a!" 2 (Mm—137".

The goal of fusion is to estimate f0 V(a:,y) E Q from eq. (5.4). Note that ,BN is
unknown and must be estimated from the measurement images f,(:17, y) (1 S i S n).
The method for estimating 3N has been discussed in Section 5.2.2. If ﬁN is known,
then a common approach to estimating f0 from the measurements f minimizes the
cost function [| [6N f — fol]. The solution to this least square estimate is f0 = ,GNf.
However, this approach is not robust to noise. A total variation norm [32] based
approach is proposed to estimate f0 from eq. (5.4). The total variation norm has
been used in several image processing applications [32—35]. It is based on a nonlinear
partial differential equation (PDE) and is more robust to noise as compared to L2
norm based estimators such as least square [36,37]. A comparison of least square and
TV norm based fusion has been shown in Fig. 5.1. Fig. 5.1a and 5.1b show the input
noisy images that are to be fused. Fig. 5.1c and 5.1d are the fused images obtained
using least square and TV norm based approach, respectively. A detailed comparison
of the fusion results are presented in Section 5.4. However, a visual comparison of

Fig. 5.1c and 5.1d clearly shows that the fused image obtained using a TV norm

68

based approach is less noisy and provides pronounced edges as compared to the fused
image obtained using a least squares approach.
Therefore, we propose to estimate f0(:r, y) by minimizing its TV norm under suit-

able constraints as given below:

Nib-d
A
OI
L!
V

minimize f9 [Vfo(:r,y)[dx at: [We y>| = (f3, + 9:3,)
subject to the following constraints
/ f0 d1: dy —_— / ﬁNf drr. dy (5.6)
Q 9
/ (Mr — f0)? dzr. dy = f 02 dar. dy (5.7)
Q Q

. . . . . . 9
where fox and fay are the derivatives of f0 in :r and y direction, respectively, and a“

is the variance of the transformed noise ,BNn. Eq. (5.5) represents the TV norm for

io-

69

 

(C) (d)

Figure 5.1: Comparison of least square and TV norm based fusion(a) LLTV Im-
age (b) FLIR Image (c) Fused Image: Least Square (d) Fused Image: Proposed
Approach (TV Norm)

70

The constraints given by eq. (5.6) and (5.7) involve mean and variance of the noise
BNn. Eq. (5.6) ensures that the noise [BN7] is of zero mean whereas eq. (5.7) is based
on a priori information about the variance of the noise. Note that the sensor gain
ﬁN depends on spatial location, which in turn also implies that a2 is a function of
(:13, y). Assuming the sensor gain vector H to be independent of f0 and its derivatives,
the approach proposed by Rudin et al. in [32] may be used to derive the following

iterative solution to solve eq. (5.5):

 

 

- . v k -
5+1fg—rk(v.(lvﬁl)+Ak(f§—[3Nf)) (0.8)
0
where k is the iteration number, Tk : is the time step, “ o ” represents the dot product
and
1 k N fof
A : -— - f V d d .
’° foa2dxdy/Ii 0" B l ' (Wm) 9” y

The boundary condition _33—[39 : 0 must be satisﬁed for each iteration of eq. (0.8),
where 89 represents the boundary of Q and £ is the outward normal along 89.

2 is the

However, the solution to eq. (5.8) requires knowledge of 3 and 02. Since a
variance of BN7) and the variance of the noise vector 1] is assumed to be known,
a2 can easily be estimated if B is known. In general, ﬂ is unknown and must be

estimated prior to solving eq. (5.8). This is addressed next.

5.2.2 Estimation of B

To estimate 6,-(1 S 2' S n), we assume that the gain of the sensor varies slowly from
one spatial location to another. Hence, )6,- can be assumed to be constant over a small
region of the sensor image. Thus, the input images are split into small regions of size
p x q and sensor gains may be computed for each region. Furthermore, we also assume
that ,8, 6 [0,1] for all 1 S i S 77..

From eq. (5.3), it is clear that in the absence of sensor noise, the variance of

the observed images f1, f2, - -- , fn are related to each other through sensor gains

71

31, ,82, - - - ,ﬂn. Therefore, the variances of the observed images may be used to esti-
mate ,8. Also, it is well known that for a given dataset, the principal eigenvector of
the correlation matrix points in the direction of maximum variance [100]. Motivated
by this fact, we propose to use a principal eigenvector based approach to estimate
sensor gains. The pixels of the observed images are used to construct the correla-
tion matrix and its principal eigenvector is used to estimate the sensor gains. This
approach is explained below.

Let us assume that the input images are divided into blocks of size p x q. For one

such block denoted by R, assume that (fll,f12,- -- ,ffxq), (f21,f§,- -- ,fgxq),

7

(f3, f3, - - ~ , ﬁx“) are the lexicographical representation of pixel values of n sensor
. . , T
images. These pixels can be viewed as n-variate random variables (fll, fig, - - - . fl) ,
T T
9 ‘7 9 X X X
(ff9f2d9u'9f7‘7‘) 9. (ff (Inf;9 q9"'9 712). (I) - LEI”: [H1,"',/.tvn,]T be the

principal eigenvector of the correlation matrix )3”, where

1 1”“! I T
2 : — V 11- . (5.9)
k l.- k T . . . .
and V}; : (f1 ,f.‘2 , - - - ,f.,,) . Assuming n01se in eq.(5.3) to be zero, and due to the

assumption that sensor gains are constant over block R, it can be shown that

7,, \
1 73)“!

EVZ— (ﬂ19°"9/3n.)2f02-
1:21

W
m

=> 21/ :- C (9819 ' ‘ ' aﬁnl (5-10)

V3")

 

 

 

 

X 9
Eiizif;

where C : ——————.
(pxm—l

72

Let p and u 2 (,ul, - - - ,pn)T be the principal eigenvalue and principal eigenvector
respectively, of 2,,. From eq.(5.10) it is clear that the rank of the matrix 2,, is one.

Therefore, ,0 and u can be related to the sensor gains as shown below:

p = 003% +~~+ii;%,). (5.11)

M = (m.--- ,Mn.)T0<(,819"' MT (5.12)

where “0c” indicates proportionality. Therefore, we propose the following rule to

estimate ﬂ : [51, - -- , ﬁn]T for region R.
1. Compute 2,, for block R, and estimate is such that “Tu. : 1.
2. Set 61 :ﬂgz-nzﬁn: 1 ifiil 2px;: =umotherwise setﬁ2u.

This process may be repeated for each block. Note that the sensor gains computed
this way are assumed to be constant over the region, i.e., ﬁ,(:r,y) = 6,:(7‘, s), where
r. 74 7", y -T" s, :r,y,r,s, E R and 1 S 2' S n. Next, we show that this proposed rule
for estimating sensor gains can discriminate between common and complementary

features. For the sake of simplicity, assume n = 2, i.e., there are only two sensors.

Then,
x312 ﬁiﬁz -
2,, : C (5.13)
ﬁiﬁz 322
p : C(ii'fwé’). (5.14)

and u. can be estimated by solving the following equation:

#1 #1
2.. ——- p ; u? + 113:1- (5.15)

#2 #2
If the two input images have completely redundant information, i.e., 61 2 ,8-2, then,

p : 20/312 = 206.3 and m : [1.2 : In this case, we select [31 = ,82 = 1, i.e.,

Sol"

73

,6 : [1,1]T. Similarly, if ,81 > ,82 (corresponding to some redundant information

between the two measurements), then, from eq. (5.15):

iii — iii __ ﬁ‘i’ — 3%
—— _ —. (5.16)
#1M2 ﬁiﬂ2

It is clear that in this case, in > #2. Therefore, according to the proposed rule,
[31 2 p1 and 62 2 #2. In the extreme case when 51 75 0 and ,82 = 0, it can be
shown using a similar procedure that ,6 = u = [1,0]T. Similarly, it can be shown
that B = p. 2 [0,1]T if [31 : 0 and ,62 7é 0. Therefore, it is clear that in case
of complementary information, the proposed approach selects the region that has
relevant information. Due to the assumption that pixel values of the measurement
images are always positive numbers, the sensor gains estimated using the proposed
approach will always be positive. Note that in the above analysis, we assumed sensor
noise to be zero. If noise is present, then the principal eigenvector will provide a
noisy estimate of the sensor gains. However, these gains are further used in the TV
based framework of eq. (5.8), which is robust to noise and compensates for the noisy

estimate of the sensor gains.

5.3 Overall Approach

The ﬂowchart of the proposed pixel level fusion method is shown in Fig. 5.2. We ﬁrst
split the input images into several blocks, and arrange the pixels inside each block
lexicographically. The lexicographically arranged pixels of any given block from all
the input images are used to compute the correlation matrix 2,, and the principal
eigenvector of this correlation matrix is used to determine the gain ,6 of that block.
This process is repeated for all the blocks. After the sensor gains are estimated,
eq. (5.8) is solved iteratively to estimate the fused pixels, fo(x,y) for all locations

(any) 6 Q

74

 

0 Input: n co—registered sensor images

0 Output: Fused image

OSensor Gain Estimation:
—) Split each input image into B blocks each
of size p x q
—) For each block (2) : 2': 1,--- ,B
0 Estimate sensor gain for each block

using algorithm presented in Section 5.2.2

0 Fusion Process:

—) Solve eq. (5.8) iteratively for each pixel.

 

 

 

Figure 5.2: Flowchart of the Proposed TV Norm. Based Pixel Level Image Fusion
5.4 Results

The results of the proposed fusion algorithm are presented in this section. The pro-
posed fusion algorithm was applied to “medical imaging” and “aircraft navigation”
datasets explained in Section 4.4. The sensor noise was simulated by adding zero
mean white Gaussian noise to the input images. The signal-to—noise ratio (SNR) was
computed using eq. (3.9).

Fig. 5.3 shows the medical images that were used to validate the proposed algo-
rithm. Fig. 5.3a and 5.3b are the images of a human brain obtained using computed
tomography (CT) and magnetic resonance imaging (MRI), respectively. To simulate
the sensor noise, zero mean white Gaussian noise were added to these input images.
The noisy input images were fused using the least square as well as proposed ap-
proach. For both the approaches for fusion, CT and MRI images were divided into

non-overlapping regions of size 8 x 8 and for each region, the sensor gain was com-

75

puted using the principal eigenvector approach explained in Section 5.2.2. For least
square based fusion, the estimated sensor gain was used in eq. (5.3) to estimate the
fused images. To fuse these images using the proposed approach, the estimated sensor
gains were used in the total variation algorithm described in Section 5.2.1. Several
different levels of SNR were used to validate the robustness of the algorithm. Three
sets of fusion results for 23 dB, 12 dB and 0 dB signal to noise ratio (SNR) have
been presented in Fig. 5.4-5.6. It is clear that for high level of SNR (Fig. 5.4), the
performance of least square and the proposed approach is similar. However, as SN R.
decreases, the proposed approach performs better than the least square approach as
shown in Fig. 5.5 and 5.6. For very high levels of noise, i.e., low SNR, the proposed
algorithm oversmoothes the edges of the fused image as shown in Fig. 5.6.

The proposed algorithm was also applied to the aircraft navigation images [98]
shown in Fig. 5.7. Fig. 5.7a was captured using a low-light-television (LLTV) sensor
and Fig. 5.7b was obtained using a forward-looking-infrared (FLIR) sensor. The
sensor noise simulation and the fusion process was exactly the same as that of the
previous dataset, and Fig. 5.8-5.10 show the fusion results for 23 dB, 8 dB and 0
dB signal to noise ratio (SNR). It is clear from the fused images that the proposed
approach is more robust to noise as compared to the least square approach. Again,
for low SNR, the proposed fusion algorithm tends to oversmooth the edges of the
fused images.

To estimate the quality of the fused image quantitatively, the similarity measure
based quality index proposed in [60] was used. The goal of quality assessment is
to compare the information content in the fused image and the corresponding input
images. Therefore, the similarity index was computed by comparing the fused images
with the noiseless versions of the corresponding input images. The similarity index
for the medical and navigation datasets are summarized in Table 5.1, and plots of the

similarity quality index for LSE and the proposed approach as a function of SNR are

76

presented in Fig. 5.11. It is evident from both the Table 5.1 as well Fig. 5.11 that
for very high SN R, performance of both the LSE and the proposed TV norm based
approaches is comparable. However, as the SN R is reduced, the proposed approach
performs better than the LSE. It is apparent that for both approaches, similarity

index value approaches zero as SNR drops.

Table 5.1: Performance summary of the total variation based pixel level fusion
algorithm

 

 

 

 

 

 

 

Image DataSet SNR SNR Similarity Similarity
(dB): (dB): Index: Index:
Sensor 1 Sensor 2 LSE Pro-
posed
Ap-
proach
, 23 23 0.58 0.62
Medical Image
12 12 0.34 0.44
0 O 0.11 0.22
, , _ 23 23 0.66 0.67
Aircraft Nav1gation .
0.20 0.32
0.07 0.16

 

 

 

 

 

 

 

5.5 Discussion

A total variation framework for pixel level fusion is proposed in this chapter. The
proposed fusion approach was applied to several different types of datasets. The

results on these data indicate the feasibility of the proposed approach.

77

 

(a) (b)

Figure 5.3: Medical Images: (a) CT Image (b) MRI Image

78

 

(C) ((1)

Figure 5.4: Medical Images, SNR : 23 dB: (a) CT Image (b) MRI Image (c)
Fused Image: Least Square ((1) Fused Image: Proposed Approach

79

 

(c) (d)

Figure 5.5: Medical Images, SNR, = 12 dB: (a) CT Image (b) MRI Image (c)
Fused Image: Least Square (d) Fused Image: Proposed Approach

80

 

Figure 5.6: Medical Images, SNR : 0 dB: (a) CT Image (b) MRI Image (c) Fused
Image: Least Square ((1) Fused Image: Proposed Approach

81

 

(a) (b)
Figure 5.7: Aircraft Navigation Images: (a) LLTV Image (b) FLIR Image

82

 

(C) ((1)

Figure 5.8: Aircraft Navigation Images, SNR : 23 dB: (a) LLTV Image (b) FLIR
Image (c) Fused Image: Least Square ((1) Fused Image: Proposed Approach

83

 

(C) (d)

Figure 5.9: Aircraft Navigation Images, SNR = 8 dB: (a) LLTV Image (b) FLIR
Image (c) Fused Image: Least Square (d) Fused Image: Proposed Approach

84

 

(C) (d)

Figure 5.10: Aircraft Navigation Images, SNR : 0 dB: (a) LLTV Image (b) FLIR
Image (c) Fused Image: Least Square (d) Fused Image: Proposed Approach

85

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

1 . . a i
--+ -- Medical Image: LSE
0.9 r ~91? -- Medical Image: TV ‘
0.8 - ‘
i -:
[=5 0.7 - . i
'0 AK.
i 0.6 '- *+.' -i
‘73 o.5~ ' * 5+ -
8 Afar ‘_ +
.‘E‘ 0.4 r- + -
g _ 9+ -
E 0.3 .
m at
0.2 - r ‘
an?”
- + -
0.1 ++,+-
0 1 i l 1
-10 O 10 20 3O 40
SNR (dB)—-->
(a)
1 . r - f
~-+ -- Aircraft Navigation: LSE
0.9 - mar -- Aircraft Navigation: TV ‘
0.8 - ‘
g 0.7- *m ‘
E 0.6- ..." " -
g 0.5 r as _r' '
O .'
g 0.4 " .' 5* +' -
.E ' ' .-*
w o.2~ _. + -
.w
0.1 ~ H '
++. .+'t
o l l I I
-10 O 10 20 30 40
SNR (dB)--->
(b)

Figure 5.11: Plot of SNR vs. Similarity Quality Index (a) CT, MRI dataset (b)
LLTV, FLIR dataset

86

CHAPTER 6

Conclusions And Future Research

6. 1 Conclusions

In this dissertation, model based image fusion algorithms have been explored. The
primary advantage of model based fusion approach is that it allows identiﬁcation of
complementary and redundant information present in the input images in a systematic
manner.

First, a multichannel blind deconvolution based approach for blur removal was
considered. The approach was based on Richardson-Lucy algorithm and is iterative
in nature. Furthermore, this approach was also extended to obtain high-resolution
images from several low-resolution images. However, this approach was seen to be
susceptible to high levels of noise.

In several situations, the input images for fusion contain complementary informa-
tion at the local level. For such cases, pixel level fusion that derives a fusion rule based
on small regions in the image is more appropriate for fusion. A pixel level fusion that
maximizes the power of the fused pixel was presented in Chapter 4. The proposed
approach does not depend on any heuristic threshold but is susceptible to noise. A

total variation based approach for pixel level fusion is discussed in Chapter 5. This

87

approach was compared to the “Least Square” approach and was seen to be more

robust to noise as compared to the “Least Square” technique.

6.2

Future Research

Future work includes the exploration of the following issues:

Experimental results indicate that the multichannel deconvolution algorithm
presented in Chapter 3 converges. Future work will include a mathematical

analysis of the convergence properties of this algorithm.

Literature review suggests that the upsampling size for super-resolution applica-
tions is assumed to be known a priori. A mathematical framework to estimate
the optimal upsampling size based on the image contents of the low-resolution

images will be explored.

The sensor gain parameters for total variation fusion algorithm presented in
Chapter 5 are estimated using principal Eigenvector analysis. Alternative ap-
proaches to estimate sensor gains that are more robust to sensor noise will be

investigated as the part of the future work.

More robust algorithms to estimate the quality of the fused image will also be
investigated. In particular, quality metrics that attempt to incorporate infor-
mation about the image scene will be investigated to provide a better estimate

of image fusion quality.

Finally, the fusion algorithms discussed in the dissertation will be validated
using additional datasets. For instance, deconvolution models have been used
in several different areas of image processing such as medical imaging, nonde-

structive evaluation, remote sensing, etc. The application of the multi-channel

88

algorithm to these areas will be examined as a part of the future work. Some
initial studies on using the proposed multichannel algorithm to estimate flaw
sizes from multi-frequency eddy coils have already been conducted [17]. The
initial results indicate that the proposed approach performs better than the

single channel analysis. However, this study needs more investigation.

89

BIBLIOGRAPHY

90

BIBLIOGRAPHY

[1] M. Kokar and K. Kim, “Review of multisensor data fusion architectures,” in
Proceeding of the IEEE International Symposium on Intelligent Control, August
1993, pp. 261—266.

[2] D. L. Hall and J. Llinas, “An introduction to multisensor data fusion,” Proc.
IEEE, vol. 85, no. 1, pp. 6—23, 1997.

[3] R. Mahler, “A uniﬁed foundation for data fusion,” in Proc 1994 Data Fusion
Syst. Conf, John Hopkins University, June 1987.

[4] M. A. Abidi and R. C. Gonzalez, Data fusion in Robotics and Machine Intelli-
gence. San Diego, CA, USA: Academic Press Inc., 1992.

[5] J. Llinas and E. Waltz, Multisensor Data Fusion. Boston, MA, USA: Artech
House, 1990.

[6] M. Elad and A. Feuer, “Restoration of a single super resolution image from
several blurred, noisy, and undersampled measured images,” IEEE Trans. Image
Processing, vol. 6, pp. 1646—1658, 1997.

[7] E. Waltz, “The principles and practice of image and spatial data fusion,” in
Proc. 8th Natl. Data Fusion Conf., Dallas, TX, March 1995.

[8] R. Blum, Z. Xue, and Z. Zhang, An overview of image fusion, in: R. Blum, Z.
Liu (Eds), Chapter in the book Multi-Sensor Image Fusion and I ts Applications.
Signal and Image Processing Series of Marcel Dekker/CRC Press, 2005.

[9] Z. Zhang and R. Blum, “A categorization and study of multiscale—
decomposition-based image fusion schemes,” Proceedings of the IEEE, vol. 87,
no. 8, pp. 1315—1326, August 1999.

[10] R. S. Blum, “Robust image fusion using a statistical signal processing ap-
proach,” Information Fusion, vol. 6, no. 2, pp. 119—128, June 2005.

[11] Y. Chibani, “Multisource image fusion by using the redundant wavelet decom-
position,” IEEE Geoscience and Remote Sensing Symposium, vol. 2, pp. 1383*
1385, July 2003.

91

[12]

[13]

[14]

[15]

[15]

[17]

[18]

[19]

[20]

[21]

[22]

[23]

M. Hurn, K. Mardia, T. Hainsworth, J. Kirkbride, and E. Berry, “Bayesian
fused classiﬁcation of medical images,” IEEE Trans. Med. Imag., vol. 15, no. 6,
pp. 850—858, December 1996.

X. Gros, Z. Liu, K. Tsukada, and K. Hanasaki, “Experimenting with pixel-level
ndt data fusion techniques,” IEEE Trans. Instrum. Meas., vol. 49, no. 5, pp.
1083—1090, October 2000.

F. Nebeker, “Golden accomplishments in biomedical engineering,” IEEE Engi-
neering in Medicine and Biology Magazine, vol. 21, no. 3, pp. 17—47, May/June
2002.

D. Barnes, G. Egan, G. OKeefe, and D. Abbott, “Characterization of dy-
namic 3-d pet imaging for functional brain mapping,” IEEE Trans. Med. Imag.,
vol. 16, no. 3, pp. 261—269, June 1997.

S. Wong, R. Knowlton, R. Hawkins, and K. Laxer, “Multimodal image fusion
for noninvasive epilepsy surgery planning,” International Journal of Remote
Sensing, vol. 16, no. 1, pp. 30—38, January 1996.

M. Kumar and P. Ramuhalli, “Dynamic programming based multichannel im-
age restoration,” in IEEE Conf. on Acoustics, Speech, and Signal Processing,
Philadelphia, PA, March 2005, pp. 609—612.

I. Edwards, X. Gros, D. Lowden, and P. Strachan, “Fusion of ndt data,” British
Journal of Non Destructive Testing, vol. 35, no. 12, pp. 710-713, December
1993.

M. Pavel, J. Larimer, and A. Ahumada, “Sensor fusion for synthetic vision,” in
Society for Information Display, 1992, pp. 475—478.

M. A. Burgess, T. Chang, D. E. Dunford, R. H. Hoh, W. F. Home, R. F. TuCker,
and J. A. Zak, “Synthetic vision technology demonstration: Flight tests,” in
Technical Report DOT/FAA/RD-93/40, III, Research and Development Ser-
vice, Washington DC, USA, December 1993.

R. K. Sharma, T. K. Leen, and M. Pavel, “Probabilistic image sensor fusion,”
in Advances in neural information processing systems 11, 1999, pp. 824—830.

P. J. Burt and R. J. Kolczynski, “Enhanced image capture through fusion,” in
Fourth International Conference on Computer Vision, 1993, pp. 173—182.

H. Li and Y. Zhou, “Automatic visual/IR. image registration,” in Optical En-
gineering, vol. 35(2), 1996, pp. 391—400.

92

[24] L. G. Brown, “A survey of image registration techniques,” in Computing Sur-
veys, vol. 24(4), 1992, pp. 325—376.

[25] D. C. Ghiglia, “Space-invariant deblurring given n independently blurred images
of a common object,” J. Opt. Soc. Am. A, vol. 1, no. 4, pp. 398—402, April 1984.

[26] M. Elad and A. Feuer, “Restoration of a single superresolution image from
several blurred, noisy, and undersampled measured images,” IEEE Trans. Image
Processing,'vol. 6, no. 12, pp. 1646—1658, December 1997.

[27] W. H. Richardson, “Bayesian-based iterative method of image restoration,” J.
Opt. Soc. Amer. A, vol. 62, no. 1, pp. 55—59, 1972.

[28] L. B. Lucy, “An iterative technique for the rectiﬁcation of observed distribu-
tions,” The Astronomical J., vol. 79, no. 6, pp. 745—754, 1974.

[29] W. Seales and S. Dutta, “Everywhere-in-focus image fusion using controllable
cameras,” in Proceedings of SPIE 2905, vol. 11, 1996, pp. 227—234.

[30] J. Yang and R. S. Blum, “A statistical signal processing approach to image
fusion for concealed weapon detection,” in Proc. of International Conference
on Image Processing, vol. 1, 2002, pp. I—513—I—516.

[31] A. Mohammad-Djafari, “Bayesian approach for data and image fusion,” in Int.
Workshop on Bayesian and Maximum Entropy Methods, Moscow, Idaho, USA,
August 2002.

[32] L. I. Rudin, S. Osher, and E. Fatemi, “Nonlinear total variation based noise
removal algorithms,” Physica D, vol. 60, pp. 259—268, 1992.

[33] P. L. Combettes and J.-C. Pesquet, “Image restoration subject to a total varia-
tion constraint,” IEEE Trans. Image Processing, vol. 13, no. 9, pp. 1213—1222,
2004.

[34] T. Chen, Y. Wotao, S. Z. Xiang, S. D. Comaniciu, and T. S. Huang, “Total
variation models for variable lighting face recognition,” IEEE Trans. Pattern
Anal. Machine Intell, vol. 28, no. 9, pp. 1519-1524, 2006.

[35] G. A. Hewer, C. Kenney, and B. S. Manjunath, “Variational image segmentation
using boundary functions,” IEEE Trans. Image Processing, vol. 7, no. 9, pp.
1269-1282, 1998.

[36] L. Alvarez, F. Guichard, P.-L. Lions, and J.-M. Morel, “Axioms and funda-
mental equations of image processing,” Archive for Rational Mechanics and
Analysis, vol. 123, pp. 199—257, 1993.

93

[37] L. Alvarez and J .-M. Morel, “Formalization and computational aspects of image
analysis,” Archive for Rational Mechanics and Analysis, pp. 1—59, 1994.

[38] S. Farsiu, D. Robinson, M. Elad, and P. Milanfar, “Advances and challenges in
super-resolution,” International Journal of Imaging Systems and Technology,
vol. 14, no. 2, pp. 47—57, August 2004.

[39] —, “Fast and robust multi-frame super-resolution,” IEEE Trans. Image Pro-
cessing, vol. 13, no. 10, pp. 1327—1344, October 2004.

[40] S. C. Park, M. K. Park, and M. G. Kang, “Super-resolution image reconstruc-
tion: a technical overview,” IEEE Signal Processing Mag., vol. 20, no. 3, pp.
21-36, May 2003.

[41] M. K. Ng and N. K. Bose, “Mathematical analysis of super-resolution method-
ology,” IEEE Signal Processing Mag, vol. 20, no. 3, pp. 62—74, May 2003.

[42] D. Hal, Mathematical Techniques in Multisensor Data Fusion, 2nd ed. Boston,
USA: Artech House, March 2004.

[43] J. Llinas and D. Hall, “A challenge for the data fusion community ii: Research
imperatives for improved processing,” in Proc. 7th Natl. Symp. On Sensor Fu-
sion, Albuquerque, New Mexico, March 1994.

[44] C. Pohl and J. L. van Genderen, “Multisensor image fusion in remote sensing:
concepts, methods and applications,” International Journal of Remote Sensing,
vol. 19, no. 5, pp. 823—854, 1998.

[45] M. Daniel and A. Willsky, “A multiresolution methodology for signal-level fu-
sion and data assimilation with applications to remote sensing,” Proc. IEEE,
vol. 85, no. 1, pp. 164—180, January 1997.

[46] E. Lallier and M. Farooq, “A real time pixel-level based image fusion via adap-
tive weight averaging,” in Proceedings of the Third International Conference on
Information Fusion, vol. 2, no. 10, Paris, France, July 2000, pp. WEC3/3 ——
WEC313.

[47] G. Pajares and J. D. L. Cruz, “A wavelet-based image fusion tutorial,” Pattern
Recognition, vol. 37, no. 9, pp. 1855—1872, September 2004.

[48] J. Richards, “Thematic mapping from multitemporal image data using the prin-
cipal components transformation,” Remote Sensing of Environment, vol. 16, pp.
36—46, 1984.

94

[49] I. Bloch, “Information combination operators for data fusion: a comparative
review with classiﬁcation,” IEEE Trans. Syst, Man, Cybern. C, vol. 26, no. 1,
pp. 52—67, January 1996.

[50] Y. Gao and M. Maggs, “Feature-level fusion in personal identiﬁcation,” in Pro-
ceedings of the IEEE Computer Society on Computer Vision and Pattern Recog-
nition, vol. 1, San Diego, CA, USA, June 2005, pp. 468-473.

[51] V. Sharma and J. W. Davis, “Feature-level fusion for object segmentation using
mutual information,” in Computer Vision and Pattern Recognition Workshop,
June 2006, pp. 139-146.

[52] A. Kumar and D. Zhang, “Personal recognition using hand shape and texture,”
IEEE Trans. Image Processing, vol. 15, no. 8, pp. 2454—2461, August 2006.

[53] K. Fukanaga, Introduction to Statistical Pattern Recognition, 2nd ed. New
York, USA: Academic Press, September 1990.

[54] Y. Liao, L. W. Nolte, and L. M. Collins, “Decision fusion of ground—penetrating
radar and metal detector algorithmsa robust approach,” IEEE Trans. Ceosci.
Remote Sensing, vol. 45, no. 2, pp. 398—409, February 2007.

[55] L. O. Jimenez, A. Morales-Morell, and A. Creus, “Classiﬁcation of hyperdimen-
sional data based on feature and decision fusion approaches using projection

pursuit, majority voting, and neural networks,” IEEE Trans. Geosci. Remote
Sensing, vol. 37, no. 3, pp. 1360—1366, May 1999.

[56] M. Fauvel, J. Chanussot, and J. A. Benediktsson, “Decision fusion for the classi-
ﬁcation of urban remote sensing images,” IEEE Trans. Ceosci. Remote Sensing,
vol. 44, no. 10, pp. 2828—2838, October 2006.

[57] S. Foucher, M. Germain, J .-M. Boucher, and G. B. Benie, “Multisource classi-
ﬁcation using icm and dempster—shafer theory,” IEEE Trans. Instrum. M eas.,
vol. 51, no. 2, pp. 277—281, April 2002.

[58] Z. Wang, A. C. Bovik, and L. Lu, “Why is image quality assessment so dif-
ﬁcult?” in Proceedings of the IEEE International Conference on Acoustics,
Speech, and Signal Processing, Orlando, USA, May 2002.

[59] Z. Wang and A. Bovik, “A universal image quality index,” IEEE Signal Pro-
cessing Lett., vol. 9, no. 3, pp. 81—84, March 2002.

[60] N. Cvejic, A. Loza, D. Bull, and N. Canagarajah, “A similarity metric for assess-
ment of image fusion algorithms,” International Journal of Signal Processing,
vol. 2, no. 3, pp. 178-182, 2005.

95

[61] T. Cover and J. Thomas, Elements of Information Theory, 2nd ed. Wiley-
Interscience, July 2006.

[62] L. Leung, B. King, and V. Vohora, “Comparison of image data fusion techniques
using entropy and ini,” in Proc. ACRS, Singapore, November 2001, pp. 152—157.

[63] Z.Li, Z.Jing, G.Liu, S.Sun, and H.Leung, “A region-based image fusion algo-
rithm using multiresolution segmentation,” in IEEE International Conference
on Intelligent Transportation Systems, Shanghai, China, October 2003, pp. 96—
101.

[64] H. C. Andrews and B. R. Hunt, Digital Image Restoration. Englewood Cliffs,
NJ, USA: Prentice Hall, June 1977.

[65] A. Jain and S. Ranganath, “Application of two dimensional spectral estimation
in image restoration,” in Proceedings of the IEEE International Conference on
Acoustics, Speech, and Signal Processing, vol. 6, April 1981, pp. 1113—1116.

[66] D. Kundur and D. Hatzinakos, “Blind image deconvolution,” IEEE Signal Pro-
cessing Mag, vol. 13, no. 3, pp. 43—64, May 1996.

[67] A. Kirsch, An Introduction to the Mathematical Theory of Inverse Problems,
lst ed. Springer, September 1996.

[68] A. Tarantola, Inverse Problem Theory and Methods for Model Parameter Esti-
mation. SIAM, December 2004.

[69] J. Liu, Q. Wang, and Y. Shen, “Comparisons of several pixel-level image fusion
schemes for infrared and visible light images,” in Proceedings of the IEEE Instru-
mentation and Measurement Technology Conference, vol. 3, Ottawa, Canada,
May 2005, pp. 2024— 2027.

[70] E. H. Adelson, C. H. Anderson, J. Bergen, P. Burt, and J. Ogden, “Pyramid
methods in image processing,” RCA Eng, vol. 29, no. 6, pp. 33—41, 1984.

[71] V. Petrovic and C. Xydeas, “Multiresolution image fusion using cross band
selection,” Proc. SPIE, vol. 39, pp. 319—326, 1999.

[72] R. VVelch and M. Ehlers, “Merging multiresolution SPOT HRV and landsat
TM data,” Photogramm. Eng. Remote Sensing, vol. 53, no. 3, pp. 301—303,
1987.

[73] R. Bellman, Introduction to Matrix. Analysis, 2nd ed. Society for Industrial
Mathematics, January 1987.

96

[74]

[761

[77]

[781

[79]

[80]

[81]

[82]

[83]

[84}

[85]

T. Akgun, Y. Altunbasak, and R. M. Mersereau, “Super-resolution reconstruc-
tion of hyperspectral images,” IEEE Trans. Image Processing, vol. 14, pp. 1860-—
1875, 2005.

R. Molina, J. Mateos, and A. K. Katsaggelos, “Super resolution reconstruction
of multispectral images,” in VirtualObservatory: Plate Content Digitization,
Archive Mining and Image Sequence Processing, Heron Press, Soﬁa, Bulgary,
2006, pp. 211~220.

H. Greenspan, G. Oz, N. Kiryati, and S. Peled, “MRI inter-slice reconstruction
using super resolution,” Magn. Reson. Imaging, vol. 20, pp. 437—446, 2002.

R. R. Peeters, P. Kornprobst, M. Nikolova, S. Sunaert, T. Vieville, G. Ma-
landain, R. Deriche, O. Faugeras, M. Ng, and P. V. Hecke, “The use of super-
resolution techniques to reduce slice thickness in functional MRI,” Int. J. Imag-
ing Syst. and Technol, vol. 14, pp. 131—138, 2004.

J. Cui, Y. Wang, J. Huang, T. Tan, and Z. Sun, “An iris image synthesis method
based on pea and super-resolution,” in Proc. 17th Int. Conf. Pattern Recognit,
vol. IV, Cambridge, UK, 2004, pp. 471—474.

S. Chaudhuri and D. R. Taur, “High-resolution slow-motion sequencing: how
to generate a slow-motion sequence from a bit stream,” IEEE Signal Processing
Mag, vol. 22, pp. 16—24, 2005.

R. Y. Tsai and T. S. Huang, “Multiframe image restoration and registration,”
Adv. Comput. Vis. Image Process, vol. 1, pp. 317—339, 1984.

P. Vandewalle, S. Siisstrunk, and M. Vetterli, “A frequency domain approach to
registration of aliased images with application to super-resolution,” E URA SIP
J. Appl. Signal Process, vol. ID 71459, p. 14, 2006.

A. K. Katsaggelos, R. Molina, and J. Mateos, Super Resolution of Images and
Video, 1st ed. Morgan and Claypool Publishers, May 2007.

H. Stark and P. Oskoui, “High—resolution image recovery from image—plane ar-
rays, using convex projections,” Journal of the Optical Society of America A,
vol. 6, no. 11, pp. 1715—1726, 1989.

S. Peleg, D. Keren, and L. Schweitzer, “Improving image resolution using sub-
pixel motion,” Pattern Recognit. Lett., vol. 5, no. 3, pp. 223—226, 1987.

D. Keren, S. Peleg, and R. Brada, “Image sequence enhancement using sub-
pixel displacement,” in IEEE International Conference on Computer Vision
and Pattern Recognition (CVPR), 1988, pp. 742—746.

97

[86]

[87]

[88]

[89]

[901

[91]

[92]

[93]

[94]

[95]

[96]

[97]

R. Schultz and R. Stevenson, “Extraction of high-resolution frames from video
sequences,” IEEE Trans. Image Processing, vol. 5, no. 6, pp. 996—1011, 1996.

R. Hardie, T. Tuinstra, J. Bognar, K. Barnard, and E. Armstrong, “High reso—
lution image reconstruction from digital video with global and non—global scene
motion,” in In Proceedings of the IEEE International Conference on Image
Processing, vol. I, Santa Barbara, CA, 1997, pp. 153—156.

P. Sementilli, M. Nadar, and B. Hunt, “Poisson map super-resolution estimator
with smoothness constraint,” in In Neural and Stochastic Methods in Image
and Signal Processing 11, vol. 2032 of Proc. SPIE, 1993, pp. 2—13.

G. R. Ayers and J. C. Dainty, “Iterative blind deconvolution method and its
applications,” Opt. Lett., vol. 13, pp. 547—549, 1988.

D. A. Fish, A. M. Brinicombe, and E. R. Pike, “Blind deconvolution by means
of the richardson-lucy algorithm,” J. Opt. Soc. Am. A, vol. 12, no. 1, pp. 58—65,
1995.

R. L. White, “Image restoration using the damped richardson-lucy method,” in
ASP Conference Series, vol. 61, 1994, pp. 292—295.

A. K. Jain, “Advances in mathematical models for image processing,” Proc.
IEEE, vol. 69, no. 5, pp. 502-528, 1981.

C. A. Berenstein and E. V. Patrick, “Exact deconvolution for multiple convo-
lution operators - an overview plus performance characterizations for imaging
sensors,” Proc. IEEE, vol. 78, pp. 723-734, 1990.

A. Chambolle, R. D. Vore, N. Lee, and B. Lucier, “Nonlinear wavelet image pro-
cessing: variational problems, compression, and noise removal through wavelet.
shrinkage,” IEEE Trans. Image Processing, vol. 7, no. 3, pp. 319—333, 1998.

G. Chang, B. Yu, and M. Vetterli, “Adaptive wavelet thresholding for image
denoising and compression,” IEEE Trans. Image Processing, vol. 9, pp. 1522—
1531, 2000.

F. Sroubek and J. Flusser, “Resolution enhancement via probabilistic decon-
volution of multiple degraded images,” Pattern Recognit. Lett., vol. 27, pp.
287—293, 2006.

N. Nguyen, P. Milanfar, and G. H. Golub, “Efficient generalized cross-validation
with applications to parametric image restoration and resolution enhancement,”
IEEE Trans. Image Processing, vol. 10, pp. 1299—1308, 2001.

98

[98] [Online]. Available: http://www.imagefusion.org/

[99] (2006) Signal Processing & Communication Research Labora-
tory, Lehigh University, Bethlehem, PA, USA. [Online]. Available:
http: / / www.ece.lehi gh.edu / SPCRL / IF / image_fusion.htm /

[100] A. C. Rencher, Methods of Multivariate Analysis, 2nd ed. W iley-Interscience,
March 2002.

99