‘I'lhl:0.-¢~gﬂ|-

 

.":1.[ ..

.

 

 

hc‘woﬁthW .

     

  

 

           

2
.0
C
‘ .I o... .
1 . c .- ......‘ .04. ‘5...
. . .... .ou‘at: .. 1.. v. .2. .y.
. —
OP.

“‘9“

               

n

 

 

. . .10 o.
.3. a...

“".' "our“ nit-090M991“!

      

  

 

     

 

- Q-
.
t. .
... . .:o'|).
o . “moi-4...:
O. I . ..I-¢>
0.. .
9 .
‘o c
I...
o
0‘ .
0 I
' .
I.
V _ . .
a.
O.
0 .
'.
-.O.-.I‘c,nb-v...0-'

 

u‘th Old.

 

 

...\Il.o u

 

an

 

 

tur.
a

‘
Q

 

 

 

 

.'.l.n"..I‘-.I,v.acO‘J’igtﬁ'u-sfqiw’

"'.1q-q.u‘..l.‘.'.v'00

 

.00.

.Iiuo- qzoll’l‘lou

 

c‘IU !

 

’OOAO

. ‘0
’ a. .... H... (9%... . .
I... s‘o..1..lo.l.2r\-\0I.;\I
.u.‘ ..Q.‘J 01.56....vol \\-O
. .. .DNI.

.o...l.v o‘%.o..:. .. . . 1-“
. ..o‘c.:5. Z a“... o . . . "L. um.
_- .. . . .

.4..L§\.rv. .1... ...V.....
.twmw..\..1v“.. ...lv. . . .2“. ...

   

... .. w

o. ‘40.. ... . . .I ..,..- .14“... 0 NJ... (.4

6:... .22.».

 
 
 

”L
9.0K) LIBRARY
Michigan State
University

 

 

 

 

This is to certify that the
dissertation entitled

CO—OPERATIVE CODED COMMUNICATION UNDER
NETWORK CONSTRAINTS

presented by

KIRAN MUKESH MISRA

has been accepted towards fulﬁllment
of the requirements for the

Doctoral degree in Electrical Engineering

 

 

 
 

/‘ ”if

.27.
7 Major Professor’s Signature

,/
r
/

May 4“ 2010

Date

 

MSU is an Affirmative Action/Equal Opportunity Employer

PLACE IN RETURN BOX to remove this checkout from your record.
To AVOID FINES return on or before date due.
MAY BE RECALLED with earlier due date if requested.

 

DATE DUE DATE DUE DATE DUE

 

 

 

 

 

 

 

 

 

 

 

 

 

 

5/08 KlProj/Acc8Pres/ClRCDateDueJndd

 

CO-OPERATIVE CODED COMMUNICATION UNDER NETWORK
CONSTRAINTS

By
Kiran Mukesh Misra

A DISSERTATION
Submitted to
Michigan State University
in partial fulﬁllment of the requirements
for the degree of
DOCTOR OF PHILOSOPHY

Electrical Engineering

2010

ABSTRACT

CO-OPERATIVE CODED COMMUNICATION UNDER NETWORK
CONSTRAINTS

By
Kiran Mukesh Misra

Traffic ﬂow-control is an essential component for successful routing of information
within networks. However flow-control algorithms using purely forwarding often fail
to reach the network capacity when faced with transmission corruptions and local
network bottlenecks. Over the past decade, seminal work has shown that coding in-
formation ﬂows within the network allows us to relieve congestion at local bottlenecks
by utilizing excess (available) capacity in other parts of the network. Additionally,
coding at intermediate network nodes can be used to overcome network erasures
(e.g. lost packets) or equivalently reduce the number of transmissions required to
achieve identical throughput. The body of work, where the above beneﬁts can be
accrued by restricting the coding at intermediate network nodes to linear operations,
is collectively referred to as Linear Network Coding (LNC). LNC can be deployed at
the physical layer or above in the network protocol stack. An attractive feature of
LNG schemes is the existence of randomized coding algorithms which facilitate zero—
conﬁguration (ZC) network deployment. However, several LNC schemes use Gaussian
elimination decoding at the receiver which result in cubic decoder complexity and ad-
ditionally are extremely vulnerable to the introduction of errors in the received stream.
Furthermore, randomized approaches to LNC that enable ZC deployment result in a
large number of coded transmissions. Consequently, random LNC solutions become
increasingly untenable under energy constrained noisy environments.

In this thesis, we design channel codes which incorporate the underlying euclidean
network topology constraints in the netwrok-code design process and can achieve per-

formance similar to the performance of random LNC. We outline code construction

algorithms which enable the generation of the desired coded symbol stream. The be—
lief propagation algorithm (which has linear complexity) is employed at the decoder
enabling recovery from errors in the received stream. The designed codes require a
smaller number of transmissions when compared to the zero-conﬁguration approach
while achieving the same throughput. We formulate the design of these Codes—over-
Network Graphs (CNG), as an optimization problem constrained by underlying net-
work connectivity and channel conditions. The formulated problem is difﬁcult to
analyze because of dynamically changing network configuration, channel conditions
and the interplay between different parts of the network. Therefore we determine
solutions to the optimization problem using two distinct numerical approaches; the
ﬁrst approach uses a classic asymptotic Low Density Parity Check (LDPC) analysis
tool called Density Evolution along with sequential quadratic programming; whereas
the second approach uses a heuristic genetic algorithm - Diﬁerential Evolution, to
search for a globally optimal solution. Network-codes are designed for two distinct
cases, in the ﬁrst case, the relay channel erases part of the coded symbol stream and
is referred to as the “erasure only” case. While in the second case the relay channel
not only erases part of the coded stream, but also introduces errors in it. The second
case is termed the “erasure and error” case. The designed codes for “erasure only”
easily outperform ZC LNC approaches such as Growth Codes (GC) when comparing
data recovery at the sink. Additionally, they have lower decoder complexity when
compared with random LNC while giving comparable performance. For the second
“erasure and error” case, the designed codes have lower error ﬂoors when compared
to the standard LT codes. Experiments reveal that solutions designed using Density

Evolution based numerical approach are more robust to variation in network size.

COPYRIGHT

Copyright by
Kiran Mukesh Misra
2010

DEDICATION

To my family and friends.

ACKNOWLEDGMENT

It is an honor and pleasure to thank those who made this thesis possible. Firstly,
I would like to express my sincere gratitude to my advisor Professor Hayder Radha
for his encouragement, guidance, support and patience. He has provided invaluable
consul to me throughout my Ph. D. years and has been an extraordinary nurturer
and teacher. I am grateful for his role in shaping my intellectual outlook as well as
philosophical perspective towards problem solving. I cherish the teaching moments I
have had with him over the years, both in research and in life.

I owe my deepest gratitude to my committee members for the valuable time
and constructive feedbacks they provided me. My research collaborations with Dr.
Chakrabartty has been highly rewarding and has provided me with valuable insight
into system realizations. His perspective of connections between channel coding,
machine learning, biologically inspired systems and VLSI realizations has been fas-
cinating to learn. Professor Hall has been a constant source of inspiration for me
and many of my colleagues. He has been a patient teacher and his clarity of thought
has helped precipitate answers to many a research questions. My discussions with
Dr. Aviyente’s has helped germinate ideas that have provided new and interesting
directions to my continued research endeavors. I feel privileged to have had a com-
mittee that has been open to serve as a sounding board for my ideas and responded
by drawing on their vast research experience.

Various parts of the work presented in this thesis appear in several published
[MKROSa], [MKR07], [SKMRO8], [KMRO8], [KMROG] and unpublished [MKROQ],
[MKROSb], [KMSRO7] papers. Many of them co-authored with brilliant colleagues
who provided critical support and invaluable intellectual contributions all along the
way.

I wish to thank my colleagues in WAVES lab for maintaining an environment

of comradeship and their enthusiastic participation in discussions. I would like to

vi

especially thank Shirish for the many debates and conversations on the merits and
demerits of different ideas and philosophical approaches. He has infused a lot of
excitement into the time spent doing research. Working with Sohraab has been a pure
joy and my many fruitful collaborations with him bear witness to his effectiveness.
Usman’s quite and disciplined style in our collaborative efforts have instilled in me
a deeper appreciation of due diligence and personal touch. The ingenuity shown by
Utpal, Nima, Yongju, Jin, Wook-Joong, Ki—Moon, Ming, Siva, Sorour and others
in our joint research efforts has been an intellectually stimulating experience. I am
grateful for my association with all these people and feel the richer for it.

I would also like to thank my friends Akshay, Bhaskar, Prajakt, Amit, Sachin,
Mukta, Lavanya, Tejas, Amol, Kanchan, Tarundeep and others who have been like
a second family for me. They have been a constant source of support with their
friendship and have made the road more enjoyable.

Lastly but most importantly I would like to thank my family: my mother, father,
brother, sister-in—law, uncles, aunts, cousins, every one of whom have provided in-
valuable encouragement and moral support. Their blessings have provided me with
the courage to take on challenging problems and has helped alleviate any associated
anxieties. I am grateful for the comfortable conditions they created for me which has
helped make my Ph. D. research years a truly memorable experience.

Inevitably this list of people is incomplete and I would like to thank everyone who

has provided me with support during my Ph. D. years.

vii

TABLE OF CONTENTS

List of Tables ................................. x
List of Figures ................................ xi
List of Symbols ................................ xiv

Introduction & Background ......... 1
1.1 Related Work ............................... 2
1.2 Overview of the thesis .......................... 7
1.3 Thesis Organization ............................ 10

Mapping to belief networks and analysis under density evolution 13

2.1 Mapping Network Code to Belief Network ............... 14
2.2 Analysis of belief propagation algorithm using density evolution . . . 20
Maximizing data recovery under erasures only ..... 25
3.1 Problem Formulation ........................... 26
3.2 Network Model .............................. 28
3.3 Distributed Generation Of Encoded Symbols .............. 32
3.3.1 Requirements ........................... 33
3.3.2 The Distributed Code over Network Algorithm ......... 34
3.3.3 Data side degree distribution .................. 36
3.3.4 Coverage bound .......................... 36
3.4 Asymptotic Optimality and Fixed Point Stability ........... 38

3.4.1 Random walk based ripple analysis under t0pology constraint 39
3.4.2 Density evolution based ﬁxed point stability analysis for maxi-

mal partial recovery ....................... 49

3.5 Numerical optimization and experimental results ............ 52
3.5.1 Optimization - Differential Evolution and Network Constraints 52
3.5.2 Experimental Results ....................... 57
3.5.2.1 Performance evaluation of designed mixing rules . . . 62

3.5.2.2 Comparison with RNC ................. 64

3.5.2.3 Comparison with Growth Codes ............ 66

3.5.2.4 Sensitivity analysis ................... 71

3.6 Conclusions ................................ 72
Minimal outage rate under both erasures and errors . . . 74
4.1 Introduction ................................ 75
4.2 Network Model .............................. 77

4.2.1 Determining the code rate to be used for the mixed channel . 79

4.3 Homogeneous sensor networks and the code construction algorithm . 81
4.4 Bounds On Achievability/ Coverage . ~ .................. 91
4.4.1 Parity side feasibility bound ................... 91
4.4.2 Data side coverage bound .................... 95
4.5 Problem formulation ........................... 97
4.5.1 Sequential Quadratic Programming using Density Evolution
Analysis .............................. 99
4.5.2 Using Genetic Algorithm - Differential Evolution ....... 100
4.6 Experimental Results ........................... 101
4.6.1 Performance Analysis for K = 1000 data sensors ........ 107
4.6.2 Sensitivity Analysis ........................ 112
4.7 Conclusions ................................ 113
4.8 Appendix ................................. 114
Concluding Remarks ........... 116
5.1 List of contributions ........................... 116
5.2 Proposals for future work ......................... 118
5.2.1 Novel decoders at sink ...................... 118
5.2.2 Expediting data delivery under network embedded coding . . 119
Bibliography ................ 122

ix

3.2

3.3

4.2

4.3

LIST OF TABLES

(2T)

Parameter values obtained when curve ﬁtting 9 d ..........
Data and parity node degree distributions for “erasure only” .....
Data and parity node degree distributions for “erasure and error”

Average parity node degree for “erasure and error” ..........

53

101

1.1

1.2

1.3

2.1

2.2

3.1

3.2

3.3

3.4

3.5

3.6

3.7

3.8

3.9

LIST OF FIGURES

Illustrative example where network coding helps achieve network mul-

ticast capacity ............................... 3
Equivalency of network coding to belief network graph generation . . 4
Example illustrating code feasibility based on topological constraint . 8

Converting XOR coding operations in the network into a belief network
for BP decoding .............................. 14

Belief propagation algorithm for LDPC codes ............. 20

Illustration for information exchange across single hOp with random
construction of parity symbols of degree 6,3 and 2 ........... 28

Illustration of restriction on linear combinations based on transmission
range (T). Data generated by node n0 is linearly combined only with
data generated by nodes (721,712,113, n4, n5) .............. 29

(a) Random sensor placement on a 100 x 100 grid with transmission
range T = 6, p = 0.1, (b) Actual and predicted network degree distri-
bution ................................... 31

Ripple progression under transmission range T. 123,228 already identiﬁed. 38

Fitted approximation through average values obtained for POT) , where

d=2,3,4,5,6and(a)T=6(b)T=7(c)T=8 ........... 45
Network code to belief network after erasures ............. 49
Monotonically decreasing erasure in BPA ................ 52

Targeted and achieved parity degree distribution for T = 6, K = 1000
and r = 01,02, - - - , 1.5 ......................... 58

(a) Recovery (6) as fraction of received symbols (7) increases, (b) In-
crease in Avg. Parity degree with increasing recovery(6) ....... 62

xi

3.10 Change in degree one and degree two fraction for transmission ranges
(a)T=oo(b)T=8(c)T=7(d)T=6 ............... 63

3.11 Recovery proﬁle plot of “Designed network code (D)”, “RNC for dif-
ferent levels of mixing (t1,t2,t3,t4,t5,t6)” and “Growth codes (G)”
given a transmission range (a) T = 00 (b) T = 8 (c) T = 7 (d) T = 6 65

3.12 Comparing the simulation-based recovery proﬁle of designed network
codes with the recovery proﬁle over 802.154 wireless packet traces, for

transmission range (a) T = 00 (b) T = 8 (c) T = 7 (d) T = 6 . . . . 69
3.13 Loss in performance for (a) K = 500 (b) K = 1000 (node interconnec-

tion sensitivity) (c) K = 1500 (d) K = 2000 .............. 71
4.1 (a) Network model example (b) Equivalent Channel representation . . 78
4.2 Constructing generator matrix C with distribution (5,5) ....... 82

4.3 Example code construction for {52 = 2/6,F4 = 4/6} and {$1 =
4/ 8, $4 = 4/ 8} under network constraints ................ 86

4.4 Illustration demonstrating the network constraint on generator matrix
C ...................................... 89

4.5 Belief ripple from degree-one parity node to degree-two parity node
and beyond ................................. 90

4.6 Mean neighborhood size progression ,uN[i] for 03—3 = 1,56 = 1, |Nv| =
”a = 15, K = 1000, M = 2000 ...................... 94

4.7 Coverage progression for $3: = 1,56 = 1,|N»U| = #5 = 15,K =
1000, M = 2000 .............................. 97

4.8 Achieved and targeted parity degree distribution for a = 0.81, PE R A SU R E =
0 (a) Differential Evolution q = 6,12, K = 1000 (b) Density Evolution
q = 6, 12, K = 1000 ............................ 105

4.9 Outage Rate for codes designed for (a = 0.81, PERASURE = 0.0) (a)
Number of sensors K = 1000 (b) Number of sensors K = 10000 . . . 108

4.10 Outage Rate of codes designed for (a = 0'68’PERASURE = 0.2)
and K = 1000 sensor nodes using (a) Density Evolution design and
(b) Differential Evolution Design. For convenience in illustration we
replace PE R A SU R E with P in the legend. .............. 109

4.11 (a) Actual mean 4-Cycle Count and (b) Estimated mean 4-Cycle Count
- of codes designed for (a = 0.81, PERAS'URE = 0) .......... 111

5.1 Illustration representing non-uniform demands from different network
segments at the sink ........................... 119

5.2 Analytic and Empirical non-uniform LT code .............. 120

xiii

e423;

bl 3

>4

Pe

LIST OF SYMBOLS

Number of source bits

Number of transmitted parity bits

Vector of parity degree distribution (node perspective)

Vector of data degree distribution (node perspective)

Maximum parity node degree

Maximum data node degree

Vector of belief network check degree distribution (node perspective)
Vector of belief network variable degree distribution (node perspective)
Vector of belief network check degree distribution (edge perspective)
Vector of belief network variable degree distribution (edge perspective)
Number of belief propagation iterations executed

Probability distribution function of messages received from the relay
channel

Probability distribution function of messages from the check nodes to
variable nodes after 3 belief propagation iterations

Probability distribution function of messages from the variable nodes
to check nodes after e belief propagation iterations

Perasure Erasure rate seen by the sink. Ratio of number of encoded symbols

QI

received at the sink to the number of encoded symbols transmitted M

Ratio of average number of source symbols recovered at the sink to the
total number of source symbols K

Vector of sensor network degree distribution (node perspective)
Maximum sensor network node degree

xiv

T Euclidean distance over which a source bit may be communicated by
a sensor

B Length and width of the square grid over which the sensors are placed
randomly

p9 Mean data node degree
00 Standard deviation of data node degree

r Ratio of number encoded symbols received by the sink to the total
number source symbols K

INvI Size of the neighborhood for each sensor node in a homogeneous sensor
network

0 Bi— AWG N Standard deviation of noise in the last hop binary additive
white gaussian noise channel

PB SC hCross over probability of the aggregated binary symmetric relay
c anne

PE R A SU R E Effective erasure rate seen by the sink due to erasures within
the sensor network and the relay network

p N [z] The average available neighborhood size while constructing the 23th

parity bit

1k Set of parity nodes with degree k

XV

Chapter 1

Introduction & Background

Information theory and coding theory have increasingly been playing a signiﬁcant
role in providing viable solutions to networking problems. This has resulted in an
intriguing convergence between coding and networking, that has led to the emergence
of new research areas. The area of Network Coding (NC) is one such example where
coding at intermediate network nodes is used to achieve network multicast capacity.
Another example is the use of Low Density Parity Check (LDPC) codes [Gal63], De-
centralized Erasure Codes (DECS) [DPR06], Growth Codes (GCs) [KFMR06] etc. to
achieve reliable networking and data persistence [KHH+06]. In particular, the via-
bility of converged coding-networking solutions to wireless sensor networks (WSN)
is becoming more evident and increasingly compelling. Despite their excellent con-
tributions, prior efforts in coding solutions for sensor networks do not take into con-
sideration vital constraints and limitations of sensor networks. For example, most
practical sensor networks have an inherently constrained topology dictated by their
initial (usually random) layout during deployment. Hence any viable coding solution
over such networks are by nature constrained to the underlying topology. Further-
more, sensors continuously deplete their energy as the network ages over time. Thus

coding solutions that may be viable under certain energy conditions may become

1

less applicable as the network progresses over time. This point can be illustrated by
considering decentralized erasure codes, that require 0(log K) XOR operations and
O(\/K) random-route based transmissions (per encoded symbol) to recover K source
symbols [DPR06] [KWSGLAOQ]. For large sensor networks as the node energy depletes
this expense may become prohibitive. The above arguments highlights the need for
coding solutions that adapts to changing dynamics and network topology. Addition-
ally, any practical coding solution should strive to achieve distributed execution, low
encoding and decoding complexity, and graceful degradation. Prior contributions of
coding over sensor networks usually address one (arguably two) of the above crucial
issues but not all simultaneously. For example, LT codes that can adapt to dynamic
network conditions require a distributed feedback to the source nodes within a WSN
that leads to signiﬁcant energy consumption [LLL07]. Similarly, random linear NC
approaches as well as decentralized erasure codes suffer rapid deterioration in number
of recovered source symbols if the total number of received symbol falls below a cer-
tain threshold [ADMK05][DPR06]. To address the issues raised above the research
in this thesis outlines a distributed code-over-network framework for self conﬁguring
wireless sensor networks with data streams that adapt to the dynamic network con-
ﬁguration and to the ﬂuctuating network conditions. The proposed approach maps
the data sensed by the WSN to a subset of the variable nodes of a LDPC code [BL05].
Before we discuss the details of the code-over-network framework and the associated
design questions it is worthwhile to have a closer look at the coding approaches which

already exist in literature and outline their related merits and drawbacks.

1.1 Related Work

In [ACLYOO], Ahlswede showed that mixing information from different data ﬂows in

the intermediate nodes of a network can help achieve the network multicast capacity.

2

 

 

O Source
O Sink

 

 

 

 

 

Figure 1.1: Illustrative example where network coding helps achieve network multicast
capacity

A simple example used to illustrate the point is shown in Figure 1.1. Here the source
aims at communicating two source bits 2:1 and 2:2 to the two sinks. Each link in the
network has unit capacity, so a routing scheme based purely on packet forwarding
would require a total of 10 transmissions to communicate both the source bits to the
sink. Under network coding the intermediate nodes are allowed to perform coding
operations on incoming data. Consequently the bottleneck link in the middle of the
ﬁgure can forward an XOR-ed version (2:1 63x2) of the incoming source bits. The sinks
receives a single source bit and XOR-ed version of the two source bits. Reversing the
XOR operation at the sink is a trivial task. The coding Operation at the intermediate
network node results in both sinks receiving the desired source information in a total
of 9 transmissions. This simple yet elegant example demonstrates the gains that can

be achieved by enabling information mixing at intermediate network nodes.

The above network can be also viewed as two separate sources trying to communi-
cate their information bits 1:1 and 1:2 as shown in the leftmost part of Figure 1.2. The
union of the set of symbols received by the sinks are the parity bits {c1, c2, C3}. The
sink performs gaussian elimination decoding to recover the original source bits. It is
easy to see that the loss of a single parity bit in this case does not impact recovery.
The parity bits are represented by a set of equations in Figure 1.2. As shown this set

of parity equations can be easily transformed into a set of check equations. We can

3

x] XZ Belief

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

network
x1 x2 1
x1201 x1®c1=0 02
¢>
x1 -x2=62<l;[> X2®C2=0 C3
x169x2=c3 x1®x2®c3=0 x1
Code Zero x
. . 2
resnllent to syndrome _ , M e
* one erasure codeword 4253?; p::::199
(xl’xl 693‘2) (161,361 693‘2) decoder
E(xpxz) E£061,362)
Gaussian Gaussian
elimination elimination
O Source
0 Sink

 

 

 

Figure 1.2: Equivalency of network coding to belief network graph generation

then create a corresponding belief network graph over which we can carry out mes-
sage passing decoding [Rya04]. Chapter 1 examines this equivalency in more detail.
The message passing decoder (for a carefully designed belief network) provides error
resiliency which eludes gaussian elimination decoding. We therefore focus our efforts

on designing belief graphs which are admissible by the network under consideration.

Note, WSN are particularly amenable to mixing data streams as the bits trans-
mitted by a sensor node are received by all nodes within transmission range free of
cost. The work in [ACLYOO] was further extended in [KM03] to show that mixing can
be seen as linear algebraic operations on the incoming data in higher Galois Fields
GF (2‘1). However Operations in higher ﬁelds are unwieldy due to their signiﬁcant en-
coding complexity (multiplication Operations) and decoding complexity [KHH+06].
WSN systems in which intermediate nodes have low complexity typically limit their

NC manipulations to the binary ﬁeld. As a result all coding Operations are restricted

4

to XOR-ing of binary bits [KHH+06] [KFMROG].

In [HMK+06] an approach was outlined where linear network codes were gener-
ated using random weighting coefficients. When employed in a WSN, this approach
generates a sequence of encoded symbols such that the sink can recover the origi-
nal source (with high probability) if it receives a subset of the encoded symbols the
same size or slightly greater than the original source [KHH+06] [DPR06]. The en-
coded symbols received at the sink arrive along with the random weighting coefficients
used to generate the symbol stream. The randomized weighting makes this approach
amenable to execution in a distributed fashion. The source data is typically recovered
at the sink using Gaussian Elimination (GE). The body of work associated with this
approach is commonly referred to as random linear network coding (RLNC). Although
RLN C is efficient in terms of the number of encoded symbols required for decoding,
the GE algorithm has cubic complexity. As a result, processing at the sink becomes
increasingly infeasible for large number of encoded symbols. Moreover, a signiﬁcant
drawback of GE based decoding is that if the number of received encoded symbols at
the sink falls below a certain threshold, the amount of source data recovered under

RLNC quickly diminishes to zero.

In [DPR06], Dimakis et al. proposed decentralized erasure codes that use ran-
dom linear NC for distributed data storage in WSNs. This approach differs from
[HMK+06] in the fact that it explicitly tackles the problem of multiple source nodes.
Using DEC, source data can be recovered by querying a small fraction of nodes within
the sensor network. However, DEC again relies on cubic complexity GE based decod-
ing to ensure successful data recovery. Moreover, to achieve complete randomization
(for minimum received symbol overhead) the work in [DPR06] employs random pre-
routing algorithm that requires O(\/K) transmissions per encoded symbol, thereby
increasing communication cost. The number of transmission per encoded symbol

is derived based on two facts: the average route tree length is O(\/log K) and the

5

transmission radius is 0 (M (log K) / K) [KWSGLA09].

A slightly different approach was taken by Kamra et al. in [KFMR06] where
growth codes with dynamically changing degree of encoded symbols was employed in
conjunction with Belief Propagation (BP) decoding [Pea88]. The use Of BP instead of
GE decoding permits partial data recovery in [KFMR06]. However, for CC to recover
all of the original data, the size of the encoded symbol stream has to be signiﬁcantly
higher than the length of embedded source data (as will be demonstrated in section
3.5.2). The larger stream size is a result of the zero-conﬁguration (ZC) approach
adopted by GC and results in higher transmission energy consumption. In time-
critical, energy-constrained WSN applications, we are Often allowed to transmit only
a ﬁxed number of symbols towards the sink. The receiver (through post-processing)
then tries to recover as much of the original data as possible. This makes energy

expensive options like GC unsuitable for deployment.

In addition to the individual drawbacks listed for RLNC, DEC and GC, all the
above approaches have the common drawback that they were designed to recover
information at the sink in the presence of network erasures only. Data transmitted in
WSNs are also susceptible to packet corruptions (speciﬁcally errors) due to the low
transmission power used by sensor nodes, limited transmission range and inter-node
communication interference due to high node density. The coding approaches such
as RLN C and DEC cannot recover from errors introduced into the symbol stream
when GE decoding is employed at the sink. An alternative coding strategy termed
co-Operative decode-and-forward [KGG05] helps avoid this predicament by cleaning
up the symbol stream at intermediate nodes. This approach achieves higher reliability
in noisy networks at the cost of increased node complexity and higher latency. In the

following section we discuss the path adopted in this thesis to tackle the above issues.

6

1.2 Overview of the thesis

To address the issues raised above we enlist the support Of work outlined in [BL05],
where network embedded coding was formulated as a joint network channel coding
problem. This formulation allows the construction of joint network channel codes
which can Operate in noisy environments and require minimal node complexity. It
was shown in [BL05] that the algebraic Operations (at the intermediate nodes) of
joint network channel coding correspond to the check equations Of a linear algebraic
channel code. In this thesis we refer to this unique perspective as codes-over-network—
graphs (CNG). CNG maps variable nodes of Low Density Parity Check (LDPC) codes
onto data collected to by network nodes, and consequently translates check equations
(used in linear algebraic LDPC codes) into in-network processing. CNG codes not
only improve achieved capacity, but are also resilient to errors in noisy environments.
Traditionally CNG employs standard LDPC codes while assuming the underlying
network is capable of supporting the degree distribution dictated by these codes.
In practice, however, we usually have the network topology predetermined by the
placement of the nodes, and hence we are constrained to map a code onto a given
topology. Figure 1.3 illustrates an example where the mapping is constrained by the
underlying network topology. In Figure 1.3 sensors collect one bit of information and
forward to their respective cluster heads. The cluster heads are then responsible for
generating the desired code. Let us assume that the prevailing channel conditions
dictate the best code consists Of six degree 5 parity symbols. As can be seen, it is
not possible for the cluster heads to generate six unique parity bits with degree 5
and hence the code is non-realizable. Codes must be designed to not only overcome
current channel conditions but should also be mappable to the underlying network.
If such a network constrained design process yields a code with ﬁve degree 3 parity

symbols, then Figure 1.3 illustrates a realization of the desired code.

7

Wireless Sensor Optimal Channel

 

 

 

 

 

 

 

 

 

Network Graph Code
Q .
‘ piﬁﬁiﬁﬁsgs Ci) Non-realizable
.0" ' I '1 of degree 5
\4" ”3"“ - LDPC Code
. %
l' Optimal Network :\
‘1‘ Constrained . . .,.' n
.' ‘ﬂ " 4/ ,‘ Channel Code :1:
0 ‘ \0 Generate six :1:
parity symbols :{> r/
I ‘ ofdegree3 r25!”
0’ /. '
O Sensors :
I Cluster Head .

 

 

 

Figure 1.3: Example illustrating code feasibility based on topological constraint

The overall intent of this thesis is to design and analyze network codes, which are
matched to the underlying network topology. Therefore the topological information
needs to be accessible while determining the network code to be deployed. When
the constructed symbol stream is transmitted over the network it is altered prior to
arrival at the sink. The received symbol stream is processed at the sink using the
belief propagation algorithm (BPA). Topological information is a a critical parameter
used by the code construction algorithm deployed in the network. The level of detail
at which the topological information is desired during code construction depends on
the Operating environment and the nature of the distortion affecting the transmitted
symbol stream. If the symbol stream merely undergoes erasures the topological infor-
mation required is the network degree distribution. However, if errors are introduced
in the symbol stream the code construction algorithm outlined requires the sensor
networks adjacency matrix. Chapters 3 and 4 take a more detailed look at these code
construction algorithms.

The performance of the network-codes under BP decoding used at the sink is

8

predicated on the choice of a 2-tuple vector often referred to as the code degree dis-
tribution. The design problem which involves the selection of the network-Optimal
essentially involves determining the code degree distribution better suited for an un-
derlying network connectivity. We deﬁne code degree distribution in more details in

chapter 2. We can now state the speciﬁc issues addressed in this thesis:

1. Design and analyze Optimal erasure-resistant codes for networks rep-
resented by euclidean hypergraphs (erasure only): Within the constraint
of a given network topology, we use CNG and BP decoding to maximize the
average number of source symbols recovered when there are erasures within the
network. BP decoding under erasures allows for partial data recovery, and the
code degree distributions deployed on the network is determined as a function

of the number of symbols received at the sink.

2. Design and analyze Optimal erasure & error resistant codes for net-
works represented by euclidean hypergraphs (erasure and error): The
assumption that errors may be additionally introduced in the network streams
leads to a signiﬁcantly different optimal code degree distribution, as against the
pure erasure case. The introduction of errors precludes reliable partial recovery
and therefore the goal corresponds to minimizing probability of BP decoding

failure (outage rate).

The larger goals outlined above are achieved by the use of several smaller interme-
diate steps. These smaller intermediate steps have some overlap for the “erasure only”
and “erasure and error” cases listed above. As an example, both cases require the
determination of a data coverage criteria, which puts an upper bound on the average
number of source symbols that can be recovered at the sink. Another intermediate
step essential to achieving the larger goals is the determination of a code construction

algorithm to be employed once a target code degree distribution is determined. The

9

code construction algorithm has an essential component which ensures that the con-
straints placed by the underlying graphs network connectivity are not violated. The
code construction algorithm has a direct impact on the suitability of a code degree
distribution to network topology. The code admissibility criteria is stricter for the
“erasure and error” case when compared with the “erasure only” case since the BP
decoding algorithm has stricter Operating requirements in presence of errors, to ensure
acceptable data recovery performance. The ﬁnal solutions are obtained by leverag-
ing existing analytic tools for LDPC codes, chieﬂy density evolution and numerical

algorithms such as: sequential quadratic programming and differential evolution.

1.3 Thesis Organization

All solutions presented in this thesis can be imagined to be for WSNs which collect
information at different sensor nodes and direct them towards say monitoring sta-
tion(s). This example is quite helpful in visualizing the discussion presented in this
thesis and represents the more generic multiple sources transmitting information to
sink(s) via a wireless network. A graph representing the WSN contains hyperedges.
A hyperedge represent single edge which connects all the nodes within range of a
transmitting node. Such graphs are typically referred to as euclidean hypergraphs.

The rest of the thesis is organized as follows:

0 In chapter 2 we discuss the mapping between network-codes and their equivalent
belief networks for BP decoding. It then details the terms and notations used
extensively throughout the rest Of the thesis. It ﬁnally discusses the theoretical

underpinnings Of LDPC codes and their design using density evolution.

0 Chapter 3 formulates the problem of maximizing expected data recovery for
the “erasure only” case mathematically. It then considers the network model

used for design and analysis. A distributed code construction algorithm is

10

outlined next. A data coverage bound, which dictates an upper bound on the
expected data recovery, is determined based on the code construction algorithm.
An expression for ripple progression under BP decoding is Obtained. A ﬁxed
point stability condition is determined which helps reduce candidate code degree
distribution. Solutions are provided based on a heuristic numerical approach
(differential evolution) which searches for a globally Optima. The performance
of the designed code degree distribution is compared against random LNC and
zero-conﬁguration approaches such growth codes. The sensitivity of the solution
is tested by varying network size. Its validity is further tested by evaluating its

performance over actual IEEE 80215.4 wireless traces.

Chapter 4 formulates the problem of minimizing outage rate for the “erasure
and error” case mathematically. It outlines the equivalent channel model to be
used for design and analysis. A simple expression determining a lower bound
on the number of encoded symbols to be transmitted is Obtained. A centralized
ceordinator based code construction algorithm is outlined. Analyzing the code
construction algorithm leads to the determination of a data coverage bounds
and code admissibility criteria. Solutions are obtained numerically using: (i)
Density evolution and sequential quadratic programming, and (ii) Monte-carlo
simulations and differential evolution. The performance Of solutions obtained
both these approaches is compared to standard LT codes. The role of short
cycles in code performance is conjectured, and an expression is Obtained for the
expected number of four-cycles in the code. The ordering in the expected num-

ber of four-cycles is compared against the actual observed code performance.

Chapter 5 summarizes the contributions of the thesis. It then outlines promising
future avenues of research for this line of work, for e.g., a coding control scheme

at intermediate nodes which can prioritize recovery for a segment of the source

11

data under the “erasure only” setup.

Note, in this thesis when we use the term “code design” we imply the determi-
nation Of a degree distribution which represents a ensemble of codes. This ensemble
represents a collection Of code instances where some of them may perform well and
others may not perform so well. We do not attempt to determine which particular
instance of a code will give the best performance and focus instead on the determining

ensembles which are admissible under the provided network connectivity constraints.

12

Chapter 2

Mapping to belief networks and

analysis under density evolution

In this chapter, we ﬁrst outline the principles used to convert network-codes into
bipartite graphs called belief networks. We then details the terms and notations to
describe the different aspects of a network code and its equivalent belief network.
These terms and mathematical notations are used extensively in the rest of the thesis
document. Representing network coded information as a belief network enables the
use of belief prOpagation algorithm (BPA) to recover the transmitted symbol stream.
The performance of BPA can be determined using an analytic tool termed density
evolution (DE) [RUOI]. Using DE an upper bound on BPA performance can be
determined. The equivalency between network codes and belief network implies that
DE can also be used to design belief networks for superior BPA performance. Its
utility in deriving useful analytic results and accuracy in predicting performance
bounds makes it the tool of choice while designing codes within the context of this
thesis. In this chapter, we discuss key components of DE which are used in the

remaining chapters for code design and analysis.

13

2.1 Mapping Network Code to Belief Network

Encoded Parity Bits

Check Nodes

 

 

v1 v2 v3 v1 v2 v3 S1 52

Source Data Bits Variable Nodes
(a) (b)

Figure 2.1: Converting XOR coding Operations in the network into a belief network
for BP decoding

In linear network coding (LNC), intermediate nodes of a network perform linear
algebraic operations on the received data before forwarding them to their neigh-
bors. These network Operations can be collectively represented by a global matrix
at the sink. This global matrix (corresponding to a linear transformation of the
source data) is referred to as the “network transformation matrix”. As long as the
linear algebraic Operations result in a global matrix that is invertible (and the re-
ceived symbols contain no errors) the transmitted data can be recovered success-
fully at the sink using gaussian elimination (GE) decoding. In general, the data
and the associated linear algebraic Operations can be performed in higher galois ﬁelds
[HMS+03, HKM+03, MEHK03]. However, we restrict our attention to network nodes
with low complexity and therefore are capable of performing only XOR Operations on
the incoming binary data. It is always possible to convert the XOR coding Operations
at intermediate nodes into check equations Of a LDPC code. This mapping coupled

with the correspondence of incoming data ﬂows, to a subset of the variable nodes

14

of an LDPC code, enables network coding to be represented with a belief network
graph. We demonstrate this with the help of a simple example. Figure 2.1 illustrates
a case where a set of XOR coding operations are shown in subﬁgure (a). These XOR
coding Operations can be represented by the following set of equations:

v1®v2®v3=81 (21)

v2 69 v3 = 82
where, EB represents componentwise modulo-2 addition or XOR-ing. The left side of
the equation represents incoming data collected by a node, whereas the right side of
the equation represents encoded data obtained after XOR-ing and is forwarded to

downstream nodes. Equation (2.1) can be rewritten as:

1) 6802691) 695 =0
1 3 1 (2.2)

u2€BU3EB82=0

The above equations correspond to the bipartite belief network graph shown in Figure
2.1(b) (with zero-syndrome). Each equation above represents a check node of the
belief network graph, while the individual elements on the left side of each equation
represent the variable nodes. This simple transformation enables the LNC Operations
to be viewed as check equations of a belief network. From this simple example, it is
easy to see that the degree distribution of the bipartite graph corresponding to the
XOR coding operations in Figure 2.1(a), determines the degree distribution of the
bipartite belief network graph in 2.1(b). The degree distribution of a belief network
plays a pivotal role in determining the performance of the BP algorithm as we shall
see in section 2.2. We therefore take a closer look at the equations which establish a

mathematical relationship between the degree distributions of both the above graphs.

Let us ﬁrst consider the terminology used in reference to the above graphs. The

15

source symbols generated by a sensor node are referred to as data bits/ nodes and
the encoded symbols as parity bits / nodes. The nodes in the belief network of Figure
2.1(b) which correspond to individual XOR equations are termed as check nodes. The
collection Of data and parity bits / nodes which appear on the left-side of equation (2.2)
are called variable nodes. As can be seen, the incoming data and parity bits/ nodes
get mapped to the variable nodes of the belief network. Now, if n represents the
maximum degree a data node can take and q represents the maximum degree for a

parity node, then:

>4
3

 

_ __ _ K _ _ _4 _
N{)\1,)\2,'--,An} K+M{91+‘I_{-,62,03,-..,6n}

70- N {52,33}... H5q+1} E{51352,--. ,gq}

(2.3)

where,

03 ~ {$1,52, - -- ,Eq} : 5,- is the fraction of degree i parity nodes. (q is maxi-
mum parity degree)
F N {511-62, - - - ,Fn} : 5i is the fraction Of degree i data nodes. (n is maximum

data degree)

bl

,5 ~ {" , 3, - - - ,‘ﬁq +1}: E, is the fraction of degree i check nodes.

~ {XL 2, - -- ,An} : X,- is the fraction of degree i variable nodes.

>4

K : Total number of data bits to be communicated to the
sink (in every round)
M : Total number Of parity bits generated by the collection

of nodes in WSN for each set of K data bits.

Note, the right hand side Of equation (2.3) corresponds to the degree distributions
for data bits (22) and parity bits (3) respectively. Also, since every check node repre-

sents an equation containing one parity bit and one or more data bits a check node

16

has at least two edges connected to it. Consequently the lowest check node degree
corresponds to degree 2. This fact is reflected in the degree distribution of p which
starts from 52. Since the above vectors essentially represent probability distributions

we have the following additional constraints on the degree distributions:

n _ ql+ n q _
Zn=12a126i=1 Era-=1 (2.4)
i=1i2= i=1 i=1

For the example illustrated in Figure 2.1 we have 5 ~ {51 = 1/3,02 = 2/ 3}, (23 ~
{52 = 1/2,$3 = 1 / 2} and the application of equation (2.3) gives 3: ~ {X1 = 3/532 =
2/5}, 0 ~ {'p'3 = 1/2,34 = 1/2}-

The sink uses BPA on the resulting belief network graph to perform decoding.
The BPA iteratively passes messages between the variable-nodes corresponding to the
codeword side and the check-nodes corresponding to the check-equations, as shown in
Figure 2.2. The initial message set is determined based on the symbol values received
at the sink. The algorithm stops when all the checks are satisﬁed i.e. when a valid
codeword is reached. The formula to calculate the message being transmitted between
the check node C and variable node V for a log-likelihood ratio belief propagation

decoder is given by [Rya04]:

L(v —> C) = LV_,(3(0) + Z L(a —+ V) ‘ (2.5)
aENv\C
we —> v) = 1] aa x1: 2 mal) (2.6)
aENc\V aENc\V
(2.7)

where,

17

LV-—>C(O) is the initial log-likelihood ratio of bits arriving at the sink.

N3; \y represents neighborhood of a: in parity check graph H except the node y
aa is the sign of the message coming from variable node a to check node c.

6a is the magnitude of the message coming from variable node a to check node

C.

\Il(:c) = ln ((833 +1)/(e$ — 1))

In this thesis we primarily make use of the following two DMCs : binary erasure
channel (BBC) and binary additive white gaussian noise channel (BIAWGNC). If 3,-
represents the i — th symbol received at the sink and it maps to the variable node
Vi, then the expressions used to calculate the initial messages for the log-likelihood

ratio BPA are [Rya04]:

+00 8i=0

LVr—ycml (BBC) = —00 3,; = 1 (2-8)
0 s7; = erasure
282'
Lvi_,0(0) (BIAWGNC) = —2- (2.9)
a

Where,

02 is the variance of BIAWGNC.

As illustrated above, CN G mapping translates the “inversion Operation” (of the
global network transformation matrix) at sink to “decoding using BPA”. The design
of network embedded codes can now be posed as the design of the bipartite network
graph shown in Figure 2.1(b). The “invertibility” of the global matrix is tantamount

to determination of a belief network graph with a full-rank incidence matrix (for

18

 

erasure networks). The incidence matrix of a belief network graph is a (0,1)—matrix
which has a row corresponding to each check node and a column for each variable
node; an element of the matrix is 1 if and only if there exists an edge between the
check and variable node, it is 0 otherwise. This incidence matrix is also called the

parity check matrix H and for our setup has the general form:

 

 

 

 

 

(1 0 1)
H: Mrows l) 1 ...() IMXM
(0 1 1}
= [GMxKlIMxM] (2-10)

where, G is the generator matrix and I is an identity matrix. When errors are
introduced into the network, designing an optimal parity check matrix becomes sig-
niﬁcantly more challenging since the problem is no longer restricted to determination

of a full-rank matrix.

The BPA algorithm is capable of correcting the errors seen at the sink and recover-
ing the original codeword. The degree distribution of the bipartite graph represented
by (2.10) and shown in Figure 2.2 plays a pivotal role in determining the perfor-
mance of the BPA decoding algorithm. A more detailed look at its analysis is given

in following section.

19

 

 

Variable Nodes
Figure 2.2: Belief propagation algorithm for LDPC codes

2.2 Analysis of belief propagation algorithm using
density evolution

Density evolution (DE) is a popular analytic tool Often used to determine the perfor-
mance of a LDPC code under BP decoding for discrete memoryless channels (DMCs).
Most DMCs are characterized by a single parameter which represents the noise level
in that channel. DE is capable of predicting the performance of a LDPC code (with
a given degree distribution) for a single point in the parameter space of a DMC. Us—
ing DE one can search the parameter space of a DMC and determine the maximum
noise level that the LDPC code under consideration is capable of overcoming. This
maximum noise level is Often referred to as the noise threshold of an LDPC code.
DE is an asymptotic analysis tool and therefore all results deduced using DE hold
for codewords with code length tending to inﬁnity. Some examples of DMC channels
where DE has proven to be provide accurate LDPC noise thresholds are : binary era-
sure channel (BEC), binary symmetric channel (BSC), binary input additive white
gaussian noise channel (BIAWGNC), binary input laplacian channel (BILC) etc.
We now present in detail some key concepts of the DE based LDPC performance
analysis used in this thesis. The performance Of the BPA can be determined by

calculating the fraction of incorrect messages passed along the belief network edges

20

from the variable node side to the check node side at the 6 — th belief propagation
iteration (assuming that the belief network graph does not contain cycles of length 23
or less). The cycle-free assumption (for belief propagation iterations of the decoder)
is necessary to presume independence among incoming messages. Independence in
turn implies that the summation of messages at variable and check-nodes translates
to convolution of their probability distributions making it easier to keep track of the
evolution of the message densities. This edge centric view of density evolution leads
to tracking equations which make use of the belief network degree distribution from
edge perspective. (Note, equation (2.3) outlined the belief network degree distribution
from the node perspective). The convolution of message densities passed at the belief

network nodes uses the following edge-centric equations:

Mr) = A1+ A251: + A3x®2 + - - - + An$®(n—1)

n
= Z AixW—ll (2.11)
i=1
p(:r) = p221: + p31®2 + - - - + pq+lx®q
q+1 .
= Z pix®(z_1) (2.12)
i=2

where,

p N {p2, p3, - ~ - ,pq+1} : pi is edge fraction connected to degree i check nodes.
A ~ {A1, A2, - -- ,An} : A,- is edge fraction connected to degree i variable nodes.
x®(i_1) represents convolution of (i — 1) message density terms at.

Note: The two tuples (A, p) E (X?) E (0,3) are often referred to as the Code

Degree Distribution

This technique of evaluating code performance by tracking evolution of message

densities is labeled density evolution [RUOl]. Using equations (2.11) and (2.12) the

21

evolution of message densities from check-nodes to variable-nodes can be succinctly

represented by the following equation:

R5 = I‘_1p(F (P0 <2) A(Rg_1))) (2.13)

Where,

R3 represents the probability distribution of messages from the check nodes to
variable nodes after 8 iterations.
P0 represents the probability distribution of messages received from the symmet-

ric relay channel.

® represents convolution I‘ = I‘_1 is change of variable Operation associated with
the function.

we) =1n<<ex +1>/<e=” — 1))

The two tuples (A, p) E (X'p) E (5,3) are Often referred to as the Code Degree
Distribution. More details can be Obtained from [RUOI], [RSUOI].

Note, here the degree distributions used have an edge perspective while equations
(2-3) use the node perspectives. Conversion between the two perspectives can be

Ca-I‘1'ied out using the following equations obtained from [RU08]:

X. 2 ___)"I.___ p. = #—
7’ inj )‘j/j 2 inj Pj/j 214
,\. 52' . 7570-2" ( . )

= _T=‘ ,0 = —_T
2 ZVjJAj z 2W “’3'
SiInilar to equation (2.13) the message densities from variable to check nodes can be
cialculated using:
Pp = P0 (8) A(Rg_1) (2.15)

22

where,

Pg represents the probability distribution of messages from variable to check

nodes after 6 belief propagation iterations.

Density evolution is usually carried out for the all-zero codeword since the perfor—
mance Of the decoder is assumed to be invariant under codeword translation [RUOl].
This assumption holds as long as the probability distribution function (PDF) of the
channel messages LV(O) obeys the symmetry property. The symmetry property is

formally stated as:
PDFV($) = exPDFV(—:c) (2.16)

Now, the fraction of incorrect messages from the variable nodes to the check nodes

after 6 belief propagation iterations can be determined using:

0
Iago, p) = / Pg(a:)dx (2.17)

The degree distribution of the belief network establishes the channel coding rate
R, which is the ratio of number of source symbols to the number Of codeword symbols.

The formula for coding rate R based on degree distribution is given below:

2:: 211pz'___/__i _1_Z?=1'5i

R=1—

(2.18)

Traditionally LDPC code degree distribution (A, p) design, using DE for a given
DMC (with a ﬁxed parameter), involves maximizing the channel coding rate R such
that P6“ tends to a acceptably small value [RSU01]. In our code design setup however

we ﬁx R and search for (A, p) that minimizes Pg. Choosing a degree distribution

23

(A, p) which minimizes Pg results in a channel code which minimizes the outage rate
in the WSN . A more detailed discussion on density evolution with proofs for the above
expressions can be found by the interested reader in [RUOI], [RSUOI], [RU08].

This concludes a look at the basic terminology and the important tools used
through the remainder of this thesis. The next couple of chapters now detail the code

design approach adopted for the “erasure only” and “erasure and error” cases.

24

Chapter 3

Maximizing data recovery under

erasures only

In this chapter we design ﬁxed length LDPC codes that provide optimal recovery for
different erasure rates within the WSN. The code ensembles are constrained by the
underlying network connectivity. The size of the codeword is typically predetermined
by the energy constraints placed on the sensors and is incorporated in the code design
process. We begin by formulating the problem mathematically followed by a descrip—
tion of the operating environment. Next we look at the distributed code over network
(DCON) encoding algorithm that incorporates the underlying network’s connectivity
constraints while mapping the desired code ensemble onto the WSN. The DCON al-
gorithm is analyzed to obtain bounds on network coverage. An analytic framework
based on ripple progression is developed to predict asymptotic code performance un-
der ﬁnite transmission ranges (with simpliﬁcations for inﬁnite transmission range).
The code design solutions obtained for the inﬁnite transmission range are listed for
comparison with the ﬁnite transmission range case. We also derive ﬁxed point sta-
bility condition for the underlying LDPC network code ensemble which can be used

to narrow the problem search space. In section 3.5.1 we use the heuristic differen-

25

tial evolution algorithm to search for a globally Optimal solution for networks with
1000 sensor nodes under different transmission range settings. The change in trans-
mission range simulates the behavior of sensors where reduction in available power
with age often leads to reduction in transmission power to extend node lifetime. The
performance of designed solutions and solution sensitivity are discussed in section
3.5.2. The validity of the designed solution for practical scenarios is further veriﬁed
by executing the proposed solution over IEEE 80215.4 wireless traces. Section 3.5.2
also contains comparisons of designed solutions with Random NC (RNC) and growth

codes. The main results presented in this chapter are summarized in section 3.6.

3. 1 Problem Formulation

Often congestion [BS02] or corruptions lead to packet drops within a wireless sensor
network. Consequently, a fraction of the transmitted encoded symbols (parity bits
shown in Figure 2.1(a)) may not reach the sink, resulting in erasures within the
network. The subset of encoded symbols that do reach the sink can be subjected
to BP decoding to retrieve as much of the original source data as possible. The BF
decoding algorithm employed at the sink, assuming a part of the parity is lost within

the network is:

 

If there is at least one check equation that has exactly one variable node erased
then the erased variable node can be recovered immediately by XOR-ing the
remaining known variable nodes participating in the check equations. The
previously erased variable nodes can now be flagged as known and the pro-
cess is repeated till no check equations with exactly one variable node erased

remain.

 

 

 

Note, since the original data bits (Figure 2.1(a)) are not transmitted towards the

sink they undergo implicit erasures. As can be seen, the BP decoding algorithm (for

26

degree distributions with ﬁxed maximum degree) has complexity linear in the number
Of received encoded symbols. In our setup we assume that the original set of K source
symbols is encoded in to a stream of M encoded symbols. Part of the encoded symbol
stream is lost within the network before arriving at the destination. This results in
an equivalent erasure rate of Perasure as seen by the data sink. The total number
of symbols recovered by the sink depends on the exact subset Of received symbols
and varies even when the number of received symbols is the same. Consequently
the number of recovered symbols for a given erasure rate is a random variable. We
therefore formulate the objective as that of determining code degree distribution that
maximize the mean number of recovered source symbols (2) at the sink, for a given
erasure rate Perasure. The network degree distribution of the underlying WSN plays
a signiﬁcant role in determining the code degree distribution that result in maximal

data recovery. Data recovery (6) is measured using:

_ Expected number of source symbols recovered (z)
_ Total number of source symbols transmitted (K)

 

(3.1)

In our analysis, since each network sensor communicates one source symbol, K is also
equal to the number of sensor nodes. Let us represent the network degree distribution
by the vector 52 = {51, 62, ...,6h} where 6,- represents the fraction of sensor nodes
with degree i. Now the problem of determining the Optimal mixing rules within the

sensor network can be stated as:

 

(3,5)?) = f(K,M,Perasure,?1—) = arg max 6 (3.2)

($3)

 

 

 

The term f (K, M, Perasure, 6) illustrates the dependence Of the Optimal solution on
the parameters K ,M ,Perasure and 5?. The solution for the above problem determines

the optimal parity and data node degree distributions, and therefore equivalently the

27

 

 

 

 

 

 

 

 

 

 

,0;
V1®V2®V3®V4®V5$V6
v2 69 v3 O v4
W V4 9 V5

 

 

 

Figure 3.1: Illustration for information exchange across single hop with random con-
struction of parity symbols of degree 6,3 and 2

Optimal check and variable node degree distributions.

3.2 Network Model

We now outline the network model considered for our analysis. The WSN is par-
titioned into two logical groups, the sensor ﬁeld responsible for collecting data and
generating the encoded stream, and the relay network responsible for forwarding the
packets to the data sink. The sensed data is ﬁrst exchanged between neighboring
nodes within the sensor ﬁeld as shown in Figure 3.1. The encoded symbols are then
formed by XOR-ing data in the sensor ﬁelds to within one hop. This leads to the for-
mation of parity bits that are relayed to the sink. The resulting restriction on linear
combinations for a transmission range of T is shown in Figure 3.2. In Figure 3.2,
the data generated at node no can be linearly combined only with data generated
at nodes (n1,n2,n3,n4,n5). The parity bits thus formed are relayed to the sink.
During the relay process a subset Of parity bits are lost within the network and only
a fraction of the parity bits arrive at the sink. Our goal is to retrieve as much infor-

mation about the source symbols from the subset of received parity bits. Note that

28

mixing beyond a single hop would require that sensor nodes be aware of connectivity
of the network beyond their immediate neighbors, which is Often diﬂicult to achieve
in WSN. From a purely coding perspective the localized mixing of the sensed data
leads to limited randomness within the codeword. The smaller is the value that T
takes the smaller are the number of options in generating a parity bit of a codeword.
However, as we shall see in section 3.5.2 localized single hop mixing is often sufficient
to achieve performance akin to that which would be obtained by mixing data beyond
a single hop. The parameter T for code design just represents the mixing radius and
it is possible to imagine situations where the mixing radius contains of several hops.
The routing algorithm deployed in such a case will have to more extensive then the

single hop routing considered here.

/’
0"
9"-
1",
.0 ‘5‘
// \‘~
.-" “\-
( < > \b-
_l‘
\ -\ f,/..
. v],
'0’
'l
\. O n _/
\.“
u“.
\\\\\ <> 3
.\\~

“

 

Figure 3.2: Illustration of restriction on linear combinations based on transmission
range (T). Data generated by node n0 is linearly combined only with data generated
by nodes (n1,n2,n3,n4,n5)

After the information exchange step each sensor node uses a “biased” coin to
determine whether or not to generate symbols of a particular degree. If the coin ﬂips
to head a random selection of neighboring data bits is mixed to generate an encoded
symbol. Obviously, a sensor node can generate only symbols with degree less than or
equal to its incoming degree. The details of how the distributed encoding algorithm
achieves a particular code degree distribution are given in section 3.3. The code degree

distribution ($0) that is employed in the WSN depends on the prevailing network

29

connectivity, i.e., the sensor ﬁeld network degree distribution 6. Over time the network
degree distribution changes and needs to be periodically propagated within the sensor
ﬁeld. It can be easily shown that any such update can be accomplished using 2(K -- 1)
transmissions of the network degree distribution vector along a spanning tree within
the sensor ﬁeld. Although the network degree distribution is a global property, we
show that in several scenarios it is possible to predict the overall network degree

distribution accurately in a distributed fashion.

Let us consider a network with sensor nodes placed randomly on a grid of di-
mensions B x B. The value of B can be Obtained by the use of boundary detection
algorithms such as [WGM06], leading to a one time setup cost. Each 1 x 1 cell within
the grid has a probability p = K / B2 of containing a sensor. The network therefore is
assumed to have node density p with each sensor node having a transmission range T.
The degree distribution of the nodes within the sensor ﬁeld can then be approximated

by the following binomial distribution:

where,

N3 = K = 32p the number of nodes in the sensor ﬁeld
133 = proportion of the area of the sensor ﬁeld that is covered by a

sensor node

_ Area covered by a sensor node _ rrT2
— Area of the sensor ﬁeld _ B2

 

The mean of the binomial distribution is E (a) = Nsps and the variance is var(b7) =
Nsp3(1 —- p3). Generally, it is cumbersome to determine the value of the Binomial

distribution over all the degrees within a network, we therefore approximate 6 by the

30

The Network

~". 2'" ' x A: . inf
80 . .. —» . . .
‘I . >\.. \ ’4’. q
- 4? rim-s z.‘ ~ ,—....-€«,Vi" '~. “"
. \ in»: ". ' “‘- ‘f
49““,
x' 7‘7 “" r“'f‘.'. " "

60

40

20

 

 

 

 

- Actual
— Predicted

 

 

 

=d)

P(NOde Degree

 

 

Node Degree (d)
(b)

Figure 3.3: (a) Random sensor placement on a 100 x 100 grid with transmission range
T = 6, p = 0.1, (b) Actual and predicted network degree distribution

31

following normal distribution:

5NN(NsPs,NsPs(1—Psll (3-4)

It is well know that this approximation is good for large values of N3. We verify
the efﬁcacy Of this approximation by evaluating the degree distribution for a network
containing sensors placed randomly in a 100 x 100 grid with p = 0.1 and T = 6. As can
be seen in Figure 3.3 the predicted and actual network degree distribution are quite
close. Experiments with different network shapes have revealed the approximation
to be fairly independent of the network shape. Note, here we made the assumption
that the transmission range of the sensor nodes is the same for all nodes. In case that
the transmission range is not the same (due to asymmetric usage) we can use the
knowledge of its probability distribution to determine approximately the overall a. It
would involve quantizing the probability distribution T and determining the Binomial
(or equivalently a Normal) distribution corresponding to each quantization interval.
The overall network degree distribution can then be determined as the sum of these
independent normal components. However, for the sake of clarity, in this chapter we
consider only the case where the distribution of T is a shifted unit impulse, i.e., all

sensors have the same transmission range.

3.3 Distributed Generation Of Encoded Symbols

Assuming an optimal (3,07) is known, we now outline the details of a distributed
code over network mapping algorithm to carry out mixing at the intermediate sensor

nodes.

32

3.3. 1 Requirements

We ﬁrst consider the parameters required by the DCON at each sensor node and

utilized by the code construction algorithm. These parameters are listed below:

Number of encoded symbols M generated by the sensor ﬁeld. The set of encoded

symbols is collectively referred to as a codeword.

The total number of information bits (same as number of sensor nodes) K being

embedded within the codeword.

The target parity degree distribution 05.

The overall network degree distribution of the sensor ﬁeld a.

The mapping algorithm uses the above parameters along with biased coin ﬂipping to
generate a parity (encoded data) stream.

In section 3.2 we have seen that there exist some situations under which it is
possible to estimate the overall network degree distribution locally at the sensor.
Also, the parameter K representing the number of sensors within the sensor ﬁeld
is usually known before deployment and can be pre—programmed within the sensors.
However, after deployment some of the sensors may stop functioning either due to the
hazardous environment they are operating in or due to excessive usage. In such cases
a periodic update can be sent out across the sensor ﬁeld to update the value of K.
In most practical applications however a reasonable approximation of K would work
just as well. Let us now consider the number of parity bits M transmitted within the
WSN. Typically M is a function of power left (Pleft) within the sensor nodes. We can
therefore pre-program sensors with a function g : Pleft —> M. If we assume uniform
power consumption across the network then M can be estimated locally, otherwise

an update mechanism needs to be employed.

33

The target parity degree distribution a, must adapt to the varying erasure rates
within the network and more speciﬁcally to the current network degree distribution '07.
Typically a set of a corresponding to different network conditions can be determined

offline and programmed into the sensor nodes.

3.3.2 The Distributed Code over Network Algorithm

We now look at the actual DCON mapping algorithm which generates the parity
. stream. It involves two stages, in the ﬁrst (information exchange) stage each sensor
node transmits the sensed information bit to all its neighbors in the sensor ﬁeld. As
a result each sensor receives the data sensed by their adjacent nodes, as illustrated in
Figure 3.1. In the second stage, each node employs algorithm 1 outlined in the box
below to generate parity bits with the overall codeword adhering to the distribution
#7 = {$132, "-aaql'

The parity generation algorithm is executed at each sensor node. Since the Operations

 

Algorithm 1 Parity generation algorithm executed at sensors

1: d <= node in-degree

2: E <= {1,2, ...,min(d,q)}
3: for all dcur in E d0

4: pR <= Mtbdcur

pE <= K Z9:61er aj

5

6. T¢pE/pR

7: while r > 0 do
8

9

 

Draw a number c from uniform distribution U (0, 1)

if c S r then
10: Randomly select dam bits and XOR them to generate a parity bit.
11: 7' <= 7 — 1
12: end if
13: end while
14: end for

 

at each sensor node are independent of each other, we get a distributed generation

of the overall codeword. We brieﬂy describe the steps involved as follows: Each node

34

ﬁrst generates a list of degrees for which it is eligible to generate parity bits (step 2).
Obviously, a node cannot generate an XOR-ing of, say 5 bits, if its own degree is 3.
The sensor node then iterates through the list of eligible degree (step 3—14). It uses
the 4-tuple (M, K, a, O?) to determine the number of bits of a particular degree the
WSN needs to generate (step 4) and the number of sensor nodes eligible to generate
it (step 5). It then uses a “biased” coin (step 8) to determine if it should generate
a parity corresponding this degree (the “bias” of the coin was set in step 6). If the
answer turns out to be “Yes” then the sensor generates the parity bit by randomly
selecting a subset of the data bits available at the sensor node and XOR-ing them
(step 9-12). Note, the set of available data bits includes information bits received from
the adjacent nodes and the information sensed by the node itself. It is important to

d

observe that if the sensor node degree is d, it can generate only 2:1:1 Ci unique
parity bits. Moreover, it is possible that a parity bit generated by random XOR-
ing at a particular node is also generated at a neighboring node. The probability of
repeating encoded symbols increases, as M is increased (for a ﬁxed K) making the

overall code look more and more like a repetition code. Hence we should not select

codeword lengths M that are inordinately long.

In summary, if the target distribution is a then the total number of degree
dcur parity symbols required is M Eda”, Only sensor nodes with network degree

clear or higher can generate degree dcm- parity symbols. There are K 2h

j=dcur 51'
such eligible sensor nodes. If every eligible sensor node on an average generates
— h _. . . . . . .
(M¢dwr) / (K Zj=dcur (1]) we would achieve the target distribution (albeit With

minor ﬂuctuations). Hence, the localized mixing of data described above results in
a codeword with the required parity degree distribution 3. However the distributed
approach leaves little control over the data degree distribution 5 of the codeword.
We can however estimate the R that will be achieved by DCON mapping for a given

(M, K, d, 6). The details of R estimation is discussed next.

35

3.3.3 Data side degree distribution

We now obtain a closed form expression estimating the data degree distribution 5.
The estimate is obtained based on the following assumption: During codeword con-
struction with DCON mapping we assume that each parity degree contributes a small
independent fraction to the overall data degree distribution. The aggregate of these
independent fractions determines the ﬁnal data degree distribution. Let us consider
data degree distribution contributed by the formation of degree d parity bits in the

sensor ﬁeld:

9 N Bimdmdl E N(ndpd,ndpd(1 - Pd» (3-5)

where,

nd = M 5d number of bits of degree d in the codeword
pd = d*Probability a node is selected while generating a single parity
bit with degree d

 

d d
Number of eligible sensor nodes K Zil=d 52.

Let us represent the mean and variance of the normal distribution shown in equation
(3.5) with “d = ndpd and 0‘21 = ndpd(1 — pd) respectively. Now, as in section 3.2 the

data side binomial distribution can be approximated by the normal distribution:

a = Zed ~ N016, 03) (3.6)
Vd

where ya = 2321M, and 03 = 2211:1031 .

3.3.4 Coverage bound

Since the DCON mapping algorithm is probabilistic in nature, there exists a nonzero

possibility that a sensor node and consequently an information bit may not participate

36

in the codeword formation. These non-participating information bits are considered as
being uncovered and cannot be recovered at the data sink. Sensor networks often have
inherent redundancy within the sensed information, and not all the sensed information
is required at the data sink for meaningful post-processing. However, the amount of
losses that can be tolerated depends on the redundancy present in the sensor ﬁeld
and the data processing algorithm in use. The probability that a data bit generated
by a sensor participates in a codeword bit formation is termed as the “probability
of coverage” of a sensor node. A lower probability of coverage requirement can lead
to better code design for the remaining nodes. A conscious design decision can be
made to relax the coverage constraint of sensor nodes by choosing the appropriate
parity node degree distribution lb. The probability of coverage can be estimated using

equation (3.6) as follows:

P(coverage) = 1 — P(unc0vered)

=1—P(0<1)

— — er 1"!19
_1 0.5(1+ f(00\/§)) (3.7)

On an average K P(R < 1) number of source symbols do not participate in the code

 

formation. This puts an upper bound on the number of source symbols that can be
recovered at the sink. Since P(R < 1) is a function of R which in turn is a function
of £5 (among many other things) we can control the number of non-participating
source symbols by controlling E. If P(R < 1) is forced to be above an appropriate
threshold during the design process then we can guarantee that the average number

of non-participating source symbols is below K times that threshold.

37

3.4 Asymptotic Optimality and Fixed Point Sta-

bility

Region from which new
hyperedges might enter the ripple

 

V
O C

Figure 3.4: Ripple progression under transmission range T. v3, v8 already identiﬁed.

We now develop an analytic framework for code design. Additionally, we derive a
necessary condition for code stability.

At any iteration of the BP algorithm the set of check equations that have exactly
one variable node erased are said to constitute a ripple. To achieve maximal recovery
we need to recover as many source symbols as possible before the ripple size hits zero.
To assist in the design of code degree distributions that achieve maximal recovery we
outline an analytic framework for ripple analysis. We begin by describing a framework
for ﬁnite transmission ranges, based on which we derive an analytic expression for
the inﬁnite transmission range case. We model the ripple in the BP decoding process
as a queue with a constant departure process, a time dependent defect process and
an arrival process process whose arrival rate is a function Of the transmission range.

In this general model if we set the transmission range (T) and number of sensors

38

(K) to 00 then we can derive an asymptotic expression for ripple progression. These
asymptotic results help us determine the nature of the expected solution which is

veriﬁed in section 3.5.2.

Next, we determine an asymptotic necessary condition to ensure decay in erasure
rate at ﬁxed point in the decoding process. The ﬁxed point is chosen to be the point
of maximal recovery and the condition is termed the “ﬁxed point stability condition”.

An Optimal degree distribution must satisfy this ﬁxed point stability condition.

3.4.1 Random walk based ripple analysis under topology con-

straint

To determine a model for the ripple in the decoding process we consider the following
hypergraph representation: The codes formed under the network model described
in section 3.2 with transmission range T can be represented using a graph with
check equations corresponding to hyperedges and vertices corresponding to the data
nodes of the WSN. Since the check equations correspond to the encoded symbols
received at the sink, the decoding process can be formulated as a vertex identiﬁa-
bility problem. Mapping the decoding process to the vertex identiﬁability problem
was ﬁrst carried out in [MS05] for T = 00. An analytic expression correspond-
ing to ripple evolution can now be determined using ﬂuid limits of pure jump type
Markov processes [Dar02]. Figure 3.4 corresponds to an example with four check
equations: {v2,v7,v8},{v8,v9},{v9,v3,v10}, {v15,v16,v17} represented as hyper-
edges and data nodes v3, v8 already identiﬁed (decoded). The next vertex that can
be identiﬁed is 229. The identiﬁcation of v9 introduces the possibility that a hyperedge
deﬁned within 2T distance of 119, is left with a single non-identiﬁed vertex. The dis-
tance 2T can be Obtained from the fact that the hyperedge containing 219 can cover a

distance of T and the affected hyperdege can cover an additional distance of T. The

39

above area corresponds to a “region-of-interest” encapsulated by a circle of radius
2T, centered around v9 in Figure 3.4. Check equation(s) may enter the ripple from
the region-Of-interest. In Figure 3.4 this corresponds to the hyperedge {v9, 223, um}.
Similarly, there is a possibility that another hyperedge existed in the region-Of-interest
that had v9 labeled as the only unidentiﬁed vertex. This hyperedge(s) now defects
from the ripple. In general, the random process entering and defecting the ripple is
modeled as a poisson and a binomial process respectively [DN05]. If we assume the
number of vertices in the region-of—interest is K (2T) with v<2Tl of them labeled as
identiﬁed and if the probability that a hyperedge from the region-Of-interest enters
the ripple (when v vertices have been identiﬁed in the overall graph) is {21), then the
poisson arrival process Av has the parameter (K (2T) — U(2T) — 1)Qv. Similarly, if
we assume the size of the ripple to be Xv, then the defecting binomial process LU
has parameters (X0, 1 / (K (2T) — v(2T))). The progression of the ripple can now be

described by the following queue based equation:
Xv+1=XU+AU-L’U_1 (3.8)

It is important to note that the decoding process dies out as soon as the number of

elements in the ripple falls to zero.

Let us now consider the determination Of the arrival rate parameter 9v under
network constraint. Let us deﬁne r as the ratio of number of encoded symbols received

at the sink to the total number of source symbols:

7" = (MO. — Perasure» /K (3.9)

where, M is the number of parity symbols generated by the network. M is typi-

cally determined based on the energy constraint within the WSN, while Perasure

40

is a function Of the packets drops resulting from congestion/ corruption within the
network. A hyperedge cannot contain a pair of vertices that are greater than 2T
distance apart. Let us denote the hyperedge degree probability distribution (where

degree represents the number of vertices in a hyperedge) within the region-of-interest

by ¢<2Tl -_—_ {¢(2T),¢g2T), . .. ,¢((12T)}, then:

 

 

 

 

1
min(q,v(2T)+2) v(2Tl
a.) = rK(2T) Z 453““ 622—12)) (3.10)
(122 rd

where,

rK<2Tl corresponds to the expected number of hyperedges in the region-of—interest.

455121” is the ratio of “number of hyperedges with degree d” to “the total number of

hyperedges” in the region-Of-interest

(05:?) represents the total number of unique hyperedges possible with degree (d— 2),
when the vertices are chosen from the already identiﬁed vertex set, within the region-
of—interest

I,(2T)

d corresponds to the total number of unique hyperedges possible with degree (1,

when the vertices are chosen from the entire region-of—interest.

We can identify subregions within the region-Of-interest as the area centered
around the vertices with radius T (this corresponds to the different feasible mix-

ing subsets). If we label these sub-regions as y§2T), yé2T) , yé2T)’ - - - ,ygélr), then

41

using principle of inclusion-exclusion [Cam94]:

(2r) (2r) (2T)
(2T2__ ly l ly l ly T‘l
Pd ‘( 1d )+( 2d l+"'+( K22 )l
— (Iy(2T) nyé2T)‘) _ (lg/(2T) Oy§2r)|) —
d d

(2T) (2T)

_ ('yK<2r>_1“-"K(2n')
d
(2r) n (21‘) (21“)

n
+(ly1 312d 1‘13 I)...

2T 2T 2T 2T
(1

For inﬁnite transmission range T = 00, the region-of—interest corresponds to the entire

graph and we have the following simpliﬁcations:

”(2T) = v K(2T) = K

(#3127): 52 r51”) = (K)

 

d
Similarly,
v+2 ( ’U v+2
_ d—Q) ~ — I ’U d-2
szrKZ¢d K “”KZfd (K—1)(K)
d22 d d22

If we replace v/K = t and assume at) = at + $212 + - - - + @tq then:

2— —/I
_ r d¢(t)_ wt) ,
Qv—K_1 dt2 —rK_1 (3.12)

 

 

42

On using the above substitutions and taking ﬂuid limit as K —> oo, equation (3.8)

becomes:

 

it = —1$_tt+(1—t)r5”(t) —1 (3.13)
Solving the above equation gives:
not = (1 — t) (r5’(t)+zn(1—t)) (3.14)

Equation (3.14) for inﬁnite transmission range was also derived in [Haj06a], [DN05]-
Thm. 2.1. We restate Thm. 2.1 of [DN05] below. Note, that source and data symbols

are equivalent expressions and used interchangeably.

Theorem 1. Let r E R, r > 0

r$’(t) + ln(1 -— t) > 0, for 0 S t S s(r,6) where

s(r,¢) = inf {t 6 [0,1) :ra’a) + ln(1 — a} /\1
and /\ above represents the fact that s(r,$) takes on the solution for
inf {t e [0, 1) : r371) + ln(1 — n}

if it exists otherwise it will take on the value 1.

Now, if K —+ 00 and number of received parity bits ~ P0isson(rK) then we have

(5 —> s(r,¢), where 5 represents the average fraction of source symbols recovered as

deﬁned in equation (3.1).
Proof. For a complete proof please refer [DN05]. Cl

In [San07] a further analysis of Theorem 1 was carried out to determine Optimal

43

degree distributions for values of r < 1. The relevant results are summarized below, so
that they may be compared to the results obtained in section 3.5.2 for more realistic

settings of the network degree distribution 6:

o For 0 < 6 S 1 / 2 4:) 0 S r S In 2, only degree 1 symbols are required at the
receiver implying that only uncoded data need be transmitted from the source.
Since our problem formulation has a ﬁxed M, this corresponds to network con-

ditions where erasure rate is: 1 — ((K In 2) /M ) S Perasure S 1.

o For 1/2 < 6 S 2/3 4:) ln2 S r S (3/4)ln3, all encoded symbols must be of
degree 2. This translates to an erasure rate with range: 1 — (3K 1n 3) / (4M) S

Perasurre <1—((KlIl2)/M).

o For 6 > 2/ 3 (i.e. Perasure < 1 — (3K ln 3) / (4M )) no analytic solution exists

and the Optimal degree distribution is determined numerically.

The above results imply that the optimal (09“, 5;) would change based on r. We show
that this is indeed the case in section 3.5.2. Interestingly, in [Lub02] Luby, has shown
that we can achieve 6 = l—e if r = 1+0 (ln2(K/e)/\/K) (where e is a small positive

constant) with digital fountain LT codes.

44

7 I I

 

X106 T=6

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

  

 

 

 

 

 

0 Actual, d=2
6 - Curve Fit, d=2
s Actual, d=3
; 5 ~ Curve Fit, d=3
3 '6 ¢ Actual, d=4
‘91 4 - Curve Fit, d=4
A a 0 Actual, d=5
i: 3 - Curve Fit, d=5
‘3'? v Actual, d=6
2 ~ Curve Fit, d=6
1
0 g .L .. .L a. _ ...-t . -
0 200 400 600 800 1000 1200
K
(a)
4 X 107 1 T f: 7 T
0 Actual, d=2 5
3'5 _ Curve Fit, d=2 5 s /
3 _ Actual, d=3 _________ _ L
A Curve Fit, d=3 j lZ
>4 2.5 - n Actual, d=4 ,(1,
8:6? Curve Fit, d=4 1,!

n 2 ‘ 0 Actual, d=5 5 ”/2 """" “
Q Curve Fit, d=5 /
3131's I V Actual, d=6 " ' "

OJ 1 _ Curve Fit, d=6 L"; .
0.5 ------
/./r/
0 A i a «L - w ,
0 200 400 60 800 1000 1200

K

(b)

Figure 3.5: Fitted approximation through average values Obtained for PET), where
d=2,3,4,5,6 and (a) T=6 (b) T = 7 (c) T=8

45

16
14
12
10

A T
(2T), 9&2 )

9d

x 107

 

I I

 

I

 

 

 

O

 

)k

1:1

0

 

V

--= Curve Fit, d=3

Actual, d=2

 

Curve Fit, d=2
Actual, d=3

 

 

Actual, d=4

 

Curve Fit, d=4 ,
Actual, d=5

 

Curve Fit, d=5
Actual, d=6

 

 

 

 

Curve Fit, d=6

 

 

 

 

 

 

200 400

(C)

Figure 3.5 (continued)

46

 

 

l
l
u
I
I
I
I
o
I
o
M *

600
K

 

1 000

1200

The expression outlined in equation (3.11) is diﬂicult to compute for increasingly
larger values of K. Moreover, the random placement of network nodes makes the
expression a random variable for ﬁnite transmission ranges. The expression can be

approximated using:

 

. min 7r(2T)2,B2
I,(l2T) z FEET) = é(2T) ( )

d 32 (3.15)

where, @512,” is an estimate of the average number of hyperedges of degree d in the
entire WSN where the vertices are placed randomly in a grid of size B x B. The second
term in the expression above represents fractional area covered by each sensors region-
Of—interest within the network. We compute the average value 932T) for small values
of K = 100, 200, - -- ,1000 and ﬁxed transmission values T = 6, 7, 8. The average is

obtained over 1000 randomly generated instances of hypergraphs. A three parameter

approximation listed below is then ﬁtted through the obtained values:

(2T) ,3, (351”) : f0. (K(f1'(Kf2))) (3.16)
where, f0,f1 and f2 represent parameters with range (0,1], [1,10] and (0,1] respec-
tively. Figure 3.5 shows the results of the approximation for the different transmis-
sion ranges. The values obtained for the parameters when least square ﬁtting was
employed are listed in table . Although an approximate expression can be obtained
for the combinatorial expression in equation (3.11), solving equation (3.8) to obtain
a closed form expression as for the inﬁnite case remains a non-trivial problem. We
therefore derive optimal solutions for ﬁnite transmission range using evolutionary

optimization techniques as outlined in section 3.5.1.

47

Table 3.2: Parameter values obtained when curve ﬁtting 9d

(2r

)

 

. 2T
65, ’,

(T=5)

 

Degree (d)

f0

f1

f2

 

C301rD-OOIO

 

2.330441 x 10"3
8.763342 x 10-6
7.883035 x 10-8
6.721843 x 10“10
6.649685 x 10"11

2.267433
3.334063
4.025110
4.672583
4.626487

 

 

1.718959 x 10-15
3.937270 x 10“15
6.459716 x 10-3
1.115611 x 10"2
2.495819 x 10—2

 

 

- 2T
93, ),

(T=7)

 

Degree (d)

f0

f1

f2

 

@O'IlD-OON

 

4.430364 x 10-3
1.996168 x 10—5
1.053616 x 10—6
2.209213 x 10—10
4.654027 x 10—8

2.225912
3.315226
3.561542
5.400098
3.328975

 

 

7.873255 x 10—16
1.681690 x 10-15
1.611865 x 10-2
6.186011 x 10—15
5.260140 x 10—2

 

 

- 2T
65, ),

(T=8)

 

Degree (d)

f0

f1

f2

 

 

magnum

 

7.595902 x 10—3
4.357461 x 10—5
2.004659 x 10-7
8.585198 x 10‘10
7.830507 x 10—6

2.190470
3.286796
4.346052
5.307566
2.436711

 

 

1.684924 x 10—15
1.780235 x 10—15
1.779443 x 10—15
1.717933 x 10"3
8.056136 x 10"2

 

48

 

3.4.2 Density evolution based ﬁxed point stability analysis

for maximal partial recovery

When BP decoding is carried out over the set Of received symbols at sink, the number
of source symbols recovered increases with the increasing number of BP iterations.
Consequently, the effective code erasure rate keeps dIOpping with every BP iteration.
This drop in erasure rate continues till the ripple size reaches zero and no further
recovery is possible. The ﬁnal effective erasure rate for the code is then referred to as
the point of maximal recovery. We declare a code degree distribution to be “stable”
at a ﬁxed point during the BP decoding process if the erasure rate at that ﬁxed point
does not grow. In this section we derive a necessary condition for stability when the
ﬁxed point is chosen to be the point of maximal recovery. To determine the stability
condition we make use of density evolution analysis of BP decoding. More speciﬁcally

equations (2.13) and (2.15) are used in determining the stability condition.

. . Belief
Transmltted Recelved Network

 

M parity
(Network Code)

Mr variables erasedy

 

f
......OOQ

Figure 3.6: Network code to belief network after erasures

49

The encoded symbol stream generated using algorithm 1 in section 3.3 can be
represented by the leftmost bipartite graph in Figure 3.6. This bipartite graph cor-
responds to the XOR Operations of NC and was ﬁrst described under section 2.1.
The bipartite graph has M nodes on its right side corresponding to the number of
encoded symbols. Only a part of the encoded stream arrives at the sink dues to
losses within the relay network (leading to code erasures). If the erasure rate within
the relay network is Perasure the total number of encoded symbols arriving at the
sink is Mr = (1 — Perasures)M- If we remove the erased right nodes of the orig-
inal bipartite graph and the edges connected to it we get a second bipartite graph
shown in the middle of Figure 3.6. As outlined in section 2.1 we can reformulate
these NC Operations into zero syndrome check equations and obtain the rightmost
bipartite belief network shown in Figure 3.6. This graph has an equivalent variable
and check degree distribution that we represent using A7- and pr respectively. The
degree distribution vector corresponding to AT and pr can be obtained by replacing
M with MT in equation (2.3). This belief network corresponds to an LDPC code that
has undergone an effective erasure corresponding to Pgmsure = (ﬂ) When
the BP algorithm is executed over this derived belief network the erasure will reduce
with each iteration, till no further reduction in erasure is possible. As mentioned
earlier in this section this corresponds to the point of maximal recovery and the ef-
fective erasure rate at this point is )8 = (1 — 6) (ﬂ). For stability at this point
we require that the slope at this ﬁxed point is either negative or zero as illustrated
in Figure 3.7. We are now prepared to state the stability condition for any pair of
(Elglperasure) E (A,p,Pem3m~e) E (Ar, pr); given 6 fraction of source symbols

were recovered on average.

Theorem 2- A (213.5, Perasure) E (Alp.Pemsure) E (Anpr) that recovers an

average of 6 fraction of source symbols is stable if A; (1 — pr (1 — 6)) pg. (1 — ,8) S

50

I

Ibrasure
Proof. Let us represent the effective erasure rate at iteration 8 and (t + 1) with

P30 and Pig—1 respectively. For the binary erasure channel Pg +1 (the probability
distribution of messages from variable to check nodes after t + 1 iterations) is said
to be a ﬁxed point of (Ar, pr), if P621 = 1380' For stability we require that there
be no ﬁxed points for P60 6 (ﬂlpecrasure) This would ensure the BP decoder
does not get stuck at a stable point on its way to recovering 6 fraction of source

symbols. We want the effective erasure in the codeword to reduce monotonically, i.e.,

P8011 < PeCNB < PEG < Pgasure. Using equations (2.13) and (2.15) we have:

Pr+1= P0 ‘8 Ar (r181: (F (Pill)
1B P5611: Pegasurebr‘ (1 —l0r (1 —PgC)) S P30
Let P? = 2:, then the above inequality can now be written as:
PC

erasureAT (1 — PT(1— 33)) “ 3” 3 0

Now, if h(.r) = PegasureAr (1 — pr (1 — 23)) — 2:; on linearizing the density evolution

equation we Obtain the following stability condition for the ﬁxed point 2: = (1 —

6) (1256;) = 6

The condition is illustrated in Figure 3.7. We therefore get:

6’63) = Pegasus; (1 — Pr(1—B)lpi~(1‘/3)-1 s o

 

Al (1 —pl~ (1 — (6)7140 — 6) s —C—1——— (3.17)

erasure

 

 

 

51

   

 

 

 

C
:3 PERASURE
5
a
[1.]
“:3
7% W)
32 ﬂ """"""""" 2T—
E l+1 \ I
BP Iteration # 3"“ mus“ be

negative or zero

Figure 3.7: Monotonically decreasing erasure in BPA

This condition can be used to verify the asymptotic ﬁxed point stability of the
Optimal degree distributions (5?, 0_*). We can substitute the data degree distribution
approximation, in equation (3.6), in the stability equation (3.17) to verify code en-
semble stability. Note, a similar stability analysis was carried out in [Sh002] for full
recovery in LDPC codes. The above stability condition can be viewed as a general-

ization of the expression in [ShoO2].

3.5 Numerical Optimization and experimental re-

sults

3.5.1 Optimization - Differential Evolution and Network Con-

straints

We now determine good LDPC codes that can be mapped to the network, to recover
on average 6 fraction of the original source symbols. 6 is Obtained by calculating
the mean code performance Over 5 instance of the WSN graph with 100 runs per

graph. These Optimal degree distributions (Rs—1‘) are obtained numerically. The

52

number of sensor nodes and the amount of parity generated by the WSN are ﬁxed at
K = 1000 and M = 1500 respectively. The sensors are randomly placed on a grid of
size 100 x 100. For ﬁxed (K, M) we can replace Perasure with r as seen in equation
(3.9). We therefore determining Optimal solutions for r = (01,02, 0.3, ..., 1.5). We
use differential evolution [SP97] which is a heuristic algorithm to look for the global
optima in determining a solution for the problem formulated in section 3.1. Infeasible
solutions are avoided by using penalty functions identical to those deﬁned in [8800].
The initial population size for the differential evolution algorithm is set to 10g, it
includes distributions that are shifted unit impulses located at every feasible degree.
The rest of the population is determined randomly. We set crossover parameter for
the algorithm as CR = 1. For the next generation G + 1 mutation the ith candidate
member is determined using:

pi,G+1 = pbest,G + 0'5 (pr1,G +pr2,G _ pr3,G _ pr4,G) (3'18)

where, r1,r2,r3,r4 is chosen randomly. The candidate member replaces the co—
located member of the previous generation if the candidate member performs better
than it. Although Optimization was carried out for maximum parity degree q =

6, 9, 12, we only present results for q = 6 for brevity.

Table 3.3: Data and parity node degree distributions for “erasure only”

 

7" (151 $2 $3 54 (155 456

Transmission Range T = 00

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

0.1 1.00000 0.00000 0.00000 0.00000 0.00000 0.00000
0.2 0.99532 0.00222 0.00021 0.00132 0.00065 0.00028
0.3 0.99789 0.00093 0.00048 0.00028 0.00039 0.00004

 

 

Continued on next page

 

53

 

Table 3.3 — continued

 

51

$2

553

$4

55

56

 

0.4

0.99820

0.00103

0.00042

0.00024

0.00005

0.00005

 

0.5

0.99834

0.00016

0.00057

0.00025

0.00021

0.00047

 

0.6

0.99886

0.00007

0.00018

0.00019

0.00029

0.00041

 

0.7

0.99270

0.00451

0.00059

0.00024

0.00068

0.00128

 

0.8

0.00895

0.98553

0.00177

0.00189

0.00187

0.00000

 

0.9

0.03364

0.56008

0.40481

0.00021

0.00113

0.00013

 

1.0

0.07743

0.39603

0.33428

0.10332

0.06144

0.02750

 

1.1

0.04909

0.56694

0.02262

0.01039

0.00720

0.34374

 

1.2

0.05787

0.49845

0.00287

0.00358

0.00996

0.42727

 

1.3

0.07847

0.40847

0.00484

0.00018

0.00545

0.50259

 

1.4

0.09098

0.35235

0.00107

0.00281

0.01148

0.54131

 

1.5

 

0.12074

 

0.26622

 

0.00700

 

0.02165

 

0.01275

 

0.57164

 

 

Tfansmission Range T = 8

 

0.1

0.99452

0.00204

0.00078

0.00027

0.00063

0.00176

 

0.2

0.99574

0.00045

0.00001

0.00193

0.00087

0.00100

 

0.3

0.99723

0.00030

0.00098

0.00057

0.00007

0.00085

 

0.4

0.99865

0.00003

0.00020

0.00046

0.00060

0.00006

 

0.5

0.99635

0.00175

0.00022

0.00071

0.00017

0.00081

 

0.6

0.99394

0.00373

0.00136

0.00071

0.00024

0.00003

 

0.7

0.97245

0.02023

0.00258

0.00276

0.00193

0.00006

 

0.8

0.22676

0.76302

0.00901

0.00085

0.00024

0.00013

 

0.9

0.02797

0.95898

0.00338

0.00818

0.00147

0.00003

 

1.0

 

0.01255

 

0.95725

 

0.01000

 

0.01593

 

0.00219

 

0.00208

 

 

Continued on next page

 

54

 

 

Table 3.3 — continued

 

7151

52

53

54

55

56

 

1.1

0.03079

0.64983

0.30100

0.00575

0.00766

0.00497

 

1.2

0.13766

0.00021

0.83939

0.00310

0.01422

0.00542

 

1.3

0.05997

0.38621

0.02717

0.51057

0.00446

0.01162

 

1.4

0.17180

0.07060

0.01096

0.72104

0.00118

0.02442

 

1.5

 

0.25350

 

0.03626

 

0.05564

 

0.00680

 

0.64376

 

0.00404

 

 

Transmission Range T = 7

 

0.1

0.99124

0.00015

0.00407

0.00241

0.00068

0.00145

 

0.2

0.99079

0.00116

0.00212

0.00336

0.00200

0.00056

 

0.3

0.99374

0.00016

0.00006

0.00274

0.00226

0.00103

 

0.4

0.99206

0.00562

0.00092

0.00068

0.00021

0.00051

 

0.5

0.99437

0.00110

0.00056

0.00086

0.00262

0.00049

 

0.6

0.99903

0.00005

0.00022 ,

0.00025

0.00004

0.00040

 

0.7

0.88835

0.10444

0.00289

0.00022

0.00408

0.00004

 

0.8

0.40647

0.57289

0.01354

0.00020

0.00153

0.00537

 

0.9

0.10168

0.89601

0.00011

0.00055

0.00125

0.00041

 

1.0

0.03212

0.94759

0.01504

0.00347

0.00031

0.00146

 

1.1

0.01287

0.90708

0.00574

0.01728

0.02470

0.03234

 

1.2

0.02656

0.77373

0.00002

0.00012

0.19535

0.00423

 

1.3

0.00479

0.72859

0.00548

0.00456

0.00113

0.25545

 

1.4

0.01727

0.60776

0.00792

0.03589

0.00065

0.33052

 

1.5

 

0.01322

 

0.60780

 

0.00532

 

0.00252

 

0.00625

 

0.36489

 

 

Transmission Range T = 6

 

0.1

 

0.99020

 

0.00107

 

0.00043

 

0.00620

 

0.00130

 

0.00080

 

 

Continued on next page

 

55

 

Table 3.3 — continued

 

51

52

353

$4

55

56

 

0.2

0.99637

0.00182

0.00029

0.00028

0.00119

0.00005

 

0.3

0.99775

0.00020

0.00004

0.00056

0.00028

0.00117

 

0.4

0.99813

0.00034

0.00021

0.00016

0.00014

0.00104

 

0.5

0.99518

0.00009

0.00218

0.00009

0.00023

0.00223

 

0.6

0.99415

0.00370

0.00198

0.00007

0.00004

0.00006

 

0.7

0.98300

0.01585

0.00022

0.00042

0.00038

0.00013

 

0.8

0.45331

0.54046

0.00466

0.00015

0.00005

0.00139

 

0.9

0.23584

0.75475

0.00172

0.00239

0.00497 .

0.00033

 

1.0

0.06857

0.92873

0.00125

0.00000

0.00029

0.00118

 

1.1

0.04924

0.90482

0.00585

0.01053

0.01863

0.01092

 

1.2

0.02954

0.83427

0.00252

0.01358

0.00008

0.12001

 

1.3

0.03895

0.76327

0.00407

0.01581

0.16413

0.01377

 

1.4

0.02879

0.67637

0.00257

0.00326

0.00768

0.28134

 

1.5

 

 

0.02329

 

0.65324

 

0.00577

 

0.00608

 

0.00001

 

0.31161

 

 

 

Over a period of time the energy reserve in sensors get depleted. The sensors
in WSN then automatically start conserving energy by reducing their transmission
power and therefore the transmission range. This reduction in transmission range
changes the underlying network connectivity and hence the optimal (3,07). To
simulate this in addition to varying r, we set the transmission range of the sensors
to T = (6, 7, 8, OO) and determine the correspond Optimal (EL-9?) The ﬁnal choice
of T = 00 corresponds to a complete graph similar to that considered in [San07]. To

summarize we vary (r, T) and determine (51,07) that gives the maximum 6.

56

3.5.2 Experimental Results

In our simulations we ﬁx K, this implies that a solution obtained for a single r is
applicable for a wide range of energy constraints and erasure rates within the WSN.
Since, we assume single hop mixing (parity formation) of data, the transmission
range of each network node places a restriction on which subset of sensed data can
participate in the formation of a check equation. Figure 3.8 shows the target and a
representative achieved degree distribution using the DCON algorithm for the designed

code distributions listed in table 3.3.

57

Target _Target

 

 

Achieved :Achieved
.T.=62’"EO;IO. .T.=63r-:-0?0r
1 ’ i 1
130.8 “0.8
an o
s .2
U) O)
80.6 J 580.6
“6 "5
§04 1§o4
O O
s s
u. u.

.0
N
.0
N

 

 

 

 

 

 

1 2345 6
Parity degree (d)

 

 

 

 

 

 

 

 

 

 

 

1 ~ 1 ~

'0 0.8 1 '0 0.8~

0 a)

2 ‘3

O) O)

.8 0.6 1 .8 0.6

“5 ’ “6

éoA» -§04

O O

E 8
LL 0.2 ' 1 U. 0.2 4

0 1 2 3 4 5 6 0 1 2 3 4 5 6
Parity degree (d) Parity degree (d)

Figure 3.8: Targeted and achieved parity degree distribution for T = 6, K = 1000
and r = 01,02, - -- ,1.5

58

 

Fraction Of degree d
.0 P .0

.O
N

 

 

Fraction of degree d

 

 

 

Fraction of degree d
9 .0 .0
e c» co

.0
N

 

' v
‘ I
Ethan-:13 EMT-€17..."- ”Edsel-321;; F l- :41 33:5: 9;.

Piafﬁt‘ﬁ‘lkifiﬁiﬁﬁﬁﬁuféﬂ

 

gree d

Fraction Of de
G
A

 

 

2. 3 £1 a 6
Panty degree (d)

-Target
Achieved
T = , r = 0.60

 

.0
a)

.o
4:.

.0
N

.0
Oo

 

.9
O)

_o
N

 

1 2, 3 4 5 6
Parlty degree (d)

, Target
m Achieved
,T?Qr=0%'

  
 

 

 

.0
co

  

 

 

I _h ‘ I _L

1 2‘,’ 3 4 5 6
Parlty degree (d)

 

Figure 3.8 (continued)

59

 

 

are
' chieved

 

.T.= 5”"?030'

aim
chieved

31:6.” 1: 1:00 r

 

Fraction of degree d

.0
N

p
on

.0
o:

.0
.p.

.0
oo

.0
o:

.o
.p.

 

Fraction of degree d

 

 

 

_Target

 

Fraction of degree d

P
N

.o
oo

.o
o:

.0
4:.

 

 

   

‘ TE 67,?" a: 1:10.

1’ 2, 3 4 5"
Panty degree (d)

Fraction of degree d

 

.0
N

 

 

1

2‘, 3 4 5 6
Panty degree (d)

ﬂarget
chieved

'Tfor': 1.201

4

 

 

p
on

.0
a)

.0
4:.

.0
N

 

 

 

1" 2_ 3 4’ 5 6
Panty degree (d)

Figure 3.8 (continued)

60

 

 

 

 

 

 

 

 

Target Target
Achieved Achieved
T3=6.’r.=1130. 1T7;62r.=1°,40
1 - 4 1 *
'0 0.81 ‘ w '0 0.8-
a) .35 a)
9 (if? 9
8,0 6 . 3’0 6'
'0 ' ‘—-f U '
“5 "5
§ 0.4- 3 .§ 0.4»
8 8
‘L 0.2 - “- 0.2»
0123456 0123456
Parity degree (d) Parity degree (d)

 

 

Fraction of degree d
O
'5.

.0
N

 

.0
oo

9
o:

 

 

2 3 4 5 6
Parity degree (d)

Figure 3.8 (continued)

61

As can be seen the achieved degree distributions are quite close to desired after
execution of DCON. We now evaluate the performance of the designed mixing rules
under different transmission ranges. Next we compare the performance of these de-
signed mixing rules with RN C and growth codes. The performance of RN C, growth
codes and the designed codes is then evaluated over IEEE 802.154 traces to further
verify the validity of the obtained simulation results. For brevity we only present
ﬁgures corresponding to trace-based evaluation of the designed codes. Finally we

preform sensitivity analysis for the designed mixing rules.

3.5.2.1 Performance evaluation of designed mixing rules

 

 

   
 
 
 

 

 

 

 

 

 

 

 

 

 

8 1 ' 4

5 +T=oo m +T=

goal —-a—r=8 g3. ——e—T=8

g i:¥i; % 3 —+——T=7

$5,0.6 — ‘ " gz +T = 6

§o.4 E 2

2 ‘3"

%O.Z~ """ <1.

9 1

u. 00 0.5 7' 1 1.5 0.4 0.5 0.66 0.8 0.9 1

Fraction of source recovered

(a) (b)

Figure 3.9: (a) Recovery (5) as fraction of received symbols (r) increases, (b) Increase
in Avg. Parity degree with increasing recovery(6)

The data recovery proﬁle shown in Figure 3.9a for different transmission ranges
demonstrates that the loss in performance due to localized single hop mixing (T =
6, 7, 8) is small compared to mixing across the entire network with multiple hops (T z
00). Note the non-smoothness in the plotted curves is due to the fact that solutions
were obtained for a discrete set of points corresponding to r = (0.1, 0.2, 0.3, - - - ,1.5)
and the curve was plotted using linear interpolation. The true data recovery profile

would be a smooth curve passing through the obtained points.

62

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

a"; 0.8
.0
O
.5 0.6
O
,5 0.4
‘8
E 0.2
00
(a) (b)
1 T = 7 1 -m
m \ a)
$0.8 30.8
o _ o _
50.6 +451 506 +¢1
o o
E 0.4 +52 50.4 +32
“(5 *6
g 0.2 90.2
LL 4} KM LL
0 AA A A 0 A A A L :
0 0.5 7‘ 1 1.5 O 0.5 'r' 1 1.5

(C) (d)

Figure 3.10: Change in degree one and degree two fraction for transmission ranges
(a)T=oo(b)T=8(c)T=7(d)T=6

A quick look at Table 3.3 shows that the degree distributions that must be used
for the same number of received symbols is different for different transmission ranges.
It is important to note that as the underlying network becomes more sparse, i.e., the
average neighborhood size of each node becomes small (compared to the degree of
the parity to be formed), multi-hop mixing needs to be deployed as single hop mixing
will fail to generate the required fraction of higher parity degrees. Therefore for the
proposed algorithm to work well in practice, we need that the underlying sensor ﬁeld

graph is connected. The underlying sensor ﬁeld graph will remain connected with

63

 

high probability if T 2 N \/ log(K ) / (K 7r), as was ﬁrst determined in [GKOO].

In Table 3.3 we can see the trend in the optimal degree distribution for T = 00
is similar to that predicted by the asymptotic analysis of section 3.4.1 till 7' =
(3/ 4) ln3 ~ 0.8. However there are some signiﬁcant differences for the ﬁnite trans-
mission range cases. The asymptotic analysis predicts a sharp change in degree
distribution from all degree one symbols to all degree two symbols at r = ln(2). How-
ever, for (T = 6, 7, 8) the onset of change is slightly earlier and lasts longer as can
be seen in Figure 3.10. Lower the transmission range, slower is the fall in fraction
of degree one nodes. This is essential to prevent the BP decoding ripple deﬁned in
section 3.4 from hitting zero too quickly. This gradual fall in degree one nodes results
in smaller average degree for ﬁnite transmission range for 5 > 0.6 (Figure 3.9b). The
role of degree two nodes is to tie together decoded source symbols with undecoded
source symbols. Other degrees are introduced at appropriate values of r to reduce

the probability of the ripple diminishing to zero.

3.5.2.2 Comparison with RNC

Comparing the performance of the designed CNG degree distributions against RN C
leads to some interesting observations. We evaluate RNC along with GE decoding

for the following parity degree distributions:

 

 

 

 

 

Description $(t)
RNC deg: 1 t

RNC deg: 2 t2
RNC deg: 3 t3
RNC deg: 4 t4
RNC deg: 5 t5

RNC deg: 6 t6

 

64

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

3 +t‘ E -~-t1 l
9 1’ 2 9 1’ 2
8 .........t I'JW; Wt IV
20.8'—'—t3 ’ t 30.8 "id—t3 '{1
g ..... t4 # g t4 /
50.63 r 30.6r ,
8 +t5 7dr 8 +t5 /
§O4L+t6 / I §0'4b+t6 /
*3 0.2 ~ ““6 f 80.2 w-r-‘G .
E o {.../“21 E o {.../“A
-0.5 0 05 r 1 1.5 -0.5 0 05 r 1 1.5
(a) (b)
E +t1 B +t1
m 1— 2 - o 1» 2
3 --t .. § ---,
20.8 *-*-t3 i, «- 20.8»—1—-t3
-‘ 8 4
“0.544 t4 , 50.6» t
‘2 +5) i E +t5
20.4 * +t6 if y g0.4 " +t6
:’ ._ f
80.2wv~G r’ 1302 “‘“G J
E 0 -*-D -_ ,2: L: 0 -*-D :1.“
-0.5 O 0.5 r 1 1.5 -0.5 0 0.5 1' 1 1.5

(C) (d)

Figure 3.11: Recovery proﬁle plot of “Designed network code (D)”, “RNC for different

levels of mixing (t1, t2, t3, t4, t5,t6)” and “Growth codes (G)” given a transmission
range (a)T=oo (b) T=8 (c)T=7(d)T=6

The network model for RNC is assumed to be the same as described in section 3.2,
used in conjunction with code generation algorithm 1. For a fair comparison all RNC
operations are restricted to the binary domain. The average recovery performance for
the different RN C degree distributions under varying transmission ranges is plotted
in Figure 3.11. As can be seen, for T = 00,8,7 the designed degree distributions,
outlined in table 3.3, outperform RNC when r < 1. RNC gives considerably superior
performance when the desired 6 = 1, however this comes at the cost of signiﬁcantly
higher complexity of the GE algorithm. In [KMRO8],[LRCK07] it was shown that exe-

cuting GE decoding after the BP decoding stage, in conjunction with the introduction

65

of a few dense rows can achieve performance identical to RNC at lower complexity.
Choosing T = 6 results in several instances of the network graph that have two
or more disconnected components. As a result, RNC performs better then the de-
signed degree distribution. Graphs with disconnected components have independent
ripples corresponding to the individual graph component. If a ripple corresponding
to a graph component hits zero, no further data can be recovered from that compo—
nent. Designing a single degree distribution for these graphs, results in suboptimal
performance. Interested readers can refer to the paper [SKMR08] which investigates
this problem and derives expressions for overall ripple progression, when there are

two graph components.

3.5.2.3 Comparison with Growth Codes

We now compare the performance of our designed approach with the zero conﬁg-
uration growth codes approach. Growth codes were designed with the assumption
that no network topological information is available and were targeted for maximum
data persistence. The growth codes algorithm makes the assumption that the sensed
information is being disseminated across the network on an epoch by epoch basis.
During the ﬁrst epoch the raw data is disseminated across the network. In the next
epoch the sensors start combining information in their buffer and what they receive
from adjacent nodes to form degree 2 symbols. In the following epoch the degree of
the symbols is increased by 1. This process is repeated till a pre-determined number
of epochs is reached. The length of each epoch is determined by performing at-sink
analysis for maximal data recovery. The analysis for a BP decoder for growth codes

in [KFMRO6] led to the following set of conditions:

 

K — 1
0 To recover R1 = source symbols the expected number of source symbols
— K
required is ’41 = 25:10 1 ﬂ (Theorem 5.7 in [KFMR06]).

66

dK -— 1
d + 1
degree distribution has symbols of degree no larger than d (Theorem 5.8 in

[KFMR06]), where

 

0 To recover less than or equal to Rd = source symbols, the optimal

 

Rf (i?)
"d = “d—l + ,- .
i=R(d_1) (d—1)(K — 1)

dK — 1
d + 1
expectation (Theorem 5.9 in [KFMR06]).

 

0 To recover Rd = source symbols at most red symbols are required in

Based on the above conditions the resulting growth codes degree distribution as ob-

tained from [KFMR06] is:

 

K. —I~t K—K.
d d—l (14)) (3,9)

ad = max (0, min ( K , K
In growth codes, the requirement that we increase the degree of encoded symbols
generated per epoch by one restricts the overall growth codes degree distribution
that can be achieved and therefore puts it at a disadvantage when compared to our
designed codes. The data recovery proﬁle for the growth code degree distribution
given by equation (3.19) is shown in Figure 3.9. As can be seen the designed codes
perform signiﬁcantly better than growth codes for the range 7' > 0.7.

The data recovery for RNC, growth codes and the designed codes was evaluated
on actual IEEE 802.154 wireless sensor network traces obtained in [IR08a]. The
setup involved transmitting information using commodity Crossbow MPR2400 MI-
CAz mote [IEEb] operating in channel 26 in the 2.480 GHz band, and receiving the
transmitted information using another MICAz mote mounted on a Crossbow MIB6OO
ethernet gateway [IEEa]. Further detail on the experimental setup used in collecting

the traces can be found in [IR08a]. The traces were used to represent the erasures

67

seen in the relay network. The average recovery performance of all the codes closely
mirrored those seen in Figure 3.11. For brevity we only show the performance of the

designed codes over the wireless traces in Figure 3.12.

68

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

*‘ Trace based - - - Simulations
E 1 z 5 I’M“-
0)) l"
8 t
9 0 8 fine ..
a) "It
8
3 0 6 — """"""""""""""""""" l 1‘ .1
3 x
“' a
2 0.4 ’ 1.1-f -
.9 I
I: 0 2 - ’ ’1- ---------------------------------------- -
ov”
0 0 5 'r 1 1 5
(a)
# Trace based - - - Simulations
'U 1 __ _________
93
a:
>
8
e 0.8 """ ' ;
E3 14*
g 0 6 2 t ... .............. .
H— ‘ ’
2 0.4 ﬁ " ................ _
e ,, 2 ’
0 t
(O I
I: 0 2 g
1”
0 1
0 0 5 7° 1 1 5

Figure 3.12: Comparing the simulation-based recovery proﬁle of designed network
codes with the recovery proﬁle over 80215.4 wireless packet traces, for transmission
range(a)T=oo (b)T=8(c)T=7(d)T=6

69

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Dmhm>oowg 00.50w *0 c0509.“.

1.5

u _
m. m
S
m W?
.l t
m I.
U m
.m ,w ............. 1
_ It.
_ I t
. . 1‘ T
I
m ,
a ‘It.
.0 Im 5
I. ............ i .
r.
T l
.1
t. I
I
I
n I
_ £0
1 8. 6. 4. 2 0
0 0 0 0

 

 

'* Trace based - - - Simulations

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

1 8. 6. A 2. 0
0 0 0 0
52989 858 ho eczema“.

Figure 3.12 (continued)

70

O
..s
p
—L

 

 

+T=00 +T=oo

   

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

§o.oa-—e—r=11.44 §o.08» +r=8
g 5 g —'—T=7
m '. Z t .
CL . .
E 0.04 ....................................
05 7- 1 15
(a)
01 . 0.1,

W’M'T=oo 2 i +T=°° 5 ;
2008, +T=651 ............... _ §0-08’+T=5,63 ..........
E —+—'r=5.70 5 i g —+—r=4.93 ; ~.
g 006* +T=4.89 ..... ................. e 006) +T=423 ... . 1 .
o : { ~ g) .

: CL
:3 9 3
3 002 .......................... ﬂ,\ ...... 3 0.02».

 

 

 

 

   

((0

Figure 3.13: Loss in performance for (a) K = 500 (b) K = 1000 (node interconnection
sensitivity) (c) K = 1500 (d) K = 2000

3.5.2.4 Sensitivity analysis

The sensor nodes can use the degree distributions deﬁned in Table 3.3 to choose
the appropriate mixing rules (for maximal recovery), as Perasure within the WSN
changes. It is however critical to determine the sensitivity of this solution to change
in network size and node interconnections. We therefore determine the loss in per-
formance for the designed mixing rules for varying number of sensors K * within

the network. To maintain a similar network degree distribution we determine new

71

transmission ranges T* for the sensors by solving:

WEIE<aK*,Tz>-E<am>l

s.t. Var (6K*,Tz) = Var (521(1)

The new transmission ranges T* are stated in legend for Figure 3.13.
We evaluate sensitivity by measuring performance loss: A6 = 6(5K,T) ——6 (EK* T* ).
The loss in performance is measured for K * = 500, 1000, 1500,2000.

Node Interconnection Sensitivity: Figure 3.13b shows the loss in performance for

 

1000 sensors, but with node interconnections different from design. We observe A6 3
0.02, with most signiﬁcant loss occurring in the range 0.7 S r g 1.3. This shows that
the degree distributions have some sensitivity to actual node interconnections but not
signiﬁcant.

Sensor Field Size Sensitivity: Figure 3.13a,3.13c,3.13d illustrates that the loss in

 

performance is more for fall in K then for a rise in K. In fact higher the K, smaller is
the loss in performance. This is consistent with results from channel coding where a
fall in codeword length typically leads to worse performance. Again signiﬁcant losses

in performance occur typically for 0.7 g r g 1.3.

3.6 Conclusions

Channel coding formulation of NC can be used to design network codes resilient to
packet drops (due to congestion/ corruption / link failures). We have shown how this
formulation enables the design of network codes tailored to the underlying network
topology. We developed an analytic framework for ripple analysis under topological
constraints and outlined simpliﬁcations that lead to an asymptotic analytic expres-

sions for the inﬁnite transmission range case. Also, we derived the necessary condition

72

to ensure the ﬁxed point stability of the designed mixing rules (code degree distribu-
tion). Note, we have achieved signiﬁcant complexity reduction when compared with
RNC, since the use of BP decoding algorithm for the designed degree distribution
results in linear decoding complexity.

We have demonstrated that the WSN can attain maximal recovery by choosing the
appropriate mixing rules (optimal degree distribution) after considering the prevailing
network conditions. A distributed code—over-network mapping algorithm was devel-
oped that uses single hop mixing to achieve the optimal degree distribution. It was
shown that for networks with sufficient connectivity, single hop mixing is adequate
to achieve performance similar to mixing across the network with multiple hops. The
performance of the designed mixing rules was found to outperform RN C and growth
codes so long as the underlying WSN graph remains connected. Sensitivity of the
“designed solution” to change in node interconnections and sensor ﬁeld size was eval-

uated and the performance loss was found to be small.

73

Chapter 4

Minimal outage rate under both

erasures and errors

In the preceding chapter we examined the case of designing network constrained
codes when the distortion suffered by the transmitted symbol stream is restricted to
erasures. In this chapter we examine the code design problem when the symbol stream
arrives at the sink with errors in addition to suffering erasures. The introduction of
errors in the received stream precludes reliable partial data recovery, i.e. we would
either successfully receive the entire transmitted codeword or none of it. As a result,
the target goal is to minimize the probability of BP decoding failure. This goal can be
equivalently stated as minimizing the outage rate. The framework outlined to achieve

this goal is called In—Network Processing based on channel code Design (INPoD).

We begin with a brief introduction and outlining the problem’s objective func-
tion, in section 4.1. Next we consider an illustrative example which is used as an
aid in describing the expected operating environment. Simpliﬁcations are outlined
and used to obtain an equivalent channel model in section 4.2. This equivalent chan—
nel model is then used to determine an upper bound on the channel coding rate.

We next deﬁne a homogeneous sensor network in section 4.3 and give an algorithm

74

to construct the parity generator matrix on this sensor network. The parity gen-
eration algorithm is constrained by the underlying graphs connectivity. Section 4.4
analyzes the code construction algorithm and gives a lower bound on the probability
of successfully achieving the target parity degree distribution. This lower bounds
is referred to as the code admissibility criteria. Since the proposed code construc—
tion algorithm is probabilistic in nature, it cannot guarantee that all data symbols
participate in the code formation process. We therefore determine an expression es-
timating the probability that data bits to be communicated to the sink participate
in the code formation process. We can lower bound this derived expression to en-
sure higher participation probability for the data being transmitted. Based on the
expressions derived in prior sections, we formally state the problem along with the
associated network constraints of admissibility and coverage, in section 4.5. We then
discuss two different approaches to determine solutions for the problem formulated
and suggest heuristics for good initial guesses. The ﬁrst approach uses sequential
quadratic programming along with Density Evolution [RSUOO] but has the tendency
to get stuck in local minima. The second approach uses a higher complexity heuristic
genetic algorithm Diﬁerential Evolution [SSOO] and does a better job of searching the
parameter space at the cost of increased computational complexity. We compare the
outage rates of network channel codes with standard LT degree distribution [Lub02],
to the designed degree distribution in section 4.6. We then discuss in detail the results

presented and ﬁnally present our conclusions in section 4.7.

4.1 Introduction

INPoD is a step forward for the Code-on—Network-Graph (CN G) framework. Basic
CNG maps the physical nodes within a network (e. g., sensor network) to the variable

nodes of a Low Density Parity Check (LDPC) code. Consequently any data generated

75

by these network nodes when combined linearly translates to check equations used in
LDPC codes. The linear combination of data generated by network nodes leads to
in-network processing Operations under the CN G framework. The broadcast nature
of wireless channels in sensor networks makes a strong case for the use of the above
network embedded coding approaches since internode data communication between
adjacent nodes is achieved at no additional cost. The basic idea behind network em-
bedded coding is to enable transmission of information via multiple paths (diversity)
while making efﬁcient use of the available bandwidth among the intermediate coding

nodes.

The work in [BL05] [HSOB05] addresses the problems associated with introduc-
tion of errors into the network code stream. A drawback of the work done in these
papers is that the proposed channel are designed with the assumption that the un-
derlying sensor network is capable of supporting their mapping. However, in real
world scenarios the underlying topology is often pre-determined by the placement of
the sensors and limits the codes which can be mapped to the given topology. Con-
sequently, we deve10p a joint network channel code design approach that performs
linear combinations of generated data or In-Network Processing based on the channel
code Design (INPoD). Under INPoD, the formation of parity bits of the LDPC code
are restricted by the ﬁnite communication range of the nodes. The network codes de-
signed using the above approach are resilient to noise introduced within the network
due to the natural robustness of LDPC codes to erasures and errors (when carefully

chosen) .

Since reliable partial data recovery is infeasible in the presence of errors, the
objective of the code design process is to minimize the number of BP decoding failures.
For our analysis we focus our attention to ﬁxed number of BP iterations at the sink,

= 25. Therefore the objective function to be minimized for the determination of

76

optimal degree distribution (A*, p*) can be expressed as:

OINPoD = 1952250.;2) (4-1)

where, Peg:25 (A, p) corresponds to the fraction of belief network edges (corresponding
to a codeword with length tending to inﬁnity and code degree distribution (A,p))
carrying incorrect messages after 25 belief prOpagation iterations. The steps used in

determination of P660, p) using density evolution (DE) is listed in section 2.2.

The network-constrained parity bits corresponding to the designed codes is trans-
mitted over a network where it suffers distortion before arriving at the intended
' destination. Let us now take a closer look at the network model used for the prob-

lem.

4.2 Network Model

Figure 4.1(a) illustrates with an example the network model employed in this chapter.
The white-nodes represent the sensor nodes collecting data and the black-node rep-
resents a network of relay nodes which forward information to the sink. In the very
ﬁrst iteration of data-transmission all the sensor nodes transmit their data to their
neighboring nodes. These data-bits are then XORed by the sensors using the rules
deﬁned in the generator matrix (constructed by the algorithm described in section
4.3). The XORing rules are selected based on the average neighborhood size of the
sensor network graph and is assumed to be known to the sensor nodes. The XORed
bits are then transmitted to the sink via the relay nodes. Since only the XORed
parity-bits are transmitted to the sink the data-bits are effectively erased. The parity
bits arriving at the sink can be corrupted by noise or dropped (e. g. due to congestion)

while traversing the relay network towards the sink. The relay network can therefore

77

be envisioned as a composite charmel which introduces erasures as well as errors while

forwarding the transmitted bits. Moreover, the noise corrupting the parity packets

at last-hop (just before the sink) can be modeled as an additive gaussian channel.

The graphical representation of the overall network model is shown in the ﬁrst part

of Figure 4.1(b).

Sensor
Nodes

      
 
 

. 35.x,
- 0 Sensor Nodes
' 0 Relay Nodes

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

NETWORK Emzﬁf ERRORS Bi-AWGN
CONSTRAINED . CHANNEL _Pata
(Data bltS (Relay . ,
CODE and Rela Nodes) (Wireless Slnk
CONSTRUCTION y Link)
Nodes)
0' .
PERASURE PBSC Bz—A WGN /
V
AGGREGATE
Bi-AWGN +1) 'ata
CHANNEL Sink
0'

(b)

Figure 4.1: (a) Network model example (b) Equivalent Channel representation

For analysis, we convert the last hop Bi-AWGN channel (having channel noise

with standard deviation UBi—AWGN) into a binary symmetric channel (BSC) with

error-rate PBi— AWG N determined as follows:

PB,_AWGN = 0.5 [1 + er f (— (o B,_ AWG N\/2)—1)] (4.2)

78

The cascade of the two BSC channels corresponding to the relay and the last wireless-
step can be aggregated into a single Bi-AWGN channel with deviation 0 (as shown

in second part Of Figure 4.1(b)). The deviation 0 is determined using:

1
ﬁerf’1 (2(PBSCHPBi—AWGN) - 1)

 

(4.3)

0':

where,

PB SC is the cross-over probability Of the aggregated BSC relay channel.

(PBSC'HPBi—AWGN) = P3300 - PBi—AWGN) + PBi—AWGN(1 - PBSC)

The cross-over probability of each hop can be estimated [KPMR06, IR08b] and
the overall cross-over probability PB SC can be communicated to the sink by storing
it in the packet header. The number of bits required by PBSC’ is a function of
the desired ﬁdelity. We assume that the packet header information is protected by
stronger codes and can be recovered at the sink. The aggregate Bi-AWGN channel’s
deviation 0 can then be calculated using the header information and the deviation
of the last-hop wireless link using equation (4.3). The overall channel as seen by the
sink is a mixture of the erasure channel and the Bi—AWGN channel with worst-case

erasure probability PE R A SU R E and worst-case noise deviation 0.

4.2.1 Determining the code rate to be used for the mixed

channel

The capacity Of the mixed channel is determined by the following equation [CT91]:

C = CERASURECBi—AWGN (4-4)

79

where, C Bi— AWG N and C E R A SU R E are the capacities of the aggregate Bi—AWGN
channel and relay erasure channel respectively. We assume that the parity bits are
transmitted through the relay channel using BPSK modulation and the deviation of

the Bi-AWGN is 0. Then we have the following capacity expressions [CT91]:

CBi—AWGN = 0850 (0-5 [1+ eff (‘1/ (Wall)

CERASURE = 1 - PERASURE

(4.5)

where, C B 50(1) = 1—(—-a:log2(x) — (1 — x) log2(1 — 23)). The capacity of the mixed
channel C Obtained by substituting the expressions in (4.5) into equation (4.4) must
be greater than the information transmission rate K /M (for asymptotically reliable
communication in noisy Channel). We can use this inequality to Obtain an upper

bound on the channel code rate R = K /N as follows:

K/M < c e R < (M/N)C (4.6)

where, N = K + M. While designing degree distribution for the mixed channel we
pick a channel code rate which satisﬁes the inequality given in equation (4.6); this
results in a lower bound on the number of parity bits M to be generated for reliable
communication. Note, that the channel code rate of the bipartite graph representing
the parity check matrix H can be determined using the belief propagation graph’s

edge degree distributions as was shown in equation (2.18).

80

 

‘5.-

4.3 Homogeneous sensor networks and the code

construction algorithm

In this section we outline an algorithm which helps us map the desired LDPC degree
distribution (5, a) to the sensor network. More precisely, we consider the construction
Of the generator matrix G given the underlying WSN topological constraints. Note,
we can convert the sensor network in Figure 4.1 into a directed graph based on the
range Of transmission of each node. The out-degree of a sensor node is determined
by the number of sensor nodes lying in its transmission range, whereas its in—degree
is determined by the number of nodes able to transmit data to it. The out-degree
distribution plays a crucial role in the selection of the data—side degree distribution
whereas the in-degree distribution constraints the choices for parity-side degree dis-
tribution. For our analysis, as in [BL05], we will consider a symmetric homogeneous
sensor network, i.e. a network where any sensor node u will have the same number
of nodes le| within its transmission range (the count INUI includes the sensor node
v as well). Also, every sensor node has the same number of nodes transmitting tO it.
This implies that the in-degree and the out-degree of the graph under consideration
is the same. The homogeneous network (i.e. equal transmission range) is a common
assumption in WSN literature [CKOO], [GK04], [FDTT05]. The additional symmetric
constraint facilitates the derivation of a relationship between network neighborhood
size and degree distribution of designed codes. As we shall see in section 4.4 the
neighborhood size le| plays a critical role in determining the admissibility of the
code degree distributions during the Optimization process. lel also influences the
coverage of the data bits (by the code employed). The system illustrated in Figure
4.1(a) can be represented as a line Of sensor nodes (folding on itself) with neighboring
nodes lying next to each other as shown in Figure 4.2. We choose this representa-

tion since it simpliﬁes the explanation of the construction algorithm for the generator

81

 

//\

i O 2\“\ 0 Data Nodes
f.—:—'U El Parity Nodes

i O i"

i.‘ .lj’ "m Choose from
V Neighborhood

Figure 4.2: Constructing generator matrix C with distribution (a, ()5)

matrix G (which in turn is used to generate the parity bit stream). The choice of a
line representation does not result in any loss Of generality for the symmetric homo-
geneous sensor network. The range of sensor transmissions determines the size of the
neighborhood for each sensor node. For the line-network this results in an incidence
matrix I for the WSN, where each sensor node has (|Nv| — 1) / 2 elements from the
top and the bottom in its neighborhood. Algorithm 2 outlines how we can construct
a network-constrained parity generator matrix G Mx K, for the line sensor network
given data and parity degree distributions (5, E) and the network incidence matrix
I (in a centralized fashion). The code construction algorithm 2 is inﬂuenced by the
socket matching algorithm outlined in [RUOl]. The critical difference in the two being
that the selection of the ﬁrst edge corresponding to a parity symbol in algorithm 2
restricts the number of available data node choices for the remaining edges (to the
immediate neighborhood). As the parity symbol construction proceeds the effective
number of available data symbols (sockets) from a neighborhood decreases. Hence
the code construction algorithm 2 ﬁrst constructs parity symbols with higher degree

to increase the probability of successfully constructing all the parity symbols.

The input parameters required by algorithm 2 are:

82

Input Parameters: K, M, 5, 35,1, n, q

 

Algorithm 2 NetworkConstrainedParityGeneratorMatrir (Erasure and Error)

 

1: VS <= List of data-nodes sorted in descending order Of their network degree
2: L D <= {}; 3 <= 1
3:fori<=1TOnSTEP1do

4: 6 <= 3 + round(Kl§i) — 1; LSD 4: {}
5: forj<=sTOeSTEP1do
6: forc<=1TOiSTEP1do
7= LSD <= {LSD V507}
8: end for
9: end for
10: LD<={LD LSD};S<=6+1
11: end for
12: s 4: 1
13: for i <= 1 TO q STEP 1 do
14: e = s + round(M$Z-) — 1
15: forj 4: 5 TO e STEP 1 do
16: LM(j) <= 7:
17: end for
18: s = e + 1
19: end for
20: for i <= 1 TO M STEP 1 do
21: LDQ <= unique(LD); LDQN <= {}
22: forj<=1TO ILDQI STEP 1 do
23: if Ineighborhood(LDQ (3')) ﬂ LDQI Z LM(i) then
24= LDQN <= {LDQN LDQ(J')}
25: end if
26: end for
27: LDQN<= (ILDQNI== 0)?LDQ:LDQN
28: f <= Randomly select one node from L DQ N
29: f N 4: neighborhood( f ) 0 L DQ N
30: if Ile Z LM(i) then
31: c 4: Randomly select L M(i) edges from f N
32: else
33: C <= fN
34: end if
35: Set row i of matrix G to an all-zero row
36: Set columns corresponding to elements in c for row i in G to 1
37: Remove c from L D
38: end for

39: Remove 4-cycles, using edge deletions (avoid creating degree zero data nodes)

 

83

Let us now consider a brief description of the code construction algorithm:

Steps 2-11: Using these steps we create a list L D containing copies of the data

 

nodes so that a 5.,- fraction of data nodes having i copias. This step ensures that we
achieve I?- in the constructed generator matrix G. For non-homogeneous networks,
nodes having higher out degree are selected for the generating copies corresponding
to higher degree data nodes. This approach tries to ensure that the number of times
a network edge is selected during code construction is reduced. Such an approach

reduces the probability of forming duplicate parity symbols.

Steps 12-19: In these steps we determine the degree to be associated with each

 

of the M parity nodes ensuring that the fraction of parity nodes with degree i is 52-.
The list of parity degrees corresponding to each parity node is labeled L M: If all the
parity nodes achieve their associated degree we successfully achieve the target parity
degree distribution 8. The network connectivity plays a role in determining how
likely it is that a particular parity distribution is achieved as will be demonstrated in

section 4.4.1.

Steps 20—38: For each of the M parity node we randomly select data nodes to

 

be associated with it. For each parity node construction: Steps 21-26 create a list
L DQ N of data nodes which have available neighborhood size greater than the current
parity degree under consideration. A data node is then selected at random from this
list. If there are no data nodes which have available neighborhood size greater than
current parity degree then a data node is selected at random from all the available
data nodes. This procedure is illustrated in Figure 4.2. The solid line represents the
randomly selected data node f resulting in selection Of the ﬁrst edge corresponding
to a parity node. This selection limits the remaining data node choices (for the
parity node under construction) to the immediate neighborhood f N of the ﬁrst data
node. The dashed-lines represents the remaining data nodes selected randomly from

the available data node neighborhood f N- Note, given a data node f the available

84

 

neighborhood f N is determined using step 29.

Step 39: The steps 1-38 result in a parity generator matrix C which has a distri-
bution (5,3). However, this construction can result in short cycles in the generator
graph. Four-cycles are in particular have a signiﬁcant adverse impact on the decoder
performance at the sink. Therefore four cycles are removed in step 39 of algorithm
2. This is achieved by deleting edges which participate in four cycle formation. We
take care to not delete all the edges associated with a particular data bit since this

would make it impossible to recover the particular data bit at the sink.

We illustrate the execution of the above algorithm for a target code degree distri-
bution of E = {52 = 2/6,E4 = 4/6} and a = {$1 = 4/8,$4 = 4/8}. Let us assume
that K = 6 and M = 8. The sensor network on which the code is to mapped is shown
on the right side of Figure 4.3(a). The execution of steps 2-11 of the code construc-
tion algorithm results in the six data nodes with the edges emanating in accordance
with the data degree distribution If. If a data node has four edges, it corresponds
to four copies of the data node, and so on. We start constructing with the highest
parity bit which in this case corresponds to a parity bit with degree 4. As in steps
20—38, we ﬁrst identify the set of eligible data nodes. These will correspond to data
nodes with available degree greater than or equal to 4, as shown in Figure 4.3(b).
Next a data node is selected at random from this eligible set and an edge connection
is made between the parity node and the data node as shown in Figure 4.3(c). We
then identify the available neighborhood of this data node. In this case the available
neighborhood size is 4 (Figure 4.3(c)). The remaining 3 edges are connected to 3
vertices chosen at random from the available neighborhood of vertices. This process
continues till a complete code is constructed as shown in Figure 4.3(d). Execution of

step 39 will reduce the number Of four cycles in this code.
Note, as a result of the edge deletions carried out in step 39 there is a slight

deviation of the achieved degree distribution from the target degree distribution. This

85

 

 

 

 

 

(a)

 

 

 

 

 

 

 

 

 

V1 0:
(b)

Figure 4.3: Example code construction for {62 = 2/6,ﬁ4 = 4/ 6} and {51 = 4/8,54 =
4 / 8} under network constraints

86

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Figure 4.3 (continued)

87

 

deviation is more pronounced when the target parity degree distribution has a large
fraction of high degree nodes. More speciﬁcally, when the in-degree of the network
node is comparable to the desired parity degree, the large fraction Of high degree parity
nodes will lead to the formation of a large number Of four cycles. This relationship
between parity degree distribution and four cycle formation is discussed in more detail
in section 4.6.1. Removal of a large number of four cycle leads to rapid deviation from
the target degree distribution. This is especially harmful for codes designed using
Density Evolution as it leads to signiﬁcant performance deterioration. It is therefore
desirable to restrict the maximum allowed parity degree to a small number for Density
Evolution based design. Modeling the four cycle deletion process and its effects on the
degree distribution is a very challenging task and a complete treatment of this task
is non-trivial, which could be the subject of a separate stand-alone work in its own
right. As we shall see in section 4.6 this does not deter codes designed using Density
Evolution from performing well so long as the maximum allowable parity degree
is kept low (compared to the network neighborhood size). Codes designed using
Differential Evolution automatically consider the deviation from target distribution
during the Optimal degree distribution selection process and therefore perform well
even when the maximum allowed parity degree is high. The Diﬁerential Evolution

based approach for code selection however does not scale well with network size.

The code construction algorithm outlined above can be executed in a distributed
fashion as long as each Of the sensor network nodes have the following information
available to them: the average neighborhood size Of the sensor network graph (lel):
the total number of data-bits (K), the noise deviation of the aggregate Bi-AWGN
channel (a), and the erasure rate in the aggregate erasure channel (PE R A SU R E).
Since linear combinations are restricted to within a single-hop, four-cycles can be
determined by querying the sensor nodes in the neighborhood. An edge deletion

would then correspond to changing one Of the two linear combinations (parity bits)

88

which are participating in the formation of the four cycle to a lower degree. Such a
distributed approach would however result in the achieved M being a random variable
and not a constant, leading to slightly lower performance.

If 1:le represents the data vector to be transmitted by the sensor ﬁeld then the

parity vector p Mx 1 can now be generated as follows:

PMx1= GMxKxle (4-7)

Figure 4.4 illustrates the structure of the resulting parity generator matrix G formed

11va

 

 

 

 

 

0000

o
o
o
o
o
o
o
0
ll. ooool

 

 

Figure 4.4: Illustration demonstrating the network constraint on generator matrix G

using algorithm 2. The parity generator matrix G can be transformed into the parity
check matrix H using the procedure outlined in section 2.1 and used for belief propa-
gation decoding at the sink. It is signiﬁcant to note that the original set of data bits
is assumed to be lost due to erasures. Additionally a fraction of the parity bits trans-
mitted via the relay network will undergo erasures (say, due to congestion). When
dealing with channels which induce erasures we must exercise care while designing
the degree distribution for the code to be deployed. The reasons can be easily inferred
by examining the messages passed between the variable and check nodes in the belief
prOpagation algorithm. The formula to calculate the message to be transmitted be-

tween the check node C and variable nodes V for a log—likelihood belief propagation

89

 

decoder is given by equations (2.5) and (2.6).

. Degree One Degree Two
Panty Node . Parity Node

9 0
\/' x. e;

Figure 4.5: Belief ripple from degree-one parity node to degree-two parity node and
beyond.

An erased variable node will transmit zero to the check node at the very ﬁrst
iteration. As can be seen from equation (2.6), if more than one variable node asso-
ciated with the check node C sends a zero to it then the expression in the square
brackets will always evaluate to 00. As a result the check node C will transmit back
a zero to all the variable nodes in its neighborhood NC: In our code formulation all
the check nodes of the parity check matrix H are connected to data bits which are
erased. If the parity degree distribution 775 does not contain a non zero fraction for
degree one nodes, then all the check nodes Of H will have at least two variable nodes
with erasures (corresponding to the erased data nodes) connected to it. Therefore,
to jump start the BPA it is critical that there are some degree-one nodes in a. These
degree one nodes create a ripple effect [Lub02], i.e. if the data-nodes connected to
degree one parity nodes are also connected to another parity node, then the belief
of the degree-one parity node ripples down to the other parity-node via the common
data node as shown in Figure 4.5. If this parity node in turn has degree-two, then the
ripple propagates further to the next data node. This ripple propagation via degree
one and degree two parity nodes is essential for successful belief-propagation decoding

for channels containing erasures. In fact, for successful BPA decoding at the sink at

90

 

least a fraction of the parity nodes should have degree one and degree two:

61 > T1 $2 > T2 (4-8)

While designing the code degree distributions using Density Evolution analysis, we

choose r1 = 0.005 and r2 = 0.49.

4.4 Bounds On Achievability/ Coverage

While mapping a channel code to the underlying sensor network, the connectiv—
ity/ topology of the actual network constraints the degree of freedom in selecting
data/ parity nodes. This constraint is reﬂected in steps 20—38 of the code construction
algorithm 2 given in section 4.3. Based on the underlying graphs connectivity we give

the following bounds corresponding to the parity and data nodes.

4.4.1 Parity side feasibility bound

The code construction algorithm constructs one parity node at a time starting with
the largest parity degree and moving down to the smallest. As the code construc-
tion proceeds the available data node neighborhood size (as described in section 4.3)
starts shrinking. Consequently, as the parity nodes are constructed the mean avail-
able neighborhood size (corresponding tO list L DQ N) reduces. Let us represent the
progression of mean available neighborhood size by p N [i] with i taking on values
0,1,2,- -- ,M. If p N [i] is small compared to the degree Of the parity node being
currently constructed it will have an adverse inﬂuence on the probability of success-
fully constructing the parity node. It is therefore obvious that pN[i] has a signiﬁcant
affect on the parity side feasibility of the code. Before we can derive an expression

representing the parity side feasibility constraint we need to model the progression of

91

 

u N [i] To facilitate this we ﬁrst deﬁne the following expected values:

n
pg = 2: f0; = E(R) = E (data node degree) (4.9)
j=1
q __ _
1125 = 2: joy = E(qb) = E (parity node degree) (4.10)
j=1
h
l‘ff = Z jg = E (a) = E (network node in-degree) (4.11)
j=1
where, a = (51,62, - - - ,Eh) represents the in-degree distribution of the sensor net-

work; 5,- corresponds to the fraction of sensor nodes with in-degree i; h is the maxi-

mum in—degree and 29:16,: : 1.

Note that each parity node on an average selects p5 data nodes for its creation.
Consequently, the probability that a data node (in a given neighborhood) is available

for creation of parity node i can be represented as:

. #—
P(data node available for parity i) = 1 — (Xi/f) 6 (4.12)

The progression of mean available neighborhood size can then be modeled by :

mm = #6 [1— (XJ—W] (4.13)

Figure 4.6 compares the prediction obtained using equation (4.13) against actual pro-
gression for the following scenario: gt—3 = 1,66 = 1, lel = pa = 15, K = 1000, M =
2000. It can be deduced from ﬁgure 4.6 that equation (4.13) is a pretty good estimate

for uN[i].

The code construction algorithm 2 proceeds with the parity node creation in

descending order starting with the highest degree ﬁrst. We can therefore partition

92

 

the parity node set into subsets with the same degree A: as follows :

k k—l
1k: i:i€Z,M 1—23; §i<M 1—qu—j (4.14)
j=1 j=1

As a result, the mean available neighborhood size when parity nodes belonging to

subset I]: are being created is:

. 1 .
[LA/[Z E 1k]: :- 2: ”NM (4.15)
Mm“ iEI
k
The quantitias deﬁned in equations (4.14) and (4.15) will be extremely useful in

representing the parity side feasibility constraint succinctly.

Let us now proceed to derive the parity side feasibility constraint. For a parity
node i to be successfully constructed with a target degree determined in steps 12—19

the inequality at step 30 must be satisﬁed, i.e.:

I f NI = Number Of available data nodes in neighborhood Of f

2 L Mu) (4.16)

We model the selection of edges in forming a single parity node as independent
poisson trials; say Nf,1’ N152, - - - ’Nf,min(|fN|,LM(i)) which take on value 1 (suc-
cess) or 0 (failure). Hence the parity formation process itself can be looked upon
as a sum Of indicator variables N f *. Let us represent the sum of the above indica-
tor variables by the random variable IN fl: We can now claim that a parity node is

successfully created if the following inequality is satisﬁed:

N 2 Number of data nodes required 4.17)
f

93

 

If the target number of data nodes required to generate a parity node is k the prob-
ability of success is given by:

P(success|k) = 1 — P(|Nf| < k)

(4.18)
1— P(|Nf| <(1 — 5)pN[’i E Ikl)

where, 6 = 1 — k/uN[i E Ik]. If we assume I: < #Nli E Ikl the second term

in equation (4.18) can be bounded below by using Chernoffs lower tail-inequality
[MROO]:

 

  

 

 

 

 

 

15

1o ...............
52
:5.

5 .............. : ........ . x‘ .............. 1
.....--.uN[i]-Actual
---uN[i]-Predicted g‘

00 500

 

 

1000 1500 2000 2500
Parity node being constructed (i)

Figure 4.6: Mean neighborhood size progression )1 N [i] for 33 = 1,66 = 1, MM =
[1.5- = 15, K = 1000, M = 2000

e—(s #NliGIkl
P(|Nf| <(1—6)uN[iEIk]) < ( )

4.19
(1 — (DU-5) ( )
Substituting (4.19) in (4.18) gives the lower bound for probability Of success for a

particular degree 1:. Therefore the lower bound on the probability of achieving the

94

 

overall parity degree distribution is given by:

Me

 

P(success) = EkP(success|k)
k=1
q _ e—a MN [71611:]
P(success) > I; 45k 1— ((1 _ (Du—6)) (4.20)

This inequality gives the lower bound (in an expectation sense) on the probability of
achieving a given parity distribution. To ensure any degree distribution selected by the
Optimization algorithm is achievable, we would like the probability Of success (deﬁned
above) to be at least greater than a appropriate constant 021. We therefore introduce

the following inequality as a constraint in the eventual optimization problem:

q _ 8—6 I‘D/[361%]

Interestingly, this bound deals only with expectations and therefore can be applied

 

even to non-homogeneous sensor networks with asymmetric transmission ranges. In

our analysis we choose “’1 = 0.9.

4.4.2 Data side coverage bound

As the code construction algorithm progresses constructing one parity node at a time,
the probability that a data node participates in the BP graph improves. The evolution
Of this probability is dependent on the outcome of the random socket selection process
corresponding to steps 28 and 31 in algorithm 2. This probability representing data
node participation is termed as the “coverage probability” Of a data node. The
progress of coverage probability is a function Of: the network neighborhood size, the
code degree distribution and the number of parity nodes being constructed. Let us

consider the coverage probability for an arbitrary data node v, given parity node i is

95

 

being constructed:

P(data node v selectedli)
= P(ngbd. selected in which v liesli) (4-22)

xP(v selectedlngbd. selected in which v lies, i)

Now the ﬁrst term on the right hand side of equation (4.22) is given by:

P(ngbd. selected in which v liesli) = 1 _ (1 _ ”2)./Ky

z 1 — exp(—iua/K) (4.23)

Similarly, the second term on the right hand side of equation (4.22) can be calcu-

lated as follows:

P(v selectedlngbd. selected in which v lies, i)

= E(#data nodes coveredli) / K (4.24)
_ _ _i z
-1 l K)

Substituting expressions (4.23) and (4.24) in equation (4.22) we predict the progres-
sion Of the coverage probability for: $3- : 1,% = 1, |Nv| = #5 = 15, K = 1000, M =
2000. Figure 4.7 compares a plot Of this predicted coverage progression to the actual
coverage. As can be seen the deduced expressions represent the evolution of coverage
quite well. The overall coverage probability per parity bit construction can therefore

be calculated as follows:
1 M
P(coverage) = M 2:1 P(data node v selectedli) (4.25)
7,:

This quantity represents how quickly a data node will participate in the code con-

96

 

 

 

 

 

 

 

 

 

0.8
m
:23: 0.6
a)
5
o 0.4 :-

0'2 I ...... COverage - Actual

0 ; - - - Coverage - Predicted
0 500 1000 1500 2000

Parity node being constructed (i)
Figure 4.7: Coverage progression for $3 = 119—6- : 1, lel = pa- = 15, K = 1000, M =
2000
struction process. Increasing the above quantity will lead to higher data node par-
ticipation per parity formation (and equivalently pg) making the code more resilient
to relay network errors and erasures. We therefore introduce the following inequality

constraint in the Optimization problem formulated in section 4.5:
P(coverage) 2 tag (4.26)

In our analysis we choose (122 = 0.8.
It is important to note that inequalities (4.21) and (4.26) are in direct conﬂict
with each other and therefore making them too restrictive will result in the problem

being infeasible.

4.5 Problem formulation

We now describe the optimization problem for a generic non-homogeneous sensor net-
work. We will restrict the problem to the determination of Optimal degree distribution

(F, p_*) E (67,05?) for a ﬁxed 3 = 25 iterations Of the BPA. The network-constrained

97

 

parity bits generated using (6;, F) are transmitted through a cascade of erasure and
Bi—AWGN channel as outlined in section 4.2. The code is designed for the worst case
channel conditions corresponding to an erasure-rate PE RA SU RE and noise deviation

0'.

The network degree distribution is used to deﬁne additional constraints to restrict
the feasible solution space for code search. The in-degree distribution of the sensor
network is represented by if as deﬁned in section 4.4.1. Similarly, we can deﬁne the
out-degree distribution of the sensor network to be 3 = (31, 32, - - - ,Ew) where 75",; is
the proportion of sensor nodes with out-degree i; w is the maximum out-degree and
2:12:132' = 1. Using the in—degree and out-degree distribution of the network we can

now deﬁne upper bounds on the fraction of data and parity-nodes with degree i as
_ ‘ _ -lim — . .
agzm = 2?: i (1]: and i = Z§ii 63- respectively, i.e.

9,; g aﬁim a}, g ‘5‘“ (4.27)

The design problem can now be ﬁnally stated as follows:

 

(0*. ¢*) 2 (A ,p ) = arg (31,13) Plum) (4.28)

 

 

 

where, equations (2.18), (4.8), (4.21), (4.26) and (4.27) are additional constraints on
the Optimization problem. While determining solutions to the optimization problem
(4.28) we limit the parity and data degrees to be less than a constant q. Smaller
values Of q result in the constraints (4.21) and (4.26) being satisﬁed more easily. In
our analysis we present results for q = 6 and q = 12. Also we determine solutions for a
symmetric homogeneous sensor network such that 6' = {6| Nvl = 1} = E = {El Nvl =
1}, where |Nv| = 19. We choose |Nv| = 19 because it allows us to experiment

with reasonably large data-mixing limits (determined by maximum parity degree

98

 

q). Next we look at the two different approaches employed to determine solutions
to the problem deﬁned by (4.28). The ﬁrst approach uses the LDPC analysis tool -
Density Evolution discussed in section 2.2. The second approach uses an evolutionary

algorithm to search for globally Optimal solutions.

4.5.1 Sequential Quadratic Programming using Density Evo-

lution Analysis

In this approach we use density evolution as outlined in section 2.2 to evaluate the
objective function Pg(/\, p). The Objective function is evaluated for different degree
distributions (A, p) till it reaches a minima. The process of determining a solution
starts with an initial guess of code degree distribution and the assumption that the
function Pg(/\, p) is quadratic (convexity). This allows the use of a line search based
descent algorithm to determine an Optimal solution. The solution determined this
ways is then assumed to be the starting point for a new iteration of the descent algo-
rithm. The algorithm continues till no further improvement in Pgﬂ, p) is possible.
The Sequential Quadratic Programming (SQP) [GMW81] process is carried out while
trying to simultaneously satisfy the constraints (2.18), (4.8), (4.21), (4.26) and (4.27)
using active set strategy. To aid the convergence process we provide the program
with initial guesses which are close to the desired channel code rate R and satisfy the
constraint (2.18). The initial guesses have low average parity degree so that decisions
in BPA can be made quickly on the check side. Also, average data-degree is chosen to
be high so that the data nodes receive correct information fairly quickly. Its impor-
tant to note that SQP based Optimization outputs locally optimal solutions since the
search space for degree distributions is highly multi modal. Different initial points
can result in signiﬁcantly different degree distributions, hence ensuring global Opti-

mality is better achieved by using either simulated annealing [KGV83] or evolutionary

99

 

computation [G0189]. One such evolutionary technique is discussed below.

4.5.2 Using Genetic Algorithm - Differential Evolution

In this approach we use the differential evolution algorithm as outlined in [SP97] to
determine a global Optima of the problem formulated in Section 4.5. The Objective is
evaluated as the average performance over 5 instances of the network graph with 25
runs per graph. Infeasible solutions are avoided by using penalty functions identical
to those deﬁned in [SSOO]. However the evolutionary nature of the algorithm makes
it difﬁcult to fully satisfy the problem constraints we therefore make the following
modiﬁcations to the problem setup:

Modiﬁcation of P880, p) evaluation: The Objective function is evaluated by com-
puting a statistical average as follows: We randomly constructed ﬁve different in-
stances of the generator matrix G for a sensor network with incidence matrix I; next
we calculate the performance of each individual G for ﬁve diﬂerent parity transmis-
sions of length M. The objective function value is the average performance over the
25 different trials.

Dropping constraints: Since the generator matrix G is actually created for a sensor
network with incidence matrix I, we no longer need to incorporate the parity feasi-
bility and data coverage bounds deﬁned by inequalities (4.21) and (4.26). Similarly,
we no longer need explicit constraints to keep the ripple alive as deﬁned by (4.8).
Restricting search to parity degree distribution: TO maintain the rate con-
straint deﬁned by (2.18), we restrict the search Space to a; and determine the data

side degree distribution using channel code rate R and the current value of 5. The

100

data side degree distribution can be estimated using:

_ M 233:1 152'
_ K

6: {6W_1= (d-I— d’afdl :1_§[d)—1} (4.29)

d

 

The initial population size for the differential evolution algorithm is set to 10q, it
includes distributions which are shifted unit impulses located at every feasible degree
(the rest Of the population is determined randomly). We set cross over parameter for
the algorithm as OR = 1 as a result the i — th member of the next generation 9 + 1

is determined using:

where, elements r1,r2,r3,r4 are chosen randomly from generation 9; and pbe st, 9
represents the best performing code degree distribution in the previous generation.
We execute the optimization algorithm for 3000 generations choosing the best code

obtained in the process for deployment over the sensor network.

4.6 Experimental Results

Table 4.2: Data and parity node degree distributions for “erasure and error”

 

 

 

 

Density Evol. (Asymp.) Differential Evol.
max. 12 max. 12 max. 6 max. 12 max. 6 LT
(LT de-
rived)

 

 

 

 

 

 

 

 

 

(0 = 981: PERASURE = 0)

Continued on next page

 

 

 

 

101

Table 4.2 — continued

 

Density Evol. (Asymp.)

Differential Evol.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

max. 12 max. 12 max. 6 max. 12 max. 6 LT
(LT de-
rived)

61 0.00500 0.00500 0.00500 0.00635 0.02529 0.00797
"2 0.49000 0.49000 0.49000 0.97111 0.78611 0.49357
33 0.00000 0.00000 0.10691 0.00349 0.06017 0.16622
34 0.00000 0.00672 0.34540 0.00716 0.02187 0.07265
‘5 0.00000 0.02390 0.00346 0.00328 0.00604 0.08256
3'5 0.00000 0.09888 0.04923 0.00183 0.10052 0.00000
’7 0.00000 0.00000 0.00020 0.00000
38 0.14513 0.00000 0.00013 0.05606
‘9 0.09845 0.00206 0.00032 0.03723
$10 0.11144 0.01386 0.00000 0.00000
’11 0.08490 0.19678 0.00488 0.00000
’12 0.06508 0.16280 0.00125 0.00000
_19 0.05559
—65 0.02502
366 0.00314
96 0.00000 0.00000 1.00000 * * *
911 0.27733 0.00000 0.00000 * * *
6'12 0.72267 1.00000 0.00000 * * *

 

 

(U = 0'68’PERASURE = 0.20)

 

 

Continued on next page

 

102

 

Table 4.2 - continued

 

Density Evol. (Asymp.)

Differential Evol.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

max. 12 max. 12 max. 6 max. 12 max. 6 LT
(LT de—
rived)

31 0.06409 4 0.00500 0.08966 0.09006 0.00797
"2 0.49535 4 0.49000 0.69618 0.61581 0.49357
‘3 0.00000 4 0.13191 0.10696 0.02149 0.16622
34 0.00000 4 0.24702 0.02763 0.18025 0.07265
35 0.00000 4 0.12523 0.00450 0.00012 0.08256
66 0.00000 4 0.00084 0.00649 0.09227 0.00000
‘7 0.00000 * 0.02377 0.00000
‘8 0.00000 4 0.00167 0.05606
E59 0.00480 4 0.02420 0.03723
$10 0.08832 4 0.00629 0.00000
3511 0.15056 4 0.00539 0.00000
‘12 0.19689 * 0.00726 0.00000
—19 0.05559
—65 0.02502
_66 0.00314
66 0.00000 * 1.00000 * * *
611 0.00000 =1 0.00000 * * *
{9'12 1.00000 4 0.00000 * * *

 

 

103

 

 

Table 4.3: Average parity node degree for “erasure and error”

 

 

 

 

 

 

 

 

 

 

 

Description (a = 0'811PERASURE = 0) (o = 0.68, PERAS'URE = 0.20)
Average degree Average degree

max. 12 5.8614 6.0000

max. 12 (LT) 6.0000 *

max. 6 3.0000 3.0000

max. 12 2.0891 2.5817

max. 6 2.4988 2.6614

trunc. 18 4.4881 :1:

trunc. 12 3.9856 *

trunc. 6 3.2594 :1:

 

 

 

 

 

 

We begin by choosing K = 1000 and design CNG based network embedded codes
for two cases:
Case (I): Errors in relay channel and noise in wireless link. We choose the worst-
case aggregate Bi-AWGN noise a = 0.81, PE R A SU R E = 0. The network therefore
needs to generate M > 1995 bits (determined using equation (4.6)).
Case (II): Erasures and Errors in relay channel and noise in wireless link. We choose
the worst-case aggregate Bi-AWGN noise a = 0.68, and the worst-case relay erasure-
rate PE R A SU R E = 0.20. The network therefore needs to generate M > 1991 bits
(determined using equation (4.6)).
For both cases we chose M = 2000 i.e. R = 1 / 3. The performance Of the codes is
evaluated by determining the “outage rate” at the sink for different noise levels. The
“outage rate” corresponds to the probability that the sink is unable to recover the

transmitted data vector using BPA. It is identical to probability of decoding failure

104

  

Achieved

 

 

 

 

 

 

 

  
 

‘ Achieved

 

 

 

 

 

' Targeted Targeted
W121? 1914.532. .lel =. 19, <1.= 9
1 ’ Diff. Evol. 1 ’ Diff. Evol.
'0 0.8: - '5 0.8:
Q) m
e e
8’ 8
U 0.6 1 '0 0.6 '
“5 ”5
.§ 0.4- 1 .§ 0.4-
O O
S ‘3
”- 0.2- « ”- 0.2
01 3‘587‘9‘11 0 1' 2_ 3 4 5
Parity degree (d) Panty degree (d)
(a)
Achieved - Achieved
Targeted [:3 Targeted
.le|.=. .1919? 12. Mel =. 19’ q .=‘.‘
1 Asymp. 1 1 ' Asymp.
'0 0.8- : '0 0.8:
G) a)
e e
8 8
O 0.6- - U 0.6-
”5 “5
.53 0.4- : .§ 0.4»
8 8
“' 0.2 ‘ 1 “- 0.2»
0 ‘ 0

 

 

 

 

1 3,5 7 9 11
Parity degree (d)

 

 

 

1 2, 3 4 5 6
Parity degree (d)

(b)

 

Figure 4.8: Achieved and targeted parity degree distribution for o
0.81,PE R A SU R E = 0 (a) Differential Evolution q = 6,12,K = 1000 (b) Density
Evolution q = 6, 12, K = 1000

105

used in traditional channel coding literature.

We evaluate the performance Of codes designed for Case (I) by varying the noise
a in the aggregate Bi—AWGN channel. We compare this performance against per-
formance of LT codes given in [Sh006]. The LT degree distribution (column 7 of
table 4.2) is truncated to q = 6, 12, 18 since the neighborhood size in the network
is restricted to lel = 19. We analyze the performance Of the codes designed for
Case (I) and determine causes for improvement and / or deterioration in performance
in section 4.6.1. In section 4.6.2, we analyze the sensitivity of the solution designed
for Case (I) by deploying them over a larger network of size K = 10000 and lel = 19
(unchanged).

We evaluate the performance of codes designed for Case (11) by varying the noise
a in the aggregate Bi-AWGN channel and the erasure rate PE RA SURE in the relay
network. We do not present the performance and sensitivity analysis done for Case

(I) to maintain brevity (as the results are similar).

Table 4.2 contains codes designed for both Case (I) and Case (II) using the tech-
niques described in section 4.5.1 and 4.5.2. The network for which the optimal codes
are designed is a symmetric homogeneous WSN with K = 1000 data-sensors and
neighborhood size lel = 19. The degree distribution contained in Column 3 of Ta-
ble 4.2 was obtained by Choosing the LT degree distribution as the initial point in the
SQP approach. Note, for columns containing * the data degree distribution can be
estimated using equation (4.29). The parity degree distributions achieved when exe-
cuting the code construction algorithm outlined in section 4.3 for four representative
parity distributions chosen from table 4.2 is shown in Figure 4.8. The target parity
distribution is plotted next to the achieved parity distribution for comparison. As
was mentioned earlier the differential evolution based code design approach is quite
successful in capturing the impact Of 4-cycle removal step in algorithm 2. Therefore

the achieved and target parity distributions are quite similar for the differential evo-

106

 

lution case. For the case of density evolution based codes there is a larger disparity
between achieved and targeted parity distribution. As a result, codes designed using
differential evolution perform better. However, the disparity is smaller for codes de-
signed using density evolution for smaller q. This results in comparable and sometimes
superior performance of density evolution based designed codes against differential
evolution based codes.

We now present the results of the performance evaluation and carry out a detailed

discussion of the Observed trends.

 

4.6.1 Performance Analysis for K --= 1000 data sensors

We evaluate the performance for Case (I) by varying the the noise deviation or equiv-

alently the Signal-to—Noise Ratio (SNR) as determined by equation (4.31) below:
SNR in dB = 10log10(1+ 6‘2) (4.31)

The performance curves for Case (I) is shown in Figure 4.9(a). Note, since we select
a ﬁnite number of data-nodes in our performance analysis there is a loss in coding
performance when compared to asymptotic bounds given by density evolution. The
performance of the codes shown in Figure 4.9(a) can be summarized by the following

ordering:

Asympt. (n = q = 6) —< Diff. Evol. (q = 12) —< Diff. Evol (q = 6) -< LT trunc. (q =
6) -< Asympt. (n = q = 12) 4 LT trunc. (q = 12) < Asympt. LT (n = q = 12) 4 LT
trunc. (q = 18)

107

 

 

 

 

 

100
§ .
8 _ " g
a 101:; 4;;
.E .
8 A +q=6,Asymp.
U *9 - 3 -- =12 Asymp.
“- ‘° 10 2 . q 1
o n: i”: '7” :g 1 -+-q=12,Asymp.-LT
§§ i ""ij +q=18,LT trunc.
E *5 +q=12,LT trunc.
9910'3 -- g : +q=6,LT trunc.
CL ' :3; ...; +q=6,Om.Evo1.

. , 1, , , +q=12,Diff.Evol.

 

 

 

5 6 7 8 9101112'1314
SNR-Gaussian channel(dB)
(a)

 

.3
CO

.............................

 

..L
0'
_s

+q=6,Asymp.
*q=12,Asymp.
" -o-q=12,Asymp.-LT
; ; +q=18,LTtrunc.
-°-q=12,LTtrunC.
. ‘ ::.:: +q=6,LTtrunc.
.............................................................. " +q=6.Diff-Evol-
+q=12,Diff.Evol.

.................................................

   

Probability of decoding error
(Outage Rate)
8I
N

 

 

 

 

 

A
0|
o:

6 7 8 9 1011 121314
SNR-Gaussian channel(dB)

(b)

Figure 4.9: Outage Rate for codes designed for (a = 0.81, PE R A SU R E = 0.0) (a)
Number of sensors K = 1000 (b) Number of sensors K = 10000

01

Similarly, the performance curves for Density-Evolution based code designed for
Case (II) is shown in Figure 4.10(a), and the performance curves for Differential-
Evolution based code designed (for Case (11)) is shown in Figure 4.10(b). Their

performance can be ordered as follows:

108

(Outage Rate)
E3|
N

A
or
o:

   

Aymptotic Design. M=2000

 

1 +11 1

000
005

.0.0.0.0
11111111»

cuppaoaou...
n

'D'U'U‘U'U

ll
F’P
[9.5
001

0.10 ﬁiiﬁgf :5:§:'i§3i‘:::~;.

---q=12. MM 3
---q:12, [3:005 11:? :.:t:i::r‘:;;:, f'i
-+-q=12,P=0.10

 

5-A-q=12,P=0456”¥3ﬁ7E¥91sixi:

Probability of decoding error

 

 

 

’ -v-q=12, P=0.20 .:'.§i:;i.i;i:§§§félif§: 1.5.5 3
2 4 6 8 10 12'

   

 

 

14

SNR-Gaussian channel(dB)
(a)

      
 
   
 

 

1111.!

030393030: ..5

(Outage Rate)

-+-q=12.P=o.1oT353]_flfIEEETE' '
- _-A-q=12,P=015,§

-v-q=12. P=0.20 79131393522:53:35:?
2 4 6 8 10 12* 1
SNR-Gaussian channel(dB)
(b)

Figure 4.10: Outage Rate of codes designed for (a = 0'681PERASURE = 0.2)
and K = 1000 sensor nodes using (a) Density Evolution design and (b) Differential
Evolution Design. For convenience in illustration we replace PE R A SU R E with P in
the legend.

Probability Of decoding error

 

 

 

 

 

4

Asympt. (n = q = 6) 4 Diff. Evol. (q = 6) 4 Diff. Evol (q = 12) 4 Asympt.
W=q=lm

109

As can be seen, codes designed using Differential Evolution have lower error-
ﬂoors when compared with the LT truncated distribution. Also, codes obtained
using Density Evolution analysis with q = 6 perform the best. The performance of

these codes can be improved if we allow data to be transmitted across larger distances

[MKR07].

In table II, we list the average parity degree of every code whose performance was
evaluated. It can be seen that most of the times choosing lower values of average
parity degree (#6) results in better performance (when the noise in the relay channel
is increased). Moreover, choosing a lower value of pa)- leads to a larger value for
the expression on the left hand side of inequality (4.21). This in turn implies that
the parity degree distributions are easier to map on to a given sensor network for
lower values of pa. Also, smaller values of average parity degree lead to formation
of a smaller number of four cycles (which are detrimental to code performance). We
can verify this hypothesis by calculating the expected number of four cycles that
can be formed in the bipartite LDPC graph (for a random construction of a given
parity degree distribution 8 under network constraints). Note, it is important to keep
the number of four cycles relatively small so that the deviation from target degree
distribution (when four cycles are removed) is minimized. The expression given below

in equation (4.32) is an estimate Of the expected number of four cycles formed in the

parity check matrix H for a given parity degree distribution 8:

 

61—1 q
_ 2- _
< C|op|v¢ > = II II M ¢d1¢d2 < C|op| >(d1,d2) +
d1=1d2=d1+1
q — _
M¢d(M¢d - 1)
(11:11 2 < Clopl >(d,d) (4.32)

where, < “IOpl’f > is the mean number Of 4-cycle in a random parity check matrix

H for a parity degree distribution 83; < Clopl >(d1 (12) is the expected number of

110

 

 

 

+q=6,Aymp.
-°-q=12,Aymp.
-+-q=12.Aymp.-LT
+q=18,LT trunc.
+q=12,LT trunc.
+q=6,LT trunc.
+q=6,Diff.Evol
+ = '
1 00 3 4 q 12,DIff.Evol
10 No. of data sensors K 10

(3)

Average NO. Of 4-cycles

 

 

 

 

 

+ q=6ASYmP-

-°- q=12,Asymp.
-+- q=12,Asymp.-LT
-*- q=18,LT trunc.
+ q=12,LT trunc.
+ q=6,LT trunc.
-*- q=6,Diff.Evol.
-+- q=12,Diff.Evol.

4

_x
O
N

Average NO. Of 4-cycles

 

 

 

 

A
O
O
0)

10 NO. of data sensors K 10

(b)

Figure 4.11: (a) Actual mean 4-Cycle Count and (b) Estimated mean 4-Cycle Count
- of codes designed for (o = 0'811PERASURE = 0)

4—cycles between two rows of H if the ﬁrst row contains “dl ones” and the second row

contains “d2 ones” positioned at random. The details of evaluating this expression is

given in APPENDIX.

111

 

We compare this estimate to the actual mean 4-cyc1e count for codes evaluated
in Case (I). The actual mean 4-cycle count is determined by counting 4—cyc1es in 50
different instances of parity check matrices H and taking the average count. Each
H is constructed executing steps 1 to 38 the network-constrained code construction
algorithm 2 for K = 1000,2000, - -- , 10000. The plots for the actual and estimated
mean 4—cycle count for different values of K is shown in Figure 4.11. The estimate
differs slightly from the actual expected count since it ignores the data-side degree
distribution 8. However, it can be clearly seen that the relative ordering in the
number Of expected 4—cycles is maintained for the estimated 4-cycle count Figure
4.11(b). Looking at Figure 4.11 it is can be inferred that most of the times codes
with lower average parity degree result in lower 4-cycle formation and therefore better
performance.

The above discussion points towards an argument which promotes the use Of
codes with smaller average parity degree. However, a lower value of ”8 will result
in smaller data coverage probability as can be judged from equation (4.25). This
will lead to higher error ﬂoors (due to non-participation) or equivalently higher SN R
requirements. This accounts for the higher performance Of the “Asympt. (n=q=6)”

codes when compared with “Diff. Evol (q=6)” which has lower average parity degree.

4.6.2 Sensitivity Analysis

Next we evaluate the performance of the designed codes and LT codes for Case (I)
using a larger network of size K = 10000. This will help us gauge the sensitivity
Of the designed solutions. The performance curves for K = 10000, M = 20000 are
shown in Figure 4.9b. The neighborhood size is not changed and is kept lel = 19.
Keeping the neighborhood size the same actually increases the probability of getting
short cycles as was illustrated above. As a result the code performance deteriorates.

This is contrary to observations in traditional LDPC (with no network constraints)

112

 

where performance improves with increasing K. Recovery from error begins later at
higher SNR (0.8 dB loss for outage rate 10—3) and the error ﬂoors for several codes
are pushed up. Solutions obtained using the genetic algorithm - Differential Evolution
seem to develop error ﬂoors where earlier there were none. This leads us to conclude
that solutions obtained using Differential Evolution are extremely sensitive to network
size (i.e. variation from environment for which they were designed). However, the
degree distribution designed using density evolution with maximum parity degree
q = 6 is surprisingly robust to variation in network size. These results suggest that
when codes invariant to network size are required we should choose codes designed

using Density Evolution.

4.7 Conclusions

In this chapter, we take advantage of the ability to map network embedded codes to
channel coding and present an algorithm which maps channel codes to WSN with
a predetermined topology. Next we use two different approaches: (a) Sequential
Quadratic Programming using Density Evolution analysis and (b) Genetic algorithm
- Differential Evolution; to design good CNG network embedded codes for symmet-
ric homogeneous sensor networks with a ﬁnite neighborhood size. The connectivity
of the WSN determines the feasibility of a given degree distribution and also inﬂu-
ences the data coverage probability. We determine lower bounds on feasibility and
include them as constraints in the code design problem. We also deduce expressions
for coverage and lower bound them as a constraint to be achieved during the code
Optimization process. We design codes for relay channels which can be aggregated to
a cascade of Erasure and Bi-AWGN channels; however the design procedure (based
on density evolution) is generic enough to be used for any other symmetric channel

so long as the channel message densities are consistent. Interested readers can ﬁnd a

113

 

more thorough discussion on the consistency condition in [RSU00]. We observe that
codes designed using the proposed approach have lower outage rates and error ﬂoors
when compared to the truncated .LT distribution. However, the global search based
“Differential Evolution” algorithm produces degree distributions which are sensitive
to network size. The neighborhood constraint |Nv| plays a critical role in determining
the probability Of short cycles for a given degree distribution. Presence Of short cycles
leads to degradation of code performance. Note, although the results presented here W

made the assumption that the sensor network is homogeneous minor modiﬁcations

 

of the bounds and the code construction algorithm can allow design of CNG network ‘ __,

codes for non-homogeneous networks.

4.8 Appendix

Let us consider the formation of parity bits in the bipartite graph representing H.
Assume a neighborhood Na and Nb are selected ﬁrst by arbitrarily selecting vertices
a and b. Let us select d1 vertices from Na and d2 vertices from Nb to form parity bits
p1 and p2 respectively.Assume, that the size of the neighborhood overlap between Na
and Nb is IONI = j, and the size of the parity overlap is IOpl = i. Note: d1 _>_ i,
d2 2 i and j 2 i; then the number Of ways we can get parity overlap of size |0p| = i,

is given below:

Hopi =l|(d1,d1),|ONl=j = (Z) 3:32 jg (jar—1i) (j—:?2 $1)

$1=0$2=0
lel T]. lel "j
dl—i—xl dQ—i—xQ

The above equation can be interpreted as follows: The remaining (d1 — i) non-zero

(4.33)

bits of p1 can be partitioned such that :51 Of them lie in [(NaﬂNb)/(p1ﬂp2)| = (j —i)

and (d1 — i — x1) lie in the non—overlapping INCL/(Na ﬂ Nb)| = (lel — j). The same

114

type of partitioning can be carried out for p2. Since we wish to have exactly i parity

bits overlapping, (11:1 + 2:2) can be chosen from the remaining neighborhood overlap

- '—i '—i—:z:
m (1.110 .2 1) ways.
We can use equation (4.33) to get the expected number of 4-cycles, given the

neighborhood overlap size is j:

j 73 — ‘
24:2 (2lllopl __ Zl(d1,d1),l0Nl=j

 

< CIOpl >(dl.dl).loN1=j= (434)

J' _ - .
234:6 “029' -:l(dl.dl>,10N1=J

For the neighborhoods to overlap by j in the line network (section 4.3), we must have
distance la — bl = M127; + j—V—Vf—l - j (unless the neighborhood overlap size j = 0).
This implies:
K . .
. (2) —K(IN61 — 1) 111 = 0
IIONI =J| = (435)
K otherwise

Therefore, the probability of selecting two neighborhoods (Na, Nb) with overlap size

j is:

IIONI =21
N .
ngoll0N1=Jl

 

PT(|0N| = j) = (436)

Using (4.34), (4.36) we obtain the mean number Of 4-cycles between two rows of H

with degrees ((11, d2):
INvl

< CIOpl >(d1,d1)= 2mm) =3) < 61061 >(dl.dn.loN1=j
J:

115

 

Chapter 5

Concluding Remarks

This thesis has examined the problem Of designing network codes by mapping them
to traditional LDPC codes. The codes are tailored for suitability to the underlying
network connectivity. Codes were designed for the two distinct distortions: “erasure
only” and “erasure and error” which the transmitted encoded stream undergoes before
its arrival at sink. In this concluding chapter we list our main results and propose

future directions of research.

5.1 List of contributions

0 Maximal recovery and minimal outage rate
This research determines network-constrained code degree distributions which
can be used for maximal data recovery under “erasures” and minimal outage
rate in presence Of “erasures and errors”. Maximal recovery corresponds to
recovering as much of embedded data as possible at the sink. We can therefore
recover partial information (under erasures) at the sink which can still prove to
be useful depending on the application under consideration. This approach leads

to recovery performance signiﬁcantly better than zero-conﬁguration approaches

116

found in current literature.

0 Code construction algorithms under network constraints
This work outlines a centralized coding algorithm 2 with a centralized co-
ordinator which is used when the network embedded code stream undergoes
erasures as well as errors prior to its arrival at sink. It also outlines a Distributed
Code on Network - DCON algorithm 1 used when the network embedded codes
must suffer erasures only. The DCON algorithm utilizes the coarser network

degree distribution information during code construction.

 

o Fixed Point Stability Analysis for Partial Recovery
We show that if the BPA is employed to recover a certain fraction of the source
data, then the code degree distribution must satisfy a certain necessary con-
dition, given by inequality (3.17). The inequality is derived by extending the

discussion for full recovery in [RSU01].

o Admissibility Criteria
The suitability of a code degree distribution, determined by inequality (4.21)
(for homogeneous symmetric wireless networks) is Obtained using the Chernoff
tail inequality and some simplifying assumptions. These are used as constraints

while solving the “erasure and error” Optimization problem.

0 Data Coverage bound
The coding algorithms along with the underlying network connectivity, lead to
a ﬁnite probability of source symbols not participating in codeword formation.
As a result, these non-participating source symbols cannot be recovered at the

sink. We determine expressions on data coverage bounds: Inequalities (3.7),

(4.26).

0 Framework for determining ripple progression under ﬁnite transmis-

117

siOn range
We have outlined a ripple progression framework for the ﬁnite transmission
range. This framework is general enough to be extended to the case of inﬁnite

transmission range leading to deductions consistent with those seen in literature.

Expected 4-Cycle Estimation under network constraints

For the “erasure and error” case, the performance Of BPA deteriorates rapidly
with the increasing number of short cycles within the codes belief network. We
derive an expression, equation (4.32), for the expected number of 4-cycles for

homogeneous symmetric line networks.

Estimating the overall network degree distribution

We determine expressions, which can be used at individual network nodes to
give the overall network degree distribution. For random sensor placement and
knowledge of transmission range distribution this is given by equation (3.6).

This mitigates the need for a centralized co—ordinator in certain scenarios.

5.2 Proposals for future work

Although the above contributions form a good starting point for code design under

network connectivity constraints, many other interesting areas of research remain

Open. A couple of them with signiﬁcant potential are listed below:

5.2.1 Novel decoders at sink

In [KMR08] it has been demonstrated that partial recovery for the “erasure only”

case can be further improved by executing gaussian elimination after BP decoding.

This approach has complexity marginally more than the complexity of BP decoding

only. In [LRCK07], an attempt was been made to propose code design rules for the

118

 

GE plus BP decoders. Although, the results are promising the code design tools for
this approach remains elusive. The ability Of this technique to improve performance
for the inﬁnite transmission rage case can potentially be translates to the ﬁnite mixing
radius scenario discussed in this thesis. It is therefore a promising direction for future

investigations.

5.2.2 Expediting data delivery under network embedded cod-
ing

In some networks we need to prioritize the communication of data generated from one
part Of the network as against the other. For example, networks which are deployed
during disaster monitoring, need to prioritize and distinguish between critical and
non-critical data. Moreover they need to expedite the migration Of critical data to—
wards the monitoring node (sink). Under random LNC the movement of information
is more in the form Of percolation in all directions, rather than directed migration.
In [KMSRO7] it was demonstrated that non-uniform demands can be satisﬁed under

network coding when only erasures are seen in the network.

 

 

 

 

 

 

 

 

 

 

Segment 1 Segment 2 Segment n
O O O O
O o e e e
O o 0 e . e e """"" ' e .
O O O O O 0
Relay
LT distribution at sink

Figure 5.1: Illustration representing non-uniform demands from different network
segments at the sink

119

 

 

The work in [Kar07] dwells deeper into the realm Of non—uniform demands for
data from different parts (segments) of the network at the sink (Figure 5.1). The
analytical framework developed in [Haj06b] which is used to analyze performance
Of LT degree distributions was extended in [Kar07] for the multiple segment case.
To verify its accuracy a simpler two segment case was studied. Figure 5.2 from
[Kar07] demonstrates that analytic performance prediction is quite close to the actual
performance for two separate non-uniform demand cases. This suggests that using
the analysis in [Kar07] one should be able to design solutions for prioritized data
delivery under BPA decoding. However, the work in [Kar07] does not take into
consideration the network constraints when determining to code to mapped to the
underlying network graph. Investigating the line of research posed in [Kar07] for
network constrained code design is an important line of future research. Designing
prioritized network codes when the distortion goes beyond erasures and includes error
is an extremely under investigated problem. However, the potential beneﬁts resulting

from a successful solution to the problem make it another compelling line of research.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

K=500

50° 7 . 500 K=5°9 ~
g 400’ ” ' ' {tfffff'ﬂ' g 400. . I .... . {Jiffﬁﬁl
CE» 4” CE» ”of
8 300» 3;. ’- 8 300. .'..
5 .9 -
8200 A ’ 8 "3
9 ,_ i —A°tua|-seg 1 $200 " ' —Actual-seg-1
§ 100. ,.¢’,'.’. . . . ---ACtua':-seg 2 § ---Actual-seg-2
m : '02! —AnalYth"Seg 1 m 100 ., _Analyﬁc_seg_1 II

. . 35:?" - - -Analytic-se 2 j - - - Analytic-seg-Z
G0 0.5 7. 1 1.5 2 00 of5 7. i 125

(a) (b)

Figure 5.2: Two examples comparing Analytic and Empirical results for non-uniform
demand LT codes [Kar07]

The above proposals for future research must not be treated as the only avenues of

research and in no way represents a complete list Of possibilities. This thesis and the

120

listed references should together form a good starting point to attack the problems

listed in the prOposal for future work above.

121

Bibliography

[ACLYOO] R. Ahlswede, Ning Cai, S. Y. R. Li, and R. W. Yeung. Network infor-
mation ﬂow. IEEE Transactions on Information Theory, 46(4):]204—
1216, 2000.

[ADMK05] S. Acedanski, S. Deb, M. Médard, and R. Koetter. How good is ran-
dom linear coding based distributed networked storage? In Workshop
on Network Coding, Theory and Applications. Citeseer, 2005.

[BL05] X. Bao and J (T) Li. Matching code-on-graph with network-on-graph:
Adaptive network coding for wireless relay networks. In Proceedings
of 43rd Allerton Conference on Communication, Control, and Com-
puting, September 2005.

[B802] J. Barros and S. D. Servetto. On the capacity of the reachback channel
in wireless sensor networks. In IEEE Workshop on Multimedia Signal
Processing, pages 408—411, December 2002.

[Cam94] P.J. Cameron. Combinatorics: topics, techniques, algorithms. Cam-
bridge Univ Pr, 1994.

[CT91] T. M. Cover and J. A Thomas. Elements of Information Theory. Wiley
Series in Telecommunications, 1991.

[Dar02] RWR Darling. Fluid limits of pure jump Markov processes: A practical
guide. Available at arXiv: math. PR/0210109, 2002.

[DN05] R. Darling and J. Norris. Structure of large random hypergraphs.
Annals of Applied Probability, 15(1A):125—152, 2005.

[DPR06] A. G. Dimakis, V. Prabhakaran, and K. Ramchandran. Decentralized
erasure codes for distributed networked storage. IEEE Transactions
on Information Theory, 14(6):2809——2816, June 2006.

[FDTT05] M. Franceschetti, O. Dousse, D. Tse, and P. Thiran. On the
throughput capacity of random wireless networks. http://ﬂeece.
ucsd.edu/ massimo/, 2005.

[Ga163] Robert G. Gallager. Low Density Parity-Check Codes. MIT Press,
Cambridge, MA, 1963.

122

A “—3.. ..« -'--_

 

[GKOO] P. Gupta and PR. Kumar. The capacity of wireless networks. IEEE
Transactions on information theory, 46(2):388—404, 2000.

[GK04] P. Gupta and P. R. Kumar. Capacity bounds for ad hoc and hybrid
wireless networks. Computer Communication Review, 34(3):71—81,
2004.

[GMW81] P. E. Gill, W. Murray, and M. H. Wright. Practical Optimization.
Academic Press, London, 1981.

[Gol89] David E. Goldberg. Genetic Algorithms in Search, Optimization and
Machine Learning. Kluwer Academic, Boston, MA, 1989.

[Ha j06a] B. Hajek. Connections between network coding and stochastic network
theory. In Stochastic Networks Conference, June 19-24 2006.

[Haj06b] Bruce Hajek. Connections between network coding and stochastic
network theory. In Stochastic Networks Conference, Urbana, June
2006.

[HKM+03] Tracy Ho, Ralf Koetter, Muriel Médard, David Karger, and Michelle
Effros. The beneﬁts of coding over routing in a randomized setting. In

Proceedings of IEEE International Symposium on Information Theory,
page 442, 2003.

[HMK+06] T. HO, M. Médard, R. Koetter, D. R. Karger, M. Effros, J. Shi, and
B. Leong. A random linear network coding approach to multicast.
IEEE Transactions on Information Theory, 52(10):4413—4430, Octo—
ber 2006.

[HMs+03] Tracy Ho, Muriel Médard, J. Shi, Michelle Effros, and David R.
Karger. On randomized network coding. In 41 st Annual Allerton Con-
ference on Communication, Control and Computing, October 2003.

[HSOB05] C. Hausl, F. Schreckenbach, I. Oikonomidis, and G. Bauch. Iterative
network and channel decoding on a tanner graph. In Proceedings of
43rd Allerton Conference on Communication, Control, and Comput-
ing, September 2005.

[IEEa] MIB6OOCA, Ethernet Interface Board,
http://www. zbow. com/Products/productsdetails. aspa: ?sid=90, last
accessed on Oct 14, 2006.

[IEEb] MPR2400, ZigBee, 2.4 GHz, MI CA2 Mote,
http://www.zbow. com/Products/productsdetails. aspr ?sid=1 01, last
accessed Oct 14, 2006.

123

 

[IR08a]

[1R08b]

[Kar07]

[KFMROG]

[KGG05]

[KGV83]

[KHH+06]

[KM03]

[KMR06]

[KMR08]

[KMSRO7]

M.U. Ilyas and H. Radha. Measurement Based Analysis and Modeling
of the Error Process in IEEE 802.15. 4 LR-WPANs. In Proceedings of
the 27th Annual Joint Conference on the IEEE Computer and Com-
munications Societies, pages 15—17, 2008.

Muhammad Usman Ilyas and Hayder Radha. Measurement based
analysis and modeling Of the error process in ieee 80215.4 lr-wpans. In
27th IEEE International Conference on Computer Communications,
pages 1274—1282, April 2008.

Shirish Karande. Network Channel Coding: Importance Of In-Network
Processing And Side-Information. PhD thesis, Michigan State Univer-
sity, 2007.

Abhinav Kamra, Jon Feldman, Vishal Misra, and Dan Rubenstein.
Growth codes: Maximizing sensor network data persistence. In ACM
Special Interest Group on Data Communication (SIGCOMM), vol-
ume 36, pages 255—266, October 2006.

Gerhard Kramer, Michael Gastpar, and Piyush Gupta. Cooperative
strategies and capacity theorems for relay networks. IEEE Transac-
tions on Information Theory, 51(9):3037 — 3063, September 2005.

S. Kirkpatrick, C. D. Gelatt, and M. P. Vecchi. Optimization by
simulated annealing. Science, 220(4598):671-680, 1983.

S. Katti, R. Hariharan, W. Hu, D. Katabi, M. Médard, and Jon
Crowcroft. Xors in the air: practical wireless network coding. In
ACM Special Interest Group on Data Communication (SIGCOMM),
volume 36, pages 243-254, October 2006.

Ralf Koetter and Muriel Médard. An algebraic approach to net-
work coding. IEEE/A CM Transactions on Networking, 11(5):782—795,
2003.

Shirish Karande, Kiran Misra, and Hayder Radha. Clix: Network-
coding and cross-layer information exchange (clix) of wireless video.

In IEEE International Conference on Image Processing, pages 737—
740, October 2006.

Shirish Karande, Kiran Misra, and Hayder Radha. Natural growth
codes: Partial recovery under random linear coding. In Proceedings of
Conference on Information Sciences and Systems, 2008.

Shirish Karande, Kiran Misra, Sohraab Soltani, and Hayder Radha.
Design of It codes for nonuniform demands. submitted to IEEE T rans-
actions on Communications, *:*, 2007.

124

 

[KPMROG]

[KWSGLA09]

[LLL07]

[LRCK07]

[Lub02]

[MEHK03]

[MKRO7]

[MKR08a]

[MKR08b]

[MKR09]

[MR00]

Shirish Karande, Utpal Parrikar, Kiran Misra, and Hayder Radha.
On modeling of 802.11b residue errors. In 40th Annual Conference on
Information Sciences and Systems, pages 283—288, March 2006.

S. Karande, Z. Wang, HR. Sadjadpour, and JJ Garcia-Luna-Aceves.
Multicast throughput order Of network coding in wireless ad-hoc net-
works. In Proceedings of the 6th Annual IEEE communications so-

ciety conference on Sensor, Mesh and Ad Hoc Communications and
Networks, pages 235-243, 2009.

Y. Lin, B. Liang, and B. Li. Data persistence in large-scale sensor
networks with decentralized fountain codes. In Proceedings of IEEE
Conference on Computer Communications (INFOCOM), pages 1658—
1666. Citeseer, 2007.

K.M. Lee, H. Radha, H.Y. Cheong, and S. Karande. LT Codes from
an Arranged Encoder Matrix and Degree Distribution Design with
Dense Rows. In Allerton Conference on Communication, Control and
Computing, 2007.

Michael Luby. Lt codes. In 43rd Annual IEEE Symposium on Foun-
dations of Computer Science, pages 271—280, November 2002.

M. Médard, M. Effros, T. HO, and D. Karger. On coding for non-
multicast networks. In Proceedings of 413t Allerton Conference on
Communication, Control, and Computing, October 2003.

Kiran Misra, Shirish Karande, and Hayder Radha. Inpod: In-network
processing based on code design. In 4th Annual IEEE Communications

Society Conference on Sensor, Mesh and Ad Hoc Communications and
Networks (SECON), pages 324-333, June 2007.

Kiran Misra, Shirish Karande, and Hayder Radha. Maximal recovery
network coding under tOpOlogy constraint. In IEEE International Con-
ference on Computer Communications (INFOCOM), Phoenix, AZ,
April 2008.

Kiran Misra, Shirish Karande, and Hayder Radha. Maximal recovery
network coding under topology constraint. submitted to IEEE Trans-
actions on Information Theory, *:*, 2008.

Kiran Misra, Shirish Karande, and Hayder Radha. Inpod: 111-network

processing based on code design. submitted to IEEE Transactions on
Information Theory, *:*, 2009.

Rajeev Motwani and Prabhakar Raghavan. Randomized Algorithms.
Cambridge University Press, 2000.

125

[MS05] E.N. Maneva and A. Shokrollahi. New model for rigorous analysis of
LT-codes. Arriv preprint cs/0512029, 2005.

[Pea88] Judea Pearl. Probabilistic Reasoning in Intelligent Systems: Networks
of Plausible Inference. Morgan Kaufmann, 1988.

[RSUOO] T. Richardson, A. Shokrollahi, and R. Urbanke. Design of provably
good low-density parity check codes. In IEEE International Sympo-
sium on Information Theory, page 199, June 2000.

[RSU01] T. Richardson, A. Shokrollahi, and R. Urbanke. Design of capacity-
approaching irregular low-density parity-check codes. IEEE Transac-
tions on Information Theory, 47(2):619—637, February 2001.

[RUOI] T. Richardson and R. Urbanke. The capacity of low-density parity
check codes under message-passing decoding. IEEE Transactions In-
formation Theory, 47(2):599—618, February 2001.

[RU08] Tom Richardson and Ruediger Urbanke. Modern Coding Theory. Cam-
bridge University Press, 2008.

[Rya04] William E. Ryan. An introduction to ldpc codes. In CRC Handbook
for Coding and Signal Processing for Recording Systems (B. Vasic,
ed.) CRC Press, 2004.

[San07] Sujay Sanghavi. Intermediate performance of rateless codes. In IEEE
Information Theory Workshop, September 2007.

[ShoO2] A. Shokrollahi. An Introduction to Low-Density Parity-Check Codes.
In Springer Berlin / Heidelberg, pages 175—197. Citeseer, 2002.

[Sh006] Amin Shokrollahi. Raptor codes. IEEE Transactions on Information
Theory, 52(6):2551—2567, June 2006.

[SKMR08] S. S. Karande, K. Misra, and H. Radha. Design and Analysis of Gen-
eralized LT-codes using Colored Ripples. In Proceedings of IEEE Sym-
posium on Information Theory, pages 2071-2075. Citeseer, 2008.

[SP97] R. Storn and K. Price. Differential evolution—a simple and efﬁcient
heuristic for global Optimization over continuous spaces. Journal of
global optimization, 11(4):341—359, 1997.

[SSOO] A. Shokrollahi and R. Storn. Design of efﬁcient erasure codes with dif-
ferential evolution. In International Symposium on Information The-
ory, page 5, 2000.

[WGM06] Y. Wang, J. Gao, and J .S.B. Mitchell. Boundary recognition in sensor
networks by topological methods. In Proceedings of the 12th annual
international conference on Mobile computing and networking, pages

122—133. ACM, 2006.

126