CODING OVER RESOURCE-CONSTRAIED WIRELESS NETWORKS By Rami Halloush A DISSERTATION Submitted to Michigan State University In Partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY Electrical Engineering 2012 ABSTRACT CODING OVER RESOURCE-CONSTRAINED WIRELESS NETWORKS By Rami Halloush In practice, wireless networks operate under multiple, mostly severe, constraints (bandwidth, energy resources, etc). Consequently, efficient techniques have to be employed in communicating data with sufficiently high data rates (depending on the application) while complying with the imposed constraints. Distributed Video Coding (DVC) and Network Coding (NC) are amongst the most dominant techniques employed in constrained data networks. In this dissertation, we address these two techniques with the objective of realizing practical and efficient data networking solutions that fit in resource constrained wireless networks. In one part of the dissertation we address DVC over Visual Sensor Networks (VSNs) from a practical point of view, i.e., unlike a large body of research work related to this topic where the focus is on theoretical analysis and simulation, we study the practical aspects that arise when deploying DVC over real visual sensors. To that end, we develop a Resource-constrained DVC (RDVC) codec, deploy it over some of the widely used visual sensors, and conduct precise energy measurements that are used throughout our study. One RDVC-related challenge that we address in this dissertation is source rate estimation, i.e., trying to efficiently identify the source rate to be used in encoding a frame. This question is crucial since sending more bits than necessary will lead to inefficiency in compression while sending fewer bits will lead to failure in decoding. We propose a practical solution that completely eliminates the need for the costly feedback messages. Another challenge we address is the global choice between a DVC encoding option that involves intensive computations and leads to less transmission versus another choice with minimal computations that implies higher transmission-energy overhead. We carry out an operational energy-distortion analysis for a variety of options available to RDVC on visual sensors. Polar codes are the first codes proven to achieve capacity while having low encoding and decoding complexity. This motivates us to employ polar codes in our DVC platform. We compare the performance of polar codes with the more established and more investigated Low Density Parity Check Accumulate (LDPCA) codes in the context of DVC. Our results show that polar codes offer a clear advantage when coding smaller size image blocks. Consequently, polar codes could represent a viable solution for distributed sensor networks that capture low-resolution video signals. In the final part of the dissertation, we survey state-of-the-art NC solutions designed to achieve high throughput transmission in multicasting over wireless mesh networks while maintaining 100% packet delivery ratio. We propose HopCaster; a scheme that employs the cache-and-forward transport strategy. We show that HopCaster achieves significant performance gains compared to schemes that employ the end-to-end transport strategy. ACKNOWLEDGMENTS I would like to show my gratitude to my advisor Dr. Hayder Radha for his valuable guidance. I also would like to thank my Ph.D committee members; Dr. Subir Biswas, Dr. Selin Aviyente, and Dr. Sandeep Kulkarni for their helpful advices and feedbacks. I am extremely grateful to my parents, brothers and sisters for their continuous support and patience. Special thanks to my brother (and former colleague at WAVES lab) Mohammed for the generous help he provided me during my study. I would like to thank my friends for the great time I spent with them at MSU. iv TABLE OF CONTENTS List of Tables ................................................................................................................... vii List of Figure .................................................................................................................. viii Dissertation Outline .......................................................................................................... 1 1. Introduction ............................................................................................................... 3 1.1. Distributed Source Coding .................................................................................. 3 1.1.1. Syndrome-based Distributed Source Coding ................................................ 6 1.2. Network Coding .................................................................................................. 7 1.3. Dissertation Contribution .................................................................................... 9 2. Distributed Video Coding Over Real Visual Sensors .......................................... 11 2.1. Introduction ....................................................................................................... 12 2.2. Visual Sensor Platform ..................................................................................... 14 2.3. Syndrome-Based DVC Codec over Real Visual Sensors ................................. 16 2.3.1. Key Frame Intracoding................................................................................ 21 2.3.2. Wyner-Ziv Coding ...................................................................................... 21 2.4. Energy Measurement Setup .............................................................................. 23 2.5. Transmission-Computation Tradeoff ................................................................ 28 2.6. Discussion ......................................................................................................... 31 3. Distributed Video Coding Based on Source Rate Estimation ............................. 32 3.1. Introduction ....................................................................................................... 33 3.2. Conditional Entropy-Source Rate Dependency ................................................ 35 3.3. Source Rate Estimation and Performance......................................................... 38 3.4. Overall Energy Consumption Performance ...................................................... 41 3.5. Discussion ......................................................................................................... 48 4. Polar Codes for Low Resolution Distributed Video Coding ............................... 49 4.1. Introduction ....................................................................................................... 50 4.2. Rate-adaptive Polar Codes for DSC ................................................................. 52 4.2.1. Polar Codes ................................................................................................. 52 4.2.2. DSC with Polar Codes................................................................................. 54 4.3. Experimental Results ........................................................................................ 57 4.3.1. DVC Codec ................................................................................................. 57 4.3.2. Results ......................................................................................................... 58 4.4. Discussion ......................................................................................................... 66 v 5. 5.1. 5.2. 5.3. 5.4. 5.5. HopCaster: A Network Coding-Based Hop-by-Hop Reliable Multicast Protocol 67 Introduction ....................................................................................................... 68 Related Work .................................................................................................... 74 HopCaster Design ............................................................................................. 76 5.3.1. Hop-by-Hop Reliable Multicast with Network Coding .............................. 76 5.3.2. Multicast Rate Adaptation ........................................................................... 81 5.3.3. Comparison: Hop-by-Hop vs. End-to-End.................................................. 83 Performance Evaluation .................................................................................... 85 5.4.1. Static Multicast Experiments ...................................................................... 86 5.4.2. Dynamic Multicast Experiments ................................................................. 89 5.4.3. Multicast Rate Adaptation ........................................................................... 90 Discussion ......................................................................................................... 92 6. Conclusion and Future Work ................................................................................ 93 6.1. Conclusion ........................................................................................................ 93 6.2. Future Work ...................................................................................................... 94 7. APPENDIX A: LDPCA Codes .............................................................................. 97 REFERENCES .............................................................................................................. 100 vi LIST OF TABLES Table 2.1: Energy cost and time support of different operations. ..................................... 26 Table 2.2: Cost of transmission and computation operations. .......................................... 28 Table 3.1: Correlation coefficients of different bit-planes in the QCIF resolution Foreman and Hall Monitor video sequences. ................................................................................... 38 vii LIST OF FIGURES Figure 1.1: (a) Dependent encoding of two sources where each source has access to the other. (b) DSC's independent encoding where each source is completely isolated from the other. This reduces the encoding complexity (For interpretation of the references to color in this and all other figures, the reader is referred to the electronic version of this dissertation). ........................................................................................................................ 5 Figure 1.2: Rate region defined by the DSC constraints. ................................................... 5 Figure 1.3: NC scenario to achieve reliability with high throughput. ................................ 8 Figure 2.1: sensor platforms employed in our study: (a) Imote2/IMB400 and (b) Micaz/Cyclops. ................................................................................................................. 16 Figure 2.2: Block Diagram of an abstract WZ encoder. ................................................... 19 Figure 2.3: Pictorial demonstration of the DVC codec..................................................... 23 Figure 2.4: Current drawn during different operations (For interpretation of the references to color in this and all other figures, the reader is referred to the electronic version of this dissertation). ...................................................................................................................... 25 Figure 2.5: Battery voltage decay over time during different operations. ........................ 27 Figure 2.6: Compression performance of different encoding schemes. ........................... 29 Figure 2.7: Power-distortion performance of different encoding schemes....................... 30 Figure 3.1: : Conditional entropy and source rate values over the first 50 WZ frames in the Foreman video sequence, seventh bit-plane. .............................................................. 37 Figure 3.2: Performance using different methods to identify source rates - QCIF Forman video sequence. ................................................................................................................. 40 Figure 3.3: Performance using different methods to identify source rates - QCIF Salesman video sequence. ................................................................................................. 40 Figure 3.4: Incremental transmission example of three bitplanes. ................................... 44 Figure 3.5: Performance over a real visual platform - Forman video sequence. .............. 47 Figure 3.6: Performance over a real visual platform - Salesman video sequence. ........... 47 viii Figure 4.1: Performance of DVC using polar, regular and irregular LDPCA codes applied on (a) 128×128 Akiyo, (b) 64×64 Akiyo, and (c) 32×32 Akyio. ..................................... 60 Figure 4.2: Performance of DVC using polar, regular and irregular LDPCA codes applied on (a) 128×128 Hall-monitor, (b) 64×64 Hall-monitor, and (c) 32×32 Hall-monitor. ..... 61 Figure 4.3: Performance of DVC using polar, regular and irregular LDPCA codes applied on (a) 128×128 Salesman, (b) 64×64 Salesman, and (c) 32×32 Salesman. ..................... 62 Figure 4.4: Performance of DVC using polar, regular and irregular LDPCA codes applied on (a) 128×128 Foreman, (b) 64×64 Foreman, and (c) 32×32 Foreman. ......................... 63 Figure 4.5: Performance of DVC using polar, regular and irregular LDPCA codes applied on (a) 128×128 Carphone, (b) 64×64 Carphone, and (c) 32×32 Carphone. ..................... 64 Figure 4.6: Performance averaged over the five video sequences with (a) 128×128, (b) 64×64 and (c) 32×32 resolutions. ..................................................................................... 65 Figure 5.1: Overhearing along a multicast tree in a WMN............................................... 77 Figure 5.2: Average throughput of HopCaster and Pacifier under 10 different scenarios. ........................................................................................................................................... 88 Figure 5.3: The number of transmissions (ACK and data packets) of HopCaster and Pacifier under 10 different scenarios. ............................................................................... 88 Figure 5.4: The number of redundant bytes that are injected into the network by HopCaster and Pacifier in 10 different scenarios. ............................................................ 89 Figure 5.5: Average throughput of the new receiver that dynamically joins the multicast group with HopCaster and Pacifier in 10 different scenarios. .......................................... 90 Figure 5.6: Average throughput of HopCaster and Pacifier under 10 different scenarios using 2Mbps, 11Mbps, and adaptive data rates. ............................................................... 91 Figure A.7.1: LDPCA code with different rates. .............................................................. 97 ix Dissertation Outline In Chapter 1 we provide a brief background about Distributed Source Coding (DSC) and Network Coding (NC). We briefly overview the theoretical bounds of DSC, and we explain the syndrome-based DSC scheme that we employ in our study. For NC we provide an example that demonstrates how NC can be used to achieve reliability while maintaining high throughput. In Chapter 2 we describe the design and implementation of a Resource-constrained distributed video coding (RDVC) codec that we deployed over the popular Micaz/Cyclops and Imote2/IMB400 visual platforms. Further, we report a summary of accurate measurements regarding the distribution of energy consumption over the different DVC operations. We carry out an operational energy-distortion analysis for a variety of options available to RDVC on visual sensors. This aspect addresses the global choice between intensive computations that lead to less transmission versus minimal computations that implies higher transmission-energy overhead. In Chapter 3 we present our approach towards a rate-adaptive RDVC solution that eliminates the need for feedback messages from the base-station to the visual sensors; this solution is crucial for any realistic mapping of DVC over visual sensors. In Chapter 4 we employ and evaluate the performance of polar codes in DVC. Further, we compare the performance of polar codes with the more established and more investigated Low Density Parity Check Accumulate (LDPCA) codes in the context of DVC. 1 In Chapter 5 we propose HopCaster, a novel protocol that incorporates intra-flow NC with hop-by-hop transport to achieve high-throughput reliable multicast over wireless mesh networks. We show that by adopting the hop-by-hop transport strategy, HopCaster avoids many problems incurred by the traditional end-to-end transport strategy. In Chapter 6 we provide a conclusion and outline directions for future work. 2 Chapter 1 1. Introduction Since Distributed Video Coding (DVC) is based on the theory of Distributed Source Coding (DSC), we dedicate this chapter to providing a brief overview of DSC. Further, we provide a brief demonstration of using Network Coding (NC) to achieve reliability while maintaining high throughput over wireless networks. 1.1. Distributed Source Coding Distributed Source Coding (DSC) is a problem in information theory. It deals with compressing two correlated sources independently; i.e., each source has no access to the other source (this explains the name distributed). Slepian and Wolf [1] provided the theoretical bounds for the DSC problem in 1973. Let X and Y be two correlated independent and identically distributed (i.i.d) sources, then X and Y can be compressed at rates RX and RY respectively with the following constraints: 3 RX ≥ H ( X | Y ) RY ≥ H (Y | X ) R X + RY ≥ H ( X , Y ) Where H (. | .) and H (.,.) are the conditional and joint entropies respectively. Note that the aggregate rate can achieve the joint entropy, which is also achieved using dependent encoding where sources have access to each other (a practical example of joint encoding is compression using motion estimation and compensation). This means that by having independent encoding, DSC provides a compression scheme that has low computational complexity at the encoder side while theoretically achieving the same performance as the more complex joint encoding. Figure 1.1 illustrates the independent encoding of DSC versus the dependent joint encoding scenario. The equations above define a set of achievable rates. These rates belong to the rate region shown in Figure 1.2. There are two cases for DSC; symmetric and asymmetric. In the symmetric case, sources are compressed using the same rate. This case is represented by the line emanating from the green circle in the figure. In the asymmetric case on the other hand, different rates are used to compress the sources. An important special case of the asymmetric case on which we focus in this dissertation is compression with decoder side information. In the latter case, one of the sources, say Y is compressed at rate bounded by the entropy H (Y ) and the other source is compressed at rate bounded by H ( X | Y ) . This case is represented by the two red circles in the figure (two corners of the rate region). 4 Y X Encoder 2 Encoder 1 Y Encoder 2 Decoder Encoder 1 Decoder X (a) (b) Figure 1.1: (a) Dependent encoding of two sources where each source has access to the other. (b) DSC's independent encoding where each source is completely isolated from the other. This reduces the encoding complexity (For interpretation of the references to color in this and all other figures, the reader is referred to the electronic version of this dissertation). RY H ( X ,Y ) Rate Region H (Y ) H (Y | X ) H(X | Y ) H(X ) H ( X ,Y ) RX Figure 1.2: Rate region defined by the DSC constraints. Syndrome-based distributed source coding [2] is a DSC paradigm that involves channel codes. Channel codes are involved in DSC since the disparity between the two sources to be compressed, X and Y, can be viewed as noise. In other words, X can be viewed as a noisy version of Y. Consequently, if Y is already communicated to the decoder, then the encoder can communicate X by transmitting a number of parity bits 5 sufficient to correct the noise in Y. Compression is achieved by transmitting the parity bits. In the next section we provide an example that explains the idea of syndrome-based DSC. 1.1.1. Syndrome-based Distributed Source Coding Consider the following example where, at the encoder, we have two bitstreams to compress: Y: 1011010 X:1001010 Note that the two bitstreams differ only in the third bit. Hence Y can be viewed as noisy version of X. If Y is already communicated to the decoder, then all what the encoder needs to do to communicate X is to correct the noise in Y. To do so, the encoder uses a channel code that maps X into a number of parity bits (or syndrome bits) that is sufficient to correct this noise in Y to get X back. Compression is achieved by sending parity bits rather than bits of X. For simplicity, we will consider the binary (7,4) Hamming code that is characterized by the following parity check matrix H and the corresponding Tanner graph [3]: s1 s3 s2 1101100  H = 1011010    0111001   y1 y2 y3 y4 y5 y6 y7 To encode X, its bits are placed on the variable nodes of the graph (circle shaped nodes). At each check node (square shaped nodes), the XOR of the bits at the variable 6 nodes connected to that check node is computed. This results in the syndrome bit sequence s = 011 that is transmitted to the decoder (note that this is equivalent to multiplying X by H; sT = H × X T ). At the decoder, bits of Y are placed on the variable nodes of the Graph, and the received syndrome bits are fed into check nodes. At each check node, the XOR of the bits at the variable nodes connected to that check node is computed and compared to the syndrome bit fed to that check node. Bits of Y are flipped iteratively until the computed bits at check nodes match the syndrome bits (this iterative process is achieved through the Belief Propagation (BP) algorithm [3]). In our example, at the decoder, the computed bits at check nodes would be 000. BP will flip the third bit of Y and stop, because at that point bits at check nodes would be 011 which match the received syndrome bits, and X would be restored. 1.2. Network Coding Network Coding (NC) [4] is a packet forwarding technique that aims to improve the performance of the traditional store and forward scheme where forwarding nodes merely propagate the packets they receive with no processing. In NC the role of the forwarding nodes is extended to include combining received/buffered packets before transmission. The end receiver node deduces the packets that were originally transmitted. NC aims to achieve the maximum folw [4]; which is the maximum data rate for a transmission between two nodes. According to the Max Flow Min Cut theorem by Fulkerson [5], for a multicast topology, the maximum flow equals the minimum cut. In [6] it is proved that linear coding is sufficient to achieve the upper bound. In linear network coding, forwarding nodes generate coded packets that consist of linear combinations of 7 previously received packets. The coefficients are chosen from a finite field and they are included in the coded packet prior sending in order to be used in decoding. 1 P1 3 P1 P2 P1 P2 2 Figure 1.3: NC scenario to achieve reliability with high throughput. In this dissertation we focus on using NC to achieve reliability in multicast protocols over wireless networks. The advantage of NC in this context is made clear by examining the example illustrated in Figure 1.3. The figure shows a simple multicast scenario where node 1 is a source node, node 3 is a destination node, and node 2 is an intermediate forwarding node. In this example, node 1 needs to send two packets, P1 and P2, to node 3. When broadcasting the two packets, node 2 manages to receive both of them, whereas node 3 overhears packet P1. If node 2 is aware that packet P1 is available at node 3, then it will forward packet P2 only, and this will be the optimal behavior. Otherwise, it will send out both P1 and P2, which will incur one redundant transmission (P1). Using NC eliminates the need to keep track of packets at node 3. Node 2 simply sends a linear combination of P1 and P2, e.g., P=P1+P2. Upon receiving P, node 3 will be able to construct P2 by simply subtracting P1 from P. In this example, reliability is achieved; as 8 100 % packet delivery ration (PDR) is satisfied, in the meanwhile redundant transmissions are eliminated resulting in high throughput. 1.3. Dissertation Contribution The following is a summary of the contributions of this dissertation: • We describe the design and implementation of a resource-constrained distributed video coding (RDVC) codec that we deployed over real visual sensor platforms, namely, the popular Micaz/Cyclops and Imote2/IMB400 visual platforms. Further, we report a summary of accurate measurements regarding the distribution of energy consumption over the end-to-end chain of RDVC within an actual sensor, including frame-capture, Discrete Cosine Transform (DCT), syndrome computation, and bitstream transmission. The codec and energy measurements are used to highlight practical aspects related to DVC when deployed over real visual sensors. • We present a practical approach towards a rate-adaptive RDVC that enables the source node to efficiently identify the source rate required to compress a frame at a certain point of time. The presented solution completely eliminates the need for feedback messages from the base-station to visual sensors; this solution is crucial for any realistic mapping of DVC over visual sensors. • We carry out an operational energy-distortion analysis for a variety of options available to RDVC on visual sensors; this aspect addresses the global choice between intensive computations that lead to less transmission versus minimal computations that implies higher transmission-energy overhead. 9 • The recently discovered Polar codes [7] have many interesting features, e.g., they are the first codes proven to achieve capacity. Further, they have low encoding and decoding complexity. As polar codes are relatively recent, there has been little work to assess the potential of polar codes for practical applications. We employ polar codes in Distributed Video Coding (DVC). We compare the performance of polar codes with the more established and more investigated Low Density Parity Check Accumulate (LDPCA) codes in the context of DVC. • We propose HopCaster; a Network Coding (NC) scheme for wireless networks. NC schemes that employ the conventional end-to-end transport strategy suffer degradation in their performance. The reason is that by employing the end-to-end transport strategy, forwarding nodes are unable to accurately determine the number of coded packets that they need to forward, and a large number of redundant packets will be injected into the network. HopCaster employs a novel hop-by-hop reliable multicast transport scheme that exploits in-network caching. The need for estimating the number of coded packets at forwarding node is completely eliminated, which results in performance gains. In addition, HopCaster is integrated with a cross-layer rate adaptation scheme that optimizes the multicast throughput performance based on varying receiver topology at different transmission stages. 10 Chapter 2 2. Distributed Video Coding Over Real Visual Sensors Despite the vast body of work covering the theoretical, analytical, and simulationbased aspects of Distributed Video Coding (DVC), there has been little work focusing on the practical aspects of DVC over actual sensors. In this chapter, we provide a rather comprehensive description of key aspects of a Resource-constrained DVC (RDVC) framework that stems from our experience in mapping DVC solutions on actual visual sensors. In particular, (1) we describe the design and implementation of a RDVC codec that we deployed over the popular Micaz/Cyclops and Imote2/IMB400 visual platforms. (2) We report a summary of accurate measurements regarding the distribution of energy consumption over the end-to-end chain of RDVC within an actual sensor, including frame-capture, DCT and syndrome computation, and bit-stream transmission. (3) We carry out an operational energy-distortion analysis for a variety of options available to RDVC on visual sensors; this aspect addresses the global choice between intensive 11 computations that lead to less transmission versus minimal computations that implies higher transmission-energy overhead. 2.1. Introduction Motivated by advances in low power image sensing technologies, and wireless sensor networking, a new paradigm of ubiquitous monitoring has emerged: Visual Sensor Networks (VSN). Typically, a VSN consists of resource constrained nodes (motes) and a more powerful node referred to as a base station. Motes periodically sense their surrounding and propagate their sensed visual content to the base station. Motes are usually battery operated; hence to achieve an acceptable mote lifetime the overall power consumption needs to be minimized. There are two main sources of power consumption in a mote: (a) computational processing and (b) data transmission. To reduce the transmission power consumption, visual sensor data has to be compressed without requiring significant computations and/or exchange of data among sensors. Therefore, for VSNs, a new compression paradigm, which employs low complexity independent encoding at the encoder side and delegates intensive computations to the decoder side (the powerful base station), has emerged. This paradigm is called Distributed Video Coding (DVC) [2] [8] and is based on the theory of Distributed Source Coding (DSC) pioneered by Slepian and Wolf [1] and Wyner and Ziv [9]. The importance of DVC as a compression paradigm that fits well in VSNs attracted researchers and led to significant advances (e.g., [2] [8]). A variety of extensions to early DVC approaches have also been proposed (e.g., [10] [11] [12]). Many research efforts addressing DVC employ theoretical and analytical formulations, while others use 12 simulations to derive results and conclusions. Interestingly, there has been little work highlighting the practical aspects of DVC over real visual sensor platforms and devices (e.g., [13] [14]). This fact motivated us to conduct a study towards a Practical DVC (RDVC) platform deployed over actual visual sensors [15] [16]. As part of our study, we implemented a flexible syndrome-based [2] DVC codec and deploy it over different real visual sensor platforms, namely, the Micaz/Cyclops [17] [18] and the Imote2/IMB400 [18] platforms. This chapter is intended to share our experience in designing, implementing and deploying practical DVC over real visual sensor platforms. We use our implementation to outline key practical aspects that arise when a RDVC platform is targeted. One practical aspect that we present in this chapter is an operational energydistortion analysis for a variety of options available to RDVC when it is deployed over an actual visual sensor; this aspect addresses the global choice between intensive computations that lead to less transmission versus minimal computations that implies higher transmission-energy overhead. Another important practical feature desired (arguably required) in any viable RDVC solution is the ability of the DVC encoder to identify the source rate of a Wyner-Ziv (WZ) frame without the need for receiving feedback messages from the decoder. A practical solution for this feature is described in the next chapter of this dissertation. It is apparent that studying the aforementioned practical aspects necessitates having an accurate power consumption measure for radio transmission and the different computational operations involved in DVC. Consequently, we present the approaches that we followed in measuring power, and later on how we employ our measurements in studying the different practical aspects of DVC. The remainder of this chapter is organized as follows. In Section 2.2, we describe the visual 13 sensor platforms that we use in our study. In Section 2.3 we describe our syndrome-based RDVC codec implementation. In Section 2.4 we outline the approaches we followed in evaluating power consumption of the codec when deployed over a visual sensor. In Section 2.5 we highlight the importance of having an operational power-distortion description for the different options available for a DVC codec in order to make a global choice amongst such options. In Section 2.6 we conclude the chapter and introduce next chapter. 2.2. Visual Sensor Platform Many visual sensor platforms can be found in the market/literature. They basically differ in many aspects such as power consumption, image quality, physical dimensions, etc. We can categorize visual sensor platforms into two categories: 1. Sensor platforms that were originally designed for sensing scalar physical phenomena (such as temperature, humidity, etc.). In general, scalar applications have lower computational demands than visual applications; hence, scalar motes are limited in their resources. For instance, Micaz it is equipped with the 8-bit ATmega128L [19] microcontroller with clock speed set to 8MHz. It has 128K Bytes program flash memory, and 512K Bytes flash memory to store measurements. For radio communication, It uses the low power CC2420 RF transceiver [20], 2.4 GHz IEEE 802.15.4/Zigbee compliant [21]. To bridge the gap between the demanding visual application and the resource constrained mote, the image sensor is endowed with its own computational unit, and the underlying mote is merely responsible for data communication. For instance, Cyclops is equipped with an ATmega128L 14 microcontroller operating at 7.3728 MHz, and a complex programmable logic device (CPLD). For imaging, it is equipped with a CMOS Agilent ADCM-1700 CIF resolution imager in addition to 64 KB external SRAM memory to store captured images. Other visual senor platforms that belong to this category are the CMUcam3 [22], eCAM [23], and WiCa [24]. 2. The second category includes sensor platforms that were designed to support multimedia applications. Motes in this category have extended capabilities in terms of computational units and memory size. Imote2 [18] is one example of such motes. Imote2 is equipped with the PXA271 XScale microprocessor [25] whose frequency can be scaled from 13 MHz up to 416 MHz. PXA271 includes a wireless MMX (WMMX) coprocessor to accelerate computations. Moreover, PXA271 includes 256 KB SRAM, 32 MB SDRAM, and 32 MB flash memory. Further, Imote2 interfaces with the IMB400 multimedia board [18] to form a visual sensor node. IMB400 is equipped with the OV7670 imager that can capture 640×480 resolution images. IMB400 only captures frames, and all processing is done on the underlying Imote2 node. Another mote that belongs to this category is Stargate [18]. For a comprehensive survey on sensor platforms, we refer the reader to [26]. In this chapter, we describe our experience in implementing and deploying RDVC over two different visual sensor platforms; Micaz/Cyclops, and the more recent Imote2/IMB400. Both Micaz and Imote2 are programmed using nesC [27], and they both run TinyOS [28] that is becoming a de-facto standard operating system (OS) for embedded sensor networks. 15 (a) (b) Figure 2.1: sensor platforms employed in our study: (a) Imote2/IMB400 and (b) Micaz/Cyclops. Figure 2.1 shows both the Imote2/IMB400 and the Micaz/Cyclops platforms that are used in our study. 2.3. Syndrome-Based DVC Codec over Real Visual Sensors There has been a vast body of work related to DVC in the literature. PRISM [12] (Power-efficient, Robust, hIgh-compression, Syndrome-based Multimedia coding) is one of the earliest schemes. To encode a frame, it is first divided into bocks (16×16 or 8×8) over which Discrete Cosine Transform (DCT) is applied. Correlation noise between the block to be encoded and the co-located one in the previous frame is computed (the square error difference between DCT coefficients in the two blocks). The amount of correlation noise is used to classify the block being encoded into three categories: (1) Skip mode, when the amount of correlation noise is negligibly small. In this mode the block is not encoded at all. (2) Intra mode, when there is a significant amount of correlation noise. In this mode the block is intracoded. (3) DSC mode, when the correlation noise is relatively small and can be corrected through channel coding. In this case, the codeword space is 16 divided into cosets, each of which is labeled by a coset leader (syndrome). Compression is achieved by transmitting the syndrome, whose size depends on the correlation channel condition. In [12], a trellis channel code is used. To aid the decoder to perform motion estimation, the encoder generates a cyclic redundancy check (CRC) of the encoded block. The decoder uses different blocks residing in the frame memory as candidate side information for decoding. Each decoding attempt is followed by a CRC computation, and the decoding attempt whose CRC matches the received CRC is the one to be adopted. Girod et al [29] [30] [31] [32] have contributed in developing a series of DVC schemes. In al [29], even and odd numbered frames are separated into two different groups. Pixel domain WZ encoding is applied on even numbered frames, whereas odd numbered frames are intracoded. Pixels are first quantized using a uniform scalar quantizer. Quantized pixels are then fed into a rate compatible punctured turbo (RCPT) code. Resulting parity bits are stored in a buffer. A feedback channel is maintained between the encoder and the decoder. The encoder sends parity bits in an incremental way in response to requests received from the decoder until the successful decoding is acknowledged. Many techniques are used to generate SI frames. The simplest is to find the average of the predecessor and successor of the WZ frame. More sophisticated techniques involve motion estimation. After the decoder decodes the WZ frame, it uses a minimum mean square error estimator (MMSE) to restore pixel values. In [30] the scheme is developed into a more general framework where SI frames are generated by interpolating or extrapolating frames residing in the decoder’s frame memory (already decoded WZ frames or previous key frames). In [31] the framework is further extended to work in the transform domain where DCT is used. In [32], channel coding (turbo codes) is 17 applied on bitplanes of quantized DCT coefficients. Further, motion estimation (ME) at the decoder side is used to generate SI. To achieve that, the encoder generates helper motion estimation information. In particular, a hash codeword is generated at the encoder side and sent to the decoder. The hash codeword is simply a coarsely quantized and subsampled version of the block to be encoded. Similar to PRISM, the distance between the hash of the current block and the co-located one in the previous frame is calculated. If the distance exceeds a certain threshold, the hash codeword is sent. The decoder uses the hash to conduct a motion search that results in the best SI block. 18 WZ Frames W DCT Quantizer Channel Model Channel Code Rate Control ME Helper Information Figure 2.2: Block Diagram of an abstract WZ encoder. From the description of the aforementioned schemes, we can identify the components of a WZ encoder. Figure 2.2 shows a block diagram of an abstract WZ encoder. To cope with the stringent nature of motes, the different modules shown in the figure have to be implemented in a way that ensures low complexity. We address this requirement by making the following observations: 1. In general, bitwise logical operations (such as AND, OR, and SHIFT) are considered to be inexpensive in terms of power consumption [33] compared to other operations (multiplication for instance). This reflects on the channel code module shown in Figure 2.2, as we design our codec to operate on the bit level (bitplanes of images). Here, a syndrome-based [2] Low-Density Parity-Check Code (LDPC) DVC [3] arises as an attractive solution (see Appendix A: Syndrome-Based DVC), where the core operation is a bitwise XOR between a bitplane and rows of a parity check matrix (as opposed to other DVC schemes that uses more complex operations such as modular arithmetic [34] [12]). Furthermore, our RDVC codec avoids the expensive floating point operations, and this reflects on the DCT module by employing the DCT described in 19 [35] where computations involve only integers, and expensive multiplications are replaced with shift operations. 2. The scarcity of memory is another reason that justifies employing a syndrome-based LDPC code in the channel code module, as the sparsity of the LDPC parity check matrix allows for compact representation and storage. 3. Different compression schemes in DVC have different underlying energy demands and, in return, provide different video quality scores. For example, the DCT component can be skipped, and compression is applied directly on raw pixels. Moreover, both DCT coefficients and pixels could be represented via Gray codes or regular binary codes. Our study shows that various combinations of these coding options provide better performance over different ranges of the energy-quality space [15]. For example, under certain conditions, the combination of raw-pixel and Gray coding can outperform other coding options (including DCT-based) as we show later in this chapter (Section 2.5). 4. For the rate control module, employing a feedback channel between the encoder and decoder may not be desirable in setups where DVC is deployed over real visual sensor networks, due to power considerations. In the next chapter, we present a practical solution to this problem over real visual sensors. Our solution results in using source rates close to the ideal rates while completely avoiding incremental transmissions and feedback messages. The following is a brief description of our RDVC codec. We consider a setting where a frame captured by the encoder is classified into two categories: 20 a) Key (side information) frame: Captured at time instance n. We refer to a key frame by Y. b) WZ frame: A frame that follows a key frame, i.e., it is captured at time instance n+1. We refer to a WZ frame by X. Therefore, our codec encodes captured frames in pairs. In each pair, the frame that was captured earlier is a key frame, and the following frame is a WZ frame. 2.3.1. Key Frame Intracoding Key frames encoding can be achieved in many low-complexity ways following traditional still image codecs used in popular standards such the ones used in different JPEG and MPEG specifications. In summary, intracoding a Key frame Y starts with applying DCT to the frame. The resulting DCT coefficients are then quantized. The quantization process turns many DCT coefficients (the ones with small values) into zeros. Run Length Encoding (RLE) is then used to encode quantized coefficients. Finally, Huffman encoding is used to encode the output of RLE. 2.3.2. Wyner-Ziv Coding Let Xi and Yi denote the i th bitplanes of WZ frame X and key frame Y respectively, where i ranges from 1 (least significant bitplane) to 8 (most significant bitplane). The following is a description of the encoding process: Key frame Y is intra-coded and communicated to the decoder. WZ frame X is DCT transformed (this can be skipped when b DVC is applied on raw pixels). A uniform quantizer of M = 2 intervals is used to quantize 21 DCT coefficients (or pixels). Quantized values are then represented via Gray codes (or regular binary codes). The resulting frame is decomposed into b bitplanes. Next step is to use Low-Density Parity-Check Accumulate (LDPCA) [36] codes to encode the bitplanes (see APPENDIX A: LDPCA Codes). For all values of i, we scan the h×l size bitplanes Xi and Yi into one dimensional vectors xi and yi with size n×1, where n = h×l. xi is then multiplied by the n×n parity-check matrix H: S i = H × xi 2.1 The n×1 vector Si is referred to as the syndrome vector. Bits in Si are then accumulated to generate a new n×1 vector that is denoted here as Ai [36]. A subset of bits is chosen from Ai to be sent to the decoder, where the number of transmitted bits depends on the disparity between Xi and Yi. The larger the disparity the more bits the encoder needs to 1 send (higher source rate ) and the smaller compression ratio is achieved. Identifying the appropriate source rate at the encoder is a crucial practical aspect that we will discuss in detail in the next chapter. For now, let’s assume that the encoder transmits a sufficient number of bits ti to the decoder. At the decoder side, bit accumulation is inversed to restore ti syndrome bits. Further, a Tanner [3] graph that has ti check nodes and n th variable nodes is constructed for decoding xi. Bits of yi (representing the i bitplane of Y) are fed into the n variable nodes, whereas the ti syndrome bits are fed into the check nodes, and belief propagation [3] algorithm runs on the graph to restore xi. When this 1 Here, the source rate is the number of parity bits needed for coding a WZ frame divided by the total number of bits in the WZ frame. 22 process is applied to all bitplanes, i.e., for all i from 1 to b, the decoded b bitplanes are combined together in one frame. The final step is performing inverse Gray coding and inverse DCT (if they are used) to approximate the encoded WZ frame. Encoder Y Decoder Yi Yi Si hi choose rate BP H X Xi Xi Figure 2.3: Pictorial demonstration of the DVC codec. Figure 2.3 provides a pictorial illustration of the whole process. The role of hi in the figure will become clear in the next chapter. 2.4. Energy Measurement Setup Because motes are battery operated, their energy resource is quite limited. A practical DVC solution aims to utilize energy in an efficient manner in order to achieve an acceptable mote life time. This utilization takes the form of compression (and hence less 23 transmission) that has low computational complexity. Consequently, measuring power consumption of transmission and computation is crucial for studying practical DVC solutions. Our key objective here has been to evaluate the energy consumption of transmission as well as the different DVC computational operations in isolation of each other. Therefore, we have experimented with the following approaches: 1. A shunt resistor is placed in series between the visual sensor mote (Imote2/IMB400 node) and the power supply. Voltage across the shunt resistor is captured using an infiniium oscilloscope [37]. To capture small voltage levels, we use an instrumentation amplifier [38]. We account for the offset introduced by the amplifier in later energy calculations. To measure the energy consumption of a particular operation (transmission or computation), we need to mark the beginning and end of that operation. Therefore, we make utility of the general purpose pins available at the Imote2's microprocessor; we program our codec so that it asserts one pin (turns the signal at that pin on) at the beginning of the operation of interest, and deactivates that pin when the operation is over. By monitoring this indicator signal on one channel of the oscilloscope and the drawn current signal on another channel, we are able to accurately quantify the energy consumption of any operation. Power is computed as follows: T E = ∑ Pi 2.2 i =1 Figure 2.4 shows the current drawn by the Imote2/IMB400 node during compressing and transmitting a pair of a key and a following WZ frames. It also shows the time support of the different operations in the codec (attained via the indicator signal). Note that the current starts from its lowest level during capturing a frame, and then it 24 goes up during DCT and intracoding. Radio is turned off during those operations, and it is turned back on only upon transmission. Therefore, when intracoding is over, radio turns on, and transmission begins. When transmission ends, radio goes back off. For a WZ frame, the current starts from the lowest level during capturing the frame, then it rises during DCT and WZ encoding (denoted by DVC in the figure), and finally radio is turned on to transmit the syndrome bit stream. Table 2.1 summarizes the per frame energy cost for the different computational operations, in addition to the packet transmission cost. Note that transmission could take on a long period of time if data is not sufficiently compressed, and as energy is a function of time duration, DVC aims to reduce the volume of transmitted data, and hence reducing the time duration of transmission in order to achieve energy savings. 0.25 DVC DCT Capture Intracoding TX 0.2 0.15 Current(A) 0.1 0.05 0 0.05 0.1 0.2 0.15 Time(second) 0.25 0.3 Figure 2.4: Current drawn during different operations (For interpretation of the references to color in this and all other figures, the reader is referred to the electronic version of this dissertation). 25 Operation DCT Energy 8.52 (mJ/frame) Time Period 15 (ms/frame) DVC 5.53 (mJ/frame) 10 (ms/frame) Capture 16.87 (mJ/frame) 44.25 (ms/frame) Intracoding 3.27 (mJ/frame) 5 (ms/frame) TX 3.4 (mJ/packet) 7.8125 (ms/ packet) Table 2.1: Energy cost and time support of different operations. 26 2. As power consumption affects the mote's battery voltage level, a natural way to quantify power consumption would be through monitoring the battery's voltage level during the course of compression and transmission. Consequently, in [15] our approach is based on voltage discharge curves [39]. We modify the codec (deployed over the Micaz/Cyclops node) so that it executes only the operation of interest (DCT for instance) repeatedly. Meanwhile, the voltage level of the battery is sampled and recorded. This process continues until the voltage level reaches a certain threshold, where the encoder stops and the average cost per frame of the operation is computed. Figure 2.5 shows the voltage level decay over time for frame capturing, DCT, and DVC (syndrome generation). Table 2.2 summarizes the frame processing cost (of the different encoding operations) in addition to the packet transmission cost. By achieving sufficient compression, DVC aims to reduce the number of transmitted packets, and hence reducing energy consumption. 3150 Capture+TX DCT DVC 3050 V. level (mV) 2950 2850 0 1000 2000 3000 Frame Number 4000 5000 Figure 2.5: Battery voltage decay over time during different operations. 27 Operation Voltage Decrease Camera DCT DVC 8.87 (µV/frame) 8.81 (µV/frame) 21.27 (µV/frame) TX 0.509 (µV/packet) Table 2.2: Cost of transmission and computation operations. 2.5. Transmission-Computation Tradeoff There are many options available for a RDVC codec when it is deployed over an actual visual sensor. Those options have different underlying energy requirements (computation and transmission energy), and they result in different video quality (distortion) scores. While a computational intensive scheme is costly in terms of computational power consumption, it achieves significant compression and hence transmission power saving. On the other hand, a less complex encoding scheme is less costly in terms of computational power consumption and more costly in terms of transmission power. Interestingly, as we show in [15], although the more complex scheme achieves more compression, and hence more transmission power saving, it may not be more efficient in terms of the overall power consumption (computation and transmission) beyond a particular achieved level of video quality. We show this by conducting an operational energy-distortion analysis involving two DVC codecs: (1) a DVC codec that incorporates DCT (extra computational power is spent on performing DCT), (2) and a DVC codec that runs directly on raw pixels (less power consuming than the DCT case, but less compression is achieved and hence more transmission power consumption). We compare between the two codecs when both are deployed over the 28 Micaz/Cyclops platform using measurements reported in Section 2.4. Both codecs are used to encode a video sequence that consists of 100 frames captured by the Cyclops camera. Frames have 64 × 64 resolution. Figure 2.6 compares between the two codecs. Each point in the figure corresponds to the number of bytes that resulted from encoding the 50 Wyner-Ziv frames using both the DCT and pixel encoding schemes along with different quantization levels with and without Gray codes. Note from the figure that when Gray coding is not used, DCT results in more data compression compared to the pixel scheme. This is also valid when Gray coding is used. 50 Pixel Pixel,Gray DCT DCT,Gray 45 40 PSNR (dB) 35 30 25 0 2 4 6 8 10 Number of Transmitted Bytes 12 14 x 10 4 Figure 2.6: Compression performance of different encoding schemes. As we are more interested in comparing between the two schemes in terms of the overall cost (compression and computation) at different video quality levels, we use the measurements presented in section 4 (for Micaz/Cyclops) to map points in Figure 2.6 to those in Figure 2.7. The overall cost at each point has three components: (1) The cost of capturing and applying DVC (computing syndromes) to the 50 Wyner-Ziv frames; this 29 component is common between DCT and pixel based schemes. (2) The cost of transmitting the encoded frames; this is different across the two schemes and it depends on the compression performance of each. (3) The cost of applying DCT; this is applicable only in the DCT case. 50 45 Pixel Pixel,Gray DCT DCT,Gray 40 PSNR (dB) 35 30 25 2.2 2.6 3 3.4 Voltage (mV) 3.8 4.2 Figure 2.7: Power-distortion performance of different encoding schemes. Figure 2.7 shows that for the case of no Gray coding, and for lower video quality values (below 34 dB), DCT demonstrates lower overall power consumption. In this case DCT achieves significant saving in transmission power such that the overall power (transmission, and computation) is still less than that in the pixel case. On the other hand, for PSNR values greater than 34 dB, the pixel based scheme demonstrates lower overall power consumption. In this case the transmission power saving of DCT is not significant enough to keep the overall power consumption less than that of the pixel case. The same justification applies for the case of Gray coding. Figure 2.7 provides an operational energy-distortion analysis for both encoding schemes. For any RDVC codec, it is extremely important to have this type of analysis when there is a set of options available 30 for the codec, and those options have different computational requirements, and they result in different quality scores. Having a power-distortion view of the different options assists in making a global optimal choice that accounts for tolerable video quality and power budget. 2.6. Discussion In this chapter we described a RDVC codec that we deployed over the popular Micaz/Cyclops and Imote2/IMB400 visual platforms. Further, we used different ways to evaluate the power consumption of the codec over both platforms. We used the codec along with the power measurements to highlight some practical aspects of DVC over actual platforms and devices. In particular, we showed that when there is a variety of options available to RDVC, some are computationally intensive and lead to less transmission and others are less intensive and result in higher transmission-energy overhead. In this case having a power-distortion profile of the different options assists in making a global decision amongst them. In the next chapter, we use the RDVC codec and energy measurements described in this chapter to address another practical aspect; the problem of identifying the appropriate source rate at the encoder side without the need to exchange feedback messages with the decoder. 31 Chapter 3 3. Distributed Video Coding Based on Source Rate Estimation In Distributed Video Coding (DVC) the encoder compresses frames at source rates that depend on the statistical dependency between the Wyner-Ziv (WZ) and side information frames. An important issue that we address in this chapter is providing the encoder with a mechanism to identify the source rate to be used in encoding a WZ frame (Here, the source rate is the number of parity bits needed/used for coding a WZ frame divided by the total number of bits in the WZ frame). One possible solution is to follow a feedback approach; the encoder starts by sending a small amount of data (low source rate). In case the decoder fails to recover the compressed frame, it provides the encoder with a feedback requesting more bits. This solution results in using source rates that are ideal in the sense that they are the minimal rates that lead to successful decoding. Nevertheless, this solution might not be practical for visual sensor networks as it may exhaust limited bandwidth and energy resources. In this chapter re present a mechanism to estimate ideal source rates without the need to exchanging feedback messages. The 32 proposed mechanism uses conditional entropy (entropy of a WZ source conditioned on a side information source) to estimate ideal source rates. Further, we show that by estimating the source rates (using optimal estimators) we can achieve a video quality and compression performance that is close to that achieved when feedback messages are exchanged. Moreover, we show that by avoiding incremental transmission and feedback messages, the proposed estimation-based approach can demonstrate lower energy consumption for a range of video quality compared with the feedback approach when both are deployed over a real visual sensor platform, namely, the imote2/IMB400. 3.1. Introduction One important question is: How can the encoder know the number of parity bits that have to be sent to the decoder? Or, what is the source rate that it has to use in encoding a WZ frame? One possible solution is to use feedback messages between the decoder (base station) and the encoder (sensor node). Under the feedback approach, when coding a WZ frame, the encoder transmits the compressed data in an incremental fashion; it starts with a low source rate (small number of parity bits). If the decoder fails to restore the WZ frame using this number of bits, it (the decoder) sends the encoder a feedback message requesting more bits. Thus, the encoder incrementally increases the source rate until the decoder acknowledges successful decoding. This approach is considered to be ideal in the sense that it will eventually lead to successful decoding for all encoded WZ frames while requiring the minimal number of parity bits. Nevertheless this approach may not be desirable (or even feasible) in setups where DVC is deployed over real visual sensor networks, because limited communication bandwidth and power considerations may not 33 afford feedback messages and incremental transmission overhead. In this chapter, we present a practical solution for this problem over visual sensor networks. Our solution results in using source rates that are close to the ideal rates of the feedback approach while completely avoiding incremental transmissions and feedback messages. The approach we are proposing is to use the conditional entropy H ( X | Y ) [40] to estimate ideal source rates. Conditional entropy gives information about the disparity between X, the WZ frame, and Y, the side information frame. As the source rate depends on this disparity, we expect the source rate to be statistically dependent on the conditional entropy, and hence, the encoder can use the conditional entropy to estimate the necessary source rate. In this chapter, we examine this dependency. Further, we build a DVC codec that employs two estimators: a Maximum a Posteriori (MAP) estimator, and a Minimum Mean Squared Error (MMSE) estimator. We use estimated source rates to encode different video sequences, and we show that the performance of the proposed estimationbased approach (in terms of compression and video quality) is close to that achieved when feedback messages are exchanged. Moreover, we show that due to the overhead incurred by employing the feedback approach in a real visual sensor application, the estimation-based approach could result in lower energy consumption for a range of video quality levels. To demonstrate that, we use the imote2/IMB400 [18] visual sensor platform. This chapter is organized as follows. In Section 3.2 we investigate the dependency between the conditional entropy and the source rate. In Section 3.3 we use the conditional entropy to estimate source rates using two types of estimators: MAP and MMSE, and we show the performance of both estimators. In Section 3.4 we highlight the overhead of the feedback approach in a real visual sensor application, and we compare it 34 with the proposed estimation-based approach. In Section 3.5 we outline conclusions of this study. 3.2. Conditional Entropy-Source Rate Dependency The conditional entropy hi = H ( X i | Yi ) evaluates the disparity between the bitplane being encoded (Xi) and the bitplane of the same significance in the key frame (Yi). hi provides the lower bound source rate for encoding Xi. In practice, the minimum source rate for encoding Xi that leads to error free decoding, denoted by ri and referred to as the ideal source rate, will be greater than the hi bound due to many reasons, such as the performance of the underlying channel code. The encoder can identify the value of ri by following the feedback approach which is considered to be ideal in the sense that it will eventually lead to successful decoding for all encoded WZ frames while requiring the minimal number of parity bits. The estimation-based approach we are proposing aims to avoid feedback messages and incremental transmissions that could exhaust limited resources in visual sensor networks. At the same time, it aims to use source rates that are as close as possible to the ideal rates attained when feedback messages are exchanged. Our solution relies on estimating the ideal rate ri given the value of the conditional entropy hi. Efficient estimation is achievable if there is a high statistical dependency between the two variables (ri and hi). To examine the degree of this dependency, we conduct the following experiment: 35 Feedback scenario simulation experiment: We use our DVC codec described in Section 2.3 to encode a video sequence. For each pair of key and its following WZ frames, we find the conditional entropy for the i th bitplanes hi = H ( X i | Yi ) . Then starting from the lowest source rate we encode Xi and attempt to decode it at that rate. If the decoding fails, we move to the next higher source rate. We keep on increasing the source rate until Xi gets successfully decoded. At that point we record the source rate that led to successful decoding against the conditional entropy that we already have, i.e., we record the pair (hi , ri ) . Note that this simulates what happens in the feedback scenario. We repeat this experiment for all the bitplanes across all frames in the video sequence. When the encoder finishes encoding the whole video sequence, we will have an (hi , ri ) pair related to each WZ frame. Figure 3.1 shows the values of hi and ri related with the first 50 WZ frames of the Foreman video sequence (QCIF resolution). Both values are evaluated for i = 7 (the seventh bitplane). Note the resemblance in shape between the two curves. This suggests that the two values are correlated. 36 0.8 H(.|.) Source rate 0.7 0.6 Rate 0.5 0.4 0.3 0 5 10 15 20 25 30 35 40 45 50 Frame number Figure 3.1: : Conditional entropy and source rate values over the first 50 WZ frames in the Foreman video sequence, seventh bit-plane. To have a more concrete measure of statistical dependency, we treat hi and ri for the th i bitplane as two random variables, and we compute the correlation coefficient [41]: ρ hi , ri = COV (hi , ri ) σ hi σ ri = E[hi ri ] − E[hi ]E[ri ] σ hi σ ri 3.1 Table 3.1 lists the correlation coefficient for different bitplanes in different video sequences (QCIF resolution Hall Monitor and Foreman). Note that the correlation coefficient is generally close to 1, which reflects the high correlation between the two values. 37 Bit-plane Foreman Hall Monitor 1 0.9871 0.9392 2 0.9948 0.9186 3 0.9885 0.8957 4 0.9635 0.8109 5 0.9118 0.925 6 0.9403 0.8284 7 0.97 0.952 8 0.9952 0.8246 Table 3.1: Correlation coefficients of different bit-planes in the QCIF resolution Foreman and Hall Monitor video sequences. 3.3. Source Rate Estimation and Performance In this section we utilize the statistical dependency between the conditional entropy hi = H ( X i | Yi ) and the ideal source rate ri and build estimators that map hi to an estimate ri . We build two types of estimators: (1) Maximum a Posteriori (MAP) estimator: ri = F (hi ) = arg max p(ri | hi ) 3.2 ri where p(.|.) is the conditional probability mass function. (2) Minimum Mean Squared Error (MMSE) estimator: ri = F (hi ) = E[ri | hi ] 3.3 Where E[. | .] denotes the conditional expectation. Probability mass functions needed for both estimators are built offline (a training stage before the encoder starts encoding) from a large ensemble of (hi , ri ) pairs generated by encoding a number of different video sequences. A source rate codebook is prepared and provided to the encoder. Each entry in the codebook has a conditional entropy value hi and the corresponding estimated source 38 rate ri . The codebook has entries for all values of i. To evaluate the performance of both estimators, we use our encoder to encode different video sequences. For a pair of a key frame and its following WZ frame in a video sequence, the conditional entropy hi = H ( X i | Yi ) of the i th bitplane is computed. The resulting hi value is used to lookup the codebook. When a matching hi value is found in the codebook, the corresponding ri value is extracted and employed in encoding/decoding the bitplane. This is done for all values of i, i.e., for all the bitplanes that pertain to the WZ frame. When the bitplanes of this WZ frame are fully encoded, the number of bytes that resulted from compressing this frame is computed. On the other hand, when all the bitplanes that pertain to the WZ frame are fully decoded, they are merged into one frame, and the quality of that frame is evaluated using the Peak Signal to Noise Ratio (PSNR) measure. Finally, the total number of bytes that resulted from encoding all WZ frames in the video sequence as well as the average PSNR of those frames are computed and plotted. 39 40 38 Ideal MAP MMSE 36 34 32 PSNR (dB) 30 28 26 24 22 2 4 6 8 10 12 Number of bytes 14 16 5 x 10 Figure 3.2: Performance using different methods to identify source rates - QCIF Forman video sequence. 40 Ideal MAP MMSE 35 PSNR (dB) 30 25 0 1 2 3 4 5 Number of bytes 6 7 5 x 10 Figure 3.3: Performance using different methods to identify source rates - QCIF Salesman video sequence. Figure 3.2 and Figure 3.3 plot the average PSNR vs. the total number of bytes for two different video sequences (QCIF resolution Foreman and Salesman). For each video 40 sequence we plot three curves; one for the feedback scenario where the minimal source rate ri is reached in an incremental fashion, and the other two curves are related to the MAP and MMSE estimators. Each point in the figure corresponds to different quantization level, hence different number of bitplanes per frame. Note that for the Foreman sequence the MAP curve is close to the feedback scenario curve. For the Salesman sequence, both MAP and MMSE curves are close to the feedback scenario curve. This means that estimating source rates can achieve performance (in terms of compression and video quality) close to that of the feedback scenario. Further, we observe that, generally, for the two sequences, the MAP estimator outperforms its MMSE counterpart. 3.4. Overall Energy Consumption Performance In this section we compare the performance (in terms of the overall energy consumption) of the feedback and estimation-based approaches when they are employed in a real visual sensor application. We implement and deploy the estimation-based DVC codec described in Section 2.3 on a real visual sensor platform that consists of the imote2 mote attached to the IMB400 multimedia board [18]. This platform is managed by the TinyOS operating system [28]. Further, we measured the energy consumption of radio, and the different computational modules in the imote2/IMB400 codec (DCT transform, parity-check matrix multiplication, etc.). In this section, we use our codec and the measured energy values (Section 2.4) to provide a comparison between the feedback and estimation-based approaches. We can group the differences between the two approaches into two categories: 41 1. Computation: Both approaches have exactly the same computational modules (DCT transform, parity-check matrix multiplication, etc.) with one exception; the estimation-based approach has one extra processing stage which is the conditional entropy computation. It is clear that the main part of this operation is computing the Hamming distance, which involves a set of XOR and addition operations. There are two features that make computing the Hamming distance on the imote2/IMB400 affordable: (1) XOR and addition operations are considered to be cheap in terms of energy consumption [33] compared to other types of operations (multiplication for instance). (2) Moreover, the cost of the Hamming distance computation is reduced by using the iwmmx [42] technology supported by imote2. This technology uses the SIMD (Single Instruction Multiple Data) processing paradigm that accelerates multimedia processing. Using SIMD, our codec performs 64 XOR operations in parallel. This shortens the execution time and hence energy consumption (energy is a function of execution time and power level). Those two features motivate using the conditional entropy as a basis to estimate source rates. 2. Transmission: There are three main differences between the two approaches in terms of transmission: a) Ideal feedback approach results in more compression (Figure 3.2 and Figure 3.3) and hence fewer bits to transmit. b) Ideal feedback approach suffers the cost of exchanging feedback messages. c) In many occasions, the incremental transmission nature of the feedback approach results in transmitting more overhead data (packet headers) compared with the estimation-based approach. As the feedback approach transmits compressed data (parity 42 bits) in increments, this results in constraining the packet payload size to a certain value that is equal to the increment size. This may lead to transmitting compressed data of a certain frame through a large number of packets, and since each packet comprises a control header, this results in transmitting a large number of control data. On the other hand, the packet payload size in the estimation-based approach does not adhere to such constraint. Therefore, bigger payload sizes could be used, and hence compressed data could be communicated via smaller number of packets which means smaller number of control data. To further clarify this point, we use the following incremental transmission model: let the increment size ∆i be the amount of compressed data (parity bits) belonging to the i th bitplane that the encoder transmits in each transmission before waiting for a feedback. Consequently, in each transmission, the total amount of compressed data that q the encoder transmits is ∆ = ∑ ∆ i , where q is the number of bitplanes that have not been i =1 completely communicated to the decoder yet. When the decoder acknowledges successful decoding for bitplane i, no more parity bits are needed (by the decoder) for that bitplane, hence the amount of transmitted data in the next transmission will decrease by ∆i , and q will decrease by 1. Regarding feedback messages, the payload size is 1 th th byte. The i bit in this byte corresponds to the i bitplane, and the decoder sets this bit to one to acknowledge successful decoding for the corresponding bitplane, or to zero to request an increment of size ∆i to be sent in the next transmission. Note that the maximum packet payload size in each transmission is ∆ . In practice, control data overhead and feedback messages could severely degrade performance of the feedback approach. The following is an example that highlights this point. 43 Example: imote2 uses the CC2420 radio transceiver [21]. In TinyOS, each CC2420 packet has a header of 11 bytes. Moreover, our imote2/IMB400 codec employs an LDPCA code that supports 512 different rates to encode the 64×64 bitplanes. The minimum source rate is 1/512 and the maximum rate is 512/512 with an increment of 1/512. Note that the increment size ∆ i = (1 / 512) × 4096 = 8 bits (the same for all i), hence ∆ = 8 bytes. This means that, as long as the decoder has not acknowledged successful decoding for any bitplane, the encoder can send a packet with a maximum payload size of 8 bytes in each transmission. 73 packets, payload=1 byte 197 124 92 packets, payload=2 bytes 32 197 feedback messages 32 packets, payload=3 bytes bp1 bp2 bp3 Figure 3.4: Incremental transmission example of three bitplanes. Figure 3.4 shows a transmission status after bitplanes 4, 5, 6, 7, and 8 were acknowledged decoded. Currently we have q = 3 (3 bitplanes are not acknowledged yet). As the figure shows, at this point, bitplanes 1, 2, and 3 require 197, 124, and 32 more increments respectively to get decoded (the encoder does not know that, it keeps on sending increments until it gets back an acknowledgement). The payload in the next 32 44 packets will hold 3 bytes; each byte corresponds to one of the remaining bitplanes. After the encoder receives an acknowledgment that bitplane 3 is successfully decoded, bitplanes 1 and 2 will need to send 165 and 92 more increments respectively. Consequently, the next 92 packets will hold a payload of 2 bytes corresponding to bitplanes 1 and 2. After the encoder receives a feedback message indicating that bitplane 2 was successfully decoded, the next 73 packets will hold only one byte corresponding to bitplane 1. This procedure results in transmitting 197 packets with payload sizes that range from 1 to 3 bytes, in addition to 197 feedback messages. Note that, although the size of the compressed data in this example is 353 bytes, a total of 2167 control bytes (197 headers) were transmitted. This is an extremely large number compared with the compressed data size. In the example above, if the source rates were known prior transmission, i.e., the encoder already knows that 197, 124, and 32 bytes would be sufficient to decode bitplanes 1, 2, and 3 respectively, then the 353 bytes of compressed data could be communicated to the decoder in a smaller number of packets as there will be no constraint on the payload size. For example, if we choose a payload size of 40 bytes, then we need only 9 packets to deliver the compressed data, and hence only 99 control bytes are transmitted. Since knowing the ideal rates prior transmission is not realistic, the proposed estimation-based approach provides the encoder with a mechanism to estimate those rates prior transmission, and hence this will eliminate the cost of overhead data. Despite this fact, the estimation-based approach suffers extra computation cost (conditional entropy computation), and it achieves less compression than the feedback approach (larger volume of compressed data). Therefore, we need to compare between 45 the two approaches in terms of the overall energy consumption (computation and transmission). To do that, we use our estimation-based imote2/IMB400 codec to encode the Foreman and the Salesman video sequences. We choose the packet payload size to be 40 bytes (remember that the header size in TinyOS is 11 bytes), and a MAP estimator to estimate rates. Further, we simulate the feedback approach to compress the same video sequences in the same manner described in Section 3.3 (but now the video sequences are down sampled to a 64×64 resolution to be consistent with the imote2/IMB400 decoder). The LDPCA code used here is the same as in the previous example (increments of 1/512). For energy measurement, we use the following values [16]: (1) cost of transmitting/receiving 1 byte is 66.67 µJ , and (2) cost of conditional entropy computation is 2.7681 mJ per frame (8 bitplanes). Figure 3.5 and Figure 3.6 demonstrate the performance of each approach in terms of energy consumption and video quality at different quantization levels. In this figure, energy consumption of the computation modules that are common between the two scenarios (such as DCT, parity-check matrix multiplication, etc.) is not considered. Note that there are two curves related to the feedback approach; one curve (feedback1 in the figure) corresponds to a scenario where a feedback message is sent back to the encoder after each transmission. The other curve (feedback2 in the figure) assumes that the decoder sends a feedback only to acknowledge successful decoding of any bitplane. Hence, after each transmission, the encoder waits for a period of time for an acknowledgement. If no acknowledgement is received, then the encoder assumes that the decoder failed to decode all bitplanes, and it transmits a new increment. Note that although this feedback scheme results in less frequent feedbacks, the estimation-based 46 approach demonstrates lower energy consumption for a wide range of video quality levels. 42 40 feedback1 feedback2 estimation 38 36 PSNR (dB) 34 32 30 28 26 24 0 20 40 60 80 Energy (J) 100 120 Figure 3.5: Performance over a real visual platform - Forman video sequence. 42 40 feedback1 feedback2 estimation 38 36 PSNR (dB) 34 32 30 28 26 24 0 10 20 30 40 50 60 70 Energy (J) Figure 3.6: Performance over a real visual platform - Salesman video sequence. 47 3.5. Discussion In this chapter we showed that the statistical dependency between the conditional entropy (entropy of a WZ source conditioned on a side information source) and the ideal source rate of encoding a WZ frame (the minimal rate that leads to successful decoding) can be utilized to estimate source rates, and hence avoid exchanging feedback messages with the decoder. We showed that the estimation-based approach results in a performance that is close to that of the feedback approach in terms of compression and video quality. Further, we showed that the estimation based approach can demonstrate lower energy consumption for a range of video quality levels compared with the feedback approach when they are deployed over a real visual sensor platform (imote2/IMB400). 48 Chapter 4 4. Polar Codes for Low Resolution Distributed Video Coding Polar codes are the first codes proven to achieve capacity while having low encoding and decoding complexity. As the development of polar codes is relatively recent, there has been little work to assess the potential of polar codes for practical applications. In this chapter, we employ and evaluate the performance of polar codes in Distributed Video Coding (DVC). Further, we compare the performance of polar codes with the more established and more investigated Low Density Parity Check Accumulate (LDPCA) codes in the context of DVC. Our study shows: (a) Polar and LDPCA codes exhibit almost similar performance for DVC coding of large image blocks. (b) Polar codes offer a clear advantage when coding smaller size image blocks. Consequently, polar codes could represent a viable solution for distributed sensor networks that capture a lowresolution video signal at each visual sensor node. 49 4.1. Introduction Similar to previous chapters, the DVC that we investigate in this chapter is based on asymmetric Distributed Source Coding (DSC), which is a special case of the broader DSC problem [1]. Let X and Y be two sources that are statistically dependent. Y, referred to as a side information (or key) source, is intra-coded and communicated to the decoder at rate RY ≥ H (Y ) . By utilizing the statistical dependency between the two sources at the decoder side, it is shown in [1] that X, referred to as a Wyner-Ziv (WZ) frame, can be compressed with no loss, and with having no access to Y at rate R X ≥ H ( X | Y ) . Interestingly, the aggregate rate R X + RY can achieve the joint entropy H ( X , Y ) . Hence, DVC can achieve the same rate as the computationally demanding joint encoding (requires access to both X and Y) by performing low complexity independent encoding (no access to Y). Due to the low complexity encoding, DVC is considered to be suitable for visual sensor networking applications [43] where visual sensors are limited in their resources (computational power, storage, etc.). Channel codes are heavily employed in DVC. The reason is that the disparity between the two sources, X and Y, can be viewed as noise. In other words, X can be viewed as a noisy version of Y. Consequently, if Y is already communicated to the decoder, then the encoder can communicate X by transmitting a number of parity bits sufficient to correct the noise in Y. Compression is achieved by transmitting the parity bits. The level of compression of DVC depends primarily on the underlying channel code. Therefore, it is common to employ capacity-approaching codes. Examples of DVC schemes that employ such codes can be found in [44] where Turbo codes are used, and in [10] where LDPC 50 codes are used. Although these codes and others are known to be capacity-approaching, there are no formal proofs on their optimality. Polar codes, recently introduced by Arikan [7], are the first channel codes proven to achieve capacity for many types of channels. Meanwhile, they are described to be practical in terms of their encoding/decoding complexity. Due to their interesting features, polar codes can be utilized in achieving optimal performance in many problems in the field of information theory. For example, in [45], polar codes were constructed to achieve optimal performance in lossy and lossless source compression, in addition to the Wyner-Ziv, and Gelfand-Pinsker coding problems. Both polar and LDPC codes have some attributes that qualify them to be employed in DVC over resource-limited visual sensor networks (which is one of the main applications of DVC). For example, (1) the parity check matrix of LDPC codes and the generator matrix of polar codes are sparse, which allows for compact storage. (2) Both codes have low encoding complexity (O(N) for LDPC codes and O(N log N) for polar codes where N is the code block length). (3) Both codes can operate in a rate-adaptive mode where the encoder transmits compressed data (parity bits) in increments until the decoder signals successful decoding. This is important in situations where the encoder lacks knowledge about the disparity (correlation channel) between the WZ and the key sources. It is well known that polar and LDPC codes are powerful under asymptotic conditions, i.e., as N (code block length) goes to infinity. However, in real visual sensor applications, codes are usually applied on short blocks since many visual sensors are endowed with low resolution imagers. For example, Cyclops [17] captures 128×128 pixel images and both WiSN [46] and MeshEye [47] support capturing of images with 30×30 51 pixel resolution. In this chapter we demonstrate the practical utility of polar codes in the context of low resolution (and hence short block length) rate-adaptive DVC. More specifically, we demonstrate the performance advantages of polar codes over the more investigated LDPCA codes [36] (rate adaptive LDPC codes that are designed for DSC) in the context of low resolution DVC. To that end, we implement a DVC codec that employs both codes. We apply the codec on a set of video sequences with different (low) resolutions. To our knowledge, this is the first work that incorporates polar codes in a real DVC application (as opposed to LDPCA codes). This chapter is organized as follows. In Section 4.2, we provide a brief description of polar codes and the DSC scheme that we employ in our DVC codec. In Section 4.3, we describe our rate-adaptive DVC codec then we present our experimental results. In Section 4.4, we outline main conclusions. 4.2. Rate-adaptive Polar Codes for DSC In this section, we describe in detail a rate-adaptive DSC scheme that incorporates polar codes. In the next section we will use this scheme to implement DVC over low resolution images. For a detailed description of LDPCA codes refer to APPENDIX A, and for a detailed description of our LDPCA DSC scheme refer to Chapter 3. 4.2.1. Polar Codes N Let X 0 −1 denote the random vector [ X 0 ,... X N −1 ] . If F is a subset of {0,..., N − 1} , c N N then F is the complement of F. ( X 0 −1 ) F denotes elements of X 0 −1 indexed by 52 members of F. Ber(p) denotes a Bernoulli random variable with parameter p, and h2 ( p) is the corresponding binary entropy. Let W be a Binary-input Discrete Memoryless Chanel (B-DMC) with transition probability W ( y | u ) . In this chapter, we are interested in the case where {0,1} is the alphabet for both the input and the output of W. Let W N be a compound channel that consists of N = 2 n independent copies of W. The transition probability of W N is: N −1 N N W N ( y0 −1 | u0 −1 ) = ∏ W ( yi | ui ) i =0 4.1 N Polar coding places a pre-processing stage before transmitting u 0 −1 over W N : N N ⊗ x0 −1 = u 0 −1G2 n 4.2 where G2 is the polarization matrix defined as: 1 G2 =  1 ⊗ is the n th 0 1  Kronecker product operator . This pre-processing operation results into a new compound channel WN that consists of N dependent channels. The transition th probability of the i channel is: ( N i W Ni ) ( y 0 −1 , u 0−1 | ui ) = 1 2 N −1 N N ∑ W N ( y0 −1 | u 0 −1 ) − u iN 1 1 + 4.3 where N N N N ⊗ W N ( y0 −1 | u 0 −1 ) = W N ( y0 −1 | u0 −1G2 n ) 4.4 The Successive Cancelation (SC) algorithm is used for decoding polar codes where Equation 4.3 is applied iteratively. At the i th 53 i iteration, u 0−1 that was estimated over the earlier i-1 iterations is used to estimate u i . As shown in [7], asymptotically (as N goes to (i infinity), a group of the dependent channels, WN ) , become completely reliable, i.e., the input of the channel can be estimated with small probability of error. The fraction of reliable channels converges to the capacity of W. On the other hand, the remaining channels become completely unreliable. This phenomenon is known as channel polarization. Polar coding: Let A and Ac be the set of indices that correspond to the reliable and unreliable channels respectively. We will refer to A as the information set, and Ac as N N the frozen set. To map a data vector d into a codeword x0 −1 , we set u 0 −1 as follows: N N (u 0 −1 ) A = d , and (u0 −1 ) c = f , where f is a vector of bits that is fixed across all the A N N ⊗ codewords. Next, u 0 −1 is multiplied by G2 n which results in the codeword x0 −1 . f and Ac are considered to be parameters of the code that are known to both the encoder and decoder. 4.2.2. DSC with Polar Codes The basic idea of employing polar codes in DVC is described in [48]. It is based on the fact that X (WZ frame) can be regarded as a noisy version of Y (key frame). i.e., X = Y + Z , where Z is a Ber(p) noise sequence (we consider the correlation channel as a Binary Symmetric Channel with parameter p; BSC(p)). Y is assumed to be available at 54 the decoder side. Our goal is to communicate X at a rate close to the theoretical bound. In ⊗ other words, we aim to encode Z. At the encoder side, X is multiplied by G2 n : N ⊗ N ⊗ X 0 −1G2 n = (Y0N −1 + Z 0 −1 )G2 n ⊗ N ⊗ = Y0N −1G2 n + Z 0 −1G2 n 4.5 ⊗ Note that Y0N −1G2 n in Equation 4.5 is already available at the decoder. Therefore, N ⊗ the encoder needs to communicate Z 0 −1G2 n to enable the decoder reconstructing N ⊗ N N ⊗ N X 0 −1G2 n according to (4.5). Let U 0 −1 = Z 0 −1G2 n . Note that U 0 −1 is the polarized N noise sequence. The Entropy rate of U 0 −1 is: N −1 N i H (U 0 −1 ) = nh2 ( p ) = ∑ H (U i | U 0−1 ) 4.6 i =0 In the context of source coding, polarizing a bit sequence results into two groups of bits: (1) a group of reliable bits indexed by the information set where the conditional i entropy in (6) is close to 0. This means that given U 0−1 , U i can be estimated reliably. (2) A group of unreliable bits that are indexed by the frozen set where the conditional entropy is close to 1. Let A and Ac denote the information and frozen sets respectively. N It is clear from (6) that the fraction of frozen bits, (U 0 −1 ) c , is h2 ( p) . Therefore, if the A N encoder transmits the frozen bits (U 0 −1 ) c , the decoder will be able to estimate A N (U 0 −1 ) A , and the source rate would be close to the theoretical bound h2 ( p) . N ⊗ Transmitting ( Z 0 −1G2 n ) c to the decoder requires separating between Z and Y at A the encoder side, which, in practice, is not possible. Therefore, the encoder transmits 55 N ⊗ ⊗ N ⊗ ( X 0 −1G2 n ) c = (Y0N −1G2 n + Z 0 −1G2 n ) c . As the decoder knows Ac and already A A N ⊗ ⊗ has Y0N −1G2 n , it simply extracts ( Z 0 −1G2 n ) c as follows: A ⊗ N ⊗ ⊗ N ⊗ (Y0N −1G2 n + Z 0 −1G2 n ) c + (Y0N −1G2 n ) c = ( Z 0 −1G2 n ) c A A A 4.7 N ⊗ ⊗ N Once the decoder obtains ( Z 0 −1G2 n ) c , it uses it to estimate ( Z 0 −1G2 n ) A using A Equation 4.3 after setting the output of the channel to a vector of zeros: ( ( i N i WNi ) (u0−1 | ui ) = WNi ) ( y0 −1 = 0, u0−1 | ui ) 4.8 N ⊗ Next, the decoder combines the information and frozen bits in one vector Z 0 −1G2 n . Finally, polarization is inversed: N N ⊗ ⊗ Z 0 −1 = ( Z 0 −1G2 n )G2 n 4.9 And this sequence is used to get X back: N N X 0 −1 = Y0N −1 + Z 0 −1 4.10 Frozen and information set construction: as mentioned earlier, we consider the correlation channel as being a BSC(p). For a certain value of p, A and Ac are constructed in an offline training stage as follows: a sequence Z of length N is generated according to a Ber(p) random variable. Z is then polarized; _ N N ⊗ U 0 −1 = Z 0 −1G2 n . The SC algorithm uses (8) to generate an estimate U i of the polarized bits for i = 0...N − 1 . This is repeated for a large number of times in order to _ construct a joint probability density function (pdf) of U i and U i . This pdf is used to 56 _ compute the mutual information I (U i ;U i ) . Values of i where the mutual information is close to 1 is assigned to A , whereas values of i with mutual information close to 0 is assigned to Ac . Rate adaptation: In some situations where the encoder is unaware of the correlation channel parameter p, it is useful to transmit frozen bits in increments, starting from a small number of bits until the decoder signals successful decoding. The worst case would N ⊗ be transmitting the whole polarized WZ bit sequence; X 0 −1G2 n . In this case, all bits would be treated as frozen bits. Therefore, to achieve rate adaptation, we treat the N bits N ⊗ of X 0 −1G2 n as frozen bits. Meanwhile, we rank those frozen bits to indicate which ones would be transmitted first. To achieve that, we conduct the training process described earlier a certain number of times starting with a small value of p, and we gradually increase p until it reaches its maximum value (0.5). Note that each training run results in different A and Ac . As p gets larger, some indices that were in A in the previous training run will move to Ac in the current training run (as the capacity of the channel decreases). Indices that move in earlier training runs (runs with smaller p) are considered less reliable, and they are transmitted in earlier increments. This way we are able to rank the N bits in terms of their transmission precedence. 4.3. Experimental Results 4.3.1. DVC Codec 57 The following is a brief description of the DVC codec that we use in our experiments. Frames are processed in pairs. In each pair, the first frame is a Key frame, and the second frame is a WZ frame. A key frame Y is intra-coded and communicated to the decoder. A WZ frame X is transformed using a 4×4 Discrete Cosine Transform (DCT). A uniform quantizer of M = 2 b intervals is used to quantize the DCT coefficients. Quantized coefficients are then represented via Gray codes (as opposed to regular binary representation). Coefficients of the same frequency are grouped together to form frequency bands. Each band is processed separately as follows: First, it is decomposed into b bitplanes. Next, Polar codes are used to encode each bitplane as described in Section 4.2. The encoder transmits bits incrementally starting with bits that are ranked as least reliable. The corresponding bitplane in Y is used to extract and decode the polarized noise sequence. We log the minimum source rate that results in error free decoding. To compare with LDPCA, we repeat the same experiments using our DVC codec modified to use LDPCA codes. 4.3.2. Results Figure 4.1 demonstrates the performance of the polar and LDPCA codes when applied to the Akiyo video sequence with different resolutions. Each point in the figure corresponds to a compression ratio and the corresponding Peak Signal to Noise Ratio (PSNR). Different points correspond to different DCT quantization levels. The compression ratio is defined as the total number of bytes after compression divided by the total number of bytes before compression. We use two different LDPCA codes whose performances were reported in [36]. The first one is regular with degree distribution 58 given by λ3=1, where λr is the proportion of variable nodes with degree r.The second is irregular with degree distribution (λ2=0.316, λ3=0.415, λ7=0.128 λ8=0.069, λ19=0.02, λ21=0.052). Figure 4.1 (a) shows the performance for the 128×128 resolution case (block length equals 1024 as each of the 16 DCT coefficient bands is encoded separately). The figure shows that generally the three codes are achieving almost the same compression performance. Irregular LDPCA slightly outperforms the regular LDPCA and polar codes. Figure 1 (b) shows the performance for the 64×64 case (block length is 256). Interestingly, at this block length, polar code achieves more compression compared to LDPCA codes. As the resolution goes down to 32×32 (Figure 1 (c) where block length=64), not only polar codes outperform both LDPCA codes, but also the distance between the performance curves become longer. This suggests that as the block length goes down, polar codes demonstrate more compression compared to LDPCA codes. To support this claim, we repeat our experiment using more video sequences. We use the Hall-monitor, Salesman, Foreman, and Carphone video sequences. The results for the aforementioned video sequences are shown in are shown in Figure 4.2, Figure 4.3, Figure 4.4, and Figure 4.5. By looking at the figures we can notice a trend; for the 128×128 case, polar and LDPCA codes behave closely. As the resolution goes down, the performance gap becomes larger where polar codes achieve more compression. This trend can be seen clearly in Figure 4.6 that averages the results over the different video sequences. 59 (a) 36 32 28 24 20 PoalrCode 128 Irreg. LDPCA 128 Reg. LDPCA 128 16 0.005 (b) 0.015 0.025 Compression Ratio 0.035 32 28 PSNR (dB) 24 20 16 0.01 (c) PoalrCode 64 Irreg. LDPCA 64 Reg. LDPCA 64 0.02 0.03 Compression Ratio 0.04 32 28 PSNR (dB) 24 20 16 0.01 PoalrCode 32 Irreg. LDPCA 32 Reg. LDPCA 32 0.03 0.05 Compression Ratio 0.07 Figure 4.1: Performance of DVC using polar, regular and irregular LDPCA codes applied on (a) 128×128 Akiyo, (b) 64×64 Akiyo, and (c) 32×32 Akyio. 60 (a) 34 30 26 PSNR (dB) 22 PoalrCode 128 Irreg. LDPCA 128 Reg. LDPCA 128 18 0.01 0.03 0.05 Compression Ratio 0.07 (b) 32 28 PSNR (dB) 24 20 PoalrCode 64 Irreg. LDPCA 64 Reg. LDPCA 64 16 0.01 0.03 0.05 0.07 Compression Ratio 0.09 (c) 32 28 PSNR (dB) 24 20 PoalrCode 32 Irreg. LDPCA 32 Reg. LDPCA 32 16 0.02 0.06 0.1 0.14 Compression Ratio Figure 4.2: Performance of DVC using polar, regular and irregular LDPCA codes applied on (a) 128×128 Hall-monitor, (b) 64×64 Hall-monitor, and (c) 32×32 Hall-monitor. 61 (a) 32 28 PSNR (dB) 24 20 PoalrCode 128 Irreg. LDPCA 128 Reg. LDPCA 128 16 0.015 0.025 0.035 0.045 Compression Ratio 0.055 (b) 32 28 PSNR (dB) 24 20 PoalrCode 64 Irreg. LDPCA 64 Reg. LDPCA 64 16 0.02 0.04 0.06 Compression Ratio 0.08 (c) 30 25 PSNR (dB) 20 15 0.02 PoalrCode 32 Irreg. LDPCA 32 Reg. LDPCA 32 0.04 0.06 0.08 0.1 Compression Ratio Figure 4.3: Performance of DVC using polar, regular and irregular LDPCA codes applied on (a) 128×128 Salesman, (b) 64×64 Salesman, and (c) 32×32 Salesman. 62 (a) 32 28 PSNR (dB) 24 PoalrCode 128 Irreg. LDPCA 128 Reg. LDPCA 128 20 16 0.04 0.06 0.08 0.1 Compression Ratio 0.12 (b) 32 28 PSNR (dB) 24 20 16 0.04 PoalrCode 64 Irreg. LDPCA 64 Reg. LDPCA 64 0.08 0.12 0.16 Compression Ratio (c) 30 25 PSNR (dB) 20 15 0.06 PoalrCode 32 Irreg. LDPCA 32 Reg. LDPCA 32 0.14 0.18 0.22 Compression Ratio Figure 4.4: Performance of DVC using polar, regular and irregular LDPCA codes applied on (a) 128×128 Foreman, (b) 64×64 Foreman, and (c) 32×32 Foreman. 0.1 63 (a) 32 28 PSNR (dB) 24 20 PoalrCode 128 Irreg. LDPCA 128 Reg. LDPCA 128 16 0.03 0.05 0.07 0.09 Compression Ratio 0.11 (b) 32 28 PSNR (dB) 24 20 PoalrCode 64 Irreg. LDPCA 64 Reg. LDPCA 64 16 0.04 0.08 0.12 Compression Ratio 0.16 (c) 32 28 PSNR (dB) 24 20 PoalrCode 32 Irreg. LDPCA 32 Reg. LDPCA 32 16 0.04 0.08 0.12 0.16 0.2 Compression Ratio Figure 4.5: Performance of DVC using polar, regular and irregular LDPCA codes applied on (a) 128×128 Carphone, (b) 64×64 Carphone, and (c) 32×32 Carphone. 64 (a) 32 28 PSNR (dB) 24 Polar Code 128. Irreg. LDPCA 128. Reg. LDPCA 128. 20 16 0.02 0.03 0.04 0.05 0.06 0.07 0.08 Compression Ratio (b) 32 28 PSNR (dB) 24 20 Polar Code 64 Irreg. LDPCA 64 Reg. LDPCA 64 16 0.02 0.04 0.06 0.08 Compression Ratio 0.1 (c) 32 28 24 PSNR (dB) 20 Polar Code 32 Irreg. LDPCA 32 Reg. LDPCA 32 16 0.02 0.06 0.1 0.14 Compression Ratio Figure 4.6: Performance averaged over the five video sequences with (a) 128×128, (b) 64×64 and (c) 32×32 resolutions. 65 4.4. Discussion In this chapter we build a rate-adaptive DVC codec that incorporates polar and LDPCA codes. We apply our codec on a set of video sequences with different (low) resolutions. We show that Polar and LDPCA codes exhibit almost similar compression performance for DVC coding with large blocks. We also show that polar codes offer a clear advantage when coding smaller size image blocks. 66 Chapter 5 5. HopCaster: A Network Coding-Based Hop-byHop Reliable Multicast Protocol Intra-flow Network Coding (NC) is an emerging technique that has potential in significantly improving the performance of multicasting over Wireless Mesh Networks (WMNs). It achieves that by allowing intermediate forwarding nodes (FNs) to utilize overhearing and coding to reduce the total number of transmissions required for end-toend data delivery. The existing intra-flow NC-based multicast protocols are all based on the conventional end-to-end transport principle. A major downside for such principal is that intermediate FNs are not able to accurately determine the minimum number of coded packets they should transmit in order to ensure successful data delivery to the destinations, and hence redundant packets can be injected into the network, leading to significant performance degradation. Furthermore, existing protocols do not accomplish fairness between multicast receivers with good network connectivity and those with bad connectivity (heterogeneous receivers). 67 We argue that a receiver-driven hop-by-hop transport approach demonstrates a higher performance compared to the end-to-end scheme when incorporated in intra-flow NCbased multicasting. The hop-by-hop transport approach allows exploiting not only computation power (in performing network coding) but also in-network storage (in doing hop-by-hop caching) at the intermediate FNs to achieve performance gain. In this chapter we propose HopCaster, a novel protocol that incorporates intra-flow NC with hop-by-hop transport to achieve high-throughput reliable multicasting and to solve receiver heterogeneity issues. It completely eliminates the need for estimating the number of coded packets to be transmitted by a FN. It also simplifies multicast management and congestion control. Moreover, HopCaster employs a cross-layer rate adaptation mechanism that optimizes the multicast data rate in hop-by-hop transport by exploiting the receiver population changes over time. We compare the hop-by-hop and end-to-end transport approaches. Our evaluations show that, compared to Pacifier, a state-of-the-art end-to-end intra-flow NC-based multicasting protocol, HopCaster achieves up to 29% throughput improvement. 5.1. Introduction Reliable multicast applications such as, file sharing, data casting, software upgrade, etc. are becoming increasingly common. These applications not only have a strict reliability requirement of 100% packet delivery ratio (since any loss causes quality degradation) but also require high throughput (as fast download is always desirable from a user standpoint). Achieving (1) reliability and (2) high-throughput in multicasting applications over multi-hop wireless mesh networks (WMNs) faces many challenges. 68 One is imposed by the lossy and time-varying nature of wireless transmission environments, which necessitates employing adaptive mechanisms to ensure reliability and high throughput. Another main challenge, which is unique to multicasting, is the bandwidth heterogeneity amongst multicast receivers. The receivers with poor connectivity and low-throughput paths to the source may greatly degrade the performance of receivers with good connectivity conditions. Traditional multicasting protocols that were devised to achieve reliability and highthroughput are client-server based where intermediate routers or forwarding nodes (FNs) simply duplicate and forward packets. Many of these protocols employ end-to-end forward error correction (FEC), automatic repeat request (ARQ) or hybrid FEC-ARQ techniques to achieve reliability in wireless networks [49] [50] [51] [52] [53]. In many cases the performance is limited by the multicast destination receivers with the worst path to the source. Network coding (NC) is emerging as an innovative technique to improve reliability and throughput in WMNs. NC takes advantage of the broadcast nature of the wireless medium by allowing intermediate FNs to encode received and overheard data packets (instead of simply duplicate and forward packets) [4] [6] [54] [55] [56] [57] [58] [59]. NC can be classified as (1) inter-flow NC and (2) intra-flow NC. With inter-flow NC, an intermediate FN can encode/forward data packets belonging to different flows. A receiver decodes the packets to obtain the flow of data targeted to it using its knowledge of another flow. However in order to obtain inter-flow coding opportunities, the encoded flows need to pass through a common FN with certain network topologies, for example, “Alice-Bob structure” or X-structure” [55] [56] [59]. Furthermore, inter-flow NC requires 69 that the sending FN knows what data packets have been buffered or overheard by each of the intended receivers in order to determine how to encode the packets across the flows. This thereby leads to increased control overhead. More recently, intra-flow NC based on random linear block codes [6] [54] has attracted research attention due to its simplicity and efficiency. In intra-flow NC, a FN generates and forwards random linear combinations of received (and overheard) packets belonging to a flow over a certain Galois field. The random mixing at each FN ensures that if a group of FNs heard the same packet transmission, the coded packets that they generate and forward are, with high probability, linearly independent. This reduces duplicate packet transmissions over the shared wireless medium making opportunistic overhearing and forwarding more effective in WMNs [57]. A node can receive, in addition to coded packets transmitted by its direct parent node, coded packets of the same flow transmitted by other neighbors, achieving significant performance gains. Compared to inter-flow NC, intra-flow NC has the advantages of low signaling overhead and ease of implementation with no topological requirements.. The use of intra-flow NC introduces new challenges in designing a practical multicasting protocol. (1) First, when a FN receives a coded packet, it has to generate (and then forward) a new coded packet by performing random mixing on the previously received and cached packets and the most recently received one. To combat the lossy nature of wireless channels, for one received packet, the coding and forwarding has to be repeated multiple time (each time with different random mixing coefficients) in order to ensure that at least one of the coded packets makes its way closer to the destination (the others could be lost due to lossy channels). The first challenge is to decide how many 70 coded packets a FN should generate and send in response to receiving one coded packet. (2) Second, how should the bandwidth heterogeneity of channels from the source to different destinations be handled? (3) Third, at each hop, we have an upstream node that is sending data, and a number of downstream nodes receiving the data. The population of the downstream nodes changes over time in the sense that downstream nodes with good connectivity to its upstream sender or with more overhearing opportunities will complete receiving data (a data file for example) faster than their poorly connected counterparts. Thus, the downstream node population decreases in this case (since some nodes are not interested in the data anymore). One challenge is to enable an upstream sender to adapt its transmission data rate taking into consideration the current downstream node population. Practical design that employs intra-flow NC for reliable multicasting in WMNs has received relatively little attention. The state-of-the-art of practical intra-flow NC-based multicast protocols such as MORE [57] and Pacifier [58] all employ the conventional end-to-end transport approach in which the source keeps on sending coded packets of a data batch. Intermediate FNs encode overheard packets and then forward the coded packets toward the multicast destination receivers. The destinations send end-to-end acknowledgements (ACKs) after they collect enough coded packets to decode the original data batch (this is similar to the TCP end-to-end ACK strategy in unicast). There is no cooperation among intermediate FNs. The number of coded packets that a FN transmits upon receiving a packet from its upstream node, so called transmission credit (TC), is determined using heuristics based on measurements of the expected packet loss rates to the destinations. This approach suffers from several fundamental problems. 71 1. FNs may transmit coded packets much more than necessary, significantly wasting network resources and degrading the performance since it is extremely difficult to obtain an accurate estimation for the value of TC for each FN in a dynamic wireless environment. 2. While an end-to-end ACK is being propagated from a multicast receiver back to the source, the source will keep on transmitting coded packets that belong to an already delivered data batch (it stops upon receiving the ACK). These redundant packets will trigger sending yet more coded packets by the FNs, leading to bandwidth waste and more interference. This becomes even worse if the end-to-end ACK gets delayed in the cases of congestion, or lost in the situations of bad link qualities. 3. The end-to-end approach used in the existing designs cannot completely solve the unfairness issue related to heterogeneous receivers. 4. The existing end-to-end multicast protocols do not adapt the data rate at each hop to account for varying receiver population and channel conditions. Selecting of a robust low data rate to broadcast coded packets to downstream receivers does not necessarily lead to higher multicast throughput. In this chapter, we argue that a receiver-driven hop-by-hop transport approach works better with intra-flow NC. Multiple receivers interested in a content file send requests towards the content source and a multicast tree is built. The content file is divided into batches and originally distributed from the source. An intermediate router or FN on the tree overhears coded packets sent by any neighbor (parent, grandparent, sibling, and even child nodes) and cache them in a buffer. Once collecting enough packets to decode a data batch, the FN becomes a new source for the cached batch. It informs its parent of 72 successful reception of the batch and requests the next batch. The acknowledgement and request messages are also used by FNs for flow control. A parent node would stop transmitting from a data batch after receiving acknowledges from all of its immediate children. This approach allows exploiting not only computing power (network coding) but also in-network storage (caching) at intermediate FNs, moving data as close as possible to the content requesters, i.e. the destinations. Based on the above basic idea, we propose HopCaster, an intra-flow NC-based hopby-hop reliable multicast protocol to achieve high-throughput over WMNs and solve the network heterogeneity issue of multicast receivers. In contrast to the existing end-to-end intra-flow NC-based multicast protocols, HopCaster completely eliminates the needs for estimating the transmission credit at FNs as well. It also simplifies multicast management and congestion control. Moreover, HopCaster employs a novel cross-layer rate adaptation mechanism to optimize the PHY data rate selection considering the receiver population changes at different stages of multicasting. We compare the hop-by-hop and end-to-end transport approaches. Our evaluations for static and dynamic scenarios show that, compared to Pacifier, a state-of-the-art intra-flow NC-based multicast protocol, HopCaster achieves up to 29% throughput improvement. The remainder of this chapter is organized as follows. Related work is reviewed in Section 2. In Section 3, we present the HopCaster design, we also discuss the intuition underlying our design approach and the synergy between hop-by-hop transport and intraflow NC. In Section 4, we present simulation results to compare between HopCaster with Pacifier, a state-of-the-art intra-flow NC-based multicast protocol that adopts an end-toend transport scheme. The chapter is concluded in Section 5. 73 5.2. Related Work The majority of work addressing reliability in multicast protocols over wireless networks is based on the traditional FEC, ARQ or hybrid ARQ-FEC techniques [49] [50] [51] [52] [53] where intermediate nodes or routers simply duplicate and forward packets. There are only a few practical designs and performance studies for intra-flow NC-based multicast protocols. MORE [57] is the first intra-flow NC-based protocol for reliable unicast and multicast over WMNs. In MORE, any node that overhears a transmission and that is closer to the destination than the sender may participate in forwarding the packet to the destination. This transmission participation scheme results in forming a forwarding belt instead of path to increase network robustness. A FN encodes packets with random network coding before forwarding them. However in MORE, multiple FNs forward the coded packets without any coordination. Belt forwarding can be inefficient, especially in multicasting, as multiple overlapped belts are formed, and the number of nodes participating in packet forwarding becomes more than necessary. Further, MORE lacks source rate control which can cause congestion. Pacifier [58] is a state-of-the-art reliable multicast protocol for WMNs. It improves upon MORE by using a multicast tree instead of forwarding belts. Only nodes on the multicast tree perform random network coding on incoming packets. They also forward the coded packets along the tree. Pacifier also proposes a mechanism to limit the transmission rate of the source to prevent congestion using backpressure. It has been shown that Pacifier is able to achieve performance improvements over MORE. 74 In the designs of both MORE and Pacifier, the source computes a transmission credit (TC) for each FN through periodical packet loss rate measurements for the links in the network. The TC values are carried in the coded packets. When a FN receives a packet from its parent node, it uses the TC field to determine whether and how many coded packets to transmit to its children nodes. The successful data delivery is verified through end-to-end acknowledgements. The source continues broadcasting coded packets until it receives end-to-end ACKs from all destination receivers. Reacting to end-to-end ACKs in addition to using TCs computed at the source cause many problems, such as inaccurate estimation of TCs and inability to handle heterogeneous receivers. Peer-to-peer (P2P) overlay networks have been widely studied [60] [61]. We consider P2P networks related to our work from an application perspective (it also aims to achieve reliable and high throughput data delivery). Further, in P2P networks, participants make a portion of their resources, such as processing power and storage available to other network participants (this is similar to FNs in our protocol where they share their caching and processing capabilities). However, in P2P systems, peers at the edge of networks form an application layer overlay with little knowledge about the underlying physical network topology. Therefore, it is difficult, if not impossible, to achieve efficient network routing. It is also difficult to take advantage of the characteristics of the underlying physical network (e.g., to take advantage of overhearing if the underlying physical network is a wireless network). Furthermore, most of P2P applications, such as BitTorrent, employ TCP for reliable transport between peers. TCP has well-known problems in wireless networks. 75 Research on the future Internet architecture has started recently [62] [63]. A cacheand-forward scheme has been proposed to provide unified network transport services for the future Internet [64]. However, their work mainly focuses on architectural designs and unicasting, with no NC in the picture. Therefore, our proposed protocol fits well in the cache-and-forward future architecture (due to its hop-by-hop nature) to achieve reliable and high throughput multicast applications over WMNs. To the best of our knowledge, HopCaster is the first practical protocol that integrates NC and receiver-driven hop-to-hop transport with the optimization of link layer transmission and network layer routing for efficient reliable multicasting. 5.3. HopCaster Design The design of HopCaster addresses the weaknesses suffered by the intra-flow NCbased multicasting protocols that are based on the end-to-end transport principle, such as Pacifier and MORE. In this section, we describe the design of HopCaster, and we compare it with that of the conventional end-to-end approach. 5.3.1. Hop-by-Hop Reliable Multicast with Network Coding HopCaster uses a tree-based hop-by-hop opportunistic multicasting with intra-flow NC. Building the tree is initiated by requests sent from the content receivers (also called destinations) to the content source, connected by a set of intermediate wireless routers (also called forwarding nodes (FNs)). We’ll describe in detail the procedure of establishing and maintaining the hop-by-hop transport tree later in this section. As shown 76 in Figure 5.1, the solid lines represent the parent-child links on the multicast tree, and the dashed lines indicate that a node can overhear packets from other neighbors, such as grandparents, siblings, and child nodes. We see that there can be many overhearing opportunities for a node on the multicast tree. Figure 5.1: Overhearing along a multicast tree in a WMN. A large content file can be divided into multiple batches at the source before distribution to reduce packet header overhead and decoding delay at the receiver. A batch is further divided and encapsulated into k packets. Random linear block codes [54] are applied on the packets belonging to a batch to generate coded packets. A coded packet here is a random linear combination of the k packets with coefficients chosen from a 8 Galois field of size 2 . The source broadcasts the coded packets when the MAC allows transmission. Each transmitted packet is also augmented with its batch ID and the vector of the random coefficients used to generate the packet. When a FN on the multicast tree receives or overhears a coded packet from any of its neighbors, it checks whether the overheard packet is innovative, i.e., whether it is linearly independent from the packets obtained in previous transmissions (Gaussian elimination is 77 used for the independence check). An intermediate FN on the multicast tree caches all the innovative packets of the batch, and also performs the random linear combinations of them and sends the re-encoded packets. Different from Pacifier and MORE, data transport in HopCaster operates in a receiver-driven hop-by-hop manner; after a FN successfully receives or overhears (from all possible neighbors) a number of innovative coded packets sufficient to decode a data batch (i.e., receives k linearly independent packets with the same batch ID), it performs the following operations: • It sends an ACK to its parent. It indicates in the ACK message if it wants to request for next batch. A FN uses the ACK-request message for controlling the data flow (more discussion about flow and congestion control is in Section 3.3). • The FN becomes a new source for the batch responsible for delivering this batch to its children. Note that if a batch has been requested by its children, a FN can start broadcasting random linear combinations of packets belonging to the batch without waiting to reach the full rank (i.e., k innovative packets) so as to keep the pipeline going and reduce delay. Random network coding removes transmission of duplicate copies of a packet over the wireless medium, and the probability that a node receive non-innovative packets from its neighbors exponentially decreases with the code length [6]. The parent starts sending the next batch as soon as it receives the first acknowledgement for the current batch and a request for the next batch from one of its children. The parent node sends the batches in a round robin fashion; i.e., after the last batch is acknowledged by a child, it goes back to send from the batches that haven’t been 78 acknowledged by all its children yet. This continues until all batches are acknowledged by all children. With this scheduling scheme, the children with good link quality can finish receiving the whole file faster without waiting for its poorly connected counterparts. Children with poor connection would be served in later rounds. A FN keeps the received batches of data in its cache (buffer memory or storage) as long as it is still in the multicast group and has the capacity available in its buffer. FNs use the Least Recently Used (LRU) policy for its buffer replacement. The cached copies may be used during network topology changes, or by receivers newly joining the multicast group as described below. The source and FNs on the multicast tree maintain a soft-state batch request and ACK table (BRAT). This table keeps track of batches with pending requests (IDs of batches that were requested but not served yet). It also keeps track of the acknowledgement state of batches (for each batch ID, the IDs of child nodes that acknowledged the batch). Every parent node knows the list of its children through multicast tree construction. HopCaster uses a subscribe/join scheme to build the multicast tree and to populate and refresh the (BRAT). The tree roots at the source and consists of all the shortest paths from the source to the destination receivers. It is assumed that a unicast routing protocol such as OLSR [65] or HWMP [66] is used to construct unicast paths in the network employing the expected transmission count (ETX) metric [67]. Our HopCaster protocol can be essentially implemented as a shim layer on top of layer 3. A destination receiver interested in the content file sends a request message to its parent on the path toward the source. The request message contains a batch ACK field (BAF), which is a bitmap indicating which batches have been successfully received by the sending node. The 79 request message is propagated and processed hop-by-hop along the path toward the source. If the parent node has cached the batches needed by the sender of the request message (the corresponding bits in the BAF bitmap has not been set yet), the needed batches get served from this parent node to save bandwidth and reduce delay. The parent node updates the BAF field in the request message by setting the bit corresponding to the batch that it would serve. It then sends the request message to its parent. If at some point all the BAF bits get set, then the request message is not forward upstream anymore. Otherwise, the request message propagates until it gets to the root of the tree. In HopCaster, BRAT entries regarding received request messages are maintained as a soft state that is discarded after a timeout. A destination receiver is responsible for sending its request periodically toward the source in order to refresh the state in the BRAT and to indicate that its request is still valid. For bandwidth efficiency, request messages can be combined with batch ACK messages if possible. When changes in network topology are detected or when links break, the path between the destination and the source is reconstructed by the underlying unicast routing protocol. To speed up the recovery process, the affected nodes are informed of the events of link breaks and routing changes. If a node changes its parent node towards the source, the node sends a request message to its new parent node on the reconstructed path toward the source. The request message contains the BAF field to indicate its progress in content delivery, i.e., which batches it has successfully received. If a node loses a child node due to routing changes, it removes the state for that child. Note that in HopCaster, a new receiver can be efficiently served by the closest node that has the requested content locally cached. 80 5.3.2. Multicast Rate Adaptation The hop-by-hop nature of HopCaster, in which a FN keeps track of its children and serves as a new source for locally cached content, motivates a cross-layer rate adaptation algorithm to assist HopCaster in determining optimal transmission rate at the PHY layer. Existing radios such as IEEE 802.11 [68] support multi-rate capability at physical layer by using different modulation and channel coding (MCS) modes. However there is no link-layer feedback for multicast packets in 802.11. While most rate adaptation mechanisms try to maximize the PHY rate for unicast, a low basic rate is simply selected for multicast, which is certainly not optimized. Along the multicast tree, the source node or a FN have multiple children. The children with good channel conditions and more overhearing opportunities will finish receiving a data batch earlier. Therefore, for each parent node, the population of children nodes will change while sending from a batch. In HopCaster, a FN on the multicast tree periodically reports its link conditions to its parent. The reporting messages can be combined with the batch request and ACK messages if possible to reduce the overhead. The parent then knows which children still need the coded packets for a particular batch (based on the BRAT) as well as their channel conditions. It selects a PHY rate for a packet transmission to maximize the throughput at its intended children. Given a radio technology and PHY mode, the packet loss probability can be estimated from channel SNR γ. Just as an example, DPSK, QPSK and CCK modulations with different channel coding schemes are used in IEEE 802.11b PHY modes m = 1, 2, 3 and 4 to get transmission rates of 1 Mbps, 2 Mbps, 5.5 Mbps, and 11 Mbps, respectively. The packet loss probability can be calculated by [69]: 81   1 Pe ( L, γ , m) = 1 − 1 − Pe (24, γ ) × 1 − Pem ( L, γ )  5.1 1 where Pe (24, γ ) is the error probability of the physical layer convergence procedure (PLCP) sublayer header that is 24 bytes long and is always transmitted with PHY mode 1. Pem ( L, γ ) is the error probability for the L-byte data payload and MAC header transmitted with PHY mode m that can be expressed as Pem ( L, γ ) = 1 − (1 − Pbm (γ ))8 L 5.2 where Pbm (γ ) is the bit error rate (BER) that corresponds to SNR γ under PHY mode m. From Equation 5.1, it is clear that the packet error probability depends on the PHY mode, the packet size, and the channel SNR γ . Note that this is true for other radios such as 802.11a/g/n even they use more complex PHY modulation, channel coding, and antenna techniques in different PHY modes. In general, a relationship database between the packet loss rate and the used PHY mode under different packet sizes and channel SNR values can be built through modeling or offline measurements. Different multicast receiving nodes have different channel SNR values. Given the received channel SNR γ t for a receiving node t, the Pet ( L, γ t , m) can be derived for each PHY mode m based on the relationship database. Suppose a multicast transmitter N currently has T receiving child nodes, the objective is to find the best PHY mode for the multicast transmitter to achieve the overall maximum throughput among all its intended receiving nodes. The objective function can be formulated as T max ∑ r (m)(1 − Pet ( L, γ t , m)) t =1 82 5.3 where r(m) is the data rate for the PHY mode m used by the multicast transmitter. Each child node on the multicast tree measures its channel SNR for the packets received from its parent node and periodically feeds the channel SNR measurements back to its parent node. The parent node can then estimate the objective function and select the best PHY rate. Once the parent node receives an ACK for a batch from one of its children, the PHY rate adaptation mechanism is informed of this event, and it readjusts its multicast PHY rate to obtain the maximum total throughput based on the link quality of the children that have not completed reception of this batch. 5.3.3. Comparison: Hop-by-Hop vs. End-to-End As discussed before, the hop-by-hop transport avoids the need to estimate the value of TC that a FN uses to determine how many coded packets it should send. Furthermore, we argue that the hop-by-hop transport approach in HopCaster offers several other advantages over the end-to-end transport used in Pacifier and MORE. • The hop-by-hop transport handles the heterogeneous receivers better. Consider a situation where a bottleneck link is close to a destination node. If the end-to-end transport protocol is used, the source will keep transmitting the packets until an endto-end ACK from the worst receiver is received. Alternatively, hop-by-hop transport avoids this deficiency because the FN at the upstream of the bottleneck link will cache the data and inform its parent node to stop sending unnecessary packets. The data is pushed towards the destinations as close as possible and network bandwidth is utilized more efficiently. The performance of multicast is not limited by the path of 83 the worst connected receiver. A lossy and low-capacity wireless link only affects the receivers behind it on the multicast tree. • The hop-by-hop transport greatly simplifies flow and congestion control. With the end-to-end approach, there are points in the network between the source and destination where congestion may occur from traffic aggregation. Certain congestion control mechanism that dynamically adjusts the source transmission rate to keep the aggregate traffic volume below the network congestion level is required in order to maintain flow balance, otherwise packet loss would occur. In HopCaster, by contrast, all communications are local and flow balance is maintained at each hop through request (pull) and ACK messages. A child node sends a request to its parent only if it is ready to accept a new batch. There is no need for additional techniques to control congestion in the middle of a path. This is not the same as backpressure-based hopby-hop flow control [70], where backpressure between adjacent nodes is used for a node to adjust its packet forwarding rate of a continuous flow. Note that a FN in HopCaster does not have FIFO queues but a LRU memory that decouples the hop-byhop control loops, compensates bandwidth fluctuations, and reduces the impact of multicast receiver heterogeneity. In case all the data batches in the buffer are in current use due to very small storage size, a node can simply delay sending its request to the parent node until the buffer space is available. • The advantages of hop-by-hop transport are even more pronounced when the network becomes congested or nodes are mobile. Even for a single destination, the end-to-end connection over a path of n hops would break or deteriorate as soon as any link in the path failed or degraded, triggering route repair or route re-discovery mechanisms. The 84 probability of disconnection or degradation on the n hop path would be roughly n times that of a single hop. The situation is worsened in multicast since more receivers and FNs are involved. Thus, a hop-by-hop transfer is far more reliable and allows delivery from intermediate nodes after route repair. • Propagation time of a 1-hop ACK is shorter than that of a multi-hop end-to-end ACK. Therefore the number of redundant packets injected into the network while a 1-hop ACK is on its way back to a parent node is much smaller than that when an end-toend ACK is sent to the source node from the destination. Moreover, the 1-hop ACK is much less possible to be delayed or lost due to congestion and transmission errors than the end-to-end ACK, In HopCaster, transmission of coded packets at a FN is not triggered by receiving a packet from its parent, rather than the requests from the children. Thus even if the 1-hop ACKs get lost or delayed, a packet originated from a parent node will not trigger more transmissions from the child nodes (as the case in MORE and Pacifier). • The hop-by-hop feedback control allows local optimization along the multicast tree. One such optimization proposed by HopCaster is to enable each parent node to adapt its radio transmission rate to maximize multicast throughput by exploiting the children’s population changes and wireless channel variations. 5.4. Performance Evaluation To evaluate the performance of HopCaster, we use Network Simulator 2 (NS-2) [71], a widely used simulator in networking research community, to simulate both HopCaster and Pacifier and compare their performance under various settings. We choose Pacifier, 85 not MORE, for performance comparison with HopCaster because it is reported in [58] that Pacifier achieves better performance than MORE. We performed our evaluation under many different network topologies and parameter settings. We report results with the following settings. We have a total of 50 nodes. We generate 10 scenarios, each corresponding to a topology by placing 50 nodes randomly in a 1000×1000 square meter area. There is one content source node and 9 destination receivers requesting for the content file. The source and destinations are chosen randomly in each scenario. Note that a node could be selected as a destination and a FN at the same time. Further, we simulate a realistic wireless channel with a shadowing propagation model [72] and we assume the maximum radio transmission range of 276 meters. In our simulations we implemented the radio multi-rate capability. as described later in this section. In the simulations, the size of the file to be transmitted from the source to the multicast destinations is 2.14 MB. The source divides the file into 48 batches, each of which is divided into 32 packets with 1460 bytes payload in a packet. To collect the packet loss probability values required by Pacifier to compute the TCs, following Pacifier implementation [58], we allow the 50 nodes involved in the simulation to periodically exchange hello messages. By keeping track of the lost hello messages, the loss probability values of all the links in the network are calculated. Next we provide our simulation results for the scenarios with static and dynamic multicast trees, and explain the gains attained by HopCaster. 5.4.1. Static Multicast Experiments 86 In those experiments, the multicast tree is established at the beginning of the experiment, and no new receivers are joining the tree to request the content file during the course of data transmission. The dynamic case will be investigated in the next section. Figure 5.2 shows the average throughput of Pacifier and HopCaster in the 10 different scenarios, in addition to the overall average throughput over all the scenarios. Here the throughput of a destination receiver under one scenario is calculated by dividing the file size by the time that the destination receives the whole file. Then averaging the throughput values over all destination receivers in a scenario, we obtain the average throughput for that scenario. The transmission data rate in those experiments is set to 2Mbps. As shown in Figure 5.2, HopCaster achieves a higher throughput compared to Pacifier in all scenarios. The gain ranges from 6% (scenario 3) to 29% (scenario 9), and the average is 12%. The higher throughput of HopCaster results from injecting fewer intra-flow NC-coded packets into the network, reducing the number of required transmissions. This can be further clarified by Figure 5.3. The figure shows the number of transmissions by HopCaster and Pacifier in all the scenarios. Each bar in the graph is divided into two parts: the bottom part corresponds to the number of coded data packet transmissions and the top part corresponds to the number of ACK packet transmissions. Note that in HopCaster, the ACKs are hop-by-hop, and in Pacifier the ACKs are end-toend. HopCaster results in sending fewer data packets and more ACK packets than Pacifier. The hop-by-hop ACK in HopCaster significantly reduces the number of redundant data packets injected into the network. Thus, HopCaster has less total number of packet transmissions (data and ACK) in all scenarios except scenario 4 (both protocols 87 have almost the same number of transmissions). In addition, the channel time of sending a data packet is much larger than that of an ACK packet because the ACK packet is much smaller. Figure 5.4 shows the redundancy in both protocols. Redundancy is computed as the total number of bytes in all transmitted packets (data and ACK) divided by the file size in bytes. Note that HopCaster results in much less redundancy over all scenarios. 600 Pacifier HopCast 500 Avg. 400 Throughput (Kbps) 300 200 100 0 1 2 3 4 5 6 7 8 9 10 Avg Scenario Figure 5.2: Average throughput of HopCaster and Pacifier under 10 different scenarios. x 10 4 3.5 Pac-ACK Hop-ACK Pac-DATA Hop-DATA 3 2.5 Total number of 2 transmissions 1.5 1 0.5 0 1 2 3 4 5 6 7 8 9 10 Avg Scenario Figure 5.3: The number of transmissions (ACK and data packets) of HopCaster and Pacifier under 10 different scenarios. 88 18 Pacifier HopCast 16 14 12 10 Redundancy 8 6 4 2 0 1 2 3 4 5 6 7 8 9 10 Avg Scenario Figure 5.4: The number of redundant bytes that are injected into the network by HopCaster and Pacifier in 10 different scenarios. 5.4.2. Dynamic Multicast Experiments As opposed to the previous static multicast simulation experiments where no new receivers were joining the multicast tree to request the content file during transmission, this section addresses the dynamic multicast case, in which new receivers request to join the tree for receiving the content data during transmission. In particular, the simulation starts with one source and 8 receivers, i.e., one of the 9 receivers in previous simulation experiments will be idle at the beginning of the simulation experiment. At some point of time, that node will request joining the multicast tree. For HopCaster, the joining node will use the request mechanism described in Section 5.3.1. For Pacifier, as described in [58], the joining node will send a join message toward the source (the root of the tree) which in turn will register the joining node as a new multicast receiver. We compare the new receiver’s throughput under HopCaster and Pacifier in 10 different scenarios. In each 89 scenario, the new receiver and the time of the request will be chosen randomly (the same randomly chosen values will be used for both HopCaster and Pacifier in an experiment). Figure 5.5 shows that the new receiver’s throughput under HopCaster is much greater than that with Pacifier. The reason is that in HopCaster, the new receiver can be immediately served from the closest upstream node in the multicast tree, where in Pacifier, only the source can serve a new node since the source needs to compute the TCs for the FNs to send the coded packets to the newcomer. This demonstrates a desirable feature of the hop-by-hop transport approach, where over time, the content will move deeper in the multicast tree and closer to the destinations. A new receiver will be served by the closest node that has cached the requested content. 800 Pacifier HopCast 700 600 Avg. 500 Throughput 400 (Kbps) 300 200 100 0 1 2 3 4 5 6 7 8 9 10 Avg Scenario Figure 5.5: Average throughput of the new receiver that dynamically joins the multicast group with HopCaster and Pacifier in 10 different scenarios. 5.4.3. Multicast Rate Adaptation In this section we demonstrate the throughput gain of the rate adaptation algorithm described in Section 5.3.2. In our simulations, children nodes report the SNR of received 90 packets to their parent node every 10 seconds. Whenever a parent node is about to transmit a packet, it evaluates the optimal data rate according to Equation 5.3 for m=1, 2, 3, 4 that correspond to data rates of 1 Mbps, 2 Mbps, 5.5 Mbps, and 11 Mbps, respectively. Equation 5.3 is evaluated only for children nodes that have not acknowledged the current batch. For comparison, Figure 5.6 illustrates the average throughput values for the 10 scenarios using the adaptive data rates as well as 2 fixed data rates: 2Mbps and 11Mbps. 800 700 Pac 2 Mbps Pac 11 Mbps Hop 2 Mbps Hopt 11 Mbps Hop adaptive 600 500 Avg. 400 Throughput (Kbps) 300 200 100 0 1 2 3 4 5 6 7 8 9 10 Avg Scenario Figure 5.6: Average throughput of HopCaster and Pacifier under 10 different scenarios using 2Mbps, 11Mbps, and adaptive data rates. The performance of both HopCaster and Pacifier degrades under 11 Mbps compared to 2 Mbps. This is because of high packet loss rate at 11 Mbps, which incurs more delay in sending data to destinations. Furthermore, the ACK messages may be lost, leading to injecting more redundant packets into the network before switching to next nonacknowledged batches. Note that the cross-layer rate adaptation mechanism greatly improves the throughput in all the experiment scenarios. The reason is that the rate adaptation algorithm enables a parent node to estimate the current conditions of the 91 wireless links to its targeted children. Therefore it is able to dynamically select a data transmission rate to maximize the overall throughput across all its targeted children. 5.5. Discussion To design a practical protocol that delivers the benefits of intra-flow NC in improving multicast performance over WMNs, many challenges need to be addressed, such as how many coded packets a FN should send and how to handle the heterogeneity of multicast receivers. In this chapter, we have designed HopCaster, a novel reliable multicast protocol that incorporates intra-flow NC with receiver-driven, hop-by-hop transport. Compared to the existing intra-flow NC-based multicast protocols that all follows the conventional end-to-end transport principle, HopCaster design eliminates the need for estimating the number of coded packets each FN should transmit, and simplifies multicast management and congestion control. We also proposed a cross-layer rate adaptation mechanism that enables HopCaster to optimize PHY rate selection in hop-byhop multicast by exploiting the changing receiver population and wireless channel variations. Our simulations show that compared to Pacifier, a state-of-the-art intra-flow NC-based multicast protocol, HopCaster greatly reduces the number of required transmissions over wireless medium to deliver multicast data and achieves up to 29% throughput improvement. Furthermore, we showed that the advantages of HopCaster are more prominent in the situations that new nodes dynamically request for the contents. 92 Chapter 6 6. Conclusion and Future Work In this chapter we provide a conclusion and outline directions for future work. 6.1. Conclusion In this dissertation we studied the practical aspects that arise when deploying DVC over real visual sensors. Further, we proposed and evaluated HopCaster; a scheme that employs the cache-and-forward transport strategy to achieve high throughput transmission in multicasting over wireless mesh networks while maintaining 100% packet delivery ratio. In Chapter 1 we provided a brief background about Distributed Source Coding (DSC) and Network Coding (NC). In Chapter 2 we studied DVC when deployed over real visual sensor. We described the design and implementation of a DVC codec that we deployed over the popular Micaz/Cyclops and Imote2/IMB400 visual platforms. Further, we reported a summary of accurate measurements regarding the distribution of energy consumption over the 93 different DVC operations. Moreover, we carried out an operational energy-distortion analysis for a variety of options available to RDVC on visual sensors. In Chapter 3 we presented our approach towards a rate-adaptive DVC solution that eliminates the need for feedback messages from the base-station to the visual sensors; this solution makes utility of the correlation between the source rate and the conditional entropy of the WZ and key frames. In Chapter 4 we employed and evaluated the performance of polar codes in DVC. Further, we compared the performance of polar codes with the more established and more investigated Low Density Parity Check Accumulate (LDPCA) codes in the context of DVC. We showed that polar codes offer a clear advantage when coding smaller size image blocks. Consequently, polar codes could represent a viable solution for distributed sensor networks that capture a low-resolution video signal at each visual sensor node. In Chapter 5 we proposed HopCaster, a novel protocol that incorporates intra-flow NC with hop-by-hop transport to achieve high-throughput reliable multicast over wireless mesh networks. We showed that by adopting the hop-by-hop transport strategy, HopCaster avoids many problems incurred by the traditional end-to-end transport strategy. 6.2. Future Work The intended future work falls into the following directions: • An essential component of traditional video compression standards, such as H.264, is motion estimation (ME) that requires intensive computations that are mainly related to block-wise motion search. For DVC to compete with standard 94 video compression schemes, there is a need to take advantage of ME at the decoder side. As we saw in Section 2.3 (Figure 2.2), there has been attempts to utilize ME in the context of DVC, mainly by allowing the encoder to provide the decoder with information that assist it in performing ME. This information can take the form of hash or cyclic redundancy check (CRC) data. Part of our future plan is look deeper into this aspect, and to find ways that assist the decoder to perform ME using minimal amount of overhead data. • Part of our future plan is to bring NC and DVC together to realize a practical and efficient data networking platform that fits in resource constrained networks. This is accomplished by integrating NC and DVC so that the two schemes work collaboratively. This is important in situations where the collected data demonstrates a degree of correlation (DVC proves to be efficient in compressing data before transmission), and in the meanwhile the collected data is propagated through a wireless network where overhearing is possible (NC is efficient in eliminating unnecessary transmissions and hence maintain high throughput). • In this dissertation the focus have been on DVC over one camera, i.e., the case where DVC aims to compress temporally consecutive frames that are captured by one camera. In fact, DVC can utilize the correlation amongst frames captured by many cameras, since when the cameras are located in spatial vicinity, the captured frames are expected to demonstrate mutual redundancy across the different cameras. Part of our future plan is to explore this aspect and to devise algorithms that can efficiently quantify the amount of disparity between frames 95 captured across multiple cameras and utilize the correlation to achieve maximum compression. • We showed in chapter 4 that polar codes offer a clear advantage when coding smaller size image blocks. Our results were in terms of compression performance. Although this result encourages the use of polar codes, it needs more evidence. As we saw in Chapter 3, a more conclusive measure would be in terms of the overall power consumption that takes into account savings in transmission energy in addition to computational energy. As we mentioned earlier, although polar codes have low encoding complexity, they are more computational intensive compared with LDPC codes (and O(N log N) for polar codes vs. O(N) for LDPC codes). Part of our future plan is to investigate the advantage of using polar codes in terms of the overall power consumption (computation plus transmission). • Results in Chapter 4 show that polar codes offer a clear advantage when coding smaller size image blocks compared to LDPCA codes. The results in Chapter 4 were attained when using the successive cancellation (SC) algorithm to decode polar codes. It is reported in the literature that the belief propagation (BP) algorithm can be used to decode polar codes, and it achieves higher performance compared to SC. Part of our future plan is to incorporate BP in our DVC codec and compare it with LDPCA. 96 7. APPENDIX A: LDPCA Codes s1 s2 s3 s4 s5 s6 s7 a1 a2 a3 s8 a4 a5 a6 a7 a8 (b) (a) s1+s2 s3+s4 s5+s6 s7+s8 s1+s2+s3+s4 (c) s5+s6+s7+s8 (d) Figure A.7.1: LDPCA code with different rates. Figure A.7.1 demonstrates the idea of LDPCA codes. We have an 8×8 parity check matrix that is represented by the Tanner graph [3] in Figure A.7.1 (a). When this graph is used to compress a WZ bitstream of 8 bits, a syndrome of 8 bits will be generated. Note that no compression is achieved in this case. LDPCA [36] suggests achieving different levels of compression by accumulating (summation modulo 2; equivalent to binary XOR) the syndrome bits: a1 = s1 ai = ai −1 + si , where i = 2 to n The accumulation process is shown in Figure A.7.1 (b). Compression is achieved by transmitting only a subset of the accumulated syndrome bits. For example, if we transmit 97 a2, a4, a6, a8, then a compression ratio of 1/2 is achieved. At the decoder, four syndrome bits are generated as follows: s1 + s2 = a2 s5 + s 6 = a 4 + a 6 s3 + s 4 = a 2 + a 4 s7 + s8 = a6 + a8 The four syndrome bits are used for decoding over a new Tanner graph. This graph (Figure A.7.1 (c)) results from merging each pair of consecutive check nodes in the original Tanner graph. To achieve a compression ratio of 1/4, only a4 and a8 are transmitted. A new Tanner graph (Figure A.7.1 (d)) is constructed by merging each set of consecutive 4 check nodes in the original graph. Further, two syndrome bits are generated: s1 + s2 + s3 + s4 = a4 s5 + s6 + s7 + s8 = a 4 + a8 98 REFERENCES 99 REFERENCES [1] J. Slepian and J. Wolf, "Noiseless coding of correlated information sources," IEEE Transaction on Information Theory, vol. IT-19, pp. 71–480, July 1973. [2] S. Pradhan and K. Ramchandran, "Distributed source coding using syndromes (DISCUS): Design and construction," in Proc. IEEE Data Compression Conference, , Snowbird, UT, Mar. 1999. [3] W.E Ryan, "An Introduction to LDPC Codes," in CRC Handbook for Coding and Signal Processing for Recording Systems (V. Vasic, ed.).: CRC Press, 2004. [4] N. Cai, S.-Y. Li, and R. Yeung R. Ahlswede, "Network information flow," IEEE Transactions on Information Theory, vol. 46, pp. 1204–1216, July, 2000. [5] D. Fulkerson L. Ford, "Maximal flow through a network," Canadian Journal of Mathematics, pp. 399–404, 1956. [6] R. W. Yeung, and N. Cai S.-Y. R. Li, "Linear network coding," IEEE Transaction on Information Theory, Feb. 2003. [7] E. Arikan, "Channel polarization: A method for constructing capacity achieving codes for symmetric binary-input memoryless channels," IEEE Transactions on Information Theory, vol. 55, pp. 3051–3073, 2009. [8] A. Aaron, S. Rane, and D. Rebollo-Monedero B. Girod, "Distributed Video Coding," IEEE Special Issue on Video Coding and Delivery, vol. 39, no. 1, pp. 71-83, Jan 2005. [9] A. D. Wyner and J. Ziv, "The rate-distortion function for source coding with side informationat the decoder," IEEE Transaction on Information Theory, vol. IT-22, pp. 1-10, Jan. 1976. [10] S. Karande, and H. Radha K. Misra, "Multi-Hypothesis Based Distributed Video Coding using LDPC Codes," in Proc. Allerton Conference on Communication, Control, and Computing, Allerton, IL, Sept.2005. 100 [11] I. H. Tseng and A. Ortega, "Rate-distortion Analysis and Bit Allocation Strategy for Motion Estimation at the Decodec Using Maximum Likelihood Technique in Distributed Video Coding," in Proc. IEEE International Conference on Image Processing (ICIP07), TX, Sep., 2007. [12] R. Puri and K. Ramchandran, "PRISM: A New Robust Video Coding Architecture Based on Distributed Compression Principles," in Proc. Allerton Conference on Communication, Control, and Computing, Allerton, IL, Oct. 2002. [13] B. Code, E. Kaiser, M. Shea, and L. Bavoil W. Feng, "Panoptes: scalable low-power video sensor networking technologies," in Proc. of ACM Multimedia, CA, USA, Nov. 2003. [14] D. Ganesan, P. Shenoy, and Q. Lu P. Kulkarni, "SensEye: a multi-tier camera sensor network," in Proc. of ACM Multimedia, 2005. [15] R.Halloush, K. Misra and H. Radha, "Practical Distributed Video Coding over Visual Sensors," in Proc. Picture Coding Symposium (PCS09), Chicago, IL, USA, May 6, 2009. [16] R. Halloush and H. Radha, "Practical Distributed Video Coding Based on Source Rate Estimation," in Proc. 44th Conference on Information Sciences and Systems (CISS'10), Princeton University, NJ, USA, March 2010. [17] R. Baer, O. Iroezi, J. Garcia, J. Warrior, D. Estrin, M. Srivastava M. Rahimi, "Cyclops: in situ image sensing and interpretation in wireless sensor networks," in Proc. ACM Conference on Embedded Networked Sensor Systems (SenSys), San Diego, CA, Nov. 2. [18] Crossbow Corporation. [Online]. http://www.xbow.com [19] Atmel Corporation. [Online]. http://www.atmel.com/products/avr [20] ChipCon Inc. [Online]. http://www.chipcon.com [21] ZigBee Alliance Homepage. [Online]. http://www.zigbee.org [22] A. Goode, D. Goel, and I. Nourbakhsh A. Rowe, "CMUcam3: an open programmable embedded vision sensor," Carnegie Mellon Robotics Institute, Technical Report 2007. [23] C. Park and P. Chou, "eCAM: ultra compact, high data-rate wireless sensor node with a miniature camera," in Proc. International Conference on Embedded Networked Sensor Systems (SenSys06), New York, NY, USA, 2006. 101 [24] A. Abbo, B. Schueler, A. Danilin R. Kleihorst, "Camera mote with a highperformance parallel processor for real-time frame-based video processing,” in Proc of IEEE Conference on Advanced Video and Signal Based Surveillance (AVSS07)," , London, UK, 5 Sept. 2007. [25] Marvell Corporation. [Online]. http://www.marvell.com [26] T. Melodia and K. Chowdhury I. Akyildiz, "Wireless Multimedia Sensor Networks: Applications and Testbeds," Proceedings of the IEEE Inst, vol. 96, pp. 1588–1605, 2008. [27] P. Levis, D. Culler, E. Brewer D. Gay, nesC 1.1 Language Reference Manual., May 2003. [28] Tinyos: An operating system for networked sensors. [Online]. http://www.tinyos.net [29] R. Zhang, and B. Girod A. Aaron, "Wyner-Ziv coding of motion video," in Proc. Asilomar Conference on Signals and Systems, Pacific Grove, CA, Nov. 2002. [30] E. Setton, and B. Girod A. Aaron, "Towards practical Wyner-Ziv coding of video," in Proc. IEEE International Conference on Image Processing, Barcelona, Spain, Sept. 2003. [31] S. Rane, E. Setton, and B. Girod A. Aaron, "Transform domain Wyner-Ziv codec for video," in Proc. SPIE Visual Communications and Image Processing, San Jose, CA, Jan. 2004. [32] S. Rane and B. Girod Aaron, "“Wyner-Ziv video coding with hash-based motioncompensation at the receiver," in Proc. IEEE International Conference on Image Processing, Singapore, Oct. 2004. [33] R. Szewczyk and D. Culler J. Polastre, "Telos: Enabling Ultra-Low Power Wireless Research," in Proc. the fourth International Conference on Information Processing in Sensor Networks (IPSN05), 2005. [34] K. Ozonat, "Lossless Distributed Source Coding for Highly Correlated Still Images," Dept. of Electrical Engineering, Stanford University, Stanford, CA., Technical report 2000. [35] "H.264/MPEG-4 Part 10: Transform and Quantization," in White paper at http://www.vcodex.com. [36] A. Aaron, B. Girod D. Varodayan, "Rate-Adaptive Codes for Distributed Source Coding," in Proc. Asilomar Conference on Signals, Systems, and Compuers, Pacific 102 Grove, CA, 2005. [37] Agilent Infiniium Oscilloscope. [Online]. http://www.agilent.com. [38] Analog Devices AD620 Instrumentation Amplifier. [Online]. http://www.analog.com [39] M. Krämer and A. Geraldy, "Energy measurements for micaz node," University of Kaiserslautern, Kaiserslautern, Germany, Technical Report 2006. [40] J. Thomas T. Cover, Elements of Information Theory.: Wiley Series in Telecommunications, 1991. [41] A. Leon-Garcia, Probability and Random Processes for Electrical Engineering, Second Edition. ed. [42] Intel Wireless MMX Technology Developer Guide., August, 2002. [43] A. Majumbar, P. Ishwar, and K. Ramchandran R. Puri, "Distributed video coding in wireless sensor networks," IEEE Signal Processing Magazine, vol. 23, pp. 94 – 106, July 2006. [44] A. Aaron and B. Girod, "Compression with side information using turbo codes," in Proc. IEEE DCC, 2002. [45] S. B. Korada, Polar codes for channel and source coding. Ph.D. dissertation, EPFL, Lausanne, Switzerland, July 2009. [46] L. Rad, H. Aghajan I. Downes, "Development of a Mote forWireless Image Sensor Networks," in Proc. of Cognitive Systems and Interactive Sensors (COGIS), Paris, France, March 2006. [47] D. Prashanth, S. Fong, and H. Aghajan S. Hengstler, "Mesh-Eye: a hybrid-resolution smart camera mote for applications in distributed intelligent surveillance," in Proc. of the 6th International Symposium on Information Processing in Sensor Networks (IPSN07), April 2007. [48] H. Cronie and S. B. Korada, "Lossless Source Coding with Polar Codes," in Proc. IEEE International Symposium on Information Theory (ISIT), Texas, USA, June 2010. [49] K. Obraczka, S.-J. Lee, and M. Gerla K. Tang, "A reliable, congestion controlled multicast transport protocol in multimedia multi-hop networks," in Proc. of WPMC, 2004. 103 [50] L. Rizzo and L. Visicano, "RMDP: an FEC-based reliable multicast protocol for wireless environments," Mobile Computing and Communications Review, vol. 2, 1998. [51] Y. Yi, K. Obraczka, S.-J. Lee, K. Tang, and M. Gerla V. Rajendran, "Combining source- and localized recovery to achieve reliable multicast in multi-hop ad hoc networks," in Proc. of IFIP Networking, 2004. [52] E. Pagani and G. Rossi, "Reliable broadcast in mobile multihop packet networks," in Proc. of ACM MOBICOM, 1997. [53] H. Baraka, and A. Fahmy A. Sobeih, "ReMhoc: A reliable multicast protocol for wireless mobile multihop ad hoc networks," in Proc. of IEEE CCNC, 2004. [54] M. M´edard, J. Shi, M. Effros, and D. Karger T. Ho, "On randomized network coding," in Proc. of Allerton, 2003. [55] H. Rahul, W. Hu, D. Katabi, M. Medard, and J. Crowcroft S. Katti, "Xors in the air: practical wireless network coding," in Proc. of ACM SIGCOMM 2006, Sept. 2006. [56] J. Zhang and Q. Zhang, "Cooperative Network Coding-Aware Routing for MultiRate Wireless Networks," in Proc. of IEEE Infocom, 2009. [57] M. Jennings, S. Katti, and D. Katabi S. Chachulski, "Trading structure for randomness in wireless opportunistic routing," in ACM SIGCOMM, 2007. [58] Y. Liu, C. Wang D. Koutsonikolas, "Pacifier: high-through, reliable multicast without crying babies in wireless mesh networks," in INFOCOM, 2009. [59] N. Santhapuri, Z. Zhong, and S. Nelakuditi B. Ni, "Routing with opportunistically coded exchanges in wireless mesh networks," in Proc. of IEEE SECON, 2006. [60] BitTorrent. [Online]. http://www.bittorrent.com [61] Kazza. [Online]. http://www.kazaa.com [62] D. Smetters, J. Thornton, M. Plass, N. Briggs, R. Braynard V. Jacobson, "Networking Named Content," in Proc. of the 5th ACM International Conference on Emerging Networking Experiments and Technologies (CoNEXT 2009), Rome, Italy, Dec. 2009. [63] R. Yates, D. Raychaudhuri, and J. Kurose S. Paul, The cache-and-forward network architecture for efficient mobile content delivery services in the future Internet.: ITU-T Innovations in NGN: Future Network and Services, May 2008. 104 [64] H. Liu, Y. Zhang, S. Paul, and D. Raychaudhuri L. Dong, "On the Cache-andForward Network Architecture," in Proc. of IEEE ICC09, June 2009. [65] IETF RFC 3626, "Optimized Link State Routing Protocol (OLSR),". [66] IEEE 802.11s, "Wireless LAN Medium Access Control (MAC) and Physical Layer (PHY) Specifications Amendment 10: Mesh Networking,". [67] D. Aguayo, J. C. Bicket, and R. Morris D. S. J. D. Couto, "A high throughput path metric for multi-hop wireless routing," in Proc. of ACM MOBICOM, 2003. [68] IEEE 802.11-2007, "Part 11: Wireless LAN Medium Access Control (MAC) and Physical Layer (PHY) Specifications," , 2007. [69] J. Prado Pavon and S. Choi, "Link adaptation strategy for IEEE 802.11 WLAN via received signal strength measurement," in Proc. of IEEE ICC'03, May 2003. [70] M. Transier, C. L. M. Mauve, and W. Effelsberg B. Scheuermann, "Backpressure multicast congestion control in mobile ad-hoc networks," in Proc. of ACM CoNEX, 2007. [71] The Network Simulator ns-2. [Online]. http://www.isi.edu/nsnam/ns/ [72] Ian Glover and Peter Grant, Digital Communications, third, Ed., 2009. 105