Ens-s ,- "41’, fig,” ‘ N. 1 " ' o. 9mg" . v I - 3"“ . 4“; .o~- . $243. ‘ ., ,.r;.<-;; 2.“: 'H‘g‘w’4“ . W... ‘_ . A...“ —. N '- «at. 3'"... $va - "fl." -. f". I This is to certify that the dissertation entitled PLANNING AND CONTROL OF MOBILE SURVEILLANCE NETWORKS presented by AMIT YOGESH GORADIA has been accepted towards fulfillment of the requirements for the Ph.D. degree in ELECTRICAL AND COMPUTER ENGINEERING _\ _ , "_.__. t Major Professor’s Signature 0/ (I Bar/t5 Date MSU is an Affirmative Action/Equal Opportunity Institution f"-‘4‘ LIBRARY Michigan State University PLACE IN RETURN BOX to remove this checkout from your record. TO AVOID FINES return on or before date due. MAY BE RECALLED with earlier due date if requested. DATE DUE DATE DUE DATE DUE 6/07 p‘lClRC/DaIeDue.indd-p.1 PLANNING AND CONTROL OF MOBILE SURVEILLANCE NETWORKS BY Amit Goradia A DISSERTATION Submitted to Michigan State University in partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY Department of Electrical and Computer Engineering 2006 P. mow St'IlNll The I In;T I'M muniti and P); [rat-hi; “Miter 1 Scale m Slalilt? r The Capabii DELSH} ABSTRACT Planning and Control of Mobile Surveillance Networks By Amit Goradia Pervasive surveillance implies the continuous tracking of multiple targets as they move about the monitored region so that they do not leave the field‘of view of the sensors observing them and maintain discernable resolution for feature identification. The tasks to be performed by a surveillance system are expressed as the follow- ing requirements:(1) Automatically track the identified targets over the region being monitored; (2) Coordinate tasking of multiple sensors to maintain target visibility and execute other requested tasks;(3) Provide concise feedback and video data of a tracked target to multiple operators. Due to the inherent complexity of large net— worked systems and the diversity of the sensors used, a major challenge for such large scale networked systems is to design an efficient modeling and analysis tool and devise stable control algorithms for accomplishing the surveillance task. The ability to track multiple targets simultaneously using sensors with motion capability is an important aspect of surveillance systems. Current feature-point- based visual surveillance and tracking techniques generally employed do not provide an adequate framework to express the surveilla; z-c task of tracking multiple targets simultaneously using a single sensor. This rlisscrtation presents the method of Haus- dorff tracking, which can express the surveillance task succinctly and track multiple targets simultaneously using a single sensor. Hausdorff tracking can readily express the surveillance tasks of maintaining the visibility and adequate resolution of a target as the minimization of an error (shape function) and accomplish the tracking task using feedback directly from the acquired image. Mutational equations are used to represent the motion of the target sets with respect to the motion of the camera. Ls IIIL‘: Emit {lung of III (haw; sire .~ for It; SWIM"! It»: 1;. apps). Tl. Ht‘lltjt: Hail, bd‘t’fii Using the mutational dynamics model, a feedback map to control the sensor motion module for accomplishing the surveillance task can be developed. Further, in order to take advantage of the redundancy in the motion control, an optimal Hausdorff tracking framework is presented. Despite the limited sensing capability and range of the individual sensors, a surveil- lance network can track targets over a large region by transferring the target-tracking task from one group of sensors to another based on the motion of the target. Muta- tional analysis and shape-based control fail to capture the discrete switching nature of the surveillance task of tracking the target using multiple switched sensors. This dissertation presents a mutational hybrid automata (MHA) model for such perva- sive surveillance networks that retains the advantages of using mutational equations for modeling the active surveillance task while also being able to model the discrete switching between various sensors. Analysis of example pervasive surveillance scenar- ios modeled using the MHA model and experimental results that verify the proposed approach are presented. The active sensors needed to track the target keep changing due to target motion. Hence the concise video feedback provided to the operator to assist decision-making needs to be switched to the sensors currently involved in the tracking task. A task- based metric representing the number of cameras required to track a moving target over a monitored region is presented. This metric (in conjunction with other metrics) can be used to optimally place cameras in a monitored environment. An optimized camera selection algorithm for selecting the minimum number of cameras to track a moving target over a large area is also presented. A surveillance test—bed has been designed based on these requirements and the algorithms and subsystems described have been implemented on it. Results of the experimental implementation validate the proposed approaches. To my parents and Prerna... haw dun (‘UIII litm- 51.13:: I and l I am Drain: Dr. \ lft‘hm and h. IOU H. Ofpan .\l: (IE-5“." Will}. ACKNOWLEDGEMENTS I would like to thank my advisor, Dr. N ing Xi, without whom this would not have been possible. I would like to thank Dr. Matt Mutka for providing acute insights during our numerous discussion sessions. I would like to thank all my other guidance committee members, Dr. Hassan Khalil, Dr. Erik Goodman and Dr. Gowei Wei, who have devoted many hours on the committee meetings. Their insightful comments and suggestions have enhanced the technical soundness of this dissertation. I would also like to express my deepest gratitude to the Dean Dr. Satish Udpa and Dr. Subir Biswas for their support and advise through the course of my career. I am grateful to my friends and colleagues from the Robotics and Automation Lab— oratory. The discussions with Zhiwei Cen, Dr. Yu Sun, Mike(Quan) Shi, Yang Liu, Dr. Yantao Shen, Long Pan, Dr. Jindong Tan, Dr. Imad Elhajj, Dr. Guangyong Li, Uchechukwu Wejinya, and Jiangbo Zhang have substantially contributed to my work and broadened my knowledge. I would like to express my great appreciation to Clay- ton Haffner, Michael Huntwork, Mat Prokos and Craig Pomeroy for the endless hours of painstaking work they have put in at the Robotics and Automation Laboratory. My thanks go to my family, especially my fiancee Prerna and my parents. This dissertation would not have been possible without their years of encouragement and continuous support. .. ...r .99 3 HIM: 11 Du. 9...... TABLE OF CONTENTS LIST OF FIGURES ix LIST OF TABLES xii 1 Introduction 1 1.1 Background ................................ 1 1.2 Challenges and Problems in Surveillance Networks ........... 5 1.3 Literature Review ............................. 7 1.4 Objectives of this Research ........................ 12 1.5 Major Contributions of this Dissertation ................ 13 1.6 Organization of this Dissertation .................... 15 2 Mathematical Background 17 2.1 Introduction ................................ 17 2.2 Mutational Equations ........................... 19 2.2.1 Evolution of a Tube ........................ 20 2.2.2 Controlled Mutational Equations ................ 21 2.3 Hybrid Dynamic Systems ......................... 22 2.4 Mutational Hybrid Dynamic Systems .................. 24 2.5 Shape Analysis .............................. 25 2.5.1 Shape Functions ......................... 25 2.5.2 Directional Derivative of Shape Functions ........... 26 2.6 Chapter Summary ............................ 27 3 Modeling Mobile Surveillance Networks 28 3.1 Introduction - Example Surveillance Scenario .............. 28 3.2 Controlled Mutational Equation Model for Camera Sensor and Target 33 3.2.1 Perspective Projection Model .................. 33 3.2.2 World Space Model ........................ 34 3.3 Incorporating Non-holonomic Sensor Constraints ............ 37 3.4 Hausdorff Tracking ............................ 39 3.4.1 Image-Based Hausdorff Tracking ................. 41 3.4.2 Cooperative Hausdorff Tracking ................. 46 vi C." C." C." C." .r. K." L' kw! C? C." C." I]! w 3.5 Mutational Hybrid Automata Model for Pervasive Surveillance Networks 50 3.5.1 Mutational Hybrid Automata Model .............. 50 3.5.2 Controller Design for Individual Cameras ............ 54 3.6 Chapter Summary ............................ 55 Analysis of Surveillance Networks 57 4.1 Lyapunov Theorem for Shape Space ......... . .......... 57 4.2 Stability Analysis of Hausdorff Tracking ................ 60 4.3 Analysis of Example Surveillance System using Mutational Hybrid Au- tomata Model ............................... 61 4.3.1 Two-Camera Example for Pervasive Surveillance with Contin- uous Servoing Cameras ...................... 62 4.3.2 Stability of the Two-Camera Example System ......... 70 4.3.3 Two-Camera Example for Pervasive Surveillance with Controlled Switches .............................. 74 4.3.4 Stability Analysis of the Two Camera Example with Controlled Switches .............................. 79 4.4 Chapter Summary ............................ 83 Design of Surveillance Networks 84 5.1 Wide Area Camera Calibration ..................... 85 5.1.1 Unitless Calibration ........................ 85 5.1.2 Method for Assigning Units ................... 88 5.2 Grouping Architectures .......................... 88 5.3 Sensor Allocation for Multi-Sensor Cooperation ............ 91 5.4 Optimal Hausdorff Tracking ....................... 93 5.4.1 Framework for Optimal Hausdorff Tracking Design ...... 94 5.5 Switched Video Feedback ......................... 106 5.5.1 Characteristics and Implementation Details for MJPEG . . . . 110 5.5.2 Characteristics and Implementation Details for H.261 ..... 110 5.5.3 Characteristics and Implementation Details for H.263 ..... 111 5.5.4 Characteristics of H.264 / MPEG4~Part10 ........... 112 5.6 Sensor Node Architecture ........................ 113 5.7 Modeling Switched Video Feedback as a Dynamic System ....... 114 5.7.1 Examples of Video Feedback Algorithms - Best Resolution . . 116 5.7.2 Example Video Feedback Algorithm - Persistent Camera . . . 117 5.7.3 Assessment Metric for Video Switching Based on Camera Lo- cations ............................... 117 5.8 Optimization for Target Tracking using Switched Sensors ....... 119 5.8.1 Optimal Control and Dynamic Programming .......... 119 5.8.2 Dynamic Programming Algorithm Implementation ...... 121 vii 5.9 Chapter Summary ............................ 121 6 Experimental Implementation and Testing 124 6.1 Experimental Test—bed Setup ...................... 124 6.1.1 Hardware Setup .......................... 126 6.2 Software Setup .............................. 127 6.3 Experimental Results for Wide Area Camera Calibration ....... 128 6.4 Experimental Implementation for Image-Based Hausdorff Tracking . . 131 6.4.1 Tracking a Single Target ..................... 131 6.4.2 Tracking Multiple Targets Simultaneously ........... 133 6.5 Experimental Implementation for Cooperative Hausdorff Tracking . . 136 6.6 Experimental Implementation for Redundancy Resolution using Opti- mal Hausdorff Tracking .......................... 141 6.7 Experimental Results for Multisensor Cooperation ........... 146 6.7.1 Experimental Evaluation for the Sensor Allocation Algorithm 146 6.7.2 Experimental Evaluation of Multiple Target Tracking using Mul- tiple Formation-Constrained Sensors .............. 148 6.7.3 Experimental Evaluation for Multi-Sensor Cooperation using Mutational Hybrid Automata Model .............. 155 6.8 Performance Analysis of the Video Subsystem ............. 162 6.8.1 Performance Analysis for Capture and Processing Time . . . . 164 6.8.2 Comparison for Video Size and Image Quality ......... 165 6.8.3 Performance Analysis for Video Bitrate and Frame Rate . . . 165 6.8.4 Performance Analysis for Server Switching and Camera Handoff 166 6.8.5 Performance Metric Evaluation for Various Surveillance Net- work Tasks ............................ 168 6.9 Chapter Summary ............................ 169 7 Conclusions 170 7.1 Conclusions ................................ 170 7.2 Future Research Directions ........................ 176 Bibliography 181 viii 1.1 3.1 3.2 3.3 3.4 3.5 3.6 4.1 4.2 4.3 5.1 5.2 5.3 5.4 5.5 5.6 5.7 5.8 5.9 LIST OF FIGURES Pervasive Surveillance Scenario ..................... 2 3 Camera Pervasive Surveillance Scenario ................ 29 Modeling Pervasive Surveillance Networks ............... 30 Target and coverage set for image-based Hausdorff tracking. ..... 42 Block diagram for image-based Hausdorff tracking. .......... 46 Target and coverage set 2D for cooperative Hausdorff tracking. . . . . 48 Target and coverage set for 3D cooperative Hausdorff tracking. . . . . 48 Example pervasive surveillance scenario with 2 cameras ........ 62 Graphical MHA model for the two-camera scenario .......... 65 Graphical MHA model for the two-camera scenario with controlled switches .................................. 76 The architecture of the optimal task distribution scheme ........ 95 The structure of the task analyzer module ................ 97 The structure of the optimization scheduling module. ......... 99 Class functions for physical programming. ............... 102 Modification of the preference mapping by confidence factor. ..... 104 Switched video architecture implementation .............. 107 H.624 Based Switched Video Streaming Client .............. 109 Architecture of Sensor Node ........................ 113 Generating Graph for the Camera Switching Strategy ......... 122 ix 6.1 6.2 6.3 6.4 6.5 6.6 6.7 6.8 6.9 6.10 6.11 6.12 6.13 6.14 6.15 6.16 6.17 6.18 6.19 Surveillance network implementation over an IP network ........ Surveillance test-bed designed along with the cameras and the motion modules for camera motion. ....................... Sony EVI—D30 mounted on Robot Nomad XR-4000 .......... Active Cameras: Sony EVI-D30 and dual head Hitachi KP-D50 Camera Setup: Location and pointing direction of the camera setup. . Image-based Hausdorff tracking using active camera .......... Image-based Hausdorff tracking using active camera: Maintaining ad- equate target resolution ......................... Tracking multiple targets using Image-based Hausdorff tracking Image-based Hausdorff tracking for multiple targets: Estimated posi- tion of targets ............................... Image sequence for tracking multiple targets using Image-based Haus- dorff tracking ................................ Position of robot and target set on x and y axes ............. Motion of target set and robot on x-z plane. The view cone of the robot is shown at a particular instance in time with the position of the target centroid shown as well. ...................... Shape functions for cooperative and image-based Hausdorff tracking . Robots View of target at several points in time to show several condi- tions that the tracking system responds to ................ Target tracking using a Sony EVI-D30 camera mounted on Robot No- mad XR—4000 mobile robot ........................ Results of the tracking task with optimal task distribution using phys- ical programming .............................. The energy consumption for no optimization, and weighted sum. . . The pan angle for no optimization, and weighted sum. ........ Estimated position of targets for multisensor cooperation using sensor allocation algorithm ............................ 125 126 127 130 132 135 136 137 138 139 140 141 143 144 147 6.20 6.21 6.22 6.23 6.24 6.25 6.26 6.27 6.28 6.29 6.30 Experiment setup for multi-target tracking using multiple cameras. . 150 Target tracks for multi-target tracking using multiple cameras ..... 151 Experiment setup for multi-target tracking using multiple cameras. . 152 Experiment setup for multi—target tracking using multiple cameras. . 153 Image Sequences taken while tracking the two targets with two formation- constrained cameras ........................... 154 X-Y Plot for target motion ........................ 155 Current MHA mode, shape function, velocity, pan angle and target visibility for Caml ............................ 157 Current MHA mode, shape function, velocity, pan angle and target visibility for Camg ............................ 158 Image Sequences taken while tracking the target over different modes 159 Current MHA mode, shape function, velocity, pan angle and target visibility for Caml ............................ 160 Current MHA mode, shape function, velocity, pan angle and target visibility for Camg ............................ 161 xi 6.1 6.2 6.3 6.4 6.5 LIST OF TABLES Scenarios and gain values ......................... 164 Capture and Compression Time of the Two Schemes ......... 164 Frame Rate and Bitrate for MJPEG, H.261 and H.263 ........ 165 Server Switching Time for Monitoring and Tracking Tasks ...... 167 Metric for Comparison of Various Schemes ............... 168 xii Ir. 1.] A $01 ("mm Df‘l Wt POHsj 3mm. (‘lffaxj In in» mm: .\'. ”‘19) T, SWIM; Infra}. I AIL“ Opfifay and 5‘ Chapter 1 Introduction 1 . 1 Background A sensor network is a collection of individual nodes with each node having sensing, communication, computation and motion capabilities. Distributed wireless sensor networks have recently emerged as an important research area. Sensor networks consist of a variety of different types of sensors in distributed areas. Technological advances in wireless networking and distributed robotics have been leading to in- creasing research on distributed sensing applications using wireless sensor networks. Infrastructure-less surveillance and monitoring are important. applications of such rapidly deployable sensor networks. Networked surveillance systems provide an extended perception and distributed reasoning capability in monitored environments through the use of multiple networked sensors. Various types of sensors with varying sensing modalities such as cameras, infrared detector arrays, laser rangefinders, omnidirectional acoustic sensors etc., can be instantly deployed in hostile environments, inaccessible terrain and disaster relief operations to obtain vital reconnaissance information about the area being surveyed. Locomotion and active sensing on individual nodes can greatly increase the range and sensing capability of the network. Such instantly deployable, wireless, mobile, Unmanned Aerial Vehicles Figure 1.1: Pervasive Surveillance Scenario distributed sensor networks networks find myriad civilian and military applications, such as battle field surveillance, environment monitoring, scene reconstruction, mo- tion detection and tracking, remote sensing and global awareness. There is a growing interest in surveillance applications due to the growing avail- ability of sensors and processors at a reasonable cost. There is also a growing need from the public for improved safety and security in large urban environments and improved usage of resources of public infrastructure. This, in conjunction with the increasing maturity of algorithms and techniques, is making possible the application of this technology in various application sectors such as security, transportation, and the automotive industry. In particular, the problem of remote surveillance of unat— tended environments has received growing attention in recent years, especially in the context of: 0 Safety in transport applications [1], [2], such as monitoring of railway stations [4], [5], underground stations [3,4], airports [5, 6] and airplane routes [7—9], motorways [10], [11], urban and city roads [12,13], maritime environments [14, 15]; 0 Safety or quality control in industrial applications, such as monitoring of nuclear plants [16] or industrial processing cycles [1,2]; 0 Improved security for people’s lives, such as monitoring of indoor or outdoor environments like banks [29], supermarkets [3], car parking areas, waiting rooms [17], buildings [18,19], etc., remote monitoring of the status of a patient [20], re— mote surveillance of human activity [21,22]; d) military applications for surveil- lance of strategic infrastructure [23,24], enemy movements in the battlefield [25,26] and air surveillance [27,28]. Surveillance networks are composed of numerous independent nodes which can be controlled individually. Hence they have high fault tolerance to node failure. However, this redundancy and fault tolerance comes at a price. One needs to design cooperation and coordination schemes in order for the individual nodes to perform tasks together. Phrther, these cooperation and coordination mechanisms should not consume too much power and other resources, as they would render the advantages due to redundancy useless. Pervasive surveillance can be defined as the active monitoring of recognized tar- gets as they move through a large monitored area using a network of sensors. An important characteristic of pervasive surveillance systems is their capability to track a target over a large area using multiple sensors. Figure 1.1 depicts a scenario for conducting reconnaissance on a city square. Each individual sensor can view only a part of the monitored area, but collectively the network can monitor the entire region. There are a number of reasons that optimal coverage may not be available for all regions of a large area being monitored, such as: a lack of prior knowledge of the IIE‘I'I IdIir Spa; CISP I mh‘lt and I Th‘fr; <3th- 0 Strain; environment, a paucity of locations for placement of sensors in the environment, or a shortage of sensors to be deployed. Motion capability of the sensors and the use of active sensing can greatly enhance the coverage quality of the network. For example, monitoring of areas with sparse or no coverage or acquiring higher resolution coverage of the targets can be accomplished by changing the active field of view of the fixed sensors or deploying mobile sensors. Further, the area monitored using a limited number of sensors can be increased by using active (mobile) sensors such as UAV’s, robot-mounted cameras or cameras mounted on pan-tilt heads etc. The mobility of these sensors can be leveraged to increase the area monitored by each sensor, which in turn increases the total area monitored. However, the mobility combined with the scalability requirements renders modeling and analysis of pervasive surveillance systems using active sensors a challenging task and brings to light many research issues. Compared with traditional surveillance networks, an IP-network-based surveil- lance system is easy to deploy, maintain, and expand. The ubiquitous nature of IP networks will dramatically reduce the cost to set up a traditionally expensive surveil— lance network. We envision that the possible usage of surveillance networks could span the venues of industry, military and daily lives. The surveillance network must provide, to the human operator, a timely and con- cise view of the relevant activities within the environment being monitored. Providing multiple video feedback streams often causes loss of attention span of the operator and makes it hard to keep track of the various activities over the various cameras. Therefore only video streams from relevant sensors should be presented to the oper- ator on a per activity basis. This involves automatically switching the active video stream presented to the operator. rm 1.2 Challenges and Problems in Surveillance Networks The tasks to be performed by a surveillance system can be expressed as the following requirements: 1. Deploy the sensors in order to monitor the region under surveillance; 2. Identify multiple moving targets based on predefined models; 3. Automatically track the identified targets over the region being monitored; 4. Coordinate tasking of multiple sensors to maintain target visibility and execute other requested tasks; 5. Provide concise feedback and video data of a tracked target to multiple opera- tors. The sensors should be deployed in the monitored region in order to maximize the likelihood of identifying and tracking the targets. In adversarial scenarios, care should be taken in order to conceal the location of the sensors in order to ensure their safety and the integrity of the task. A major challenge in the deployment phase is sensor placement in order to maximize the covered area while also ensuring continuous tracking of a target when it traverses the coverage regions of multiple sensors. An important aspect of sensor deployment is calibration. Calibration of the sen- sors enables multiple sensors to coordinate with each other and share information amongst themselves. This cooperation amongst sensors leads to better sensing and is in general a desirable property. Calibration of the sensors can be viewed in many do- mains (feature spaces). The most literal one is calibrating the location of the deployed sensors with respect to each other and a global frame. Other types of calibration can involve calibrating the color spaces of various cameras, which will facilitate the sharing of target models amongst the sensors. on: ist 01’ {II at it [rat-l Hum and f. agent Very . nlOllC T1 the r 0f th. {0 ii.” felt. malt-j In order to perform automated surveillance there are two major subtasks: target detection and target tracking. A target perception and video understanding module is responsible for detecting and classifying the various targets in the active field of view (F CV) of the sensor and performing temporal consolidation of the detected targets over multiple frames of detection. Moving target detection and classification is known to be a difficult research problem and has been the focus of many recent research efforts [29]. The next challenge would be to classify and associate the various detected image blobs to discernable targets and maintain their temporal tracks in order to pervasively track them. Once the targets have been identified, the challenge is to recognize the relationships of the target to the environment and other targets in order to attribute aims and motives to the tracked targets. The target tracking problem involves continuously keeping the target in view of at least one sensor. When the target is escaping the viewing area of the current tracking sensor, the current sensor needs to be moved (relocated) or the tracking task must be transferred to another sensor which can currently view the target. Modeling and design of such cooperative tracking using multiple sensors is a challenge. Active agents further make the modeling and design of such pervasive surveillance systems very difficult. A modeling and design framework for coordinating and controlling the motion of the active cameras to pervasively track targets is necessary. The various methods through which the the human operator communicates with the surveillance network form an integral part of the usability, usefulness and efficacy of the surveillance network. These methods include passing commands and queries to the network indicating the intention of the surveillance task and receiving realtime feedback and results of the queries in progress and alarms for violations of various predefined conditions. Video feedback is provided to the operator in order to apprise the operator of the various tracking and monitoring tasks being currently executed by the sensors. the lIliU and surv dciaj rarui 3r. - 1.3 Nf‘l It. muni .‘ However, due to the large number of sensors involved, the operator can be easily overwhelmed by the amount of video information. Hence, only relevant video feeds should be provided to the operator and, based on the task, the video feed should be switched from sensor to sensor. There are various choices for the switched video feedback implementation and they have their advantages and disadvantages based on the scenario they are being applied to. A performance evaluation metric is needed for evaluating the efficacy of a particular scheme when being applied to certain scenarios. Direct human intervention and control of the various cameras can be used for the logging and tracking of targets which are not predetermined and programmed into the network. Hence the human operator needs to move the active cameras and receive real-time video feedback to manually track targets. However for such surveillance systems, direct manual tele—control of the active active camera over the delayed IP network suffers from stability and synchronization problems due to the random, unbounded and asymmetric communication time delay in the network [30, 31]. This can result in reduced performance and even loss of tracking. 1.3 Literature Review Networked surveillance systems have received much attention from the research com- munity due to their many pervasive applications [32], [33]. Previous work from liter- ature is reviewed here. Deployment and Calibration: Based on their mobility and rapid-deployment capability, the operation of MSN’s can be divided into two separate phases, namely deployment and surveillance. Infrastructure-less rapid deployment is an important characteristic of mobile surveillance networks and many research efforts have been dedicated to optimally deploy sensor nodes for increasing their total sensing area [34—36]. A global kinematic representation of a network of connected sensors ’R, = Tat 1!]th and Tilt; deter appri tion ran: next discer Ihem. and B larger Targp 1W‘pl I. I 5])”. l f; . .gr‘ ' I nle. .I‘) inl‘gh.‘ ] I the Far 5]] ”In r b] Urfit {R1,R2, ...,Rn} using only localization information between one-hop neighbors is suggested in [34]. This relationship between the various nodes is key to sharing meaningful information between them. Target Identification and Video Understanding: Beyond the initial deploy- ment phase, in the surveillance phase there are two major subtasks: target detection and target tracking. Figure 5.8 depicts the general architecture of a sensor node. The target perception module is responsible for detecting and classifying the various targets in the active field of view (FOV) of the sensor and performing temporal con- solidation of the detected targets over multiple frames of detection. Moving target detection and classification is known to be a difficult research problem [29]. Many approaches such as active background subtraction [37] [38] and temporal differentia- tion have been suggested for detecting and classifying various types of moving targets ranging from single humans and human groups to vehicles and wildlife [39] [38]. The next problem would be to classify and associate the various detected image blobs to discernable targets and maintain their temporal tracks in order to pervasively track them. Various approaches such as extended Kalman filtering, pheromone routing and Bayesian belief nets have been suggested for maintaining the track of the various targets [40]. Target Tracking with Active Camera: A general surveillance task involves keeping moving targets in the active sensing region of the sensors with a certain pre- specified resolution. Research approaches to this problem found in recent literature [39] [37], generally use visual servo control [41] or gaze control [42], which mainly involve feature-point-based tracking and fail to describe the basic task of maintaining the target in the sensor’s active field of view with certain resolution, effectively and succinctly. These approaches result in excessive camera motion that may result in blurring and other undesired effects, which in turn have detrimental effects on the Slll fill; fron its-'1: in a , rt>r]11j In or. 0f [lit to an. the 01 Opera both ( surveillance task. Further, using these approaches it is not possible to describe a multiple target tracking task with randomly moving targets using a single sensor. Sensor Cooperation and Coordination: Multiple groups of nodes are sequen- tially involved in order to pervasively track the target as it traverses through the monitored area which is much larger than the range of an individual sensor. Hence there needs to be a handoff mechanism which essentially “switches” the tracking task from one group of sensors to another based on the movement of the target. Tracking itself can be described as a switching phenomenon since the act of tracking a target in a given monitoring area which is greater than the viewing area of any given sensor requires there to be some form of hand-off, or switch, between a number of sensors. In order to assure that the target visibility will be maintained even when it goes out of the capable field of view of one sensor, the system must switch the tracking task to another sensor that has to “pick up” the target before it leaves the field of view of the original sensor. This discrete switching phenomenon along with the continuous operation of the individual nodes can be modeled as a hybrid system consisting of both continuous and discrete transitions. Hybrid automata models for describing switching systems have been the focus of many recent research efforts [43], [44]. They can effectively capture the discrete switching nature of the surveillance task. However, their continuous part is modeled in a point-based framework using differential equations. This model fails to effectively model the continuous operation of the individual nodes in the surveillance task and suffers from the drawbacks mentioned previously. Network topology The efficient and timely exchange of information between the individual sensing nodes is imperative for cooperation and coordination of multi- ple nodes in order to perform pervasive target tracking. Ad-hoc, multi-hop mecha- nisms are best suited for such mobile infrastructure-less wireless networked systems. IN" to ' sur: by i ore: T ,w (”nyl, in ‘ I [fwd r. Networking and routing mechanisms for distributed wireless sensor networks have attracted a lot of attention in recent literature [45] [46] [47]. These approaches gener- ally assume scalar (low volume) data transfer between the individuals nodes and the sink node (which is just a consumer of the information.) However, surveillance tasks require fast and timely transfer of vector (large volume) data, e.g. video data, to the sink and also within the nodes themselves. This scenario may create an unbalanced networking load on the nodes along the optimal relay path, which can be detrimental to the longevity of the network. Operator Interface Design: Video feedback is an essential component of the surveillance system. A single human operator cannot effectively monitor a large area by looking at dozens of monitors showing raw video output. That amount of sensory over-load virtually guarantees that information will be ignored and requires a pro- hibitive amount of transmission bandwidth. In [39] an approach is presented which provides an interactive, graphical user interface (GUI) showing a synthetic View of the environment, upon which the system displays dynamic agents representing peo- ple and vehicles. This approach has the benefit that visualization of scene events is no longer tied to the original resolution and viewpoint of a single video sensor and the operator can therefore infer proper spatial relationships between multiple ob- jects and scene features. Another program called the the Modular Semi-Automated Forces (ModSAF) program provides a 2-D graphical interface similar to the VSAM GUI [48], with the ability to insert computer-generated human and vehicle avatars that provide simulated opponents for training [49]. The ModStealth program gener- ates an immersive, realistic 3-D visualization of texture—mapped scene geometry and computer-generated avatars [50]. Although automatic image analysis and video understanding tools [39] can be used to facilitate identification of targets and activation of alarms or logs for certain 10 I] {)1 iii. Iltfl dcx ('lll’l 59115 Illl_l\'i llllllt 1 (61111 (JYffr to ti. S‘Vlft surveillance tasks, the operator needs the video feedback to make decisions about the tracking task which may not have been pre—programmed or to independently task the network based on the current feedback received from the sensors. The video feedback provided to the operator can be transmitted over either analog or digital channels [51]. The use of analog modulating and transmission techniques for surveillance applica- tions has been reported in [52]. Receiving digital video feedback over an IP-based network from the networked camera sensors has also received much attention with the development of the various video compression and communication protocols [53—57]. Since multiple cameras are deployed to track the identified targets, multiple, con- current feedback video streams maybe required for monitoring the target. These sensors initiating these streams will be changing from time to time as the target moves out of range of the current sensors tracking it. However, providing multiple unnecessary (unrelated to the task) video feedback streams often causes loss of at- tention span of the operator and makes it hard to keep track of the various activities over the cameras. Hence only video streams from relevant sensors should be presented to the Operator on a per activity basis. This is done through automatic or manual switching of the camera streams that are presented to the operator. Literature Review Summary: From the above review on recent literature on mobile surveillance networks, the focus of the major works reviewed revolves around target identification and video understanding. However, most of the approaches cited in literature lack a generalized framework for modeling the surveillance task of main- taining visibility of multiple targets with adequate resolution. These approaches tend to use visual servoing which involves only feature point based target tracking and cannot adequately describe multi—target tracking using a single camera sensor. Multi target tracking is addressed in a very ad-hoc fashion in literature using methods such as sensor slaving and gaze control. A generalized modeling and analysis framework for pervasive target tracking using multiple distributed sensors is not available in 11 5835 0f ti; Cfffjl’f 56911;} n mode] Ih’ffast tori“; the CT, ”ad-ii: Ufa}. “fuel I Haiti} ”‘3 86:: F0] current literature. Although switched surveillance implementations using multiple distributed sensors have received much attention in recent literature, a comparative analysis of video feedback schemes for various visual surveillance scenarios has not been reported. Further, optimized sensor deployment based on minimizing the num- ber of sensor switches in order to pervasively track a target using multiple sensors over a large area has not been explored. This dissertation tries to address these issues and develop a formal modeling, analysis and design framework for mobile surveillance applications. 1.4 Objectives of this Research The main objective of this research is to implement a pervasive surveillance network using active camera sensors connected to each other using an IP-based network. The sensors have motion capability and are required to continuously maintain visibility of the identified targets in the surveillance region. The objectives of this research effort are to provide a modeling and design framework for such pervasive surveillance scenarios. This dissertation proposes a mutational-analysis—based topological framework for modeling and design of mobile surveillance networks which find applications in myriad infrastructure-less, rapidly deployable pervasive multi-target surveillance and moni- toring scenarios. Using the concept of mutations of domains and shape analysis, the conditions for accomplishing various surveillance tasks such as continuous target tracking while maintaining appropriate image resolution can be derived. The design of a surveillance task using image-based Hausdorff tracking, used when the target is in the active field of view of the sensor is presented. The method of cooperative Hausdorff tracking, used to track targets which are outside the active field of view of the sensor using the observations from other sensors, is presented. For pervasive surveillance scenario which requires a hand-off or switching mecha- l2 nisrn between a number of sensors, this dissertation proposes the Mutational Hybrid Automata (MHA) model, that combines the discrete switching structure of hybrid au- tomata with the domain-based evolution of mutational systems in order to overcome their individual limitations. It further presents example surveillance tasks modeled using the proposed MHA model and provides an analysis and proof of stability of the example systems. A switched video feedback system is an important component of the pervasive surveillance network. This dissertation compares different design alternatives for the implementation of the switched video feedback system, such as MJ PEG [57] H.261 [53] and H.263 [54] transported over RTP transport protocol [56]. Their performance under certain scenarios/tasks are measured and the advantages and disadvantages of different alternatives are analyzed. Various performance metrics are proposed to compare the suitability of these schemes based on a given scenario/ task. Further an optimal sensor selection algorithm for selecting minimum number of sensors to track a moving target is presented. 1.5 Major Contributions of this Dissertation The major contributions of this dissertation can be classified into various categories. This section provides a brief statement about the major contributions of this disser- tation. Theoretical Contribution o A new domain based modeling framework called Hausdorff tracking for tracking multiple targets using continuous camera motion is developed. 0 A mutational hybrid automata model is developed for modeling multi-sensor pervasive target tracking tasks. Using this model, autonomous as well as con- trolled sensor switching scenarios can be modeled. 13 0 Stability analysis of the proposed Hausdorff tracking scheme using the shape Lyapunov theorem has been presented. 0 Target tracking stability of the pervasive surveillance task modeled using the mutational hybrid automata models is analyzed for example systems using a two-step procedure. Technological Contributions 0 A switched video feedback system that enables streaming of relevant video streams to a human operator for multiple clients has been implemented us- ing a H.264 video encoding scheme. The operator clients developed allow the automated monitoring as well as active control of the networked sensors. 0 A task / scenario based performance metric is proposed to analyze the suitability of application of a particular video transport scheme for that task. Various schemes such as MJPEG, H.261, H.263 and H.264 are compared for different scenarios using the proposed metric. Design Contributions 0 An optimality framework for exploiting the redundancy in task accomplishment using Hausdorff tracking is developed. Physical programming has been used to provide a physical meaning to the combinations of the various cost functions. a The video switching algorithm is modeled as a discrete-time finite-state dynamic system and an assessment metric is proposed to analyze the locations and con- figurations of the cameras for switched video feedback in order to minimize the number of switches while maximizing the resolution sustained by the tracked target at the tracking camera. 14 011 Hibt Contributions for Experimental Work 0 A visual surveillance test-bed has been implemented for testing and validating the efficacy of the various approaches proposed. 0 A mobile robot and articulated robot arm have been integrated as a part of the motion platform for the visual sensors. These can be used in the future for further extension and experiments of this research. 1.6 Organization of this Dissertation The scope of this dissertation is to investigate a novel modeling and design framework for mobile surveillance networks. The work is divided into three parts - namely mod- eling, analysis and design. First, in Chapter 2, we present the necessary background on the mathematical tools which will be required to develop and analyze the formal modeling and design framework proposed. In Chapter 3, the new modeling framework using mutational equations for motion of a single sensor is presented. Using this mutational equation model, a shape-based Hausdorff tracking control scheme to track multiple targets with viewing constraints is developed. Chapter 3 further presents the Mutational Hybrid Automata model developed for modeling multi-sensor cooperative target tracking scenarios. Chapter 4 presents the shape Lyapunov theorem for analyzing the stability (in the sense of Lyapunov) of the Hausdorff tracking method proposed. It further presents examples of multi camera surveillance systems modeled using the mutational hybrid automata model. A procedure for analyzing the stability of the example multi-sensor cooperative tracking systems modeled using the mutational hybrid automata model is also presented. Chapter 5 presents the work done toward design of the switched video surveil- lance system. It presents a brief review on grouping architectures and presents a 15 sensor allocation algorithm which can be used to allocate sensors to targets. A design framework for optimal tracking of multiple targets by exploiting the redundancy of task accomplishment is presented. Further algorithms for sensor placement in or- der to minimize sensor switching while tracking moving targets and algorithms for optimized target tracking using switched sensors are also presented. Experimental validations of the proposed algorithms are presented in Chapter 6. The experimental results for Hausdorff tracking, multi-sensor cooperation using the mutational hybrid automata model and multi—sensor target allocation are presented. Further results of optimal Hausdorff tracking and multi-target tracking experiments are also presented. Chapter 6 also provides a performance analysis of the various video transmission schemes and proposes a task- and scenario— based metric for eval- uating the various video compression and transmission schemes for switched video surveillance tasks. Further, simulation results for optimized target tracking using switched sensors is also presented. Conclusions and future directions are presented in Chapter 7. 16 9qu Fur fl 10g WIT ll 2.1 Let g. and J be Cir SHIN. (105,.“ lle‘d . Chapter 2 Mathematical Background This chapter presents the mathematical preliminaries that will be used in the rest of the dissertation. A definition of a dynamic systems is presented. Then mutational equations, which are used to represent domain based dynamic systems are introduced. Further hybrid dynamic systems are introduced and the Mutational Hybrid Automata model is presented. Shape analysis and shape functions are introduced to provide various measures on sets (shapes). 2.1 Introduction Let X be a collection of variables; then X denotes the valuations of these variables and a: will be used to refer to both the variable and its valuation (the reference should be clear from the context). 2X denotes the power space of X i.e., the family of all subsets of X and [C(X) denotes the space of all nonempty compact subsets of a given closed set W C R". We will use M to denote a general metric space and X will be used to denote a finite-dimensional vector space. A controlled dynamical system is a system 2 = [X , F ,U , qt] where, o X is an arbitrary topological space called the state space of E; 17 ifl di: Dir; C011 enti o I‘ is the time set, which is a transition semigroup with identity; 0 L1 is a nonempty set called the control—value space of E;and o The map d) : X x I‘ x U H X is a continuous function satisfying the identity and semigroup properties [58]. The dynamical system can also be denoted by the system 2 = [X , F,L(, f] where the transition function f is the generator of the extended transition function ¢ [58]. A discrete-time dynamic system is a dynamic system 2 for which I‘ = Z where, Z is the set of integers. The discrete-time dynamic system, 2 is finite dimensional if both X and L! are finite dimensional and the dimension of the system 2 is the dimension of X. A system 2 is complete if every input is admissible for every state. A hybrid dynamic system is an indexed collection of dynamic systems with some map defined for jumping among them. Various models for hybrid systems have been proposed in the literature [43] [44]. Generally, these approaches assume that the continuous dynamics part of the hybrid systems are defined by the solutions of differ- ential equations (51': = f (:I:(t))) where :r:(t) E X C R" and the function f : X H IR" is called a vector field on IR". The resulting dynamical system is given by ¢(:ro, t) = 23(t) where :r:(t) is a solution of the differential equation starting at 3:0 at t = 0. For the case of the pervasive surveillance problem, the variables to be tracked are not elements of a vector space but geometrical domains themselves which are elements of the space [C(W) of all non-empty compact subsets of a given closed set W C R" with distance d. Hence we need to enhance the hybrid systems model(s) to include the cases where the continuous state is not an element of a vector space but a geometric domain K E [C (W) In this case we describe the continuous dynamical system as the solution of a mutational equation K 3
[C(E) K(): (2.4) ti-—> K(t). The evolution of a tube can be described using the notion of a time derivative of the tube as the perturbation of a set. Associate with any Lipschitz map (,0 : E i—-> E, a map called the transition 19¢(h, q) :2 q(h), which denotes the value at time h of the solution of the differential equation: (1 = 99((1) (1(0) = <10- 20 Extend this concept of a transition to the space [C(E) by introducing the reachable set from set K at time h of (,9 as 19,901.10 == {199201. qo)}qoeK (2-6) The curve h +——> 19¢(h, K) plays the role of the half lines h i—> :1: + lit) for defining differential quotients in vector spaces. Using the concept of the reachable set the time derivative of a tube can be defined as a mutation: Definition 2.2.1 (Mutation) Let E C R" and cp : E i—r E be a Lipschitz map {9.0 E Lip(E,lR”)). Iffort E R+, the tube K : R+ H [C(E) satisfies: d(1((t+ h). 199001. K0») 1' = 2. ’13ng h 0. ( 7) then, go is a mutation of K at time t and is denoted as: K 3 «p (2.8) It should be noted that cp is not a unique representation of the mutational equation, which justifies the use of the notation (3) [60]. Consider a function (,0 : R x [C (E) i—> Lip(E, IR") as a map associating a Lipschitz map (,0 to a pair (t, K). Using this map, we can define the mutation of tube K (t) as: K(t) a cp(t,K(t)), VtZO (2.9) 2.2.2 Controlled Mutational Equations A controlled mutational equation can be written as: K(t) a e(t,K(t),u(t)), VtZO (2.10) 21 u(t) e U. (2.11) where, (0 : R+ x [C(E) x U H Lip(E, IR") is a continuous map associating a Lipschitz map with (t, K, u) and u(t) E U is the control input. A feedback law can be defined as a map Ll : [C(E) H U associating a control u with a domain K (t) as: u(t) = U(K(t)) (2.12) Using a controlled mutational equation, we can model the motion of the target and coverage sets due to the motion input u to the camera/ robot. 2.3 Hybrid Dynamic Systems Hybrid systems generally refer to dynamical systems which consist of a mixture of discrete and continuous components. Such systems describe the time evolution of systems with both continuous and discrete parts. Hybrid dynamical systems consid- ered in the literature generally have continuous dynamics modeled using differential equations which depend on some discrete phenomena like switching and jumping in the continuous and discrete state of the system respectively. These hybrid phenomena can be modeled using hybrid automata proposed by Lygeros and Sastry [44] [61]. A hybrid automaton describes the interaction and time evolution of both the discrete and continuous parts of a hybrid dynamical system. The hybrid automata frame- work models the continuous part of a hybrid dynamical system as using differential equations. Edge maps and reset maps are used to capture the discrete nature of the system while guard conditions and domains are used to describe the interaction of the discrete and continuous components. 22 A hybrid automaton (HA) is a collection H = (Q, X, s, Init, D, E, G, R] (2.13) with the constituent parts as follows. Q is the set of index states or discrete states. X is the set of continuous variables. 2 = {2,1}qu is a collection of controlled dynamical systems, where each Eq = [Xq, Fq, fq,uq] is a controlled dynamical system. Xq are the valuations of the continuous state variables X q, fq are the continuous dynamics and Hg is the set of continuous controls. I nit C Q X X is a set of initial states. D : Q H 2X is a domain. Each discrete state has a domain of possible con- tinuous states associated with it. The domain is the set of valuations of the continuous states where the particular associated discrete state is valid. E C Q x Q is a set of possible edge transitions between the discrete states. G : E H 2X is a guard condition. Each edge in E is mapped to a manifold in X which triggers a mode switch when the continuous state hits this manifold. R : E x X H 2X is a reset map which causes the continuous state to be reset due to the mode switch in the discrete state. The state of the above hybrid dynamical system can be denoted as S = UqEQXq x Q. The above model can be used to model the discrete phenomena mentioned earlier. For more details about this model refer to [44]. 23 2.4 Mutational Hybrid Dynamic Systems Mutational equations can be incorporated into the definition of general dynamical sys- tems as they satisfy the various nontrivialitry, semigroup and identity properties [58]. Using the concept of transitions of domains introduced earlier and mutational equa- tions, we can write a general dynamic system as: E =[IC(W),I‘,2999] or (2.14) s = [K(W), r, ,0) (2.15) where i9.p is the transition function and (,0 is the generator of that transition function. Using the above definition of a dynamic system, we can now define a Mutational Hybrid Automaton (MHA) as: Hm = [Q, M,E,Init,D, E,G, R] (2.16) with the constituent parts as follows. 0 Q is the set of index states or discrete states. 0 M is a collection of the continuous variables K,- 6 IC (W) E = {2%}qu is a collection of controlled dynamic systems, where each Eq 2 [K(W)q, I‘q, (,0q,qu] is a controlled dynamic system. [C(W)q are the continuous state spaces, (0., can be construed as the continuous dynamics and Hg is the set of continuous controls. Init E M x Q is a set of initial states. D : Q H 2M is a domain which associates a domain of operation with every qEQ. 24 n)- o E C_: Q x Q is a set of edges along which discrete transition takes place. 0 G : E H 2M is a collection of guard conditions which enable discrete transitions. 0 R : E x M H 2'M is a reset map which resets the continuous state when a discrete transition is enabled. Thus, S = quQ M x {q} is the hybrid state space of Hm. A MHA can be pictured as an automaton as in Figure 4.2. There, each node is a constituent dynamical system with the index state as the name of the node. This definition is a fairly broad and powerful modeling framework which subsumes continuous differential equations, mutational equations, discrete time difference equa- tions, finite automata as well as switched controlled systems and non-smooth differ- ential/ mutational equations. A more intuitive explanation of the various components will be provided via an example application to pervasive surveillance networks in the following section. 2.5 Shape Analysis Shape analysis deals with problems where the variables are not vectors of parameters or functions, but the shapes of geometric domains or sets K contained in a subset E C R”. The measure of deformation of a set K can be expressed using shape functions as: J : [C(E) H IR. The directional derivative of the shape function rep- resents the change in the shape function J (K) due to the deformation of the shape K. 2.5.] Shape Functions Shape functions are set-defined maps defined from K(E) H IR and provide a “mea- sure” of the deformation of K E [C(E). For example we can use a shape function 25 Ti, to check the distance of the set K to another set K, or whether the reference set K is contained within the current set K. Shape acceptability and optimality can be studied using shape functions J as optimization cost functions as is shown in [62]. Various shape functions can be readily developed which can adequately describe the surveillance task of set coverage or a measure of the size of a set, etc. 2.5.2 Directional Derivative 0f Shape Functions The directional derivative of the shape function [62] represents the change in the shape function J (K) due to the deformation of the shape K. It can be construed as the analog of the directional derivative in vector spaces and provides us with a measure of the change in the task criterion due to motion of the coverage or target set. Consider a function J : H H R where H is a real Hilbert space such as R". The Gateaux (directional) derivative of J at the point q in the direction of v is defined as: J(q+tv)—J(Q) D ' . = l' 2.1 J ((1)0) gr), t ( 7) This can also be looked at in a different way. Suppose q(t) is defined as: 40) = 11. (1(0) = q (2-18) This means that ifj(t) = J(q(t)) then, dj(t) _ -. _ . J((me - J(CI(0)) dt " J(t) — ill—13(1) t (2'19) = lint) “‘1 + “’2 — “(1) = DJ(q)(v) (2.20) Thus, the Gateaux (directional) derivative involves the point q and the direction v at t=0. 26 for); can 2.6 Tlll.‘ tati» 113-b: nan] llnu, 1151? j A 11:. Slime. 8‘30“ : Extending this same concept when q is not an element of a Hilbert space, but a shape (domain) K and v is replaced by (0, which is the direction of mutation of the tube K. Thus the Eulerian semi—derivative (Gateaux directional derivative) is defined as: MM z 1.... J(v.(t.1<))— J(K) 2.21 t—>0 t ( ) where, 1999(t, K) is the t reachable tube of K under the mutation (,0. From [60] and [63], the directional derivative of the shape function having the form of J: [K new. (222) can be written as: J°(K)(
) dq (2.23)
K
2.6 Chapter Summary
This chapter presents a discussion on various dynamic systems and introduces mu-
tational equations as a tool to model the dynamics of shapes. It also presented the
hybrid automata model used to model a combination of discrete and continuous dy-
namic systems. Hybrid automata models use differential equations to model the con-
tinuous system dynamics. This chapter further extends the hybrid automata model to
use mutational equations to model the continuous dynamics. The Mutational Hybrid
Automata (MHA) model proposed allows us to model large scale networked pervasive
surveillance systems. This chapter also introduces shape analysis used to measure the
acceptability of shapes using shape functions.
27
Thi.
lam
irai:
Hifir
Illi‘t
Inuit
(hill
of II:
3.1
In a
Willi
0f a .
One ,-
JUN};
Chapter 3
Modeling Mobile Surveillance
Networks
This chapter presents the modeling of a tracking task using a multiple camera surveil-
lance network. which is a mutational analysis based technique christened “Hausdorff
tracking” (in honor of the German mathematician Felix Hausdorff who made sig-
nificant contributions to set theory, descriptive set theory, measure theory, function
theory, and functional analysis), is introduced in order to track multiple targets si-
multaneously using a single camera sensor. Further the Mutational Hybrid Automata
(MHA) model is introduced for tracking a target as it moves across the viewing region
of multiple cameras.
3.1 Introduction - Example Surveillance Scenario
In a multiple camera surveillance scenario with active (moving) cameras, the target,
which moves in a continuous fashion, can be tracked using either continuous motion
of a camera sensor or by discontinuous jumps of the “current” tracking sensor from
one camera to the next or a combination of continuous motion and a discontinuous
jump. Consider the example surveillance scenario form Figure 3.1(a) which depicts
28
-- Coverage set
.- Target set (disjoint)
Optical Axis
(a) Three—Camera Pervasive Surveillance Scenario
Figure 3.1: 3 Camera Pervasive Surveillance Scenario
the image planes of three camera sensors and a couple targets.
The targets can be tracked using the continuous motion of Camera 1 as is shown
in Figure 3.2(a). We see that Camera 1 is moved in a continuous fashion to ensure
that the targets are within the active field of view of the sensor. In Figure 3.2(b),
the tracking task is transferred to Camera 2 which can already view the targets in
its active field of view. Finally, as shown in Figure 3.2(c), the tracking task can
be transferred to camera 3 which will first need to acquire the targets by executing
continuous motion and then a discrete switch will be executed to transfer the tracking
task to camera 3.
This chapter first presents the modeling of continuous motion of the camera sen-
sor and target (continuous dynamics) using mutational equations. These dynamics
29
(a) Coverage Using Continuous Motion
(b) Coverage Using Only Discrete Motion
(c) Coverage Using Hybrid Motion
Figure 3.2: Modeling Pervasive Surveillance Networks
30
Ff.
1
models can be used for developing a servo mechanism to control the continuous cam-
era motion. The continuous input to the camera for motion can be controlled using
an error feedback from either the image space analysis or the world space analysis.
Image space analysis implies that the error feedback for moving the camera is derived
from analysis of the target location in the image plane of the camera, while world
space analysis implies that the error input for moving the camera is derived from the
world space coordinates of the target (acquired from other sensors that can observe
the target) and the current world space location of the camera coverage set. Hence,
we develop models for image space servoing and world space servoing separately.
The mutational equation model for the target motion in the image space is derived
using the perspective projection model and the image jacobian [64]. The motion of
the camera coverage set in the world space model is derived separately for 2-D and
3—D scenarios. In the 2-D scenario, the camera coverage set is taken to be a pie cut
out from a circle, while in the 3—D scenario, the camera coverage is modeled as a
3-D pie (cone bounded by a sphere) cut out from a sphere. The motion of these
coverage sets due to the motion of the camera is modeled as a controlled mutational
equation. Rirther, we relax the implicit assumption of holonomic motion constraints
and extend the continuous motion modeling using mutational equations for a sensor
with non-holonomic constraints.
The Hausdorff tracking method for image space and world space models is intro-
duced. This method uses shape functions to derive feedback errors which are used to
control the motion input to the camera. Image space Hausdorff tracking and coop-
erative Hausdorff trackng are presented as two examples of the Hausdorff tracking
method in the image space coordinates and world space coordinates, respectively.
This chapter also introduces the Mutational Hybrid Automata (MHA) model for
modeling the hybrid camera motion scenario shown in Figure 3.2(c). Using mutational
equations, the mutational hybrid automata model can describe the continuous motion
31
of the cameras and target and the discrete switching behaviors when a target leaves
or enters the field of view of a camera.
Using Hausdorff tracking method developed in this section, there are numerous
choices of the input applied to the camera which can accomplish the task. This
is because the robot system has a certain amount of redundancy built in for the
surveillance task. An example of this redundancy in task execution can be evinced
by noting that in order to bring a target, which is currently located at one of the sides
of the camera F OV, into the cameras F OV, the camera can execute an angular motion
by panning or a lateral motion by moving behind. Both these motions individually
or a combination of them can be used to accomplish the surveillance task. Hence we
need to select an appropriate input to the camera for task accomplishment.
We can select various other tasks that the camera can perform in order to utilize
the redundancy offered by the camera motion. For example the energy consumed
due to the motion of the sensor should also be minimized while maintaining a safe
distance from sensed obstacles (in the case of sensors mounted on mobile robots).
These combined tasks can be expressed as a multi-objective optimization problem
which tries to minimize the resolution inadequacies and energy consumption subject
to the constraint of maintaining the visibility of the target.
Various other tasks can also be considered for the multi-objective optimization
problem and the complete problem can be represented as an optimal control frame—
work for Hausdorff tracking. Once the task criteria are expressed in a multi-objective
constrained optimization framework, various methods such weighted sum method
,physical programming, non-linear programming, search space methods, etc. can be
used to find the solution to the Optimization problem.
32
Tl
llsj
ilt )1
hip
I111;
101
Spam:
3.2 Controlled Mutational Equation Model for Camera Sen-
sor and Target
A controlled mutational equation can be written as:
K(t) a (,0(t,K(t),u(t)), age (3.1)
u(t) E U, (3-2)
where, (p : R+ x [C(E) x U H Lip(E, R") is a continuous map associating a Lipschitz
map with (t, K, u) and u(t) E U is the control input.
Using a controlled mutational equation, we can model the motion of the target
and coverage sets due to the motion input u to the camera/ robot.
3.2.1 Perspective Projection Model
The deformation of the target set w.r.t. the motion of the camera can be represented
using a mutational equation. The mutational equation can be derived using the optic
flow modeling as shown in [41]. Let u be a velocity screw which represents the velocity
input to the camera and let q be a vector of image space feature parameters. The
image jacobian Jv is a linear transformation that relates the camera velocity screw u
to the tangent space q of the image space feature parameter vector q as:
q = Jv(c)u (3.3)
where, c represents the camera pose such that c = u.
Assuming that the projective geometry of the camera is modeled by the per-
spective projection model, a point P = [:r,y, z]T, whose coordinates are expressed
33
r—J
with respect to the camera coordinate frame, will project onto the image plane with
coordinates q = [gag qy]T as:
(3.4)
where A is the focal length of the camera lens [41]. Using the perspective projection
model of the camera, the velocity of a point in the image frame with respective to the
motion of the camera frame [41] can be expressed. This is called the image jacobian
by [41] and is expressed as:
= «pc(q) = Bc(Q) , = Bc(q)uc (3-5)
—-:— o a —e A—xe q... s
0 —% 2} 1:33 _-_€1.rq _q$ {q
)T are the image space coordinates of the point P whose task
Where. q = ((122.93;
Space coordinates are (23,31, z)T. A is the focal length of the lens being used and
u = [11:2, vy, vz, (.013, wy, wZ]T is the velocity screw of the camera motion and A is
the rate of change of the focal length.
The mutational equation of the target set can now be written as:
(i: wlq) = 99cm) + 92107)
A
K 3 (0(K) (3.6)
3.2.2 World Space Model
We can derive the mutational equation for the sensor coverage set and target set
in the world space. Here we consider two cases for a 2-D model and 3-D model
34
Ci;
poi
for
mum.
separately and present the mutational model for the 2-D and 3-D cases based on the
dimensionality and geometry.
2 Dimensional World Space Model
Consider a point q E R2 defined as:
qr, _ :r + r cos(p) (3 7)
qz 2 -l- rsin(p)
where [r, z, 0]T are the coordinates of the robot and r, p are the polar coordinates of
point q with respect to the robot coordinates. Using equation 4.15, the coverage set
for the focusing constraint can be written as:
K : {q l T E (Dmina Dmax)ap E (01min: amaw» (3'8)
where Dmm, Dmax are the minimum and maximum focus distances and amin: amax
are the minimum and maximum angles of the view lines as shown in Figure 3.5. The
motion of each point in the coverage set can be written as:
q;- 1 0 —r sin(p)
2 ’11.
4,. 0 1 r cos(p)
i=Mm=Bu we
where u = [13, 2,9]T is the velocity input to the camera. Using equation 4.17, the
mutational equations for the motion of the coverage set can be derived.
35
3 Dimensional World Space Model
The coverage set can be specified as the set of points which satisfies the viewing
constraints of the sensor and can be defined as the active FOV of the camera after
incorporating the depth of field constraints such as the maximum and minimum focus
distances. Consider a point q E R3 in the sensor coordinate frame defined as:
' (1x - ’ rcos(0) sin((b) ‘
qy = rcos(q‘9) (3.10)
qz rsin(6) sin(c§) 4
where (r, 6, (b) are the polar coordinates of the point q in the sensor frame. Using the
above equation, the coverage set can be defined as :
K = {q I 7' E (DminiDmar)»9 E (aminaamarla
d9 E (remap, (3111271)} (3:11)
where, (DmimDmax) represent the depth of field and (am-mam”), (fimabflmm)
are the view] angles for the sensor. The global coordinates of the target set K can
be transformed to the local sensor coordinates using the localization information
available.
The mutational equation for the motion of the coverage set can be written as:
qg; F cos(6)cos(q§) —sin(9)sin(gb) T
4y = 7‘ 811105) 0 u
_ (iz . [ sin(9)COS(¢) c08(9) $1105) -
(i = <.0(<1)=Bu (3-12)
where u = [wig wy]T is the velocity input to the camera. The objective is to cover the
36
In ti.
set of targets K with the coverage set K such that K C K.
3.3 Incorporating Non-holonomic Sensor Constraints
In section 3.2 and 3.2.2 the mutational model relating the motion of the sensor u
to the motion of the target and coverage sets was introduced. However, the implicit
assumption made was that the individual components of the vector u are independent
of each other i.e., the system has a holonomic structure. Consider the example of a
car moving in a horizontal plane. Select the reference point to be the midpoint of
the two rear wheels and the reference direction to be the main axis of the car moving
forward. This car’s configuration can be defined as a vector (state) (:r,y, 6) E R3.
However, these parameters and their derivatives are related by the following relation:
—:i: sin6 + ycos6 (3.13)
where it and y are the derivatives of a: and y respectively. At any configuration
(2:, y, 6) along a trajectory parameterized by time, the triplet (5i), , y, 6) is the velocity
of the robot (car). The set of all vectors (2:, , y 6) passing through the point (x, y, 6) is
a vector space. If the robot was not constrained by the constraint in equation (3.13),
this vector space would be the tangent space of the configuration space at (:r,y, 6)
and would have a dimension 3. However, due to the kinematic constraint in equation
(3.13), the velocity vectors only span a subspace of the tangent space of dimension
2. This is called a nonholonomic constraint and the system dynamics of the car are
said to be nonholonomic.
Various sensor motion platforms have a nonholonomic structure with examples like
a car which can be steered left and right only when the car has a velocity component
in the forward direction. Hence it is required to model such nonholonomic platforms
37
for an accurate description of system dynamics. Another example of a nonholonomic
platform would be a sensor mounted on a unicycle-type robotic motion module with
a differential drive, which has been quite popular recently since they require only two
motors for forward motion and steering.
Further, due to the large amount of energy consumed by the motion module,
multiple sensors with varying sensing modalities may be mounted on a single motion
module. An example is the mobile manipulator depicted in Figure 6.3, which has a
PTZ camera mounted on the mobile base and a bullet camera mounted on the robot
arm.
This section presents an example of a unicycle-type robot with nonholomic motion
constraints which can be modeled as:
a': = v cos 6,-
3) = 1) sin 6,-
6 = (.0 (3.14)
where, v and w are the linear and angular velocity control inputs of the robot. The
relationship between u and uh = (v, w)T can be written as:
cos6 0
v
u = Bhuh = sing O (3.15)
w
0 1
Thus, the controlled mutational equation for camera coverage model in equation 3.6
can be written as:
WI) = 3(a)u = B ((DBhuh = Bow. (3-16)
38
3.4 Hausdorff Tracking
The task of a surveillance camera is to continuously keep a moving target or multiple
moving targets in its active field of view. The surveillance task of maintaining multi-
ple targets in the active field of view (F OV) of the sensor with an adequate resolution
can be readily expressed in a set-based framework where the variables taken into con-
sideration are not vectors of parameters but shapes (domains) themselves. However,
due to lack of vectorial structure on the space, classical differential calculus cannot be
used to describe the evolution of such domains. However, mutational equations [60]
can be used to describe the dynamics (change in shape) of the sensor FOV and target
domains and further be used to calculate feedback mechanisms to accomplish the
surveillance task.
The surveillance task can be expressed, using shape functions [62], as the min—
imization of a Hausdorff distance-based metric or the size of the target, etc. The
shape function essentially represents the error between the desired and actual shapes
and reducing it to zero will accomplish the task. This section presents the method of
Hausdorff tracking using mutational equations for performing the surveillance task.
Shape or a geometric domain can be defined as the set K E [C(E), E C IR" where
[C (E) represents the space of all nonempty, compact subsets of E. The target and the
camera coverage can be readily expressed as shapes. Mutational equations can then
be used to express the change (deformation) in the coverage and target sets based
on the motion of the sensor. Shape analysis [62] can be used to address problems
involving geometric domains or shapes. Shape functions, which are set-defined maps
from J (K) : [C(E) H IR, can be used to provide a ”measure” of acceptability and
optimality of the shape K. For example we can use a shape function to see if a
reference set K is contained within a current set K. In order to accomplish the task
defined using shape functions, we need to derive a feedback map LI : [C(E) H U,
where u = U (K (t)) is the input to the sensor, which will reduce the shape function
39
(it
d 3
IS ;
P0:
CO(
CO
I»)
to zero. The convergence of the shape function can be analyzed using the shape
Lyapunov theorem [65]. The convergence to zero of the task function would imply
task accomplishment.
The task requirements for Hausdorff tracking can be expressed as a positive semi-
definite shape function which provides a measure of acceptability of a particular shape
(set). For the example shown in Figure 3.2(a), the measure of acceptability of the
target image (sets) is whether they are within a certain region of the image space and
the measure can be expressed as a shape function J (K) 6 73+. The task of Hausdorff
tracking can be defined as finding an input in order to move the camera in order to
reduce the shape function to zero. Hence we need to express the evolution (change)
in the shape of the target w.r.t. the movement of the camera. This is accomplished
using mutational equations, which are the analog of differential equations for shape
spaces. Using the mutational equation, we should be able to calculate the directional
derivative of the shape function w.r.t. the motion of the camera, and using an inverse
dynamics procedure, calculate a motion input to the camera in order to reduce the
shape function (error) to zero.
For the example in Figure 3.1(a), the task requirements for Hausdorff tracking
can be mathematically] expressed using a positive definite shape function J (K ) =
fl? (13((p) dp where K is the coverage set for the camera and dK(p) = inqu K ”g — pl],
is the point to set distance from point p to set K. Note that the shape function
is zero only when set K is completely covered by set K, otherwise it is a non-zero
positive value. Initially in Figure 3.1(a) the target set K is not covered by the camera
coverage set K and hence the shape function J (K > 0. The Hausdorff tracking task
is to reduce the shape function to zero by moving the camera as shown in Figure
3.2(a) i.e., minimize the shape function.
40
II
p:
r.
a(
lllt
Ta]
(‘3
3.4.1 Image-Based Hausdorfi Tracking
When the target is located in the active FOV of the sensor, the method of image-
based Hausdorff tracking can be used to accomplish the surveillance task. The target
is detected using image processing algorithms and represented as a collection of pixels
(blob) on the image plane. The sensor coverage set is represented as a rectangle cen-
tered at the image center and one of the tasks is to maintain the target blob within
the sensor coverage set. Surveillance tasks such as set coverage and maintaining ap—
propriate image resolution can be readily expressed in a setbased mutational analysis
framework using a shape function as the minimization of a Hausdorff distance-based
metric or the size of the target, etc. The task can be accomplished by reducing the
shape function to zero. The task of visual servoing can also be expressed in a set-
based framework [66]. The shape function essentially represents the error between
the desired and actual shapes.
Target and Coverage Sets for Image-Based Hausdorff Tracking
Figure 3.3 depicts the coverage and target sets. The coverage set K is the set of points
on the image plane (X, Y), within which we require the targets to be maintained.
Generally it is taken as a rectangle centered in the image plane. The target K is
defined as the union of the individual image plane blobs (sets of pixels) formed by
the multiple (n) targets as: K = U121 Ki.
Shape Functions for Image-Based Hausdorff Tracking
The surveillance task of maintaining visibility of the target(s) with a certain resolution
can be represented using a shape function as:
41
Figure 3.3: Target and coverage set for image-based Hausdorff tracking.
J(K) = JFovu‘o + Jame?) + amuse?) (3.17)
JFOVUAQ = I}? did?) dq
JAm,,,(f{) = max(ff( dq — AREA_MIN, 0)
JAMAR) = min(AREA-MAX — In dq, 0)
where q is a point on the image set K and AREA_M AX and AREA_M I N denote
the maximum and minimum admissible areas of the target set K for maintaining
adequate resolution.
Note that the shape function J (K ) is zero only when set K is completely covered
by set K and when the area of set K is within the limits (AREA-M I N, AREA_M AX )
Otherwise it is a non-zero positive value.
42
Shape Directional Derivative for Image-Based Hausdorff Tracking
In order to derive the feedback u to the camera, we need to find the shape directional
derivative J (K )((p) of J (K) in the direction of the mutation (0(K) using the shape
function and the mutational equation. Assuming a relatively flat object, i.e., the z
coordinate of all the points on the target are approximately the same, we can derive
an expression for J (K )((,0) by substituting equations 6.1 and 3.6 into 2.23 as:
face) = [K waive) +f T(q — pup. (3.35)
K (Enid?)
Let C (K )# be the pseudo inverse of C (K)
A choice of u can be taken as:
1 #
u = —§aC(K) J(K), (3.36)
where a is the gain of the control. The gain distribution of the task among the various
control channels of u is controlled by assigning the null space vector of the matrix
C(K).
49
3.5 Mutational Hybrid Automata Model for Pervasive Surveil-
lance Networks
Consider a network of N cameras which comprise the pervasive surveillance system.
With each camera camn we associate a coverage set Kn E [C(Wn) which signifies the
region camn can view actively and Wn represents the total capable FOV of camn. A
target to be tracked is represented by a target set K C [C(W), W = Une N Wn. An
example of the graphical MHA model for two cameras tracking a target is shown in
Figure 4.2
Groups of cameras are sequentially involved in tracking a target as it traverses
the monitored region. Within each group, the behavior of each camera is governed
by target visibility from a particular camera. This combination of continuous track-
ing and switching behavior can be modeled using the Mutational Hybrid Automata
(MHA) model presented in section 2.4. This section discusses the various parts of the
MHA model for the pervasive surveillance network H p S N and further elaborates on
the design of stable controllers for a pervasive surveillance system modeled using the
MHA model.
3.5.1 Mutational Hybrid Automata Model
It is relatively straight forward to write the MHA model for a pervasive surveillance
system as:
HPSN = (Q,M,Z,Init,D,E,G,R) (3.37)
where Q represents the discrete variables, M represents the continuous variables,
2 = {2.1}qu is a collection of the continuous mutational dynamics, Init C_: Q x M is
50
the set of initial values of the variables in Q and M. D is a domain which associates
a domain of operation of the continuous states for every discrete state q E Q, E is
an edge set along which discrete transition takes place, G is a set of guard conditions
which enable the discrete transitions and R is a reset map which resets the continuous
states when a discrete transition is enabled.
Discrete Variables
In the above multi-camera surveillance scenario the target can be tracked by a single
camera or a group of cameras at any given time. Fhrther, if the target is in the FOV
of a particular camera, it can visually track the target, i.e., perform image-based
tracking, while if the target is not visible, the camera can cooperatively track the
target based on the location of another camera which can view the target. Hence
one discrete (binary) variable bn is associated with each camera to indicate whether
the target is visible to that camera, such that bn = 1 if target is visible in camn and
bn = 0 otherwise. Another discrete variable on indicates whether a camera (camn)
is involved in the particular tracking scenario. Thus the discrete variables collection
can be defined as: Q = {{bn, CR}nENi }.
Edge Set
The edge set E defines which particular transitions are possible. Here we describe
general rules to develop E. A handoff between cameras can be initiated when the
target is visible to both cameras. An example edge set for a two—camera scenario is
developed in section 4.3.
51
Domains
The domains of operation Dq for each mode can be developed based on location of the
target set K, which will decide which cameras are capable of viewing the target, and
based on the continuity of the target motion, will decide which cameras can acquire
the target as it moves through the monitored region W.
Continuous Variables
The collection of continuous variables AI consists of the individual camera coverage
sets Kn and the target set K; i.e., M = {{KnlneNa K}. Let K E M represent an
element of the valuations of M.
Assumption: We can safely make the assumption that the target set K is convex.
This assumption does not have significant effect on the model as all non-convex targets
can be modeled as their convex cover. This assumption will be invoked to aid the
proof of stability later in this section. We further make the assumption that the
coverage sets Kn are also convex. This assumption can be substantiated by the fact
that the active fields of view of most cameras are convex.
Continuous Dynamics
The continuous dynamics represent the change (deformation) of the sets in M which
can be modeled using mutational equations as: Kn 3 (pn(Kn, un) where an E U is
the control input to cam”. Based on the mode of operation of camn, the controller
model can be changed, giving rise to different continuous dynamics for each camera
in different modes of operation q. The motion of the target can also be represented as
a mutational equation as: K 3 (0t(K) which can be estimated from the observations
of the target.
52
Guard Condition
A guard condition or switching condition G associated with each edge e E E is
designed based on the location of the target set. Various events can be recognized
and processed by the multiple camera surveillance network such as the visibility of a
target in a particular target or the global location of the target or camera coverage
set reaching some predefined value.
For example, when the target moves into the visible region of camn then the bn
associated with it changes, which effects the transition along an edge e to a mode
where bn = 1.
Further, based on the current location of the target (sensed and communicated
from other cameras), the variable an associated with camera camn will change in
order to involve the camera in the tracking task.
Reset Map
Since we are dealing with continuous positions of the targets and camera coverage
sets, there are no jumps affected in the continuous variables. Hence the reset map R
can be defined as R : {(p, q), K} I-—> K, K E M. However, in case of multiple groups
of cameras being involved in the surveillance scenario, the coverage sets, which are
defined as the union of the individual coverage sets, will jump from one group to
another based on the change in the discrete variable c. For example, when switching
the camera group from state q to p, the reset map is given as: R(p,q,K,;) t—> K j
where Ki = UlEpKlv Kj = UlEqu-
Init Set
The target can be recognized only if it is in the view of at least one camera. Hence
the set of initial conditions is restricted to the set of discrete modes Init : {q, M :
53
Anern = 1} where bn is the target visibility condition for camn.
3.5.2 Controller Design for Individual Cameras
Events can be recognized by the controller which can be used to generate the control
inputs provided to the various cameras and trigger mode switches. Such events include
the recognition of the target by a particular camera and the location of the coverage
set of camn hitting certain boundaries within its domain of operation Wn. Also,
within each mode q we are provided with the number of cameras actively involved in
the surveillance c and the cameras which can view the target on.
Camera Active, Target Visible(cn = 1, on = 1)
When the target is visible in the image plane of camn (bn = 1), the method of image—
based Hausdorff tracking [67] can be used to track the target. In this controller, the
control input to the controller is derived based on the shape function z'Jn(i"Kn,i" R) :
K(Wn) x [C(W) I—> R+ where the superscript in represents the projection of a set on
the image plane z'Wn of camn. The control input an can be calculated as is detailed
in [68].
Camera Active, Target Not Visible (on = 1, bn = 0)
When the target is not visible to camn but is visible to camm (bm = 1), we can use
the method of cooperative Hausdorff tracking proposed in [67]. If the target K ¢ Wn
then camn should track the projection HWn(K) of the target on the domain W”.
This strategy will ensure that as soon as the target moves into the region Wn it will
be encompassed in the visible region of camn.
However, in a real tracking scenario the projection flu/”(1%) may not be easily
available. But taking into account the fact that bm = 1, which implies that R C K m,
54
we can deduce that HWn(K) C IIWn(Km). Thus, tracking HWn (Km) will also
ensure tracking the set IIWnUf) (proof of this proposition not presented due to space
limitations but it can be done in a very straightforward manner).
We can derive a control input an using cooperative Hausdorff tracking technique in
order to track the set HWn ( K m) by minimizing the shape function CJn(Kn, HWn (K m)) :
[C(Wn) x [C(Wn) +——+ R+.
Camera Inactive(c.,, = 0)
When the distance of the target set to the coverage set of camn is large i.e., the target
is vary far away from camn, the variable on is set to zero and camn enters into inactive
mode. This mode prevents all the cameras from continuously servoing based on small
motions of a target which is rather far away and is used to conserve energy of the
camera. In this mode, the input to the camera, camn is set to 0 and the camera does
not move. The target dynamics are still represented using the mutational equation
of the target set.
3.6 Chapter Summary
In this chapter, we have introduced the use of controlled mutational equations to
model the motion of the sensor coverage and target sets. Using the mutational equa-
tion model, we have introduced the method of Hausdorff tracking which can be used
to control the motion of the sensors (cameras) to maintain target visibility for moving
targets. Specific examples of the Hausdorff tracking in image space and world space
have also been presented.
Cooperation among various sensors is an important characteristic for pervasive
surveillance. We have introduced the mutational hybrid automata (MHA) model for
modeling motion of multiple sensors tracking a target. The implementation of this
55
model for a pervasive surveillance scenario can be done in a distributed fashion as the
various events occurring, such as target visibility, etc., can be detected and processed
in a distributed fashion.
56
Chapter 4
Analysis of Surveillance Networks
This chapter presents the stability analysis of the proposed model and controllers.
The shape Lyapunov theorem is presented in section 4.1, and can be used to analyze
the asymptotic behavior of tubes. The stability analysis of the proposed Hausdorff
tracking method using the shape Lyapunov theorem is presented next, followed by
the analysis of an example two-camera system modeled using the mutational hybrid
automata model proposed earlier.
4.1 Lyapunov Theorem for Shape Space
The asymptotic behavior of the measure J (K (t)) of the deformation of the set K can
be studied using the shape Lyapunov theorem. The deformation is described as the
reachable tube K (t) and is the solution to the mutational equation in equation 3.1.
We usually look for assumptions that guarantee the convergence to O of J (K (t))
Definition 4.1.1 (Shape Lyapunov Function) Consider a subset E of R” and a
mutational map (,0 defined on the set E as shown in section 1, a shape functional
J : [C(E) I——> R+ and a continuous map f : R I——> IR. The functional J is a f-Lyapunov
57
6L5
Tl.-
56
function for the mutational map (,9 if, for any K E Dom(J) we have,
deixngwmwnga Jm3=wm) (4D
where, w() is a solution of w’ = —f(w).
When the solution w(-) converges to 0 as t —+ 00, the shape Lyapunov functional
J (1990(t, K )) also converges to 0.
Using shape derivatives, the above definition of Lyapunov functions can be stated
as the shape Lyapunov theorem.
Theorem 4.1.1 Consider a subset E of IR" and a mutational map 99 defined on the
set E as shown in section 1, a shape functional J : K(E) I—> IR+ and a continuous
map f : IR t—v R. Let the Eulerian semi-derivative of J in the direction go exist and be
defined as J(K)( “w
H-IH
Using the definition of directional shape derivative this yields,
Mono) s w'(0> (4.4)
Then, observing that w’ (0) = — f (w(0)) = — f (J (K )) we can conclude the result.
58
Conversely, denote y(t) = J (1999(t, K )) We can now write
mo~wn=u/:Efl WicK>
By the hypothesis, we obtain
t
yw—mms—Afuwns 0%)
Now, we can write
yf1(t) S —f(y(t)) (47)
Now consider a solution w(-) of the differential equation
u305) = -f(w(t)), w(0) = J(K) (4-8)
Using the Gronwall Lemma, for any positive t, we obtain
y(t) = J (19920, K )) S 10(0) (4.9)
This proves the theorem. [:1
Using the shape Lyapunov theorem we can derive assumptions on the input u, to
move the camera/ robot, in order to accomplish the surveillance task.
59
4.2 Stability Analysis of Hausdorff Tracking
In this section we analyze the stability of the Hausdorff tracking method using the
shape Lyapunov theorem discussed in the previous section.
Stability in the sense of Lyapunov for the proposed Hausdorff tracking method
would imply the asymptotic convergence to zero of the shape function J (K) This
asymptotic convergence to zero of J (K ) will imply that the task criterion for the
system is accomplished.
For both the image-based and cooperative Hausdorff tracking methods proposed,
the shape directional derivative has been reduced to the following form:
J°(K)(so) < C(K) - u (4.10)
We can now propose the following theorem for stability of the Hausdorff tracking
method.
Theorem 4.2.1 For a automated tracking system modeled using the mutational equa-
tion K 3 w(K) where K is the target set and w(q) = B(q)u with the plant input
vector u. The shape ( error) function is defined as J (K) with the shape directional
derivative in the direction of the mutation (J (K )(go) ) defined by equation 4.10. The
system is stable, i.e., the shape function asymptotically converges to zero for the input
u = —aC#(K)J(K), a > 0.
Proof. In order to prove the stability of this system, we have to show that it
satisfies the shape Lyapunov theorem from section 4.1. That would imply that the
shape function J (K) asymptotically converges to zero. The shape Lyapunov theorem
can be written as:
J°(K)( _ 0, Vi = 0 implies
that the target is covered. Based on the procedure for calculating the motion input
an to the cameras outlined in 4.3.1, we can show that z'Jn and an exponentially tend
to zero, which implies that V,- also exponentially tends to zero. Hence we can say
that the continuous evolution in the modes {q1, q2, q3} converges exponentially to the
equilibrium state.
We further note that the switching conditions G defined in equation 4.20 have a
certain hysteresis associated with them due to the finite size of the non-empty target
set R and the non-empty overlap region W1 (1 W2. Hence we can prove that Zeno
executions are prevented, which completes the proof. [:1
Hybrid System Stability for the Two-Camera Example System without
Controlled Switches
Based on Definition 4.3.1 of internal stability and lemma 4.3.1, it can be deduced
that each state (except qO) is internally stable. We can now define stability for the
pervasive surveillance scenario as:
72
Definition 4.3.2 {Stability of a pervasive surveillance system) A pervasive surveil-
lance system is said to have the stability property if a target set (R) will be continu-
ously covered in the monitored region if it enters the coverage set (Ki) of one of the
sensors comprising the network.
Now, if we can prove that the set of modes {q1, q2, q3} is invariant, then any hybrid
execution (r, q, K) starting in these modes will always remain in these exponentially
stable modes, which implies that the target will be pervasively tracked. Based on
these conditions we can put forward the following theorem for stability of a pervasive
surveillance network.
Theorem 4.3.1 Consider the graphical MHA model of a pervasive surveillance sys-
tem in Figure 4.2. The set of operating modes {q1, (12, q3} is an invariant set.
Proof. Consider a hybrid trajectory 5 H = (T,q, K) of the system with its initial
conditions within the set {q,KIq E {q1,q2,q3}}. The modes {q1,q2,q3} are expo-
nentially stable, which implies that the energy of the system in that mode can be
reduced to a very small value in any finite amount of time t. For mode ql this implies
that when the target R C K 2 hits the boundary of the domain W1, its projection
flu/10?) C HW1(K2) is already covered by K1 as IIW1(K2) C K1. When K is inside
the boundary of W2, IIW2 (R) Q R C K1. Hence the target is within the coverage
region of K1 and will be recognized by K1 which will generate the event to set bl = 1,
which in turn will switch the mode to q3 (assuming W1 0 W2 75 0). Thus as soon as
the target moves out of W2 and into W1 it will be covered by K1. A similar argument
can be presented for the mode qg and the target hitting the boundary of W2.
Now in mode q3, the target is located in the intersection of the domains W1 and
W2 (fr c W1 n W2) and is being covered by both K1 and K2 such that fr c K1 and
If C K 2 due to the exponential stability of mode q3. Now the target can only move
into the regions W1 \ (W1 (1 W2) or W2 \ (Wl 0 W2) and in both regions either K1 or
73
K2 can cover the target, respectively. This implies that the system can switch modes
to q1 or q2, both of which possess the exponential stability (in the sense of Lyapunov)
property.
We have shown that all trajectories originating in ql and q2 can switch only to
state q3 and all trajectories originating in q3 can switch only to either q1 or q2. This
implies that the set of modes {q1, q2, q3} is invariant, which proves the theorem. El
Based on the above theorem, and the fact that all the modes in the invariant set
{q1, q2, q3} are exponentially stable (in the sense of Lyapunov), we can conclude that
the hybrid system is stable i.e., the target set is continuously covered once it enters
the invariant set.
4.3.3 Two-Camera Example for Pervasive Surveillance with Controlled
Switches
For the two-camera pervasive surveillance example considered in the previous section
4.3, the discrete variables were comprised of Q = (b1, b2, c) = {q} where b,- is a binary
value which denotes whether the target is visible to sensor Cam,- and c is a discrete
variable which indicates which cameras are involved in the tracking scenario.
This representation of the discrete state c is not very intuitive. Hence for a truly
distributed implementation, we propose to modify the discrete variable c and separate
it into its constituent parts. Each camera will be associated with its own 0,- which
will assist in a distributed implementation.
For the example system presented earlier, only the modes in which the discrete
variable c was set to 2 were considered. This implies that for all possible locations of
the target all the cameras will be servoing the target. This scenario will be wasteful
for energy consumption and does not take into account the fact that the sensors need
not track the target if it is relatively far away from the sensor’s domain.
Since the cameras need not be tracking the target continuously and can shut down
74
when the target is not close by, the MHA model with controlled switching will be
more energy efficient. Figure 4.3 shows the MHA model of the two camera system
with controlled a controlled switch which is used render a camera into an inactive
state.
75
8:336 8:838 55 2.858 eHmeoéBu 2: 8m Ecofi <32 EQEQSO ”m6 853m
NK&UM ‘NN:N
vRUNN <§U- NNSUM
AN3.N¥VNSMN¥ <
:5 A$2¥V§m Cw :5
k a w: .
/ “VAC: 3 gesmw mambo s >\
< \RUN I_ . N N: N
3&5”st a- «CNN 3.3" s .. u MCNN $2.5st
Aizkfism Cw Aizvbicm .M
CEMSmww thsmk
2.3.8 n s 2.3.: n s
aéfigé snflecwéuicw téfigss
N ... 33%.?
%VA¥V h fiHN¥C-diw
dq (3.18)
where, f (q) can be construed as the aggregated integrand from equation 6.1. For
the shape function of the type
J(K) = [K d§{(q) dq (3.19)
Substituting Equations (3.19) and (3.6) into Equation (3.21) we get
f(mev‘cu» = Lee—Menage
+ di
div(B(q>u)) dq
= [f((aq—ern-Be)
+d§~((q)divB(q)) dq - u (3.20)
The above equation can be written as:
June) = 2 [K (16311}; 0.1“" -p>..o(p>> dp+ 2 [K divee» dp (321)
Here it must be noted that for the existence of the shape directional derivative,
the integrand or E C1(IR”,IR). J FOV in Equation (6.1) is continuous. However, the
43
elements J Amin and J Amax in Equation (6.1) do not satisfy this criteria. Hence we
modify the shape function in order to satisfy the continuity criteria using smooth
approximations of the min and max functions as:
“d —Amm2 62 “d ”Amm
(MK 10 > + +0., 12 ) (3.22)
JAmin :
JAmaa: : _ (3:23)
2
V (Amar - IR dp)2 "l' 52 “ (Amos: _ fK dp)
2
where, e is a very small positive number. This approximation physically manifests
as adding a very small area (of size 6) to the target set size and hence this approx-
imation will hardly affect the validity of the shape function to represent the task of
maintaining adequate resolution.
Feedback Map u for Image-Based Hausdorff Tracking
The problem now is to find a feedback map uc such that the shape function J is
reduced to zero. Equation 3.21 can be approximated as:
. - 1 .
J(K)((0) S Z—tCi(K)U1 + CI'2lKl‘U2 +' C3(K)’uc (3-24)
where,uc = [u?u§]T, u1 = [vx,vy,vz,A]T and u2 = [w$,wy,wz]T. zt is an esti-
mated minimum bound on the target 2 position. The relation 2t > 2 will guarantee
the inequality in 3.24.
The matrices C1(K), 02(K) and C3(K) can be defined as:
c.(K)=2/ inf B.(q>T(q—p>dp, (3.25)
KqGHKIP)
44
02(K)=2/ inf B2(q)T(q—p)dp. (3.26)
KqEHK(P)
03(K) = 2 [K B(q)dp. (3.27)
where, the matrix B(q) is defined as: B(q) = [31 (q),B2(q)] and its elements are
defined as:
I "A 0 9x
Bi(q) = 2— (3.28)
t 0 —A qy
QIQ (A2+02) x
"TL q
B2(q) = EH2 y 9X (3.29)
will“ if“ —q. 91“
After combining the relevant terms from equation 3.24, it can now be rewritten
A
J(K)(
R... (5.29)
K : [0,7') x X H U (5.30)
The Bellman function V(s, 1:) should satisfy for any 3 E [0, r], and each a: E X,
V(s,:r) 2 main B(r,o,x,w) (5.31)
and the optimal control input satisfies the condition
60' + 1) = ¢(€(J'),J}K(j,€(j))), j = 3,3 + 1, MT
{(3) = :1: (5.32)
120
for each s E [0, r) and each a: E X.
The computation effort required to solve this problem will be significantly lower
than tabulating all the control input sequences. However storage requirements for
this procedure will be significantly large.
5. 8.2 Dynamic Programming Algorithm Implementation
In order to minimize the switching cost and maintain adequate resolution of the target,
we construct a graph based on the camera location and the visibility of the targets to
the cameras, given the predicted motion of the target, Figure 5.9. The procedure for
constructing the graph is shown in Algorithm 1. After the graph is constructed, we
use the dynamic programming method, a modified version of the Dijkstra algorithm,
to find the optimal switching strategy (Algorithm 2). In the graph generating stage,
we enumerate all the cameras that can see the current grid point with an acceptable
resolution. The resolution metric is marked over each camera as a weight on the
vertex. However, in order to run the dynamic programming method, it is desirable
to have all the weights only on the edges. Thus each camera is represented by two
vertices in the graph, with an edge connecting them that has a weight equaling to
the resolution metric. We also enumerate all the cameras that can see the grid points
specified by the prediction vector. Between two prediction grid points we connect
the cameras based on their switching metric. If the two cameras are the same, the
switching metric is set to 0, otherwise a non-zero switching metric is set on the edge.
In Figure 5.9, all non-zero metrics are set to 1 for the sake of simplicity.
5.9 Chapter Summary
This chapter presents the design of the various components of a surveillance network.
For exploiting the redundancy in motion along the various axes of the sensor, an
121
f." ' "\ //’”\ ""\\ /“
{ C t Resolution .. l/ i \t O (/ C ‘1, Resolution J C'\l
'=. Metric . / ‘ ’ Metric ’ ‘ I
\\ 1/ \\..._// \x.._fu’/ \\a_l—/
1
/’""“ ’7 ’ ' , ”‘\ ”A
r" \ Resolution , / ’\ Resolution .. / ’\
i 62 l Metric K CZ / ’ CZ Metric W] 62 l
\\\_’_,’ .\\__// 1 \\_/ \ku/z
rx'"\ . /-\\ , ... . ‘ /"' ,/‘\.
(I C \ . Resolution a / C, . _ ‘ / C\ . Reso|ution .. “f C,
K k ) Metric K k K k / Metric \ k
’\._-/ -,_,. ‘\.__/ v/
Figure 5.9: Generating Graph for the Camera Switching Strategy
Algorithm 1 [0, W] = Grathen(pv)
1: for all cam in C ameraSet do
2 if pv(l) is visible by cam then
3: Add two nodes representing com to G
4
5
Connect the two nodes with the resolution metric of cam in W
end if
6: end for
i Pprev = 7911(1)
8: Delete pv(1) from pv
9: for all p in pv do
10: for all cam in CameraSet do
\1
11: if p is visible by cam then
12: Add two nodes representing cam to C
13: Connect the two nodes with the resolution metric of cam
14: Connecting the nodes generated by p and ppm” and set the switching cost
to W if they are for different cameras.
15: end if
16: end for
17: end for
optimization framework is prOposed. This framework allows the accomplishment of
various other tasks under the constraints of the main tracking task, such as energy
conservation and better sensor location for sensing.
The characteristics and implementation details of the various choices for video
feedback are discussed and finally the design of a sensor node is presented. The
architecture of the sensor node consists of a camera, a target perception module, a
motion control module, a video feedback module and a communications module.
122
Algorithm 2 p = Dijkstra(G, W, s, t)
1: for all 22 in V(G) do
2: d(v) <— +oo {Initialize the distance vector}
3: p(v) <— undefined {Initialize the previous node vector}
4: end for
5: d(s) <— 0 {The source has 0 distance to itself}
6: C (— 0 {Initialize the checked node set}
7: Q <— V(C) {Copy the vertex set into a working set}
8: while Q sé (ll do
9: u +— ExtractMin(Q) {Extract the vertex with minimum value in d}
10: C s— C U {u}
11: if u = t then
12: return
13: end if
14: for all edge (11,12) do
15: if d(u) + W(u,v) < d(v) then
16: d(v) 4— d(u) + W(u, v)
17: p(v) <— u
18: end if
19: end for
20: end while
21: if !t e C then
n p90
23: end if
123
Chapter 6
Experimental Implementation and
Testing
Figure 6.1 shows the general architecture of the surveillance system implemented over
an IP network. This chapter presents a discussion of the implementation issues such
as switched video feedback to the human operator/ client and a performance analysis
of the experimental results of the proposed algorithms.
6.1 Experimental Test-bed Setup
A visual surveillance test-bed comprising of seven video cameras has been set up at
the Robotics and Automation Laboratory at Michigan State University. The test-bed
is used to demonstrate the integration of multiple active sensors with active target
tracking algorithms to perform a coherent pervasive surveillance task of tracking
multiple targets as they move across the monitored landscape. The cameras are
attached to processing and communication computers and some are mounted on pan-
tilt drives or robots for moving the cameras. The surveillance test-bed developed
has the functionality of an end-to—end, multi-camera monitoring system that allows a
single or multiple human Operator(s) to monitor activities in the region of surveillance.
124
Networked sensor”
(Wired)
\
Mobile networked
sensor (wireless)
. FEVER '
“iii-oi. ‘
5] “Tit—'4‘
Wireless networked
sensor
Wireless networked
sensor
Mobile wireless
operator
camZ \l / cam3
amt /
Network
"' Infrastructure
cam5
/
{'—
\ Robot Motion
Platforms
Targets
Figure 6.2: Surveillance test-bed designed along with the cameras and the motion
modules for camera motion.
125
Figure 6.3: Sony EVI—D30 mounted on Robot Nomad XR—4000
Figure 6.2 shows the test—bed setup. This section describes the hardware and
software architecture of the individual sensors used.
6.1.1 Hardware Setup
The sensor node setup consists of three Sony EVI—D30 active PTZ (pan-tilt—zoom)
cameras, one dual head stereo vision camera consisting of two Hitachi KPI-D50 cam-
eras mounted on a Digital Perception DP-250 pan-tilt drive and one fixed view KPI
D-50 camera. One of the Sony EVI—D30 cameras was mounted on a Nomad XR 4000
mobile robot for extended mobility and sensing range. Figures 6.3 and 6.4 show the
sensor and motion hardware for the implemented system.
The cameras were connected to Pentium 4 2.4 GHz computers which had PCI-
based video capture hardware cards attached to them. The camera capture cards are
126
Active Camera
Sensors
Figure 6.4: Active Cameras: Sony EVI-D30 and dual head Hitachi KP-D50
capable of transcoding, in real time, the input from the cameras to an H.264 stream
(on a separate software interface) and transmiting that stream over the Internet to
clients that request the video stream. The PTZ modules for the various cameras
were controlled through serial port communications. The various computers were
connected to each other using wired ethernet and wireless 802.11g connections over
an IP network. The individual sensor nodes were provided with publicly addressable
IP addresses and hence could be accessed from the Internet. The human interface
clients were comprised of Pentium 4 laptop computers which could be connected to
the surveillance network through a direct wired or wireless connection or through the
Internet.
6.2 Software Setup
The sensor outputs are processed on the local computer for target detection and
identification. The local computers used for sensing and processing are running a
Linux Operating system. The targets to be tracked using the automated tracking
127
system were multiple humans moving around in the monitored region. For ease
of target detection and identification, the human targets were wearing solid color
clothing and CM Vision was used for color analysis and blob detection and merging
[91]. The acquired 640x480~pixe1 images were quantized on a 128x96 grid for reduced
computation load. The identified targets were represented using their bounding boxes
which were quantized on the 128x96 grid. The coverage set is also quantized on the
grid and the distances of the grid points to the target set are pre—computed and stored.
The tracking task is carried out at 25 fps - i.e., a new velocity input is computed for
the motion module inputs at the rate of 25 Hz.
At the same time, the raw camera feed is input to the H.264 transcoder card,
which is provided as a separate software device (/dev/dsp0) on the computer. There
are various implemented on Linux and Windows operating systems for receiving video
stream and other pertinent information from the camera servers.
6.3 Experimental Results for Wide Area Camera Calibration
The individual camera sensors involved in the setup of the camera system need to
be calibrated for their location and pose information in order to share information
amongst each other. A wide area calibration procedure described in Section 5.1 has
been used to calibrate the configuration of the cameras.
BlueCCal, developed for use at Czech Technical University by Tomas Svoboda [92],
is an attempt at calibrating many cameras without the necessity of a large calibration
object, or the necessary handwork of more classical methods. The system does,
however, have necessary conditions for the calibration to yield sufficiently accurate
results, which, among others, is a minimum of three cameras for the setup.
The only action required as a reference for calibration is the mapping of l-point
objects that is easily detectable in the images captured. For our setup, this involved
128
capturing images in a dark setting with a plastic-capped laser pointer to act as the
1-point reference in all images. Therefore, the only necessary work was moving the
laser pointer around within our viewable domain, and capturing images during the
course of this motion. These images would later be used in the calibration software,
to locate the relevant points necessary for calibration.
The software itself can be broken down into a number of simple steps: find all
points in the images for each sensor, discard all mis-detected points by pair-wise
RANSAC analysis, estimate projective depths, fill missing points to make the scaled
measurement matrix complete, perform the rank 4 factorization to get projective
shape and motion, and estimate parameters of non—linear distortion.
Results of the calibration aren’t only just the necessary conversion matrices be-
tween each of the sensors, but also the distortion models of the system, the location
of all captured points in each of the sensors, and a scaled model of the location of
each of the sensors. This information proves vital in the immediate testing of the
results of the calibration; i.e., if the scaled model displays some erratic location of a
given sensor, it is likely the calibration is off w.r.t. that sensor. One can then look at
the location of the points in that sensor and notice if that sensor got the necessary
coverage of its viewing area to accurately calibrate w.r.t. the rest of the system. This
allows for a more intuitive check of the accuracy of a given calibration.
An advantage of this calibration tool, besides those stated above, is that the system
is an adaptive calibration, and among the images captured, the one-point object, -
i.e. the laser pointer - does not have to be in all of the images for each sensor. The
adaptive style of this network means that the more cameras involved, the better the
calibration results. So this proves beneficial for heavily overlapped multiple sensor
networks. The only necessary condition w.r.t. the number of sensors that can view
a given point is that the resultant calibration requires a minimum number of points
that can be viewed by at least three cameras at a time; this proves incredibly helpful
129
5-Camera Configuration
z-axis
4 4 x—axis
Figure 6.5: Camera Setup: Location and pointing direction of the camera setup.
when working with a disjoint network, where there exists no point that all of the
cameras can see. By only requiring three cameras to see it, as long as the majority
of the monitored region is covered by at least 3 cameras, then the system is able to
calibrate with reasonable efficiency.
Figure 6.5 shows the results of the calibration carried out on 5 cameras. The
cameras were located at different heights and were pointing to the center of the
room.
The results of this experiment demonstrate the calibration of the surveillance test-
bed setup for conducting the remaining experiments outlined. From Figure 6.5 we
can see that the camera setup was calibrated with an adequate degree of accuracy.
130
6.4 Experimental Implementation for Image-Based Hausdorff
Tracking
The automated tracking task was defined as maintaining the visibility of multiple
moving targets in a region with adequate resolution. The shape function used to
track the targets is:
Mo = 2?; Jpovrkr) + await-1+ await.) (6.1)
JFOkai) = lg, dim?) dq
JAmmez) = ma.x(fKz dq — AREAJl/IINi, 0)
JAma$(f{,-) = min(AREA_MAX,- — IR. dq, 0)
2
where, N is the number of targets, q is a point on the target set R and AREA-M AX i
and AREA_M I N,- denote the maximum and minimum admissible areas of the target
set R,- for maintaining adequate resolution. Note that the shape function J (K )
is zero only when the set of targets UiN=1 If; is completely covered by the sen-
sor coverage set K and when the area of each target set R is within the limits
(AREA-M I N, AREA-M AX ) specified for that target. Otherwise J (R ) is a non-
zero positive value.
Based on the value of the shape function J (K ), the velocity input vector u to the
camera motion units (PTZ drives or robot) is calculated and applied at the rate of
image acquisition - i.e., 25 frames per second (fps).
6.4.1 Tracking a Single Target
The surveillance task is to maintain the target in the active FOV while maintaining
a discernable resolution of the target on the image plane. The task is described
131
g """"""""" Jr=ov
g . -'-'-JAmmMm
D . ‘ . J
E; ........................ E ...... .:.U§J\ .................. AflmMm‘
‘0 : . f f s '
(T) 1' t J E
r in! 3 -A I." I I
40 60 80 100 120
time (s)
(a) Shape functions
150-“; ......... 2 ............ J.H.H.H.E ............ . ............ : ............ VX
100._...{\' ......... ............ ............ ............ ..... .....vz
-100 i i i i i i
O 20 40 , 60 80 100 120
time (s)
(b) Active sensor inputs
Velocity input (mm/s) (mrad/s)
Figure 6.6: Image-based Hausdorff tracking using active camera
mathematically using equation 6.1. The target was a ball that was manually moved
around in 3D space during the duration of the experiment. The image was taken as a
regular grid of 128x96 pixels evenly spread over the 640x480 original image in order
to reduce the computation load. The target set K was approximated as occupying a
certain number of pixels on this grid. Assumptions on the input u = [23, 2, W13, wy, MT
to the robot and camera system were derived using equation 3.31. The target was
initially placed where the task critera were not satisfied and then the target moved
around manually, which generated a disturbance input to the system and the system
immediately tried to reduce the value of the shape function to zero.
Figures 6.6 and 6.7 depict the results of the image space Hausdorff tracking task
using an active camera. The shape functions J FOV (If ), J Area Minfk ) and J Area Max
and inputs u = [23,y,w$]T are plotted in 6.6 (a) and (b), respectively. The figure
132
(O FOV
:
.9 J .
'5 AreaMrn
C ......................................................................... :
3 .
LL :
a) _ ......................................................... ;
o- .
to t
_c .
(D ............................................................ f
7
--—W
(I) Y
*5 """" 7»
Q .
E
S
m ........
E
(U
0
-10. """"""" 1 """"""" 1 """ 1 ... 1 ........ i ....... I.U.H,;j
(bl
Figure 6.7: Image-based Hausdorff tracking using active camera: Maintaining ade-
quate target resolution
shows that the system is stable and that the target is maintained in the camera field
of view with a desired resolution despite the seemingly random motion of the target.
These experiments demonstrate Hausdorff tracking using multiple shape functions.
We see that the shape functions are are non-zero when the target moves but they
are quickly reduced to zero by the motion of the camera. The motion of the target
is taken as a disturbance input for the tracking control system shown in Figure 3.4
and it can be seen that the Hausdorff tracking controller can reject disturbance input
from motion of the target.
6.4.2 Tracking Multiple Targets Simultaneously
The surveillance task is to maintain the multiple targets in the active FOV of the
sensor. The targets were two humans moving around and interacting with each other.
133
,5 1 r r g s .1
0 1 0 20 40 50 60
time (s)30
Figure 6.8: Tracking multiple targets using Image—based Hausdorff tracking
The target set K was approximated as occupying a certain number of pixels (in
multiple disjoint locations) on this grid. Assumptions on the input u = [wx,wy]T to
the camera system were derived using equation 3.31. The targets were initially located
such that they were out of the field of view of camera. The targets were detected by
the monitor and the sensor was tasked to cover the two targets simultaneously. At,
t = 0, the targets are just in the active FOV of the sensor and the task criterion is
not satisfied. The camera then moves to reduce the shape function J to zero so the
targets are now covered. The targets then randomly move around the room and the
camera tries to maintain both targets continuously in the active FOV.
Figure 6.8 depicts the J and the input velocities (u 2 [tax wy]T) applied to the
camera. Notice the initial large value of the shape function J, which is quickly reduced
to zero. Figure 6.9 depicts the position X, Z estimated of the two targets. We see
134
Person 1
Person 2
.............................................................
...........................
1o 20 30 4o 50 60 7o 80
time (s)
Figure 6.9: Image-based Hausdorff tracking for multiple targets: Estimated position
of targets
that despite the seemingly random motion of the two targets, the camera always tries
to keep both of them in the active FOV. Further, the energy efficiency of the proposed
method is demonstrated by the relatively infrequent input applied to the camera only
when one of the objects escapes the active FOV. Figure 6.10 shows the images taken
from the camera at various times during the tracking experiment shown in Figure
6.8.
This experiment shows multiple target tracking with a single camera sensor using
the Hausdorff tracking method. Tracking multiple targets using a single sensor is
advantageous for overloaded networks where the number of targets outnumber the
number of cameras available for tracking. The image sequence in Figure 6.10 shows
that the targets are moving about randomly and still being maintained in the field
of view of the camera sensor at all times, which demonstrates the advantages of
135
time = Ssec time = 13sec time = 208cc
time = 425cc time = 465cc
Figure 6.10: Image sequence for tracking multiple targets using Image-based Haus-
dorff tracking.
Hausdorff trackng over vector space servoing.
6.5 Experimental Implementation for Cooperative Hausdorff
Tracking
The cooperation between nodes to continuously track a target by directly tasking
various sensors can be done using cooperative Hausdorff tracking as explained in
section 3.4.2.
In this experiment, a mobile robot is deployed in order to track a target that cannot
be covered by a fixed sensor with adequate resolution. The robot uses the position
information received from other sensors in the network to compute a trajectory that
will bring the target set within its coverage set.
Figure 6.12 shows the x-z plot, which is the plane of the floor, of the data for
both the target centroid and the robot tracking it. The robot’s path shows the initial
cooperative tracking taking place. As the robot moved its view cone toward the
136
Plot of cooperative and image-based Hausdorff tracking on the x axis
7000 r r l r r r I I
f § § ’: § § """" ”b“
6000_H..H,H (,H.. . Win a.. .f,..,fl ._flq.q....;....n.m ; ....... -f--ungetcenUOklfj
E 5000
g
x 4000
2000 L I l I l J 1 I
0 10 20 30 40 50 60 70 80 90
time (5)
Plot of cooperative and image-based Hausdorff tracking on the z axis
4000 r r r I r r . 1
: E i 2 § 5 ------- robot
3500- . .. .. _. émm9t_wnmw _
E 3000 _. ......................
s ‘ .
N 2500 _ .......................................
2000 _ .- .................... ..
1 500 l l l l l 1 1 l
1 0 20 30 40 50 60 70 80 90
time (s)
Figure 6.11: Position of robot and target set on x and y axes.
target, the target also began moving. Re-projection of the target onto the view cone
caused a slight change in velocity in the robot that can be seen. Once the target set
enters the set space of the mobile robot’s view cone, image-based Hausdorff tracking
begins. Now the robot reacts to changes in the target to maintain optimal resolution.
This continues for the remainder of the experiment, as can be seen in the plot.
In Figure 6.11 we can see the data from the individual axes. The robot is quite
successful in maintaining the target set within its coverage set, as can be seen by the
correlation of the plots of each of the axes.
Finally, in Figure 6.13 we see the plot of the J value for both image—based and
137
Plot of cooperative and image-based Hausdorff tracking on the x-z plane
5000 r r 1 1 r 1
------ robot
—— target centroid
n ’
I
4500
Target
1
l
4000
3500
2 (mm)
1
41
I
l
3000
.....
I
2500
l
L
2000
uuuuuuuuuu
1 500 l L I l J l
2000 2500 3000 3500 4000 4500 5000 5500
Figure 6.12: Motion of target set and robot on x—z plane. The view cone of the robot
is shown at a particular instance in time with the position of the target centroid
shown as well.
138
1° I F T I '.
8.... .. .4 ................................. —¢
2 :--e ------- ;--—----+ ------- e-—----A-------*
a; 6- ...... - M .. ....... ......... .—Ima9eJ
. i f I . q
s . 2 : s """ °°°9J ,
IL ' r r j - - -coop(|ow)oractive(hlgh)
8. 4,. l ....................... ................... -l
a
.r: I
‘0 I f
2,. I . _ ._
' ; 2
0 R 1M A 1 M
0 10 20 30 40 50 60
time(s)
Velocities on x and z axes in time
I l I I
3 —xvelodty
a ....... _'..f.'..z_.‘.’9'9‘?‘.'¥‘-
2
>.
2 ...... -
(I
x
a .................... ............... «—
.3‘ i
o . r----.--------s-
> ------__‘C ; -
—2oo ‘ ' 1 i
O 40 50 60
time(s)
Figure 6.13: Shape functions for cooperative and image-based Hausdorff tracking
139
Target at optimal distance. (505)
w k 1d ' I
' ‘ ‘ ' l
Profile of robot and target
Target dlstance to great. (205) Target at edge of view. (805)
Figure 6.14: Robots View of target at several points in time to show several conditions
that the tracking system responds to.
cooperative Hausdorff tracking. A demarcation line has been included to clearly mark
the switch from cooperative to image-based tracking. This occurs at approximately
seven seconds. This agrees with the data from Figure 6.12 since this is the point
in time that the values of the axes begin to synchronize. Figure 6.5 shows several
images gathered during the experiment. The first image in the series shows a profile
of the mobile robot and the target tracked. The remaining images were gathered from
the robot’s onboard camera. The second image shows the target at the determined
optimal distance. This determination is based on the set space of the target instead of
simple-distance as is the case with point-based tracking. The third and fourth images
show cases where the target is moving toward, and away, from the robot, which then
causes the robot to retreat from, and pursue, the target respectively. The final image
in the series shows the target strafing left, to which the robot responds by moving to
re—center the target set.
140
Figure 6.15: Target tracking using a Sony EVI—D30 camera mounted on Robot Nomad
XR—4000 mobile robot
This experiment shows that the Hausdorff tracking control can be used adequately
for task space servoing, as well. Given the task space location of a target set (possibly
multiple disjoint targets) and a model of the coverage set of a camera, the target
can be brought into view using cooperative Hausdorff tracking. This experiment also
demonstrates the stable switching (due to in-built hysteresis in the switching criteria)
from cooperative Hausdorff tracking mode to image-based Hausdorff tracking mode.
6.6 Experimental Implementation for Redundancy Resolu—
tion using Optimal Hausdorff Tracking
Based on the discussions in the preceding sections, we can now formulate the task
distribution problem for Hausdorff tracking using redundant motion stages. The
problem we are considering is the optimal task distribution for a pan-tilt camera
mounted on a mobile robot tracking a single target. The experimental setup for the
tracking task is depicted in Figure 6.15.
For the experimental implementation, we will consider the redundancy in the
robot and camera motion along the X traverse direction. When the target moves
141
along the X direction, either the robot can traverse linearly or the camera can pan.
Hence there is a redundancy in the task execution. The model for the Hausdorff
tracking task using the camera and the robot combination can be written as:
"U2:
[01C2] = —aJ (6.2)
my
where u 2 [ex 212ij is the velocity input vector to the robot and camera combination,
a is a positive task level scalar gain and J is the shape function which indicates the
error in accomplishing the task. The vector C = [Cl C2] describes the distribution of
the task error between the two redundant axes. Equation (6.2) can be derived using
the formula provided in [67].
The task indicators used are the energy consumed by the system and the cur-
rent pan angle of the camera. The current pan angle of the camera affects how
the translation in X by the mobile robot affects the task accomplishment. Hence,
the objective functions considered for the multi-objective optimization schemes also
included energy minimization and pan angle minimization and were represented as:
01 = q8 = (0 - 00)2 (63)
D2 = llvxll + Ilwyll (6-4)
where, 0 is the current pan angle value, 00 is the desired pan angle value considered as
zero pan, Hug,- II and Hwy II are the horizontal X velocity and the angular Y velocity (pan
velocity), respectively. Note that confidence parameters are built into the objective
functions. For example, when the pan angle is close to zero, we would like to reduce
the weight for the objective function D1, which is exactly what happens.
Various optimization schemes were set up, including one with no optimization
involving just a random selection of the null space vector 6. The optimization schemes
142
10r .. ~.- w - 150
‘ b panvelocity
j '6 Xvelocity
c $100» . -.»
.9 >
*g x
3 E 50
“t .8
g 2
w 5 o
a
_50 l i a" g .
0 10 20 30 40 50
time (s) time (s)
120 150.
Tolerable
100’ .........
80 Tolerable 2 100’
8’
E) 60 F1 . g
a) ...... ...... . .............. 50
4O . 3
'20’“ ‘ """""""" 0 Highly Desirable
o ‘_ . i . 4 . A .
0 10 20 30 40 50 60 0 10 20 30 40 50 60
time (s) time (s)
Figure 6.16: Results of the tracking task with optimal task distribution using physical
programming.
selected for the experiment included single objective optimization where either the
energy minimization or zero pan angle objectives were considered individually. For
the multi-objective schemes, both weighted sum and physical programming methods
were tested. The results of these tests are tabulated in the following sections. Figure
6.16 shows the results of the experiments carried out using physical programming
optimization described above. The figure shows the plots of the shape function, the
velocities in the X and pan directions and the objective function values for energy
conservation and pan angle zeroing.
Notice that the values of all the objective functions are always maintained at
least in the tolerable range by sacrificing the performance of the other cost functions
currently under consideration.
The same task was also carried out using a lie-optimization scheme with a fixed
143
45 _ I. Weighted Sum Opt.
II :I _ .. . No Optimization
ll
40‘ II I:
35 II I.
- I _
:: ,I Desrrable
I I
I I
I I
' I
i .., .......................
I'
II
| I
Hfghly Desirable
" |
time (s)
Figure 6.17: The energy consumption for no optimization, and weighted sum.
task distribution between the two redundant axes and also using a weighted sum
scheme. As mentioned in section 5.4.], a weighted sum method with fixed weights
without apparent meaning is used in order to combine the various cost criteria. The
cost criterion used is:
+ llwzll)
The two cost functions - i.e., energy and pan—angle maintenance for the no-
optimization method and the weighted sum method, are shown in Figures 6.17 and
6.17, respectively.
For the no—optimization case, it can be seen that in the absence of any optimiza-
tion, the linear velocity transferred to the robot is very small and most of the velocity
144
500 '
'— - - - — Weighted Sum Opt.
- - - No Optimization
400-
Untolerable
300-
I """""" H'i' iii ' 'Uh'cié's'iiéiiié"
_g 200 .................................. 3‘ ............. .g...y. .................
2’ -_
“3 |
S ------------------------------------ I ---------------------------
o.
100 ‘_ _, Tolerable
I',IIIIIII.IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII‘IIIIIIIIIIIIIIIIIIIIISIS
0". ............................................... ". .......................
lllllllllllllllllllllllllllllllllllllllllllllllll ‘Hl‘llllllllillllilltl
!
-1oo~ ,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, I, _[9'37‘3'9'3 . ,
Undesirable
_20 I I I L I I
0 10 20 30 40 50 60
time (s)
Figure 6.18: The pan angle for no optimization, and weighted sum.
for task execution is allocated to the camera pan.
Using the weighted sum method, the various components comprising the task
criterion lack physical meaning. Based on this observation we notice that even when
the multi-objective optimization is enabled, the pan angle remains very close to the
desired value while the energy conservation objective attains unacceptable values.
It should be noted that various tolerance ranges, which are used for the physical
programming approach, are not used in the weighted sum approach. Hence, despite
the optimization being enabled on the various objectives, the values of the objective
functions can attain unacceptable values.
The above results show that the physical programming method provided better
results in terms of Optimizing multiple objectives while simultaneously accomplish-
ing the task. This is because with the weighted sum method, due to the loss of
physical meaning of the weights, dynamic weighting may provide an unacceptable
145
solution. However, for physical programming, since the cost criteria are combined
using weights which are attributed a physical meaning, physical programming can
calculate solutions on the pareto—optimal frontier.
This experiment further goes to show that the redundancy in the input expressed
in Equation (3.30) can be successfully exploited in order to optimize various cost func-
tions associated with the task. Hence, some criteria, such as resolution maintenance
of multiple targets. can indeed be removed from the set of constraints and used in
the optimization criteria and can provide more flexibility for task specification.
6.7 Experimental Results for Multisensor Cooperation
This section illustrates the experimental implementation and results for the two tech-
niques:
1. Sensor allocation algorithm and
2. Mutational hybrid automata modeling techniques proposed earlier.
6. 7.1 Experimental Evaluation for the Sensor Allocation Algorithm
Pervasive surveillance using the proposed multi-sensor coordination mechanism was
rigorously tested. The task of tracking a target over multiple sensor coverage regions
was carried out. The target moved a large distance, spanning the capable viewing
regions of multiple sensors. Figure 6.19 depicts the results of the tracking scenario.
Initially, sensor 2 can see the complete target but the target is out of the active FOV
of sensor 1, which is depicted by the target visible flag. Based on the solution of the
optimization, sensor 1 is tasked to track the object. It uses cooperative Hausdorff
surveillance to cover the target. The global coordinates of the target for cooperative
Hausdorff tracking are provided by sensor 2. Once it acquires the target, it switches to
image-based Hausdorff tracking to ensure target coverage. The target then randomly
146
5- ....................... , ........................ , ................................................. ,
—X
4-_Y
—Z
-—- visible
E
C
.9
#0.;
O
0.
N
>. 3 -
X
'0
OJ
‘5 2 -
.§
E
,_ 1 - .
e l 5
g 0 I : l 1 l l l l I
8 0 1O 20 30 40 50 60 70 80
time (s)
(a)
Camera 2- Estimated X, Y, 2 Position (m)
(b)
Figure 6.19: Estimated position of targets for multisensor cooperation using sensor
allocation algorithm
147
moves around in the active F CV of both the sensors, which keep continuously tracking
it. Notice the robust tracking performance of the image-based tracking scenario
despite the minor differences in the estimation of the object location. The target
then starts to move out of the tracking range of sensor 2, which again triggers the
optimization planning problem, which assigns sensor 1 to keep continuously tracking
the target.
This experiment demonstrates the cooperation between multiple sensors for track-
ing a target. It shows that information from another sensor can be used in order to
cooperatively acquire a target by a camera and that the location of the targets can
be used to optimally assign the tracking task to the various sensors based on their
availability, target priority and the locations of the sensors and targets.
6. 7.2 Ezperimental Evaluation of Multiple Target Tracking using Mul-
tiple Formation- Constrained Sensors
Motion is an energy intensive resource for a sensor node and should be minimized.
Multiple sensors (maybe with varying sensing modalities) can be mounted on a single
motion drive to reduce the energy consumption. Further, the various sensors can be
tasked to track different targets simultaneously. Hence the motion of the sensor node
should be optimized in order to achieve the various tasks simultaneously. The optimal
Hausdorff tracking framework proposed in section 5.4.1 can be used to calculate the
optimal sensor input u.
Consider a scenario where two targets are being tracked by two separate formation-
constrained sensors as shown in Figure 6.20. The task is to maintain the visibility of
targets R1 and K2 using the sensors caml and cam2 respectively. The two sensors
are mounted on a single motion platform, although sensor caml can be moved inde-
pendently of sensor camg by using the motion of joint go of the PUMA robot arm.
148
The shape functions for this task can be expressed as:
J1: [Kl dK1(p)dp (6.6)
J2=/1.{2dK2(p)dp (6.7)
The motion inputs to the complete robot system are linear motion along the X
axis (um) and angular motion about the Z axis of the robot base (LUZ). Angular
motion of the robot will not affect the coverage task assigned to sensor camg while
linear motion along the X axis will affect the task assigned to both the sensors. Using
Equation (3.31), the model for Hausdorff tracking for the two—camera system can be
written as:
C C u a 0 J
Cu : 11 12 a: < _ 1 1 (6.8)
/
Figure 6.21 depicts the target tracks of the two targets in the world coordinate
frame. The two targets move independently of each other and can cross each other
as is shown in Figure 6.24.
Figure 6.22 depicts the shape functions .11 and J2 for two cameras caml and camg,
respectively, and the velocity profiles of the two inputs v3; and wz to the mobile base
and robot arm are shown in Figure 6.23. The inputs to the two motion axes (linear
velocity v3 and angular velocity wz) are calculated using 6.8.
The results of this experiment show that we can use formation-constrained sensors
in order to accomplish multiple tracking tasks simultaneously using multiple sensors.
This implies that the optimal Hausdorff tracking framework can be leveraged in order
to solve multiple target tracking problems.
149
r g Camera 1
\ . .
r
Figure 6.20: Experiment setup for multi-target tracking using multiple cameras.
150
Y position (m)
-2.5 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 2.5
Xposition (m)
Figure 6.21: Target tracks for multi-target tracking using multiple cameras.
151
Sha Functions
4000 .............. : ........... .7 ......... PET ....... . ..... : .....
3500 .............. .............. ............. ........... ..... ‘
3000 .............. g .............. ;.H_i.-.ug ............ g .............. ;.-.H.H_H;
N 1 E f f i 3
2500 ............. E ........... ;{H.H.H.H.i ............ i .............. j.m.x.u.né
2000 ............. é .......... jéli ........... é ...... 4E5}; ...... . ....... §.HH_H.H.i
Shape Function, J1, J
. l- . _ - I
. - _ . _ .
. : _ _ .
1 500 ............. . ........... _ ..... , .............
I ‘ --_I _ - I
1000 ............. g ......... .{;€§; ........
500
'IIIDIIIII,‘.'
Figure 6.22: Experiment setup for multi-target tracking using multiple cameras.
152
100-...W ............... I .................. : ............ ..: ...............
....... Base velocity: v)( i __E
—— Arm angular velocity: cox
Figure 6.23: Experiment setup for multi—target tracking using multiple cameras.
153
'l'I'l'I'I'I'I‘I'I'I'l'l‘l'l'I'I'l'l'l'I'I'l'l'l'l'I'I'I'l'l'l'l‘l'. . . . . . . .
Arm Camera — Cami Base Camera - Cam2
(a) time = 10 sec
Arm Camera — Carnl Base Camera — Cam2
(b) time = 25 sec
I - r .. ' . ‘
Arm Camera — Caml Base Camera - Cam2
(c) time = 37 sec
. ' g _, with
Arm Camera — Cam] Base Camera -— Cam2
(d) time = 55 sec
Figure 6.24: Image Sequences taken while tracking the two targets with two
formation-constrained cameras
154
7000
6000
5000
Y (mm)
3000
2000
1000
Figure 6.25: X—Y Plot for target motion
6. 7.3 Erperimental Evaluation for Multi-Sensor Cooperation using Mu-
tational Hybrid Automata Model
For the experiment, the location of the target (human) moving about was captured
using a stereo vision system. Note that the location of the target was not used in the
feedback control scheme designed, but was used merely to show the random motion
of the target and coordinate it with the various mode changes in the MHA model.
Figure 6.25 represents the position, in x-y coordinates, of the centroid of the target,
located using the stereo camera. The locations of the cameras caml and camz are
superimposed on this figure.
155
Pervasive Target Tracking Using Two Cameras with Continuous Cam-
era Servoing
In this experiment, the MHA model presented in section is implemented.
Figures 6.27 and 6.26 show the results of the experiments conducted using this
model. Figure 6.27 represents the data calculated by caml and Figure 6.26 represents
the data calculated by camg. Figures 6.27 and 6.26 show the value of the shape
function J, the pan velocity input to the camera, and the mode q E {q1,q2,q3}
(defined in 4.2) that the system is in; and subfigure b depicts the current pan position
of the camera and the visibility flag of the target for that camera. The visibility flag
indicates whether the target is within the FOV of the camera and is set when the
target is recognized by the target detection subsystem.
Figure 6.28 depicts an image sequence captured by the two cameras, with subfig-
ures a, b and c representing the three modes q1, q3 and q2 respectively. Within each
Figure 6.28(a), 6.28(b) and 6.28(c), the left and right images correspond to caml and
camg, respectively. We notice that the target is in the field of view of at least one
camera at any time during the course of the entire experiment.
The results of this experiment show that the mutational hybrid automata frame—
work can be used to systematically design a pervasive surveillance network which can
be used to track a target as it moves across a wide region covered in parts by individ-
ual cameras. It shows that despite the random motions of the target, the surveillance
network can, in fact, perform stable tracking without having to estimate the exact
location of the target.
Pervasive Target Tracking Using Two Cameras with Controlled Switch
This section presents the results of the experimental implementation of the mutational
hybrid automata model with controlled switch shown in Figure 4.3.
156
200 ................
' _Shape FunctionJ g
. q3 .
. : pan velocnty
100 ....................... ”...,... .......... A. Mode
J, 6, Mode
0
Li
fr
‘3.
—10 i i i i .1
0 20 4O 60 80 100
time(s)
300 ................ : ...... ........ _ ......
Pan Angle (O) :
Target Visibility
200 ................ ..............
100 _ .. Visible
OJ .......... .............. 3 ........ i
9, Target Visibility
Not Visible § § § Not Visible -----------
-1o . i . 1
0 20 6° 80 100
time (s)
Figure 6.26: Current MHA mode, shape function, velocity, pan angle and target
visibility for Caml
157
100 ......... : ........ : ........ : ......... 1 ........ 2 ........ , ........ 2 ........ : ....... f ........ :
J, 9, Mode
—— Shape Function llLTl if . , . ...
Pan Velocity i ' i
— Mode
Visible Not Visible
"100 -------- 1 ........ g ...... i I :
' ' Pan (9)
-200 ... . . . . . . ....... ...... Visibi'ity
9, Target Visibility
_ g i g i . i i i . .
300o 10 20 3o 40 50 60 7o 80 90 100
time (s)
Figure 6.27: Current MHA mode, shape function, velocity, pan angle and target
visibility for Cam2
158
rip.
. ’i‘.
— in
'. “R
(b) Mode q3, (10 sec)
Figure 6.28: Image Sequences taken while tracking the target over different modes
300- .......... ShapeFunctionJ ...... .......... ..........
' ‘ Pan Velocity 6) ' '
g ----- *— Current Mode
0 .
2
co“
5"
-100
-200
:§‘
2?:
.2
>
‘65
2’
(U
[.—
a5
_40 i i i L i i i i
O 5 10 15 , 20 25 3O 35 40
time (see)
Figure 6.29: Current MHA mode, shape function, velocity, pan angle and target
visibility for Caml
Figures 6.29 and 6.30 show the results of the experiments conducted. Figure
6.29 represents the data calculated by caml and Figure 6.30 represents the data
calculated by camg. Figures 6.29 and 6.30 show the value of the shape function J,
the pan velocity input to the camera, and the mode q E {q1, q2, q3,q4,q5} (defined
in 4.3) that the system is in; and subfigure b depicts the current pan position of
the camera and the visibility flag of the target for that camera. The visibility flag
indicates whether the target is within the FOV of the camera and is set when the
target is recognized by the target detection subsystem.
These experiments show that irrespective of the target’s current location in the
monitored region, visibility of the target is maintained by at least one camera. This
160
400 , ................... 3 .......... E .......... : ...... d . .i. . . ShapeFunction J
q4 q, Q2 g q3 g qs E 3 """" Pan Velocity 6)
g 200 ........ ....... ................. §-----CurrentMode
2 _ _. . . : : .
a5 .
fi- 0 :1}. 3 "-::-:EE"' l.-o- .......
4d. q2 q1 q4
_200 L i I i i l i j
0 5 10 15 20 25 30 35 40
time (sec)
.........................
Pan angle (9)
.......................
.......................
9, Target Visibility
Not Visible
_10 ' I I : i i l
15 . 20 25 30 35 40
time (sec)
Figure 6.30: Current MHA mode, shape function, velocity, pan angle and target
visibility for Cam2
161
experiment further demonstrates the controlled switch scenario, where, the cameras
can be actively turned off, can be used to perform pervasive tracking.
6.8 Performance Analysis of the Video Subsystem
The real-time-performance video subsystem implemented was measured and analyzed
for use in the surveillance network scenario. The parameters which affect the perfor-
mance of a video transport scheme in the context of a switched surveillance scenario
are time taken to deliver the video stream to the operator, size and quality of the
image, the rate of video frame update (framerate) and the initialization time for the
client to receive one complete image frame from the server. We compared the perfor-
mance of using H.261, H.263 and MJPEG as the encoding/decoding scheme for the
visual surveillance tasks. The various choices for the video subsystem implementa-
tion have their advantages and disadvantages based on the scenario they are being
applied to. We propose a scenario/ task based metric which can be used to evaluate
the suitability of the various schemes based on their characteristics. The parameters
for evaluation include:
1. Codec complexity
2. Bitrate of generated video stream
3. Switched video initialization time
Codec complexity can be represented by the amount of time taken for the capture and
compression of an individual image and may be an important factor for implemen-
tation on computationally constrained systems such as embedded processors. The
bitrate of the generated video stream represents the bandwidth consumed for trans-
porting the video stream to the operator and can be a limiting factor for wireless
or power constrained sensor nodes. Switched video initialization time represents the
162
amount of time taken to display the relevant objects (targets or environment) after
a video stream switch has occurred. The proposed metric involving these parameters
is:
M=— — — 6.9
TLE+B+TC ( l
where, K L E, K B, K C are gain coefficients chosen on a per-scenario/ task basis based
on a fuzzy rule. TL E and To represent the switched video initialization time and
capture and compress time, respectively, and B represents the bitrate of the generated
stream.
The surveillance tasks can be divided into two categories - namely, monitoring and
tracking tasks. In the monitoring task, the operator must evaluate the actions of the
target, which may require the knowledge of the environment w.r.t the target, while
the tracking task requires to only keep the target in view. Hence, for the tracking
task, in order to calculate the real switching time for the video system, we need to
account only for the display of a moving target, while for the monitoring task, we need
to evaluate the switching time for the moving target as well as the static environment.
Based on the characteristics of the task / scenario being evaluated, the values of the
gain coefficients are chosen based on a fuzzy rule. For example, if the sensor nodes are
not constrained by the processing requirements or power consumption (implemented
on regular PC’s), the codec complexity will not have much effect on the system
performance, so we assign a very low value to the gain coefficient Kc- Similarly, if
the nodes are networked using a wired LAN (100Mbps ethernet) infrastructure having
large enough bandwidth, then we assign a low value to the gain coefficient K B- In
contrast, if the nodes are sharing a low-bandwidth wireless channel, K B should be
assigned a higher value which signifies the importance of the bandwidth-consumed
parameter.
The various task / scenario combinations and their respective gain values we have
compared for the video system evaluation are tabulated in Table 6.1. In the scenarios
163
Table 6.1: Scenarios and gain values
# Scenario K L E K B K C
1 Wired network, monitoring task 1 0 0
2 Wired network, tracking task 1 0 0
3 Wireless network, monitoring task 1 1000 0
4 Wireless network, tracking task 1 1000 0
implemented, the sensor nodes are implemented on general purpose PC’s running
Linux and hence the capture and compress time is not very crucial to the performance
comparison of the various schemes. The software implementation of the codecs for the
tests was a modified version of the Vic software. The video streams were packetized
using the RTP transport protocol.
6.8.1 Performance Analysis for Capture and Processing Time
The capture time and compression time for one image frame for H.263 and MJPEG
is shown in Table 6.2. The frame rate is set to 25 fps, the bit rate is set to be 1.5 M
and the image size is set to CIF resolution (352x288) for the experiments.
Table 6.2: Capture and Compression Time of the Two Schemes
Scheme MJPEG H.261 H.263
Capture 85 Compression 12.65ms 30.09ms 32.58ms
We noticed that for H.261 and H.263, when the scene has dramatic changes, the
compression time tends to increase, and in the case of MJPEG the compression time
is quite constant. Since both MJPEG and H.263 are symmetric coding/decoding
schemes, the decompression time should be equivalent to the compression time.
164
6. 8.2 Comparison for Video Size and Image Quality
The H.263 standard is very limited in its capability for selection of the frame size,
which is limited to 5 standard resolutions, namely, CIF, QCIF, SIF, 4CIF and 16CIF.
This may be a limitation when integrating multiple types of camera sensors. On
the other hand MJPEG allows for almost all resolutions (limited to multiples of 8,
of course) and can be used to transmit very high definition images when required.
The viewing quality of the transmitted images can be largely regarded as the same
because both schemes use DCT (discrete cosine transform) and quantization for the
compression. However, it should be noted that using MJPEG, the complete image
is updated at once, while for H.263, the image is updated in parts and may cause a
visual degradation of the perceived image quality.
6.8.3 Performance Analysis for Video Bitrate and Frame Rate
Due to the inter-frame (temporal) compression implementation, the video bitrate
generated per frame for the H.261/H.263 schemes is lower compared to MJ PEG. This
makes H.261/H.263 schemes more suitable for video communications over a restricted
bandwidth network. The frame rates and bitrates for the two schemes generated for
fixed and panning cameras are tabulated in Table 6.3. The quantization ‘Q’ values [57]
used are based on a visual notion of picture clarity and are noted in the table.
Table 6.3: Ffame Rate and Bitrate for MJPEG, H.261 and H.263
Frame MJPEG: Q=30 H.261: Q=10 H.263: Q=10
Rate Bitrate(KBPS) Bitrate(KBPS) Bitrate(KBPS)
(fps) panning static panning static panning static
10 668 668 550 25 700 20
20 1300 1300 1000 45 1000 38
25 1700 1700 1500 60 1300 48
165
6. 8.4 Performance Analysis for Server Switching and Camera Hand-
017
The initialization time of H.261, H.263 and MJPEG are also measured. The initial-
ization time is defined as the time from when the server starts broadcasting to when
the client receives the first image frame. In all three cases, the initialization time is
around 10ms.
For H.261 and H.263 schemes, since inter-frame coding is used, there will be long
delay when a client joins a session where a server is already broadcasting to other
clients. The effect is termed “late entry” and is caused due to the use of inter-frame
encoding, where each frame is not independent of the others but must be decoded
using some information from other frames. The stream switching delay due to late
entry can be large when using these encoding schemes, especially when the scene is
relatively static. The late entry problem does not exist for the MJ PEG coding scheme
because all the frames transmitted are intra—coded and do not rely on information in
the previous or subsequent frames.
For the H.261 scheme implementation the server is only required to transmit those
master blocks that have changed through consecutive frames. Thus the client has to
wait until all the master blocks are transmitted once to be able to see the whole
scene. Based on our implementation methodology for the H.261 scheme, a moving
target will be displayed almost immediately while the static environment will take
a longer time to display after the switch, which implies that the switching time will
be low for a tracking task and relatively high for a monitoring task where the static
environment needs to be taken into account.
The H.263 standard allows the transmission of a completely intra—coded frame,
called the ‘1’ frame, to be transmitted periodically in order to negate the effects of
accumulated errors. In our experiments for the H.263 coded stream carried out at
various frame rate settings, the client has to wait for approximately 5—10 seconds to
166
decode and display the first image frame because it can start to decode a stream
only after it has received the first ‘I’ frame. Table 6.4 tabulates the maximum time
taken by the client to display all the relevant macro blocks for the first time when the
client joins the broadcast session late for both the tracking and monitoring tasks. The
maximum switching time due to late entry problem is compared for various frame
rates of the video streams with a very large bandwidth.
Table 6.4: Server Switching Time for Monitoring and Tracking Tasks
Frame rate Monitoring task Tracking task
(fps) MJPEG H.261 H.263 MJPEG H.261 H.263
1 13 323 9.33 ls ls 9.53
10 0.13 3.23 9.83 0.13 0.123 9.73
25 0.063 1.33 9.23 0.063 0.053 9.43
Note that the times given in Table 6.4 are the measured values for the display to
update. We notice that the switching time for the H.261 scheme is significantly lower
for the tracking task than for the monitoring task because the macro blocks comprising
the moving target will be updated and transmitted almost instantly; however, the
static environment blocks will not be updated until they age and expire at the end
of their 32-frame update cycle.
It should be noted that for the H.263 coding scheme, all the information needed to
initialize the decoder is stored in an intra coded ‘1’ reference frame, which is transmit-
ted periodically. One method to combat this late entry problem would be to transmit
an ‘I’ frame every time a new client joins the session, and can be implemented using
the RTCP (control commands) part of the RTP protocol. This may lead to higher
bandwidth consumption, but may be acceptable in wired surveillance applications.
167
6.8.5 Performance Metric Evaluation for Various Surveillance Net-
work Tasks
Based on the experimental results reported above, the metric It! evaluated for the
various scenarios and the codec schemes is tabulated in Table 6.5. The video frame
rate was held constant at 25 fps with the image quality (Q) being held at 30 for
MJ PEG and 10 for H.261/H.263. The cameras are assumed to be stationary.
Table 6.5: Metric for Comparison of Various Schemes
Scenario # MJPEG H.261 H.263
1: Wired, monitoring 16.67 0.769 0.109
2: Wired, tracking 16.67 20 0.109
3: Wireless, monitoring 17.26 17.43 20.94
4: Wireless, tracking 17.26 36.67 20.94
We notice that the H.261 implementation has best performance for the wired and
wireless tracking scenarios, whereas for the wired monitoring scenario, MJPEG has
the best performance and the H.263 implementation manages better performance for
the wireless monitoring scenario. It must be kept in mind that the performance eval-
uation for the various schemes are qualified by their implementation methodologies
outlined in section 5.5. A brief discussion regarding the advantages and disadvantages
of using the various video subsystems is presented here.
The MJ PEG system can transmit video frames of various sizes and hence has the
advantage of being able to handle heterogenous hardware for video capture. Further
it has a low coding and initialization latency and allows for a direct hardware im-
plementation, which is advantageous for using lower processing power on the sensor
node. It does not suffer from the late entry problem during switching. However, it
requires a consistently high bandwidth, which may limit its application on wireless
or low bandwidth communication channels.
H.261 and H.263 encoding schemes consume less bandwidth, allowing for higher
168
frame rates, which is an important characteristic for continuous and responsive real-
time monitoring. The drawbacks of these schemes are that they can handle video
frames of certain specified sizes only and that both initialization time and switching
time are higher than in MJ PEG. They also suffer from the late entry problem, which
can be detrimental for systems involving video stream switching and nodes linked over
unreliable channels. However, the late entry problem can be solved at the expense
of transmitting the complete intra—frame—encoded image when requested using the
RTCP protocol.
6.9 Chapter Summary
This chapter presents the experimental implementation details of the various algo-
rithms proposed earlier. The image-based Hausdorff tracking method has been im—
plemented for tracking a single and multiple targets simultaneously. This tracking of
multiple disjoint targets moving independently is a major advantage of the proposed
method, as the current existing image—based tracking techniques, such as visual ser-
voing and gaze control, cannot achieve this. Results for the cooperative Hausdorff
tracking are also presented. The two multi-sensor cooperation methods proposed are
also validated experimentally. A task-based metric is also proposed for evaluating the
suitability of the various video feedback schemes.
169
Chapter 7
Conclusions
This dissertation has presented an in-depth examination of the various aspects of
planning and control of mobile surveillance networks, concentrating particularly in
the broad areas of modeling, analysis and design of mobile surveillance networks. This
chapter discusses the conclusions and gives a view towards future areas of research.
7.1 Conclusions
The non-intrusive nature of cameras makes them an attractive choice for sensors for
various surveillance applications. In recent times, with the decrease in the cost of cam-
era sensors accompanied with increasing processing power and availability of wireless
networking capabilities in sensors of exceedingly smaller size, large-scale target track-
ing systems utilizing a large number of cameras are being increasingly deployed for
myriad surveillance and tracking tasks. Sensor mobility accompanied with large scale
wireless networking can overcome the limitations due to range, direction and occlu-
sion of visual sensing. However, a large number of sensors with motion capabilities
tends to increase the modeling and analysis complexity of such systems.
For successful completion of the surveillance task, it is important to allocate the
sensors to the targets that are actively being tracked. Generally, each target should
170
be tracked by the sensor that can view it at an optimal resolution for best visibility of
that target. However, when the system is heavily loaded, that is, the target density
in a particular region with a limited number of sensors is high, one sensor might need
to track multiple targets. Even when the system is under-loaded, multiple target
tracking by individual sensors is highly desirable, as viewpoint redundancy will lead
to better estimates of the target coordinates and also present the viewer a clearer
view of all the targets that are being tracked.
In order to track a target over a wide area with a distributed surveillance network,
the tracking sensors need to be successively switched and the tracking task needs to
be handed over from one sensor to the next. This switching, while increasing the
modeling and analysis complexity, also introduces the design problem of optimally
locating the sensors (i.e., configuring the sensors) in order to maximize the tracking
performance. Switching the tracking task from one sensor to another tends to be quite
disorienting for a human operator and hence needs to be minimized while tracking a
target over a large region, while also maintaining adequate resolution of the target to
allow discerning various target features for classification and identification of targets.
This dissertation presents various novel approaches to solve the aforementioned
problems associated with mobile surveillance networks. The original contributions of
this dissertation can be broadly classified into modeling, analysis and design contri-
butions, which are individually addressed below.
Modeling Contributions Using the concepts of continuous and discrete motion
of cameras, a generalized modeling framework for conducting pervasive surveillance
has been developed. First, we have developed a novel domain-based modeling ap-
proach, christened “Hausdorff tracking”, for visually tracking multiple targets with
a single sensor while simultaneously accomplishing various other sub-tasks such as
maintaining adequate resolution of the tracked targets. Hausdorff tracking uses a
171
mutational-analysis-based approach to express the motion of the target and sensor
coverage sets with respect to the camera motion input. The various task requirements
can be specified using shape functions which can be construed as analogs of vector
errors in a vector—based control system.
Using the concept of mutations of domains and shape analysis, the dynamics
(change in shape) of the sensor field of view (FOV) and target domains are described
and further feedback control mechanisms have been derived to complete the specified
task. The assumptions on the control inputs to the camera for stability of the control
system can be calculated using the shape Lyapunov theorem. Hausdorff tracking
provides a significant advantage over classical vector space servoing as it can express
the multiple target tracking succinctly and perform the tracking task in an energy-
efficient manner. The methods of image-based Hausdorff tracking and cooperative
Hausdorff tracking are introduced to perform target tracking using target information
either directly sensed from the camera image plane or acquired from other sensors
using the communication module, respectively.
In order to pervasively track a target over a large area covered by multiple sensors,
the mutational hybrid automata (MHA) model is proposed. The MHA model is
capable of capturing the inherent hybrid (discrete switching with continuous motion)
nature of the pervasive surveillance task - i.e., it accounts for both continuous camera
motion and discrete jumps in the location of the tracking camera, which indicates
a switch of the tracking camera from one sensor to another. The MHA model for
pervasive surveillance can be used to model “hand-off” scenarios as the targets move
across the coverage regions of multiple networked sensors. It allows modeling of both
autonomous and controlled switches, which provides for a rich framework for hybrid
control of the various tracking cameras based on the location of the target and other
environment inputs.
172
:E_t Wfifl
Analysis Stability properties of the Hausdorff tracking controllers are proved using
the shape Lyapunov theorem. Using the shape Lyapunov theorem, it is shown that
the shape function (error) converges to zero at least as fast as a stable first-order
linear system. Stability for a pervasive tracking scenario implies that when a target
is detected by a camera in the network, its visibility will always be ensured while
it moves around in the monitored system. Using examples of pervasive surveillance
scenarios, a two-step procedure is proposed to show the stability properties of the
MHA model for modeling pervasive surveillance tasks. The first step shows that all
the modes of the MHA model are internally stable - i.e., continuous evolution within
each mode is stable in the sense of Lyapunov and there are no Zeno executions. The
next step is to show that a particular set of stable modes in the MHA model forms an
invariant set - i.e., if the hybrid execution enters this invariant set, it will remain in
this set for all time. Now if we limit the Init set of the MHA model to the invariant
set, the execution will never leave that set. This implies that if we can prove that the
invariant set is the set of all modes where at least one camera can view the target, then
if the target enters the field of view of any one camera, it will be tracked continuously
as long as it remains in the monitored region. This procedure is developed for both
models with and without controlled switches.
Design A wide area multi-camera location and pose calibration scheme is presented
that can calibrate the many heterogenous cameras with respect to a global coordinate
system in a short time by simply waving a bright light source in the region observed
by the cameras. This method can be used even when all cameras do not have an
overlapping field of view. The problem of optimally placing a number of cameras in
a given region to facilitate a surveillance task is presented. Metrics for evaluating
the performance of the tracking task based on the target resolution and number of
switches to pervasively track the target are presented. Another problem we presented
173
is related to camera switching strategy given a full or part of a trajectory path of a
moving target. A dynamic-programming—based approach to generate the switching
strategy and optimize both the switching and resolution metrics is presented.
The Hausdorff tracking problem is extended to incorporate an optimal control
framework for executing multiple subtasks along with the target visibility task for a
surveillance network. This optimization framework allows the surveillance task de-
signer to consider various sub-tasks in the planning phase. Various sub—tasks can
include minimizing the energy consumed by the network, which will enhance the
longevity of the deployed network sensors. Physical programming is used for find-
ing a solution to the optimization problem in order to lend a tangible and physical
interpretation to the subtasks being considered.
The design of a switched video feedback system and a sensor node architecture
are also presented. The various alternatives for the switched video feedback namely
MJ PEG, H.261, H.263 and H.264 are evaluated and a task- and scenario-based metric
is proposed to analyze their performance. The MJPEG scheme does not suffer from
the late entry problem, but the bandwidth requirement is constantly high. H.261 and
H.263 schemes consume low bandwidth for relatively static scenes but suffer from
the late entry problem when video switching is required. A switched video feedback
system using a multiple-client, multiple-server architecture using the H.264 video
encoding scheme is implemented.
Implementation and testing A surveillance test-bed consisting of seven heteroge-
nous cameras with motion capabilities for validation and testing of the proposed
schemes and algorithms is implemented. The sensors are calibrated using the wide
area calibration scheme proposed and various target tracking scenarios are imple-
mented and tested. Experimental results validating the proposed theoretical ap-
proaches are presented.
174
The various theoretical approaches and tools presented in this dissertation will
enable the design, analysis and implementation of real-time mobile surveillance sys-
tems.
Overall Impacts All together this dissertation presents a collection of modeling
analysis and design tools for mobile surveillance networks. The impacts of this dis-
sertation to various fields are summarized below.
Impact on the field of surveillance and sensor networks: This dissertation presents
significant advances for modeling analysis and design for the field of surveillance and
sensor networks. Modeling multiple target tracking using the Hausdorff tracking
method provides a significant advantage for distributed sensor systems as it can sig-
nificantly improve sensing accuracy due to multiple detections of the same target with
different sensors. The novel stability analysis approach presented for guaranteing tar-
get coverage using distributed sensors is a significant step forward from the current
approaches of probabilistic modeling.
Impact on the field of robotics and control: This dissertation presents a major
contribution to the field of robotics and control. Vision based robot control has
suffered from the problem of task specificity i.e., the tracking task is generally over
specified and also lacked a generalized modeling structure for modeling multi target
visibility problems. The method of Hausdorff tracking proposed in this dissertation
represents a generalized set based modeling structure for modeling traditional vi-
sion based servoing. It can in fact be viewed as the super set of visual servoing as
traditional visual servoing tasks can also be readily expressed in this framework.
The use of mutational equations and shape analysis for representing shape based
control is a significant step towards extending the rich tool-set of control literature to
domain based control. Generalizing a vector based model to a domain based model
175
opens up a lot of avenues for creative design of control algorithms. Many real-world
tasks which needed to be parameterized in order to be expressed in a controls theoretic
framework can be directly solved using this domain based approach.
Impact on general science and engineering: This dissertation presents various
results on suitability of various video streaming approaches for surveillance appli-
cations. It provides a method to measure the acceptability of various streaming
solutions which will have a significant impact on the implementation of networking
and IP-based video transmission.
Impact on society in general: There are many issues up for debate on privacy
and intrusion of it thereof. However, these concerns should not preclude the pervasive
monitoring and surveillance of critical infrastructure. In this current world scenario,
exhibiting a little precaution with regards to safety is, in my opinion, not a bad
idea. This dissertation provides tools for conducting pervasive surveillance which will
have many potential uses with regards to monitoring critical infrastructure such as
airports power plants etc and also many applications in our fuller understanding of the
world around us i.e., in environmental monitoring for understanding and preserving
natural habitats. However there are also many possible uses for this technology such
as continuous tracking of people with express intent of invasion of their privacy. But
all technology like a double edged sword has it’s advantages and disadvantages. It
is behooved on the people who use technology to be fair, just and moral in the
application of technology.
7.2 Future Research Directions
Large-scale surveillance systems have only recently been commercialized, and the
architecture and capabilities of these systems are still an active research area as many
176
problems still remain to be solved. This dissertation has addressed several issues in
modeling, analysis and design of mobile surveillance systems. Certainly more can be
done to elaborate on the work presented in this dissertation and even more can be
done to address various other issues in mobile surveillance which have not been the
topic of this dissertation.
Shape functions for performing visibility- and resolution-based tasks have been
developed in this dissertation. However, more general classes of admissible shape
functions should be developed in order to provide a rich set of tools to design surveil-
lance tasks.
The mutational hybrid automata model presented uses an edge set in order to
describe the discrete switching dynamics of the pervasive target tracking model. It
should be noted that the number of edges and modes required to model the perva-
sive surveillance task increases exponentially as the number of cameras involved in
the surveillance task grows. This somewhat limits the applicability of the proposed
approach of using edge sets for modeling discrete dynamics. In order to alleviate this
problem, a language-based automata model can be used with an alphabet capable of
modeling very large-scale camera systems.
The proposed MHA model can be used for ensuring the visibility of a single
target moving in a region. The possibility of combining multiple such single-target
MHA models in order to track multiple targets simultaneously using the same network
should be investigated. Optimal control for the proposed mutational hybrid automata
model is still an open problem.
The switched video feedback system is implemented using a baseline encoder. A
high profile encoder with object-level support to code the various objects separately
would certainly enhance the video feedback system by reducing the bandwidth re-
quirement for the surveillance system. The support of user-defined queries and tasks
will certainly enhance the usability of the surveillance system, making it more flexible
177
and adaptable for a real-world surveillance scenario.
There has been relatively little research done on camera deployment and place-
ment issues for tracking applications. The switching-based metric presented in this
dissertation serves as a first step towards the final goal of optimal camera placement.
A number of avenues can be pursued to carry this work further. A user-friendly
camera placement planning system that utilizes the switching metric amongst others
would provide a very useful tool for optimal camera placement.
178
10.
11.
AUTHOR’S PUBLICATIONS
A. Goradia, M. Huntwork, C. Haffner, N. Xi, and M. Mutka, “Modeling and
Control of Pervasive Surveillance Networks using Mutational Hybrid Systems,”
in IEEE IEEE/RSJ International Conference on Intelligent Robots and Sys-
tems, 2006.
A. Goradia, C. Haffner, N. Xi and M. Mutka, “Optimality Framework for Haus-
dorff Tracking using Mutational Dynamics and Physical Programming,” submit-
ted to IEEE International Conference on Robotics and Automation, 2007.
A. Goradia, Z. Cen, C. Haffner, N. Xi and M. Mutka, “Sensor Selection Problem
using Dynamic Programming,” submitted to IEEE International Conference on
Robotics and Automation, 2007.
A. Goradia, C. Haffner, Z. Cen, M. Mutka and N. Xi, “Modeling, Analysis and
Design of Mobile Surveillance Networks Using Mutational Hybrid Systems,”
Submitted to IEEE Transactions on Robotics.
. A. Goradia, M. Huntwork, C. Haffner, C. Klochko, N. Xi, and M. Mutka, “Per-
vasive Surveillance using a Cooperative Mobile Sensor Network,” in IEEE In—
ternational Conference on Robotics and Automation, 2006.
A. Goradia, Z. Cen, C. Haffner, Y. Liu, B. H. Song, M. Mutka and N. Xi,
“Pervasive Surveillance Networks: Design, Implementation and Performance
Analysis,” Accepted, International Journal of Distributed Sensor Networks.
A. Goradia, Z. Cen, C. Haffner, N. Xi and M. Mutka, “Design, Implementation
and Performance Analysis of Pervasive Surveillance Networks”, 19th Interna-
tional F LAIRS conference, 2006.
A. Goradia, N. Xi, Z. Cen, and M. Mutka, “Modeling and Design of Mobile
Surveillance Networks using a Mutational Analysis Approach,” in IEEE/RSJ
International Conference on Intelligent Robots and Systems, 2005.
A. Goradia, N. Xi, M. Prokos, Z. Gen, and M. Mutka, “Cooperative Multi-
Target Surveillance using a Mutational Analysis Approach,” in Proceedings of
2005 IEEE/ASME International Conference on Advanced Intelligent Mecha-
tronics (AIM 2005), Monterey, California, USA, 2005.
A. Goradia, I. Elhajj, and N. Xi, “Internet Based Robots: Applications, Im-
pacts, Challenges and Future Directions,” in IEEE Workshop on Advanced
Robotics and its Social Impacts (ARSO), 2005.
Z. Cen, A. Goradia, Y. Liu, N. Xi, M. Mutka, “Improving the Operation Effi-
ciency of Internet-Based Teleoperation via Overlay Networks,” Submitted, IEEE
Transactions on Robotics.
179
12
13.
14.
15.
16.
17.
18.
19.
20.
21.
Z. Cen, A. Goradia, M. Mutka, N. Xi, W.-K. Fung, and Y.-h. Liu, “Improving
the Operation Efficiency of Supermedia Enhanced Internet Based Teleoperation
via an Overlay Network,” in IEEE International Conference on Robotics and
Automation, Barcelona, Spain, 2005.
Z. Cen, M. Mutka, Y. Liu, A. Goradia, and N. Xi, “QoS Management of Super-
media Enhanced Teleoperation via Overlay Networks,” in IEEE/RSJ Interna-
tional Conference on Intelligent Robots and Systems (IROS 2005), Edmonton,
Alberta, Canada, 2005.
A. Goradia, Z. Nichol, Y. Liu, P. Suchyta, M. Prokos, and N. Xi, “Super-
Media Enhanced Internet-Based Real—Time Teleoperation,” in IEEE Interna-
tional Conference on Mechatronics and International Conference on Hands-on
Intelligent Mechatronics and Automation (ICM-HIMA), 2005.
A. Goradia, L. Pan and N. Xi, “Dynamic Multi-Objective Optimal Task Dis-
tribution for Teleoperated Mobile Manipulators,” in IEEE/RSJ International
Conference on Intelligent Robots and Systems, 2004.
J. Tan, A. Goradia, N. Xi, and W. Sheng, “Coordination of Multi-Robot and
Human Systems in a Perceptive Reference Frame,” International Journal of
Vehicle Autonomous Systems, vol. 2, 2004.
I. Elhajj, A. Goradia, N. Xi, C. M. Kit, Y.-H. Liu, and T. Fukuda, “Design
and Analysis of Internet-Based Tele—Coordinated Multi-Robot Systems,” Au-
tonomous Robots, vol. 15, no. 3, 2003.
I. Elhajj, A. Goradia, N. Xi, C. M. Kit, Y.-H. Liu, and T. Fukuda, “Tele-
Coordinated Control of Multi-Robot Systems via the Internet.” in IEEE Inter-
national Conference on Robotics and Automation, 2003.
J. Tan, A. Goradia, N. Xi, and W. Sheng, “Coordination of Human and Mobile
Manipulator Formation in a Perceptive Reference Frame.” in IEEE Interna-
tional Conference on Robotics and Automation, 2003.
I. Elhajj, A. Goradia, N. Xi, W. T. Lo, Y.—H. Liu, and T. Fukuda, “Internet-
Based Tele-Manufacturing,” in The Seventh International Conference on Au—
tomation Technology, Chia—yi, Taiwan, 2003.
Amit Goradia, Jindong Tan, Ning Xi, “Hybrid Force/ Position Control in Mov-
ing Coordinate Frames,” International Conference of Control, Automation, Robotics
and Computer Vision, Singapore, 2002.
180
BIBLIOGRAPHY
[1] C. S. Regazzoni, G. Fabri, and G. Vernazza, Advanced Video-Based Surveillance
Systems. Norwell, MA: Kluwer, 1998.
[2] G. L. Foresti, P. Mahonen, and C. S. Regazzoni, Multimedia Video-Based Surveil-
lance Systems: Requirements, Issues and Solutions. Norwell, MA : Kluwer, 2000.
[3] M. Bogaert, N. Chelq, P. Cornez, C. S. Regazzoni, A. Teschioni, and M. Thonnat,
“The password project,” in Proceedings of the International Conference on Image
Processing, Chicago , IL, 1996, p. 675 678.
[4] C. S. Regazzoni and A. Tesei, “Distributed data fusion for real-time crowding
estimation,” Signal Processing, vol. 53, p. 47 63, 1996.
[5] E. F. Lyon, “The application of automatic surface lights to improve airport
safety,” IEEE AES Systems Magazine, p. 14 20, 1993.
[6] M. Braasch, M. DiBenedetto, S. Braasch, and R. Thomas, “Laas operations in
support of airport surface movement, guidance, control and surveillance: Initial
test results,” in IEEE Proceedings on Position Location and Navigation Sympo-
sium, 2000, p. 82 89.
[7] H. Wang, T. Kirubarajan, and Y. Bar-Shalom, “Precision large scale air
traffic surveillance using imm/ assignment estimators,” IEEE Transactions on
Aerospace and Electronic Systems, vol. 35, p. 255 266, January 1999.
[8] B. Sridhar and G. B. Chatterji, “Computationally efficient conflict detection
methods for air traffic management,” in Proceedings of the American Control
Conference, vol. 2, 1997, p. 1126 1130.
[9] G. Donohue, “Vision on aviation surveillance systems,” in Proceedings of the
IEEE International Radar Conference, 1995, p. 1 4.
[10] G. L. Foresti and B. Pani, “Monitoring motorway infrastructures for detection
of dangerous events,” in Proceedings of the IEEE International Conference on
Image Analysis and Processing, Venice, Italy, 1999, p. 1144 1147.
[11] J. M. Manendez, L. Salgado, E. Rendon, and N. Garcia, “Motorway surveil-
lance through stereo computer vision,” in Proceedings of the 33rd IEEE Annual
International Carnahan Conference on Security Technology, 1999, p. 197 202.
[12] D. Koller, K. Daniilidis, and H. Nagel, “Model-based object tracking in monoc-
ular image sequences of road traffic scenes,” International Journal on Computer
Vision, vol. 10, p. 257 281, 1993.
[13] J. Malik, D. Koller, and J. Weber, “Robust multiple car tracking with occlusion
reasoning,” in European Conference on Computer Vision, Stockolm, Sweden,
1994, p. 189 196.
181
[14]
[15]
[16]
[17]
[18]
[19]
[20]
[21]
[22]
[23]
[24]
[25]
A. N. Ince and E. Topuz, “The design and computer simulation of a maritime
surveillance system,” in International Confierence on Radar, 1997, p. 653 656.
J. G. Sanderson, M. K. Teal, and T. J. Ellis, “Target identification in complex
maritime scenes,” in 6th International Conference on Image Processing and its
Applications, vol. 2, 1997, p. 463 467.
C. A. Rodriguez, J. A. Howell, H. O. Menlove, C. M. Brislawn, J. N. Bradley,
P. Chare, and T. Gorten, “Nuclear video image pro—cessing for nuclear safe-
guards,” in Proceedings of the IEEE 29th Annual International Carnahan Con-
ference on Security Technology, 1995, p. 355 363.
E. Stringa and C. S. Regazzoni, “Content-based retrieval and real time detection
from video sequences acquired by surveillance systems,” in IEEE International
Confnference on Image Processing, vol. 3, Chicago, IL, October 4-7 1998, p. 138
142.
C. Lin and R. Nevatia, “Building detection and description from a single intensity
image,” Computer Vision and Image Understanding, vol. 72, no. 2, p. 101 121,
1998.
L. Vergara and P. Bernabeu, “Automatic signal detection applied to fire control
by infrared digital signal processing,” Signal Processing, vol. 80, no. 4, p. 659
669, 2000.
W. Millesi, M. J. Truppe, F. Watzinger, A. Wagner, and R. Ewers, Lecture Notes
in Computer Science. Berlin, Germany: Springer, 1997, vol. 1205, ch. Image
guided surgery extended by remote stereotactic visualization, p. 813 821.
I. Haritaoglu, D. Harwood, and L. S. Davis, “W4: Real-time surveillance of peo—
ple and their activities,” IEEE Transaction on Pattern Analalysis and Machine
Intelligence, vol. 22, p. 809830, August 2000.
N. M. Oliver, B. Rosario, and A. P. Pentland, “A bayesian computer vision sys-
tem for modeling human interactions,” IEEE Transactions on Pattern Analalysis
and Machine Intelligence, vol. 22, no. 8, p. 831843, August 2000.
D. A. Pritchard, “System overview and applications of a panoramic imaging
perimeter sensor,” in Proceedings of the IEEE 29th Annual International Car-
nahan Conference on Security Technology, 1995, p. 420425.
U. Oppelt, “New possibilities for video applications in the security field,” in
Proceedings of the IEEE 29th Annual International Carnahan Conference on
Security Technology, 1995, p. 426435.
B. Peters, J. Meehan, D. Miller, and D. Moore, “Sensor link protocol: Linking
sensor systems to the digital battlefield,” in Proceedings of the IEEE Military
Communications Conference, vol. 3, 1998, p. 919923.
182
[26]
[27]
[28]
[29]
[30]
[31]
[32]
[33]
[34]
[35]
[36]
[37]
M. T. Fennell and R. P. Wishner, “Battlefield awareness via synergistic sar and
mti exploitation,” IEEE Aerospace and Electronic Syststems Magazine, vol. 13,
p. 3943, February 1998.
J. S. Draper, S. Perlman, C. K. Chuang, M. Hanson, L. Lillard, B. Hibbeln, and
D. Sene, “Tracking and identification of distant missiles by remote sounding,” in
Proceedings of the IEEE Aerospace Conference, vol. 4, 1999, p. 333341.
G. A. V. Sickle, “Aircraft self reports for military air surveillance,” in Proceedings
of the IEEE Digital Avionics Systems Conference, vol. 2, 1999, p. 28.
K. Toyoma, J. Krumm, B. Brumitt, and B. Meyers, “Wallflower: Principles and
practice of background maintainence,” in International Conference on Computer
Vision, 1999, pp. 255—261.
I. Elhajj, N. Xi, W. K. Fung, Y.-H. Liu, Y. Hasegawa, and T. Fukuda,
“Supermedia-enhanced internet-based telerobotics,” Proceedings of the IEEE,
vol. 91, no. 3, pp. 396 —- 42, March 2003.
N. Xi and T. J. Tarn, “Action synchronization and control of internet based teler-
obotic systems,” in IEEE International Conference on Robotics and Automation,
1999.
C. Regazzoni, V. Ramesh, and G. E. Foresti, “Special issue on third generation
surveillance systems,” Proceedings of the. IEEE, vol. 89, Oct. 2001.
R. Collins, A. Lipton, and T. Kanade, “Introduction to the special section on
video surveillance,” Pattern Analysis and Machine Intelligence, IEEE Transac-
tions on, vol. 22, no. 8, pp. 745—746, August 2000.
J. Tan, N. Xi, W. Sheng, and J. Xiao, “Modeling multiple robot systems for
area coverage and cooperation,” in International Cconference on Robotics and
Automation, 2004.
S. Stoeter, P. Rybski, M. Erickson, M. Gini, D. Hougen, D. Krantz, N. Pa-
panikolopoulos, and M. Wyman, “A robot team for exploration and surveillance:
Design and architecture,” in Proceedings of the International Conference on In-
telligent Autonomous Systems, 2000, pp. 767—774.
J. Cortes, S. Martinez, T. Karatas, and F. Bullo, “Coverage control for mobile
sensing networks,” in International Conference on Robotics and Automation,
2003.
T.Matsuyama and N .Ukita, “Real-time multitarget tracking by a cooperative
distributed vision system,” Proceedings of the IEEE, vol. 90, no. 7, pp. 1136—
1150, 2002.
183
[38]
[39]
[40]
[41]
[42]
[43]
[44]
[45]
[46]
[47]
[48]
[49]
[50]
T. Boult, R. J. Micheals, X. Gao, and M. Eckmann, “Into the woods: Visual
surveillance of noncooperative and camouflaged targets in complex outdoor set-
tings,” Proceedings of the IEEE, vol. 89, Oct. 2001.
R.T.Collins, A.J.Lipton, H.Fujiyoshi, and T.Kanade, “Algorithms for coopera-
tive multisensor surveillance,” Proceedings of the IEEE, vol. 89, pp. 1456—1477,
2001.
R. Brooks, C. Griffin, and D. Friedlander, “Self-organized distributed sensor
network entity tracking,” International Journal of High Performance Computing,
vol. 16, no. 2, 2002.
S. Hutchinson, G. Hager, and P. Corke, “A tutorial on visual servo control,”
IEEE Transactions on Robotics and Automation, vol. 12, no. 5, pp. 651—670,
1996.
C. Brown, “Gaze control with interactions and delays,” IEEE Transactions on
Systems Man and Cybernetics, vol. 20, no. 1, pp. 518—527, 1990.
M. Branicky, V. Borkar, and S. Mitter, “A unified framework for hybrid control:
Model and optimal control theory,” IEEE Transactions on Automatic Control,
vol. 43, no. 1, pp. 31-45, 1998.
J. Lygeros, K. Johansson, S. Simic, J. Zhang, and S. S. Sastry, “Dynamical prop-
erties of hybrid automata,” IEEE Transactions on Automatic Control, vol. 48,
no. 1, pp. 2—17, 2003.
C. Intanagonwiwat, R. Govindan, D. Estrin, J. Heidemann, and F. Silva, “Di-
rected Diffusion for Wireless Sensor Networking,” IEEE/A CM Transactions on
Networking, vol. 11, no. 1, 2003.
D. Braginsky and D. Estrin, “Rumor Routing Algorithm for Sensor Networks,”
Proceedings of the 1 st ACM international workshop on Wireless sensor networks
and applications, pp. 22—31, 2002.
H. S. Hamza and S. Wu, “RAODV: an Efficient Routing Algorithm for Sen-
sor Networks,” Proceedings of the 2nd International Conference on Embedded
Networked Sensor Systems, pp. 297—298, 2004.
R. T. Collins, A. J. Lipton, and T. Kanade, “Introduction to the special section
on video surveillance,” IEEE Transactions on Pattern Analysis and Machine
Intelligence, vol. 22, p. 745746, August 2000.
A. J. Courtemanche and A. Ceranowicz, “Modsaf development status,” in Pro-
ceeding of the 5th Conference on Computer Generated Forces and Behavioral
Representation, Orlando, FL, May 1995, p. 313.
K. Cauble, “The openscene modstealth,” in Proceedings of the Simulation Inter-
operability Workshop, Paper 97S-SIW—008 1997.
184
[51] C. S. Regazzoni, C. Sacchi, and C. Dambra, “Remote cable-based video surveil-
lance applications: the avs—rio project,” in Proceedings of ICIAP99, Venice ,
Italy, September 2729 1999, p. 12141215.
[52] P. Rybski, S. Stoeter, M. Gini, D. Hougen, and N. Papanikolopoulos, “Perfor-
mance of a distributed robotic system using shared communications channels,”
Robotics and Automation, IEEE Transactions on, vol. 18, no. 5, pp. 713 —- 727,
October 2002.
[53] ITU-H.261, “Video codec for audio—visual services at 64—1920 kbit/s,” I TU-T
Recommendation H.261, 1993.
[54] ITU-H.263, “Video coding for low bit rate communication,” I TU—T Recommen-
dation H.263, March 1996.
[55] I.-T. R. H. . I. 11496-10, “Advanced video coding,” Final Committee Draft,
Document J VT-E022, September 2002.
[56] H. Schulzrinne, S. Casner, R. Frederick, and V. Jacobson, “Rtp: A transport
protocol for real—time applications,” RF 01889, July 1997.
[57] L. Berc, W. Fenner, R. Frederick, S. McCanne, and P. Stewart, “Rtp payload
format for jpeg—compressed video,” RF C2435, October 1998.
[58] E. Sontag, Mathematical Control Theory. Springer-Verlag, 1990.
[59] J .-P. Aubin, “Mutational equations in metric spaces,” Set- Valued Analysis, vol. 1,
pp. 346, 1993.
[60] —, Mutational and Morphological Analysis: Tools for Shape Evolution and
Morphogenesis. Birkhéiuser,1999.
[61] R. Alur, T. Henzinger, G. Lafferriere, and G. Pappas, “Discrete abstractions of
hybrid systems,” Proceedings of the IEEE, vol. 88, no. 7, pp. 971—984, 2000.
[62] J. Cea, “Problems of shape optimal design,” Optimization of Distributed Param-
eter Structures vol. I and II, pp. 1005—1087, 1981.
[63] J. Sokolowski and J .-P. Zolesio, Introduction to Shape Optimization: Shape Sen-
sitivity Analysis, ser. Computational Mathematics. Springer-Verlag, 1991.
[64] A. C. Sanderson, L. Weiss, and C. Neuman, “Dynamic sensor based control of
robots with visual feedback,” IEEE Transactions on Robotics and Automation,
vol. RA-3, pp. 404—417, October 1987.
[65] L. Doyen, “Shape laypunov functions and stabilization of reachable tubes of
control problems,” Journal of Mathematical Analysis and Applications, vol. 184,
pp. 222—228, 1994.
185
[66] —, “Mutational equations for shapes and vision-based control,” Journal of
Mathematical Imaging and Vision, vol. 5, no. 2, pp. 99—109, 1995.
[67] A. Goradia, Z. Cen, M. Mutka, and N. Xi, “Modeling and design of mobile
surveillance networks using a mutational analysis approach,” in IEEE/RSJ In-
ternational Conference on Intelligent Robots and Systems, 2005.
[68] A. Goradia, N. Xi, M. Prokos, Z. Gen, and M. Mutka, “Cooperative Multi-
Target Surveillance Using a Mutational Analysis Approach,” in Proceedings of
2005 IEEE/ASME International Conference on Advanced Intelligent Mechatron-
ics (AIM 2005), Monterey, California, USA, 2005.
[69] K. Fregene, D. Kennedy, and D. Wang, “A study of supervisory constraints in a
class of coordinated multiagent systems,” in Proceedings of the American control
Conference, 2003, pp. 1044-4049.
[70] Z. Zhang, “A flexible new technique for camera calibration,” IEEE Transactions
on Pattern and Machine Intell, vol. 22, no. 11, pp. 1330—1334, 2000.
[71] —, “Motion and structure from two perspective views: from essential param-
eters to euclidean motion through the fundamental matrix,” J. Opt. Soc. Am,
vol. 14, no. 11, 1997.
[72] R. D. Z. Zhang, 0. Faugeras, “An effective technique for calibrating a binocular
stereo through projective reconstruction using both a calibration object and the
environment,” Videre: Journal of Computer Vision Research, vol. 1, no. 1, 1997.
[73] L. Shapiro and G. Stockman, Computer Vision. Prentice Hall, 2001.
[74] X. Chen, J. Davis, and P. Slusallek, “Wide area camera calibration using virtual
calibration objects,” in Computer Vision and Pattern Recognition. Proceedings.
IEEE Conference on, 2000.
[75] A. Heyden and K. Astrom, “Minimal conditions on intrinsic parameters for eu-
clidean reconstruction,” in ACC V ’98: Proceedings of the Third Asian Confer-
ence on Computer Vision- Volume II. London, UK: Springer-Verlag, 1997, pp.
169—176.
[76] M. Grewal and A. Andrews, Kalman Filtering: Theory and Practice. Prentice
Hall, 1993.
[77] W.-P. Chen, C.-J. Hou, and L. Sha, “Dynamic clustering for acoustic target
tracking in wireless sensor networks,” IEEE Transactions on Mobile Computing
Special Issue on Sensor Networks, 2004.
[78] J. Liu, J. Liu, J. Reich, P. Cheung, and F. Zhao, “Distributed Group Manage—
ment for Track Initiation and Maintenance in Target Localization Applications,”
in Proceedings of 2nd International Workshop on Information Processing in Sen—
sor Networks (IPSN), 2003.
186
[79]
[80]
[81]
[82]
[83]
[84]
[85]
[86]
[87]
[88]
[89]
[90]
[91]
[92]
Q. Fang, J. Liu, L. Guibas, and F. Zhao, “RoamHBA: Maintaining Group Con-
nectivity In Sensor Networks,” in The 3rd international symposium on Informa-
tion processing in sensor networks (IPSN2004), 2004, pp. 151—160.
M. Mauve, J. Widmer, and H. Hartenstein, “A Survey on Position—Based Routing
in Mobile Ad—Hoc Networks,” IEEE Network Magazine, vol. 15, no. 6, pp. 30—39,
2001.
R. Timothy and J. Arora, “Survey of multi-objective optimization method for en-
gineering.” http://www. ccad.uiowa.edu/ tmarler/rtm/img/survey.pdf, Accessed
March 2005.
B. S. Gottfried and J. VVeisman, Introduction to Optimization Theory. Prentice-
Hall, Inc., Englewood Cliffs, New Jersey, 1973.
A. Mullur, C. Mattson, , and A. Messac, “New decision matrix based ap-
proach for concept selection using linear physical programming,” in 44th
AIAA/ASME/ASCE/AHS, Paper No. AIAA 20034446, Norfolk, VA, April
2003.
A. Messac, “Physical programming: Effective optimization for computational
design,” AIAA Journal, vol. 34, no. 1, pp. 149—158, 1996.
———, “Horn dubious construction of objective functions to the application of
physical programming,” AIAA Journal, vol. 38, no. 1, pp. 155—163, Jan 2000.
R. Fletcher, Practical Methods of Optimization, Second Edition. Anchor Bren-
don Ltd., Tiptree, Essex, 1991.
Trolltech-Inc, “Qt: A cross platform application development framework,”
http://www.trolltech.com/products/qt/index.html, accessed December 2005.
C. Kim and J .-N . Hwang, “Ob ject-based video abstraction for video surveillance
systems,” IEEE Transactions on Circuits and Systems for Video Technology,
vol. 12, no. 12, December 2002.
S. M. LaValle and S. A. Hutchinson, “Optimal motion planning for multiple
robots having independent goals,” IEEE Transactions on Robotics and Automa—
tion, vol. 14, no. 6, pp. 912—925, December 1998.
R. Pal, Nikhil and S. K. Pal, “A review on image segmentation techniques,”
Pattern Recognition, vol. 26, no. 9, pp. 1277-1294, 1993.
J. Bruce, T. Balch, and M. Veloso, “Fast and inexpensive color image segmen-
tation for interactive robots,” in IROS, 2000.
T. Svoboda, D. Martinec, and T. Pajdla, “A convenient multi-camera self-
calibration for virtual environments,” PRESENCE: Teleoperators and Virtual
Environments, vol. 14, no. 4, pp. 407—422, August 2005.
187
I"[[[[gi[[[[[jgl[[1]]