DESIGN AND DEPLOYMENT OF LOW-COST WIRELESS SENSOR NETWORKS FOR

REAL-TIME EVENT DETECTION AND MONITORING

By

Dennis Edward Phillips

A DISSERTATION

Submitted to

Michigan State University

in partial fulﬁllment of the requirements

for the degree of

Computer Science – Doctor of Philosophy

2018

ABSTRACT

DESIGN AND DEPLOYMENT OF LOW-COST WIRELESS SENSOR NETWORKS FOR

REAL-TIME EVENT DETECTION AND MONITORING

By

Dennis Edward Phillips

As sensor network technologies become more mature, they are increasingly being applied to a wide

variety of environmental monitoring applications, ranging from agricultural sensing to habitat

monitoring, oceanic and volcanic monitoring. In this dissertation two wireless sensor networks

(WSNs) are presented. One for monitoring residential power usage and another for producing an

image of a volcano’s internal structure.

The two WSNs presented address several common challenges facing modern sensor networks.

The ﬁrst is in-network processing and assigning the processing tasks across a heterogeneous

network architecture. By eﬃciently utilizing in-network processing power consumption can be

reduced and operational lifetime of the network can be extended. As nodes are embedded into

various environments sensing accuracy is intrinsically aﬀected by physical noise. The second

challenge relates to how to deal with this noise in a way which increases sensing accuracy. The

third challenge is ease of deployment. As WSNs become more common place they will be installed

by non-experts.

As a key technology of home area networks in smart grids, ﬁne-grained power usage monitoring

may help conserve electricity. Smart homes outﬁtted with network connected appliances will

provide this capability in the future. Until smart appliances have wide adaption there is a serious

gap in capabilities. To ﬁll this gap an easy to deploy monitoring system is needed. Several existing

systems achieve the goal of ﬁne-grained power monitoring by exploiting appliances’ power usage

signatures utilizing labor-intensive in situ training processes. Recent work shows that autonomous

power usage monitoring can be achieved by supplementing a smart meter with distributed sensors

that detect the working states of appliances. However, sensors must be carefully installed for each

appliance, resulting in high installation cost. Supero is the ﬁrst ad hoc sensor system that can

monitor appliance power usage without supervised training. By exploiting multi-sensor fusion and

unsupervised machine learning algorithms, Supero can classify the appliance events of interest and

autonomously associate measured power usage with the respective appliances. Extensive evaluation

in ﬁve real homes shows that Supero can estimate the energy consumption with errors less than

7.5%. Moreover, non-professional users can quickly deploy Supero with considerable ﬂexibility.

There are a number of active volcanos around the world with large population areas located

nearby. An eruption poses a signiﬁcant threat to the adjacent population. During times of increased

activity being able to obtain a real-time images of the interior would allow seismologists to

better understand volcanic dynamics. Volcano tomography can provide this valuable information

concerning the internal structure of a volcano. The second sensor network presented in this

dissertation is a seismic monitoring sensor network featuring in-network processing of the seismic

signals with the capability to perform volcano tomography in real-time. The design challenges,

analysis of processing/network processing times in the information processing pipeline, the system

designed to meet these challenges and the results from deploying a prototype network on two

volcanoes in Ecuador and Chile are presented. The study shows that it is possible to achieve in-

network seismic event detection and real-time tomography using a sensor network that is 2 orders

of magnitude less expensive than traditional seismic equipment.

ACKNOWLEDGEMENTS

As with any research eﬀort, it can not be conducted in isolation. There are two groups within

Engineering at Michigan State University who deserve special recognition. The ﬁrst is the Electrical

and Computer Engineering Technical Services group. This group fabricated many prototype

seismic sensor circuit boards as I revised the ﬁnal design. The second is the Engineering Machine

Shop with the Mechanical Engineering Department. Using the equipment available within the

machine shop I was able to fabricate the various mounting brackets for the sensor modules. In

addition, the machinists were able to cut the odd shaped connector mounting holes in the sensor

enclosures. Without the eﬀorts of these two groups the construction of the seismic sensors would

not have been achieved.

I would also like to thank the Department of Computer Science and Engineering for their

support. Without the assistantships, fellowships, and general support I would not have been able to

achieve my goal of earning my PhD.

Sincerely, Dennis E. Phillips

iv

TABLE OF CONTENTS

LIST OF TABLES .

. .

LIST OF FIGURES .

. .

.

.

. .

. .

.

.

.

.

. .

. .

CHAPTER 1

INTRODUCTION . .

CHAPTER 2 RELATED WORK . .

.

.

.

.

. .

. .

. .

. .

.

.

.

.

.

.

.

.

. .

. .

. .

. .

.

.

.

.

. .

. .

. .

. .

.

.

.

.

.

.

.

.

. .

. .

. .

. .

.

.

.

.

.

.

.

.

. .

. .

. .

. .

.

.

.

.

. .

. .

. .

. .

.

.

.

.

.

.

.

.

. .

. .

. .

. .

.

.

.

.

. .

. vii

. .

. viii

. .

. .

.

.

1

7

CHAPTER 3 UNSUPERVISED RESIDENTIAL POWER USAGE MONITORING US-

.

.

.

.

.

.
.

.
.
.

.
.
.

.

.
.

. .

. .

. .

System Architecture .

3.1 Overview of Supero .

. .
. .
. .
. .
. .
. .
. .
. .
. .

. .
. .
3.2 Event Detection and Data Correlation . .
. .
3.2.1 Light Event Detection .
. .
. .
3.2.2 Acoustic Event Detection .
3.2.3
. .
Power Event Detection . .
3.2.4 Multi-modal Data Correlation . .

.
3.1.1 Design Objectives and Challenges .
.
3.1.2 Motivating Observations .
.
3.1.3
. .
.
.
.
.
.

.
.
.
.
.
.
.
.
.
3.3 Event Classiﬁcation and Appliance Association .
.
3.4 Duty-Cycled Heating Appliances .
3.5
.
.
.
.
.

Implementation and Deployment
3.5.1
3.5.2

. .
.
. .
.
Prototype System Implementation .
. .
System Deployment and Conﬁguration .
. .

ING A WIRELESS SENSOR NETWORK .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
3.6.1 Deployments and Evaluation Methodology . .
. .
3.6.2 Controlled Experiments in Apartment-1 .
. .
.
.
. .
. .
.
.
. .
Experiences and Learned Lessons . .
. .
Evaluation Results .
. .
3.6.4 Experiments in Apartment-2 .
. .
.
3.6.5 Experiments in House-1 .
3.6.6
. .
.
. .
. .
.
. .
3.6.7
. .
.
. .

Experimental Settings .
. .
3.6.2.1
Energy Estimation Accuracy .
3.6.2.2
3.6.2.3
Impact of Distance Errors . .
10-Day Experiment in Apartment-1 . .
3.6.3.1
3.6.3.2

3.6 Experimental Evaluation . .

.
.
3.7 Conclusion and Future Work .

System Usability . .
. .
System Lifetime .

. .
. .
. .
. .
. .
. .

. .
. .
. .
. .
. .
. .

. .
. .

3.6.3

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.

.
.
.
.

. .

. .

.

.

.

.

.
.

.
.

.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

CHAPTER 4 A SENSOR NETWORK FOR REAL-TIME VOLCANO IMAGING .
.
.

4.1 Background of Volcano Tomography .
.

Scientiﬁc Motivation .

. .
. .

. .
. .

. .
. .

. .
. .

. .
. .

. .
. .

4.1.1

. .

.
.

.
.

.
.

.
.

.
.

.
.

.
.

.
.

.

v

. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .

. .
. .
. .

. 11
. 12
. 12
. 13
. 14
. 15
. 15
. 16
. 18
. 18
. 18
. 21
. 22
. 22
. 23
. 25
. 25
. 26
. 26
. 30
. 31
. 31
. 31
. 33
. 34
. 38
. 38
. 39
. 40

. 41
. 41
. 41

. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .

. .

. .

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.

.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.

.

. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .

. .

. .

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.

.

. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .

. .

. .

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.

.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.

.

. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .

. .

. .

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.

.

. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .
. .

. 42
. 44
. 45
. 47
. 48
. 50
. 54
. 57
. 57
. 58
. 59
. 60
. 61
. 64
. 64
. 65
. 67
. 67
. 68
. 69
. 71
. 73
. 75
. 75

. .

. 76

. .

. 81

. .

. .

.
.

.
.

.

.

.

.
.

.
.
.

.

.
.

.
.
.

.

.
.
.
.

.
.
.
.
.
.

. .
. .

. .
. .

Packaging . .

. .
4.2 Design Requirements .
. .
4.3 System Architecture and Design .

Spatial Coverage . .
. .

4.1.2 Tomographic Processing Pipeline
. .
4.1.3
. .
. .

. .
. .
. .
. .
Signal Processing and Event Detection .
Sensor Node Design .
. .

.
.
.
.
.
4.3.1
4.3.2
.
4.3.3 Design Lessons Learned and Generation 4 .
4.3.4 Base Station Design .
.
.
. .
4.3.5
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

. .
. .
4.4 System Modeling & Dynamic Task Assignment
. .
. .
. .
. .
. .
. .
. .
. .
. .
.
4.6.1
System Delay .
. .
4.6.2 Data Fidelity .
.
4.6.3 Communication Performance .
. .
4.6.4 Battery and Solar Panel Performance . .
. .
4.6.5
. .

.
System Delay Modeling .
System Lifetime Modeling .
.
.

4.4.1
4.4.2
4.4.3 Dynamic Task Assignment
. .

. .
. .
. .
. .
4.5.1 Tungurahua Volcano Deployment
4.5.2 Llaima Volcano Deployment
. .
4.5.3 Deployment Lessons Learned . .
4.6 Evaluation and Deployment Experiences .
. .
. .
. .

4.7 Conclusion . .

4.5 Deployments

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.

. .
. .

Packaging and Ease of Deployment
. .

. .

. .

. .

.

.

.

.

.

.

. .
. .

.
.

.
.
.
.

.

.

. .

.

.

.

. .

.

.

.

.
.

.
.

.
.

CHAPTER 5 CONCLUSION .

BIBLIOGRAPHY .

.

. .

.

. .

.

.

.

. .

. .

.

.

. .

. .

.

.

.

.

. .

. .

.

.

. .

. .

.

.

.

.

vi

LIST OF TABLES

Table 3.1: Energy breakdown for the 1-hour controlled experiment in Apartment-1.

Table 3.2: Energy breakdown during 7 days in Apartment-1∗

.

.

.

. .

.

. .

.

.

. .

.

.

. .

. 29

. .

. 34

Table 3.3: The set of sensors detecting a light (i.e., Rm) and clustering/association results

. 36

Table 3.4: Energy breakdown in House-1∗

.

.

. .

.

. .

.

.

. .

.

.

. .

.

. .

.

.

. .

.

. .

. 37

Table 4.1: Key Characteristics of a Traditional Seismic Station versus a our sensor nodes .

. 50

Table 4.2: Application processing deadlines .

Table 4.3: Task execution times by tier

. .

.

.

.

. .

. .

Table 4.4: Key characteristics of the deployments .

Table 4.5: Shared SPI bus ADC Sample Intervals .

.

.

.

.

. .

. .

. .

. .

.

.

.

.

.

.

.

.

. .

. .

. .

. .

.

.

.

.

.

.

.

.

. .

. .

. .

. .

.

.

.

.

. .

. .

. .

. .

.

.

.

.

.

.

.

.

. .

. .

. .

. .

.

.

.

.

. .

. 59

. .

. 59

. .

. 64

. .

. 70

vii

LIST OF FIGURES

Figure 1.1: Tungurahua Volcano near Baños - Ecuador Deployment

. .

Figure 3.1: The Supero architecture. .

.

. .

.

.

. .

.

. .

.

.

. .

.

.

. .

.

.

. .

. .

.

.

.

.

. .

. .

.

.

. .

.

5

. .

. 15

Figure 3.2: EDF result on light readings sampled at 4 Hz. Vertical lines represent detec-

tions. A person passes by Light 1 at the 31st and 53rd seconds.

.

.

.

. .

.

. .

. 16

Figure 3.3: Acoustic signal is separated into three bands using lattice wave digital ﬁlters

for feature extraction.

. .

.

. .

.

.

. .

Figure 3.4: Light feature vectors of two sensors.

.

.

.

. .

. .

.

.

Figure 3.5: Light intensity vs. distance (cm) in log-scale. .

.

.

.

. .

. .

. .

.

.

.

.

.

.

. .

. .

. .

.

.

.

. .

. .

. .

.

.

.

.

.

.

. .

. .

. .

.

.

.

. .

. 17

. .

. 19

. .

. 19

Figure 3.6: Acoustic event clustering and transition detection for a 3-speed fan. (a) The
number of phases is identiﬁed as three; (b) Clustering and transition detection
results, where Y -axis is the major principle component (PC) and vertical lines
represent the detected acoustic transitions.

. .

. .

. .

. .

.

.

.

.

.

.

.

.

.

. .

Figure 3.7: Detecting stove burner. (1) Red curve: Total household power readings when
(2)
a burner is working; Blue curve: The reconstructed lower envelope.
Standard deviation of power readings and threshold-based detection results
(detection window size: 100 s).

. .

. .

. .

. .

. .

. .

.

.

.

.

.

.

.

.

.

.

.

. .

. 19

. 21

Figure 3.8: Web conﬁguration interface. . .

Figure 3.9: Apartment-1 deployment.

.

. .

.

.

.

.

. .

. .

.

.

. .

. .

.

.

.

.

. .

. .

.

.

.

.

. .

. .

.

.

. .

. .

.

.

.

.

. .

. .

.

.

. .

. 25

. .

. 27

Figure 3.10: Results of the controlled experiment in Apartment-1. (1) The top chart shows
the power readings labeled with ground truths of the events. (2) The bars in
the second chart show the detections of the light sensors. Two black bars at
around the 35th minute are false alarms (labeled “FA” in the chart) identiﬁed
by the multi-modal data correlation. Clusters are diﬀerentiated by colors and
the overhead numbers are the IDs of the associated light. (3) The third chart
shows the major principle component given by PCA and the detected acoustic
transitions. The acoustic transitions of the same color are associated with
the same appliance. (4) The bottom chart shows the clustered and associated
power events of the unattended appliances.

. .

. .

. .

. .

.

.

.

.

.

.

.

.

.

. .

. 28

Figure 3.11: PRR and power traces in 10 days. .

. .

.

. .

.

.

. .

.

.

. .

.

. .

.

.

. .

.

. .

. 32

viii

Figure 3.12: Sensor placements in Apartment-2. The numbers in the squares and circles
are the sensor IDs of TelosB and Iris, respectively. If a TelosB does not face
upward, the arrow represents its facing direction. . .

. .

. .

. .

.

.

.

.

.

.

. .

. 35

Figure 3.13: Sensor installation examples. Sensors were placed on the ground, in the

corner of walls, on the fan of a range, and on a table. .

Figure 3.14: Battery voltage traces of TelosB and Iris.

. .

.

.

. .

.

.

.

. .

. .

.

.

. .

. .

.

.

.

.

. .

. .

.

.

. .

. 36

. .

. 40

Figure 4.1: Seismic tomography and real-time signal processing pipeline. The tomog-
raphy estimates a velocity model consisting of the seismic wave propagation
speeds in cubic blocks beneath the volcano surface.

. .

. .

. .

.

.

.

.

.

.

. .

. 42

Figure 4.2: Tomography results using simulated seismic data.

.

.

.

. .

.

. .

Figure 4.3: General system architecture and online remote monitoring panel.

Figure 4.4: Data processing components of sensor node and base station.

. .

Figure 4.5: STA/LTA ratio in response to a seismic signal.

Figure 4.6: Seismic Node Prototypes

.

. .

Figure 4.7: Seismic Monitoring Nodes

. .

.

.

.

.

. .

. .

Figure 4.8: Recorded Ecuador Seismic Event .

. .

.

.

.

. .

. .

. .

.

.

.

Figure 4.9: The clock drift and correction by GPS 1PPS .

Figure 4.10: Llaima Volcano deployment, Chile, 2015.

.

.

.

.

.

.

.

.

. .

. .

. .

. .

. .

. .

Figure 4.11: Node locations in Llaima Volcano deployment. .

. .

Figure 4.12: Hypocenter execution times by number of stations. .

Figure 4.13: Tomography execution times by number of events. .

Figure 4.14: One-hop link quality (circles represent nodes).

.

. .

Figure 4.15: Line of Sight Path between Node 9 and Node 3 .

. .

Figure 4.16: Distribution of Satellite Round Trip Ping Times

. .

Figure 4.17: Battery daily charging cycle.

.

.

.

. .

.

. .

.

.

. .

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

. .

. .

. .

. .

. .

. .

. .

. .

. .

. .

. .

. .

. .

.

.

.

.

.

.

.

.

.

.

.

.

.

. .

. .

. .

. .

. .

. .

. .

. .

. .

. .

. .

. .

. .

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

. .

. .

. .

. .

. .

. .

. .

. .

. .

. .

. .

. .

. .

. .

. .

. .

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

. .

. 44

. .

. 47

. .

. 48

. .

. 49

. .

. 51

. .

. 53

. .

. 54

. .

. 54

. .

. 65

. .

. 66

. .

. 68

. .

. 68

. .

. 71

. .

. 72

. .

. 72

. .

. 74

ix

Figure 4.18: System Life Time Estimation .

.

.

. .

.

. .

.

.

. .

.

.

. .

.

. .

.

.

. .

.

. .

. 74

x

CHAPTER 1

INTRODUCTION

Since their introduction in the late 1990’s the use of wireless sensor networks (WSNs) to perform

monitoring has continually grown.

Initially used to monitor and record various environmental

readings (temperature, humidity, etc) their use has broadened to include detection, tracking, and

identiﬁcation of objects and events. As the complexity of the events or objects to be detected and

identiﬁed increases so must the sophistication of the techniques and algorithms. With the increase

in event detection complexity, there has also been an increase in the type and complexity of features

used to in the detection process. This has gone from simple environmental features (light levels,

temperature, humidity, etc.) to multiple features (from both single and multiple sensors) to time

sequences of features.

Along with the changes in complexity, the very nature of WSNs has changed. WSNs have

moved from simple monitoring to providing information and data which is to be acted upon in

realtime. This results in new challenges which must be addressed when designing a WSN. Some

of these challenges are:

Ease of deployment - As WSNs have matured who deploys the nodes has changed. Initial WSN

were installed by researchers who were also the designers. Today, WSNs are being installed by

ordinary homeowners, as is the case for residential power monitoring applications, or individuals

who are not the designers. This requires the individual sensor nodes to be easy to install and

insensitive to where they are placed.

Sensing accuracy - One of the purposes is to provide actionable information as a result accuracy

is important. False alarms and false positives can have a signiﬁcant eﬀect possibly resulting in

the wrong or inappropriate action to be taken. Steps must be taken to reduce these to application

acceptable limits.

Heterogeneous network architecture - The increase in computation capabilities also allows WSN

designers to include nodes with diﬀerent computational capabilities. This results in an opportunity

1

for assigning the computational task to the appropriate element that minimizes overall system power

consumption while still meeting any application processing deadlines. The assignments may need

to be dynamic and adapt to changes in the sensor data or communication delays.

Operational lifetime - One challenge which hasn’t changed is power related. In many applications

the sensor nodes are still battery powered. While the computing capabilities have increased

signiﬁcantly, battery power density has not increased at the same pace. Yet, these WSNs are

expected to reliably operate for extended periods of time with little to no maintenance. As the

computational capabilities of the individual nodes have increased, this allows more processing to be

performed within the network. Rather than sending raw sensor data over the WSN communication

channels, the network now can analyze the data, extract the relevant features, and only transmit

this reduce set of features. This in-network processing can be used to reduce power and extend the

operational lifetime of the WSN.

This dissertation presents two diﬀerent applications of wireless sensor networks illustrating

these challenges and approaches to address them. The ﬁrst is a residential power monitoring

application, Supero. Since 1978 the percentage of residential electricity has increased from 17%

to 31% [49], while the cost of energy has also been on the rise. A recent study shows that in 2001

one half of the American homes with an annual income of less than 50,000 U.S. dollars annually

spent, on average, 12% of the after tax income on energy [2]. Now in 2011 this percentage has

increased to 21% [2]. With these increasing energy costs, home owners are increasingly interested

in reducing the energy usage of their appliances. If they have a better understanding of the energy

consumption of their appliances, they could more easily identify any wastage of power.

As a key technology of home area networks in smart grids, ﬁne-grained power usage monitoring

may help conserve electricity. Several existing systems achieve this goal by exploiting appliances’

power usage signatures identiﬁed in labor-intensive in situ training processes. Recent work shows

that autonomous power usage monitoring can be achieved by supplementing a smart meter with

distributed sensors that detect the working states of appliances. However, sensors must be carefully

installed for each appliance, resulting in high installation cost. Chapter 3 presents Supero – the

2

ﬁrst ad hoc sensor system that can monitor appliance power usage without supervised training.

By exploiting multi-sensor fusion and unsupervised machine learning algorithms, Supero can

classify the appliance events of interest and autonomously associate measured power usage with

the respective appliances.

Previous systems for ﬁne-grained power usage monitoring can be broadly classiﬁed into two

categories. The ﬁrst category, direct sensing, measures per-appliance power usage by smart plugs

[18] and smart switches [17]. As smart plugs are placed between the appliances and power outlets,

they cannot be used for appliances hardwired to power lines, such as ceiling lights. Replacing

normal wall switches with smart switches needs cumbersome hardwiring and possibly expensive

modiﬁcations to walls. In light of the installation overhead, direct sensing is suitable only when

permanent monitoring is desired. However, for identifying power wastage and diagnosing ineﬃcient

appliances, a swift one-oﬀ deployment for a short time period (e.g., a few weeks) is typically

suﬃcient. The second category, indirect sensing, is less intrusive as it infers the working states

and energy consumption of individual appliances by detecting their power usage patterns [15, 35]

or ambient signals they emit during operation [14, 22]. However, these techniques require either

labor-intensive in situ supervised training, due to their dependency on the appliance characteristics

[15] and electrical wiring [35, 14], or careful sensor installation for each appliance [22], leading to

high installation cost and reduced usability.

Supero aims to design a residential power usage monitoring system that (i) uses only inexpensive

and easy-to-install sensing devices, (ii) can be deployed by non-professional users with straight-

forward instructions, and yet (iii) can work eﬀectively based on a small amount of easily obtained

prior information without resorting to supervised in situ training. Such a system must automatically

detect events of interest, autonomously associate the events with the correct appliances, and ﬁnally

infer the power usage of each appliance. It brings three key challenges. First, inexpensive sensors

typically have limited sensing capabilities; hence, they can produce false alarms or miss important

events of monitored appliances. Second, when sensors are installed in an ad hoc manner, multiple

sensors may detect the same event, and it becomes diﬃcult to associate the event with the appliance

3

that is the source of the event. Lastly, to make the system practical, we must minimize the amount

of prior information that users will need to collect.

Another area which can beneﬁt from the use of WSNs is seismic monitoring. It is estimated

that approximately 500 million people live within the risk range of volcanos [7]. Unfortunately, the

dynamics of volcanos is not fully understood. Volcano tomography provides valuable information

concerning the internal structure of a volcano. Volcano tomography allows a seismologist to

visualize the internal geological structure of a volcano [24, 25, 13]. Using the travel times from

seismic event sources to multiple monitoring stations it is possible to model the propagation speed

of the seismic waves as they travel through diﬀerent portions of the structure. The speed is aﬀected

both by the density of the material as well as its temperature. When a volcano transitions from

a dormant state to a crisis state, as indicated by a marked increase in seismic activity, the fear is

that an eruption could occur. If the seismologists could observe the internal structural changes in

real-time they would have a better idea if an eruption is imminent and nearby populations centers

should be evacuated.

Current volcano monitoring systems developed by the seismic community cannot achieve real-

time tomography due to the small number of stations deployed around active volcanos. Most existing

volcano monitoring systems employ expensive broadband seismic stations. Portable traditional

stations cost $20,000US as well as being bulky and diﬃcult to deploy. Consequently, even the most

active volcanos today are monitored by fewer than 20 nodes [42] which is insuﬃcient to perform

real-time tomography. The stations typically sample seismic data at a sample rate of 50-200Hz for

oﬀ-line, batch processing. With such a small number of nodes and batch processing mechanism,

existing systems do not have the capability to capture the physical dynamics of volcanoes with

suﬃcient detail. This limits our ability to study volcanic activities and the internal physical

dynamics of volcanoes. Therefore, an online seismic monitoring system could lead to substantial

scientiﬁc discoveries on the geology and physics of active volcanoes and similar applications. This

requires a large-scale network with real-time sensing and online in-network processing capabilities.

There have been many low-power sensor systems developed [51, 42], but mainly focus on data

4

Figure 1.1: Tungurahua Volcano near Baños - Ecuador Deployment

collection and didn’t address the challenge of the data intensive computations needed to perform

real-time volcano tomography. With the high sampling rate required for seismic sensing, it is

virtually impossible to collect and transmit live data from a large-scale dense sensor network, due

to severe limitations of energy and network bandwidth with current, battery-powered sensor nodes.

Designing a sensor platform for real-time volcano tomography faces several challenges. System

hardware and software needs to be carefully designed in tandem to ensure suﬃcient compute power

is available while being power-eﬃcient, minimizing network bandwidth utilized and staying within

cost budget. Due to the signiﬁcant data intensity, the system likely consists of heterogeneous

platforms processing the information at multiple levels, such as sensor nodes, local cluster head,

and remote server. The complex trade-oﬀs between multiple system performance metrics such as

delay, energy consumption, and sensing accuracy must be carefully analyzed.

A seismic sensor network capable of performing volcano tomography was developed. This

system performs real-time seismic event detection and identiﬁes the seismic wavefront arrival

times at sensor nodes. To achieve desirable event detection accuracy while keeping computation

5

overhead low, we develop a new seismic event detection framework that extends and integrates

two earthquake detection algorithms and a data fusion scheme. Moreover, to leverage the system

heterogeneity, we design a novel compute task assignment scheme based on analytical models of

system delay and lifetime. Processing tasks can be assigned to the right system tiers such that

all real-time and energy requirements are met. Through two ﬁeld deployments of the system on

Tungurahua Volcano, Ecuador and Llaima Volcano, Chile, in 2012 and 2015, respectively, it is

shows that system can survive the harsh environmental conditions found in volcanic regions.

The system design and deployment experience provides important insight into developing

sensor networks in other application domains that require sophisticated, high-ﬁdelity processing

of intensive, highly dynamic and complex physical information with the network. In summary,

the main contributions include 1) design and implement a new system for seismic monitoring

consisting of a low cost sensor node (< $500US) with mesh data network communication, base

station supporting data intensive computations and satellite link communication, and remote server

performing real-time network monitoring and tomography computations. 2) conduct an empirical

delay analysis of the tomographic processing pipeline and propose a task assignment scheme

speciﬁcally targeted at computationally intensive real-time applications, 3) describe the experiences

of two ﬁeld deployments to volcanic regions, and 4) systematically evaluate our system using results

from these deployments.

6

CHAPTER 2

RELATED WORK

Research as it relates to the WSN applications presented in this dissertation center around three

areas. The ﬁrst area is the overall approach to sensing, i.e. direct or indirect sensing. For example,

the temperature of an object could be measured by directly attaching a temperature probe or

indirectly from a distance using an infrared sensor. This is also the case when trying to determine

appliance power usage. A second area of research involved sensor systems which are very data-

intensive.

In the past, WSNs samples sensors at rates ranging from one reading per second to

several minutes between readings. Data-intensive systems sample sensors at much higher rates

ranging from 100 Hz to several kilohertz. The third area of research relates to assign computation

tasks to diﬀerent hardware elements based on computation capabilities, power consumption and

communication delays.

The following discusses representative indirect sensing approaches for appliance power usage

monitoring, and identiﬁes their diﬀerences from Supero. Early work in this area [15, 8, 10] utilizes

per-appliance power operating characteristics, measured at power panels, to disaggregate the total

energy consumption. These approaches need either in situ training [15, 10] or a comprehensive

database of a priori power characteristics of appliances [15, 8]. Jiang et al.

[19] present the

experience of monitoring the power usage of a laboratory using smart plugs [18] and light sensors.

In [20], binary sensors are used to help deploy power meters to estimate energy breakdowns in a

building. Both of the studies [19, 20] exploit the tree topology of the subject power supply system.

Patel et al. [35] detect and classify electrical events based on transient noises generated by the

appliances. Their transient signatures are heavily inﬂuenced by the electrical wiring, which results

in the need of in situ training. In [14, 45], appliances are recognized based on their electromagnetic

interference and acoustic signals. Similarly, their work requires labor-intensive in situ training. A

typical training process involves switching on/oﬀ appliances, and collecting and labeling signals.

Recently, Ho et al. [16] use a thermal camera to detect the on/oﬀ states of appliances and infer the

7

per-appliance energy consumption. The thermal camera can be hard to install and can raise privacy

concerns in residential environments.

ViridiScope [22] is a ﬁne-grained power usage monitoring system closest in design to Supero. It

features an autonomous regression framework that can calculate per-appliance energy consumption

based on the appliances’ working states and the total household power trace. ViridiScope detects

the working states by carefully installed sensors. For instance, light and magnetic sensors are placed

in close proximity to or attached to each appliance, and must not be triggered by other appliances to

ensure correct power estimation. Such an installation of the sensors is hard for diﬃcult-to-access

appliances such as ceiling lights. In Supero, due to the use of unsupervised learning and novel

sensor fusion/association techniques, sensors are not dedicated to speciﬁc appliances, and so can

be deployed in an ad hoc manner, leading to signiﬁcantly lower installation costs. ViridiScope

uses two acoustic sensors to monitor a fridge compressor and reject ambient noises. In Chapter

3, we propose a systematic approach for monitoring a range of acoustic appliances, which jointly

processes the data from all acoustic sensors to detect the appliances’ working states.

The task partitioning presented in ORBIT [29] dispatches the execution of sensing and pro-

cessing tasks in a smart-phone-based multi-tier architecture to achieve data-intensive applications

requirements. Signal processing timing proﬁles can exhibit signiﬁcantly variations in real scenarios.

To address this, ORBIT measures the statistical timing proﬁles at runtime, and periodically reﬁnes

the partitioning results. ORBIT maximizes the battery lifetime subject to the application-speciﬁc

latency constraints. Moreover, in order to support ﬁne-grained task partitioning across the tiers,

the developer speciﬁes the application’s task structure as well as real-time requirements via either

Java annotations or an XML-based application model provided by ORBIT. ORBIT also provides

a messaging interface to support uniﬁed data passing mechanism between heterogenous tiers and

between diﬀerent application components.

There have been a number of wireless sensor network (WSN) deployments over the past decade.

In the following, this focuses on discussing sensor systems that target data-intensive applications,

including those monitoring industrial equipment[23], structural health [33, 52, 21, 44], earthquakes

8

[39, 11] and volcanos [51, 42].

Krishnamurthy, et. al. [23] installed a WSN in a semiconductor plant and an operating oil tanker

to collect data used to predict equipment failure. Each node sampled multiple vibration sensor at

19.2. kHz. Data was collected at regular intervals for processing and analysis by a backend system.

Only DC component removal was performed in-network.

Xu, et. al. [52] and Paek, et. al. [33] developed WSNs for structural monitoring using Mica

motes interfaced to a vibration sensor card sampled at either 100 or 200 Hz. In both cases the focus

was primarily on reliable network communication. Kim, et. al. [21] installed a 64-node network

on the Golden Gate Bridge to perform structural health monitoring utilizing vibration sensors were

sampled at 1kHz. All analysis was performed oﬀ-line. Stoianov, et. al.

[44] used a WSN to

monitor the health of underground pipeline in Boston, MA. The nodes in this system computed

pressure sensor reading summary statistics and transmitted these statistics to a gateway.

Rosi, et. al. [39] deployed a 16 node landslide monitoring WSN over a 500 square meter area.

The Micaz nodes communicated to a laptop used to relay information via cellular network to a

remote server for storage and display of the sensor readings. A data compression algorithm based

on the motion state was used to reduce network bandwidth. No other in-network processing was

performed. Faulkner, et. al. [11] used a community based sensor network to detect rare seismic

events. Cell phones and motion sensors attached to PC were used to detect seismic events sending

an event message to a cloud fusion center.

Werner-Allen, et. al. [51] and Song, et. al. [42] deployed small (< 20 node) WSNs on volcanos.

Both systems employed event detection on the nodes and transmitted the data to a base station for

storage and oﬄine processing. Werner-Allen’s stations did not sample and store sensor readings

during the data communication process resulting in lost events and data. Song’s stations were large

and needed to be transported to the deployment site using a helicopter. Unlike their systems which

focused on network communication and data collection, our system targets in-network seismic

signal processing and real-time compute-intensive tomography.

Various task oﬄoading schemes for smartphones have been developed recently. Spectra [12]

9

allows programmers to specify task partitioning plans given application-speciﬁc service require-

ments. Chroma [3] aims to reduce the burden on manually deﬁning the detailed partitioning plans.

Medusa [38] features a distributed runtime system to coordinate the execution of tasks between

smartphones and cloud. Turducken [43] adopts a hierarchical power management architecture,

in which a laptop can oﬄoad lightweight tasks to tethered PDAs and sensors. While Turducken

provides a tiered hardware architecture for partitioning, it relies on the application developer to

design a partitioned application across the tiers to achieve energy eﬃciency.

The MAUI system [5] enables a ﬁne-grained oﬄoading mechanism to prolong the smartphone’s

battery lifetime. However, MAUI relies on the properties of the Microsoft .NET managed code

environment to identify the functions that can be executed remotely. When a function is executed

remotely, MAUI assumes the energy associated with its local execution is saved.

In contrast,

ORBIT does not rely on any language speciﬁc environment and its measurement-based power

proﬁles account for many realistic power characteristics such as CPU sleep, wake up and tail time.

The Wishbone system [31] also features a task dispatch scheme. Wishbone uses the CPU and

network timing proﬁles only to ﬁnd the optimal task partition. Moreover, Wishbone depends on

the timing proﬁles based on sample data under the assumption that the sample data can represent

actual runtime data. However, signal processing timing proﬁles can exhibit signiﬁcantly variations

in real scenarios due to the variations in data complexity. Moreover, Wishbone formulates the

partitioning problem as a 0/1 integer programming problem and thus supports two tiers only.

10

CHAPTER 3

UNSUPERVISED RESIDENTIAL POWER USAGE MONITORING USING A

WIRELESS SENSOR NETWORK

This chapter presents the design and implementation of Supero – a system for unsupervised power

monitoring.

To detect appliance operating modes Supero utilizes an indirect sensing approach. With this

approach sensors are not directly attached to each appliance. This also allows each sensor to

potentially monitor more than one appliance at a time. Using some a priori information provided

by a home owner Supero is able to estimate the energy consumption with errors less than 7.5%

Supero utilizes a smart meter to measure real-time total household power consumption and

inexpensive light and acoustic sensors that are deployed in an ad hoc manner to detect interesting

events of appliances. It uses multi-sensor fusion to correlate data collected by power, light, and

acoustic sensors and reduce possible sensing errors. By using advanced unsupervised clustering

algorithms, Supero analyzes the signal signatures of diﬀerent appliances and identiﬁes the events

generated by the same appliance. Moreover, Supero autonomously associates the classiﬁed events

with the appliances through an optimization framework that accounts for environmental factors

such as light signal propagation. Given a small amount of easily obtained prior information such as

sensor-appliance distances and rated powers of a small subset of the appliances, our unsupervised

algorithms work together to disaggregate the total household energy consumption into usage by the

individual appliances.

To the best of our knowledge, Supero is the ﬁrst practical ad hoc sensor system that can

accurately monitor appliance power usage without supervised training. Supero aims at swift one-

oﬀ deployments for power usage diagnosis over short time periods (e.g., a few days to weeks). As

such, there should be little concern about user privacy or any negative visual impact of the sensor

installation.

Supero was prototyped using a network of TelosB/Iris motes [28, 27] and a smart meter, and

11

evaluated it in ﬁve real homes of diﬀerent sizes and with diﬀerent characteristics of electricity

consumption. A 10-day evaluation in an apartment shows that Supero can estimate the energy

consumption with errors less than 7.5%. The results also demonstrate that Supero can be quickly

deployed by non-professionals with considerable ﬂexibility.

The remainder of this chapter is organized as follows. Section 3.1 presents the overview of

Supero. Section 3.2 presents the event detection and multi-sensor fusion algorithms. Section 3.3

presents the unsupervised event clustering and autonomous appliance association approaches.

Section 3.4 discusses estimating the power consumption of a class of high-power heating appliances.

Section 3.5 and Section 3.6 present our system implementation and evaluation results, respectively.

Section 3.7 concludes this chapter.

3.1 Overview of Supero

3.1.1 Design Objectives and Challenges

The goal of Supero is to produce ﬁne-grained electricity usage reports over speciﬁc time durations

in a household. A report includes the energy consumption of particular appliances, as well as

when they were turned on/oﬀ. Supero is designed to meet the following three objectives. First, it

should be possible to deploy the sensors in an ad hoc and non-intrusive manner. A non-professional

should be able to deploy battery-powered wireless sensors with intuitive instructions such as “place

a light sensor with unobstructed view to the light” and “place an acoustic sensor on top of the

microwave.” Second, we aim to reduce needed eﬀorts for system conﬁguration by avoiding the use

of labor-intensive training and extensive user inputs. Third, Supero should be able to operator for a

long enough time period (e.g., a few weeks) without changing the sensors’ batteries, such that the

generated report is meaningful and informative enough for identifying wasteful energy usage and

diagnosing eﬃciency problems in appliances.

Four major challenges are brought by the above design objectives. First, in an ad hoc deploy-

ment, a sensor may pick up signals emitted by multiple appliances, which can make it diﬃcult to

pinpoint the appliance that is consuming power. For instance, a light sensor can sense light emitted

12

by various sources, and an acoustic sensor in the kitchen can hear sounds from the exhaust fan,

disposer, microwave, etc. Second, without careful installation, sensors typically suﬀer from sensing

errors caused by ambient noises and human activities. For instance, light sensors can report false

alarms when nearby window blinds are opened, and acoustic sensors may pick up sounds such

as human conversations that are unrelated to power consumption. Third, without in situ system

training, unsupervised learning often requires more prior information than supervised learning. In

Supero, we strive to reduce the burden on users to obtain the prior information required, while

maintaining good monitoring accuracy. Finally, to extend the system lifetime, wireless sensors

should adopt lightweight sensing algorithms and minimize the data transmissions, which however

raises challenges for accurate monitoring of appliance working states.

3.1.2 Motivating Observations

To meet the aforementioned objectives, Supero utilizes a household power meter and a small number

of inexpensive light and acoustic sensors that are deployed in the home in an ad hoc manner. Based

on an unsupervised approach, it does not require any in situ system training. Rather, it leverages a

small amount of prior information that can be easily obtained by non-professional users. We now

discuss several important observations that motivate our approach.

Real-time total household power metering Nowadays, the real-time total household power con-

sumption can be easily measured by installing a commercial oﬀ-the-shelf smart meter (e.g.,

TED [6] and AlertMe [1]) on the main circuit panel. These meters are inexpensive and

most of them can be easily installed without hardwiring with the power lines. Moreover,

as the coverage of smart grid increases, the real-time total household power readings are

increasingly available to the homeowners, without resorting to a personal smart meter.

Sensing modalities According to a survey of U.S. Department of Energy [48], the average distri-

bution of electricity consumption in household is: heating 24%, lights 24%, air conditioners

20%, refrigerators 15%, dryers 9%, and electronics 9%. As most heating appliances con-

13

sume substantially more power than other appliances, their consumption trace often can be

identiﬁed from the real-time total household power readings. Most lights, air conditioners,

refrigerators and dryers emit light and acoustic signals. As a result, on average, more than

90% power consumption of a typical household can be captured by a combination of a smart

meter and a set of light and acoustic sensors.

Useful prior information To avoid expensive in situ system training, Supero leverages unsuper-

vised learning techniques and a small amount of prior knowledge including rough sensor-

appliance distances and the rated powers of a small subset of appliances. As the light/acoustic

signal decays with the distance from the source appliance, the distances between sensors and

appliances provide important hints for associating the detected events to the right appliances.

Moreover, although the rated power of an appliance often has small discrepancy with the

actual power consumption, it helps identifying the consumption trace of a small number of

diﬃcult-to-detect appliances from the household power readings. Rated powers are often

available from the labels on the appliances or the user manuals. Moreover, there exist a few

publicly available databases (e.g., [46]), which provide rated power based on the appliance

brand and model.

3.1.3 System Architecture

Supero consists of a number of wireless sensors distributed in the home being monitored, a smart

meter, and a base station for receiving information from the sensors and the smart meter.

In this

work, we only consider light and acoustic sensors while other sensing modalities such as infrared

can be easily incorporated by Supero.

Fig. 3.1 illustrates the two-tiered architecture of Supero. In the ﬁrst tier, sensors sample signals

and detect events that are possibly caused by switching appliances on/oﬀ. On the detection of

an event, a sensor extracts various features of the event and sends an event message to the base

station. Further details of the ﬁrst tier will be presented in Section 3.2. The base station provides a

graphic conﬁguration interface that allows user to input prior information such as sensor-appliance

14

Graphic Config.
Interface
1) Light-sensor
    distances
2) Acoustic appliances'
    properties
3) Appliances' rated
    powers

Base Station

Unsupervised

Event Clustering

Cluster-Appliance

Association

Multi-modal Data

Correlation

Power Event

Detection

fine-grained
power usage

events features

Light/Acoustic

Sensors

power readings

Smart Meter

Figure 3.1: The Supero architecture.

distances and appliances’ rated powers. When Supero is requested to generate a power usage report,

the base station executes the following second-tier algorithms based on the collected data and the

prior information input by the user:

Multi-modal data correlation The base station correlates sensor events and power readings to

diﬀerentiate between true appliance events and false alarms unrelated to power consumption.

(Section 3.2.4)

Unsupervised event clustering and event-appliance association Leveraging unsupervised clus-

tering algorithms, we can classify the events generated by an appliance into the same cluster,

and estimate the power consumption of the appliance by correlating the events with measure-

ments by the smart meter. Supero associates the classiﬁed events with their appliances based

on features of the events and the prior information. It then calculates the energy consumption

of each appliance. (Section 3.3)

3.2 Event Detection and Data Correlation

3.2.1 Light Event Detection

Light sensors detect the state changes of lights from changes in the light readings. We adopt an

exponential diﬀerence ﬁlter (EDF) to light intensity samples to detect light events. The EDF is

15

y
t
i
s
n
e
t
n
i

t
h
g
i
L

2600

2400

2200

2000

1800

1600

sensor readings
¯xS − ¯xL

Light 2 on Light 1 off

Light 1 on

Human movement

Light 2 off

Human movement

L
¯x
−
S
¯x

100
0
-100

0

20

40

60

80
Time (second)

100

120

140

Figure 3.2: EDF result on light readings sampled at 4 Hz. Vertical lines represent detections. A
person passes by Light 1 at the 31st and 53rd seconds.

lightweight and resilient to sensing noise and natural ambient light changes. Speciﬁcally, using

two settings for the coeﬃcient of the exponential moving average (EMA), the sensor computes the

short-term and long-term EMAs, denoted by ¯xs and ¯xl , respectively, over the periodic light samples

(4 Hz in our implementation). Note that a historical light reading has higher weight in ¯xl than in
¯xs. If (cid:12)(cid:12) ¯xs − ¯xl(cid:12)(cid:12) keeps higher than a threshold for a certain number of readings, the sensor reports a

light event message which includes the current reading as well as the two averages. Moreover, it

sets ¯xl = ¯xs to adapt ¯xl quickly to the most recent readings. The coeﬃcients and thresholds used

in EDF are carefully tuned in oﬄine experiments such that the EDF is resilient to normal human

movements. Fig. 3.2 shows the operation of the EDF on the sensor readings when two lights are

turned on/oﬀ and a person moves around. It can be seen that the light events can be accurately

detected and the human movements do not trigger false alarms. Light sensors may still pick up

events unrelated to power consumption (which we refer to as non-power events), such as those

caused by human movements and the opening/closing of window blinds, which will be identiﬁed

by a multi-modal data correlation technique given in Section 3.2.4 and then discarded.

3.2.2 Acoustic Event Detection

A challenge in acoustic sensing is that a high sampling rate is often required to extract event features.

Supero adopts a duty-cycled and adaptive sampling scheme to reduce the energy consumed in the

16

acoustic
sensor

low-pass filter

[0, 900 Hz]

band-pass filter
[900, 3000 Hz]

high-pass filter
[3000 Hz, 
]

compute signal energy

count zero crossings

compute signal energy

count zero crossings

compute signal energy

count zero crossings

compute signal energy

count zero crossings

feature
packet

Figure 3.3: Acoustic signal is separated into three bands using lattice wave digital ﬁlters for
feature extraction.

sampling and computation. For each second, an acoustic sensor is active for 0.08 seconds only.

Initially, it samples the signal at 1 kHz when it is active. If the signal energy exceeds a threshold η A,

the sensor switches to a high sampling rate of 12.5 kHz to capture more details of the potential event.

As shown in Fig. 3.3, we use three lattice wave digital software ﬁlters to decompose the signal into

low-pass, band-pass, and high-pass components. The passbands of the three ﬁlters are [0, 900 Hz],
[900 Hz, 3000 Hz], and [3000 Hz, ∞), respectively. The signal energy and zero-crossing counts
of the signals in the whole band and the three subbands are computed as acoustic features and

transmitted to the base station. The sensor remains in the fast sampling mode as long as the signal

energy is above η A. We set a low threshold η A conservatively such that the acoustic sensors will not

miss any sounds generated by an appliance. Note that diﬀerent from a light event, that refers to the

switching on/oﬀ of a light, an acoustic event refers to the sound heard by a sensor. Therefore, the

sensor will continuously report acoustic events while the sound persists. We refer to the switching

or phase change of an acoustic appliance as an acoustic transition. Owing to intrinsic complexity

of the acoustic modality, acoustic transitions are detected by advanced learning algorithms running

on the base station, as we will discuss in Section 3.3.

17

3.2.3 Power Event Detection

As the total power consumption is critical for identifying appliance events and estimating per-

appliance consumption, real-time power readings by the smart meter are transmitted to the base

station for storage. Moreover, the base station applies EDF to detect rapid increases and drops in

the power measured. The thresholds in the EDF are tuned in oﬄine experiments such that power

changes as small as 50 W can be always detected.

In this analysis, we assume that the appliances

are not duty-cycled at high rates, except those explicitly speciﬁed. In Section 3.4, we develop an

approach for monitoring high-power duty-cycled appliances (e.g., stove burner) and discuss how to

integrate the approach with Supero.

3.2.4 Multi-modal Data Correlation

Because of their limited sensing capability and the complexity of home environments, the sensors

can easily raise false alarms or miss important on/oﬀ events of appliances. For instance, opening/-

closing a window blind can trigger the nearby light sensors, and human conversations may trigger

the acoustic sensors. To deal with these sensing errors, we present a two-tiered fusion approach

to correlate the light/power events and acoustic transitions reported by diﬀerent sensors. The ﬁrst

tier uses a short moving window to correlate the events/transitions reported by multiple sensors of

the same modality. The events/transitions falling into the same window are regarded as generated

by the same source. This is equivalent to an OR-rule for decision fusion and can greatly reduce

the overall miss rate. The second tier correlates the results of the ﬁrst tier with readings by the

smart meter to remove false alarms. Speciﬁcally, if the change in power on an event/transition is

smaller than a conservatively low threshold (e.g., 5 W), the event/transition will be discarded. The

evaluation in Section 3.6 shows that this approach is eﬀective in removing sensor false alarms.

3.3 Event Classiﬁcation and Appliance Association

A novel feature of Supero is that it automatically classiﬁes the detected events and associates

them with the right appliances, without any in situ system training. Supero uses a two step approach

18

)
2
x
(
2

r
o
s
n
e
S
f
o

e
r
u
t
a
e
F

300

250

200

150

100

Light 1
Light 2
Light 3

50

100

150

200

250

300

350

Feature of Sensor 1 (x1)

)
y
t
i
s
n
e
t
n
I
(
n

l

7.5

7

6.5

6

5.5

5

4.5

4

3.5

50W
100W
150W

4

4.5

5

5.5

6

ln(Distance from light source)

Figure 3.4: Light feature vectors of two
sensors.

Figure 3.5: Light intensity vs. distance
(cm) in log-scale.

)
)
k
(
w
S
(
t
e
d
/
)
)
k
(
b
S
(
t
e
d

1.4

1.2

1

0.8

0.6

0.4

0.2

0

8

6

4

2

0

-2

-4

-6

)
4
0
1
×

(

C
P
r
o
j
a

M

cluster 1
cluster 2
cluster 3

1

2

3

4

5
6
(a) k

7

8

9

10

0

2

4

6

8

10

(b) Time (minute)

Figure 3.6: Acoustic event clustering and transition detection for a 3-speed fan. (a) The number of
phases is identiﬁed as three; (b) Clustering and transition detection results, where Y -axis is the
major principle component (PC) and vertical lines represent the detected acoustic transitions.

to this:

Event Clustering Events are clustered using the features extracted by the sensor nodes. For lights,

light intensity as measured by each light sensor is used. Fig. 3.4 shows the feature vectors

measured by two light sensors when three standing lights nearby the sensors were turned on

and oﬀ. We can clearly see that the feature vectors are clustered together.

For appliances, the overall sound intensity and the intensity in three speciﬁc frequency bands

are used. To reduce then number of acoustic features, principal component analysis is used

and those features contributing to 99% of the variance are used. A challenge when clustering

19

features is determining the number of clusters to use. For light events, we know the number

of monitored lights. This is used as the number of clusters which should be given to the

k-means clustering algorithm. For acoustic events this is more diﬃcult. Some appliances

have multiple modes with each mode consuming a diﬀerent amount of power such as a three

speed ﬂoor fan. In this case each acoustic cluster may represent a diﬀerent mode or appliance.

The number of actually used modes of an appliance depends on the habit of the user and is

therefore unpredictable. Thus, it is not known the number of clusters that will be present.

Supero estimates the number of clusters by comparing the between-cluster and within-cluster

variance matrices for varying number of clusters. As a simple example, Fig. 3.6 shows a case

study using one acoustic sensor only to detect the phase changes of a 3-speed fan. As shown

in Fig. 3.6(a), optimum number is identiﬁed as 3 based on the acoustic event features shown

in Fig. 3.6(b). Transitions between modes can be identiﬁed as transitions between clusters

over time as shown by the vertical lines in Fig. 3.6(b). In either case, the mean power change

associated with each event in a given cluster is used as the power consumption for that device

or mode.

Appliance Association To associate, light events with the actual two pieces of information are

use: a) the prior information of how far a speciﬁc light is from each light sensor and b) the

decay of light intensity follows the power law as shown in Fig. 3.5. Using this the association

algorithm is described in detail in [36] is applied to associate a light with each light event

cluster. For associating appliances with acoustic events, the change in sound intensity is used.

The assumption is made that the sensor registering the greatest change is the sensor closest

to the appliance generating the acoustic event. A heuristic association algorithm described

in [36] is used. There may be appliances that generate power events but do not generate light

or acoustic events. These are referred to as unattended appliances. The approach to handling

unattended appliances is also described in [36].

20

)

W
k
(

r
e
w
o
P
)

W
k
(
v
e
D
d
t
S

2
1.5
1
0.5
0
1
0.8
0.6
0.4
0.2
0

change heat level

power
envelope

0

1000

2000

3000
Time (s)

4000

5000

6000

threshold

Figure 3.7: Detecting stove burner. (1) Red curve: Total household power readings when a burner
is working; Blue curve: The reconstructed lower envelope. (2) Standard deviation of power
readings and threshold-based detection results (detection window size: 100 s).

3.4 Duty-Cycled Heating Appliances

As discussed in Section 3.1.2, heating appliances such as stove burner and oven are major

electricity consumer in homes. Most modern heating appliances duty-cycle to achieve the desired

heat level. For instance, the top part of Fig. 3.7 shows the total household power readings when a GE

JB710ST2SS burner is working. As the cycle can be short (e.g., several seconds), the EDF-based

detector discussed in Section 3.2.3 may have poor performance. In this section, we propose a new

approach to detect the duty-cycling pattern from the total power readings and calculate the related

energy consumption.

Duty-cycled appliance rapidly switches between on and oﬀ, causing large variation in power

readings. Thus, we detect the duty-cycling pattern based on the standard deviation of the windowed
power readings. By denoting P and γ ∈ (0, 1) as the power and duty cycle of the appliance, the
standard deviation of the power readings can be derived as Pqγ − γ2. We choose a threshold
of P√0.05 − 0.052 by conservatively assuming that the duty cycle is greater than 5%. When P

is unknown, we can choose a default value of 1.5 kW for P because most duty-cycled heating

appliances have a rated power around 1.5 kW [46]. As a result, the default threshold is 0.327 kW.

To suppress the false alarms caused by other high-power non-duty-cycled appliances, we further

21

require that the zero crossing count of the mean-removed power readings in a window is at least 2.

The bottom part of Fig. 3.7 shows the standard deviation of the power readings in the top part and

the detection result. We can see that the time duration that the burner is working can be accurately

detected. For the power readings in a window that has a positive detection, we apply the k-means

algorithm with k = 2 and then interpolate the power readings in the cluster with a smaller average

to reconstruct the lower envelope of power consumption (i.e., the background power), as shown in

the top part of Fig. 3.7. With the lower envelope, it is easy to calculate the energy consumption of

the duty-cycled appliance. In typical U.S. homes, stove burner and oven are the major duty-cycled

heating appliances and they are often the components of a range. Supero does not diﬀerentiate

the duty-cycled heating appliances and attributes all energy consumption to the range. To address

multiple simultaneously working duty-cycled appliances, the number of clusters, i.e., k, can be ﬁrst

determined by the technique presented in Section 3.3.

The rapid duty-cycling can cause signiﬁcant errors to the EDF-based power event detection

(cf. Section 3.2.3) and the second tier of the multi-modal data correlation (cf. Section 3.2.4).

Hence, when a duty-cycled appliance is detected, Supero disables these two components and the

power changes of the light/acoustic events in this period are set to be missing values. Although such

a design can cause errors to other appliances, it is worthwhile to give priority to the high-power

duty-cycled appliances since they usually dominate the total power consumption of a household.

3.5 Implementation and Deployment

3.5.1 Prototype System Implementation

Sensors and smart meter. The sensors are implemented using TelosB [28] and Iris [27] motes.

TelosB only has light sensor while Iris has both light and acoustic sensors. According to our lab

tests, the light sensors on TelosB and Iris have satisfactory isotropic sensitivity in a considerably

large range of incoming angles, which can mitigate the impact of sensor orientation on the accuracy

of the power-law-based association algorithm. The signal sampling and event detection algorithms

described in Section 3.2 are implemented in TinyOS 2.1. The parameters used in these algorithms

22

are carefully tuned oﬄine and then ﬁxed for diﬀerent deployments. The sensors communicate

directly with the base station. Such a single-hop topology suﬃces for our deployments in three

apartments and two multi-story houses. TED5000 [6] is used to measure the total household power

consumption.

Base station. The base station is a TelosB mote connected to a laptop computer. A daemon service

on the computer retrieves real-time power readings from the TED5000 and stores the received

event messages. The data correlation, clustering, and association algorithms are implemented

in GNU Octave. The energy consumption of an appliance is computed by integrating estimated

power over time. Note that this simple energy calculation method can be easily replaced by the

regression-based method developed in [22] to improve robustness.

Groundtruth Kill-A-Watt meters.

In order to evaluate the accuracy of Supero, we built 14

power meters based on the P3 Kill-A-Watt (KAW) Model P4400 [32] to provide groundtruth power

usage data of individual appliances. We connect two ADC channels of a Senshoc mote to two

pins on the internal circuit board of a KAW to sample the voltage and current signals. Senshoc

is a TelosB-compatible mote implementation with signiﬁcantly reduced cost [26]. The Senshoc

mote computes and transmits the real-time power usage data to the base station for storage. Each

modiﬁed KAW is carefully calibrated to output accurate power readings.

3.5.2 System Deployment and Conﬁguration

This section discusses the sensor deployment and initial conﬁguration of Supero.1

Sensor deployment strategies. A necessary condition for correct clustering and association is that

every light/acoustic appliance can be detected, which is referred to as the coverage requirement.

A conservative deployment strategy is to place a sensor close to each appliance. The number of

sensors can be reduced by incrementally placing sensors close to appliances, starting with those

that emit dim light/acoustic signals, until the coverage requirement is met. In our implementation,

1An online video illustrating the system deployment and conﬁguration can be found at https:

//youtu.be/4sSZaaV0Kv4

23

the coverage is checked by switching appliances on and check the sensors’ LEDs that blink to

indicate detection. Note that this coverage check is diﬀerent from supervised training processes

(e.g., [35]) that are typically conducted after system deployment and involve labelling the events

with the source appliances. After coverage requirement is met, a few extra sensors may be deployed

in regions without any sensors to provide redundancy and improve robustness. The eﬀectiveness

of the above conservative and incremental deployment strategies will be evaluated in Section 3.6.4.

User inputs. First, Supero needs a list of the monitored appliances, which are categorized as

lights, acoustic, or unattended appliances. Supero also needs to know whether an appliance has

multiple working states although the exact number of the working states is optional. Second, for

the light modality, Supero requires roughly estimated line-of-sight distances between the sensors

and lights. Third, for the acoustic modality, Supero needs to know whether an acoustic appliance

has a primary sensor or not. All the non-primarily monitored acoustic appliances need to be sorted

by their powers. Such a ranking is usually straightforward to obtain, e.g., based on common sense.

Finally, Supero requires the rated powers of the unattended appliances, which can be obtained from

the labels on the appliances or from a database of appliance rated powers. Supero only needs to be

reconﬁgured occasionally, e.g., when sensors/appliances are relocated.

Conﬁguration interface. We have developed a web conﬁguration interface using JavaServer Pages

served by the base station computer to help the user input all the required information. For instance,

Fig. 3.8(a) shows the conﬁguration for the acoustic sensing, where the user can input the acoustic

sensor IDs, appliance names, and other required information. In addition, we leverage TPCDB [46],

which is an online collaboratively edited database of appliance powers, to help the user input the

required rated powers. Currently, TPCDB comprises the information of more than 500 appliances.

Fig. 3.8(b) shows our interface of querying TPCDB through its web service API, where the user

can ﬁnd the rated power by appliance type, manufacturer and model. The case studies presented in

Section 3.6.6 shows that this interface can be easily used by non-professionals.

24

(a) Acoustic conﬁguration

(b) Rated power database

Figure 3.8: Web conﬁguration interface.

3.6 Experimental Evaluation

3.6.1 Deployments and Evaluation Methodology

We deployed and evaluated Supero in ﬁve real households. We ﬁrst deployed Supero in a 40 m2

single-bedroom apartment (Apartment-1). As most of the appliances in Apartment-1 can be

monitored by groundtruth KAW meters, this deployment allows us to extensively evaluate the

accuracy of Supero. We then evaluate the sensor deployment strategies (cf. Section 3.5.2) in an

80 m2 apartment (Apartment-2). In addition, we deployed Supero in a one-story three-bedroom

ranch house (House-1) to evaluate the portability of Supero to larger homes. Lastly, we recruited

two homeowner volunteers to deploy Supero in their homes, an apartment (Apartment-3) and a two-

story house (House-2). The Apartment-3 and House-2 deployments evaluate if non-professionals

can deploy Supero easily.

We compare Supero with two baseline approaches. The ﬁrst baseline approach (referred to as

Oracle) uses appliances’ groundtruth states and then applies the regression-based energy calculation

25

method in ViridiScope [22]. In the second baseline approach (referred to as Baseline), the state

of each appliance is detected by the sensor closest to the appliance and then the regression is

applied. The results of Baseline will help us understand the challenges brought by an ad hoc sensor

deployment.

3.6.2 Controlled Experiments in Apartment-1

3.6.2.1 Experimental Settings

The electrical appliances in Apartment-1 include 5 standing lights, a fridge, a water boiler, a 3-

speed tower fan, a rice cooker, a bath fan, a hair dryer, 3 laptop computers, and a WiFi router.

The apartment uses a natural gas range and a steam-based central heating unit that do not draw

electrical power. The deployment consists of 4 TelosB and 5 Iris motes. The Iris motes only detect

acoustic events. The laptops and router cannot be easily detected by sensors. However, as the

router’s rated power is known and it is always on, Supero can estimate its energy consumption. The

residual energy consumption is thus mainly attributed to the laptops. The rice cooker, water boiler,

and fridge are treated as unattended appliances, because they do not emit light or stable acoustic

signals. The water boiler and fridge are also monitored by acoustic sensors. Fig. 3.9 shows the

ﬂoor plan and sensor positions. The sensors are placed on the ﬂoor, a nearby table, chairs, and a

toilet. The positions of the sensors are not carefully chosen except for the tower fan, fridge, and

water boiler. Sensors are deployed close to these quiet appliances. As the bathroom has complex

sound patterns, two acoustic sensors are deployed and both of them can hear all the appliances and

the sound of water ﬂow in the bathroom.

26

Light 1

8 m

kitchen counter

Tower fan

Node 1

refrig.

Node 12

Node 13

Light 3

living room

Water boiler

Rice cooker

Node 14

 

5
m

bedroom

Node 2

bathroom

Node 4

Light 5

Legend:

TelosB
Iris

Light 2

Appliances

Bath
fan

Hair
dryer

Light 4

Node 11

Node 3

Node 15

Figure 3.9: Apartment-1 deployment.

27

)

W
k
(
 
r
e
w
o
p

 
l

t

a
o
T

n
o

t

i
t
c
e
e
D
C
P

j

 
r
o
a
M

1.6
1.4
1.2
1
0.8
0.6
0.4
0.2
0
node 4
node 3
node 2
node 1
8000
6000
4000
2000
0
-2000

r
e
w
o
P

)

W
k
(
 

e
g
n
a
h
c

1.5
1
0.5
0
-0.5
-1
-1.5

n
o

 

1

 
t

h
g

i
l

f
f

o

 

1

 
t

h
g

i
l

n
o

 

2

 
t

h
g

i
l

f
f

o

 

2

 
t

h
g

i
l

n
o

 

3

 
t

h
g

i
l

f
f

o

 

3

 
t

h
g

i
l

n
o

 

4

 
t

h
g

i
l

f
f

o

 

4

 
t

h
g

i
l

n
o

 

5

 
t

h
g

i
l

f
f

o

 

5

 
t

h
g

i
l

n
o

 

1

 
t

h
g

i
l

f
f

o

 

1

 
t

h
g

i
l

n
o

 

2

 
t

h
g

i
l

f
f

o

 

2

 
t

h
g

i
l

n
o

 

3

 
t

h
g

i
l

f
f

o

 

3

 
t

h
g

i
l

n
o

 

4

 
t

h
g

i
l

f
f

o

 

4

 
t

h
g

i
l

n
o

 

5

 
t

h
g

i
l

f
f

o

 

5

 
t

h
g

i
l

n
o

 

1

 
t

h
g

i
l

n
o

 

2

 
t

h
g

i
l

n
o

 

3

 
t

h
g

i
l

n
o

 

4

 
t

h
g

i
l

n
o

 

5

 
t

h
g

i
l

f
f

o

 

1

 
t

h
g

i
l

f
f

o

 

2

 
t

h
g

i
l

f
f

o

 

3

 
t

h
g

i
l

f
f

o

 

4

 
t

h
g

i
l

f
f

o

 

5

 
t

h
g

i
l

n
o

 

1

 
t

h
g

i
l

n
o

 

2

 
t

h
g

i
l

n
o

 

3

 
t

h
g

i
l

n
o

 

4

 
t

h
g

i
l

n
o

 

5

 
t

h
g

i
l

f
f

o

 

1

 
t

h
g

i
l

f
f

o

 

2

 
t

h
g

i
l

f
f

o

 

3

 
t

h
g

i
l

f
f

o

 

4

 
t

h
g

i
l

f
f

o

 

5

 
t

h
g

i
l

n
o

 

e
g
d
i
r
f

n
o

 

n
a

f
 
r
e
w
o

t

f
f

o

 

n
a

f
 
r
e
w
o

t

n
o

 

n
a

f
 
r
e
w
o

t

f
f

o

 

n
a

f
 
r
e
w
o

t

n
o

f
f

o

 

 

n
a

n
a

f
 
r
e
w
o

f
 
r
e
w
o

t

t

f
f

o

 

e
g
d
i
r
f

n
o

 

n
a

f
 
r
e
w
o

t

f
f

o

 

n
a

f
 
r
e
w
o

t

r
e

l
i

o
b

)
4
x
(
 
f
f

 
r
e

t

a
w

 

o
&
n
o

 

n
o

 

4

 
t

h
g

i
l

)
4
x
(
 
f
f

 

o
&
n
o

 

r
e
k
o
o
c
 

e
c
i
r

n
a

f
 

h

t

a
b

)
4
x
(
 
f
f

 

o
&
n
o

 

)
4
x
(
 
f
f

 

o
&
n
o

 

r
e
y
r
d
 
r
i
a
h

f
f

o

 

4

 
t

h
g

i
l

)
h

t

a
b
(
 
t

e
c
u
a

f
 

e
s
o
c
/

l

n
e
p
o

)
n
e
h
c
t
i
k
(
 
t

e
c
u
a

f
 

e
s
o
c
/

l

n
e
p
o

h
s
u

l
f
 
t

e

l
i

o

t

4 4

4 4

4

4

4

FA

4

4

4

1 1 2 2 3 3

5 5 1 1 2 2 3 3

5 5 1 2 3

5 1 2 3

5 1 2 3

5 1 2 3

5

FA

fridge

tower
fan

water
boiler

bath
fan

hair
dryer

0

10

20

30

Time (minute)

40

fridge

water boiler

rice cooker

50

60

70

Figure 3.10: Results of the controlled experiment in Apartment-1. (1) The top chart shows the power readings labeled with ground
truths of the events. (2) The bars in the second chart show the detections of the light sensors. Two black bars at around the 35th minute
are false alarms (labeled “FA” in the chart) identiﬁed by the multi-modal data correlation. Clusters are diﬀerentiated by colors and the
overhead numbers are the IDs of the associated light. (3) The third chart shows the major principle component given by PCA and the
detected acoustic transitions. The acoustic transitions of the same color are associated with the same appliance. (4) The bottom chart
shows the clustered and associated power events of the unattended appliances.

28

Table 3.1: Energy breakdown for the 1-hour controlled experiment in Apartment-1.

Oracle

Baseline

Energy Error∗
(%)
(kW·h)
0.7
0.0305
0.7
0.0300
0.0306
2.0
0.5
0.0210
3.4
0.0200
1.8
0.0481
0.0028
9.7
3.1
0.0168
5.1
0.0150
1.4
0.0795
0.0020
N/A
4.8
0.0154
3.4
0.4840
3.1

Power Energy Error∗
(%)
(W)
1.0
153
2.3
151
152
2.3
3.8
62
0.5
102
41.0
232
30
45.1
0.0
508
88.6
8.2
N/A
4.8
0.9
16.5

(kW·h)
0.0310
0.0305
0.0307
0.0219
0.0206
0.0289
0.0045
0.0163
0.0018
0.0848
0.0048
0.0154
0.0472

119
55
13
53

5

Appliance

Name

Rating

Light 1
Light 2
Light 3
Light 4
Light 5

Water boiler

Tower fan
Rice cooker
Hair dryer

Fridge
Bath fan
Router

3 Laptops

(W)
150
150
150
50
100
1500
N/A
500
N/A
N/A†
N/A‡
12
N/A

KAW

Power
(W)
152
148
151
60
102

1472-1524

23-40
498
442

117-146

N/A
12.5
37-63

Energy
(kW·h)
0.0307
0.0298
0.0300
0.0211
0.0207
0.0490
0.0031
0.0163
0.0158
0.0784

N/A

0.0147
0.0468

Average error

Supero

Power Energy Error∗
(%)
(W)
0.7
154
0.7
150
153
1.3
0.5
61
0.5
103
6.9
1479
N/A
5.3
3.1
508
5.1
462
7.3
129
60
N/A
3.4
12
8.1
36
3.6

(kW·h)
0.0309
0.0300
0.0304
0.0210
0.0205
0.0456
0.0029
0.0168
0.0150
0.0841
0.0020
0.0142
0.0430

Power
(W)
152
150
153
60
100
1481

{23, 28, 35}

507
459
122
61
13
31

∗Error is the relative error of energy, in percentage, with respect to the KAW measurements.
‡Bath fan is hardwired to the power line and hence no KAW is applied for it.
†Fridge’s rated power is not available. However, its power events can be correctly associated when a rated power of 80 W to
400 W is given to Supero.

29

3.6.2.2 Energy Estimation Accuracy

This section presents the results of a controlled experiment, in which we intentionally turned on

and oﬀ the appliances. It allows us to understand the micro-scale performance of Supero. Fig. 3.10

shows the groundtruth information, power readings, event detection and clustering results. Both

of the two light false alarms are identiﬁed by the multi-modal event correlation. No light event is

missed. All the light events are correctly clustered and associated. For the acoustic modality, the

non-power sounds such as toilet ﬂush and run of tap water can be identiﬁed by the multi-modal

data correlation. From the third chart in Fig. 3.10, Supero fails to detect the oﬀ event of the fridge

and four events of the water boiler. The miss detections of the water boiler are caused by the delay

of sound. However, as discussed in Section 3.3, by jointly treating the fridge and water boiler

as acoustic and unattended appliances, these misses can be successfully recovered by the events

detected from the power readings. Other detected acoustic transitions including the phase changes

of the 3-speed tower fan can be correctly associated.

Table 3.1 shows the groundtruth measurements by KAWs and the estimation results of the

various approaches. Both Supero and Oracle can accurately estimate the power and energy of

each appliance. The average errors of energy consumption estimate are lower than 4%. For a few

appliances, Supero outperforms Oracle. This can be caused by small errors in the groundtruth

measurements by KAWs and the adoption of diﬀerent energy calculation methods in Supero and

Oracle. As Lights 1, 2, and 3 have no nearby sensors, Baseline uses the groundtruth states of Lights

1, 2 and 3. For other appliances, Baseline uses the closest sensor to detect the state of an appliance.

As Baseline does not perform data correlation and event clustering, it generates excessive false

alarms. For instance, as the hair dryer is very noisy, all the acoustic sensors raise detections when

the hair dryer is on, which causes false alarms for all the other acoustic appliances. Hence, Baseline

yields wrong power and energy estimates for several appliances. In fact, it is quite diﬃcult to deploy

dedicated acoustic sensors as they can be easily triggered by any noisy appliances. Acoustic data

from multiple sensors must be jointly processed to produce correct detections.

30

3.6.2.3 Impact of Distance Errors

This section evaluates the robustness of the association algorithm in Section 3.3 with respect to the

errors in the light-sensor distances. The distances given to Supero are distorted as follows. First,

we proportionally increase all the distances. As the association algorithm can ﬁnd a best ﬁt scaling

factor β, the association remains correct even if we multiply the distances by 10. Second, we add

a random bias to a particular distance in each test. The result shows that if the bias is within 70%

of the true distance, the association remains correct. Finally, when we exclude Node 2 from the

evaluation, the results remain the same as long as the order of the distances from Node 1 to Light

1 and Light 3 is consistent with reality, i.e., Light 1 is farther from Node 1 than Light 3. These

results demonstrate that Supero is robust to the errors in the light-sensor distances.

3.6.3 10-Day Experiment in Apartment-1

We conducted a 10-day uncontrolled experiment, during which two residents led normal lives in

their apartment.

In this section, we ﬁrst discuss our experiences and learned lessons, and then

present the evaluation results.

3.6.3.1 Experiences and Learned Lessons

We experienced the following three issues during the 10-day experiment.

Power spikes. Power spike is a typical dynamics in power lines, which can be caused by bad weather

conditions and turning on/oﬀ appliances in the tested home and even neighbor homes. Power spikes

may cause errors in the appliance power estimation. In the controlled experiment, we can see a few

power spikes in the top chart of Fig. 3.10 when an appliance changes state. As we apply a guard

region for computing the power change as discussed in Section 3.2.4, the power spikes do not aﬀect

the results. However, in the 10-day experiment, we observe excessive power spikes as shown in

Fig. 3.11(b) that can aﬀect the calculation of power changes for the detected events. We suspect

that the power spikes observed on September 1 were caused by the thunderstorms during the period

31

R
R
P

)

W
k
(

r
e
w
o
P

)

W
k
(

r
e
w
o
P

1
0.8
0.6
0.4
0.2
0
8/30

2.5
2
1.5
1
0.5
0
-0.5
-1
-1.5
8/30

1.5
1
0.5
0
-0.5
-1
4AM

seg 1

router failures
reset router

seg 2

8/31

9/1

9/2

9/3

9/4

9/5

9/6

9/7

Date (from 1AM Aug 30 2011 to 11AM Sep 8 2011)

(a) PRR of a KAW

seg 3
9/8

9/9

8/31

9/1

9/2

9/3

9/4

9/5

9/6

9/7

9/8

9/9

Date (from 1AM Aug 30 2011 to 11AM Sep 8 2011)

(b) Power trace

raw power reading
ﬁltered reading (window size=7)

10AM

4PM

10PM

4AM

caused by appliances

Time (starts at 09/01/2011 4AM)
(c) 12 hours of power trace on September 1, 2011

Figure 3.11: PRR and power traces in 10 days.

of experiment. A zoomed-in view of the power trace on that day is shown in Fig. 3.11(c). Almost

all power spikes can be removed by a median ﬁlter with a window size of 7 seconds. We also apply

the median ﬁlter with the same setting to the power traces collected in other experiments.

Router failures. The probe of TED5000 installed on the power panel sends real-time readings

through power lines to the TED5000 gateway, which was attached on a power outlet and wired to

the WiFi router to deliver readings to the base station computer. However, the router failed twice

during the 10 days, leading to disruptions to the collection of power readings. We had to reset the

router manually to restart the data collection. We suspect that the failures were caused by bugs in

32

the router. As power readings are critical information to Supero, it is crucial to adopt a high-quality

and stable router. Moreover, when the base station fails to receive power readings for a while, it

can raise an alarm sound to remind the user to reset the router.

Communication performance. The quality of wireless links between the base station and sensors

can aﬀect the performance of Supero. Each Supero sensor only sends a packet when an event is

detected while each KAW meter continuously transmits groundtruth power usage to the base station

by the attached Senshoc mote equipped with a CC2420 radio. Therefore, we use the data traces of

KAWs to examine the packet reception ratio (PRR). Fig. 3.11(a) shows the PRR of a KAW during

the 10 days. We can see that the communication performance signiﬁcantly degraded and ﬂuctuated

between the evening of September 1 and the noon of September 3. As the residents watched

online videos over WiFi during this period, we suspect that the poor link quality was caused by

the interference from WiFi. We also examined the traces of other KAWs. Similarly, their link

quality degraded during this outage period. We were able to repeat this phenomenon in an extra

experiment using Senshoc motes and two laptops that transferred a large ﬁle over WiFi. Although

the channel of Senshoc was set to 11, which is well separated from channel 6 used by WiFi, the

PRR of Senshoc still signiﬁcantly degraded. However, we did not observe signiﬁcant degradation

of PRR when experimenting with TelosB and Iris motes. Hence, we suspect that the performance

degradation is caused by the imperfect antenna design of Senshoc.

Nevertheless, after the 10-day experiment, we have enabled packet acknowledgment and added

retransmission mechanism to enhance the reliability of communication. Due to the router failures

and lost groundtruth information from KAWs, we only use three data segments (“seg 1”, “seg 2”

and “seg 3” shown in Fig. 3.11(a)). The total length of the three segments is more than 7 days. The

three data segments are concatenated and then fed to the clustering and association algorithms.

3.6.3.2 Evaluation Results

Table 3.2 shows the results based on the data of 7 days. During this period, 713 false alarms out

of a total of 859 light events were raised by the light sensors, in which 703 of the false alarms

33

Table 3.2: Energy breakdown during 7 days in Apartment-1∗

Appliance

KAW

Name

Light 1
Light 2
Light 3
Light 4
Light 5

Water boiler

Tower fan
Rice cooker
Hair dryer

Fridge
Bath fan
Router

E

(kW·h)
4.14
4.96
6.15
1.45
0.39
0.48
0.15
1.00
0.09
12.22
N/A
2.12

P
(W)
154
150
155
62
105
1493

30
499
467
143
50
12

Supero

E

(kW·h)
4.17
4.96
6.24
1.45
0.39
0.48
0.21
0.98
0.07
11.8
0.12
2.03

Error
(%)
0.5
0.1
1.4
0.1
0.2
0.5
50
2.2
19.2
3.7
N/A
4.3
7.5

Oracle

E

(kW·h)
4.11
4.92
6.25
1.45
0.39
0.48
0.17
1.01
0.09
11.8
0.17
3.04

P
(W)
152
149
155
62
105
1491

26
513
467
127
57
18

P
(W)
152
149
155
63
110

0
24
511

3

127

0
18

Error
(%)
0.9
0.8
1.7
0.1
0.7
1.6
17.9
1.2
0.4
3.2
N/A
43.3
6.5

Baseline

E

(kW·h)
4.11
4.92
6.25
1.48
0.41

0

0.24
1.01
0.02
11.8

0

3.04

Error
(%)
0.9
0.8
1.7
1.7
5.5
100
66.2
0.8
73.2
3.2
N/A
43.3
27.0

Average error
∗Error is relative error of energy with respect to KAW measurements.

are identiﬁed by the multi-modal data correlation. All the remaining false alarms are identiﬁed as

outliers by the event clustering algorithm (cf. Section 3.3). In addition to the acoustic transitions

generated by the fridge, 60 acoustic transitions were detected. We see that Supero can accurately

estimate the energy consumption of lights. The tower fan was turned on and oﬀ twice and all its

transitions were detected. However, two bath fan transitions were incorrectly associated with the

tower fan, because Node 13 (i.e., the primary sensor for the tower fan) heard loud noises in the

living room at the same time. The two false associations introduce errors in the energy estimates of

the tower fan and hair dryer. As shown in Table 3.2, the average error of Supero is only 7.5%. The

average error of Oracle is 6.5%. Therefore, the performance of Supero is close to that of Oracle.

Baseline still fails to estimate the energy consumption of several appliances due to excessive false

alarms, leading to an average error of 27%.

3.6.4 Experiments in Apartment-2

This section evaluates the performance of Supero under diﬀerent sensor placements. We deployed

6 TelosB and 11 Iris motes in the doorway, living room, and kitchen of Apartment-2, as shown in

Fig. 3.12. As the two doorway lights are in series, they are regarded as one light. As shown in

Fig. 3.13, sensors were placed or attached on the ground, walls, appliances, and furniture. Note that

34

Figure 3.12: Sensor placements in Apartment-2. The numbers in the squares and circles are the
sensor IDs of TelosB and Iris, respectively. If a TelosB does not face upward, the arrow represents
its facing direction.

the positions of sensors were chosen by common sense without careful planning. We also varied

the positions of sensors in several trials and similar results were observed, as shown later in this

section. We ﬁrst evaluate the light modality. We conducted ﬁve sensor placement trials to monitor

6 lights including incandescent bulbs and ﬂuorescent lamps. Diﬀerent colors of the TelosB motes

in Fig. 3.12 represent diﬀerent placements, which are also labeled with the initials of color names,

i.e., ‘R’, ‘G’, ‘B’, ‘Y’ and ‘BK’. In the red and green placements, a sensor was placed close to each

appliance. The blue and yellow placements follow the incremental strategy to reduce the number

35

Node 2
2nd placement

Node 5
4th placement

Node 11

Node 2

Figure 3.13: Sensor installation examples. Sensors were placed on the ground, in the corner of
walls, on the fan of a range, and on a table.

Table 3.3: The set of sensors detecting a light (i.e., Rm) and clustering/association results

Light
Dining
Kitchen
Doorway
Living 1
Living 2
Result

Red
{6}
{3}
{5}

Green

{6}
{3}
{5}

{1,2,4}
{1,2,4,6}

{1,2,4}
{1,2,4,6}

X

X

Blue Yellow Black
{6}
{6}
{3}
{3}
{1}
{1}
{5}
{3}
{6}
{5,6}
X

{6}
{3}
{1}
{5,3}
{5,6}

X

X

of sensors from 6 to 4. In the black placement, no sensor was deployed in the living area. All the

placements ensure the coverage requirement. We conducted a controlled experiment to evaluate

each placement. Table 3.3 shows the set of sensors that can detect the same light. The clustering

and association results of the red to yellow placements are correct. In the black placement, although

all the events can be detected, they cannot be correctly clustered. For instance, although Node 6

can detect the near dining light (13 W) and the farther “living light 2” (150 W), the changes in light

intensity from them are similar, leading to incorrect clustering.

To further demonstrate the ﬂexibility of sensor deployment, we deployed 11 Iris motes and
select four diﬀerent subsets of them as sensor placements, which are S1 = {All Iris motes}, S2 =
{10, 12, 14, 15, 16, 18, 20}, S3 = {10, 12, 14, 19}, and S4 = {10, 14}. All the subsets satisfy the
coverage requirement. However, they represent very diﬀerent deployment strategies. S1 and S2

use redundant sensors and hence are conservative. S3 follows the incremental deployment strategy.

36

Appliance

Name

Entry light
Hall light

Kitchen light
Dining light
Living light

Master bed light
Master bath light
Master bath fan
Guest bed light
Guest bath light
Guest bath fan
Stove burner

E

Groundtruth
P
(W)
32
38
24
76
43
33
22
47
29
20
41

(kW·h)
.0079
.0112
.0059
.0149
.0041
.0065
.0054
.0069
.0071
.0070
.0097
.4603
N/A

P
(W)
33
38
23
77
41
31
21
47
29
20
40

E

(kW·h)
.0081
.0109
.0056
.0113
.0040
.0061
.0052
.0068
.0056
.0070
.0097
.4675
.0518

Error
(%)
2.3
1.9
5.8
24.6
3.1
6.0
3.6
2.3
21.2
0.6
0.0
1.6
N/A
6.1

Table 3.4: Energy breakdown in House-1∗
Supero

1356
Water dispenser N/A
Average error
∗Error is relative error of energy with respect to KAW measurements.

1379
140

As there is no sensor in the living area, S4 does not follow any proposed deployment strategy.

The acoustic appliances covered in the experiment include an exhaust fan over the range, a waste

disposer in the sink, a dish washer, and a vacuum cleaner. During the experiment, we used the

vacuum cleaner in both the dining and living areas. The exhaust fan has two speeds and Node 10

is designated as the primary sensor for the fan. For the other appliances, the order (rather than

the actual values) of their power consumption is provided to Supero. The event detection and

association results for S1, S2, and S3 are correct. For S4, although all the acoustic events can be

successfully detected, some of them cannot be correctly associated. For instance, when the vacuum

cleaner ran in the living area, Node 10 received the highest signal energy, which is inconsistent

with its designation as the primary sensor for the exhaust fan.

The results in this section show that both the conservative and incremental deployment strategies

can eﬀectively ensure the sensing results. Moreover, the data correlation and the unsupervised

clustering/association algorithms adopted by Supero allow the sensors to be deployed in an ad hoc

manner with considerable ﬂexibility.

37

3.6.5 Experiments in House-1

House-1 is a one-story three-bedroom ranch house with a living space of about 150 m2. Compared

with Apartment-1, it has more lights of various types (incandescent bulbs and standard/compact

ﬂuorescent lamps). The deployment consists of 7 TelosB and 3 Iris motes. The Iris motes detect

both light and acoustic events. We conducted a controlled experiment for more than 5 hours.

Groundtruth information was manually recorded and then rectiﬁed by checking the total power

readings. In the experiment, each light sensor could detect multiple lights, and 40 false alarms

out of totally 127 light events were raised by the light sensors, where 38 of the false alarms were

identiﬁed by multi-modal data correlation. The remaining two false alarms were identiﬁed as

outliers by the clustering algorithm. Table 3.4 shows the results. For one of the dining light events,

a sensor monitoring the light missed the event, which resulted in a misclassiﬁcation and error in

estimating the energy of the dining light. From the background cluster of unattended power events,

we observed that an unknown appliance with a power of 140 W was turned on for one minute about

every 10 minutes. The appliance turns out to be a hot water dispenser at a sink. Moreover, the

dispenser caused a missed detection of a guest bed light event, as the dispenser and the light were

once turned on/oﬀ at the same time. The average error of Supero is 6.1%.

3.6.6 System Usability

We now present two case studies on how easily Supero can be deployed and conﬁgured by non-

professionals. We recruited two homeowner volunteers to deploy Supero in their homes including

a single-bedroom apartment (Apartment-3) and a two-story house with basement (House-2). We

ﬁrst introduced Supero and explained the deployment strategies to the volunteers, which took less

than one hour. They then installed the sensors and conﬁgured the system using our web interface

without any further instructions from us. For safety reasons, they did not install the TED5000.2

2The TED5000 probe needs to be hardwired to electrical service wires to get powered and
connected to the gateway. Contactless power sensors [34], which are more friendly to non-technical
end users, can be used instead.

38

In Apartment-3, the volunteer deployed 5 TelosB and 3 Iris motes to monitor all the appliances

including 5 lights, a fridge, a microwave, and a fan. The deployment and conﬁguration took only

about half an hour. In House-2, the volunteer took about one hour to survey the appliances and

another hour to install the sensors. He ﬁnally deployed 12 TelosB and 10 Iris motes to monitor 12

lights, an exhaust fan in the kitchen, a waste disposer, a dish washer, a fridge, a microwave, and three

fans in three bathrooms respectively. The base station on the ﬁrst ﬂoor could reliably receive data

packets from sensors distributed on the two ﬂoors and basement. After the system deployments, we

conducted controlled experiments to evaluate the deployments and conﬁgurations. We generated

total power readings according to gathered groundtruth to run the algorithms. The event detection,

clustering, and association results of the controlled experiments are correct in both deployments.

These two case studies show that the non-professional users were able to quickly deploy Supero and

ensure correct sensing results. We also ﬁnd that both users preferred the conservative deployment

strategy discussed in Section 3.5.2.

3.6.7 System Lifetime

This section evaluates the lifetime of the battery-powered Supero sensors. In this experiment, we

force the CPUs of the motes to stay active even though they would operate in low duty cycles (e.g.,

≤ 5% for Iris) in Supero. The radios are turned on only when there are packets to transmit. The
TelosB motes report their battery voltages to the base station every minute. Fig. 3.14(a) plots the

battery voltages of two TelosB motes with Alkaline and Lithium batteries, respectively, over time.

The projected lifetime with Alkaline batteries is 79 days by conservatively setting the minimum

operating voltage (MOV) to be 2.2 V although it is 2.1 V in datasheet [28]. With the high-capacity

Lithium batteries, there is no observable voltage drop in one month. For the tested Iris mote, we

enforce it to always work in the fast sampling mode. It piggybacks voltage reading to the acoustic

feature packet. Fig. 3.14(b) plots the battery voltage of the Iris with Alkaline batteries. The tested

Iris kept working from the 4th to the 9th day. Regression analysis shows that the projected lifetime

is 40 days by conservatively setting the MOV of Iris to be 2.2 V, since the MOVs of the RF230

39

radio chip and ATmega1281 8MHz MCU on Iris are 2.1 V and 1.8 V. We note that the lifetime can

be further extended by simply using Lithium batteries and duty-cycling the CPU of motes.

)

V

(

e
g
a
t
l
o
V

3.3
3.2
3.1
3
2.9
2.8
2.7

TelosB w/ Lithium
TelosB w/ Alkaline

0

5

10

15

20

25

30

(a) Days

)

V

(

e
g
a
t
l
o
V

3.3
3.2
3.1
3
2.9
2.8
2.7

Iris w/ Alkaline

0

2

4

6

8
(b) Days

10 12 14

Figure 3.14: Battery voltage traces of TelosB and Iris.

3.7 Conclusion and Future Work

This chapter presents Supero – a sensor system for unsupervised residential power usage

monitoring. In Supero, the multi-sensor fusion can eﬀectively reduce sensing errors in complex

household environments. By using unsupervised event clustering algorithms and a novel appliance

association framework, Supero can autonomously estimate the power and energy usage of each

monitored appliance. Extensive evaluation in ﬁve real homes shows that Supero can be deployed

with considerable ﬂexibility and provide accurate monitoring results.

Complementary to Supero, a few direct meters (e.g., the Zigbee-enabled KAW) can be applied

to handle certain other appliances that have highly complex light/acoustic signal characteristics

(e.g., TV) and power consumption proﬁles (e.g., furnace). In our future work, we will explore the

use of other sensing modalities (e.g., infrared, seismic, and magnetic) to monitor these complex

appliances. We will explore privacy-preserving strategies to prevent information leakage due to

the wireless communications in Supero. Moreover, we plan to develop an easy-to-understand user

manual to help non-professionals set up the sensor deployment, e.g., by video examples.

40

CHAPTER 4

A SENSOR NETWORK FOR REAL-TIME VOLCANO IMAGING

As sensor network technologies become more mature, they are increasingly being applied to a wide

variety of environmental monitoring applications, ranging from agricultural sensing to habitat

monitoring, oceanic and volcanic monitoring. This chapter presents some of the challenges facing

a designer of a large scale seismic sensor network, the system designed to meet these challenges

and the results from deploying a prototype network on active volcanos in Equador and Chile. Using

low-cost hardware it is possible to record seismic activity within the local region of a volcano.

The remainder of this chapter is organized as follows. Section 4.1 presents the overview of

real-time volcano tomography. Section 4.2 presents the design requirements the system must meet

as well as the design alternatives considered. Section 4.3 presents the the system architecture

and design. Section 4.4 discusses modelling system operation and the process used to assign the

various computational elements to diﬀerent nodes. Section 4.5 describes the system deployments

in Ecuador and Chile. Section 4.6 describes the deployment experiences and results from these

deployments while Section 4.7 concludes the chapter.

4.1 Background of Volcano Tomography

4.1.1 Scientiﬁc Motivation

Real-time volcano tomography creates a new paradigm for studying seismic activities and provides

a deeper scientiﬁc understanding of the complex, time-varying dynamics of volcano. To study

volcano dynamics, a three-dimensional (3-D) velocity model corresponding to the internal structure,

as shown in Fig. 4.1, is to be constructed. The process of creating such a model from seismic event

data is called seismic tomography [24, 25]. By observing the model changes, seismologists can

predict how soon the volcano will erupt and issue early warning to evacuate surrounding areas.

41

Figure 4.1: Seismic tomography and real-time signal processing pipeline. The tomography
estimates a velocity model consisting of the seismic wave propagation speeds in cubic blocks
beneath the volcano surface.

4.1.2 Tomographic Processing Pipeline

This section introduces the steps in the tomographic pipeline in Fig. 4.1. Tomography begins with

obtaining readings from seismic sensors. The seismic signal frequencies in a volcanic earthquake

range from sub hertz to 40 Hz. Volcano monitoring systems often adopt a sampling rate of 100 Hz,

which can fully capture the seismic events. However, such a high sample rate makes it diﬃcult to

collect raw data in real time from a large-scale (e.g., hundreds of nodes) WSN due to limitations

of energy and bandwidth at current, battery-powered nodes. This requires the nodes to perform

in-network event detection and only send short event messages through the network.

Seismic event detection detects the occurrence of a seismic event and determines the arrival

times of primary waves (i.e., P-phases) at the sensor nodes. The STA/LTA [9] and the ARAIC

(Autoregression with Akaike Information Criterion) [41] algorithms are widely used for seismic

event detection. The STA/LTA computes both a long-term average and a short-term average for

the signal, and makes a detection if the ratio of the two averages exceeds a predeﬁned threshold.

42

The event magnitude, the window sizes of the averages, and the threshold setting aﬀect STA/LTA’s

sensitivity to events and thus the accuracy in estimating events’ onset times. The ARAIC constructs

two autoregression models based on the seismic signals before and after each time instant and yields

a time instant that maximizes the dissimilarity between the two autoregression models as the P-

phase arrival time. Thus, ARAIC requires no thresholds and is more accurate than STA/LTA, but

at the cost of increased computation overhead. For instance, processing a 16-second seismic signal

requires computations on 6,500 ﬂoating point variables.

Once a seismic event is detected, the P-phases from multiple sensors are used to estimate the

earthquake hypocenter including its source location and origin time. Geiger’s method [50], a widely

used hypocenter estimation approach, applies the Gauss-Newton nonlinear optimization method

through a series of iterative linearization steps.

Volcano tomography [24, 25] involves tracing the travel paths of the seismic signals (ray paths)

from the event source location to the monitoring stations and computing the travel times. Note

that, as the material density beneath the volcano surface is not evenly distributed, the ray paths

are curves, as illustrated in Fig. 4.1. The velocity model is represented as a vector of slowness

perturbations, which is computed from the data collected by many sensors for many events and is

thus compute-intensive.

For example, a 1992 study of Mount St. Helens [24] utilized seismic data consisting of 5,454

local seismic events recorded by 39 stations over 10 years. This produced the 35,475 rays used to

model a 27.5 km by 21 km target area. Increasing station density by a factor of 10 to 20 would result

in a corresponding reduction of the time required to obtain suﬃcient ray coverage. Thus, a viable

option is to deploy a large number of nodes, i.e., 250 to 1,000 nodes to cover a typical volcano.

This is economically infeasible for traditional platforms. It motivates us to develop easy-to-deploy

and inexpensive volcano sensor platforms with a target cost budget of approximately 500 US$ per

node.

To investigate the impact of increasing station density we created a model of the internal

structure of the Chilean Llaima volcano. Within this structure we modelled a volume to represent

43

a magma through which seismic waves would travel at a diﬀerent velocity. Using this model,

we explored the impact of the number of monitoring stations and events on the computed model

accuracy as compared to the original model. The following section describes the results of this

evaluation.

4.1.3 Spatial Coverage

(a)
anomaly

Simulated

(b) 22 stations, 600
events

(c) 22 stations, 2400
events

(d) 48 stations, 600
events

(e) 48 stations, 2400
events

Figure 4.2: Tomography results using simulated seismic data.

Suﬃcient 3-D ray coverage is a key requirement of volcano tomography. That is, each cubic

block in the 3-D model should have multiple rays passing through it. The 3-D ray coverage also

aﬀects the resolution of the model. Ray coverage can be increased either by including more events

or by increasing station density. Wide geographic coverage is also required to ensure the 3-D model

covers the entire internal structure of the volcano. The 1992 study of Mount St. Helens [24] utilized

seismic data consisting of 5,454 local seismic events recorded by 39 stations over 10 years. This

produced 35,475 rays used to model a 27.5 km by 21 km target area. Increasing station density by

a factor of 10 to 20 would result in reduction of the time required to obtain suﬃcient ray coverage.

To investigate the impact of the numbers of sensors and events on the model resolution, a

seismic simulation model of the Llaima Volcano was developed. The model consisted of grid 80

x 80 covering the volcano with 7 layers corresponding to the velocity model at the depths of 0,

2, 4, 5.5, 8, 10.5, 13 km. To model a seismic anomaly, corresponding to a feature that might

represent magma, polygons were placed in each layer of the model and assigned a velocity factor

44

from the corresponding layer. Fig. 4.2(a) shows the polygon at the 2 to 4 km depth. Using this

model, we generated simulated events at random locations. For each of these events we computed

the travel time to each monitoring station. Gaussian noise was added to this travel time and used

as the event arrival time. The monitoring station locations, events’ hypocenters, and event travel

times were used to compute a velocity model. We conducted simulations with 600, 1200, and 2400

events using 22 and 48 monitoring stations. Due to space limitations, only the 2-4 km layer is

discussed here. The other layers showed similar results. As both the numbers of stations and events

increase, there are rays covering wider areas of the model. This is illustrated in the ﬁgures by areas

changing from white to a color representative of the inverse velocity (i.e., slowness). Compared

with Fig. 4.2(c), Fig. 4.2(d) contains a larger area having color, which suggests that increasing the

station density has a greater impact on the model coverage than increasing the number of events.

As the number of events increases (Fig. 4.2(d) compared to Fig. 4.2(e)), the slowness in the region

of the anomaly is more accurately modeled, which is indicated by the region’s color more closely

matching the color in the simulated ground truth in Fig. 4.2(a).

In the above simulation, 20,000 to 50,000 rays were required to compute the tomography. To

obtain a suﬃcient ray coverage, we require either a large number of seismic events collected over a

long time period or a large number of monitoring stations spread over a large area. When a volcano

is in crisis, it is desirable to update the model one to four times per hour. Such a high temporal

resolution requires a large number of rays collected in a short time period. Thus, to obtain this

number of rays in a short time period a large number of nodes, i.e., 250 to 1000 nodes must be

deployed to cover a typical volcano. This is economically infeasible for traditional platforms. This

motivated us to develop easy-to-deploy and inexpensive volcano monitoring platforms with a target

cost of approximately 500 USD per station.

4.2 Design Requirements

We aim to meet the following design requirements:

Delay and data ﬁdelity requirements: Data-intensive sensing applications typically require

45

continuous data sampling at high rates. For seismic applications, this rate is typically 100 Hz. It is

also important for the sampling rate to be constant with no breaks in the data to minimize required

post processing. The detected events will be correlated across several nodes thus requiring the

data samples to have time stamps accurate to the sub-millisecond.

Nodes must be able to process the signal fast enough to neither fall behind nor overﬂow memory

buﬀers. Thus, the choice of event detection algorithms impacts the required node computational

capabilities. Notiﬁcation of a seismic event should be reported in seconds rather than minutes

including the delays of hypocenter computation and data communications. Tomography computa-

tions needed to be completed in 15 to 30 minutes to ensure timely model updates.

A common problem with the STA/LTA event detection algorithm [47, 30] is false alarms.

Systems sensitive enough to detect weak seismic events can be triggered by man-made events such

as vehicle movement, vibrations caused by wind moving nearby trees, etc. These events might

be detected by sensors close to the source but not by sensors further away. The system should

incorporate false alarm suppression methods.

Communication requirement: To achieve a high degree of coverage, nodes should be deployed

with 200 m to 400 m spacing over an area of several square kilometers having large variations in

terrain. Cases can arise where nodes might be up to one kilometer apart with an intervening ravine.

The area surrounding a volcano is typically remote with limited access, making it diﬃcult to access

the nodes once deployed. As a result, special attention must be paid to communication performance

and proper planning. It is also desirable to remotely monitor node health and collect event detection

results, which necessitate remote network connection. However, the remote volcanic areas often

have limited network coverage with satellite communication as the only connection. Satellite links

have limited bandwidth (less than 500 kbps) and high latencies (one to three seconds of round trip

time). These delays and limitations must be accounted for to ensure timely reporting of detected

events.

Power requirement: A volcano monitoring system must operate unattended for several months

when a volcano becomes active, with sensor nodes sampling seismic signals continuously. In such

46

a case, common power conservation techniques such as sleep scheduling are often inapplicable.

Thus, it is desirable to combine power conservation with power harvesting to prolong lifetime while

achieving suﬃcient spatiotemporal coverage of seismic activities for volcano tomography.

Packaging requirement: Volcanic regions have harsh environment conditions for sensor nodes.

Volcanic ash is extremely ﬁne and can ﬁnd its way into even the smallest places.

In addition,

areas may receive heavy rain storms lasting for long periods of time. Due to the remoteness of the

deployment site, the nodes should be small and light enough that one person could comfortably

carry three to four units.

4.3 System Architecture and Design

(a) System Network Diagram and control panels

(b) Our online remote seismic monitoring
panel

Figure 4.3: General system architecture and online remote monitoring panel.

To meet the requirements outlined in Sec. 4.2, we have developed a multi-tiered system shown

in Fig. 4.3(a). It consists of sensor nodes communicating over a mesh network to a base station.

The sensor nodes handle sensor sampling, time-stamping and event detection. Event information

and position information is exchanged between stations over the mesh network. The base station

computes the seismic event hypocenter and forwards the associated event information to a remote

server. The base station also provides a remote command and control link for the sensor network.

The remote server provides long-term event storage, event visualization, tomographic inversion

and velocity model visualization using a web interface shown in Fig. 4.3(b).

47

Figure 4.4: Data processing components of sensor node and base station.

4.3.1 Signal Processing and Event Detection

Each sensor node performs a number of processing functions as shown in Fig. 4.4. These functions

are divided between interrupt-level and the main processing loop. Sampling and time synchroniza-

tion both occur at interrupt-level while all other processing occurs in the main processing loop.

The seismic sensor is sampled at 100 Hz. These samples are stored locally on a SD card. Once

stored, the signal is processed through a digital bandpass ﬁlter to remove the direct current (DC)

component and eliminate high frequency noises.

Given the severe limitations on the energy and compute resources of the system, a key challenge

in the design of seismic event detection algorithms is to achieve a trade-oﬀ between detection

accuracy and compute overhead.

In addition, it is desirable to adopt existing, well established

algorithms in the seismic community, such as STA/LTA and ARAIC, because many existing post-

processing geophysical tools are designed to work with these algorithms.

The Short Term Average/Long Term Average (STA/LTA) algorithm [47, 30] is commonly used

to detect seismic activity. STA/LTA uses the ratio of the short term average to the long term

average to detect events. The long term average establishes the noise ﬂoor. When the ratio exceeds

a threshold value an event is declared. The associated time is used as a rough estimate of the event

48

starting time. The various STA/LTA parameters are used to tune the detector’s sensitivity and have

a secondary eﬀect of impacting the time associated with the event. The STA/LTA also requires

the seismic signal be properly ﬁltered and the DC component removed. Another event detection

algorithm is the ARAIC. This algorithm is more computationally intensive than STA/LTA but

provides a more accurate event arrival time. If this algorithm is to be used the system must have

suﬃcient computational power to perform the required computations without missing samples.

Keeping these requirements in mind, we developed a new seismic event detection framework

consisting of the following two steps. First, the preliminary event detection (CoursePick in Fig. 4.5)

is performed using STA/LTA. If the preliminary detection result is negative (no event), the node

waits for the next sample; otherwise, the node collects an additional 8 seconds of samples. These

samples along with the 8 seconds of samples proceeding the event are passed to the ARAIC algo-

rithm [40] to determine the event start time (FinePick in the Fig. 4.5). This event detection approach

achieves a desirable trade-oﬀ between accuracy and energy consumption by taking advantage of

complementary characteristics of two existing algorithms: STA/LTA is highly eﬃcient but sensitive

to settings and event magnitude, while ARAIC is more robust but computationally expensive.

o

i
t

 

a
R
A
T
L
/
A
T
S

Threshold

Signal
STA/LTA Ratio

Course Pick

Fine Pick

s
t
n
u
o
C

 150

 100

 50

 0

-50

-100

-150

 8

 7

 6

 5

 4

 3

 2

 1

 0

Figure 4.5: STA/LTA ratio in response to a seismic signal.

49

Table 4.1: Key Characteristics of a Traditional Seismic Station versus a our sensor nodes

Model

Type

Sensitivity

Model
ADC

Storage (GB)
Event Triggering

Battery

Solar Panel
Size

Total Weight
Total Cost

Traditional Sta-
tion

Gen 1

Gen 2

Gen 3

Gen 4

Seismic Sensor

GURALP
CMG-40T
3-axis
band
1000 V/m/s

broad-

PASSCAL L-
28-3D
3-axis
period
40 V/m/s

long

PASSCAL L-
28-3D
3-axis
period
40 V/m/s

long

L-

long

Sercel
28LB
1-axis
period
40 V/m/s

L-

long

Sercel
28LB
1-axis
period
40 V/m/s

delta

bit

Reftek 130
24
sigma
8
STA/LTA

Digitizer

Microchip
10 bit

Microchip
10 bit

N/A
STA/LTA

N/A
STA/LTA

Power Source

TI
24 bit delta
sigma
32
STA/LTA,
ARAIC

TI
24 bit delta
sigma
32
STA/LTA,
ARAIC

55 Ah Lead
Acid
50W
36 in x 36 in

4 D-Cells

4 D-Cells

7 Ah SLA

7 Ah SLA

N/A
N/A

N/A
N/A

20W
12 in x 24 in

20W
12 in x 24 in

90 lb
H $22,000 US

10 lb
H $500 US

10 lb
H $500 US

10 lb
H $500 US

10 lb
H $500 US

4.3.2 Sensor Node Design

Several alternative sensor node designs were considered. Traditional wireless sensor network

platforms, such as the TelosB series [28], while very energy eﬃcient, have limited computational and

communication capabilities. Nodes in this class can perform STA/LTA event detection but do not

have suﬃcient memory or speed to perform the more complex ARAIC algorithm or the hypocenter

computations. When operation in a continuous sampling mode, their energy consumption increases

signiﬁcantly. The Zigbee radios have suﬃcient throughput (250 kbps) to send event messages but

range is limited to 20 to 100 meters, insuﬃcient for the spacing needed by this application. After

considering these options, it was decided the current class of low energy motes were not appropriate

for this application. Through a series of rapid prototypes we investigated alternative hardware

designs. The hardware diﬀerences are summarized in Table 4.1 and discussed in the following

sections along with the lessons learned from each prototype. Two deployments allowed us to

investigate and better understand the environmental conditions the nodes would be subjected to.

These are discussed in Section 4.5

Generation 1/2: An initial question which arose early in the design process was “Can the high

50

(a) Generation 1 - Deployment: Tungu-
rahua, Ecuador, July 2012

(b) Generation 2 - Arduino Test Node

Figure 4.6: Seismic Node Prototypes

sampling requirement be met with low cost hardware?” To evaluate this, a prototype (Fig. 4.6(a))

was developed and deployed on the Ecuadorian Tungurahua Volcano. This prototype consisted of

an Android phone with an attached external board. The external board contained a 12 bit digitizer,

signal ampliﬁer and GPS receiver controlled by software running on the phone. During the 5-day

deployment period this prototype was able to successfully record a seismic event 20km away (Fig.

4.8).

While capable of recording nearby seismic events, this prototype also identiﬁed several issues.

First, a 12 bit ADC did not have suﬃcient dynamic range to capture all events of interest to

seismologists. Second, the high sample rate did not allow the phone to enter its low power sleep

mode resulting poor battery performance. Finally, the hardware was not fast enough to capture the

GPS one pulse per second digital output and could not synchronize its clock with the GPS.

To resolve these issues, the Arduino family of processors were used to investigate diﬀerent

approaches to clock synchronization and to determine the amount of clock drift over long periods

51

of time. The generation 1 sensor board was attached to an Arduino Mega ADK processor (Fig.

4.6(b)).

It was discovered that without any synchronization the internal clock of the Arduino

Mega ADK system drifted 708 microseconds every second. The Arduino kernel was modiﬁed to

allow adjusting the clock to compensate for the drift. Using the GPS 1pps signal along with this

modiﬁed kernel it was possible to calculate the drift and adjust the tick interval reducing the drift

to 21 microseconds per second. Unfortunately, the Arduino Mega ADK, even with an extended

memory board attached, did not have suﬃcient built-in memory or speed to execute the needed

event detection algorithms.

Generation 3: For the third generation of the node hardware we employed the Arduino Due process

and a custom board containing the specialized sensing components (Fig. 4.7(a)). Fig. 4.7(a) shows

the sensor node we designed for this work. This design provides similar quantization resolution at

2% of the cost and 10% the weight of a traditional portable seismic monitoring station. Table 4.1

lists the key characteristics of traditional portable stations versus our nodes.

The Arduino Due was chosen due to its computational power and the desire to use an existing

processor design where possible. It contains an 84 MHz AMTEL 32-bit processor which is able

to digitally ﬁlter a sample using a 204th order FIR ﬁlter and compute the STA/LTA ratio in 0.15

milliseconds. It can perform an ARAIC computation on a 16 second signal (1600 samples) in

3.9 seconds. Proper buﬀering allows the node to handle 100 samples per second without losing

samples.

The analog-to-digital converter (ADC) chosen was a TI ADS1281. This is a 24-bit ADC

designed speciﬁcally for seismic applications and is capable of sampling at up to 4000 Hz. To

amplify and condition the seismic sensor signal, a TI THS4521 diﬀerential ampliﬁer with a gain

of 100 feeds the ADC. The ADC’s internal clock is used to control the sampling rate. When

a conversion has been completed an interrupt ﬁres an interrupt routine accessing the ADC and

placing the reading along with the current time into a circular buﬀer. Samples are removed from

this buﬀer and analyzed in the main processing loop.

A Digi International XBee-Pro 900HP 900 MHz RF module provides network communications.

52

(a) Generation 3 Node

(b) Generation 4 Node

(c) Base Station

Figure 4.7: Seismic Monitoring Nodes

This module support DigiMesh network topology with a RF data rate of 200Kbps. 900 MHz was

chosen over 2.4GHz due to the reduced signal absorption in heavily wooded areas. The manufacturer

claims the unit is capable of a line of sight range of 6.5 km with a 2.1 dB dipole antenna. This

radio provides suﬃcient capability to allow a large number of nodes to exchange short messages,

i.e., position information and event detection results. To facilitate oﬄine validation of detection

results, each node stores the raw signal to a 32GB SD card. Using a compact delta based format,

one hour of samples requires 1.4 MB storage.

We include a GlobalTop Technologies, Inc. MTK3339 GPS chip on each node. This chip

provides geolocation information, global time, and a pulse-per-second (PPS) signal. This signal

is used to interrupt the processor and synchronize the internal processor clock to global time. To

mitigate the increased power consumption, control circuitry was included allowing the GPS chip

to be completely powered down. A small coin battery is used to maintain the GPS memory during

power down allowing quick restart when power is reapplied.

Sealed lead acid (SLA) batteries were chosen to power the nodes. While unnecessarily having

the highest energy density, SLA batteries have certain advantages over other choices. They are

53

inexpensive, commonly available even in third world countries, operate over a wide temperature

range, and survive large numbers of charge/discharge cycles. For slow discharge rates, the battery

should not be discharged below 10.5 volts. For a small 7 Amp-hour SLA battery with a 0.15 Amp

discharge current, the estimated runtime is 3.4 days.

A 20 watt solar panel maintains the battery charge. The charge controller regulating battery

charging is a potential point of failure. A failure could cause battery overcharging and subsequent

rupturing resulting in the release of corrosive acid. To prevent damage to the electronics the battery

was placed in a separate enclosure to isolate it from the electronics.

4.3.3 Design Lessons Learned and Generation 4

Figure 4.8: Recorded Ecuador Seismic Event

No Correction
Every 1 PPS Correction
Every 10 PPS Correction

)
s
m

(
 
t
f
i
r

D
 
k
c
o
C

l

 0.04

 0.035

 0.03

 0.025

 0.02

 0.015

 0.01

 0.005

 0

 0

 5

 10

 15

 20

 25

 30

 35

 40

 45

 50

Elapsed Time (Seconds)

Figure 4.9: The clock drift and correction by GPS 1PPS

54

Each sensor node generation improved based on lessons learned during the design process.

With Generation 1, it was shown an inexpensive ADC with a simple input ampliﬁer could detect

local seismic events as shown in Fig. 4.8. This event was approximately 25 km away from the

sensor location. The external IOIO board used to interface the phone to the GPS was unable to

properly detect the timing signals. This could have been remedied by developing custom code

for the IOIO but we were never able to reliably load the custom code. Another issue is the lack

of real time support in the Android operating system. This caused unreliable time stamps. The

combination of these two factors lead us to explore the Arduino processors for the next generation.

While the packaging chosen for this ﬁrst generation did not suﬀer any failures during the 2 week

Ecuador deployment, further testing identiﬁed other issues. The thin walls of the enclosure were

easily cracked during shipment and the cover was subject to warping negatively impacting the

weather proof nature of the enclosure. This lead us to explore the Pelican cases.

The second generation was primarily used to validate that closer integration with the GPS

resulted in more accurate time stamping and clock synchronization (Fig. 4.9). With this we were

able to explore clock drift and clock drift correction. The system clock on the Arduino Mega256

drifts 35 ms every 60 seconds. Using the GPS to correct the system clock, the drift can be

reduced to the sub-millisecond range. This requires the GPS to be in continuous operation with

the corresponding increase in power consumption. Using a modiﬁed Arduino kernel, we added the

capability to compensate for the clock drift by changing the number of microseconds added to the

system time each clock tick. Using this approach we were able to reduce the clock drift without

having to operate the GPS continuously. Thus allowing the GPS can be turned oﬀ for periods of

time.

The third generation was a major reﬁnement of the hardware and packaging. The processor had

suﬃcient processing power to eliminate the need for the phone. The packaging for the node and the

battery were such that the node and antennas could be placed inside the battery case reducing the

space needed during shipment. The cases were strong enough to survive rough handling without

suﬀering any failure of their waterproof nature.

55

Deploying the Generation 3 node identiﬁed several deﬁciencies with the hardware. The ﬁrst

was power consumption related to not being able to place the processor into its lowest power state

due to having to maintain the system times used to time stamp the seismic samples. This was caused

by a limitation of the Arduino Due and how its internal clock was maintained. The second issue

identiﬁed related to noise in the signal processing chain. While the board had suﬃcient shielding

and bypassing the seismic ampliﬁer still exhibited a high level of noise that interfered with detecting

weak seismic events. This could only be by a complete redesign of the board and seismic ampliﬁer.

The third issues was the SPI bus contention between the ADC and the SD-Card using the same bus

(See Section 4.6.2 the discussion of the impact on data ﬁdelity). While this was initially remedied

by using 3 General Purpose I/O pins and simulating SPI bus operation in software this was not an

optimal solution. Finally, there were memory management issues which caused the node to stop

functioning at random times. To alleviate these issues a redesign of the node was undertaken. The

Arduino Due processor was replaced with a Teensy 3.6 processor and the a new board layout was

developed providing the following features:

• Realtime clock - The Teensy 3.6 processor includes a realtime clock that continues to operate

even when the processor is placed into its lower power state. In addition, when operating at

approximately the same clock frequency as the Due processor the Teensy processor consumes

less energy.

• Board Layout - The smaller physical size of the Teensy 3.6 processor allowed the sensor

board to be reconﬁgured allowing better separation of the analog and digital components.

This reduced noise in the analog signal chain.

• Seismic ampliﬁer redesign - The seismic ampliﬁer was redesigned to include signal ﬁltering

reducing high frequency signal aliasing and further lowering the noise levels.

• Multiple SPI busses - this allows the ADC and SD-Card to utilize separate SPI busses

removing bus contention.

56

• Additional memory and hardware ﬂoating point computations - the DUE processor has 96KB

of SRAM while the Teensy 3.6 has 256 KB of SRAM. The Teensy also adds ﬂoating point

hardware instructions. This greatly reduces the computation time of complex event detection

algorithms. (See Table 4.3)

Fig. 4.7(b) shows the resulting Generation 4 node.

4.3.4 Base Station Design

The base station is composed of a Beaglebone Black board containing a 1GHz 32bit processor

running Debian Linux. A custom cape containing a power regulator and a XBee radio module

attaches to the processor (Fig. 4.7(c)). This cape allows the battery and solar panel voltages for

both the base station and satellite link to be monitored. The provided Ethernet connector allows the

board to communicate with a Hughes 9502 one-piece integrated satellite terminal providing a 448

Kbps broadband connection. This terminal uses 3 to 4 watts with an active TCP/IP connection,

0.01 watts in hibernate mode and wakes up in under 30 seconds with LAN activity.

The control software executed by the base station receives detection events from the sensor

nodes. Geiger’s method is used to compute the hypocenter of the detected seismic events. Events

received by a suﬃcient number of nodes and within the volcano’s vicinity are forwarded over the

satellite link to a remote server for storage and display. The remote server computes the velocity

model and visualizes the result.

4.3.5 Packaging

The sensor nodes and base station are housed in Pelican 1020 micro cases. These cases, along with

the power and sensor connectors, meet the IP67 standard for dust and water. A potential problem

exists with the connections for the radio and GPS antennas. The RP-SMA connections used for

these connections are not inherently waterproof. To waterproof these entry points, a recess was

57

machined around the entry hole to accommodate a rubber O-ring. Once the connector, O-ring,

lock washer and nut are tightened into place a water proof seal was made.

To eliminate ground eﬀects, a PVC mast raises the radio and GPS antennas to 1.5 m above the

ground. This height maximizes the radio range. The Pelican micro cases are attached to the top of

the mast using an aluminum bracket. The GPS antenna magnetically attaches to the mast as well.

The battery and charge controller is housed in a MTM survivor box. This box is slightly larger

and nicely accommodates either a 7 or 9 amp-hour 12 volt SLA battery.

4.4 System Modeling & Dynamic Task Assignment

Our system consists of three tiers: sensor node (Tier 1), base station (Tier 2), and remote

server (Tier 3). A key design question is how to assign processing tasks to speciﬁc tiers subject

to processing delays, communication throughput and system lifetime constraints. To assign tasks

to the processing tiers, it is necessary to model the processing delays, communications throughput

and system power consumption. To this end, we carefully analyzed delays at various stages of the

information processing pipeline and propose the following task assignment scheme. Our scheme

takes advantage of complementary compute/communication capabilities of diﬀerent tiers of our

system, while minimizing usage of the networking layer.

The decision of which tier to place a particular processing task depends on the computational

costs associated with the task on each tier as well as the communication cost to transfer the required

data. Subject to not exceeding the task’s processing delay bound D, computing a task should be

transferred to the next higher tier, if the following relationship is true:

ti > t′i + ci,i+1 + ti+1,

(4.1)

where ti is the processing time for the task if it resides on tier i, t′i is the tier i processing time if the
task moves to the next higher tier i + 1, ci,i+1 is the time to transfer the required data from tier i to

the next higher tier, and ti+1 is the next higher tier’s processing time. Evaluating Eqn. 4.1 for each

task produces an initial set of task assignments.

58

Table 4.2: Application processing deadlines

Task

Deadline Consequence of violating dead-

Sensor sampling and timing
Event Detection

< 10ms
< 10 ms

Hypocenter Determination

< 10s

Compute Velocity Model
Visualization

15 min
< 1 min

line
Samples dropped
Samples dropped unless suﬃ-
cient buﬀering
Lack of
event source to end user
Slower model update rate
Image unavailable to user in
timely fashion

timeliness reporting

4.4.1 System Delay Modeling

To determine which tier a processing task should be assigned, it is necessary to consider the

following: application processing constraints (D), execution times for each task (ti, t′i), and com-
munication throughput (ci,i+1). Table 4.2 lists the application processing deadlines (D) along with

the consequence of violating the deadline. For this application, the most critical deadlines relate

to the Sampling/Timing task and the Event Detection task, where missing the deadline results in

missed samples. For the other tasks, Hypocenter Determination, Tomography and Visualization,

violating the deadline will result in slower model updates.

Table 4.3: Task execution times by tier

Task

Sampling and

Timing

Tier 1

Gen 3 Node

100µs

Gen 4 Node

< 50µs

Event Detection

0.15 ms

51 µs

(STA/LTA)

Event Detection

3.9 sec

206.2 ms

(ARAIC)
Hypocenter

Determination

Compute

Velocity Model
Visualization

na

na

na

na

na

na

Tier 2

Base station

Tier 3
Cloud

Assigned Tier

na

5 µs

6 ms

278 ms

na

na

na

na

na

33 ms

81 sec

.32 sec

1

1

1

2

3

3

The task processing times for each tier are given in Table 4.3. Some tasks can only be executed

on a particular tier due to hardware restrictions or data requirements. For example, the Sampling

and Timing task can only be performed on Tier 1 since the sensors are connected to the nodes that

59

make up that tier. The Hypocenter Determination cannot be run on Tier 1 since it requires data

from all the nodes before the calculations can be performed. In addition, the Tier 1 nodes have

insuﬃcient memory to execute the task. The execution time for the Hypocenter Determination

depends on the number of nodes detecting the event, while the time for the Tomography task

depends on both the number of nodes and the number of events.

The radios in our system use both link-wise and end-to-end acknowledgements. The maximum

data throughput is determined by both the single hop link speed and the number of hops. Based on

data provided by the radio manufacturer, the data network throughput can be modeled by

Tkbps = 91.23h−0.945,

(4.2)

where h is the number of hops and Tkbps is the data throughput in kilobits per second. Satellite

links have high latency with round-trip time of 1 to 3 seconds. They provide 464 kbps link speeds.

Thus, we can compute the communication delay based on the packet size and the number of hops

between tiers.

4.4.2 System Lifetime Modeling

System lifetime depends on the battery capacity and the amount of current being drawn from the

battery. For SLA batteries, at a constant discharge current, the voltage decreases linearly from its

fully charged voltage to its terminal discharge voltage. To maximize battery life, SLA batteries

should not be discharged below this terminal voltage. When the SLA battery capacity and discharge

current are known, Peukert’s Law can be used to estimate battery lifetime. If the system operates

in one power mode only, the current drawn can be measured oﬄine and used to estimate system

lifetime. For cases when a system has multiple power modes, e.g., sleep and active, the average

time spent in that mode. Speciﬁcally, Iavg

power consumption can be computed based on the modes’ power consumption and the percent of
Ii · Pi, where n is the number of power modes,
Ii and Pi are the current and percent time in mode i subject to P Pi = 1. Based on the initial
task assignments, Pactive for a given tier can be estimated as Pactive = Pn

ti/ fi, where: ti is

i=0

= Pn

i=0

60

the processing time and fi is the frequency of execution of task i on the tier. Thus, SLA battery

lifetime is

IH! k
Tli f etime = H   C

,

(4.3)

where C is the battery capacity (in Ah), H is the rated discharge time (in hours, 20 hours for SLA

batteries), and k is Peukert’s constant (1.25 for SLA batteries).

Estimating system lifetime using Eqn. 4.3 does not take into account energy harvesting. Since

SLA batteries discharge voltage varies linearly for a constant discharge current, by monitoring the

change in battery voltage over time, the charging or discharging time can be computed as follows:

△v
△t






> 0, charging,

= 0, equilibrium,

< 0, discharging

Tchar ge = (VT −Vc) ∗ △t
△v

,

Tdischar ge = (VD−Vc) ∗ △t
△v

,

(4.4)

where Vc is the current battery voltage, Tchar ge is the time to recharge the battery, VT is the battery

fully charged voltage, VD is the terminal discharge voltage, and Tdischar ge is the time to discharge

the battery (system lifetime).

With energy harvesting, the time to discharge or recharge the battery depends on both the

discharge current and the amount of energy harvested, i.e., the net battery current. Using the

change in battery voltage, the net battery discharge current can be estimated by I =

C, H, and k are the same as in Eqn. 4.3 and t is (VT − VD) · △t/△v in minutes.

, where

C

H·(cid:16) t
H(cid:17)

1
k

4.4.3 Dynamic Task Assignment

Using the processing times and communication rates presented in the previous section, task assign-

ments were made based on evaluating Eqn. 4.1 when assigned to a tier (Table 4.3). For instance,

the Event Detection task was assigned to the sensor node (Tier 1) versus the base station (Tier 2)

one hop away. This task can be computed in 0.15 milliseconds (t1) on a 32-bit 84MHz Arduino

processor. It takes approximately 19 milliseconds (c12) to transmit 1 second of sensor readings and

61

associated timing information one hop. In this case, no matter how fast the task can be computed

in Tier 2 (t2), the inequality will be false and the processing should remain on Tier 1.

Another consideration in deciding tier assignments is the data needed. For example, event

detection only requires data from a single sensor. In contrast, the hypocenter computation needs

event information from multiple nodes. For rendezvous operations such as this, the communications

impact for both the single node case and the multi-node case must be considered. When a node

detects a seismic event, the associated event time must be transmitted to the base station.

The seismic event information can be transmitted in a 16 byte packet, the packet rate depends

on the frequency of seismic events and cannot be predicted. It must be determined if the network

has suﬃcient throughput to handle the event packets within a large network. To determine this we

considered nodes placed in a 20km diameter ring surrounding a volcano with 250 meter spacing

such that each node could communicate with at least two other nodes. To complete this ring

approximately 500 nodes would be required. A single node transmitting one 16-byte packet every

10 seconds would require a data throughput of 0.0128 kbps, assuming no transmission failures.

Assuming the packet must travel through half the nodes to reach the base station a 250 hop path

provides 0.49 kbps data throughput. Since all messages must ﬂow through the ﬁnal link to the base

station the aggregate data throughput needed for a 500 node network is 6.4 kbps. This is well below

the 91 kbps single hop data throughput rate.

With a such a large network, impacts of delays and lost packets must be considered from the

application point of view. For seismic tomography each event packet send by a node results in 1

ray to be included in the model calculation. Each ray has the potential to provide the information

needed to compute the velocity in a new portion of the model. Since the model will be computed

using thousands of rays the loss of a low percentage of rays should have minimal impact on the

resulting model. The impact of a delayed packet depends on how frequently the model will be

recomputed. For example, assume the model were recomputed once 25,000 to 50,000 new rays

have been collected. If seismic events were occurring once every minute and reported by half the

stations (250 nodes reporting) it would take 100 to 200 minutes to receive the number of events

62

required to recompute the velocity model. Since delays will most likely be on the order of seconds

the resulting system will be tolerant to communication delays. The Hypocenter Determination

task, based solely on Eqn. 4.1, should be assigned to the remote server. This equation only

considers processing time and communications overhead. It does not consider other aspects of the

application environment. For example, not all seismic events contribute data to the velocity model.

These include seismic events only detected by a few number of nodes (e.g. man made events) or

events with a source outside the local vicinity of the volcano. Once the event hypocenter has been

computed it can be used to identify these types of events. Performing this task on the base station

allows the system to avoid using the high latency satellite link to transmit data which would not

be used to update the model. This provides power savings, as the satellite station can be placed in

sleep mode when the link is not being used.

Oﬀ-line task assignment using Eqn. 4.1 also does not take into account the dynamic nature of

WSNs and the system lifetime constraints. For example, during an extended period of cloudy days,

insuﬃcient energy may be harvested from a solar panel to meet a node’s system lifetime constraint.

Under that condition a node may elect to shift a high energy task to another node having suﬃcient

energy. This can be accomplished by incorporating the system lifetime (Tdischar ge calculated by

Eqn. 4.4) into the task assignment process. If the system lifetime is less than a speciﬁed threshold,

a decision can be made to shift one or more assigned tasks to another node or tier. Once a decision

has been made to shift a task, the system records the current battery voltage as Vdecision. When

Vc + τ > Vdecision, the task will be shifted back to the node, where τ is the desired amount the

battery voltage should recover by.

We now illustrate the impact of our dynamic assignment scheme on the system lifetime using an

experiment. Consider a node with two power states: active and sleep while performing sampling,

STA/LTA and ARAIC. Sampling takes 100 µs every 10 ms, STA/LTA takes 0.15 ms every 10 ms,

and ARAIC takes 3.9 s whenever an event is detected. If an event is detected every 60 seconds,

Pactive = (100 µs)/10 ms + 0.15 ms/10 ms + 3.9 s/60 s = 0.09. If Iactive is 125 mA and Isleep is

25 mA, Iavg

= 25 mA · (1 − 0.09) + 125 mA · 0.09 = 34 mA.

63

A 7Ah SLA battery discharging at 125 mA has an estimated life of 84.5 hours. With 34 mA

discharge rate, the battery life is 523 hours. If the ARAIC processing task were assigned to another

node, P would change from 0.09 to 0.031 (allowing 0.35 seconds to transmit the required data to

an adjacent node), the average current would be lowered to 28.1 mA providing a battery life of 699

hours.

4.5 Deployments

Table 4.4: Key characteristics of the deployments

Deployment

site

Tungurahua,

Ecuador

Time Dura-
tion

Jul
2012

1

week

Llaima, Chile

Jan

4

2015 months

Area of coverage

# of

Station type

3 patches, 2 stations each,

6

200 m separation

stations

925 m × 475 m
33 km × 26 km

16

50

Smart-
phone
based

Fig. 4.7(a)

Traditional

Comm-
nication

None

Cost
(US$)
2.1K

XBee-
based
None

6.4K

572K

We deployed our systems on Tungurahua Volcano, Ecuador and Llaima Volcano, Chile (Fig. 4.10)

to evaluate their operations in real volcano environments. These two deployment sites present dif-

ferent environments. Tunguarahua is at a higher altitude, wetter, and with more vegetation. Llaima

is at a lower altitude, with less vegetation, and signiﬁcantly drier in the summer. Table 4.4 summa-

rizes the characteristics of the two deployments. Each deployment had diﬀerent evaluation goals

as described in the subsequent sections.

4.5.1 Tungurahua Volcano Deployment

To test the ﬁrst generation sensor nodes, we traveled to the Tungurahua Volcano near Baños Ecuador

and deployed six nodes in July, 2012. This ﬁeld trip aﬀorded us an opportunity to test under real

ﬁeld conditions. It also allowed us to gain ﬁrst-hand experience of the conditions we would face in

future larger scale deployments. The primary goal was to test whether our sensing hardware can

record seismic signal.

64

(a) Sensor node

(b) Base station

Figure 4.10: Llaima Volcano deployment, Chile, 2015.

The deployment faced several challenges. The ﬁrst was the remote location. Only a few sites

were accessible by road. Vast areas were only reachable via hiking on foot. Utilizing helicopters to

assist during deployment was not feasible due to the altitude and the highly variability of the weather

conditions. Teams must backpack all the equipment several kilometers to reach the deployment

locations. Besides the distances covered, working at high altitude (12,000 to 14,000 feet) presented

its own challenges. From the sensor design standpoint, care must be taken to minimize the package

weight. It was important for the teams to be able to maximize the number of packages carried

on each trip to minimize the number of trips. The wet weather conditions presented the second

challenge. Between fog, that commonly covered the volcano, and rains challenged the equipment

packaging.

4.5.2 Llaima Volcano Deployment

The second deployment was on Llaima Volcano located near Melipeuco, Chile. The primary goal

of this deployment was to test the new generation of hardware over a long duration period. We

deployed 16 new-generation nodes in January 2015, in a 800 m by 1400 m patch. Fig. 4.11 shows

65

Figure 4.11: Node locations in Llaima Volcano deployment.

the deployment locations relative to the volcano summit and the terrain around the nodes. Llaima

has an access road encircling it and the lower slopes can be reached using an oﬀ-road vehicle.

With the varied terrain, it also provided a good ﬁeld test of the node to node communications. The

16 nodes were deployed by three 2-person teams in one day. One team acted as a survey team

establishing the ﬁnal node locations. These were chosen based on topology to ensure that at least

three other node locations were visible. Fig. 4.10(a) shows a deployed sensor node. A base station

with satellite link, as shown in Fig. 4.10(b), was deployed. The system ran from January 10th to

March 25th, 2015.

Together with our WSN nodes, 26 traditional seismic stations were deployed across a broad

geographic area surrounding the volcano, as shown in Fig. 4.11. These traditional stations were

installed over a period of two weeks by four multiple-person teams. The number of people in each

deployment team depended upon how far the equipment had to be carried from the nearest road.

Two stations were transported on horseback.

66

4.5.3 Deployment Lessons Learned

We gained several important insights from the deployment. First, the communication range was

longer than that observed during our initial testing. In the ﬁeld, the signal strength from the link

tests followed what was predicted by the path loss equations with consideration of terrain. Based

on this, given topological information, it would be possible to estimate link quality during the

planning phase of a deployment. This estimation is critical for larger scale deployment of 500 to

1,000 nodes.

Second, the antenna pole was fashioned out of two one-meter PVC pipe sections. In the ﬁeld,

it was discovered that the wind caused the pole to vibrate. This vibration can be transmitted to the

geophone, introducing noise in the seismic signal. Heavier material should be used or guy wires

attached to stabilize the poles. Based on the communication range results, an even simpler solution

would be to use a single section of PVC pipe only. This would simplify deployment and be less

likely to vibrate in the wind.

Third, assembling the solar panel frame required installation of several bolts. This proved

diﬃcult and increased the time to deploy a node. Altering the leg design would allow them

to be installed prior to shipment. In the ﬁeld, they could simply be unfolded, simplifying their

installations.

For the Llaima Volcano deployment the 2-meter cable attached the geophone to the node was

adequate. This cable should be increased to 6 meters to allow more ﬂexibility in sensor placement.

This would also place the sensor further from the antenna pole further reducing an noise generated

by the pole vibrating in the wind.

4.6 Evaluation and Deployment Experiences

This section presents the system evaluation and experiences during the Llaima Volcano deploy-

ment.

67

4.6.1 System Delay

To evaluate the computation overhead of the hypocenter computation, the algorithm was executed

while varying the number of stations detecting the event. The resulting execution times are shown

in Fig. 4.12 and is roughly linear with the number of stations. Extrapolating to 500 stations, the

execution times for the base station increases to approximately 4 seconds while on the remote server

(or cloud) to 0.3 seconds. To accurately compute the hypocenter it would not be necessary to

utilize event information from all 500 stations. Only a small subset of events from geographically

distributed stations are required. By limiting this subset to 10% of the nodes the execution time

can be tuned to not exceed the task deadline. Tomography involves computing a velocity model

using matrix inversion. The dimensions of this matrix is based on the resolution of the model, the

number of stations and events. Increasing these results in more computation overhead as shown in

Fig. 4.13.

BeagleBone Black
Cloud

20 Stations

40 Stations

)
s
m

i

 

(
 
e
m
T
d
e
s
p
a
E

l

 300

 250

 200

 150

 100

 50

 0

 10

 15

 20

 25

 30

 35

 40

)
s
d
n
o
c
e
S

i

 

(
 
e
m
T
d
e
s
p
a
E

l

 1100
 1000
 900
 800
 700
 600
 500
 400
 300
 200
 100
 0
 600

 900  1200  1500  1800  2100  2400

Number of Stations

Number of Events

Figure 4.12: Hypocenter execution times by
number of stations.

Figure 4.13: Tomography execution times by
number of events.

Of the tasks executed by the sensor node, the ARAIC event detection task takes the longest time

(3.9 seconds on a Gen 3 node). If this task were executed on the base station it would take only 6

ms. Using Eqn. 4.1, if the communication time to transmit the data (ci,i+1) plus the execution time

on the node (ti) and base station (ti+1) is less that the 3.9 seconds then this task could be executed

more eﬃciently on the base station. To evaluate this, a node repeatedly analyzed a simulated

seismic signal that contained an event. Each time the STA/LTA event detection task detected an

68

event, the seismic signal was transmitted 1-hop to a basestation for detection using the ARAIC task.

The ARAIC task requires 16 seconds of seismic data (1600 samples). Using a compressed format

the required data could be transmitted in 55 99-byte packets and one 27-byte packet. To transmit,

with acknowledgements, took 2.7 seconds with minimal processing overhead on the node. On the

surface, it would appear to make sense to move the ARAIC task to the basestation. This is not the

case when the network impact of having 500 nodes sending the seismic data over multiple hops to

the basestation. This would saturate the network capacity. When the node executes the ARAIC task

the average current consumption is 40 ma. When transmitting the seismic data to the basestation

the average current consumption increases to 60 ma. Considering the battery capacity consumed,

executing the ARAIC task on the node uses 156 masec (40 ma x 3.9 sec) versus executing on the

basestation using 162 masec (60 sec X 2.7 sec). For the Gen 4 node, transmitting this seismic data

takes the same amount of time while executing the task on the node takes only 206 ms. Thus for

Gen 4 nodes, it is vastly more eﬃcient to compute ARAIC task on the node.

4.6.2 Data Fidelity

There are two dimensions to assessing data ﬁdelity. First, how accurately sensor samples are time-

stamped. The GPS provides a PPS interrupt with a jitter of 10 ns. To achieve a sub-ms precision,

we integrate Due’s system clock and an internal 4-µs resolution timer. Speciﬁcally, when the

node receives the 1PPS interrupt, the system clock is resynchronized with the global time and the

current value of the timer is saved. The time stamp of a seismic sample is the global time plus the

diﬀerence between the timer’s current and saved values. The timer’s drift was measured against

the PPS signal. Under normal temperature conditions, it drifts 16 µs per second. The drift of

the Real-Time Clock (RTC) in the Gen 4 node was also evaluated against GPS time. To evaluate

this the node was allowed to run for a 32 hour period. At the end of this period the GPS time was

compared with the RTC time. During this period the RTC had drifted 2.24 seconds. This was a

19.01 parts per million (PPM) drift. The RTC also has the capability to compensate for drift such

as this. The required compensation factor required is calculated by the following formula [37]:

69

C = int   Dppm

0.1192

+ 0.5!

(4.5)

where Dppm is the drift in Parts Per Million and C is the compensation factor. Positive compensation

values result in the RTC running faster, while negative values slow the RTC. For the 19.01 PPM

drift a compensation value of 160 was applied, the RTC synchronized with the GPS and allowed

to run an additional 32 hours. This compensation value properly corrected for the RTC drift. By

periodically monitoring the drift it is possible to correct for changes in the amount of drift caused

by temperature changes.

Table 4.5: Shared SPI bus ADC Sample Intervals

Sample Interval (ms) Number of samples
43
154237
53
1

9
10
11
34

Total

154334

Second factor impacting data ﬁdelity is sampling frequency consistency. This is impacted by

method used to communicate with the ADC. Two approaches for communicating with the ADC

were evaluated. Both the ADC and the SD card utilize a Serial Peripheral Interface (SPI) bus for

communication. The ﬁrst approach, the ADC and the SD card shared the hardware SPI bus. With

this approach care must be taken ensure the two devices do not interfere with each other when

accessing the SPI bus. Since the ADC is accessed at interrupt level, interrupts must be disabled

when accessing the SD card. This causes two side eﬀects. First, the ADC data ready interrupt

can be delayed resulting in an inconsistent sample rate. Second, the PPS interrupt can be delayed,

causing clock skew. During a test for half an hour, the sampling intervals range from 9 ms to 34 ms

as shown in Table 4.5. In the second approach, the ADC is linked to three GPIO pins with SPI

signaling performed by software routines. This approach provided a consistent 10 ms sampling

period.

The ability to detect weak seismic activity is determined by the sensitivity of the sensor as well

as the noise level of the signal chain. The noise of the signal chain was measured for the Gen 3

70

node and two variations of the Gen 4 node. The ﬁrst Gen 4 (Gen 4a) variation used exactly the

same input ampliﬁer as the Gen 3 node. The only diﬀerence being the board layout. The second

Gen 4 (Gen 4b) variation used an entirely diﬀerent ampliﬁer design that incorporated a 6-pole low

pass ﬁlter. To measure the noise of the seismic ampliﬁer, the input of the ampliﬁer was terminated

with an input resistance equivalent to that of the seismic sensor. 10000 samples were collected

using both a Gen 3 node and a both Gen 4 variations. The Root Mean Square value for these three

sets of samples were computed and compared. Changing the sensor board layout (Gen 3 vs Gen

4a) resulted in a 10 db reduction in the noise level. Adding the input low pass ﬁlter (Gen 3 vs Gen

4b) resulted in a 16.8 db reduction in the noise level.

4.6.3 Communication Performance

Communication performance is aﬀected by station spacing, antenna gains, transmitter power, and

receiver sensitivity. Traditional WSNs rely on high density to ensure network connectivity. For

volcano monitoring, geographic coverage is more important and necessitates wider spacing.

Figure 4.14: One-hop link quality (circles represent nodes).

71

Visible
Obscured
Line of Sight

+
o
x

+ + + + +o o o o o o o o
x

+

+

o o o o o o o o o o o

+

+ + +
x

)

m

(
 

e
d
u

t
i
t
l

A

 1500

 1450

 1400

 1350

 1300

 1250

 1200

 1150

 1100

 0

 100  200  300  400  500  600  700  800  900

Horizontal Distance between Nodes (m)

Figure 4.15: Line of Sight Path between Node 9 and Node 3

s
t

e
k
c
a
P

n

i

l

a
v
r
e

t

n

I

 12

 10

 8

 6

 4

 2

 0

1
0
0
0
-
1
2
5
0

1
2
5
0
-
1
5
0
0

1
5
0
0
-
1
7
5
0

1
7
5
0
-
2
0
0
0

2
0
0
0
-
2
2
5
0

2
2
5
0
-
2
5
0
0

2
5
0
0
-
2
7
5
0

2
7
5
0
-
3
0
0
0

Roundtrip time (ms)

Figure 4.16: Distribution of Satellite Round Trip Ping Times

During the ﬁeld deployment, the nodes periodically assessed link quality. For each assessment,

a hundred 64-byte packets were transmitted between two nodes. Fig. 4.14 shows the link quality

for single hop links. In our assessment, a link was considered “good” (shown with solid lines) if

there were less than 10 retransmissions, “weak” (shown with dotted lines) if there were between 10

and 15 retransmissions, and “bad” (shown with dash-dot-dot lines) if there were between 15 and

20 retransmissions. Links with more than 20 retransmissions were not included in the ﬁgure. The

link quality shown in Fig. 4.14 is consistent with the ground topology. There was a ridge running

72

roughly south to north that nodes 1, 8, 15, 11 and 14 were installed on, as indicated by the oval in

Fig. 4.14. The south to north ridge that runs between these nodes As a result, nodes on either side

of this ridge did not have a line-of-sight path to each other. The topographic proﬁle between nodes

9 and 3, Fig. 4.15, illustrates this ridge and how it blocks the line of sight between these nodes.

There was also a rise between nodes 9/10 and 13/16 which blocked the line of sight between these

two groups. To provide a network path between these two groups, Node 12 installed on the rise to

provide an additional network link between the two groups. Node 5 was also blocked from several

nodes by intermediate rises.

The base station communicated with the remote database server through a satellite link. To

assess the performance of this link, the base station periodically used the ping command to

determine the round trip times. The test showed a median round trip time of 1849 ms. Fig. 4.16

shows the distribution of these round trip times.

The satellite communication overhead of two diﬀerent protocols was also assessed. The ﬁrst

used a simple web service interface based on HTTPS Post requests. While easy to implement and

test, it is a heavy-weight protocol. For example, 962 events generated 5.85 MB of upload traﬃc and

8.227 MB download. Typically, satellite links have data packages with a limited amount of data

which can be transmitted each month. This level of traﬃc quickly exceeded our 2 MB per-month

data limit. The second protocol sends binary encoded messages over a TCP/IP connection. This

approach generated less than 100 KB of upload traﬃc for the same number of event messages.

4.6.4 Battery and Solar Panel Performance

Our design uses a 7Ah SLA battery recharged using a 20 watt solar panel. Fig. 4.17 shows a

24-hour discharge/charge cycle for this design during a full-sun daylight period. As shown in the

ﬁgure, after discharging during the night period, the battery was completely recharged during the

next daylight period. Analysis shows that the battery discharged at a rate of 0.0282 volts per hour

during the dark period. At this rate, it takes 88.6 hours (3.5 days) to discharge the battery to

10.5 volts, similar to the result obtained through Peukert’s Law. This was conﬁrmed during the

73

t

)
s
e
u
n
M

i

i

 

(
 
e
m
T
g
n
n
a
m
e
R

i

i

Predicted Time Remaining
Actual Time Remaiing

 20000
 18000
 16000
 14000
 12000
 10000
 8000
 6000
 4000
 2000
 0

 0

 500

 1000

 1500

 2000

 2500

Elapsed Time (Minutes)

Battery
Solar Panel

Time

 25

 20

 15

 10

 5

)

V

(
 
e
g
a

t
l

o
V

 0
12:00 16:00 20:00 00:00 04:00 08:00 12:00 16:00 20:00 00:00

Figure 4.17: Battery daily charging cycle.

Figure 4.18: System Life Time Estimation

ﬁeld deployment when nodes operated for four days under heavy cloud cover. Post deployment

analysis determined 75% of the batteries would not accept a charge due to repeated cycles where

the batteries discharged below 10 volts. To resolve this, a new power control circuit was designed

to disconnect the node from the battery when discharged to 10 volts and reconnect when recharged

to 11 volts.

To evaluate predicting system life time using Eqn. 4.4, a node was allowed to operate processing

a simulated seismic signal for several days. The simulated seismic signal generated an event every

16 seconds that was transmitted to a base station. While operating the battery voltage sampled once

per second. These reading were then passed through an exponential low-pass ﬁlter with a period
of 11 minutes (a = 0.985) to compute long term average. The change in battery voltage (△t/△v)
was then computed the over the same 11 minute period updating this estimage every minute. The

amount of time for the battery to discharge from its current voltage to the terminal discharge voltage

(VD) of 10.5 volts was predicted using Eqn. 4.4. Fig. 4.18 displays the predictions between the

battery fully changed voltage (VT = 12.5volts) to the terminal discharge voltage. As can be seen

from this ﬁgure this approach in general over estimates system life time and is very sensitive to

minor changes in the battery discharge rate as evidenced by the large spikes in the estimated life

time particularly at approximately the 300 minute point.

74

4.6.5 Packaging and Ease of Deployment

To deploy a node, the antenna pole is driven into the ground and attached the top section. Once

completed, the Pelican case housing the sensor electronics snaps into a metal bracket. Next, the

GPS antenna and radio antenna are attached to the sensor node. The geophone is buried in a hole

dug nearby and the cable is plugged into the sensor node. The most time consuming task is to

assemble the solar panel leg frames using several small bolts. Once these are attached, the solar

panel and the battery box can be positioned and cabled to the sensor node. Zip ties secure the cables

in place prior to powering on the node. Nodes were placed approximately 200 meters apart across

the lower slope on the north side of the volcano. Two people could deploy a node in approximately

20 minutes including the time to hike between locations.

To assess the waterproof nature of the sensor and battery boxes, they were subjected to three

days torrential rain. During the four months of deployment, the boxes remained dry inside with

no failure of the electrical connections. While the weather condition during the deployment was

generally dry, moisture can condense on the equipment during the night.

4.7 Conclusion

This paper presents the development of a seismic monitoring system supporting data-intensive

applications such as volcano tomography. By utilizing processing tiers with varying computing

capacities and a task assignment approach, low-cost sensors can support such applications. Through

our two deployments, we have shown that low-cost hardware can be used to record seismic events

in the immediate vicinity of a volcano. Moreover, we learned lessons and identiﬁed subtle but

important modiﬁcations to the system, which ensure the success of future deployments. Our

Chile deployment identiﬁed the importance of minimizing mechanical noise sources and ensuring

that sensor cables are long enough to allow the ﬂexibility needed for optimal sensor placement.

Communication measurements showed that the use of a wireless mesh network over varied terrain

is a viable communication medium, even over longer distances than originally expected.

75

CHAPTER 5

CONCLUSION

Designing and deploying a multi-node heterogeneous wireless sensor network presents a number

of challenges. These include:

Event Detection and Sensing Accuracy While environmental changes can be detected using rel-

atively simple algorithms and hardware it is not always easy to assign them to a speciﬁc

source. For example in Supero, simple light and sound detectors are able to detect the oper-

ation of a number of lights and appliances. To automatically assign the detection event to a

speciﬁc light or appliance it was necessary to utilize application speciﬁc knowledge (power

decay of light) and a priori knowledge (distance of the lights from the sensors). By correlating

the detection events across multiple sensing modes it was possible to eliminate the majority

of the false alarms while being able to estimate per-appliance energy consumption with an

average error of less than 7.5%.

Controlling Power Consumption and Operational Lifetime For systems which must operate

unattended for long periods of time, controlling power consumption is a major consider-

ation. This is commonly handled by powering-down portions of the system when not in use.

For some applications this is handled by using the lowest sensor sampling rate the application

can tolerate and while providing the required sensing accuracy. This was particularly a chal-

lenge for the acoustic sensing performed by Supero. Supero utilized a two-phase sampling

approach based on the observation that for large portions of the time the environment is

relatively quiet. During these quiet periods Supero utilized a very slow sampling rate to

conserve power. Once an increase in the acoustic background sound was detected it switch

to a higher sampling rate to provide the necessary acoustic features needed to identify the

appliance.

The seismic sensing system utilizes a similar approach in that it was able to power down

76

major components, such as the GPS, and place the processor into lower power states be-

tween samples. To allow this to occur, it was necessary to utilize additional features of the

more modern processors such as built-in real-time clocks. This system also utilized power

harvesting from solar panels to recharge its batteries. This was needed due to the extended

operational lifetime and the inaccessibility of the nodes once deployed.

Assigning Processing Tasks to Functional Units In a WSN utilizing a heterogeneous mixture

of nodes with varying levels of computing resources, careful consideration must be given

to which node (functional unit) a processing task is assigned. When assigning a task,

the designer must consider the amount of data required by the task, processing resources

needed to perform the task, communication overhead and bandwidth available to move the

required data from one functional unit to another, and any real-time processing deadlines

imposed by the application. For Supero, the sensing nodes needed to be small so as to be

unobtrusive as possible once installed. This resulted in nodes running on small batteries (two

AA batteries) utilizing very power eﬃcient processors. In addition, the appliance detection

algorithms required event information from all the sensing nodes as well as signiﬁcant

computational resources to execute the algorithms. The combination of these two factors

determined the partitioning of the processing tasks with the sensing nodes only executing

simple event detection algorithm as well as feature extraction while the base station performed

the remaining tasks.

The sensing applications composed of a wider variety of processing tasks require a more

quantitative analysis for assigning the tasks. As part of the seismic sensing system such an

approach was developed. This approach, quantiﬁed the computational/communication/data

resources required for each processing task and developed a decision criteria for assigning

a task to a particular functional unit based on the application processing deadlines. This

approach was then utilized to assign the various tasks to a functional unit, in this case a tier

in the system’s hierarchical architecture.

77

Ease of Deployment As sensing networks are more widely deployed they will be installed by

individuals who do not have speciﬁc knowledge or expertise related to the network. They

will be more than likely be just ordinary home owners or experts in some other subject area,

such as geologists. As a result, it will be necessary for the deployment to be as simple yet

robust as possible. For example with both Supero and the seismic sensing network very

simple deployment instructions/guidelines where were provided.

In the case of Supero,

instructions were “place a light sensor with unobstructed view to the light” and “place an

acoustic sensor on top of the microwave.” For the seismic sensing network, the guidelines

were “deploy the nodes approximately 200 to 300 meters apart in a location where 2 or 3

other nodes are visible”. In both cases, the systems were able to be successfully deployed by

individuals with little or no expertise with sensor networks.

The experiences from deploying the wireless sensor network resulted in several lessons learned.

The ﬁrst lesson learned early on in the development of Supero involved node reliability. The

small, embedded system sensing nodes such as the Telosb and IRIS motes are diﬃcult to identify

and debug system problems. The general lack of modern debugging tools relegates a designer

to using print statements to determine what code paths are being taken during operation. These

provide little help when the node locks up and completely stops functioning. While the Arduino

class of processors provide greater functionality they still lack modern debugging tools. In both

cases, a JTAG hardware debugging capability maybe supported it is diﬃcult to use, requires special

additional hardware, and may not be easily accessible on a commercially produced processor

board. For example, for the Teensy 3.6 processor utilized in Generation 4 of the seismic sensor

node, utilizing the JTAG associated functionality requires modiﬁcations to its circuit board. Once

installed on the main sensor board the JTAG connections are no longer accessible.

Another lesson learned was related to the importance of rapid prototyping and multiple small

ﬁeld deployments when developing sensor hardware. While careful design can eliminate a number

of hardware related issues, some can not be identiﬁed until actual hardware has been produced

and attempted to be used in an actual sensing application. It is only by evaluating actual hardware

78

can subtle interactions between components be identiﬁed. Two such cases arose with the seismic

sensing nodes. The ﬁrst was related to SPI bus contention which caused erratic sensor sampling

rates. The second was related to noise in the sensor input ampliﬁer caused by interaction between

the analog and digital circuits located on the main sensor board.

The importance of ﬁeld experience associated with deploying a sensor network can not be

discounted. In the case of the seismic sensor network the ﬁeld deployments identiﬁed several areas

that needed to be improved. During the ﬁrst deployment, this identiﬁed a packaging issue that

needed to be changed. The initial boxes were not sturdy enough, did not seal properly in some

cases, to diﬃcult to open in the ﬁeld to service a node, and were too bulky for multiple units to be

carried by one person. This resulted in the development of an alternative packaging approach that

was tested during the second development.

Prior to the second deployment, it was believed the solar panel support frames required a wide

range of adjustability to account for the angle of the sun. Having this adjustability in fact was

not utilized in the ﬁeld and increased the diﬃculty associated with transporting and deploying the

nodes. By redesigning the frames with less adjustability in mind, the frames could be attached

before transporting the panel to the deployment site and shortened the time to deploy a node by

approximately 10 minutes. While 10 minutes might not seem like a long period of time, it is when

you must deploy 15 to 25 nodes in a day in the rain with it very cold. Under those conditions every

minute counts. As a result, performing multiple deployments, even those which might have limited

scope, is highly recommended for an designer of a sensor network for there are some things that

can only be identiﬁed once a system is out in the ﬁeld.

The system developed can also be applied to other sensing domains. For example, a current area

of study within the geophysical community is low frequency sound propagation at high altitudes.

For studies in this area, infrasound sensors are sent aloft in a high altitude balloon and used to

record the sounds. As the balloon proceeds on its ﬂight path GPS coordinates, altitude, and course

information is recorded.

In 2015, a Generation 3 node was included on such a ﬂight with the

seismic sensor replace with an infrasound sensor. [4]

79

Another application area is animal tracking. For this application, a camera and motion sensor

are added to the sensing nodes. When movement is detected the node captures a picture then

applies an image recognition algorithm to identify the source of movement. If the source is one

of the types of animals being tracked, the WSN is used to transmit the detection event to a central

location so that additional tracking resources can be dispatched to observe the animal.

In both of these additional application areas, ﬁeld experience obtained during trial deployments

is critical to the successful ﬁnal deployment.

It is only through a trial deployment can one be

assured that the nodes and overall system will operating in its ﬁnal environment. In the case of the

high altitude balloon ﬂight, it is diﬃcult to simulate the environmental conditions experienced by

the system. How will the system respond to the eﬀects of high altitude operation or the extreme

changes in temperature? For the animal tracking, how will animals, which are curious by nature,

react to seeing something new (a sensor node with a solar panel) in their environment. Will they

react in a way that introduces a system failure? Such questions can only be answered through ﬁeld

experience.

80

BIBLIOGRAPHY

81

BIBLIOGRAPHY

[1] Alertme. Alertme, August 2015.

[2] American Coalition for Clean Coal Electricity. Study ﬁnds families burdened by ever-

increasing energy costs, 2011.

[3] Rajesh Krishna Balan, Mahadev Satyanarayanan, So Young Park, and Tadashi Okoshi. Tactics-

based remote execution for mobile computing. In MobiSys, 2003.

[4] D. C. Bowman, C. S. Johnson, R. A. Gupta, J. Anderson, J. M. Lees, D. P. Drob, and
D. Phillips. High Altitude Infrasound Measurements using Balloon-Borne Arrays. AGU Fall
Meeting Abstracts, pages S54B–06, December 2015.

[5] Eduardo Cuervo, Aruna Balasubramanian, Dae-ki Cho, Alec Wolman, Stefan Saroiu, Ranveer
Chandra, and Paramvir Bahl. Maui: making smartphones last longer with code oﬄoad. In
MobiSys, 2010.

[6] The Energy Detective. The energy detective, August 2015.

[7] Shannon Doocy, Amy Daniels, Shayna Dooling, and Yuri Gorokhovich. The human impact
of volcanoes: a historical review of events 1900-2009 and systematic literature review. PLoS
currents, 5, 2013.

[8] Steven Drenker and Ab Kader. Nonintrusive monitoring of electric loads. IEEE Computer

Applications in Power, 12(4):47–51, 1999.

[9] E.T. Endo and T. Murray. Real-time seismic amplitude measurement (RSAM): a volcano

monitoring and prediction tool. Bulletin of Volcanology, 53(7), 1991.

[10] Linda Farinaccio and Radu Zmeureanu. Using a pattern recognition approach to disaggregate
the total electricity consumption in a house into the major end-uses. Energy and Buildings,
30(3):245–259, 1999.

[11] Matthew Faulkner, Michael Olson, Rishi Chandy, Jonathan Krause, K Mani Chandy, and An-
dreas Krause. The next big one: Detecting earthquakes and other rare events from community-
based sensors. In Information Processing in Sensor Networks (IPSN), 2011 10th International
Conference on, pages 13–24. IEEE, 2011.

[12] Jason Flinn, SoYoung Park, and M. Satyanarayanan. Balancing performance, energy, and

quality in pervasive computing. In ICDCS, 2002.

[13] Scott W French and Barbara Romanowicz. Broad plumes rooted at the base of the earth’s

mantle beneath major hotspots. Nature, 525(7567):95–99, 2015.

[14] Sidhant Gupta, Matthew S. Reynolds, and Shwetak N. Patel. Electrisense: single-point
sensing using emi for electrical event detection and classiﬁcation in the home. In 12th ACM
International Conference on Ubiquitour Computing (UbiComp), pages 139–148, 2010.

82

[15] George W. Hart. Nonintrusive appliance load monitoring. Proceedings of IEEE, 80(12):1870–

1891, 1992.

[16] Bo-Jhang Ho, Hsin-Liu Cindy Kao, Nan-Chen. Chen, Chuang-Wen You, Hao-Hua Chu, and
Ming-Syan Chen. Heatprobe: a thermal-based power meter for accounting disaggregated
electricity usage. In 13th International Conference on Ubiquitous Computing (UbiComp),
pages 55–64, 2011.

[17] Insteon. Insteon, August 2015.

[18] Xiaofan Jiang, Stephen Dawson-Haggerty, Prabal Dutta, and David Culler. Design and
implementation of a high-ﬁdelity ac metering network. In The 8th ACM/IEEE International
Conference on Information Processing in Sensor Networks (IPSN), pages 253–264, 2009.

[19] Xiaofan Jiang, Minh Van Ly, Jay Taneja, Prabal Dutta, and David Culler. Experiences with
a high-ﬁdelity wireless building energy auditing network. In The 7th ACM Conference on
Embedded Networked Sensor Systems (SenSys), pages 113–126, 2009.

[20] Deokwoo Jung and Andreas Savvides. Estimating building consumption breakdowns using
on/oﬀ state sensing and incremental sub-meter deployment. In The 8th ACM Conference on
Embedded Networked Sensor Systems (SenSys), pages 225–238, 2010.

[21] Sukun Kim, Shamim Pakzad, David Culler, James Demmel, Gregory Fenves, Steven Glaser,
and Martin Turon. Health monitoring of civil infrastructures using wireless sensor networks.
In Information Processing in Sensor Networks, 2007. IPSN 2007. 6th International Symposium
on, pages 254–263. IEEE, 2007.

[22] Younghun Kim, Thomas Schmid, Zainul M. Charbiwala, and Mani B. Srivastava. Viridiscope:
design and implementation of a ﬁne grained power monitoring system for homes. In The 11th
International Conference on Ubiquitour Computing (UbiComp), pages 245–254, 2009.

[23] Lakshman Krishnamurthy, Robert Adler, Phil Buonadonna, Jasmeet Chhabra, Mick Flanigan,
Nandakishore Kushalnagar, Lama Nachman, and Mark Yarvis. Design and deployment of
industrial sensor networks: experiences from a semiconductor plant and the north sea. In
Proceedings of the 3rd international conference on Embedded networked sensor systems,
pages 64–75. ACM, 2005.

[24] J.M. Lees. The magma system of mount st. helens: non-linear high-resolution p-wave

tomography. Journal of volcanology and geothermal research, 53(1-4):103–116, 1992.

[25] Jonathan M Lees. Seismic tomography of magmatic systems. Journal of Volcanology and

Geothermal Research, 167(1):37–56, 2007.

[26] Liqun Li, Guoliang Xing, Limin Sun, Wei Huangfu, Ruogu Zhou, and Hongsong Zhu.
Exploiting fm radio data system for adaptive clock calibration in sensor networks. In The 9th
International Conference on Mobile Systems, Applications, and Services (MobiSys), pages
169–182, 2011.

[27] Memsic Corp. Iris datasheet, 2011.

83

[28] Memsic Corp. TelosB datasheet, 2011.

[29] Mohammad-Mahdi Moazzami, Dennis E Phillips, Rui Tan, and Guoliang Xing. Orbit: a
smartphone-based platform for data-intensive embedded sensing applications. In Proceedings
of the 14th International Conference on Information Processing in Sensor Networks, pages
83–94. ACM, 2015.

[30] Kim Munro. Automatic event detection and picking of p-wave arrivals. CREWES Report,

2004.

[31] Ryan Newton, Sivan Toledo, Lewis Girod, Hari Balakrishnan, and Samuel Madden. Wish-

bone: Proﬁle-based partitioning for sensornet applications. In NSDI, 2009.

[32] P3 International Corp. P4400 Kill A Watt TM Operation Manual, 2012.

[33] Jeongyeup Paek, Krishna Chintalapudi, John Caﬀrey, Ramesh Govindan, and Sami Masri.
A wireless sensor network for structural health monitoring: Performance and experience.
Center for Embedded Network Sensing, 2005.

[34] Shwetak N. Patel, Sidhant Gupta, and Matthew S. Reynolds. The design and evaluation
In ACM

of an end-user-deployable, whole house, contactless power consumption sensor.
Conference on Human Factors in Computing Systems (CHI), pages 2471–2480, 2010.

[35] Shwetak N. Patel, Thomas Robertson, Julie A. Kientz, Matthew S. Reynolds, and Gregory D.
Abowd. At the ﬂick of a switch: Detecting and classifying unique electrical events on
the residential power line. In The 10th International Conference on Ubiquitous Computing
(UbiComp), pages 271–288, 2007.

[36] D. E. Phillips, R. Tan, M. M. Moazzami, G. Xing, J. Chen, and D. K. Y. Yau. Supero: A sensor
system for unsupervised residential power usage monitoring. In 2013 IEEE International
Conference on Pervasive Computing and Communications (PerCom), pages 66–75, March
2013.

[37] prjc.com. Teensyduino, version 1.40 source code, 2017.

[38] Moo-Ryong Ra, Bin Liu, Tom F. La Porta, and Ramesh Govindan. Medusa: a programming

framework for crowd-sensing applications. In MobiSys, 2012.

[39] Alberto Rosi, Matteo Berti, Nicola Bicocchi, Gabriella Castelli, Alessandro Corsini, Marco
Mamei, and Franco Zambonelli. Landslide monitoring with sensor networks&#58; experi-
ences and lessons learnt from a real&#45;world deployment. Int. J. Sen. Netw., 10(3):111–122,
August 2011.

[40] Ritei Shibata. Selection of the order of an autoregressive model by akaike’s information

criterion. Biometrika, 63(1):117–126, 1976.

[41] R. Sleeman and T. van Eck. Robust automatic p-phase picking: an on-line implementation
in the analysis of broadband seismogram recordings. Physics of the earth and planetary
interiors, 113, 1999.

84

[42] Wen-Zhan Song, Renjie Huang, Mingsen Xu, Andy Ma, Behrooz Shirazi, and Richard
LaHusen. Air-dropped sensor network for real-time high-ﬁdelity volcano monitoring.
In
Proceedings of the 7th international conference on Mobile systems, applications, and services,
MobiSys ’09, pages 305–318, New York, NY, USA, 2009. ACM.

[43] Jacob Sorber, Nilanjan Banerjee, Mark D. Corner, and Sami Rollins. Turducken: Hierarchical

power management for mobile devices. In MobiSys, 2005.

[44] Ivan Stoianov, Lama Nachman, Sam Madden, and Timur Tokmouline. Pipeneta wireless
sensor network for pipeline monitoring. In Proceedings of the 6th International Conference
on Information Processing in Sensor Networks, IPSN ’07, pages 264–273, New York, NY,
USA, 2007. ACM.

[45] Z. Cihan Taysi, M. Amac Guvensan, and Tommaso Melodia. Tinyears: spying on house
In The 2nd ACM Workshop on Embedded Sensing

appliances with audio sensor nodes.
Systems for Energy-Eﬃciency in Building, pages 31–36, 2010.

[46] TPCDB. The power consumption database, August 2015.

[47] A Trnkoczy. Understanding and parameter setting of sta/lta trigger algorithm. IASPEI New

Manual of Seismological Observatory Practice, 2:1–19, 2002.

[48] U.S. DoE. Annual energy outlook, 2006.

[49] U.S. Energy Information Administration. Residential energy consumption survey, 2011.

http://www.eia.gov.

[50] Agustín Udías Vallina. Principles of seismology. Cambridge Univ. Press, 1999.

[51] G. Werner-Allen, K. Lorincz, M. Ruiz, O. Marcillo, J. Johnson, J. Lees, and M. Welsh.
Internet Computing, IEEE,

Deploying a wireless sensor network on an active volcano.
10(2):18–25, March 2006.

[52] Ning Xu, Sumit Rangwala, Krishna Kant Chintalapudi, Deepak Ganesan, Alan Broad, Ramesh
In
Govindan, and Deborah Estrin. A wireless sensor network for structural monitoring.
Proceedings of the 2nd International Conference on Embedded Networked Sensor Systems,
SenSys ’04, pages 13–24, New York, NY, USA, 2004. ACM.

85