STATISTICAL APPROACHES FOR THE ANALYSIS, MEASUREMENT, AND

MODELING OF RFID SYSTEMS

By

Liyan Wang

A DISSERTATION

Submitted to

Michigan State University

in partial fulﬁllment of the requirements

for the degree of

Computer Science – Doctor of Philosophy

2018

ABSTRACT

STATISTICAL APPROACHES FOR THE ANALYSIS, MEASUREMENT, AND

MODELING OF RFID SYSTEMS

By

Liyan Wang

The goal of this thesis is to develop statistical and learning algorithms for the analysis,

measurement, and modeling of wireless networking( Radio frequency identiﬁcation systems).

Next, I will give a brief overview of those topics.

Radio frequency identiﬁcation (RFID) systems are widely used in logistic, supply chain

industry and inventory management. RFID is already in use in multiple industries and for

various purposes. The device in your car that lets you zoom by in the fast lane at a tollbooth,

while deducting a dollar amount from your account, is an example of RFID technology

in everyday use. Mostly, existing RFID systems are primarily used to identify the RFID

tags present in a tag population(e.g., tracking a speciﬁc tag from a tag population) while

identifying some speciﬁc tags is a critical operation, it is usually very time consuming and

is not desired or nessary in some situations. For instance, if the objective is to determine

whether any of the tags are missing(e.g., to detect some items according to a consignment),

the ﬁrst thing to do is to identify all tags’ ID and then compare with the original record to

determine if there is any tags are missing. Deﬁnitely, the whole process will be very slow

if we have a very large tag population. In this thesis, I present novel statistical algorithms

to enable fast and new applications in RFID systems. For example, detecting the missing

tags in a large tag population with high accuracy while using the existing infrastructure of

RFID systems which is already deployed in industry. More paciﬁcally, I present my work on

designing statistical algorithms for estimation the number of missing tags in a population of

RFID tags, for detecting and identifying the missing tags from a population of RFID tags.

The key distinction of my work compared to prior art is that my methods are compliant with

EPCGlobal Class 1 Generation 2 (C1G2) RFID standard. It is critical for RFID methods to

be compliant with the C1G2 standard since the commercially available of-the-shelf RFID

equipment follows the C1G2 standard. A method which does not comply with the C1G2

standard cannot be deployed on the existing installations of RFID systems because it requires

custom hardware, which will cost a lot.

In an RFID-enabled warehouse, there may be

thousands of tagged items that belong to diﬀerent categories, e.g., diﬀerent places of origin

or diﬀerent brands [207]. Each tag attached to an item has a unique ID that consists of

two ﬁelds: a category ID that speciﬁes the category of the attached object, and a member

ID that identiﬁes this object within its category. As a manager of the warehouse, one may

desire to timely monitor the product stock of each category. If the stock of a category is

too high, it may indicate that this product category is not popular, and the seller needs to

adjust the marketing strategy (e.g., lowering prices to increase sales). On the contrary, if

the stock of a category is too low, the seller should perform stock replenishment as soon as

possible. Manual checking is laborious and of low time-eﬃciency. You cannot imagine how

diﬃcult it is for a manager to manually count the number of items in each category that may

be stacked together or placed on high shelves. Hence, it is desirable to exploit the RFID

technique to quickly obtain the number of tagged items in each category. A multi-category

RFID estimation protocol should satisfy three additional requirements. First, it should be

standard compliant; otherwise, it will be diﬃcult to be deployed. Second, it should preserve

the privacy of tags by not reading their member IDs. Third, it should work with both a

single-reader and multiple-reader environments. As the communication range between a tag

and a reader is limited, a large population of tags is often covered by multiple readers whose

regions often overlap.

Copyright by
LIYAN WANG
2018

To my parents,

for their support and encouragement.

iv

ACKNOWLEDGEMENTS

Working towards a Ph.D. has been a deeply enriching and rewarding experience. Looking

back, many people have helped shape my journey. I would like to extend them my thanks.
• First and foremost, I would like to thank to my PhD advisor, Prof. Alex X. Liu, for
supporting me during ﬁve past years. He is the one of the smartest people I know.

My work would not have been possible without his constant guidance, his unwavering

encouragement, his many insights, and his exceptional resourcefulness. For all of this,

Alex, thank you.

• I would also like to thank the rest of my thesis committee Profs. Yiying Tong, Xiaoming

Liu and Susan Selke for their helpful research advice and suggestion in general.

• I would also like to thank Dr. Muhammad Shahzad, thank you for numerous insightful
suggestions and discussion on my projects. I will remember how you encourage me to

do research and think about the solution. Never give up and never stop thinking.

• Thanks NSF for supporting my Ph.D. .
• My great thanks to Michigan State University and the Department of Computer Science

and Engineering for providing me ﬁnancial support to attend the conferences.

• Many thanks to my colleagues in Systems and Security Lab at Michigan State University.
In particular, I would like to thank Muhammad Zubair Shaﬁq, Kamran Ali, Ali Munir,

and Faraz Ahmed for your friendship.

• I must say that I owe my great time in Michigan State University to all of my fabulous
friends. It is simply not feasible to list all of them here. I would like to thank them all

for their friendship and support.

• I must say thanks to my great parents. Without their encouragement, I even don’t

have a choice to pursue my PhD in U.S. Also, I love you more than I can say.

• I must sat thanks to my great friend Heidi. You gave me a lot suggestions and advices

to help me face with a lot of problems in my life.

v

TABLE OF CONTENTS

LIST OF TABLES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viii

LIST OF FIGURES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

ix

CHAPTER 1

INTRODUCTION . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.1 Challenges and Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.2 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.2.1 RFID Missing Tag Searching [55, 219]
. . . . . . . . . . . . . . . . .
1.2.2 RFID Multi-category Tag Estimation [115] . . . . . . . . . . . . . . .
1.2.3 RFID Valued Missing Tag Detection [119]
. . . . . . . . . . . . . . .

1
1
1
1
2
3

4
4
4
8
10
11
11
15
15
16
17
17
18
18
18
19
19
21
25
26
27
28
32
32
33
34
35
35
36
37

2.1

2.3.1
2.3.2 Estimating Tag Search Protocols

CHAPTER 2 RFID MISSING TAGS SEARCHING . . . . . . . . . . . . . . . . .
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.1.1 Motivation and Problem Statement . . . . . . . . . . . . . . . . . . .
2.1.2 Proposed Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.1.3 Limitation of Prior Art . . . . . . . . . . . . . . . . . . . . . . . . . .
2.2 RFID Tag Searching Background . . . . . . . . . . . . . . . . . . . . . . . .
2.2.1 RFID Tag Searching Motivation and Problem Statement . . . . . . .
2.3 Review of Related Research . . . . . . . . . . . . . . . . . . . . . . . . . . .
Identiﬁcation Tag Search Protocols . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
2.4 Proposed Research . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.4.1 Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.4.2 C1G2 Compliance . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.4.3 Communication Channel . . . . . . . . . . . . . . . . . . . . . . . . .
2.4.4 Formal Development Assumption . . . . . . . . . . . . . . . . . . . .
2.5 RFID Tag Search Protocol . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.5.1 Protocol Description . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.5.2 Estimating Number of Tags in Set C . . . . . . . . . . . . . . . . . .
2.6 Parameter Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.6.1 False Positive Probability . . . . . . . . . . . . . . . . . . . . . . . .
2.6.2 Conﬁdence Condition . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.6.3 Duration Condition . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.6.4 Handling Large Frame Sizes . . . . . . . . . . . . . . . . . . . . . . .
2.7 Performance Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.7.1 Accuracy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
2.7.2 Execution Time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.8 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

2.7.1.1 Observed Conﬁdence interval vs. |A|
2.7.1.2 Observed Conﬁdence interval vs. |B|
2.7.1.3 Observed Conﬁdence interval vs. |C|

vi

2.8.1 Valued Tag Detection and Identiﬁcation . . . . . . . . . . . . . . . .

3.1

CHAPTER 3 RFID MULTI-CATEGORY TAGS ESTIMATION . . . . . . . . . . .
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.1.1 Background and Problem Statement
. . . . . . . . . . . . . . . . . .
3.1.2 Proposed Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.1.3 Challenges and Proposed Solutions . . . . . . . . . . . . . . . . . . .
3.1.4 Novelty and Advantage over Prior Art
. . . . . . . . . . . . . . . . .
3.2 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.3 Proposed Research . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
SEM: Estimator and Variance . . . . . . . . . . . . . . . . . . . . . .
3.3.1
3.3.2 Estimator of SEM . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.3.3 Variance of SEM . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.4 SEM: Parameter Optimization . . . . . . . . . . . . . . . . . . . . . . . . . .
3.4.1 Number of Frames k . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.4.2 Frame Sizes f and f(cid:48)
. . . . . . . . . . . . . . . . . . . . . . . . . . .
3.4.3 Dynamic Parameter Adjusting . . . . . . . . . . . . . . . . . . . . . .
3.4.4 Avoiding Premature Termination . . . . . . . . . . . . . . . . . . . .
3.5 SEM: Adaptive Partitioning . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.5.1 Category Types Analysis . . . . . . . . . . . . . . . . . . . . . . . . .
3.5.1.1 Balanced Categories . . . . . . . . . . . . . . . . . . . . . .
3.5.1.2 Unbalanced Categories . . . . . . . . . . . . . . . . . . . . .
3.5.2 Adaptive Partitioning . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.5.3 Discussion about SEM . . . . . . . . . . . . . . . . . . . . . . . . . .
3.5.3.1 Multi-reader Estimation . . . . . . . . . . . . . . . . . . . .
3.5.3.2 Bit Synchronization . . . . . . . . . . . . . . . . . . . . . .
3.6 Performance Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.6.1 Evaluation Metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.6.2 Adaptive Partitioning . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.6.2.1 Balanced Categories . . . . . . . . . . . . . . . . . . . . . .
3.6.2.2 Unbalanced Categories . . . . . . . . . . . . . . . . . . . . .
3.6.3 Actual Reliability . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.6.4 Execution Time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.6.4.1 Balanced Categories . . . . . . . . . . . . . . . . . . . . . .
3.6.4.2 Unbalanced Categories . . . . . . . . . . . . . . . . . . . . .
3.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

CHAPTER 4 FUTURE WORK . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.1 Missing Tag Detection and Identiﬁcation . . . . . . . . . . . . . . . . . . . .
4.2 Motivation and Problem Statement . . . . . . . . . . . . . . . . . . . . . . .
4.3 Limitations of Prior Art
. . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.4 Solution Directions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

BIBLIOGRAPHY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

38

39
39
39
41
45
46
46
48
48
49
51
52
53
54
56
57
58
59
59
59
60
62
62
62
62
63
64
64
64
65
65
67
68
69

70
70
70
71
72

75

vii

LIST OF TABLES

Table 2.1: RFID systems with diﬀerent tags . . . . . . . . . . . . . . . . . . . . . . .

Table 2.2: Symbols used in the paper

. . . . . . . . . . . . . . . . . . . . . . . . . .

Table 3.1: Main notations used in the paper.

. . . . . . . . . . . . . . . . . . . . . .

6

20

50

viii

LIST OF FIGURES

Figure 2.1: Expected value of number of 1↔1 slots vs. |C| . . . . . . . . . . . . . . .

Figure 2.2: Comparison of theoretical and experimental Pf p . . . . . . . . . . . . . .

Figure 2.3: Total number of slots S vs. frame size f . . . . . . . . . . . . . . . . . .

Figure 2.4: Expected execution times of RTSP . . . . . . . . . . . . . . . . . . . . .
Figure 2.5: Observed conﬁdence interval vs. |A| when |B| = 5000, and |C| = 500 . .
Figure 2.6: Observed conﬁdence interval vs. |B| when |A| = 5000, and |C| = 500 . .
Figure 2.7: Observed conﬁdence interval vs. |C| when |A| = 5000, and |B| = 5000 . .

Figure 2.8: Comparison of execution times of RTSP and TH . . . . . . . . . . . . . .

Figure 3.1: Single-one Manchester Coding . . . . . . . . . . . . . . . . . . . . . . . .

Figure 3.2: From one physical frame to λ = 3 logical frames . . . . . . . . . . . . . .

Figure 3.3: Separate estimation vs. simultaneous estimation in a balanced RFID
system that contains two categories C1 and C2 with sizes of 100 and 110
tags respectively. (α, β) = (5%, 95%). (a) SEM on C1. (b) SEM on C2.
(c) SEM on C1 and C2. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Figure 3.4: Separate estimation vs. simultaneous estimation in an unbalanced RFID
system that contains two categories C1 and C2 with sizes of 100 and
2000 tags respectively. (α, β) = (5%, 95%). (a) SEM on C1. (b) SEM
on C2. (c) SEM on C1 and C2.

. . . . . . . . . . . . . . . . . . . . . . .

Figure 3.5: Example of Adaptive Partitioning (AP): an initial unbalanced group is

partitioned into 3 balanced groups.

. . . . . . . . . . . . . . . . . . . . .

Figure 3.6: Comparing SEM+AP with SEM for balanced category sizes. Each

category has the same size of 5000 tags.

. . . . . . . . . . . . . . . . . .

Figure 3.7: Comparing SEM+AP with SEM for unbalanced category sizes. The

cardinalities of 10 categories are exponentially distributed.

. . . . . . . .

Figure 3.8: Comparing SEM+AP with SEM for unbalanced category sizes. The

cardinalities of 10 categories are linearly distributed.

. . . . . . . . . . .

ix

24

27

29

31

33

34

36

37

41

43

58

58

61

63

63

63

Figure 3.9: Actual reliability of SEM+AP for balanced categories.

. . . . . . . . . .

Figure 3.10: Actual reliability of SEM+AP for unbalanced (exponential) categories.

.

Figure 3.11: Actual reliability of SEM+AP for unbalanced (linear) categories. . . . . .

Figure 3.12: Execution time of SEM+AP ((cid:96) = 1) and prior protocols for balanced

categories. α = 5%, β = 95%.

. . . . . . . . . . . . . . . . . . . . . . . .

Figure 3.13: Execution time of SEM+AP ((cid:96) = 1) and prior protocols for unbalanced

categories. α = 5%, β = 95%.

. . . . . . . . . . . . . . . . . . . . . . . .

66

66

66

68

68

x

CHAPTER 1

INTRODUCTION

1.1 Challenges and Motivation

In this thesis I present my work on measurement, modeling, design, and analysis of RFID

systems. For RFID systems, I present my work on probabilistic network measurements. My

research is focus on the modeling, design, and analysis of probabilistic measurement schemes

for radio frequency identiﬁcation (RFID) systems. It deserves to be specially noted that I

present my work on designing statistical algorithms for estimating the number of missing

tags in a population of RFID tags, for optimizing the RFID estimation protocol, and for

identifying missing tags from a population of RFID tags. The main diﬀerence between the

prior art and my work is that my approaches are compliant with the EPCGlobal Class 1

Generation 2 (C1G2) RFID standard.

It is quite important for RFID schemes to be compliant with the C1G2 standard because the

commercially available oﬀ-the-shelf RFID equipment follows the C1G2 standard. complying

with C1G2 standard is an important factor in determining the schemas which can be promoted

in the commercial applications. A scheme that does not comply with the C1G2 standard

cannot be deployed on the existing installations of RFID systems since it requires custom

tags or readers, which will cost a lot.

1.2 Contributions

This thesis takes an in-depth look at the following research problems.

1.2.1 RFID Missing Tag Searching [55, 219]

We address the fundamental problem of estimating RFID missing tag population size, which

is needed in many applications such as consignment identiﬁcation, warehouse monitoring,

1

and privacy sensitive RFID systems. We propose a new scheme for estimating missing tag

population size named RFID Tag Searching Protocol(RTSP) The technique is based on the

average number of 1s in the receiving frames and the location of 1 in the receiving frames

while using the standardized framed slotted Aloha protocol. RTSP is signiﬁcantly faster than

prior schemes. For example, given a required conﬁdence interval of 0.1% and the number of

missing tags is 500, the tag population is 5000, RTSP takes 15 seconds to search the tags

whereas the fastest priori tag identiﬁcation protocol needs 40 seconds.(TH )

1.2.2 RFID Multi-category Tag Estimation [115]

We concern the practically important problem of multi-category RFID estimation: given a set

of RFID tags, we want to quickly and accurately estimate the number of tags in each category.

However, almost all the existing RFID estimation protocols are dedicated to the estimation

problem on a single set, regardless of tag categories. ART, the faster estimation protocol

which is based on the average run-length of 1s in the bit string received using the standardized

framed slotted Aloha protocol does not consider any tag categories. A feasible solution is to

separately execute the existing estimation protocols on each category. The execution time

of such a serial solution is proportional to the number of categories, and cannot satisfy the

delay-stringent application scenarios. Simultaneous RFID estimation over multiple categories

is desirable, hence, this paper proposes an approach called Simultaneous Estimation for

Multi-category RFID systems (SEM). SEM exploits the Manchester-coding mechanism, which

is supported by the ISO 18000-6 RFID standard, to decode the combined signals, thereby

simultaneously obtaining the reply status of tags from each category. As a result, multiple bit

vectors are decoded from just one physical slotted frame. Built on our SEM, many existing

excellent estimation protocols can be used to estimate the tag cardinality of each category in

a simultaneous manner. To ensure the predeﬁned accuracy, we calculate the variance of the

estimate in one round, as well as the variance of the average estimate in multiple rounds. To

ﬁnd the optimal frame size, we propose an eﬃcient binary search-based algorithm. To address

2

signiﬁcant variance in category sizes, we propose an Adaptive Partitioning (AP) strategy to

group categories of similar sizes together and execute the estimation protocol for each group

separately. Compared with the existing protocols, our approach is much faster, meanwhile

satisfying the predeﬁned estimation accuracy. For example, with 20 categories, the proposed

SEM+AP is about 7 times faster than prior estimation schemes.

Moreover, our approach is the only one whose normalized estimation time (i.e., time per

category) decreases as the number of categories increases.

1.2.3 RFID Valued Missing Tag Detection [119]

As I mentioned before, most methods are not considering the missing tags. The only methods

to estimate the missing tag is using the identiﬁcation protocol to identify all the tags then

ﬁgure out all the missing tags. However, mostly, there is no necessary to identify all the

tags. Moreover, sometimes people just care about the values tags(tags attached to the values

items). In this scenario we just have to detect and identify all the missing tag with the value

equal to or greater than the threshold which can be set by users. With this fundamental

problem for diﬀerent tag population size and diﬀerent values of each tag, our approach have

to meet diﬀerent accuracy requirements based on the users deﬁnitions. Our exploratory

analysis uncovers several statistically signiﬁcant ﬁndings that have important implications

for software development and deployment.

Based on the Single-one Manchester coding method we can easily classify the diﬀerent

valued tags into groups. For example, tags with value greater than 100 will be in group 1

which demand the higher accuracy requirement while group 2 with lower accuracy is consist of

tags whose value are smaller than 100. After grouping we can use diﬀerent estimation protocol

to satisfy the diﬀerent accuracy requirements. Apparently, detecting missing expensive tags

with higher accuracy will need more time.

3

CHAPTER 2

RFID MISSING TAGS SEARCHING

2.1

Introduction

2.1.1 Motivation and Problem Statement

RFID doesn’t provide much value on its own, but with RFID many companies can develop

a lot applications which can make a lot proﬁt. RFID systems can help to connect all the

things into network which enable companies to communicate, educate, sell, entertain and

distribute the products. As the RFID tags are cheaper and cheaper, RFID enables companies

to do many diﬀerent things. As we known, RFID is used to identify objects or even people

by attaching a tag which including all the information we can track. Its advantage is no

human intervention. Tags can be read by a reader and the information(e.g., the tag ID, the

reader’s ID and the time the tag was read) can be transmitted to computers in real time.

One of the most common uses of RFID is asset tracking. Companies attach tags to the

asset to prevent from stealing. RFID has been used in manufacturing plants for more than a

decade. RFID technology can track parts and work in process and to reduce defects while

increasing throughput and managing the production of diﬀerent versions of the products.

RFID technology has been used in closed loop supply chains for years. Companies distribute

RFID system to increase throughput, reduce shipping error, costs and save labor costs. Most

retailers such as BestBuy, Metro, Target, Tesco, Wal-mart and Amazon are in the forefront of

RFID adoption. Retailer are currently focus on improving supply chain eﬃciency and making

sure products are on the right shelf when customers want to purchase. RFID technology is

catching on as a convenient payment mechanism. One of the most popular uses of RFID is to

pay for road tolls without stopping. Using as an electronic key to control the access to oﬃce

or buildings is one of important application of RFID. There are many other innovative uses

4

for RFID. Locating children at theme parks, combining RFID tags with temperature loggers,

motion sensors, radiation sensors to achieve more important goal. Object tracking using

RFID since it is very convenience and low cost to deploy [33, 208, 20]. Since RFID tags can

attach to most items, they are able to support for localization, aiming to pinpoint objects in

3D space referring as 3D positioning [200, 40, 144, 131]. Antitheft can be very useful in large

warehouse since the cost of commercial RFID tags is negligible compared to the value of the

products to which they are attached.(e.g., just 5 cents per tag [21]). An RFID system consists

of tags, readers and servers. A tag is a microchip with an antenna in a compact package

with the limited computing power and communication range. There are three types of tags

[87, 104]: (1) passive tags, which are powered up by harvesting the radio frequency energy

from readers (as they do not have their own power sources) and have communication range

often less than 10 meters; (2) active tags, which have their own power sources and transmitter

thus have relatively longer communication range; (3) Battery-Assisted passive (BAP) tags,

which have the internal power source to power on, and energy transferred from the reader to

backscatter and have the moderate communication range. Active RFID systems typically

operate in the ultra-high frequency (UHF) band and oﬀer a range of up to 100 meters or more.

In general, active tags are used on large objects, such as rail cars, big reusable containers,

and other assets that need to be tracked over long distances. Passive RFID system with the

passive tags which required strong signal from readers to power on. Because passive tags do

not require a power source or transmitter, and only require a tag chip and antenna, they are

cheaper, smaller, and easier to manufacture than active tags. While most passive RFID tags

use the energy from the RFID reader’s radio signal to power on the tag’s integreted chip and

backscatter to the reader, BAP tags use an integrated power source (usually a battery) to

power on the chip, so all of the captured energy from the reader can be used for backscatter.

Unlike transponders, BAP tags do not have their own transmitters.

An RFID reader has a dedicated power source with signiﬁcant computing power. An

RFID reader’s function is to interrogate RFID tags. The means of interrogation is wireless

5

Table 2.1 RFID systems with diﬀerent tags

Passive RFID

Active RFID

Tag Power Source

Energy transfer from
the reader via RF

Internal to tag

Tag battery
Availability of Tag
Power

No
Only within ﬁeld of
reader

Yes

continuous

very high(must power
the tag)

very low

uses

source

Battery-Assisted
Passive RFID
internal
Tag
power
to
power on, and energy
transferred from the
reader via RF to
backscatter
Yes
only within ﬁeld of
reader
Moderate (does not
need to power tag,
but must
power
backscatter)

Required
Signal
Strength from Reader
to Tag

Available
Signal
Strength from Tag to
Reader

Communication Rage

Data transfer

Low

High

Moderate

range(up to

Short
10m)
Ability to read and
transfer sensor values
only when tag is pow-
ered by reader

Long range(100m or
more)

Ability to continu-
ously monitor and
record sensor input

Moderate range(up to
100m)
Ability to read and
transfer sensor values
only tag receives RF
signal from reader

and because the distance is relatively short; line of sight between the reader and tags are not

necessary. A reader contains an RF module, which acts as both a transmitter and receiver of

radio frequency signal. The transmitter consist of an oscillator to create the carrier frequency;

a modulator to impinge data commands upon this carrier signal and an ampliﬁer to boost

the signal enough to awaken the tag. The receiver has a demodulator to extract the returned

data and also contains an ampliﬁer to strengthen the signal for processing. A microprocessor

forms the control unit, which employs an operating system and memory to ﬁlter and store

the data. The data is now ready to be sent to the network. It transmits a query to a set

of tags and the tags respond over a shared wireless medium. RFID reader types are ﬁxed,

mobile or handhold units. Which type to use is governed by the application or environment

in which they will be utilized. Fixed readers are often used for large-scale deployments;

6

installed in portals at dock doors and conveyor belts to capture inventory or for tracking

parts, tools and equipment. Fixed RFID readers require access to a ground power source and

usually connect to the network by cables such as RS-232 or USB. Mobile RFID readers come

into play for hard to reach areas where it would be diﬃcult to install a ﬁxed reader. Their

robustness is beneﬁcial when it comes to mounting them on moving vehicles such as forklifts.

When self-contained, with their own battery and antennas, their wireless communication

allows them to connect to a network from a trolley or cart. Handhold RFID readers are

light, compact and ruggedly built to withstand being mishandled. By tethering a cable to be

reader, you can assure yourself of having constant power and communication to the network.

Because mobility is usually more important, most have wireless capability with integrated

antennas and a rechargeable battery.

This chapter concerns the fundamental problem of estimating the size of a missing tag

cardinality of a given large tag set with high speed and high estimation accuracy. This is

needed in many applications such as tag identiﬁcation, privacy sensitive RFID systems, theft

detecting and warehouse monitoring. In missing tag detection and identiﬁcation protocols,

which detect all the missing tags from an unknown tag population, the size of the missing tag

population is estimated at the start to guide the identiﬁcation process. For example, as tag

identiﬁcation protocols which are based on the framed slotted Aloha protocol (standardized

in EPCGlobal Class-1 Generation-2 (C1G2) RFID standard and inlemanted in commercial

RFID systems), missing tag estimation is often used to calculate the optimal frame size. In

privacy sensitive RFID systems, for example, those systems used in museums for continuously

monitoring the number of visitors in diﬀerent areas of a museum to plan the guided trips

eﬃciently, readers may not have the permission to identify human individuals. In warehouse

with RFID-based monitoring systems, managers sometimes need to estimate the number of

sold products quickly in order to estimate the stock of products or detection of employee

theft. It is very straight forward to use the tag identiﬁcation protocols to accurately measure

the missing tag population. However, it will be very slow if we use identiﬁcation protocols.

7

For example, for a population of 5k tags called A, we have a another known tag set with

size 1k which we named it B. We want to know how many tag are missing in 5k A tags

comparing with this 1k B tags. The missing tags are the tag which only exist in B not in A.

This is very important to estimate those missing tag immediately. Now we formally deﬁne

the tag search problem. Let A represent the set of IDs of tags that we want to search for

in a population. We know exactly which IDs are present in set A. Let B represent the set

of IDs of tags in the population in which we search for tags in set A. Let C represent the

set of IDs of those tags that are present in both sets A and B. We do not have any prior
knowledge about the IDs in sets B or C, however, we do know that C ⊆ A and C ⊆ B. Let
˜C represent the set of IDs in C that the tag search protocol returns, where C ⊆ ˜C ⊆ A. As
most application can tolerate a small error in determining the IDs in set C, for a required

conﬁdence interval of β, our objective is to design a tag search protocol that uses a set
of readers to quickly generate set ˜C such that | ˜C| − |C| ≤ β|C|. Conﬁdence interval β
represents the maximum tolerable fraction of tags in A that are not in C but are declared as

members of C by RTSP. Additionally, a tag search protocol should work in single as well as

multiple-reader environments, and should be compliant with the C1G2 standard.

2.1.2 Proposed Approach

For the problem of searching tags with IDs in set A in population with IDs in set B, there is

a seemingly obvious solution based on RFID tag collection protocol. Execute an RFID tag

collection protocol to ﬁrst collect IDs of all tags in set B and then compare them with the IDs

in set A. This will identify all tags in set A that are present in set B. This solution works;

however, it is too slow. For example, our experimental results show that even the fastest

existing tag collection protocol TH [176] is 2 times slower than our scheme. Slow searching

of RFID tags may have unbearable consequences in time critical applications especially when

tag search has to be performed for hundreds of thousands of diﬀerent kinds of products such

as in case of Amazon warehouses for balancing inventory. Furthermore, this solution can not

8

be used in settings where readers are not allowed to read the IDs of tags in set B due to

privacy reasons. An example of such a setting is a multi-tenant warehouse, where one tenant

may not permit readers of other tenants to read the IDs of its tags.

In this paper, we propose an identiﬁcation tag search protocol called RFID Tag Search

Protocol (RTSP) that can quickly identify tags in set C while ensuring that the requirement
| ˜C| − |C| ≤ β|C| ≥ α is satisﬁed. RTSP uses the frame slotted Aloha protocol speciﬁed in
the C1G2 standard as its MAC layer communication protocol. In Aloha protocol, the reader

ﬁrst tells the tags a frame size f and a random seed number R. Later in the paper, we

will see how a simple use of seed number R will make it straightforward to handle multiple

readers with overlapping regions. Each tag within the transmission range of the reader then

uses f, R, and its ID to select a slot in the frame by evaluating a hash function h(f, R, ID)

whose result is uniformly distributed in [1, f ]. Each tag has a counter initialized with the slot

number it chose to reply. After each slot, the reader ﬁrst transmits an end of slot signal and

then each tag decrements its counter by one. In any given slot, all the tags whose counters

equal 1 respond with a random sequence called RN16. If no tag replies in a slot, it is called

an empty slot. If one or more tags reply in a slot, it is called a nonempty slot. As per the

C1G2 standard, tags do not transmit their IDs unless the reader speciﬁcally asks them to

do so. In RTSP, reader checks if a slot is empty or nonempty using the RN16 sequence and

never asks tags to transmit their IDs. This preserves the privacy in settings where a reader is

not allowed to read IDs of tags in set B. C1G2 standard provisions this functionality of not

asking the tags for their IDs.

To identify the tags in set C, i.e., the tags in set A that are present in population of set B,

RTSP executes multiple Aloha frames with diﬀerent seeds. In each frame, each tag uses the

seed for that frame to select its slot. As RTSP already knows the IDs of all tags in set A, it

pre-computes which tags in A will select which slots in the frames. Thus, it knows which slot

in the frame must be nonempty if a certain ID in A is present in B. When a reader executes

a frame, RTSP compares the response in each slot of that frame with the corresponding slot

9

in the pre-computed frame. RTSP continues and executes n frames with diﬀerent seeds. At

the end of n frames, for any given tag in A, if RTSP observes that the n slots this tag was

supposed to respond in the n executed frames all turned out to be non-empty, it marks that

tag in A to be present in B. If, however, RTSP observes that any of the n slots this tag was

supposed to respond in the n executed frames turned out to be empty, it marks that tag in A

as absent in B. The value of n is chosen such that in executing n frames, for any given tag in

A that is not present in B, with a high probability, RTSP will see at least one slot in one of

the n frames, which is 1 in pre-computed frame due to this tag but 0 in the executed frame.

Our proposed protocol works with multiple readers with overlapping regions. To handle

multiple readers, RTSP uses a central controller for all readers to use same values of frame
size f and seed R across all readers. When a reader transmits seed Ri in its ith frame, it does
not generate Ri on its own, rather it uses the ith seed Ri issued by the central controller.
That is, each reader generates the same sequence of seeds in consecutive frames. As all
readers use the same seed Ri in the ith frame, the slot number that a particular tag chooses
in the ith frame of each reader covering this tag is the same i.e., h(f, Ri, ID) evaluated by
the tag results in same value for each reader. Once a reader completes its frame, it sends the
responses to the central controller. The controller applies logical OR operator on all the ith
frames from all readers and gets a single ith frame as if returned by one reader covering the

entire tag population. The controller repeats this process until it has n ORed frames and

then determine which tags in A are present in B.

2.1.3 Limitation of Prior Art

There are two types of RFID tag search protocols: estimating tag search protocols that

estimate the cardinality of set C [175] and identiﬁcation tag search protocols that identify the

IDs of tags in set C [176, 174, 179, 120]. Estimating tag search protocol is faster but does not

return the IDs in set C. Identiﬁcation tag search protocols return the IDs in set C but are

comparatively slower. Both approaches have their merits. In fact, they are complementary to

10

each other, and should be used together. For example, an estimating tag search protocol can

be used to determine if any tags in set A are present in set B, and if true, an identiﬁcation

tag search protocol should be invoked to identify the tags in C.

There are two major limitations of existing protocols. First, they can not achieve arbitrarily

small conﬁdence interval. Second, except KCTP (which does not return the IDs in set C),

none of the existing protocols is compliant with the EPCGlobal Class 1 Generation 2 (C1G2)

RFID standard [82] because they require the tags to receive, interpret, and act either according

to pre-frame Bloom Filters or according to protocol speciﬁc parameters. Such functionalities

are not provisioned in the C1G2 standard because tags, especially the passive ones, do not

have enough computational power. It is important for an RFID protocol to be compliant

with the C1G2 standard because the cheap commercially available oﬀ-the-shelf (COTS) tags

follow the C1G2 standard. A protocol that is not compliant with the C1G2 standard will

require custom tags, which will not only cost more but will also work only in limited settings.

For example, if an airline uses a protocol and tags that are non-compliant with the C1G2

standard, it may be able to track its baggage at its home airport but not at the airports in

rest of the world, which support only the C1G2 compliant tags.

2.2 RFID Tag Searching Background

2.2.1 RFID Tag Searching Motivation and Problem Statement

As the cost of commercial RFID tags, which is as low as 5 cents per tag [160], RFID system

has become negligible compared to the prices of the products to which they are attached,

RFID systems have been increasingly used in various applications such as supply chain

management [94], indoor localization [213, 149], 3D positioning [195], object tracking [141],

inventory control, electronic toll collection, and access control [60, 140]. For example, Walmart

uses RFID tags to track expensive clothing merchandize [161] and Honeywell Aerospace uses

RFID tags to track its products from birth to repair and retirement [185]. As we mentioned

before, an RFID system consists of tags and readers. A tag is a microchip with an integrated

11

antenna in a compact package that has limited computing power and communication range.

There are two types of tags: passive tags and active tags. Passive tags do not have their

own power source, are powered up by harvesting the radio frequency energy from readers,

and have communication ranges often less than 20 feet. Active tags have their own power

sources and have relatively longer communication ranges. A reader has a dedicated power

source with a signiﬁcant amount of computing power. RFID systems mostly work in a

query-response fashion where a reader transmits queries to a set of tags and the tags respond

with their IDs over a shared wireless medium. Based on those RFID systems, we address

the fundamental problem of RFID tag searching which can be stated as, given a set of

known tag IDs and a population of RFID tags with unknown IDs, where the tags may be

passive or active, we want to know which tag IDs are in the tag population, i.e., search in a

population of unknown tags for a set of known IDs. Searching tags with unknown IDs has

many applications such as products recall, inventory balancing, and stock veriﬁcation. For

product recall, if a manufacturer suspects that some of its products, which have already been

distributed in diﬀerent warehouses, are defective, they can use a tag searching protocol to

quickly locate defective products, where the known tag IDs are defective products and the

tag population are the products in a warehouse. For inventory balancing, if a large retailer,

such as Amazon, wants to balance the quantity of diﬀerent products among its warehouses

across the country to reduce shipping time and costs, they can use a tag searching protocol to

determine the quantity of any given product in each warehouse and then balance the quantity

among warehouses accordingly, where the known tag IDs are the ones in inventory and the

tag population are the ones in a warehouse. For stock veriﬁcation, if a large retailer wants to

check the quantity of each requested product sent to it in a large consignment, they can use a

tag searching protocol to determine whether the consignment contains all requested products,

where the known tag IDs are the ones that they are expecting and the tag population are the

ones in the consignment. In this report, we use the three terms, a tag, a tag ID, and the

product that a tag is attached to, interchangeably.

12

The tag searching problem can be formally deﬁned as: Given a set A, which is a set

of known tag IDs, a set B, which is a population of RFID tags with unknown IDs, a
required conﬁdence interval β, a tag searching protocol outputs ˜C so that C ⊆ ˜C ⊆ A and
| ˜C| − |C| ≤ β|C|, where C = A ∩ B. Conﬁdence interval β represents the maximum tolerable
fraction of tags in A that are not in C but are declared as members of C by a tag searching

protocol. A tag searching protocol should satisfy three additional requirements:

• First, it should comply with the EPCGlobal Class 1 Generation 2 (C1G2) RFID
standard [82]. Otherwise, it will be extremely diﬃcult to be practically deployed

because commercial RFID readers and tags are typically C1G2 compliant.

• Second, it should preserve the privacy of the RFID tags in set B by not reading their
tag IDs. Many RFID tag searching applications need to satisfy this privacy requirement.

For example, if a policeman searches for some items with known tag IDs in a private

house with a population of tags with unknown tag IDs, the home owner may prefer

not to read the IDs of all tags in the house.

• Third, it should work with both a single-reader and multiple-reader environments. As
the communication range between a tag and a reader is limited, a large population of

tags is often covered by multiple readers whose regions often overlap.

In this report, a protocol called RFID Tag Searching Protocol (RTSP) is proposed to

solve RFID tag searching problem, which satisﬁes the following four requirement: (1) C1G2
compliance, (2) arbitrary accuracy, i.e., C ⊆ ˜C ⊆ A and | ˜C| − |C| ≤ β|C| for any required
conﬁdence interval β, (3) privacy preserving, and (4) multiple-reader capability.

To satisfy the requirement of C1G2 compliance, RTSP uses the frame slotted Aloha

protocol speciﬁed in the C1G2 standard as its MAC layer communication protocol. In Aloha,

the reader ﬁrst tells the tags a frame size f and a random seed number R. Each tag within

the transmission range of the reader then uses f, R, and its ID to select a slot in the frame

by calculating a hash function h(f, R, ID) whose result is uniformly distributed in [1, f ].

Each tag has a counter initialized with the slot number that it chose to reply. After each slot,

13

the reader ﬁrst transmits an end of slot signal and then each tag decrements its counter by

one. In any given slot, all the tags whose counters equal 1 respond with a random sequence

called RN16. If no tag replies in a slot, it is called an empty slot. If one or more tags reply

in a slot, it is called a nonempty slot. Using 0 to denote an empty slot and 1 to denote a

nonempty slot, after we execute the Aloha protocol on a population A of tags using frame
size f and random seed R, we obtain a binary array of f bits, denoted as S(A, f, R).

To satisfy the requirement of arbitrary accuracy, RTSP executes n runs of the Aloha
protocol where each run uses a diﬀerent seed. For the ith run with frame size f and random
seed Ri, RTSP executes the Aloha protocol on both sets A and B, and thus obtains two
binary arrays S(A, f, Ri) and S(B, f, Ri). Note that RTSP executes the Aloha protocol on A
virtually as it knows all tag IDs in A. After n runs, for each tag ID t ∈ A, if for all 1 ≤ i ≤ n,
we have S(A, f, Ri)[h(f, Ri, t)] = S(B, f, Ri)[h(f, Ri, t)], (i.e., for all n runs, the two bits
corresponding to tag t in both S(A, f, Ri) and S(B, f, Ri) are 1), then RTSP outputs t ∈ ˜C.
Clearly RTSP satisﬁes C ⊆ ˜C ⊆ A. RTSP chooses a value of n so that | ˜C| − |C| ≤ β|C|.

To satisfy the requirement of privacy preserving, RTSP checks if a slot is empty or

nonempty using the RN16 sequence and never asks tags to transmit their IDs. In C1G2, tags

do not transmit their IDs unless the reader speciﬁcally asks them.

To satisfy the requirement of multi-reader capability, RTSP uses a central controller for all

readers to use the same values for frame size f and seed R across all readers. When a reader
transmits seed Ri in its ith frame, it does not generate Ri on its own, rather, it uses the ith
seed Ri issued by the central controller. Thus, for a tag t ∈ B that is covered by multiple
readers, it chooses the same slot h(f, Ri, t) for all readers. Once a reader completes its frame,
it sends its binary array to the central controller. The controller applies the bit-wise logical

OR operation on the binary arrays returned from all readers. The resulting binary array

is the same as if there is one reader that covers all tags. RTSP uses this binary array to

compute ˜C.

The key novelty of RTSP is that it statistically guarantees to achieve any required

14

accuracy and complies with the C1G2 standard. The key technical depth of RTSP lies in its

mathematical development to guarantee the arbitrary required accuracy and to minimize tag

searching time. The key advantages of RTSP over prior tag searching protocols are that RTSP

can achieve arbitrarily high accuracy and RTSP complies with the C1G2 standard. RTSP

is easy to deploy because it neither requires modiﬁcation to tags nor to the communication

protocol between tags and readers. RTSP can be implemented as a software module on

readers. We have extensively evaluated the performance of RTSP. Our results show that for
a scenario with |A| = 5000, |B| = 5000, and |C| = 500, and a required conﬁdence interval of
0.1%, RTSP takes 15 seconds to search the tags whereas the fastest prior tag identiﬁcation

protocol (TH [176]) takes 22 seconds.

2.3 Review of Related Research

To the best of our knowledge, there are only three identiﬁcation tag search protocols

[221, 54, 217] and one estimating tag search protocol [116]. By identiﬁcation, we mean those

protocols identify all the tags IDs ﬁrstly, then give the result by comparing with the IDs

we want to search. And none of them satisfy all four requirements simultaneously. Next, I

review these identiﬁcation and estimating tag search protocols in this chapter.

2.3.1

Identiﬁcation Tag Search Protocols

Zheng and Li proposed the ﬁrst RFID tag search protocol namely CATS [221]. CATS works

in two phases. In the ﬁrst phase, a server ﬁrst constructs a Bloom Filter by applying multiple

hash functions in conjunction with a random seed on each tag ID in set A. Second, an RFID

reader broadcasts the bit array of Bloom Filter generated by the server along with the random

seed to all tags in the population B. Third, on receiving the broadcast, each tag constructs a

Bloom Filter using the seed and its own ID. Fourth, if a tag ﬁnds that all the bits it has set

to 1 in its local Bloom Filter are also set to 1 in the Bloom Filter array broadcasted by the

reader, it considers itself as a candidate tag that the reader is searching for and thus stays

15

awake to participate in the next round; otherwise, it sleeps and does not participate in future

rounds. The server and reader repeat the process of generating and transmitting Bloom

Filter arrays with diﬀerent seeds until most of the tags that are left awake are those that the

reader is searching for. In the second phase, the reader executes standard Aloha protocol to

identify the tags that are awake. Unfortunately, C1G2 compliant tags can not interpret or

generate Bloom Filters, which makes CATS non-compliant with the C1G2 standard.

Chen et al. proposed another tag search protocol called ITSP, which is an improved version

of CATS [54]. The authors of ITSP realized that the Bloom Filter array that CATS uses

may be much larger than 96-bits. Therefore, they proposed to segment the implementation

of Bloom Filter into small arrays. The major diﬀerence between CATS and ITSP is that

in CATS, in the ﬁrst phase, a reader transmits a single Bloom Filter array all at once,

whereas in ITSP, reader only transmit a segment of the Bloom Filter to shrink the candidate

set. In addition to using segmented Bloom Filters, authors also proposed to observe the

empty and non-empty slots in the second phase and compare them against pre-computed

frames to further ﬁlter out any tags not in B that were not ﬁltered out by the Bloom Filters.

Unfortunately, C1G2 compliant tags can not interpret or generate Bloom Filters, which

makes ITSP non-compliant with the C1G2 standard.

Zhang et al. proposed another tag search protocol called TSM [217]. TSM extends CATS

for use with multiple readers. It ﬁrst executes CATS using each reader and then aggregates

results from all readers to identify the tags in A that are present in B. Unfortunately, due

to similar reasons as for CATS, TSM is also non-compliant with the C1G2 standard. In

contrast, our proposed protocol RTSP is C1G2 compliant.

2.3.2 Estimating Tag Search Protocols

Liu et al. proposed Basic Key tag Counting protocol (B − KC to count the number of tags in
A that are present in B [116]. In stead of observing the whole time frame, reader in B − KC
just need to focus on the singleton slots. B − KC ﬁrst pre-computes a frame using IDs in set

16

A and then executes a frame on population B to determine how many times the slots that

were 1 in the pre-computed frame turned out to be 1 in the executed frame. It then uses the

number of such slots to obtain the estimate of the number of tags in A that are present in B.
B − KC falls short because it can only estimate the number of tags in A that are present in
B, but it can not determine exactly which tags of A are present in B. Another fast estimation

scheme is ART . Executing time of ART which is providing in [96] is provably independent

of the tag population size. In contrast, our proposed protocol RT SP can identify such tags.

2.4 Proposed Research

In this chapter, I explain the proposed research methods to address the problem of RFID

tags searching.

2.4.1 Architecture

For searching RFID tags, RTSP uses a central controller connected with a set of readers that

cover the area where the tags in set B are located. The use of a central controller ensures that

all readers use consistent values of frame sizes and seeds when executing frames, which helps

in eﬃciently aggregating and processing information returned by the readers. The readers

use the standardized frame slotted Aloha protocol to communicate with tags and never ask

the tags to transmit their IDs. The use of multiple readers with overlapping coverage regions

introduces following two problems: (1) scheduling the readers such that no two readers with

overlapping regions transmit at the same time, and (2) alleviating the eﬀect of some tags

responding to multiple readers due to overlap in the coverage region of those readers. For the

ﬁrst problem, the controller uses one of the several existing reader scheduling protocols [188]

to avoid reader-reader collisions. For the second problem, we propose solution in Section 2.5.1.

RTSP does not require any modiﬁcations to tags or readers. It only requires the readers to

receive system parameters from the controller and communicate the responses in the frames

back to the controller.

17

2.4.2 C1G2 Compliance

RTSP does not require any modiﬁcations to tags or readers. It only requires the readers

to receive the frame size, persistence probability, and seed number from the controller and

communicate the responses in the frames back to the controller. Persistence probability p is

the probability with which a tag decides whether it will participate in a frame or not before

selecting a slot in that frame. Later in the paper, we will show how we use p to handle frame

sizes that exceed the C1G2 speciﬁed upper limit of 215. Such large frame sizes are required

when the size of tag population is large and required conﬁdence interval β is small. With

the use of p, the reader reduces the number of tags that participate in each frame, which in

turn reduces the optimal frame size at the expense of increased number of frames. As the

C1G2 standard does not specify the use of p, COTS tags do not support it. To avoid making

any modiﬁcations to tags, in RTSP, the reader implements p by announcing a frame size of

f /p but terminating the frame after the ﬁrst f slots, which can be done as per the C1G2

standard.

2.4.3 Communication Channel

We assume that the communication channel between readers and tags is reliable i.e., tags

correctly receives queries from the readers and the readers correctly detect transmission of

RN16 sequence in a slot if one or more tags in the population transmit in that slot. If the

channel is unreliable, the solution proposed in [176] can be easily adapted for use with RTSP.

2.4.4 Formal Development Assumption

To make the formal development tractable, we assume that instead of picking a single slot to
transmit at the start of ith frame of size f, a tag independently decides to transmit in each

slot of the frame with probability 1/f regardless of its decision about previous or forthcoming

slots. Vogt ﬁrst used this assumption for the analysis of Aloha protocol for RFID and justiﬁed

18

its use by recognizing that this problem belongs to a class of problems called occupancy

problem, which deals with the allocation of balls to urns [193]. Ever since, the use of this

assumption has become a norm in the formal analysis of all Aloha based RFID protocols

[175, 193, 218].

The implication of this assumption is that a tag can end up choosing more than one slots

in the same frame or even not choosing any at all, which is not in accordance with the C1G2

standard that requires a tag to pick exactly one slot in a frame. However, this assumption

does not create any problems because the expected number of slots that a tag chooses in a

frame is still one. The analysis with this assumption is, therefore, asymptotically the same

as that without this assumption [39]. Bordenave et al. further explained in detail why this

independence assumption in analyzing Aloha based protocols provides results just as accurate

as if all the analysis was done without this assumption [39]. This independence assumption

is made only to make the formal development tractable. In all our simulations, a tag chooses

exactly one slot at the start of a frame. Table 2.2 lists the symbols used in this paper.

2.5 RFID Tag Search Protocol

2.5.1 Protocol Description

To search which tags in set A are present in the population B, in RTSP, the central controller

executes n Aloha frames using the RFID readers. There are ﬁve steps involved in executing

each frame. First, before executing any frame i, the controller calculates the optimal values of
frame size fi, persistence probability pi, and generates a random seed number Ri. Second, as
the controller knows the IDs in set A, it pre-computes which tag in A will choose which slot
in the ith frame, i.e., it virtually executes the Aloha protocol on set A and obtains the binary
array S(A, fi, Ri). Thus, the controller knows which bits in the binary array S(B, fi, Ri)
resulting from executing ith frame on population B should be 1 if all the tags in A were

present and a single reader covered the entire population. Third, it provides each reader with
the parameters fi, pi, and Ri and asks each of them to execute the ith frame using these

19

Table 2.2 Symbols used in the paper

Symbol Description

A
B
C
˜C
β
fi
fop
Ts
n
nop
p
Ri

h(f, R, ID)

Pf p

Xij
N 11
i
S
E[.]

set of tag IDs to be searched
set of tag IDs in RFID tag population
tag IDs in set A that are present in B
IDs of A returned by RTSP to be present in B
required conﬁdence interval
frame size for round i
optimum value of frame size
duration of each slot in frame
Number of times frames are repeated
optimum value of n
persistence probability
random seed for ith frame
unform hash function in [1, f ]
false positive probability
indicator random variable for jth slot in ith
frame to be 1↔1
random variable for # of 1↔1 slots in ith frame
Total number of execution slots
Expected value

parameters. The motivation behind using the same values of fi, pi, and Ri across all readers
for the ith frame is to enable RTSP to work with multiple readers with overlapping regions.
As all readers use the same values of fi, pi, and Ri in the ith frame, the slot number that
a particular tag chooses in the ith frame of each reader covering this tag is the same i.e.,
h( fi
, Ri, ID) evaluated by the tag results in same value for each reader. Fourth, each reader
pi
executes the frame on its turn as per the reader scheduling protocol and sends the responses in
the frame back to the controller. Fifth, after the controller has received the ith frame of each
reader, it applies logical OR operator on all the received ith frames and obtains the resultant
bit array S(B, fi, Ri). This resultant bit array S(B, fi, Ri) is same as if generated by a single
reader covering all the tags. After obtaining n bit arrays S(B, fi, Ri) for 1 ≤ i ≤ n, for each
tag t in A, the controller checks whether S(A, fi, Ri)[h( fi
, Ri, t)]
pi
for all n frames, i.e., for all n frames, the two bits corresponding to tag t in both S(A, fi, Ri)
and S(B, fi, Ri) are 1, then RTSP declares that tag t is present in population B. Note that

, Ri, t)] = S(A, fi, Ri)[h( fi
pi

20

RTSP can have false positives, i.e., it can declare a tag in set A to be present in population

B, when it actually is not. Apparently, based on the design of RTSP we know that RTSP

does not have false negatives.

2.5.2 Estimating Number of Tags in Set C

Recall from the previous section that before executing any frame i, the controller calculates
the optimal values of frame size fi and persistence probability pi. To calculate these optimal
values for ith frame, the controller needs estimate of |C| at start of the ith frame, which
it obtains using the responses from the tag population in the previous i − 1 frames. We
represent the estimate of |C| at the start of ith frame by | ˜Ci|. As the controller executes
more and more frames, i.e., as i increases, the estimate | ˜Ci| asymptotically becomes equal to
|C|. Next, we present a method to estimate the value of |C| at start of any frame i.

The intuition behind our estimation method is that as the number of tags in set C
increase, the number of pairs of bits in S(A, fi, Ri) and S(B, fi, Ri) that are 1 also increase.
We represent a bit that is 1 in both S(A, fi, Ri) and S(B, fi, Ri) by 1↔1. The number of
1↔1 bits for any given frame is a function of |C| and can, therefore, be used to estimate
the value of |C|. Next, we derive an expression that relates the number of 1↔1 bits with
the value of |C|, i.e., we derive an expression for E[N 11
is
random variable for number of 1↔1 bits in pair of bit arrays i.e., S(A, fi, Ri) and S(B, fi, Ri).
To derive the expression for E[N 11
], we need the probability that any given bit in a pair of
bit arrays is 1↔1. We calculate this probability in the following lemma.

] as a function of |C|, where N 11

i

i

i

Lemma 1. Let A be the set of IDs of tags that we want to search for in a population. Let

B be the set of IDs of tags in the population in which we search for tags in set A. Let C be

the set of IDs of those tags that are present in both sets A and B. Let Xij be an indicator
random variable for the event that the jth bit in ith bit array pair is a 1↔1 bit. For frame size
fi and persistence probability pi, the probability distribution of Xij is given by the following

21

equation.

P(cid:8)Xij = 1(cid:9) = 1 − (1 − pi

fi

)|A| − (1 − pi
fi

)|B| + (1 − pi
fI

)|A|+|B|−|C|

(2.1)

Proof. Probability that any given bit j in a bit array pair is a 1↔1 bit can be obtained by
ﬁrst calculating the probability that this bit is not a 1↔1 bit, and then subtracting it from 1.
The jth bit is not 1↔1 when one of the following three cases happens.

1. None of the tags in set A select the jth slot in pre-computed frame i.e., jth bit in
S(A, fi, Ri) is 0, and none of the tags in population B select the jth slot in corresponding
executed frame i.e., jth bit in S(B, fi, Ri) is 0. We represent this event by an indicator
random variable Y00. The probability distribution of Y00 is given by the following
equations.

(cid:18)
(cid:18)

1 − p
f
1 − p
f

P {Y00 = 1} =

=

(cid:19)|A−C|(cid:18)
(cid:19)|A|+|B|−|C|

1 − p
f

(cid:19)|C|(cid:18)

(cid:19)|B−C|

1 − p
f

(2.2)

(2.3)

2. One or more tags in set A − C select the jth slot in pre-computed frame i.e., jth
bit in S(A, fi, Ri) is 1, and none of the tags in population B select the jth slot in
corresponding executed frame i.e., jth bit in S(B, fi, Ri) is 0. We represent this event
by an indicator random variable Y10. The probability distribution of Y10 is given by
the following equations.

(cid:32)
(cid:32)

(cid:19)|A−C|(cid:33)(cid:18)
(cid:19)|A−C|(cid:33)(cid:18)

(cid:18)
(cid:18)

1 − p
f
1 − p
f

1 − p
f
1 − p
f

P {Y10 = 1} =

=

1 −

1 −

(cid:19)|B−C|

(cid:19)|C|(cid:18)
(cid:19)|B|

1 − p
f

3. None of the tags in set A select the jth slot in pre-computed frame i.e., jth bit in
S(A, fi, Ri) is 0, and one or more tags in population B − C select the jth slot in
corresponding executed frame, i.e., i.e., jth bit in S(B, fi, Ri) is 1. We represent this
event by an indicator random variable Y01. The probability distribution of Y01 is given
by the following equations.

22

(cid:18)
(cid:32)

1 − p
f

(cid:18)

1 −

P {Y01 = 1} =

=

(cid:19)|A−C|(cid:18)

(cid:19)|C|(cid:32)

(cid:18)
(cid:19)|A|

1 −

1 − p
f

(cid:19)|B−C|(cid:33)(cid:18)

1 − p
f

1 − p
f

(cid:19)|B−C|(cid:33)

1 − p
f

The probability distribution of Xij is given by the following equation.

P(cid:8)Xij = 1(cid:9) = 1 − P {Y00 = 1} − P {Y10 = 1} − P {Y01 = 1}

(2.4)

(2.5)

Substituting the expressions for the probability distributions of Y00, Y10, and Y01 from
Equations (2.2), (2.3), and (2.4), respectively, into Equation (2.5) and simplifying, we get

Equation (1).

Following theorem derives the expression for E[N 11

i

] as a function of |C|.

Theorem 2. Let A be the set of IDs of tags that we want to search for in a population. Let

B be the set of IDs of tags in the population in which we search for tags in set A. Let C be
the set of IDs of those tags that are present in both sets A and B. Let N 11
be the random
variable for the number of 1↔1 slots in a pair of bit arrays of size fi each. When persistence
(cid:18)
probability is pi, the expected value of N 11

is given by the following equation.

i

i

Proof. It is straight forward to see that N 11
set of identically distributed random variables, E[N 11

j=1 Xij. As

] is given by

i

Xi1, Xi2, . . . , Xifi)

E[N 11

i

] = E[

Xij] = fi × E[Xij]

j=1

As expected value of an indicator random variable equals its probability of being 1, E[Xij] =

P(cid:8)Xij = 1(cid:9). Substituting the value of E[Xij] in the equation above with the value of
P(cid:8)Xij = 1(cid:9) from Equation (2.5), we get the equation for E[N 11

] in theorem statement.

i

23

E[N 11

i

] = fi ×

1 − (1 − pi
fi

)|B|

)|A|+|B|−|C|(cid:19)
)|A| − (1 − pi
fi
(cid:110)

+(1 − pi
fi

i =(cid:80)fi
fi(cid:88)

(cid:111)

(2.6)

forms a

Figure 2.1 plots E[N 11

] as a function of |C| using Equation (2.6). This ﬁgure is obtained
] is

using |A| = 200, |B| = 300, fi = 300 and pi = 1. We observe from this ﬁgure that E[N 11
a monotonically increasing function of |C|. To estimate the value of |C|, let ˜N 11
the observed value of number of 1↔1 bits for ith pair of bit arrays. Replacing E[N 11
] in
Equation (2.6) with ˜N 11
and solving for |C| gives an estimate of |C|. This estimate is
obtained by utilizing the information from the ith frame only. While this estimate may not be

represent

i

i

i

i

i

accurate, if we use the information from more frames, the estimate will become more accurate.

Speciﬁcally, we leverage the well known statistical result that the variance in the observed

value of a random variable reduces by x times if we take the average of x observations of
that random variable. Therefore, to obtain the estimate | ˜Ci| of |C| at the start of the ith
frame, we obtain an estimate from each of the previous i − 1 frames and take their average.
Solving Equation (2.6) for |C| and averaging over past i − 1 frames, the formal expression for
| ˜Ci| becomes

(cid:41)

E[N 11
l
fl

]

− pl
fl

− 1 + e
i − 1

A

+ e

− pl
fl

B

(2.7)

(cid:40)

(cid:80)i−1

l=1

fl
pl

ln

| ˜Ci| ≈ |A| + |B| +

Figure 2.1 Expected value of number of 1↔1 slots vs. |C|

Finally, note that the controller obtains this estimate without executing any additional

frames. It gets this estimate from the frames it was already executing to search for tags.

24

02550751001251501752006080100120140160180No. of present tagsExpected value of Ni1 12.6 Parameter Optimization

In this section, we will derive equations that the controller uses at the start of ith frame
to calculate the optimal values of frame size fi and persistence probability pi to minimize
the execution time of RTSP while ensuring that its actual conﬁdence interval is less than
the required conﬁdence interval. At the start of ith frame, the controller uses the estimate
| ˜Ci| along with the values of |A|, |B|, and β to calculate the optimal values of fi and
pi. Before asking the readers to execute the ith frame, the controller also calculates the
maximum number of frames that it should execute, represented by ni. Recall from Section
2.5.2 that as the number of executed frames increase, the estimate of |C| becomes more
accurate. Consequently, ni, fi, and pi asymptotically become equal to constants n, f, and
p, respectively. When the estimate of |C| changes by less than 2 in 10 consecutive frames,
the controller considers the estimate to be close enough to |C|.At this point, the controller
calculates the values of ni, fi, and pi one last time and puts f = fi, p = pi, and n = ni, and
uses these ﬁxed values of f and p to execute subsequent frames until the total number of

frames executed since the ﬁrst frame become equal to n. For the ﬁrst frame, i.e., when i = 1,
the controller uses n1 = ∞, f1 = max{|A|,|B|}, and p1 = 1. The choices of the values of n1,
f1, and p1 are arbitrary and do not really matter because as the controller executes more
frames, number of frames, frame size, and persistence probability converge to constants n, f,

and p, respectively.

In subsequent calculation of ni, fi, and pi, we will drop the subscript i to make the
presentation simple. Next, we ﬁrst derive the expression for false positive probability i.e.,

probability with which RTSP declares a tag in set A to be present in population B, when

it actually is not. Second, using the expression for false positive probability, we derive a

conﬁdence condition that the values of n, f, and p must satisfy to ensure that the observed

conﬁdence interval is smaller than the required conﬁdence interval β, i.e., the requirement
| ˜C| − |C| ≤ β|C| is satisﬁed. Third, we derive a duration condition, which the values of f
and p must satisfy to ensure that the execution time of RTSP is minimized. The controller

25

solves these two conditions simultaneously to obtain the optimal values of n, f, and p. Last ,

we will describe our strategy to bring the value of f within limit when the optimal value of

the frame size exceeds the C1G2 speciﬁed upper limit of 215.

2.6.1 False Positive Probability

A false positive occurs when all the bits that a particular tag in A that is not present in B
selects in the n bit arrays S(A, fi, Ri) turn out to be nonempty in S(B, fi, Ri) because some
other tags in the population made those bits 1. Lemma 3 gives the expression to calculate

the false positive probability.

Lemma 3. Let B be the set of IDs of tags in the population in which we search for tags. With

persistence probability p, frame size f, and number of frames n, the false positive probability,
Pf p, is given by the following equation.

(cid:34)

(cid:18)

(cid:19)|B|(cid:35)n

Pf p =

1 −

1 − p
f

(2.8)

Proof. Consider a tag t such that t ∈ A ∧ t /∈ B. The probability that the slot tag t selects
in the ith pre-computed frame i.e., the bit it selects in S(A, fi, Ri) is selected by at least one
f )|B|. The probability that all n bits tag t
tag in population B in S(B, fi, Ri) is 1 − (1 − p
selects in the n bit arrays S(A, fi, Ri) are also selected by some other tags in population B
f )|B|]n, which is the expression for false
in corresponding bit arrays S(B, fi, Ri) is [1 − (1 − p
positive probability given in Equation (2.8).

Figure 2.2 shows the theoretically calculated false positive probability from Equation (2.8)

represented by the solid line and experimentally observed values of false positive probability

represented by the dots. To obtain this ﬁgure, we use f = 600, p = 1, and n = 10. Each dot

represents the false positive probability calculated from 200 runs of simulation. We observe

that the theoretically calculated values match perfectly with experimentally observed values,

showing that our independence assumption that we stated in Section 2.4.4 does not cause

the theoretical analysis to deviate from practically observed values.

26

Figure 2.2 Comparison of theoretical and experimental Pf p

2.6.2 Conﬁdence Condition

Theorem 4 states the conﬁdence condition that the values of n, f, and p must satisfy to

achieve the required conﬁdence interval β.

Theorem 4. Let A be the set of IDs of tags that we want to search for in a population. Let

B be the set of IDs of tags in the population in which we search for tags in set A. Let C

be the set of IDs of those tags that are present in both sets A and B. To ensure that RTSP
satisﬁes the requirement | ˜C| − |C| ≤ β|C|, the controller must use the values for number of
frames n, frame size f, and persistence probability p that satisfy the conﬁdence condition

given in the following equation.

n =

ln

(cid:16)

β×| ˜C|
ln
|A|−| ˜C|
1 − (1 − p

(cid:18)

(cid:19)
f )|B|(cid:17)

(2.9)

Proof. Let E[| ˜C|] represent the number of tags that RTSP declares as belonging to set C after
executing n frames of size f with persistence probability p. Replacing | ˜C| in | ˜C|−|C| ≤ β|C|
by E[| ˜C|], the reliability requirement is given in the following equation.

E[| ˜C|] − |C| ≤ β|C|

(2.10)

27

20030040050001234x 10−3|B|False positive probability  TheoreticalSimulationsNext, we derive the expression for E[| ˜C|]. Recall from Section 2.5.1 that RTSP can have false
positives, but it cannot have false negatives i.e., it will always identify the tags of A present

in B and in addition, it may also declare some tags in A that are not in B to be present in
B. Thus, E[| ˜C|] = |C| + (|A − C|) × Pf p. As C ⊆ A, thus, E[| ˜C|] = |C| + (|A| − |C|) × Pf p
Substituting this value of E[| ˜C|] into Equation (2.10), we get the following equation.

|C| + (|A| − |C|) × Pf p − |C| ≤ β|C|

Substituting the value of Pf p from Equation (2.8) into equation above and rearranging, we
get

(cid:16) β×|C|

|A|−|C|
ln
1 − (1 − p

(cid:17)
f )|B|(cid:17)

(cid:16)

n ≤

ln

As we do not know the exact value of |C|, rather we know the estimate | ˜C| of |C|, replacing
|C| in this equation with | ˜C| and using the largest value for n to ensure that conﬁdence
requirement is always met, we get Equation (2.9) in theorem statement.

2.6.3 Duration Condition

Theorem 5 states the duration condition that the values of f and p must satisfy to minimize

the execution time of RTSP.

Theorem 5. Let A be the set of IDs of tags that we want to search for in a population. Let

B be the set of IDs of tags in the population in which we search for tags in set A. Let C

be the set of IDs of those tags that are present in both sets A and B. To ensure that the

execution time of RTSP is minimum, the controller must use the values for frame size f and

persistence probability p that satisfy the duration condition given in the following equation.

(cid:18)

|B|(cid:19)

(cid:26)

|B|(cid:27)

p × |B| = f ×

1 − e

p
f

× ln

1 − e

− p
f

(2.11)

Proof. Execution time is directly proportional to the total number of slots because the

duration of each slot is the same, typically 300µs for Philips I-Code RFID reader [171]. Let

28

S represent the total number of slots. Thus, S = f × n. To ensure that RTSP achieves the
required conﬁdence interval, we use the value of n from Equation (2.9). Thus,

(cid:18)

(cid:19)
f )|B|(cid:17)

(cid:16)

β×| ˜C|
f ln
|A|−| ˜C|
1 − (1 − p

S =

ln

(2.12)

Figure 2.3 plots S as a function of f using Equation (2.12). This ﬁgure is made using
|A| = 100, |B| = 100, | ˜C| = 52, p = 1, and β = 0.05. We observe from this ﬁgure that S
is a convex function of f. Therefore, optimum value of f exists, represented by fop, that
minimizes the total number of slots S. To ﬁnd optimal value of f, we diﬀerentiate Equation

(2.12) with respect to f and equate the resulting expression to 0, which results in the following

expression.

(cid:18)

Note that ln

(cid:34)

(cid:32)

ln

β×| ˜C|
|A|−| ˜C|

β × | ˜C|
|A| − | ˜C|

(cid:19)

(cid:33)(cid:35)(cid:34)

p|B| − f

(cid:32)

1 − e

|B|(cid:33)

p
f

ln

(cid:40)

1 − e

|B|(cid:41)(cid:35)

− p
f

= 0

(cid:54)= 0, which means that

(cid:18)

|B|(cid:19)

(cid:26)

1 − e

|B|(cid:27)

− p
f

= 0

ln

p|B| − f

1 − e

p
f

Rearranging the equation above, we get the duration condition in the theorem statement.

Figure 2.3 Total number of slots S vs. frame size f

29

010020030040050060070080090010001100Frame size fTotal number of slotsThe controller solves Equations (2.9) and (2.11) simultaneously using p = 1 and gets the

optimal values of n and f represented by nop and fop, respectively. Next, we study the eﬀect
of |A|, |B|, |C|, and β on execution time of RTSP.

Execution Time vs. |A|: Intuitively, as the number of tags in A increases, the execution
time of RTSP should increase because greater number of tags in A imply higher chances of

false positives. Thus, to ensure that the number of false positives stay small enough so that

the required conﬁdence interval is achieved, RTSP executes more frames, i.e., the value of nop
increases, which increases the overall execution time. Figure 2.4(a) conﬁrms our intuition.
This ﬁgure plots the expected execution time of RTSP for multiple values of |A| while ﬁxing
|B| at X5000 and |C| at 500. We calculated the execution time as nop × fop × Ts, where
Ts is the time of each slot and is equal to 300µs as per the speciﬁcations of Philips I-Code
RFID reader [171]. We observe from Figure 2.4(a) that as the number of tags in A increase,

the execution time of RTSP increases.

Execution Time vs. |B|: Intuitively, as the number of tags in B increases, the execution
time of RTSP should increase because greater number of tags in B also imply higher chances

of false positives. Thus, to ensure that the number of false positives stay small enough so

that the required conﬁdence interval is achieved, RTSP increases the frame size, i.e., the

value of fop increases according to Equation (2.11), which increases the overall execution
time. Figure 2.4(b) conﬁrms our intuition. This ﬁgure plots the expected execution time of
RTSP for multiple values of |B| while ﬁxing |A| at 5000 and |C| at 500. We observe from
Figure 2.4(b) that as the number of tags in B increase, the execution time of RTSP increases.
Execution Time vs. |C|: Intuitively, as the number of tags in C increase, the execution
time of RTSP should decrease because greater number of tags in C means RTSP has greater
margin of error i.e., β|C|. Thus, RTSP reduces the value of nop, which decreases the overall
execution time. Figure 2.4(c) conﬁrms our intuition. This ﬁgure plots the expected execution
time of RTSP for multiple values of |C| while ﬁxing |A| at 5000 and |B| at 5000. We observe
from Figure 2.4(c) that as the number of tags in C increase, the execution time of RTSP

30

(a) Execution time vs. |A|

(b) Execution time vs. |B|

(c) Execution time vs. |C|

(d) Execution time vs. β

Figure 2.4 Expected execution times of RTSP

decreases.

Execution Time vs. β: Intuitively, as the required conﬁdence interval β increases, the

execution time of RTSP should decrease because larger required conﬁdence interval means

RTSP has greater margin of error. Thus, RTSP reduces the values of nop, which decreases the
overall execution time. Figure 2.4(d) conﬁrms our intuition. This ﬁgure plots the expected
execution time of RTSP for diﬀerent values of β while ﬁxing |A| at 5000, |B| at 5000, and
|C| at 500. We observe from Figure 2.4(d) that as the required conﬁdence interval increases,
the execution time of RTSP decreases.

31

200040006000800010000510152025|A|Execution time (seconds)  beta=0.01beta=0.05beta=0.1beta=0.2200040006000800010000010203040|B|Execution time (seconds)  beta=0.01beta=0.05beta=0.1beta=0.250010001500200051015202530|C|Execution time (seconds)  beta=0.01beta=0.05beta=0.1beta=0.20.050.10.150.212141618202224βExecution time (seconds)2.6.4 Handling Large Frame Sizes

For large populations and/or small required conﬁdence interval, it is possible for the value

of fop to exceed the C1G2 speciﬁed upper limit of 215. Next, we describe how we use p to
bring the frame size within limits. Bringing the frame size within limits comes at a cost

of increased number of slots; greater than the minimum value of S that would have been

achieved if the controller could use fop > 215.

When we decrease the value of p, the number of tags that participate in a frame decreases.

Therefore, the required value of f also decreases. Participation by fewer tags means that

participation by the tags belonging to both the sets A and B decreases. This increases

the chances that a given tag in A that is present in B will not select any slot in a given

pre-computed frame, which means that chances of identifying its presence decrease. Therefore,

the overall uncertainty in identifying tags in A increases. To reduce this uncertainty, the

value of n increases when p decreases to achieve the required conﬁdence interval.

We use these two observations to reduce the value of f whenever fop > 215. When
fop > 215, the controller uses f = fmax = 215 in Equation (2.9), which leaves two unknowns,
p and n, in the resulting equation. The controller solves the resulting equation simultaneously

with Equation (2.11) to get new values of p and n. The new value of p is less than 1 and the

new value of n is greater than nop (we represent n with nop when we use f = fop to calculate
it). The controller uses these new values of n and p along with f = fmax to pre-compute the
bit array S(A, fi, Ri). Although the total number of slots S = fmax × n > fop × nop, this is
still the smallest under the constraints that the required conﬁdence interval is achieved and

the frame size does not exceed fmax.

2.7 Performance Evaluation

We simulated and implemented RTSP in Matlab. We also implemented the fastest existing

tag identiﬁcation protocol, TH [176], to compare the execution time of RTSP with it. We

choose tag ID length of 64 bits as speciﬁed in the C1G2 standard. Note that the distributions

32

(a) required β = 0.2

(b) required β = 0.1

(c) required β = 0.05

(d) required β = 0.01

Figure 2.5 Observed conﬁdence interval vs. |A| when |B| = 5000, and |C| = 500

of the IDs of tags in A and B do not matter because RTSP is independent of ID distributions.

Next, we ﬁrst evaluate the accuracy of RTSP and then compare its execution time with the

execution time of TH. All results reported in this section are obtained from averaging over

200 independent runs of RTSP.

2.7.1 Accuracy

To evaluate the accuracy of RTSP, we study whether it achieves the required conﬁdence
interval for diﬀerent values of |A|, |B|, and |C|.

If we recall the how we calculate the

33

20004000600080001000000.10.20.30.4|A|Obeserved confidence interval20004000600080001000000.050.10.150.2|A|Obeserved confidence interval20004000600080001000000.020.040.060.080.1|A|Obeserved confidence interval20004000600080001000000.0050.010.0150.02|A|Obeserved confidence interval(a) required β = 0.2

(b) required β = 0.1

(c) required β = 0.05

(d) required β = 0.01

Figure 2.6 Observed conﬁdence interval vs. |B| when |A| = 5000, and |C| = 500

parameters we will know that RTSP can ensure the conﬁdence interval with any combinations
of |A|, |B|, and |C|.

2.7.1.1 Observed Conﬁdence interval vs. |A|

Our experimental results show that RTSP always achieves the required conﬁdence interval

regardless of the size of set A. Figures 2.5(a), 2.5(b), 2.5(c), and 2.5(d) plot the actual

conﬁdence interval RTSP achieved for diﬀerent sizes of set A when the required values of

conﬁdence interval are β = 0.2, β = 0.1, β = 0.05, β = 0.01, respectively. To plot these

34

20004000600080001000000.10.20.30.4|B|Obeserved confidence interval20004000600080001000000.050.10.150.2|B|Obeserved confidence interval20004000600080001000000.020.040.060.080.1|B|Obeserved confidence interval20004000600080001000000.0050.010.0150.02|B|Obeserved confidence intervalﬁgures, we ﬁxed number of tags in set B at 5000 and number of tags in A that are in B,

i.e., number of tags in set C at 500. The dashed horizontal line in each of these ﬁgures

shows the required value of conﬁdence interval and the solid line shows the observed values of

conﬁdence interval achieved by RTSP. We observe from these ﬁgures that the observed values

of conﬁdence interval are always smaller than the required values of conﬁdence interval.

2.7.1.2 Observed Conﬁdence interval vs. |B|

Our experimental results show that RTSP always achieves the required conﬁdence interval

regardless of the number of tags in population B. Figures 2.6(a), 2.6(b), 2.6(c), and 2.6(d) plot

the actual conﬁdence interval RTSP achieved for diﬀerent sizes of set B when the required

values of conﬁdence interval are β = 0.2, β = 0.1, β = 0.05, β = 0.01, respectively. To plot

these ﬁgures, we ﬁxed number of tags in set A at 5000 and number of tags in set C at 500.

We observe from these ﬁgures that the solid lines are always below their corresponding dashed

lines, which means that the actual values of conﬁdence interval are always smaller than the

required values of conﬁdence interval.

2.7.1.3 Observed Conﬁdence interval vs. |C|

Our experimental results show that RTSP always achieves the required conﬁdence interval

regardless of the number of tags in set C. Figures 2.7(a), 2.7(b), 2.7(c), and 2.7(d) plot the

actual conﬁdence interval RTSP achieved for diﬀerent sizes of set C when the required values

of conﬁdence interval are β = 0.2, β = 0.1, β = 0.05, β = 0.01, respectively. To plot these

ﬁgures, we ﬁxed number of tags in sets A and B at 5000 each. Again, we observe from these

ﬁgures that the solid lines are always below their corresponding dashed lines, which means

that RTSP always achieves the required conﬁdence interval.

35

(a) required β = 0.2

(b) required β = 0.1

(c) required β = 0.05

(d) required β = 0.01

Figure 2.7 Observed conﬁdence interval vs. |C| when |A| = 5000, and |B| = 5000

2.7.2 Execution Time

Execution time of RTSP is smaller than TH. Fig. 2.8(a) plots the execution times of TH
and RTSP vs. |A| for β = 0.1, |B| = 3000, and C = 500. We observe from this ﬁgure that
RTSP is up to 77.27% faster compared to TH. Similarly, Fig. 2.8(b) plots the execution
times vs. |B| for β = 0.1, |A| = 1000, and |C| = 500 and Fig. 2.8(c) plots the execution
times vs. |C| for β = 0.1, |A| = 5000, and |B| = 5000 . Again, we observe from these ﬁgures
that RTSP is always faster compared to TH. Finally, Fig. 2.8(d) plots the execution times

36

50010001500200000.10.20.30.4|C|Obeserved confidence interval50010001500200000.050.10.150.2|C|Obeserved confidence interval50010001500200000.020.040.060.080.1|C|Obeserved confidence interval50010001500200000.0050.010.0150.02|C|Obeserved confidence interval(a) Execution time vs. |A|

(b) Execution time vs. |B|

(c) Execution time vs. |C|

(d) Execution time vs. β

Figure 2.8 Comparison of execution times of RTSP and TH

vs. β for |A| = 5000, |B| = 5000, and |C| = 500. We observe from this ﬁgure that RTSP is
faster compared to TH as long as required conﬁdence interval is greater than 0.01. When the

required conﬁdence interval is less than 0.01, TH is faster. Thus, if privacy is not a concern,

a user should use TH to search for tags whenever β < 0.01. If, however, privacy is a concern,

then the user should use RTSP regardless of the value of β.

2.8 Future Work

In this section, I provide an overview of the planned future work.

37

200040006000800010000510152025|A|Execution time (seconds)  RTSPTH200040006000800010000010203040|B|Execution time (seconds)  RTSPTH50010001500200051015202530|C|Execution time (seconds)  RTSPTH0.050.10.150.20510152025βExecution time (seconds)  RTSPTH2.8.1 Valued Tag Detection and Identiﬁcation

As we shown before, RTSP can be used to detect and identify some tags we are caring about.

However, when we talk about the missing tag detection we are not mean all the missing tags

which maybe very cheap. For example, given two items A and B attached with tags( we will

just name as tag A and B), value of item A is 1k and item B is just one dollar. Apparently,

those two tag should be detected with diﬀerent accuracy and conﬁdence. We want to design

a protocol which can detect and identify tag A with 99.99% accuracy comparing with tag

B with 90% accuracy. However, RFID reader have no idea about value(the value can be

anything else not just measure by money) of the each tag which means we may need to design

a method to help reader to understand the value of each tags. I am working on this problem

and one idea is to use one hot coding to categories all the tags into diﬀerent categories

according to their values.

38

CHAPTER 3

RFID MULTI-CATEGORY TAGS ESTIMATION

3.1

Introduction

In this section I will brieﬂy introduce the background and problem statement of RFID

multi-category problem. The design of our scheme will also mention in this section.

3.1.1 Background and Problem Statement

Radio Frequency Identiﬁcation (RFID) has been widely used in many applications such as

inventory management [98, 122, 109, 55, 113, 110, 53], object tracking [212, 225, 114], and

localization [209, 210, 180]. A typical RFID system consists of readers, tags, and a back-end

server. The back-end server controls the reader to interrogate a set of tags, and the tags

respond with their IDs over a shared wireless medium. A tag is a microchip with an antenna

in a compact package that has limited computing power and communication ranges. The

reader communicates with the tags via wireless channel to identify or monitor the tagged

objects. There are two types of RFID tags: passive tags, which do not have their own power

sources and are powered up by harvesting the radio frequency energy from readers, and active

tags, which have their own power sources. In an RFID-enabled warehouse, there may be

thousands of tagged items that belong to diﬀerent categories, e.g., diﬀerent places of origin

or diﬀerent brands [207]. Each tag attached to an item has a unique ID that consists of

two ﬁelds: a category ID that speciﬁes the category of the attached object, and a member

ID that identiﬁes this object within its category. As a manager of the warehouse, one may

desire to timely monitor the product stock of each category. If the stock of a category is too

high, it may indicate that this category of products are not popular, and the manager needs

to adjust the marketing strategy (e.g., lowering prices to increase sales). On the contrary,

if the stock of a category is too low, the manager should perform stock replenishment as

39

soon as possible. Manual checking is laborious and of low time-eﬃciency. You can imagine

how diﬃcult it is for a manager to manually count the number of items in each category

that may be stacked together or placed on high shelves. Hence, it is desirable to exploit

the RFID technique to quickly obtain the number of tagged items in each category. We

concerns the practically important problem of multi-category RFID estimation: given a

set of RFID tags, we want to quickly and accurately estimate the number of tags in each

category. However, almost all the existing RFID estimation protocols are dedicated to the

estimation problem on a single set, regardless of tag categories. A feasible solution is to

separately execute the existing estimation protocols on each category. The execution time

of such a serial solution is proportional to the number of categories, and cannot satisfy the

delay-stringent application scenarios. Simultaneous RFID estimation over multiple categories

is desirable, hence, this paper proposes an approach called Simultaneous Estimation for

Multi-category RFID systems (SEM). SEM exploits the Manchester-coding mechanism, which

is supported by the ISO 18000-6 RFID standard, to decode the combined signals, thereby

simultaneously obtaining the reply status of tags from each category. As a result, multiple bit

vectors are decoded from just one physical slotted frame. Built on our SEM, many existing

excellent estimation protocols can be used to estimate the tag cardinality of each category in

a simultaneous manner. To ensure the predeﬁned accuracy, we calculate the variance of the

estimate in one round, as well as the variance of the average estimate in multiple rounds. To

ﬁnd the optimal frame size, we propose an eﬃcient binary search-based algorithm. To address

signiﬁcant variance in category sizes, we propose an Adaptive Partitioning (AP) strategy to

group categories of similar sizes together and execute the estimation protocol for each group

separately. Compared with the existing protocols, our approach is much faster, meanwhile

satisfying the predeﬁned estimation accuracy. For example, with 20 categories, the proposed

SEM+AP is about 7 times faster than prior estimation schemes. Moreover, our approach

is the only one whose normalized estimation time (i.e., time per category) decreases as the

number of categories increases. Radio Frequency Identiﬁcation (RFID) has been widely used

40

Figure 3.1 Single-one Manchester Coding

in many applications such as inventory management [98, 122, 109, 55, 53], object tracking

[222, 212, 225], and localization [209, 210, 180].

This paper formulates and addresses the practical problem of multi-category RFID
estimation. Given a set of RFID tags with λ categories denoted by C1, C2,··· , Cλ, whose
cardinalities are denoted by n1, n2,··· , nλ, respectively, a conﬁdence interval α ∈ (0, 1], and
a required reliability β ∈ [0, 1), we want to estimate the number of tags in each category
using one or more readers such that for each 1 ≤ i ≤ λ, we have P{| ˆni − ni| ≤ niα} ≥ β,
where ˆni is the estimate of ni.

3.1.2 Proposed Approach

In this paper, we propose an approach called Simultaneous Estimation for Multi-category
RFID systems (SEM). At the start of SEM, we inject a so-called single-one string (SO string
for short) into each tag. Given λ categories, the SO string injected into the tag belonging
to the i-th category is a vector of λ bits where exactly the i-th bit is 1 and all other bits
are 0s. For example, given 3 categories, the SO strings are 100, 010, and 001, respectively.

41

100100100Signal 1: 100Signal 2: 100100010xx0Signal 1: 100Signal 2: 010Combined Signal (a)(b)Combined Signal 100 is injected into the tags of the ﬁrst category; 010 is injected into the tags of the second

category; 001 is injected into the tags of the third category. Such a string injecting operation

can be easily implemented as follows. The reader uses SELECT command [59] to activate the

tags in a speciﬁc category while keeping the other tags inactive. Then, the reader broadcasts
the corresponding SO string, and the active tags record the received string in their memories.
The RFID tags respond to the reader’s query with the SO strings that are modulated by
Manchester coding mechanism. When querying two tags, which are in the i-th category and

the j-th category, respectively, if i = j, then the reader obtains a vector of λ bits where
exactly the i-th bit is 1 and all other λ − 1 bits are 0s; if i (cid:54)= j, then the reader obtains a
vector of λ bits where exactly the i-th bit and the j-th bit are collisions and all other λ − 2
bits are 0s. Speciﬁcally, as illustrated in Figure 3.1, 1 is encoded as a falling edge and 0 is

encoded as a rising edge in the Manchester coding. If all tags transmit 0 (or 1) at the same

time, the reader can successfully recover the bit as 0 (or 1); otherwise, the reader will detect

a bit collision x. Thus, from the bit vector that the reader obtains, we know exactly which

categories of tags responded in this slot. Note that Manchester coding is supported by the

RFID standard ISO 18000-6 [22] for detecting bit-level collisions [56, 92]. Many excellent

literature [225, 91] makes use of the bit-level synchronization to address RFID application

problems.

SEM is based on the standard Framed Slotted Aloha protocol [97] for MAC layer commu-

nication. First, the RFID reader initializes a slotted time frame by broadcasting a binary
request (cid:104)δ, f(cid:105), where δ is a random seed and f is the frame size (i.e., the number of slots
in the forthcoming frame). Each tag randomly chooses a slot in the frame to reply its SO
string. Speciﬁcally, each tag initializes its slot counter sc = H(ID, δ) mod f, which follows
a uniform distribution within [0, f − 1]. The reader broadcasts the QueryRep command at
the end of each slot to inform every tag to decrement its slot counter sc by 1. In each slot, a

tag responds to the reader once its slot counter sc becomes 0. At the end of each frame, the

reader obtains an array of f ternary strings where each ternary string has λ bits and each bit

42

Figure 3.2 From one physical frame to λ = 3 logical frames

has a value of 0, 1, or x. We call this array a physical frame. For the λ-bit ternary string
ti of the i-th slot, for each 1 ≤ j ≤ λ, if ti[j] = 0, then there is no tag in category Cj that
responded in the i-th slot; if ti[j] = 1, then only tags in category Cj responded in the i-th
slot; if ti[j] = x, then more than one tag responded in the i-th slot: at least one in category
Cj and the remaining not in category Cj. Thus, from this physical frame, we can obtain λ
logical frames, one for each category, where the logical frame for category Ci is the same as
the physical frame that the reader could obtain if the tag population only contains the tags
in category Ci. Figure 3.2 shows an example of obtaining λ logical frames from a physical
frame. For example, in the third slot, the ternary string xxx is the collision result of three

types of single-one strings: 100, 010, and 001.

We now zoom into the logical frame for category Ci. For each slot, we either have the SO

43

100xxxxx00010xx100100100Logical frame for C1:010010010001001001Physical frame:Category 1: 100Category 2: 010Category 3: 001Logical frame for C2:Logical frame for C3:11Bit vector for C1:111111Bit vector for C2:Bit vector for C3:0100000000000000string or nothing. By denoting the slot containing SO string with 1 and the slot that is empty
with 0, we can obtain a bit vector with f bits. Figure 3.2 shows three bit vectors that we

obtain. Based on the bit vectors obtained by SEM, many excellent tag estimation protocols,

as summarized in Table ??, can be used to simultaneously estimate the tag cardinality of each

category. For example, Enhanced Zero-Based estimator (EZB) [89] relies on an important

intuition: the fewer tags are, the more empty slots will appear in the frame. Thus, EZB can

exploit the number of empty slots in a frame to conduct the tag estimation. Here, we could

use the number of 0s in each bit vector as the input of EZB to estimate the number of tags

in the corresponding category. Besides EZB, many existing protocols such as FNEB [78] that

leverages the index of the ﬁrst non-empty slots in the frame, LoF [155] that makes use of

the length of continuous non-empty slots, ART [177] that exploits the average run length of

non-empty slots, can be built on our SEM to achieve simultaneous estimation over multiple

categories.

Using our SEM approach, previous RFID estimation protocols can be signiﬁcantly acceler-

ated when facing the multi-category estimation problem. In the following, we use a numerical

example to show this point. Let tγ represent the duration of a slot for transmitting γ-bit data
and is given by τw + γ × τb, where τw is the waiting time and τb is the time for transmitting
one bit. Typically, τw = 302us and τb = 18.8us [215, 156]. As there are λ categories, in SEM
each slot contains λ-bit single-one string, i.e., γ = λ. Thus, the time cost of an SEM frame is
f (τw + λ × τb). On the other hand, the time cost of executing a frame of existing protocol
once for each category is λf (τw + τb), where in this case γ = 1 because in existing protocols,
each slot needs to carry only a single bit to indicate empty or non-empty. By comparing the

execution time of a frame of SEM and existing protocols for all categories, it is easy to see

that the number of slots executed by SEM are much smaller than the total number of slots

executed by existing protocols. For example, when λ = 30, SEM is almost 11 times faster

than the existing estimation protocols.

44

3.1.3 Challenges and Proposed Solutions

The ﬁrst key challenge is to guarantee the required estimation accuracy speciﬁed by conﬁdence
interval α ∈ (0, 1] and required reliability β ∈ [0, 1) for all categories. As the estimation
based on one round of SEM has an inherent variance due to the probabilistic nature, we

execute multiple rounds of SEM to reduce the variance of the estimate of each category.

To ensure that SEM achieves the required accuracy, we ﬁrst calculate the variance of the

estimate for one round and the variance of the average estimate in multiple rounds. Then, we

use statistical methods to ﬁnd the minimum number of rounds that can achieve the required

accuracy.

The second key challenge is to choose an optimal frame size f that minimizes the estimation

time. The key factor that aﬀects estimation time is f. We show that the execution time is a

convex function with respect to the frame size, which means that the estimation time is long

when the frame size is too small or too large. To ﬁnd the optimal frame size, we propose an

eﬃcient binary search-based algorithm.

The third key challenge is to deal with categories that vary signiﬁcantly in size. To

minimize the estimation time, categories with small sizes demand a small frame size, whereas,

categories with large sizes demand a large frame size. To address this issue, we propose an

Adaptive Partitioning (AP) to group categories of similar sizes together and execute SEM for

each group separately. Although this introduces more times of executing SEM, the estimation

time for each group is well optimized as the categories in each group have similar sizes. Such

a hybrid strategy has a smaller estimation time in comparison with the two extreme strategies

of estimating each category separately and estimating all categories together. As we do not

know category sizes in advance, we adaptively partition the categories based on the execution

of previous rounds.

45

3.1.4 Novelty and Advantage over Prior Art

The key technical novelty of this paper lies in proposing an Single-one Manchester coding-

based approach called SEM, built on which traditional tag estimation protocols can be used

to address the multi-category RFID estimation in a simultaneous manner. The key technical

depth of this paper is in the mathematical development of SEM in addressing the three

technical challenges of guaranteeing accuracy, choosing frame sizes, and partitioning categories.

The key advantage of our approach over prior art is that SEM can decode multiple bit vectors

from just one physical frame to simultaneously estimate the tag cardinality of each category.

Compared with the prior separate estimation methods, our SEM approach signiﬁcantly

reduces the number of physical slots, and thus achieves much better time-eﬃciency. For

example, for an RFID system with 20 categories, our SEM+AP uses 2 seconds whereas the

state-of-the-art ART protocol takes 14 seconds [177]. It represents that our SEM+AP is 7x

faster than ART. As the number of categories increases, the normalized estimation time of

our approach decreases, whereas, that of prior estimation protocols does not.

The rest of the paper is organized as follows. In Section 3.2, we review the related work.

In Section 3.3.1, we present our SEM approach in detail along with its analysis. In Section

3.4, we describe how to calculate the optimal values for system parameters to minimize

the estimation time of SEM while achieving the required reliability.

In Section 3.5, we

describe how SEM adaptively partitions the categories into comparable sizes to reduce the

estimation time. In Section 3.6, we present results from our extensive evaluation of the

proposed approach and its comparison with the existing protocols. Finally, in Section 3.7, we

conclude the paper.

3.2 Related Work

At the infancy stage of RFID research, the academic communities have paid much

attention to the exact tag identiﬁcation problem [97, 178], which is to exactly identify the

tag IDs within the interrogation range of an RFID reader. Generally, there are two types of

46

tag identiﬁcation protocols: Aloha-based protocols and Tree-based protocols. Their basic

principles are presented as follows. Fundamentally, the Aloha-based protocol is a kind of

Time Division Multiple Access (TDMA) mechanism. A tag ID can be successfully identiﬁed

in a slot when only one tag responds in this slot. As for tree-based protocols, the reader

broadcasts a 0/1 string to query the tags. A tag responds with its ID once it ﬁnds that

the queried string is the preﬁx of its ID. A reader identiﬁes a tag ID when only one tag

responds. Although RFID identiﬁcation protocols can be used to obtain the exact tag IDs, it

is a well-recognized fact that the tag identiﬁcation protocols are slow because their execution

time is proportional to the number of tags. For some purposes like stock monitoring, it is

not eﬃcient to execute the tag identiﬁcation protocols because we only need to know the

approximate number of tags instead of exact tag IDs.

Another direction of research on RFID systems is targeted at the cardinality estimation of

RFID tag populations. Kodialam et al. proposed the ﬁrst set of cardinality estimation schemes,

USE and UPE, which use the number of empty or collision slots to estimate population

sizes [88]. Similarly, Zheng et al. proposed Probabilistic Estimation Tree (PET) to estimate

cardinalities for tree-based RFID systems [220]. Shahzad et al. proposed ART, which uses

the average run length of non-empty slots for cardinality estimation [177]. Li et al. proposed

Maximum Likelihood Estimator (MLE), which looks at the energy aspect of cardinality

estimation [101]. Liu et al. studied the problem of key tag population tracking [121]. Gong

et al. investigated INformative Counting (INC) to estimate the number of counterfeit tags

whose IDs are not stored in a database [74]. For privacy reason, RFID estimation with the

presence of blocker tag is investigated in [118].

The above literature assumes that all tags within an RFID system belong to the same

category. However, in practical scenarios, tags are usually classiﬁed into diﬀerent categories

according to brands. In recent years, the researchers have shifted some attention to the

interesting problems raising in the multi-category RFID systems. Sheng et al. addressed the

problem of identifying categories whose cardinalities are above a given threshold [182]. They

47

proposed the Group Testing (GT) scheme, that rapidly eliminates the groups containing

small-sized categories. Luo et al. claimed that the GT protocol is not suitable for RFID

systems in which the sizes of a large number of categories are above a threshold, because

each group has a high probability of containing a large-size category, and thus is diﬃcult

to eliminate. To accommodate this situation, they proposed an eﬃcient Threshold-Based

Classiﬁcation (TBC) Protocol [128] that obtains multiple logical bitmaps from a single

time frame. Each bitmap is used to approximate the tag cardinality of a category. The

categories whose cardinalities are obviously above (or below) the given threshold can be

rapidly eliminated. Unfortunately, GT and TBC protocols can only identify the categories

with sizes greater than a threshold, but cannot estimate sizes of individual categories. The

work closest to ours, focusing on multi-category RFID system, is Ensemble Sampling (ES)

[207], which exploits the number of singleton slots occupied by each category in a time frame

to estimate the tag cardinality in each category. ES can only distinguish three types of slots:

empty slot, singleton slot, and collision slot. For collision slot, ES only knows two or more

tags responded in this slot, and nothing else. How to make full use of the information in each

type of slots especially that in the collision slots is the key to achieve better time eﬃciency.

The proposed SEM exploits the Single-one Manchester coding string, and could know which

categories of tags responded in a collision slot. From a single physical frame, it can derive

multiple logical frames, and each servers the tag cardinality estimation for a category.

3.3 Proposed Research

In this section, I explain the proposed research methods to address the problem of

multi-category RFID tags estimation.

3.3.1 SEM: Estimator and Variance

To estimate the number of tags in each category, SEM executes multiple Aloha frames. At

the end of each frame, it obtains a bit vector for each category. Based on the obtained bit

48

vectors, SEM can perform any estimator mentioned before to simultaneously estimate the

number of tags of each category. Here, for the purpose of clarity, we let SEM exploit the most

classical estimator in EZB [89]. Note that, if more advanced estimators such as ART [177]

or SRC [51] are used, the performance of SEM can be further improved. An insight behind

EZB: the fewer tags are there, the more empty slots appear in the frame. Hence, we make

use of the number of 0s in each bit vector to perform the estimation. The estimate obtained

from the number of 0s observed in a single bit vector is not accurate due to the variance

associated with the estimation process. Thus, instead of executing a single round, SEM

executes k rounds and obtains k estimates of the number of tags in that category. It then

calculates the average of those k estimates to obtain the ﬁne-grained estimate. Next, we ﬁrst

formally derive the estimator that SEM uses to estimate the size of any given category, using

the number of 0s in the bit vector corresponding to that category as input. Then, we derive

the expression for variance of the estimator, which we will use in Section 3.4 to determine

the values of system parameters to ensure that SEM achieves the required reliability in the

minimum possible time. Table 3.1 summarizes the main notations used in this paper.

3.3.2 Estimator of SEM

Based on such a logical frame, many previous literature proposed excellent schemes to estimate

the tag cardinality. Murali Kodialam et al. proposed to use the number of empty slots

to perform the estimation [88]. Muhammad Shahzad et al. proposed to use average run

length of non-empty slots for estimation [177]. As illustrated in Fig. 3.2, in an arbitrary bit
vector for category Ci, we know the number of 0s. Intuitively, the more tags there are in
category Ci, the less 0s there are in the bit vector. In fact, there is a monotonically functional
relationship between the number of 0s in the bit vector and the tag cardinality ni. Later, we
will present a rigorous analysis to derive the functional relationship. Hence, we could use the
number of 0s observed in the bit vector to estimate the tag cardinality ni [88]. Note that,
such an empty-slot-based estimator was proposed in [88]. We do not claim novelty on such

49

Table 3.1 Main notations used in the paper.

Notations Descriptions

λ
Ci
ni
α
β
ˆni
δ
f
f(cid:48)
H(·)
τw
τb
tγ
pi,0
Ni,0
ˆni,j

Ak( ˆni)

(cid:96)

# of tag categories under estimation.
category ID, i ∈ [1, λ].
tag cardinality of category Ci.
required conﬁdence interval.
required reliability.
estimate of ni.
random number.
broadcast frame size.
executed frame size.
uniform hash function.
waiting time in a slot.
time for transmitting 1-bit data from tag to reader.
slot length that transmits γ-bit data. tγ = τw+γτb.
probability that a bit in the bit vector is 0.
# of 0s in the bit vector corresponding to category Ci.
estimate for category Ci of the jth frame.
averaged result of k rounds of estimation for category
Ci. Ak( ˆni) = 1
k
the parameter of (cid:96)-sigma method. (cid:96) = 0, 1, 2, or 3.

(cid:80)k

j=1 ˆni,j

an estimation idea. The novelty of this paper lies in proposing single-one machester coding

to parallelize the estimation processes of multiple tag categories. The frame size should be no

more than 512 in practice [177, 178] (the detailed reasons can be found in literature [178]).

A solution used in [177] is to initialize a long frame with length of f, but terminate the
frame after the ﬁrst f(cid:48) slots. Let ni represent the number of tags in category Ci. Let f
represent the number of slots that the reader broadcasts at the start of the frame. We call f
the broadcast frame size. Let pi,0 represent the probability that any bit in the bit vector of
category Ci is 0. Formally, for large values of f, the probability pi,0 is given by the following
equation.

(cid:18)

(cid:19)ni ≈ e

− ni
f

pi,0 =

1 − 1
f

(3.1)

In the above equation, such an approximation is usually made in previous literature [119, 177].

50

Let the reader terminate the frame after executing f(cid:48) slots, where f(cid:48) ≤ f. We call f(cid:48)
the executed frame size. Let Ni,0 be the random variable for number of 0s observed in the
ﬁrst f(cid:48) bits of the bit vector of category Ci. As the probability for any bit to be 0 is pi,0,
the random variable Ni,0 follows binomial distribution Binom(f(cid:48), pi,0). Thus, the expected
value of Ni,0 is given by the following equation.

− ni
f

E(Ni,0) = f(cid:48) · pi,0 = f(cid:48)e
(cid:27)
(cid:26) E(Ni,0)

ni = −f ln

Solving Eq. (3.2) for ni, we get the following equation.

f(cid:48)

(3.3)
This equation shows that for ﬁxed given values of f and f(cid:48), ni is a monotonically decreasing
function of E(Ni,0). Thus, we can estimate the value of ni by substituting E(Ni,0) in the
equation above by the observed value of Ni,0 from the logical frame of category Ci. Recall that
Ni,0 represents the number of 0s observed from the bit vector of category Ci. Substituting
Ni,0 for E(Ni,0) in Eq. (3.3), we get the estimator ˆni of ni as follows.

(3.2)

(3.4)

(3.5)

ˆni = −f ln

3.3.3 Variance of SEM

(cid:26) Ni,0

(cid:27)

f(cid:48)

The following lemma calculates the variance in the estimator derived in Eq. (3.4).
Lemma 1. Let f and f(cid:48) be the broadcast and executed frame sizes, respectively, and ni
be the number of tags in category Ci. The variance in the estimate ˆni of ni is given by the
following equation.

(cid:18)

(cid:19)

V ar( ˆni) =

ni
f − 1

e

f 2
f(cid:48)

Proof. According to Eq. (3.4), ˆni is a function of the random variable Ni,0. Thus, we express
ˆni as φ(Ni,0). The Taylor’s series expansion of φ(Ni,0) around E(Ni,0) is given by the
following equation.

ˆni = φ(Ni,0) = φ(η) +

∂φ
∂Ni,0

(Ni,0 − η),

51

where

∂φ
∂Ni,0

is the ﬁrst-order derivative. Taking the expectation of both sides of the equation

above, we have:

E[ ˆni] = E[φ(η)] +

∂φ
∂Ni,0

E[(Ni,0 − η)] = φ(η)

The variance of ˆni can now be calculated using the following expression.

V ar( ˆni) = E[ ˆni − E( ˆni)]2 =

V ar(Ni,0)

(cid:20) ∂φ

(cid:21)2

∂Ni,0

As required by the equation above, we next calculate the ﬁrst-order derivative
and the variance V ar(Ni,0).

∂φ
∂Ni,0
− ni
Replacing Ni,0 by η = E(Ni,0) = f(cid:48)e

= −f × f(cid:48)

× 1
f(cid:48)

Ni,0

(3.6)

∂φ
∂Ni,0

|Ni,0=η

f in the equation above, we get:
|Ni,0=η = − f
f(cid:48) e

ni
f

∂φ
∂Ni,0

(cid:18)

(cid:19)

As Ni,0 ∼ Binom(f(cid:48), pi,0), the variance V ar(Ni,0) is given by the following equation.

V ar(Ni,0) = f(cid:48)pi,0(1 − pi,0) = f(cid:48)e

− ni
f

1 − e

− ni
f

(3.7)

Substituting the expressions of
of ˆni as given in Eq. (3.5) in the lemma statement.

∂φ
∂Ni,0

|Ni,0=η and V ar(Ni,0) into Eq. (3.6), we get the variance

3.4 SEM: Parameter Optimization

In this section, we will derive equations that the controller uses at the start of ith frame
to calculate the optimal values of frame size f and f(cid:48) and number of frames k to minimize
the execution time of SEM while ensuring that its actual conﬁdence interval is less than

the required conﬁdence interval. Recall from Section 3.3.1 that SEM executes k frames to

estimate the number of tags in each category. Next, we ﬁrst derive the expression to calculate

the value of k, which ensures that SEM achieves the required reliability. After that, we derive
expressions to calculate broadcast frame size f and executed frame size f(cid:48), which ensure that
the execution time of SEM is the minimum.

52

3.4.1 Number of Frames k

Let ˆni,j represent the estimate of the number of tags in category Ci obtained from the jth
frame. Let Ak( ˆni) represent the average of the k estimates obtained from the k frames, i.e.,
j=1 ˆni,j. In what follows, Theorem 1 calculates the value of k which ensures
Ak( ˆni) = 1
k
that the average estimate satisﬁes the required reliability.

(cid:80)k

Theorem 1. Given required conﬁdence interval α, required reliability β, broadcast frame size
f, and executed frame size f(cid:48), the average estimate Aki
( ˆni) of the number of tags in category
( ˆni) − ni| ≤ niα} ≥ β when the average is obtained from
Ci satisﬁes the requirement P{|Aki
ki frames, where ki satisﬁes the following equation.
ni
f − 1
f(cid:48)

(cid:18) f Zβ

ki ≥

(3.8)

αni

Proof. As SEM uses diﬀerent seeds for each frame, the ki frames are independent of each
other. According to the central limit theorem,
is a random variable that
follows the standard normal distribution. Let us represent this random variable by ℵ. As ℵ
follows a standard normal distribution, for any required reliability β, there exists a number
Zβ such that

( ˆni)]



(cid:19)2e
(cid:113)
( ˆni)−E[Aki
V ar[Aki
( ˆni)]

Aki

The requirement P{|Aki

Comparing Eqs. (3.9) and (3.10), SEM will achieve the required reliability when the following

conditions hold.

P (−Zβ ≤ ℵ ≤ Zβ) = β

P

 ≥ β

( ˆni)]

( ˆni)]

( ˆni)]

( ˆni)]

(cid:113)

V ar[Aki

≤ℵ≤ (1+α)ni−E[Aki
V ar[Aki

( ˆni) − ni| ≤ niα} ≥ β can be written as below.
 (1−α)ni−E[Aki
(cid:113)

(cid:113)
(1 − α)ni − E[Aki
( ˆni)]
(cid:113)
(1 + α)ni − E[Aki
( ˆni)]

≤ −Zβ

V ar[Aki

≥ Zβ,

( ˆni)]

( ˆni)]

V ar[Aki

53

(3.9)

(3.10)

(3.11)

Next we calculate the expectation and variance of Aki

( ˆni).

E[Aki

( ˆni)] =

V ar[Aki

( ˆni)] =

1
ki

1
k2
i

ki(cid:88)
ki(cid:88)

j=1

j=1

E( ˆni,j) = ni

ni
f −1) =

f 2
f(cid:48) (e

ni
f −1)

f 2
kif(cid:48) (e

(3.12)

Substituting the expressions for E[Aki
in Eq. (3.11) and rearranging, we get the inequality in Eq. (3.8).

( ˆni)] and V ar[Aki

( ˆni)] into either of the two inequalities

3.4.2 Frame Sizes f and f(cid:48)
For the given values of f(cid:48) and f, Theorem 1 calculates the number of frames that SEM must
execute to achieve the required reliability. Next we optimize the values of the executed and

broadcast frame sizes to ensure that the estimation time of SEM is minimized.

(f Zβ )2
f(cid:48)(αni)2 (e

Let Ti represent the minimum execution time needed by category Ci, tλ represent the
duration of each slot, and tξ represent the time that the reader takes to transmit the ξ-
ni
bit parameters for frame initialization. Thus, Ti = ki × (tξ + f(cid:48) × tλ) =
f −
1) × (tξ + f(cid:48) × tλ). Let T represent the execution time of SEM for all categories, which
should be equal to the longest execution time among all minimum execution times for the
λ categories. It is easy to see that Ti is a monotonically decreasing function of f(cid:48) because
tξ
its ﬁrst derivative ∂Ti
f(cid:48)2 is always negative. Therefore, SEM always
sets f(cid:48) to its maximum value. According to C1G2, the executed frame size f(cid:48) should be
no more than 512 due to practical reasons [177]. Meanwhile, executed frame size f(cid:48) should
also be smaller than the broadcast frame size f. Brieﬂy, we set f(cid:48) as min{512, f}. Next, we
will show that Ti is a convex function of ni. To prove convexity, a suﬃcient and necessary
condition is that the second-order derivative of Ti with respect to ni is always larger than 0.
The following equation calculates the second-order derivative of Ti with respect to ni.

∂f(cid:48) = − Z2

ni
f − 1)

β
α2n2
i

f 2(e

f 2Z2
β

(tξ +f(cid:48)tλ)
f(cid:48)α2

∂2Ti
∂n2
i

=

(cid:33)

(cid:35)

1

f 2n2
i

− 4
f n3
i

+

6
n4
i

− 6
n4
i

(cid:32)

(cid:34)

ni
f

e

54

(3.13)

(cid:114) 1

For simplicity, we substitute (
f 2n2
√
i
6−4
= 2
> 0. Furthermore, using the fourth-order Taylor series
f n3
i

) with Φ. Note that Φ = (

+ 6
n4
i

+ 6
n4
i

f 2n2
i

1

1

2

f 2n2
i

× 6
n4
i
expansion of e

− 4
f n3
i
ni
f , we know that e

ni
f > 1 + ni

f +

n2
i
2f 2 +

n3
i
6f 3 +

n4
24f 4 . Then, Eq. (3.13) can be
i

− 4
f n3
i

) ≥

− 4
f n3
i

written as the following inequality.

f 2Z2

β(tξ+f(cid:48)tλ)
f(cid:48)α2

∂2Ti
∂n2
i

>

(cid:34)(cid:32)

(cid:33)

(cid:35)

1+

ni
f

+

n2
i
2f 2 +

n3
i
6f 3 +

n4
i
24f 4

Φ− 6
n4
i

f 2Z2
β

(tξ+f(cid:48)tλ)
f(cid:48)α2

>

+

( 2
f n3
i

Substituting the value of Φ in the inequality above and simplifying, we get ∂2Ti
∂n2
i

n2
24f 6 ) > 0. As this second-order derivative is always greater than 0, Ti is a convex
1
i
12f 4 +
function of ni. Let Cx and Cy be the categories with the fewest and the most number of tags,
respectively, among all λ categories. Let nx and ny be the number of tags in the categories
Cx and Cy, respectively. By the property of convex function, the maximum value of Ti
lies at one of the two boundary points, i.e., (nx, Tx) or (ny, Ty). Thus, T = max{Tx, Ty}.
Minimizing the overall time T is equivalent to minimizing max{Tx, Ty}. Formally, we need
to solve the following optimization problem to ﬁnd out the optimal values of f(cid:48) and f to
minimize max{Tx, Ty}.

Minimizing max{Tx, Ty}

s.t. f(cid:48) ∈ [1, 512]

f(cid:48) ≤ f
Cx is the smallest category under estimation

(cid:32)

Cy is the largest category under estimation
tξ + f(cid:48) × tλ

ni
f − 1

Ti =

e

×(cid:16)

(cid:17)

(f Zβ)2
f(cid:48)(αni)2

(cid:33)

(3.14)

i = x or y

In the optimization problem formulated in Eq. (3.14), the executed frame size f(cid:48) should be
no more than 512 due to practical reasons [177]. It is easy to enumerate each possible value
of f(cid:48) to ﬁnd the optimal one because of its small value range. However, f has a large value

55

range, and the enumeration method is not suitable when optimizing its value. Therefore,

we investigate how to quickly optimize the value of f in the following. We ﬁrst show that
max{Tx, Ty} is a convex function of f, which means that there is a value of f for which the
execution time of SEM is the minimum. Then, we describe a simple binary search-based
method to determine the optimal value of f. The second-order derivative of Ti with respect
to f is given by the following equation.

(cid:32)

(cid:34)

ni
f

e

(cid:33)

(cid:35)

n2
f 2 − 2ni
i

f

+ 2

− 2

(3.15)

∂2Ti
∂f 2 =
For simplicity, we substitute n2

β(tξ + f(cid:48)tλ)
Z2
i f(cid:48)
α2n2
f 2− 2ni

i

Substituting e

the following inequality.

∂2Ti
∂f 2 >

Z2
β

(tξ +f(cid:48)tλ)
i f(cid:48)
α2n2

(cid:32)

f −1)2+1 > 0.
ni
f with its fourth-order Taylor series in Eq. (3.15) and simplifying, we have

f +2 with Ψ. Note that Ψ =

f +2 = ( ni

n2
f 2− 2ni
i

(cid:33)

n3
i
3f 3

+

n4
i
4f 4

+

n5
i
12f 5

+

n6
i
24f 6

> 0

(3.16)

As the second order derivative of Ti, with respect to f, is always greater than 0, Ti is a convex
function of f. Thus, Tx and Ty are both convex functions of f. Consequently, max{Tx, Ty}
is also a convex function of f.

Leveraging this convexity of max{Tx, Ty} with respect to f, SEM uses a fast binary-
searching algorithm to ﬁnd the optimal value of f. Given a f(cid:48) ≤ 512, SEM ﬁrst initializes
flow to f(cid:48), and fhigh to 3ny. We have observed through simulations that 3ny is a good upper
bound on the size of broadcast frame. Second, SEM calculates the ﬁrst-order derivative of
max{Tx, Ty} at flow+fhigh
. If this derivative is less than 0, it updates flow to flow+fhigh
otherwise, it updates fhigh to flow+fhigh
. SEM recursively performs this search until
flow = fhigh, at which point it stops and returns the value of f as f = flow = fhigh.

;

2

2

2

3.4.3 Dynamic Parameter Adjusting

To calculate the optimal values of system parameters, our proposed methods assume that

SEM already knows the size of each category apriori. However, the category sizes are unknown

56

apriori and are actually the quantity we need to estimate. Next, we present how to obtain

rough estimates of category sizes, which are then used to calculate the optimal values of

system parameters.

Before executing the ﬁrst frame, SEM sets the size of the smallest category to nmin and
the largest category to nmax, where nmin and nmax are the lower and upper bounds on
category sizes, respectively, and are provided by the system administrator. Using nmin and
nmax as inputs, we calculate the broadcast frame size f using the binary search-based method
proposed above. Note that our binary search based method is not sensitive to the rough
values of nmin and nmax because the system parameter values converge to their near optimal
values after only a few frames. After executing κ > 1 frames, we get average estimate Aκ( ˆni)
for each category Ci. This Aκ( ˆni) is used to calculate the number of required frames, and
should be repeated using Eq. (3.8).

3.4.4 Avoiding Premature Termination

As we calculate the number of times the frames are executed (i.e., ki) using the estimated
value Aκ( ˆni), which is not very accurate when κ is small, the value of ki may be smaller
than what it should be. Consequently, SEM may stop after executing fewer frames than it
should have executed causing the estimated size of category Ci do not satisfy the required
reliability. In other words, the estimation process for category Ci is terminated too early,
which we call premature termination. As ki is a monotonically increasing function of ni,

instead of substituting ni with Aκ( ˆni), SEM substitutes ni with Aκ( ˆni) + (cid:96) ·(cid:112)V ar[Aκ( ˆni)]

to calculate the value of ki. The variance of Aκ( ˆni) was calculated in Eq. (3.12). According
to the famous three-sigma rule [183], (cid:96) = 3 should be large enough. We name this method of
calculating ki as the (cid:96)-sigma method. Through extensive simulations in Section 3.6, we show
that our (cid:96)-sigma method is highly eﬀective against premature termination.

57

Figure 3.3 Separate estimation vs. simultaneous estimation in a balanced RFID system that

contains two categories C1 and C2 with sizes of 100 and 110 tags respectively.
(α, β) = (5%, 95%). (a) SEM on C1. (b) SEM on C2. (c) SEM on C1 and C2.

Figure 3.4 Separate estimation vs. simultaneous estimation in an unbalanced RFID system

that contains two categories C1 and C2 with sizes of 100 and 2000 tags respectively.

(α, β) = (5%, 95%). (a) SEM on C1. (b) SEM on C2. (c) SEM on C1 and C2.

3.5 SEM: Adaptive Partitioning

Until now, we have described how SEM executes multiple frames for all categories

simultaneously, and estimates the sizes of the categories. This strategy works well only when

all categories are balanced, i.e., sizes of all categories are similar. When the categories are

unbalanced, i.e., sizes of categories are very diﬀerent, simultaneously estimating sizes of all

categories adversely aﬀects the performance of SEM. Next, we discuss the two scenarios of

balanced and unbalanced categories, respectively.

58

406080100120406080100120  0.911.11.21.31.41.5406080100120406080100120  0.91.01.11.21.3Optimal pair(cid:269)68, 68, 0.8378s(cid:270)Executed frame sizeBroadcast frame size(a)Executed frame sizeBroadcast frame size(b)Optimal pair(cid:269)74, 74, 0.8311s(cid:270)406080100120406080100120  0.91.01.11.21.31.41.5Executed frame sizeBroadcast frame size(c)Optimal pair(cid:269)68, 68, 0.8826s(cid:270)Optimal pair (cid:9)512, 700, 1.0038s(cid:10)406080100120406080100120  0.91.01.11.21.3Optimal pair (cid:9)68, 68, 0.8378s(cid:10)Executed frame sizeBroadcast frame size450470490510450500550600650700  1.11.21.31.41.51.61.71.81.92(cid:15)(cid:17)2.1Broadcast frame size350400450500350400450500550600  33.544.5Optimal pair(cid:1)(cid:9)429, 429, 2.56s(cid:10)Broadcast frame size(a)Executed frame size(b)Executed frame size(c)3.5.1 Category Types Analysis

3.5.1.1 Balanced Categories

We ﬁrst consider an RFID system that consists of two categories C1 and C2 with similar sizes
of 100 and 110 tags, respectively. Fig. 3.3(a) and (b) respectively show the minimal execution

time of SEM when it is separately executed on tags in category C1 and C2. In the ﬁgure, the
optimal pair, e.g., (68, 68, 0.8378s), means that the optimal values of both f(cid:48) and f for SEM
are 68, and the corresponding minimum execution time is 0.8378s. Note that the minimum

time SEM takes to solely estimate the number of tags in category C1 is 0.8378s. And the
time for category C2 is 0.8311s. Clearly, the total time of SEM when executed separately
for each category is 1.6689s. In contrast, as shown in Fig. 3.3 (c), the minimum time SEM

takes to simultaneously estimate the number of tags in both categories C1 and C2 is just
0.8826s, which is much smaller than the time SEM takes to estimate the number of tags in

the categories separately. Thus, simultaneous estimation performs much better than separate

estimation method in such a balanced RFID system.

3.5.1.2 Unbalanced Categories

Fig. 3.4(a) and (b) plot the minimum execution times of SEM for two categories C1 and C2
with quite diﬀerent sizes of 100 and 2000 tags, respectively. The minimum time SEM takes

to estimate the number of tags in categories C1 and C2 separately are 0.8378s and 1.0038s,
resulting in the total time of 1.8416s. In contrast, as shown in Fig. 3.4(c), the minimum

time SEM takes to simultaneously estimate the number of tags in both categories is 2.56s,

which is much larger than the time SEM takes to separately estimate the number of tags

in the categories. This happens because, for the unbalanced categories, it is hard to ﬁnd a
pair of parameters (cid:104)f, f(cid:48)(cid:105) that simultaneously ﬁt categories with large and small sizes. Thus,
separate estimation performs better in the scenario of unbalanced categories.

From the above case studies of balanced and unbalanced categories, we conclude that

59

when the category sizes are unbalanced, SEM should ﬁrst partition categories into groups

such that the sizes of categories in the same group are comparable and then simultaneously

estimate the sizes of categories in individual groups. This will reduce the overall estimation

time of SEM. Next, we describe how SEM partitions categories into groups.

3.5.2 Adaptive Partitioning

At start, SEM assumes that all categories belong to the same group. Without loss of
generality, it assumes that all categories are arranged in a list L1,λ in ascending order, i.e.,
L1,λ = (cid:104)n1, n2, .., nλ(cid:105), and for any i, j ∈ [1, λ], if i < j, we have ni ≤ nj. As aforementioned,
the sizes of the smallest and largest categories in a group, i.e., n1 and nλ in this case,
determine the estimation time of SEM. We represent the minimum time of SEM on a group
that has the smallest category size ni and the largest category size nj by Ti,j. Recall that
the estimation time of SEM is minimum when the values of n, f(cid:48), and f are calculated as
described in Section 3.4.

SEM partitions the group represented by list Lx,y = (cid:104)nx, .., ny(cid:105) into two groups represented
by lists Lx,s = (cid:104)nx, .., ns(cid:105) and Ls+1,y = (cid:104)ns+1, .., ny(cid:105), where the value of s should satisfy the
following two conditions.

1. Tx,s + Ts+1,y ≤ Tx,y
2. ∀z ∈ [x, y − 1], Tx,s + Ts+1,y ≤ Tx,z + Tz+1,y

SEM recursively applies this partitioning method on groups starting with x = 1 and y = λ
and continues until for a given group represented by list Lx,y, there is no s ∈ [x, y − 1]
that satisﬁes the ﬁrst condition. Fig. 3.5 shows an example where SEM partitions a large
unbalanced group represented by the list (cid:104)n1, n2, n3, n4, n5, n6(cid:105) into several small balanced
groups represented by the lists (cid:104)n1, n2(cid:105), (cid:104)n3, n4(cid:105), and (cid:104)n5, n6(cid:105). Note that, in Fig. 3.5, Tx,x
(e.g., T1,1) means the minimum estimation time of SEM on a group that contains just one
category Cx (e.g., C1). After obtaining the small balanced groups, SEM takes one balanced
group at a time and estimates the sizes of categories in that group simultaneously.

60

Figure 3.5 Example of Adaptive Partitioning (AP): an initial unbalanced group is partitioned

into 3 balanced groups.

Just like in the calculation of optimal system parameters, adaptive partitioning also needs

the size of each category apriori. At the very beginning, we do not know the number of tags

in each category at all. Hence, for the ﬁrst round of SEM, we let all the categories be in the

same group. After the ﬁrst round of estimation, SEM uses the method proposed in Section

3.4.3 to obtain the rough estimates of category sizes to guide the group partitioning process,
and to ﬁnd the optimum values of broadcast frame size f and executed frame size f(cid:48) for
each group. If for any category, the estimate Aκ( ˆni) achieves the required reliability after κ
frames, SEM removes category Ci from the list L1,λ. Before executing each frame, SEM ﬁrst
updates the list L1,λ by removing the categories for which the required reliability has been
achieved, and then partitions them into groups. The estimation process terminates when

all categories achieve the required reliability. What we should clarify is that the proposed

Adaptive Partitioning method is a heuristic algorithm, and does not ensure to return the

optimal grouping result.

61

Group150556006107000720050556006107000720060061070007200Group2Group3means the best partitioning pointInitial Group3.5.3 Discussion about SEM

3.5.3.1 Multi-reader Estimation

Due to the limited communication range, a single RFID reader cannot cover a large area.

Thus, multiple RFID readers are frequently deployed. SEM uses one of the many existing

reader-scheduling protocols [211] to schedule which reader transmits and receives at what

time. All readers always send the same commands and relay the data they receive to a

back-end server. Thus, these readers essentially work like a logical big reader. SEM works

seamlessly in single as well as multi-reader environments.

3.5.3.2 Bit Synchronization

Katabi et al. reported in [197] that the synchronization oﬀset for commercial RFID tags

is normally no more than 1us. Recall that transmitting each bit from a tag to a reader

requires 18.8us. Hence, the 1us oﬀset is only about 5.3% of a bit duration. In other words,

the signal oﬀset does not have much negative impact on SEM. Hence, like many top level

RFID literature [225, 91], we also assume that the signals of each tag is well synchronized on

bit level.

3.6 Performance Evaluation

In this section, we conduct extensive simulations on a large scale multi-category RFID

system to evaluate the performance of SEM. We evaluate SEM in a variety of scenarios

both with and without adaptive partitioning. We implemented SEM and 5 existing state-

of-the-art RFID cardinality estimation schemes, namely Maximum Likelihood Estimator

(MLE) [101], Enhanced Zero Based estimator (EZB) [89], Uniﬁed Probabilistic Estimator

(UPE) [88], Average Run-based Tag estimation (ART) [177], and Ensemble Sampling (ES)

[207]. Following the simulation strategy used by these state-of-the-art cardinality estimation

schemes, we assume that the communication channel is error-free and a single reader covers

62

Figure 3.6 Comparing SEM+AP with SEM for balanced category sizes. Each category has

the same size of 5000 tags.

Figure 3.7 Comparing SEM+AP with SEM for unbalanced category sizes. The cardinalities

of 10 categories are exponentially distributed.

Figure 3.8 Comparing SEM+AP with SEM for unbalanced category sizes. The cardinalities

of 10 categories are linearly distributed.

all tags. Recall that the execution of SEM in multi-reader scenario is same as that in the

single reader scenario.

3.6.1 Evaluation Metrics

We evaluate SEM on two important metrics: (1) actual reliability, which is the percentage

of times the relative errors in the estimates calculated by SEM are less than α, and (2)

execution time, which is the time SEM takes to estimate the cardinalities of all tags in each

category. We run each simulation 1000 times and use the results from these 1000 simulations

to calculate the values of the performance metrics. Before evaluating these metrics, we ﬁrst

evaluate the eﬀectiveness of our adaptive partitioning strategy. In the rest of this section,

63

02000400060008000100001.11.21.31.41.51.6  SEMSEM+AP44.24.44.64.85  SEMSEM+AP44.24.44.64.85  SEMSEM+AP1.11.21.31.41.51.6  SEMSEM+AP2004006008001000120040060080010001 20040060080010001 20040060080010001 C1C2C3C4C5C6C7C8C9C10category1000 simulations(a) Balanced(b)    =5%,    =95%, (cid:127)m=0αβ(c)    =5%,    =95%, (cid:127)m=1αβ(d)    =3%,    =97%, (cid:127)m=0αβ(e)    =3%,    =97%, (cid:127)m=1αβ1000 simulations1000 simulations1000 simulationstag cardinalityexecution time (s)execution time (s)execution time (s)execution time (s)0246810  SEMSEM+AP010,00020,00030,00040,00050,00060,0000246810  SEMSEM+AP05101520253035  SEMSEM+AP05101520253035  SEMSEM+AP2004006008001000120040060080010001 20040060080010001 20040060080010001 C1C2C3C4C5C6C7C8C9C10category1000 simulations(a) Unbalanced (Exp.)(b)    =5%,    =95%, (cid:127)m=0αβ(c)    =5%,    =95%, (cid:127)m=1αβ(d)    =3%,    =97%, (cid:127)m=0αβ(e)    =3%,    =97%, (cid:127)m=1αβ1000 simulations1000 simulations1000 simulationstag cardinalityexecution time (s)execution time (s)execution time (s)execution time (s)20040060080010001024681012  SEMSEM+AP20040060080010001024681012  SEMSEM+AP2004006008001000105101520253035  SEMSEM+AP2004006008001000105101520253035  SEMSEM+APC1C2C3C4C5C6C7C8C9C10010,00020,00030,00040,00050,00060,000category1000 simulations(a) Unbalanced (Linear)(b)    =5%,    =95%, (cid:127)m=0αβ(c)    =5%,    =95%, (cid:127)m=1αβ(d)    =3%,    =97%, (cid:127)m=0αβ(e)    =3%,    =97%, (cid:127)m=1αβtag cardinalityexecution time (s)1000 simulations1000 simulations1000 simulationsexecution time (s)execution time (s)execution time (s)we use SEM+AP to denote SEM with adaptive partitioning and simply SEM to denote it

without partitioning.

3.6.2 Adaptive Partitioning

To evaluate the improvement in execution time due to adaptive partitioning, we simulate an

RFID system containing a tag population with 10 categories. We conduct simulations for two

accuracy requirements, i.e., α = 5%, β = 95% and α = 3%, β = 97% and two settings of (cid:96),

i.e., (cid:96) = 0 and (cid:96) = 1. We conduct simulations for both balanced and unbalanced categories.

3.6.2.1 Balanced Categories

In this case, each category has the cardinality of 5000 tags, as shown in Figure 3.6(a). Figures

3.6(b) through 3.6(e) show the execution times of both SEM+AP and SEM for the two

accuracy requirements and the two settings of (cid:96) from 1000 independent runs of simulations.

We observe from these ﬁgures that the execution time of SEM+AP and SEM are almost

the same. This is because the frame size calculated by SEM is appropriate for all categories.

This means that there is no need to partition the list L1,10 into multiple groups. In fact,
when SEM applies the adaptive partitioning algorithm on these 10 categories, the categories

are not divided into multiple groups; rather, they are returned in a single group only.

3.6.2.2 Unbalanced Categories

In this case, the category sizes vary from 1000 tags to 50000 tags. We pick the category

sizes from two diﬀerent distributions: exponential distribution as shown in Figure 3.7(a)

and linear distribution as shown in Figure 3.8(a). Figures 3.7(b) through 3.7(e) and 3.8(b)

through 3.8(e) show the execution times of both SEM+AP and SEM for the two accuracy

requirements and the two settings of (cid:96) from 1000 independent runs of simulations. We observe

from these ﬁgures that the execution time of SEM+AP is 50% smaller than the execution time

of SEM. For example, the execution times of SEM+AP and SEM with (cid:96) = 0, α = 5%, and

64

β = 95% are approximately 3.2s and 6.3s, respectively. We make similar observations about

SEM+AP and SEM for other settings of (cid:96), α, and β when the categories are unbalanced.

The underlying reason is that SEM+AP ﬁrst adaptively partitions an unbalanced group into

multiple balanced groups and then ﬁnds proper broadcast frame sizes for each group, which

signiﬁcantly reduces the execution time.

3.6.3 Actual Reliability

Recall that actual reliability is the percentage of times the estimates for any category Ci
lie in the range [(1 − α)ni, (1 + α)ni], where ni is the actual cardinality of category Ci. We
independently repeat each simulation scenario 1000 times and calculate the actual reliability

from those 1000 estimation results. Figure 3.9 plots the actual reliability of SEM+AP in a

balanced RFID system for the two accuracy requirements and the two settings of (cid:96) when the

cardinalities of the categories are those shown in Figure 3.6(a). We observe from these two

ﬁgures that the actual reliability achieved by SEM+AP for each category is higher than the

required reliability β.

Figures 3.10 and 3.11 plot the actual reliability of SEM+AP in the unbalanced RFID

system for the two accuracy requirements and the two settings of (cid:96) when the cardinalities of

the categories are those shown in Figures 3.7(a) and Figures 3.8(a), respectively. We observe

from these ﬁgures that SEM+AP with (cid:96) = 0 sometimes does not satisfy the required reliability.

This happens due to the premature termination, discussed in Section 3.4.4. However, with

(cid:96) = 1, the actual reliability of SEM+AP is always higher than the required reliability β in

all scenarios. This further shows that our (cid:96)-sigma method with (cid:96) = 1 is very eﬀective in

alleviating premature termination.

3.6.4 Execution Time

We evaluate the execution time of SEM+AP and present its side-by-side comparison with

the execution times of ﬁve existing estimation protocols, namely MLE, EZB, UPE, ART,

65

Figure 3.9 Actual reliability of SEM+AP for balanced categories.

Figure 3.10 Actual reliability of SEM+AP for unbalanced (exponential) categories.

Figure 3.11 Actual reliability of SEM+AP for unbalanced (linear) categories.

66

C1C2C3C4C5C6C7C8C9C10C1C2C3C4C5C6C7C8C9C10100%99%98%97%96%95%94%100%99%98%97%96%95%94%Actual ReliabilityActual ReliabilityRequired reliability(cid:127)()+m=0()+PPcategorycategoryαβ=5%,=95%αβ=3%,=97%(a)(b)(cid:127)m=1100%99%98%97%96%95%94%100%99%98%97%96%95%94%C1C2C3C4C5C6C7C8C9C10C1C2C3C4C5C6C7C8C9C10Actual ReliabilityActual ReliabilityPremature terminationPremature terminationRequired reliability(cid:127)()+(cid:127)()+PPcategorycategoryαβ=5%,=95%αβ=3%,=97%(a)(b)m=0m=1categoryC1C2C3C4C5C6C7C8C9C10categoryC1C2C3C4C5C6C7C8C9C10100%99%98%97%96%95%94%100%99%98%97%96%95%94%Actual ReliabilityActual ReliabilityPremature terminationPremature terminationαβ=5%,=95%αβ=3%,=97%(a)(b)Required reliability(cid:127)()+(cid:127)m=0()+PPm=1and ES. We use these existing estimation protocols to separately estimate the cardinality of

each category one by one, except for ES that simultaneously estimates the cardinalities of
the top-k largest categories. We set K in ES equal to the total number of categories. We
change the number of categories in tag populations from 1 to 20 and pick category sizes from

two distributions: a non-uniform distribution to generate balanced categories and a uniform

distribution to generate unbalanced categories. Next we present the execution time of SEM

and existing protocols for the balanced and unbalanced categories.

3.6.4.1 Balanced Categories

In this case, for each value of number of categories, we pick the sizes of categories from the

distribution shown in Figure 3.12(a). For example, the probability corresponding to 10000

tags is 0.25, which means that an arbitrary category has a 25% likelihood of being assigned a

cardinality of 10000 when simulating the RFID system. Since the cardinalities with non-zero

probabilities are within a relatively small range ([8000, 12000]), all categories will have similar

cardinalities, resulting in a balanced categories scenario. Figure 3.12(b) plots the normalized

average execution times of SEM+AP and existing protocols. Normalized execution time is

calculated by dividing the execution time with the number of categories. We observe from

this ﬁgure that SEM+AP is the only protocol whose average execution time per category

decreases as the number of categories increases. Furthermore, SEM+AP is signiﬁcantly faster

compared with the prior estimation protocols. For example, with 20 categories, the average

time per category of the fastest existing protocol, i.e., ART, is about 0.7 seconds, whereas

that of our SEM+AP is just about 0.10 seconds, which is nearly 7 times faster than ART.

67

Figure 3.12 Execution time of SEM+AP ((cid:96) = 1) and prior protocols for balanced categories.

α = 5%, β = 95%.

Figure 3.13 Execution time of SEM+AP ((cid:96) = 1) and prior protocols for unbalanced

categories. α = 5%, β = 95%.

3.6.4.2 Unbalanced Categories

In this case, for each value of number of categories, we pick the sizes of categories from the

distribution shown in Figure 3.13(a). Since the cardinalities with non-zero probabilities are

in a relatively wide range ([1000, 20000]), diﬀerent categories will have diﬀerent cardinalities,

resulting in an unbalanced categories scenario. Figure 3.12(b) plots the normalized average

execution times of SEM+AP and existing protocols. We make two important observations.

First, the average execution time of the existing protocols is almost the same as that

68

Time Per Category (s)Category number1510152000.050.10.150.20.250.3ProbabilityCardinalityx103(a) Cardinality Picking Prob.(b) Avg. time vs. category number00.51.01.52.02.53.0246810121416182000.050.10.150.20.250.3Probability00.51.01.52.02.53.0Category number15101520Cardinalityx10(a) Cardinality Picking Prob.(b) Avg. time vs. category number2468101214161820Time Per Category (s)3in the scenario of balanced categories. This is because the execution times of existing

protocols only depend on the required accuracy and are independent of tag population sizes

[177, 101, 89, 88, 51]. Thus, as long as the number of categories does not change, the execution

time of the existing protocols does not change. Second, our SEM+AP protocol is persistently

several times faster than all prior protocols for unbalanced categories as long as the number

of categories is greater than 2.

3.7 Conclusion

In this paper, we make the following three key contributions. First, we formally deﬁned the

practically important problem of multi-category RFID estimation and proposed an Single-one

Manchester coding-based approach called SEM. Our SEM approach could decode multiple

bit vectors from a single physical frame, thereby achieving simultaneous estimation over

multiple categories. Second, we propose the optimization technique of adaptive partitioning

called AP to address the issue that category sizes may have large variances. The key idea

is to group categories of similar sizes together and execute our SEM approach for each

group separately. Third, we conducted extensive simulations to evaluate the proposed

approaches. The simulation results show that our optimized SEM+AP approach can satisfy

the predeﬁned estimation accuracy while signiﬁcantly outperforming all prior schemes, in

terms of execution time. Moreover, we ﬁnd that our SEM is the only approach whose

normalized estimation time decreases as the number of categories increases. Many excellent

estimation protocols dedicated to single-set estimation can be built on our SEM+AP to

achieve fast and simultaneous estimation in multi-category RFID systems.

69

CHAPTER 4

FUTURE WORK

In this section, I provide an overview of the planned future work.

4.1 Missing Tag Detection and Identiﬁcation

After brieﬂy explain of my RFID work there is another question come out. Even RTSP

can detect and identify some speciﬁc tags we want to know but RTSP cannot tell anything

more than the ID of tag. Since previously we haven’t make any change of the entire RFID

systems which means we can not do some operations. However, as the RFID tags are designed

to be more powerful we can do more work about that.

4.2 Motivation and Problem Statement

RFID systems with active tags are widely used in various application, such as indoor

localization, object tracking , work-in-process tracking, supply chain management [143], [141],

[142] etc. A typical RFID system consists of three elements: a large number of tags, RFID

readers and a back-end data server. RFID tags are labeled in designated objects where each

tag has a small size of memory to store its unique ID and some other information (e.g.,

product price, expiry date, personal information, etc). A reader has a dedicated power source

with signiﬁcant computing power. It transmits a command to query a set of tags, and the tags

respond over a shared wireless medium. A data server which stores all tags information such

as tag ID, brands or values of tagged items will do some computing based on the information

from reader.

An interesting application of RFID is to detect and identify missing items in a large

storage [126]. Consider a warehouse that stores a large number of commercial products. Some

of the items might be expensive and some of the items might be inexpensive. Now assume

a scenario where some of the items are stolen from the warehouse. In such situations, it is

70

important to not only identify the stolen items but also detect these events in timely fashion.

However, most of the existing missing tag detection protocols spend a lot time to detect such

events. These protocols also do not consider the value of the missing items and treat all

the tags as having same value, [99], [181], [216], [61], [151]. However, this is untrue in the

real world, RFID missing tag detection protocol is expected to be unf air. The reason is

that the missing tag event of expensive products should be detected successfully with a high

probability. As in this scenario, managers may want to detect and identify the expensive

tags with a higher accuracy like 99.99%. But for other inexpensive ones, they may just want

to detect with the accuracy above 90%.

This project will address this fundamental problem of achieving unfairness in RFID missing

tag detection while minimizing identiﬁcation time. Speciﬁcally, given a tag population of

known size, with IDs and values of known distribution in the data server, and a required

detection probability β, design a missing tag detection and identiﬁcation protocol that

minimizes the execution time under the constraint that the accuracy of detecting and

identifying the expensive missing tag event is no less than β. The protocol should be

compliant with the prevalent EPCGlobal Class 1 Generation 2 (C1G2) RFID standard.

4.3 Limitations of Prior Art

To the best of our knowledge, no prior work has been targeted on developing a unfair

RFID missing tag detection protocol that can achieve any required accuracy. Existing RFID

missing tag detection protocols mostly focus on minimizing detection time and energy [126],

thus are not suitable to the real world because in these protocols, all non-singleton tag has

to transmit equal number of times on average. In contrast, our objective is to minimize the

detection time with total value of missing items constraint.

There are two types of missing tag detection and identiﬁcation protocols: probabilistic

[126], [186], [123] and deterministic [112], [99], [181], [216]. The probabilistic protocols are

faster but only report the event that some tags are missing, without pinpointing exactly

71

which ones. The deterministic protocols return IDs of all missing tags but are comparatively

slower. Both protocols are have their merits, and they are complementary to each other. For

example, a probabilistic detection protocol should be used to detect a missing tag event and

once detected positive, a deterministic protocol should be invoked to identify all the missing

tags. There are two key limitations of existing protocols. The ﬁrst limitation is that all

existing protocols don’t consider the value of items attached with tags, which is not a realistic

assumption. Here is an example, suppose in a shopping mall which uses RFID readers to

monitor only expensive merchandize, the reader receive response from tags of inexpensive

merchandize as well. Existing protocols can not handle the presence of inexpensive tags

because they pick up unexpected slot in Aloha frames resulting in unexpected false positive.

The second key limitation of the missing tags detection and identiﬁcation protocols except

TRP, none of them is compliant with the EPCGloable Class 1 Generation 2 (C1G2) RFID

standard. Most of those protocols require the manufactures to put random bit vectors in tags

that tags use to calculate specialized hash functions. They also require the tags to be able to

receive and interpret "pre-vector" and/or "post-vector" frames to select slots in frames. Such

functionalities are not provisioned in the C1G2 standard. A protocol which is not compliant

with the C1G2 RFID standard will lead to diﬃculty in deploying RFID system in diﬀerent

scenarios.

4.4 Solution Directions

For the problem of RFID missing tag detection with diﬀerent values, there are three

seemingly straightforward solutions based on previous work. The ﬁrst solution is to execute

a tag collection protocol repeatedly to collect IDs of all tags compare with IDs stored in the

data server to detect and identify the missing tags. This method works; however, the long

execution time is not appropriate for some scenarios. For example, in the airport luggage

check out, we want to detect the missing tags as soon as possible.

The second solution is to ﬁrst execute missing tag detection protocol on expensive tags

72

and then run the protocol on inexpensive tags. This can be achieved by using bloom ﬁlter

to deactivate the inexpensive tags when executing the protocol with a higher accuracy

requirement. After that, the inexpensive tags can be activated and expensive tags will not

participate, running the same protocol again with lower accuracy requirement to minimize

the total execution time. This method pays much attention to the expensive tags however

it’s not compliant with C1G2 RFID standard.

The third solution is to run our searching protocol RTSP. Since the data server has

IDs of all tags which are supposed to be in the tag population. After running RTSP, we

will know tags present in the population, therefore, the missing tag can be detected and

identiﬁed. However, RTSP only can detect and identify the missing tags. It doesn’t show

any information about the value of items attached with tags.

In this project, we will develop a new protocol based on a sampled method to detect and

identify the expensive tags with a higher accuracy requirement which can be assigned before

the execution.

The missing tag detection and identiﬁcation problem can be deﬁned as: Given a tag set

X with known IDs, a threshold value λ, a required accuracy α. Let A denote the event that

reader reports a missing tag alert after running our detection protocol. m missing tags with
values {x1, x2, ..., xm} and {x1 ≥ x2 ≥ ... ≥ xm}. Neither m or {xi, i = 1..m} is known. Our
detection protocol should meet the following probability:

P{A| m(cid:88)

≥ λ} ≥ α

Given a tag set with known tag IDs, a required conﬁdence interval β, a required accuracy

i=1

α and a threshold value λ. m missing tags x1, x2, ..., xm are unknown. If the total value
of missing tags are larger or equal to λ, a missing tag detection and identiﬁcation protocol

outputs (cid:101)y1,(cid:101)y2, ..., (cid:101)yk, ..,(cid:101)ys so that top-k expensive tags can be identiﬁed with the probability
(cid:80)m
i=1 xi−λ

greater or equals to α1, α2, ..., αk and the missing tag event can be detected with the probability
P{

≤ β} ≥ α.

λ

73

In our protocol, before each run, the data server will calculate a hash function to maximize

the number of expensive tags who will respond in the coming frame. Meanwhile, this hash

function will try to minimize the number of inexpensive tags who will participate in the

frame. After the reader broadcast all the parameters, the data server will compare the frame

collected by the reader with the logical frame it computes and deactivate those tags which are

present and identify missing tags through empty slots which are supposed to be nonempty

based on its computation. After several runs, all the missing tags can be detected.

74

BIBLIOGRAPHY

75

BIBLIOGRAPHY

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

Common Vulnerabilities and Exposures, http://cve.mitre.org/.

Community Emergency Response Teams, http://www.cert.org/.

Forum for Incident Response and Security Teams, http://www.ﬁrst.org/cvss.

http://www.centreforaviation.com/news/share-market/2010/06/17/hong-kong-
airport-sets-new-cargo-traﬃc-record-fedex-sees-surging-asian-exports/page1.

http://www.impinj.com/resources/about-rﬁd/.

http://www.informationweek.com/rsa-unveils-rﬁd-tag-blocker/d/d-id/1023433?

http://www.nordicsemi.com/eng/Products/2.4GHz-RF/nRF24LE1.

http://xACMlpdp.sourceforge.net/.

IBM Internet Security Systems, http://www.iss.net/.

[10] Managing more than 50,000 inbound freight containers and 30,000 outbound trailers
annually is a logistical nightmare. but nyk logistics has found a truckload of savings by
using an rﬁd yard-management system.

[11] National Vulnerability Database, http://nvd.nist.gov/.

[12] Organization for the Advancement of Structured Information Standards (OASIS),

http://www.oasis-open.org.

[13] Secunia, http://secunia.com/.

[14] Security Focus, http://www.securityfocus.com/.

[15] Sun’s XACML Implementation, http://sunxACMl.sourceforge.net.

[16] The Open Source Vulnerability Database, http://osvdb.org/.

[17] Vupen Security, http://www.vupen.com/english/.

[18] www.rﬁdsupplychain.com.

[19] University

of

California,

http://webs.cs.berkeley.edu/tos/hardware
/design/ORCAD_FILES/MICA2/6310-0306-01ACLEAN.pdf, March 2003.

Berkeley.

Mica2

schematics.

[20] Rﬁd insider roadmap. http://blog.atlasrfidstore.com/learn-rfid/, 2017.

[21] Rﬁd tags and readers store. https://www.atlasrfidstore.com/, 2017.

76

[22] Standard ISO 18000-6.

Information technology automatic identiﬁcation and data
capture techniques-radio frequency identiﬁcation for item management air interface.
Part 6. Parameters for Air interface communications at 860-960 MHZ, 2003.

[23] Martin Abadi and Leslie Lamport. Composing speciﬁcations. ACM Transactions on

Programming Languages and Systems, 15(1):73–132, 1993.

[24] Milton Abramowitz and Irene A. Stegun, editors. Handbook of mathematical functions

with formulas, graphs, and mathematical tables. Dover Publications, 1964.

[25] Rakesh Agrawal and Ramakrishnan Srikant. Fast algorithms for mining association
rules. In Proceedings of of 20th International Conference of on Very Large Data Bases,
pages 487–499, 1994.

[26] Omar H. Alhazmi and Yashwant K. Malaiya. Quantitative vulnerability assessment of
systems software. In Proceedings of Annual Reliability and Maintainability Symposium,
pages 615–620, 2005.

[27] Omar H. Alhazmi and Yashwant K. Malaiya. Prediction capabilities of vulnerability
discovery models. In Proceedings of Annual Reliability and Maintainability Symposium,
pages 86–91, 2006.

[28] Ross Anderson. Why information security is hard – an economic perspective.

In
Proceedings of 17th Annual Computer Security Applications Conference of , pages
358–365, 2001.

[29] Ross Anderson. Security in open versus closed systems – the dance of boltzmann, coase
and moore. In Proceedings of Open Source Software: Economics, Law, and Policy
Confocuce, June 2002.

[30] Ashish Arora, Chris Forman, Anand Nandkumar, and Rahul Telang. Competition and
patching of security vulnerabilities: An empirical analysis. Information Economics and
Policy, 22(2):164–177, 2010.

[31] Ashish Arora, Ramayya Krishnan, Rahul Telang, and Yubao Yang. An empirical
analysis of software vendors?patch release behavior: Impact of vulnerability disclosure.
Information Systems Research, 21(1):115–132, 2010.

[32] Ashish Arora, Anand Nandkumar, and Rahul Telang. Does information security attack
frequency increase with vulnerability disclosure? an empirical analysis. Information
Systems Frontiers, 8(8):350–362, 2006.

[33] E. Aydin, R. Oktem, Z. Dincer, and I. K. Akbulut. Study of an rﬁd based moving

object tracking system. In 2007 1st Annual RFID Eurasia, pages 1–5, Sept 2007.

[34] Michael Backes, Thomas R. Gross, and Guenter Karjoth. Tag identiﬁcation system,

2008.

77

[35] Elisa Bertino, Sushil Jajodia, and Pierangela Samarati. A ﬂexible authorization
mechanism for relational data management systems. ACM Transactions on Information
Systems, 17(2):101–140, 1999.

[36] N. Bhandari, A. Sahoo, and S. Iyer. Intelligent Query Tree (IQT) Protocol to Improve

RFID Tag Read Eﬃciency. Proc. of IEEE ICIT, 2006.

[37] B. A. Bjerke, J. G. Proakis, K. Y. M. Lee, and Z. Zvonar. A comparison of gsm receivers
for fading multipath channels with adjacent and co-channel interference. IEEE Journal
on Selected Areas in Communications, 18(11):2211–2219, Nov 2000.

[38] Piero Bonatti, Sabrina De Capitani di Vimercati, and Pierangela Samarati. An algebra
for composing access control policies. ACM Transactions on Information and System
Security (TISSEC), 5(1):1–35, 2002.

[39] Charles Bordenave, David McDonald, and Alexandre Proutiere. Performance of random
medium access control, an asymptotic approach. In Proceedings of ACM SIGMETRICS,
2008.

[40] M. Bouet and A. L. dos Santos. Rﬁd tags: Positioning principles and localization

techniques. In 2008 1st IFIP Wireless Days, pages 1–5, Nov 2008.

[41] Mehran Bozorgi, Lawrence K. Saul, Stefan Savage, and Geoﬀrey M. Voelker. Beyond
heuristics: Learning to classify vulnerabilities and predict exploits. In Proceedings of
of 16th International Conference of on Knowledge discovery and data mining, pages
105–114, 2010.

[42] Richard P. Brent. Algorithms for minimization without derivatives. Prentice-Hall, 2002.

[43] Hilary K. Browne, William A. Arbaugh, John McHugh, and William L. Fithen. A
trend analysis of exploitations. In Proceedings of IEEE Symposium on Security and
Privacy, pages 214–231, 2001.

[44] Michael Buettner, Richa Prasad, Alanson Sample, Daniel Yeager, Ben Greenstein,
Joshua R. Smith, and David Wetherall. RFID sensor networks with the Intel WISP. In
Proceedings of 6th ACM Conference of on Embedded Networked Sensor Systems, pages
393–394, 2008.

[45] Miguel Bustillo. Wal-mart radio tags to track clothing. The Wall Street Journal, 2010.

[46] Y. Liu C. Qian, H. Ngan and L. Ni. Cardinality estimation for large-scale rﬁd systems.

IEEE Transactions on Parallel and Distributed Systems, 22(9):1441–1454, 2011.

[47] John I. Capetanakis. Tree algorithms for packet broadcast channels. IEEE Transactions

on Information Theory, 25:505–515, 1979.

[48] Bogdan Carbunar, Murali Krishna Ramanathan, Mehmet Koyuturk, Christoph Hoﬀ-
mann, and Ananth Grama. Redundant reader elimination in RFID systems.
In
Proceedings of IEEE Communications Society Conference of on SECON, pages 576–580,
2005.

78

[49] Auto ID Center. Draft Protocol Speciﬁcation for a 900 MHz Class 0 Radio Frequency

Identiﬁcation Tag, 2–10 2007.

[50] Jae-Ryong Cha and Jae-Ryong Cha. Dynamic framed slotted aloha algorithms using
fast tag estimation method for RFID system. In Proceedings of 3rd IEEE Consumer
Communications and Networking Conference of , 2006.

[51] Binbin Chen, Ziling Zhou, and Haifeng Yu. Understanding RFID Counting Protocols.

Proc. of ACM MobiCom, 2013.

[52] Min Chen and Shigang Chen. ETAP: Enable Lightweight Anonymous RFID Authenti-

cation with O(1) Overhead. Proc. of IEEE ICNP, 2015.

[53] Min Chen and Shigang Chen. Identifying State-free Networked Tags. Proc. of IEEE

ICNP, 2015.

[54] Min Chen, Wen Luo, Zhen Mo, Shigang Chen, and Yuguang Fang. An eﬃcient tag
search protocol in large-scale RFID systems. In Proceedings of IEEE INFOCOM, 2013.

[55] Min Chen, Wen Luo, Zhen Mo, Shigang Chen, and Yuguang Fang. An Eﬃcient Tag
IEEE/ACM

Search Protocol in Large-Scale RFID Systems With Noisy Channel.
Transactions on Networking, 24(2):703–716, 2016.

[56] Yuan-Hsin Chen, Shi-Jinn Horng, Ray-Shine Run, Jui-Lin Lai, Rong-Jian Chen, Wei-
Chih Chen, Yi Pan, and Terano Takao. A Novel Anti-Collision Algorithm in RFID
Systems for Identifying Passive Tags. IEEE Transactions on Industrial Informatics,
6(1):105–121, 2010.

[57] Sandy Clark, Stefan Frei, Matt Blaze, and Jonathan Smith. Familiarity breeds contempt:
The honeymoon eﬀect and the role of legacy code in zero-day vulnerabilities.
In
Proceedings of 26th International Annual Computer Security Applications Conference
of , pages 251–260, 2010.

[58] Robert Dorfman. The detection of defective members of large populations. Annals of

Mathematical Statistics, 14:436–440, 1943.

[59] EPC EPCglobal. Radio-frequency identity protocols class-1 generation-2 uhf rﬁd
protocol for communications at 860 mhz–960 mhz version 1.0. 9. K. Chiew et al./On
False Authenticationsfor C1G2 Passive RFID Tags, 65, 2004.

[60] Klaus Finkenzeller. RFID Handbook: Fundamentals and Applications in Contactless
Smart Cards, Radio Frequency Identiﬁcation and Near-Field Communication. Wiley,
2010.

[61] B. Firner, P. Jadhav, Y. Zhang, R. Howard, W. Trappe, and E. Fenson. Towards
continuous asset tracking: Low-power communication and fail-safe presence assurance.
In Sensor, Mesh and Ad Hoc Communications and Networks, 2009. SECON ’09. 6th
Annual IEEE Communications Society Conference on, pages 1–9, June 2009.

79

[62] Philippe Flajolet and G. Nigel Martin. Probabilistic counting algorithms for data base

applications. Journal of Computer and System Sciences, 31(2):182–209, 1985.

[63] Robert C Francis, James P McGee, Robert A Sainati, Richard L Sheehan Jr, and Sai-
Kit K Tong. Object tracking and management system and method using radio-frequency
identiﬁcation tags, July 29 2003. US Patent 6,600,418.

[64] Stefan Frei, Thomas Duebendorfer, Gunter Ollmann, and Martin May. Understanding
the web browser threat: Examination of vulnerable online web browser populations
and the “insecurity iceberg”. Technical report, Eidgen
"ossische Technische Hochschule Z
"urich, 2008.

[65] Stefan Frei, Martin May, Ulrich Fiedler, and Bernhard Plattner. Large-scale vulnerability
analysis. In Proceedings of 2006 SIGCOMM workshop on Large-Scale Attack Defense,
pages 131–138, September 2006.

[66] Stefan Frei, Bernhard Tellenbach, and Bernhard Plattner. 0-day patch exposing vendors
(in) security performance. In Proceedings of Black Hat Technical Security Conference
of , volume 14, 2009.

[67] Karsten Fyhn, , Rasmus Melchior Jacobsen, Petar Popovski, and Torben Larsen. Fast
capture – recapture approach for mitigating the problem of missing rﬁd tags. IEEE
Transactions on Mobile Computing, 11(3):518–528, 2012.

[68] Xingxin Gao, Zhe Xiang, Hao Wang, Jun Shen, Jian Huang, and Song Song. An
approach to security and privacy of RFID system for supply chain. Proc. of IEEE
CEC-East, 2004.

[69] Joaquin Garcia-Alfaro, Michel Barbeau, and Evangelos Kranakis. Analysis of threats to
the security of EPC networks. In Proceedings of 6th Annual Communication Networks
and Services Research Conference of , pages 67–74, 2008.

[70] Sylvie Ghez, Sergio Verdu, and Stuart C. Schwartz. Stability properties of slotted
aloha with multipacket reception capability. IEEE Transactions on Automatic Control,
33:640–649, 1988.

[71] Omprakash Gnawali, Rodrigo Fonseca, Kyle Jamieson, and Philip Levis. CTP: Robust
and eﬃcient collection through control and data plane integration. Technical report,
Univ. of Southern California, UC Berkeley, MIT CSAIL, Stanford University, 2008.

[72] Simon Godik and Tim Moses. eXtensible Access Control Markup Language (XACML)

Version 2. Standard, OASIS. February, 2005.

[73] Wei Gong, Kebin Liu, Xin Miao, and Haoxiang Liu. Arbitrarily Accurate Approximation
Scheme for Large-Scale RFID Cardinality Estimation. Proc. of IEEE INFOCOM, 2014.

[74] Wei Gong, Kebin Liu, Xin Miao, Qiang Ma, Zheng Yang, and Yunhao Liu. Informative
Counting: Fine-grained Batch Authentication for Large-Scale RFID Systems. Proc. of
ACM Mobihoc, 2013.

80

[75] Wei Gong, Kebin Liu, Xin Miao, Qiang Ma, Zheng Yang, and Yunhao Liu. Informative
Counting: Fine-grained Batch Authentication for Large-Scale RFID Systems. Proc. of
ACM MobiHoc, 2013.

[76] Mohamed G. Gouda and Xiang-Yang Alex Liu. Firewall design: Consistency, com-
pleteness, and compactness. In Proceedings of 24th International Conference of on
Distributed Computing Systems, pages 320 – 327, 2004.

[77] Dirk Hahnel, Wolfram Burgard, Dieter Fox, Ken Fishkin, and Matthai Philipose.

Mapping and localization with RFID technology. Proc. of IEEE ICRA, 2004.

[78] H. Han, B. Sheng, C. C. Tan, Q. Li, W. Mao, and S. Lu. Counting RFID Tags Eﬃciently

and Anonymously. Proc. of IEEE INFOCOM, 2010.

[79] Hao Han, Bo Sheng, Chiu C. Tan, Qun Li, Weizhen Mao, and Sanglu Lu. Counting

RFID tags eﬃciently and anonymously. In Proceedings of IEEE INFOCOM, 2010.

[80] Peter Adam Hoeher, Sabah Badri-Hoeher, Wen Xu, and Claudiu Krakowski. Single-
antenna co-channel interference cancellation for tdma cellular radio systems. IEEE
Wireless Communications, 12(2):30–37, 2005.

[81] Don R. Hush and Cliﬀ Wood. Analysis of tree algorithms for RFID arbitration. In

Proceedings of IEEE International Symposium on Information Theory, 1998.

[82] EPCGlobal Inc. Radio-Frequency Identity Protocols Class-1 Generation-2 UHF RFID
Protocol for Communications at 860 MHz–960 MHz. EPCGlobal Inc, 1.2.0 edition,
2008.

[83] Rasmus Jacobsen, Karsten Fyhn Nielsen, Petar Popovski, and Torben Larsen. Reliable
identiﬁcation of rﬁd tags using multiple independent reader sessions. In Proceedings of
IEEE International Conference of on RFID, pages 64–71, 2009.

[84] Rajendra K. Jain, Dah-Ming W. Chiu, and William R. Hawe. A quantitative measure
of fairness and discrimination for resource allocation in shared computer systems.
Technical report, Digital Equipment Corporation, 1984.

[85] Sushil Jajodia, Pierangela Samarati, V. S. Subrahmanian, and Eliza Bertino. A uniﬁed
framework for enforcing multiple access control policies.
In Proceedings of ACM
SIGMOD International Conference of on Management of data, pages 474–485, 1997.

[86] Ari Juels, Ronald L Rivest, and Michael Szydlo. The Blocker Tag: Selective Blocking

of RFID Tags for Consumer Privacy. Proc. of ACM CCS, 2003.

[87] M Ayoub Khan, Manoj Sharma, and Ha Prabhu R. A survey of rﬁd tags.

[88] Murali Kodialam and Thyaga Nandagopal. Fast and reliable estimation schemes in
RFID systems. In Proceedings of 12th International Conference of on Mobile Computing
and Networking, pages 322–333, 2006.

81

[89] Murali Kodialam, Thyaga Nandagopal, and Wing Cheong Lau. Anonymous Tracking

using RFID tags. Proc. of IEEE INFOCOM, 2007.

[90] Murali Kodialam, Thyaga Nandagopal, and Wing Cheong Lau. Anonymous tracking

using RFID tags. In Proceedings of IEEE INFOCOM, 2007.

[91] Linghe Kong, Liang He, Yu Gu, Min-You Wu, and Tian He. A Parallel Identiﬁcation

Protocol for RFID Systems. Proc. of IEEE INFOCOM, 2014.

[92] Yuan-Cheng Lai, Ling-Yen Hsiao, Hong-Jie Chen, Ching-Neng Lai, and Jian-Wei Lin.
A Novel Query Tree Protocol with Bit Tracking in RFID Tag Identiﬁcation. IEEE
Transactions on Mobile Computing, 12(10):2063–2075, 2013.

[93] Ching Law, Kayi Lee, and Kai-Yeung Siu. Eﬃcient memoryless protocol for tag
identiﬁcation. In Proceedings of 4th International Workshop on Discrete Algorithms
and Methods for Mobile Computing and Communications, 2000.

[94] Chun Hee Lee and Chin-Wan Chung. Eﬃcient storage scheme and query processing
for supply chain management using RFID. In Proceedings of ACM Conference of on
Management of data, pages 291–302, 2008.

[95] S. Lee, S. Joo, and C. Lee. An Enhanced Dynamic Framed Slotted ALOHA Algorithm

for RFID Tag Identiﬁcation. Proc. of IEEE MobiQuitous, 2005.

[96] Su-Ryun Lee, Sung-Don Joo, and Chae-Woo Lee. An enhanced dynamic framed
slotted aloha algorithm for rﬁd tag identiﬁcation. In The Second Annual International
Conference on Mobile and Ubiquitous Systems: Networking and Services, pages 166–172,
July 2005.

[97] Su Ryun Lee, Sung Don Joo, and Chae Woo Lee. An enhanced dynamic framed
slotted ALOHA algorithm for RFID tag identiﬁcation. In Proceedings of 2nd Annual
International Conference of on Mobile and Ubiquitous Systems: Networking and Services,
pages 166–172, 2005.

[98] T. Li, S. Chen, and Y. Ling. Eﬃcient Protocols for Identifying the Missing Tags in a
Large RFID System. IEEE/ACM Transactions on Networking, 21(6):1974–1987, 2013.

[99] Tao Li, Shigang Chen, and Yibei Ling. Identifying the missing tags in a large rﬁd
system. In Proceedings of the Eleventh ACM International Symposium on Mobile Ad
Hoc Networking and Computing, MobiHoc ’10, pages 1–10, New York, NY, USA, 2010.
ACM.

[100] Tao Li, Shigang Chen, and Yibei Ling. Identifying the Missing Tags in a Large RFID

System. Proc. of ACM MobiHoc, 2010.

[101] Tao Li, Samuel Wu, Shigang Chen, and Mark Yang. Energy eﬃcient algorithms for the

RFID estimation problem. In Proceedings of IEEE INFOCOM, 2010.

82

[102] Tao Li, Samuel Wu, Shigang Chen, and Mark Yang. Generalized Energy-Eﬃcient
Algorithms for the RFID Estimation Problem. IEEE/ACM Transactions on Networking,
20(6):1978–1990, 2012.

[103] Chieh-Jan Mike Liang, Jie Liu, Liqian Luo, Andreas Terzis, and Feng Zhao. RACNet:
A high-ﬁdelity data center sensing network. In Proceedings of 7th ACM Conference of
on Embedded Networked Sensor Systems, pages 15–28, 2009.

[104] Convergence Systems Limited. http://www.csl-rfid.com/.

[105] Alex X. Liu, Fei Chen, JeeHyun Hwang, and Tao Xie. XEngine: A Fast and Scalable
XACML Policy Evaluation Engine. In Proceedings of of International Conference of on
Measurements and Modeling of Computer Systems SIGMETRICS, pages 265–276, 2008.

[106] Alex X. Liu, Fei Chen, and JeeHyun Hwang Tao Xie. XEngine: A Fast and Scalable
XACML Policy Evaluation Engine. In Proceedings of of International Conference of on
Measurements and Modeling of Computer Systems SIGMETRICS, pages 265–276, 2008.

[107] Alex X. Liu and Mohamed G. Gouda. Diverse ﬁrewall design. IEEE Transactions on

Parallel and Distributed Systems, 19(9):1237 – 1251, September 2007.

[108] Haoxiang Liu, Wei Gong, Lei Chen, Wenbo He, Kebin Liu, and Yunhao Liu. Generic

Composite Counting in RFID Systems. Proc. of IEEE ICDCS, 2014.

[109] J. Liu, B. Xiao, K. Bu, and L. Chen. Eﬃcient Distributed Query Processing in Large

RFID-enabled Supply Chains. Proc. of IEEE INFOCOM, 2014.

[110] Jia Liu, Bin Xiao, Shigang Chen, Feng Zhu, and Lijun Chen. Fast RFID grouping

protocols. Proc. of IEEE INFOCOM, 2015.

[111] Tianci Liu, Lei Yang, Qiongzheng Lin, and Yunhao Liu. Anchor-free Backscatter

Positioning for RFID Tags with High Accuracy. Proc. of IEEE INFOCOM, 2014.

[112] X. Liu, K. Li, G. Min, Y. Shen, A. X. Liu, and W. Qu. Completely pinpointing the
missing rﬁd tags in a time-eﬃcient way. IEEE Transactions on Computers, 64(1):87–96,
Jan 2015.

[113] Xiulong Liu, Keqiu Li, Geyong Min, Kai Lin, Bin Xiao, Yanming Shen, and Wenyu Qu.
Eﬃcient Unknown Tag Identiﬁcation Protocols in Large-Scale RFID Systems. IEEE
Transactions on Parallel and Distributed Systems, 25(12):3145–3155, 2014.

[114] Xiulong Liu, Keqiu Li, Geyong Min, Yanming Shen, Alex X. Liu, and Wenyu Qu. A
Multiple Hashing Approach to Complete Identiﬁcation of Missing RFID Tags. IEEE
Transactions on Communications, 62(3):1046–1057, 2014.

[115] Xiulong Liu, Keqiu Li, Heng Qi, Bin Xiao, and Xin Xie. Fast Counting the Key Tags

in Anonymous RFID Systems. Proc. of IEEE ICNP, 2014.

[116] Xiulong Liu, Keqiu Li, Heng Qi, Bin Xiao, and Xin Xie. Fast counting the key tags in

anonymous RFID systems. In Proceedings of IEEE ICNP, 2014.

83

[117] Xiulong Liu, Keqiu Li, Heng Qi, Bin Xiao, and Xin Xie. Fast Counting the Key Tags

in Anonymous RFID Systems. Proc. of IEEE ICNP, 2014.

[118] Xiulong Liu, Bin Xiao, Keqiu Li, Alex X. Liu, Jie Wu, Xin Xie, and Heng Qi. RFID
Estimation with Blocker Tags. IEEE/ACM Transactions on Networking, in press, 2016.

[119] Xiulong Liu, Bin Xiao, Keqiu Li, Jie Wu, Alex X. Liu, Heng Qi, and Xin Xie. RFID

Cardinality Estimation with Blocker Tags. Proc. of IEEE INFOCOM, 2015.

[120] Xiulong Liu, Bin Xiao, Keqiu Li, Jie Wu, Alex X. Liu, Heng Qi, and Xin Xie. RFID

Cardinality Estimation with Blocker Tags. Proc. of IEEE INFOCOM, 2015.

[121] Xiulong Liu, Xin Xie, Keqiu Li, Bin Xiao, Jie Wu, Heng Qi, and Dawei Lu. Fast Tracking
the Population of Key Tags in Large-scale Anonymous RFID Systems. IEEE/ACM
Transactions on Networking, in press, 2016.

[122] Xuan Liu, Shigeng Zhang, Bin Xiao, and Kai Bu. Flexible and Time-Eﬃcient Tag
Scanning with Handheld Readers. IEEE Transactions on Mobile Computing, pp(99):1–1,
2015.

[123] W. Luo, S. Chen, T. Li, and S. Chen. Eﬃcient missing tag detection in rﬁd systems.

In INFOCOM, 2011 Proceedings IEEE, pages 356–360, April 2011.

[124] W. Luo, S. Chen, T. Li, and S. Chen. Eﬃcient Missing Tag Detection in RFID Systems.

Proc. of IEEE INFOCOM, 2011.

[125] W. Luo, S Chen, T. Li, and Y. Qiao. Probabilistic Missing-tag Detection and Energy-

Time Tradeoﬀ in Large-scale RFID Systems. Proc. of ACM MobiHoc, 2012.

[126] Wen Luo, Shigang Chen, Tao Li, and Yan Qiao. Probabilistic missing-tag detection and
energy-time tradeoﬀ in large-scale rﬁd systems. In Proceedings of the Thirteenth ACM
International Symposium on Mobile Ad Hoc Networking and Computing, MobiHoc ’12,
pages 95–104, New York, NY, USA, 2012. ACM.

[127] Wen Luo, Yan Qiao, and Shigang Chen. An Eﬃcient Protocol for RFID Multigroup

Threshold-based Classiﬁcation. Proc. of IEEE INFOCOM, 2013.

[128] Wen Luo, Yan Qiao, Shigang Chen, and Min Chen. An Eﬃcient Protocol for RFID
Multigroup Threshold-Based Classiﬁcation Based on Sampling and Logical Bitmap.
IEEE/ACM Transactions on Networking, 24(1):397–407, 2016.

[129] Michael R. Lyu. Handbook of Software Reliability Engineering. McGraw-Hill, Inc.

Hightstown, NJ, USA, 1996.

[130] Chandan Maity, Ashutosh Gupta, and Mahua Maity. Timing analysis of passive UHF
RFID-EPC c1g2 system in dynamic frame. Contemporary Computing, pages 216–227,
2009.

[131] J. Maneesilp, C. Wang, H. Wu, and N. F. Tzeng. Rﬁd support for accurate 3d

localization. IEEE Transactions on Computers, 62(7):1447–1459, July 2013.

84

[132] Gaia Maselli, Chiara Petrioli, and Claudio Vicari. Dynamic tag estimation for optimizing
tree slotted aloha in RFID networks. In Proceedings of 11th International Symposium
on Modeling, Analysis and Simulation of Wireless and Mobile Systems, pages 315–322,
2008.

[133] Peter Mell, Karen Scarfone, and Sasha Romanosky. A Complete Guide to the Com-
mon Vulnerability Scoring System Version 2.0. National Institute of Standards and
Technology, June 2007.

[134] Katina Michael and Luke McCathie. The pros and cons of RFID in supply chain

management. Proc. of IEEE ICMB, 2005.

[135] Jihoon Myung and Wonjun Lee. An adaptive memoryless tag anti-collision protocol for

RFID netwroks. In Proceedings of IEEE INFOCOM, 2005.

[136] Jihoon Myung and Wonjun Lee. Adaptive splitting protocols for RFID tag collision
arbitration. In Proceedings of 7th ACM International Symposium on Mobile Ad Hoc
Networking and Computing, pages 202–213, 2006.

[137] Jihoon Myung and Wonjun Lee. Adaptive Splitting Protocols for RFID Tag Collision

Arbitration. Proc. of ACM MobiHoc, 2006.

[138] V. Namboodiri and L. Gao. Energy-Aware Tag Anti-Collision Protocols for RFID

Systems. Proc. of IEEE PerCom, 2007.

[139] Vinod Namboodiri and Lixin Gao. Energy-aware tag anticollision protocols for RFID
systems. In Proceedings of 5th IEEE International Conference of on Pervasive Com-
puting and Communications, pages 23–36, 2007.

[140] Badri Nath, Franklin Reynolds, and Roy Want. RFID technology and applications.

Proceedings of IEEE Pervasive Computing, 5:22–24, 2006.

[141] Aditya Nemmaluri, Mark D. Corner, and Prashant Shenoy. Sherlock: Automatically
locating objects for humans. In Proceedings of International Conference of on Mobile
Systems, Applications, and Services, 2008.

[142] T. J. Ng, M. M. Wong, J. B. Zhang, and O. P. Gan. Rﬁd for mro work in progress
tracking. In IEEE Industrial Electronics, IECON 2006 - 32nd Annual Conference on,
pages 4779–4784, Nov 2006.

[143] L. M. Ni, Yunhao Liu, Yiu Cho Lau, and A. P. Patil. Landmarc: indoor location sensing
using active rﬁd. In Pervasive Computing and Communications, 2003. (PerCom 2003).
Proceedings of the First IEEE International Conference on, pages 407–415, March 2003.

[144] Lionel M. Ni, Yunhao Liu, Yiu Cho Lau, and Abhishek P. Patil. Landmarc: Indoor

location sensing using active RFID. Wireless networks, 10:701–710, 2004.

[145] Qun Ni, Elisa Bertino, and Jorge Lobo. D-Algebra for Composing Access Control Policy
Decisions. In Proceedings of of 4th International Symp. on Information, Computer,
and Communications Security, pages 298–309, 2009.

85

[146] Qun Ni, Elisa Bertino, and Jorge Lobo. D-Algebra for Composing Access Control Policy
Decisions. In Proceedings of of 4th International Symp. on Information, Computer,
and Communications Security, pages 298–309, 2009.

[147] Jeongyeup Paek and Ramesh Govindan. RCRT: Rate-controlled reliable transport
for wireless sensor networks. In Proceedings of 5th International Conference of on
Embedded Networked Sensor Systems, pages 305–319, 2007.

[148] Lei Pan and Hongyi Wu. Smart trend-traversal: A low delay and energy tag arbitration

protocol for large RFID systems. In Proceedings of IEEE INFOCOM, 2009.

[149] Apostolia Papapostolou and Hakima Chaouchi. Rﬁd-assisted indoor localization and
the impact of interference on its performance. Journal of Network and Computer
Applications, 34(3):902 – 913, 2011. RFID Technology, Systems, and Applications.

[150] Joseph Polastre, Jason Hill, and David Culler. Versatile low power media access for
wireless sensor networks. In Proceedings of 2nd International Conference of on Embedded
Networked Sensor Systems, pages 95–107, 2004.

[151] P. Popovski, K. Fyhn, R. M. Jacobsen, and T. Larsen. Robust statistical methods for
detection of missing rﬁd tags. IEEE Wireless Communications, 18(4):74–80, August
2011.

[152] Saiyu Qi, Yuanqing Zheng, Mo Li, Li Lu, and Yunhao Liu. COLLECTOR: A Secure

RFID-Enabled Batch Recall Protocol. Proc. of IEEE INFOCOM, 2014.

[153] Chen Qian, Yunhuai Liu, Hoilun Ngan, and Lionel M. Ni. ASAP: Scalable identiﬁcation
and counting for contactless RFID systems. In Proceedings of 30th IEEE International
Conference of on Distributed Computing Systems, pages 52–61, 2010.

[154] Chen Qian, Hoilun Ngan, and Yunhao Liu. Cardinality estimation for large-scale RFID
systems. In Proceedings of 6th Annual IEEE International Conference of on Pervasive
Computing and Communications, pages 30–39, 2008.

[155] Chen Qian, Hoilun Ngan, Yunhao Liu, and Lionel M. Ni. Cardinality Estimation for
Large-scale RFID Systems. IEEE Transactions on Parallel and Distributed Systems,
22(9):1441–1454, 2011.

[156] Yan Qiao, Shigang Chen, Tao Li, and Shiping Chen. Energy-eﬃcient Polling Protocols

in RFID Systems. Proc. of ACM Mobihoc, 2011.

[157] Eric Rescorla. Security holes... who cares. In Proceedings of 12th USENIX Security

Symposium, 2003.

[158] Eric Rescorla. Is ﬁnding security holes a good idea?

3(1):14–19, Januray 2005.

IEEE Security and Privacy,

[159] John Rice. Mathematical statistics and data analysis. Cengage Learning, 2006.

[160] Mark Roberti. A 5-cent breakthrough. RFID Journal, 5(6), 2006.

86

[161] Mark Roberti. Wal-mart relaunches EPC RFID eﬀort, starting with men’s jeans and

basics. RFID Journal, 2010.

[162] L. G. Roberts. Aloha Packet System with and without Slots and capture. ACM

SIGCOMM Computer Communication Review, 5(2):28–42, 1975.

[163] Walter A. Rosenkrantz and Don Towsley. On the instability of slotted ALOHA

multiaccess algorithm. IEEE Transactions on Automatic Control, 28:994–996, 1983.

[164] Walter A. Rosenkrantz and Donald Towsley. On the instability of slotted aloha
multiaccess algorithm. IEEE Transactions on Automatic Control, 28(10):994–996, 1983.

[165] Sheldon M. Ross. Introduction to Probability Models. Academic Press, Elsevier, 9th

edition, 2007.

[166] Mark Schilling. Understanding probability: Chance rules in everyday life. The American

Statistician, 60(1):97–98, 2006.

[167] Bruce Schneier. Cryptogram September 2000-Full Disclosure and the Window of

Exposure.

[168] Frits C Schoute. Dynamic Frame Length ALOHA. IEEE Transactions on Communica-

tions, 31(4):565 – 568, 1983.

[169] Guido Schryen. A comprehensive and comparative analysis of the patching behavior of
open source and closed source software vendors. In Proceedings of 5th International
Conference of on IT Security Incident Management and IT Forensics, pages 153–168,
2009.

[170] E. Eugene Schultz, David S. Brown, and Thomas A. Longstaﬀ. Responding to Computer
Security Incidents: Guidelines for Incident Handling. Lawrence Livermore National
Laboratory, Livermore, CA, 1990.

[171] Philips Semiconductors. SL2 ICS11 I.Code UID Smart Label IC Functional Speciﬁcation

Datasheet http://www.advanide.com/datasheets/sl2ics11.pdf, 2004.

[172] Philips

I-CODE

Smart

Semiconductors.

RFID
http://www.nxp.com/acrobat_download/other/identiﬁcation/SL092030.pdf,
2004.

Label

Tags.
Jan

[173] Vahid Shah-Mansouri and Vincent W.S. Wong. Cardinality estimation in RFID systems
with multiple readers. In Proceedings of IEEE Global Communications Conference of ,
2009.

[174] Vahid Shah-Mansouri and Vincent W.S. Wong. Cardinality estimation in RFID systems
with multiple readers. IEEE Transactions on Wireless Communications, 10(5):1458–
1469, 2011.

[175] Muhammad Shahzad and Alex X Liu. Every bit counts:

estimation. In Proceedings of ACM MobiCom, 2012.

fast and scalable RFID

87

[176] Muhammad Shahzad and Alex X. Liu. Probabilistic optimal tree hopping for RFID

identiﬁcation. In Proceedings of ACM SIGMETRICS, 2013.

[177] Muhammad Shahzad and Alex X. Liu. Fast and Accurate Estimation of RFID Tags.

IEEE/ACM Transactions on Networking, 23(1):241–254, 2015.

[178] Muhammad Shahzad and Alex X. Liu. Probabilistic Optimal Tree Hopping for RFID

Identiﬁcation. IEEE/ACM Transactions on Networking, 23(3):796–809, 2015.

[179] Muhammad Shahzad, M. Zubair Shaﬁq, and Alex X. Liu. A large scale exploratory
analysis of vulnerability life cycles (extended version). Technical report, Michigan State
University, www.msu.edu/~shahzadm/ICSE2012/shahzad2011vulnsTR.pdf, 2011.

[180] Longfei Shangguan, Zimu Zhou, Xiaolong Zheng, Lei Yang, Yunhao Liu, and Jinsong
Han. ShopMiner: Mining Customer Shopping Behavior in Physical Clothing Stores
with Passive RFIDs. Proc. of ACM SenSys, 2015.

[181] Bo Sheng, Qun Li, and Weizhen Mao. Eﬃcient continuous scanning in rﬁd systems. In
Proceedings of the 29th Conference on Information Communications, INFOCOM’10,
pages 1010–1018, Piscataway, NJ, USA, 2010. IEEE Press.

[182] Bo Sheng, Chiu C. Tan, Qun Li, and Weizhen Mao. Finding Popular Categories for

RFID Tags. Proc. of ACM Mobihoc, 2008.

[183] Nikolaj V Smirnov, Igor? V Dunin-Barkovskij, and Wolfgang Richter. Mathematische

Statistik in der Technik. Dt. Verlag d. Wiss., 1963.

[184] David Eugene Smith. A source book in mathematics. Courier Dover Publications, 2012.

[185] Claire Swedberg. Honeywell aerospace tags parts for airbus. RFID Journal, 2013.

[186] C. C. Tan, B. Sheng, and Q. Li. How to monitor for missing rﬁd tags. In Distributed
Computing Systems, 2008. ICDCS ’08. The 28th International Conference on, pages
295–302, June 2008.

[187] Andrew S. Tanenbaum. Computer Networks. Prentice-Hall, 2002.

[188] ShaoJie Tang, Jing Yuan, Xiang-Yang Li, Guihai Chen, Yunhao Liu, and JiZhong
Zhao. Raspberry: A stable reader activation scheduling protocol in multi-reader RFID
systems. In Proceedings of IEEE ICNP, 2009.

[189] Rahul Telang and Sunil Wattal. An empirical analysis of the impact of software
vulnerability announcements on ﬁrm stock price. IEEE Transactions on Software
Engineering, 33(8):544–557, 2007.

[190] Robert Tibshirani, Guenther Walther, and Trevor Hastie. Estimating the number of
clusters in a data set via the gap statistic. Journal of the Royal Statistical Society:
Series B (Statistical Methodology), 63(2):411–423, 2001.

88

[191] Géraldine Vache. Vulnerability analysis for a quantitative security evaluation.

In
Proceedings of 3rd International Symp. on Empirical Software Engineering and Mea-
surement, 2009.

[192] Ehsan Vahedi, Vahid Shah-Mansouri, Vincent W. S. Wong, Ian F.Blake, and Rabab K.
Ward. Probabilistic Analysis of Blocking Attack in RFID Systems. IEEE Transactions
on Information Forensics and Security, 6(3):803 – 817, 2011.

[193] Harald Vogt. Eﬃcient object identiﬁcation with passive RFID tags. Pervasive Comput-

ing, 2414:98–113, 2002.

[194] James Waldrop, Daniel W. Engels, and Sanjay E. Sarma. Colorwave: A MAC for RFID
reader networks. In Proceedings of IEEE Wireless Communications and Networking,
pages 1701–1704, 2003.

[195] Chong Wang, Hongyi Wu, and Nian-Feng Tzeng. RFID-based 3-D positioning schemes.

In Proceedings of IEEE INFOCOM, 2007.

[196] Fei Wang, Bin Xiao, Kai Bu, and Jinshu Su. Detect and Identify Blocker Tags in

Tree-based RFID Systems. Proc. of IEEE ICC, 2013.

[197] J. Wang, H. Hassanieh, D. Katabi, and P. Indyk. Eﬃcient and Reliable Low-power

Backscatter Networks. Proc. of ACM SIGCOMM, 2012.

[198] Roy Want. An introduction to rﬁd technology. IEEE Pervasive Computing, 5(1):25–33,

2006.

[199] Stephen A. Weis, Sanjay E. Sarma, Ronald L. Rivest, and Daniel W. Engels. Security
and privacy aspects of low-cost radio frequency identiﬁcation systems. Security in
Pervasive Computing, pages 50–59, 2004.

[200] J. Werb and C. Lanzl. Designing a positioning system for ﬁnding things and people

indoors. IEEE Spectrum, 35(9):71–78, Sep 1998.

[201] D. Wijesekera and S. Jajodia. A propositional policy algebra for access control. ACM

Transactions on Information and System Security (TISSEC), 6(2):286–325, 2003.

[202] Duminda Wijesekera and Sushil Jajodia. A propositional policy algebra for access
control. ACM Transactions on Information and System Security (TISSEC), 6(2):286–
325, 2003.

[203] Ian H. Witten, Eibe Frank, Len Trigg, Mark Hall, Geoﬀrey Holmes, and Sally Jo
Cunningham. Weka: Practical Machine Learning Tools and Techniques with Java
Implementations. Citeseer, 1999.

[204] Yan Wu, Harvey Siy, and Robin Gandhi. Empirical results on the study of software
vulnerabilities: NIER track. In Proceedings of 33rd International Conference of on
Software Engineering, pages 964–967, 2011.

89

[205] Qingjun Xiao, Bin Xiao, and Shigang Chen. Diﬀerential Estimation in Dynamic RFID

Systems. Proc. of IEEE INFOCOM, 2013.

[206] Qingjun Xiao, Bin Xiao, and Shigang Chen. Diﬀerential Estimation in Dynamic RFID

Systems. Proc. of IEEE INFOCOM, 2013.

[207] Lei Xie, Hao Han, Qun Li, Jie Wu, and Sanglu Lu. Eﬃcient Protocols for Collect-
ing Histograms in Large-Scale RFID Systems. IEEE Transactions on Parallel and
Distributed Systems, 26(9):2421–2433, 2015.

[208] L. Yang, J. Cao, W. Zhu, and S. Tang. Accurate and eﬃcient object tracking based on
passive rﬁd. IEEE Transactions on Mobile Computing, 14(11):2188–2200, Nov 2015.

[209] Lei Yang, Yekui Chen, Xiang-Yang Li, Chaowei Xiao, Mo Li, and Yunhao Liu. Tagoram:
Real-Time Tracking of Mobile RFID Tags to High Precision Using COTS Devices. Proc.
of ACM MobiCom, 2014.

[210] Lei Yang, Yi Guo, Tianci Liu, Cheng Wang, and Yunhao Liu. Perceiving the Slightest
Tag Motion beyond Localization. IEEE Transactions on Mobile Computing, 14(11):2363–
2375, 2015.

[211] Lei Yang, Jinsong Han, Yong Qi, Cheng Wang, Tao Gux, and Yunhao Liu. Season:
Shelving Interference and Joint Identiﬁcation in Large-Scale RFID Systems. Proc. of
IEEE INFOCOM, 2011.

[212] Lei Yang, Qiongzheng Lin, Xiang-Yang Li, Tianci Liu, and Yunhao Liu. See Through

Walls with COTS RFID System. Proc. of ACM MobiCom, 2015.

[213] Guang yao Jin, Xiao yi Lu, and Myong-Soon Park. An indoor localization mechanism
using active rﬁd tag. In IEEE International Conference on Sensor Networks, Ubiquitous,
and Trustworthy Computing (SUTC’06), volume 1, pages 4 pp.–, June 2006.

[214] Wei Ye, John Heidemann, and Deborah Estrin. Medium access control with coordinated
adaptive sleeping for wireless sensor networks. IEEE/ACM Transactions on Networking,
12(3):493–506, 2004.

[215] Yafeng Yin, Lei Xie, Sanglu Lu, and Daoxu Chen. Check out the Rules: Towards
Time-Eﬃcient Rule Checking over RFID Tags. Mobile Networks and Applications,
19(4):524–533, 2014.

[216] R. Zhang, Y. Liu, Y. Zhang, and J. Sun. Fast identiﬁcation of the missing tags in a large
rﬁd system. In Sensor, Mesh and Ad Hoc Communications and Networks (SECON),
2011 8th Annual IEEE Communications Society Conference on, pages 278–286, June
2011.

[217] Shigeng Zhang, Xiaoxian He, Hong Song, and Daqiang Zhang. Time eﬃcient tag
searching in multiple reader RFID systems. In Proceedings of Green Computing and
Communications (GreenCom). IEEE, 2013.

90

[218] Bin Zhen, Mamoru Kobayashi, and Masashi Shimizu. Framed ALOHA for multiple
RFID objects identiﬁcation. IEICE Transactions on Communications, 88:991–999,
2005.

[219] Yuanqing Zheng and Mo Li. Fast Tag Searching Protocol for Large-Scale RFID Systems.

Proc. of IEEE ICNP, 2011.

[220] Yuanqing Zheng and Mo Li. PET: Probabilistic Estimating Tree for Large-scale RFID

Estimation. IEEE Transactions on Mobile Computing, 11(11):1763–1774, 2012.

[221] Yuanqing Zheng and Mo Li. Fast tag searching protocol for large-scale RFID systems.

IEEE/ACM Transactions on Networking (TON), 21(3):924–934, 2013.

[222] Yuanqing Zheng and Mo Li. Fast Tag Searching Protocol for Large-Scale RFID Systems.

IEEE/ACM Transactions on Networking, 21(3):924–934, 2013.

[223] Yuanqing Zheng and Mo Li. ZOE: Fast Cardinality Estimation for Large-Scale RFID

Systems. Proc. of IEEE INFOCOM, 2013.

[224] Yuanqing Zheng and Mo Li. ZOE: Fast Cardinality Estimation for Large-Scale RFID

Systems. Proc. of IEEE INFOCOM, 2013.

[225] Yuanqing Zheng and Mo Li. P-MTI: Physical-Layer Missing Tag Identiﬁcation via
Compressive Sensing. IEEE/ACM Transactions on Networking, 23(4):1356–1366, 2015.

[226] Zongheng Zhou, Himanshu Gupta, Samir R. Das, and Xianjin Zhu. Slotted scheduled tag
access in multi-reader RFID systems. In Proceedings of IEEE International Conference
of on Network Protocols, pages 61–70, 2007.

91