PRIVILEGE-BASED DECENTRALIZED DATA SHARING

By

Ehab Zaghloul

A DISSERTATION

Submitted to

Michigan State University

in partial fulﬁllment of the requirements

for the degree of

Electrical Engineering — Doctor of Philosophy

2020

ABSTRACT

PRIVILEGE-BASED DECENTRALIZED DATA SHARING

By

Ehab Zaghloul

In this dissertation, cryptographic mechanism-based data sharing schemes are presented

that address the challenges of dependence on third trusted parties to facilitate sensitive data

sharing and censorship. This work is driven by the lack of elevated security and privacy

countermeasures necessary for use with sensitive data that many third parties exhibit. Our

proposed schemes transition from centralized to distributed models, thus alleviating trust

on third parties to realize data owner sharing preferences.

We ﬁrst propose a secure Privilege-based Multilevel Organizational Data sharing (P-

MOD) scheme that uses the cloud as a storage medium. P-MOD integrates a privilege-based

access structure into an attribute-based encryption mechanism to facilitate sensitive data

sharing in hierarchical settings. This structure allows data owners to share their sensitive

data selectively among all levels of the hierarchy in a ﬁne-grained manner. It also reduces

computational complexity by minimizing the overall cryptographic operations.

Following the development of P-MOD, we wished to gain a better understanding of dis-

tributed systems in an eﬀort to eliminate the need to trust third parties. Therefore, we

conducted a comprehensive study of the ﬁrst system to adopt blockchain, Bitcoin. In this

study, we aimed to identify the security points of weakness of these distributed systems. We

delved deeply into one of the major security threats, double-spending attacks, by performing

two thorough probability analyses of its likelihood of success. Next, we conducted a prob-

ability of success versus proﬁtability analysis of double-spending attacks to investigate the

trade-oﬀs between waiting time before accepting a transaction and the proﬁtability of these

attacks.

Motivated by our study of blockchain and the underlying foundation of distributed peer-

to-peer (P2P) networks, we developed a distributed Multilevel Attribute-based EMR man-

agement (d-MABE) scheme based on our groundwork of P-MOD. The d-MABE scheme

incorporates smart contracts deployed and executed over the blockchain to ensure the data

sharing preferences of the data owners are maintained. It also replaces the cloud storage with

a distributed storage system that is managed by a P2P network to improve the reliability

of retrieving data when requested. Using electronic medical records (EMR) as a use-case,

our goal is to demonstrate the beneﬁts of alleviating dependence on the electronic record-

generating institutions and thus granting data owners (patients) control of their sensitive

data in a distributed manner.

To further expand our research and reﬂect its applicability to a wider domain space, we

proposed a blockchain-based distributed Coercion-Resistant and Anonymous Mobile Elec-

tronic (d-CRAME) voting scheme. The proposed scheme is secure and preserves voter pri-

vacy through secure multi-party computations performed by parties of diﬀering allegiances.

It also leverages a blockchain running smart contracts as a publicly accessible and tamper-

resistant bulletin board to permanently store votes and prevent double-voting. Using voting

as an application, our goal is to demonstrate the potential and feasibility of designing a

distributed and remote voting scheme for large-scale elections, thus increasing voter turnout

and accuracy in the decision-making process.

Copyright by
EHAB ZAGHLOUL
2020

I dedicate this work to my dear and loving parents.

v

ACKNOWLEDGMENTS

First and foremost, I would like to express my sincere gratitude to my advisor, Dr. Jian Ren.

Being so knowledgeable and dedicated to his ﬁeld, Dr. Ren was the best mentor anyone could

wish to have. Our conversations were always inspiring and insightful to both my academic

career and personal life. I cannot thank you enough for everything you have taught me and

will always be grateful.

I would like to thank Dr. Tongtong Li, Dr. Mi Zhang, and Dr. Richard Embody for

serving on my committee. Your support and feedback were a great value-added to this

dissertation. I would also like to express my appreciation to Dr. Tongtong Li for providing

me with guidance in my personal life and cooking us the tastiest food for Thanksgiving.

I must thank my labmates, Kai and Aﬁﬁ. You were both the coolest people to ever work

with and I have learned so much from both of you. I will not forget these fun times.

As for my family, I cannot thank my dear father enough for pushing me in the direction of

pursuing a Ph.D. You have always believed in me and motivated me in every single possible

way. My beloved mother, your constant prayers and comforting words are the reasons why I

am where I am today. My beautiful sisters, Shaza and Hadeel, you are both real inspirations

and role models that have given me all the strength I need.

My friends were my support system during this ride. Dana, you have helped me in every

single possible way, it would take me pages to thank you, so I will leave it there. Kasstawi,

you have always supported me and pushed me outside of my comfort zone to achieve the

impossible. Yousry, Beltagy, and Raghda, you guys were always there and cheered me up

when I needed it most, thank you.

vi

TABLE OF CONTENTS

LIST OF TABLES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

x

LIST OF FIGURES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

xi

LIST OF ALGORITHMS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xii

Chapter 1

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.2 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.2.1 Centralized Models . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.2.2 Distributed Models . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.3 Summary of Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.3.1 P-MOD: Secure Privilege-Based Multilevel Organizational Data Shar-
ing in Cloud Computing . . . . . . . . . . . . . . . . . . . . . . . . .
1.3.2 Bitcoin and Blockchain . . . . . . . . . . . . . . . . . . . . . . . . . .
d-MABE: Distributed Multilevel Attribute-Based EMR Management
1.3.3
and Applications
. . . . . . . . . . . . . . . . . . . . . . . . . . . . .
d-CRAME: Distributed Coercion-Resistant and Anonymous Mobile
Electronic Voting . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.4 Dissertation Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1.3.4

2.2.1 Design Goals
2.2.2

Chapter 2 P-MOD: Secure Privilege-Based Multilevel Organizational Data
Sharing in Cloud Computing . . . . . . . . . . . . . . . . . . . . .
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.1
2.2 Problem Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
System Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.3 The Proposed P-MOD Scheme . . . . . . . . . . . . . . . . . . . . . . . . . .
2.3.1 Data File Partitioning and Encryption . . . . . . . . . . . . . . . . .
2.3.2 The P-MOD Privilege-Based Access Structure . . . . . . . . . . . . .
2.3.3 The Proposed P-MOD Construction . . . . . . . . . . . . . . . . . .
2.4 Security Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.5 Performance Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.5.1 Traditional CP-ABE in a Hierarchical Setting . . . . . . . . . . . . .
2.5.2 Computational Cost
. . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.6 Empirical Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.6.1 Key Generation Time-Cost . . . . . . . . . . . . . . . . . . . . . . . .

2.5.2.1 Encryption Cost
2.5.2.2 Decryption Cost
Storage Cost

2.5.3

1
1
3
3
5
7

8
8

9

9
10

11
11
12
12
13
15
15
17
18
22
27
27
29
29
30
32
33
35

vii

2.6.2 Encryption Time-Cost . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . .
2.6.3 Decryption Time-Cost
2.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Chapter 3 Bitcoin and Blockchain . . . . . . . . . . . . . . . . . . . . . . . .
3.1
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.2 Understanding Bitcoin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.2.1 The Bitcoin Network . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.2.2 Bitcoin Transactions . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.2.3 Bitcoin Transaction Standards . . . . . . . . . . . . . . . . . . . . . .
3.2.4 Merkle Trees
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.2.5 Blockchain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.2.6 Bitcoin Mining . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.2.7 Bitcoin Mining Pools and Payment Methods . . . . . . . . . . . . . .
3.2.8 Alternative Cryptocurrencies . . . . . . . . . . . . . . . . . . . . . . .
3.3 Bitcoin Security Issues - Double-Spending Attacks . . . . . . . . . . . . . . .
3.3.1 Types of Attacks . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.3.1.1 Race Attack . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.3.1.2 Finney Attack . . . . . . . . . . . . . . . . . . . . . . . . .
3.3.1.3 Vector76 Attack . . . . . . . . . . . . . . . . . . . . . . . .
51% Attack . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.3.1.4
3.3.2 Probability of Success
. . . . . . . . . . . . . . . . . . . . . . . . . .
3.3.3 Attack Proﬁtability . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

37
38
39

40
40
41
41
45
48
51
53
54
57
61
63
64
64
65
65
66
67
73
80

Chapter 4

4.2.1 Design Goals
4.2.2

d-MABE: Distributed Multilevel Attribute-Based EMR Man-
agement and Applications . . . . . . . . . . . . . . . . . . . . . . .
4.1
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.2 Problem Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
System Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.3 The Proposed d-MABE Scheme . . . . . . . . . . . . . . . . . . . . . . . . .
Scheme Orchestration . . . . . . . . . . . . . . . . . . . . . . . . . .
4.3.1
4.3.2
Smart Contracts
. . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.3.3 Access Permission Revocation . . . . . . . . . . . . . . . . . . . . . .
4.4 Security and Privacy Analysis . . . . . . . . . . . . . . . . . . . . . . . . . .
Security Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

81
81
82
84
85
86
87
90
95
96
97
. . . . . . . . . . . . . . . . . . . . 102
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
4.5 Performance Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
SMR Contract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
4.5.1
4.5.2 AVPA Contract
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
4.5.3 GK Contract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
4.6 Empirical Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
4.7 Extended Application Discussions . . . . . . . . . . . . . . . . . . . . . . . . 113

4.4.1
4.4.2 Recommended Security Practices
4.4.3 Privacy Analysis

viii

4.8 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113

Chapter 5

5.3.3 Acquiring a Ballot

5.2.1 Design Goals
5.2.2

d-CRAME: Distributed Coercion- Resistant and Anonymous
Mobile Electronic Voting . . . . . . . . . . . . . . . . . . . . . . . 115
5.1
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
5.2 Problem Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
System Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
5.3 The Proposed d-CRAME Scheme . . . . . . . . . . . . . . . . . . . . . . . . 119
5.3.1
Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
5.3.2 Voter Registration . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
5.3.2.1 Voter Key Generation and Registration . . . . . . . . . . . . 120
Signing Voter’s Public Key . . . . . . . . . . . . . . . . . . 120
5.3.2.2
. . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
5.3.3.1 Ballots Generation . . . . . . . . . . . . . . . . . . . . . . . 121
5.3.3.2 Voter Permutation . . . . . . . . . . . . . . . . . . . . . . . 121
5.3.3.3 Requesting a Ballot
. . . . . . . . . . . . . . . . . . . . . . 121
5.3.3.4 Voter Identity Obscuring . . . . . . . . . . . . . . . . . . . . 122
5.3.3.5 Ballot Assignment and Encryption . . . . . . . . . . . . . . 123
5.3.3.6 Encrypted Ballot Transmission . . . . . . . . . . . . . . . . 124
5.3.3.7 Deriving Ballot . . . . . . . . . . . . . . . . . . . . . . . . . 124
5.3.4 Casting Votes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
5.3.4.1 Ballot Double-Encryption . . . . . . . . . . . . . . . . . . . 125
5.3.4.2
. . . . . . . . . . . . . . . . . . . . . . . . . . 126
5.3.5 Tabulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
5.3.6 Vote Veriﬁability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128
5.4 Security and Privacy Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . 128
5.5 Performance Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
5.5.1
Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
5.5.2 Voter Registration . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140
5.5.3 Acquiring Ballots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
5.5.4 Casting Vote
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
5.5.5 Tabulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142
5.6 Empirical Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
5.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146

Submit Vote

Chapter 6 Conclusion and Future Work . . . . . . . . . . . . . . . . . . . . . 148
6.1 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148
6.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150

BIBLIOGRAPHY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154

ix

LIST OF TABLES

Table 2.1:

P-MOD notations summary.

. . . . . . . . . . . . . . . . . . . . . . . .

14

Table 2.2:

P-MOD comparison of number of operations. . . . . . . . . . . . . . . .

32

Table 2.3:

P-MOD comparison of private key and ciphertext sizes.

. . . . . . . . .

32

Table 2.4:

Attribute distribution in each partition Fi.

. . . . . . . . . . . . . . . .

35

Table 4.1:

Estimated d-MABE smart contracts deployment at k=5 and N =25.

. . 110

Table 4.2:

Estimated d-MABE smart contracts deployment at k=5 and N =50.

. . 110

Table 4.3:

Estimated d-MABE smart contracts deployment at k=10 and N =100. . 110

Table 4.4:

U.S maximum state fees to copy and deliver records upon being requested
by others.

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111

Table 4.5:

Estimated d-MABE addStaﬀMember function costs.

. . . . . . . . . . . 111

Table 4.6:

Estimated d-MABE addKey function costs.

. . . . . . . . . . . . . . . . 111

Table 5.1:

d-CRAME performance comparison. . . . . . . . . . . . . . . . . . . . . 139

x

LIST OF FIGURES

Figure 2.1: P-MOD General scheme of privilege-based data sharing. . . . . . . . . .

14

Figure 2.2: P-MOD ﬁle partitioning.

. . . . . . . . . . . . . . . . . . . . . . . . . .

16

Figure 2.3: P-MOD privilege-based multilevel access structure. . . . . . . . . . . . .

17

Figure 2.4: CP-ABE used in a hierarchical organization.

. . . . . . . . . . . . . . .

28

Figure 2.5: P-MOD performance comparison: (a) Key generation time, (b) Encryp-
. . . . . . . . . . . . . . . . . . . .

tion time, and (c) Decryption time.

36

Figure 3.1: A single Bitcoin transaction with multiple UTXO inputs and outputs.

.

47

Figure 3.2: A Merkle tree within a block.

. . . . . . . . . . . . . . . . . . . . . . .

53

Figure 3.3: Probability of successful double-spending attacks vs. number of conﬁr-
mations waited by the merchant. . . . . . . . . . . . . . . . . . . . . . .

Figure 3.4: Proﬁts/losses of attackers with varying computational power q trying to
double-spend v = 5 BTC during October 2019. . . . . . . . . . . . . . .

73

78

Figure 4.1:

d-MABE general scheme orchestration.

. . . . . . . . . . . . . . . . . .

86

Figure 4.2: Estimated costs to run the d-MABE AVPA smart contract.

. . . . . . . 112

Figure 5.1:

d-CRAME steps to select the desired candidates: (1) application start
and (2) selecting and encrypting selected candidates. . . . . . . . . . . . 125

Figure 5.2:

d-CRAME steps to cast and verify vote: (3) encryption result, (4) im-
porting blockchain account to cast vote, (5) information on block con-
taining cast vote, and (6) verifying vote permanently exists on blockchain.127

Figure 5.3:

d-CRAME time comparison for various key sizes: (a) in computer, (b)
in smartphone, (c) smartphone/computer.

. . . . . . . . . . . . . . . . 145

xi

LIST OF ALGORITHMS

Algorithm 1: SMR smart contract . . . . . . . . . . . . . . . . . . . . . . . . . . . .

91

Algorithm 2: AVPA smart contract . . . . . . . . . . . . . . . . . . . . . . . . . . . .

92

Algorithm 3: GK smart contract . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

95

Algorithm 4: Voter Registration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120

Algorithm 5: Ballot Acquisition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122

xii

Chapter 1

Introduction

1.1 Overview

Nowadays, if we choose to share sensitive data with someone, we must rely on a Trusted

Third Party (TTP) to accurately and eﬃciently deliver the data to the correct person. This

arrangement is the basis of nearly all data exchanges and rarely is it possible to circumvent

the need to rely on a TTP. In a perfect world, a TTP would protect user privacy and have

impenetrable security measures in place. However, this perfect scenario does not exist and

thus security and privacy issues are major concerns in current data sharing systems.

In a September 2018 press release, two UMass Memorial Healthcare entities are under

investigation by the Massachusetts Attorney General [1]. The lawsuit claims that two sep-

arate data breaches occurred by two medical entities where former employees improperly

accessed the personal and protected health information of patients for fraudulent purposes.

The lawsuit claims that the healthcare entities failed to protect patient information in-line

with federal and state law. The former employees were able to access names, addresses, social

security numbers, clinical information, and health insurance information. As a result of this

lawsuit, the UMass Memorial Healthcare entities have agreed to a hefty $230,000 settlement

and a series of remedial action steps. These steps include limiting employee access to patient

information and a third-party review of their current data security policies and procedures.

1

In this event, the detrimental eﬀects of unprivileged access to sensitive data are evident.

More so, it is apparent that data and security breaches are not conﬁned to random entities,

but can even occur in highly respected healthcare organizations. In this scenario, the TTP

failed to prevent unprivileged access to the most sensitive parts of patient records. The TTP

also failed to revoke access to data when a data user (employee) was no longer privileged to

access it. In a situation such as this, it is valuable to be able to control which parts of the

data set are shared and with whom. For instance, the duty of a physician may warrant access

to patient clinical information, but rarely would a physician need access to social security

numbers.

In a similar manner, a hospital billing department may need access to patient

insurance information, but may not need unrestricted access to patient medical records.

A healthcare organization is a representative, real-world example of a matrix where many

diﬀerent professions and role duties exist. Within this complex matrix, there are also hi-

erarchies within diﬀerent branches. For example, a chief surgeon may preside over resident

surgeons. When there is a complex matrix such as this, current methods to share data can

fall short in terms of eﬃciency, privacy, and security. As mentioned previously, a TTP is

used today to manage this. This places the burden of privacy protection and data security

solely on the TTP and removes the power from the data owner, or patient in this case. The

patient relinquishes ownership over his/her personal information, which in the wrong hands

could have severe ramiﬁcations.

Eﬃcient, secure, and private data sharing of sensitive data is essential in almost every

domain today. Unfortunately, data breaches continue to rise year after year. Therefore, it is

of interest to develop secure cryptographic protocols that allow proper data sharing meth-

ods. These schemes have the intention of de-risking unprivileged access to sensitive data by

empowering the data owners. In this dissertation, we propose sensitive data sharing schemes

2

that address eﬃciency, security, and privacy. We present our scheme by applying them to

two diﬀerent real-world applications that may be generalized to serve other applications

similarly.

1.2 Related Work

In this section, we present the related work for both centralized and distributed models.

1.2.1 Centralized Models

Over the years, cryptographic schemes have been proposed to enhance sensitive data sharing.

More recently, Attribute-Based Encryption (ABE) [2, 3] schemes have evolved to provide

versatility when sharing data. These schemes integrate two types of constructs: attributes

and access policies. Access policies are statements that join attributes to express which users

of the system are granted access and which users are denied. ABE schemes were introduced

via two diﬀerent approaches: Key-Policy Attribute-Based Encryption (KP-ABE) [4, 5] and

Ciphertext Policy Attribute-Based Encryption (CP-ABE) [6]. In KP-ABE, each ciphertext

is labeled with a set of descriptive attributes, while each private key is integrated with an

access policy. For authorized data users to decrypt the ciphertext, they must ﬁrst obtain

a private key from the key-issuer to use in decryption. The key-issuer integrates the access

policy into the keys generated. Data users can successfully decrypt a ciphertext if the set

of descriptive attributes associated with the ciphertext satisﬁes the access policy integrated

within their private keys. KP-ABE can achieve ﬁne-grained access control and is more

ﬂexible than Fuzzy IBE. However, the data owner must trust the key-issuer to only issue

private keys to data users granted the privilege of access. This is a limitation since the

3

data owner ultimately forfeits control over which data users are granted access. On the

other hand, CP-ABE is considered to be conceptually similar to Role-Based Access Control

(RBAC) [7]. It gives the data owner control over which data user is able to decrypt certain

ciphertexts. This is due to the access structure being integrated by the data owner into the

ciphertext during encryption. It allows the private key generated by the key-issuer to only

contain the set of attributes possessed by the data user. Some CP-ABE schemes [8–23] were

later introduced that can provide higher ﬂexibility and better eﬃciency.

Attribute-based encryption schemes such as KP-ABE and CP-ABE serve as a bet-

ter solution when data users are not ranked into a hierarchy and each is independent of

one another (i.e. no relationships). Therefore, Hierarchical Attribute-Based Encryption

(HABE) [24, 25] was introduced that combines the Hierarchical Identity-Based Encryption

(HIBE) [26] scheme and CP-ABE [6]. HABE is able to achieve ﬁne-grained access control

in a hierarchical organization. It consists of a root master that generates and distributes

parameters and keys, multiple domain masters that delegate keys to domain masters at the

following levels and numerous users. In this scheme, keys are generated in the same hierarchi-

cal key generation approach as the HIBE scheme. However, this scheme becomes unsuitable

for practical implementation when replicas of the same attributes are administered by other

domain authorities. Synchronizing attribute administration might become a challenging is-

sue with complex organizations that have multiple domain authorities. Examples of other

hierarchical schemes were later introduced in [27–30]. As a result, File Hierarchy Cipher-

text Policy Attribute-Based Encryption (FH-CP-ABE) [31] was introduced. It proposes a

leveled access structure to manage a hierarchical organization that shares data of various

sensitivity. A single access structure was proposed that represents both the hierarchy and

the access policies of an organization. Based on the possession of certain attributes, each

4

data user is mapped into speciﬁc transport nodes (certain levels within the hierarchy) based

on the access structure that the user satisﬁes. The main advantage of this scheme is that

it provides leveled access structures that are integrated into a single access structure. As a

result, storage space is saved as only one copy of the ciphertext is needed to be shared on

the cloud for all data users. However, since this scheme uses a single access structure to

represent the full hierarchy, the higher levels are forced to accommodate attributes of all the

levels below.

1.2.2 Distributed Models

With the continuous urge to reduce reliance on third trusted parties, distributed data sharing

schemes began to evolve. In 2015, a decentralized data management scheme was introduced

that facilitated access-control management over a blockchain [32]. In this system, the actual

data records are stored in oﬀ-blockchain storage while pointers to these records are main-

tained by a key-value storage over the blockchain. This solution helps simplify the amount

of data processed on the blockchain. However, the method used to deﬁne access policies in

this scheme does not consider hierarchical data sharing. A user that desires to share ﬁles

selectively among multiple users in a hierarchy will have to deﬁne several access policies

that could become a complex problem as the number of users increases drastically. One

year later, MedRec [33], the ﬁrst functional electronic medical record-sharing system built

on some concepts from [32] was introduced. This work builds on three Ethereum [34] smart

contracts that manage authentication, conﬁdentiality, and accountability during the data

sharing process. In this system, the primary entities involved in maintaining the blockchain

are the parties interested in gaining data, such as researchers and public health authorities.

In return, the institutions are rewarded with access to aggregate and anonymized data. How-

5

ever, the success of such a system is dependent on the participation of entities that maintain

the system in return for data. In addition, similar to [32], MedRec does not consider hierar-

chical data sharing. Finally, in 2017, another functioning electronic medical record-sharing

scheme was presented to provide a secure solution using the blockchain [35]. The system

uses a cloud-based storage system to store the medical records. With centralized storage,

the system becomes liable to a single point of failure. In contrast to the previous systems

discussed, this work builds over a permissioned blockchain, a monitored blockchain where

each node involved in maintaining consensus is known.

Distributed data sharing schemes have also began to evolve in more speciﬁc-related do-

mains such as electronic voting. Smart contracts for boardroom voting systems [36] were

implemented over the Ethereum blockchain using the Open Vote Network [37]. The main

advantages of this system are that it is completely decentralized, provides self-tallying and

achieves end-to-end characteristics. However, the system utilizes zero-knowledge proofs

which increases its overall computational costs. The majority of these cryptographic compu-

tations are implemented in smart contracts and executed over the blockchain, limiting the

protocol to small-scale elections. In addition, the system is not coercion-resistant and can be

manipulated by last-minute voters since they can tally the results before casting their votes.

To circumvent this issue, the scheme implements an optional additional round where voters

initially cast a hashed version of their votes as a commitment before casting their actual

encrypted votes. In this case, although last-minute voters can still compute the election re-

sults before casting their votes, they can no longer change their selection to manipulate the

results. However, this method requires additional smart contract callings and computations,

imposing additional costs. Another small-scale election was also introduced in [38] that re-

lies on expensive homomorphic encryption primitives to preserve voter privacy. Similarly,

6

incorporating such resource-intensive cryptographic computations requires signiﬁcant costs

and limits the scheme to small-scale elections.

Large-scale blockchain-based voting schemes were also introduced such as the scheme

presented in [39]. However, it relies heavily on a trusted central authority to facilitate

ballot generation. Eligible voters are required to blind their public keys along with a digital

commitment and send it to the central authority that signs the blinded message and returns

them to the corresponding voters. Next, the voters unblind the signed messages to end

up with signed ballots. This method is performed to prevent the central authority from

linking voters to their ballots. However, since the central authority cannot identify who it is

signing messages for, it may be subject to DDoS attacks. To improve on the issues imposed

by using blind signatures, other schemes such as [40] and [41] were proposed that utilize

homomorphic encryption alongside to ring or unlikable signatures. Aside from deploying

such expensive computations to enhance voter privacy, neither scheme addresses remote

voting and coercion-resistance.

1.3 Summary of Contributions

In this section, we outline our proposed data sharing schemes while presenting the motivation

for our transition from centralized to distributed models. The main contributions of this

research are summarized in the following subsections.

7

1.3.1 P-MOD: Secure Privilege-Based Multilevel Organizational

Data Sharing in Cloud Computing

Sharing data in privilege-based multilevel organizations initially requires data owners to

identify the sensitivity of their data they wish to share. Therefore, we begin by presenting

multiple data ﬁle partitioning techniques and propose a privilege-based access structure

that facilitates data sharing in hierarchical settings. We then present the proposed P-MOD

scheme and formally prove its security. We show that it is secure against adaptively chosen

plaintext attacks under the Decisional Bilinear Diﬃe-Hellman (DBDH) assumption. Next,

we present a performance analysis for P-MOD and compare it to three existing schemes [6,

25, 31] that aim to achieve similar hierarchical goals. To support our performance analysis,

we then implement P-MOD and conduct comprehensive simulations under various scenarios

using the Census Income data set [42]. We also compare our results to simulations we have

conducted for two other schemes [6, 31] under the same conditions.

1.3.2 Bitcoin and Blockchain

Here, we present an extensive analysis of Bitcoin and its underlying technology, blockchain.

We provide a comprehensive explanation of the primary components of Bitcoin discussed in a

sequential and logical order for the readers to comprehend. The main purpose is to cultivate

the readers with the necessary background on Bitcoin to consolidate their understanding of

the system. Next, We delve thoroughly into the analysis of double-spending attacks. We

provide a quantitative characterization between the risk of double-spending and the number

of blocks to be added to the blockchain before a transaction is accepted. We show that

the probability of success of performing double-spending attacks can be modeled using two

8

distinct probabilistic models. We also show that both models result in a similar outcome.

Using these probabilistic models, we present a proﬁtability analysis of performing double-

spending attacks. Our ﬁndings are useful to both Bitcoin users and miners. Miners can

obtain more insight into the mining process and potential methods to maximize their proﬁts.

1.3.3 d-MABE: Distributed Multilevel Attribute-Based EMR Man-

agement and Applications

To eliminate the dependence on trusted third parties, such as the cloud, we expand our

research by developing a novel distributed electronic medical record-sharing scheme for pa-

tients that is secure, preserves the privacy of patient data, and is eﬃcient. Within our

scheme, we also propose a distributed method for verifying medical institution staﬀ member

attributes over the blockchain before issuing them access keys. This method can help mini-

mize trust and dependencies on key-issuers when verifying attributes of staﬀ members. Next,

we conduct comprehensive security and privacy analyses of the proposed scheme. Finally, we

implement our proposed scheme using smart contracts and deploy them over the Ethereum

blockchain for performance evaluation and numerical demonstration.

1.3.4 d-CRAME: Distributed Coercion-Resistant and Anonymous

Mobile Electronic Voting

We develop a novel distributed, coercion-resistant, and anonymous mobile electronic voting

scheme that allows voters to cast their votes in large-scale elections. Our proposed scheme

builds over secure multi-party computations, allowing voters to cast their votes remotely

using mobile devices such as smartphones, storing them over a blockchain permanently and

9

irreversibly. Next, we conduct comprehensive security and privacy analyses of d-CRAME for-

mally proving it is secure against adaptively chosen plaintext attacks under the Decisional

Diﬃe-Hellman (DDH) assumption. We also introduce our Indistinguishability of Encryp-

tion Keys (IND-EK) Security Game showing that d-CRAME is coercion-resistant. Follow-

ing that, we present a performance analysis for d-CRAME and compare it to the existing

schemes [36,40]. Finally, we implement a desktop application and an iOS mobile application

for d-CRAME and the corresponding smart contracts which we deploy over the Ropsten

Ethereum testnet. Our empirical evaluation shows that, even when running d-CRAME un-

der encryption key sizes of 4096 bits, voters can cast their votes via mobile devices in less

than a minute.

1.4 Dissertation Organization

The rest of this dissertation is structured as follows. In Chapter 2, we propose the Privilege-

based Multilevel Organizational Data sharing (P-MOD) scheme. Following that, in Chap-

ter 3, we present our comprehensive research on the blockchain technology and provide our

results for the probability of successful double-spending attacks versus proﬁtability. Next, in

Chapter 4, we present the distributed Multilevel Attribute-Based EMR (d-MABE) scheme.

In Chapter 5, we present the distributed Coercion-Resistant and Anonymous Mobile Elec-

tronic (d-CRAME) voting scheme. Finally, in Chapter 6, we conclude this dissertation and

present our future work.

10

Chapter 2

P-MOD: Secure Privilege-Based

Multilevel Organizational Data

Sharing in Cloud Computing

2.1

Introduction

A critical issue for data owners is how to eﬃciently and securely grant privilege level-based

access rights to a set of data. Data owners are becoming more interested in selectively sharing

information with data users based on diﬀerent levels of granted privileges. The desire to grant

level-based access results in higher computational complexity and complicates the methods

in which data is shared on the cloud. Research in this ﬁeld focuses on ﬁnding enhanced

schemes that can securely, eﬃciently and intelligently share data on the cloud among users

according to granted access levels.

In this chapter, we propose a secure Privilege-based Multilevel Organizational Data shar-

ing scheme (P-MOD) that incorporates a privilege-based access structure into an attribute-

based encryption mechanism. In principle, data owners encrypt parts of their data under

each access policy at every level to grant access to speciﬁc data users based on their data

access privileges. A data user ranked at a certain level of the hierarchy can decrypt the

11

ciphertext (at that speciﬁc level) if and only if that data user possesses a correct set of

attributes that can satisfy the access policy of that level. The data user may also decrypt

the ciphertexts at the lower-levels with respect to the user’s level.

The rest of this chapter is organized as follows. In Section 2.2, the problem formulation is

described outlining the system model and design goals. In Section 2.3, the proposed scheme,

P-MOD is introduced. Following that, in Section 2.4 we formally prove the security of P-

MOD based on the hardness of the Decisional Bilinear Diﬃe-Hellman (DBDH) problem [43].

In Sections 2.5 and 2.6, a performance analysis our empirical results of P-MOD are conducted

and compared with two other schemes [6, 31]. Finally, in Section 2.7, a conclusion is drawn

to summarize the work done in this research.

2.2 Problem Formulation

Consider a data owner that possesses a data ﬁle F and wishes to selectively share diﬀerent
segments of it on the cloud among a set of data users based on certain access privileges. We

assume that the data users can be ranked into a hierarchy that deﬁnes their access privileges.

The challenge is to provide the data owners with an eﬃcient, secure and privilege-based

method that allows them to selectively share their data ﬁles among multiple data users while

minimizing the required cloud storage space needed to store the encrypted data segments.

2.2.1 Design Goals

Based on the problem described above, we have the following design goals:

• Privilege-Based Access: Data is shared in a hierarchical manner based on user privi-
leges. Data users with more privileges (ranked at the higher levels of the hierarchy) are

12

granted access to more sensitive parts of F than those with fewer privileges (ranked
at the lower-levels of the hierarchy).

• Data Conﬁdentiality: All parts of F are completely protected from unprivileged data
users (including the storage space). Data users are entitled to access the parts of F
corresponding to the levels they fall in or any other parts corresponding to the levels

below with respect to their own.

• Fine-grained access control: The data owner has the capability to encrypt any part of F
using any set of descriptive attributes he/she wishes, limiting access to only authorized

data users. The set of descriptive attributes is deﬁned by the data owner at the time

of encryption and can be selected from an inﬁnite pool.

• Collusion resistant: Two or more data users at the same/diﬀerent level cannot combine
their private keys to gain access to any part of F they are not authorized to access
independently.

2.2.2 System Model

The general model of privilege-based data sharing among hierarchically-ranked data users is

illustrated in Fig. 2.1. The system consists of four main entities:

Data owner (DO): An individual that owns a data ﬁle and wishes to selectively share it

with multiple data users based on certain desired privacy preferences.

Data users (DU): A set of hierarchically-ranked individuals {DUj ∈ DU | 1 ≤ i ≤ ∞}
interested in obtaining diﬀerent segments of a shared data ﬁle. Data users fall into diﬀerent

13

Figure 2.1: P-MOD General scheme of privilege-based data sharing.

Symbol Deﬁnition

Table 2.1: P-MOD notations summary.

DO A data owner
H
Li
Ti
Fi
ski
EFi
eski
DUj
SKj

The hierarchical layout of the organization
ith level within H where 1 ≤ i ≤ k
ith access tree at Li where 1 ≤ i ≤ k
ith data ﬁle part where 1 ≤ i ≤ k
ith symmetric key used to encrypt Fi where 1 ≤ i ≤ k
ith sym. encrypted Fi under ski where 1 ≤ i ≤ k
ith cp-abe encrypted ski under Ti at Li where 1 ≤ i ≤ k
jth data user in set DU where 1 ≤ j ≤ m
jth data user’s private key where 1 ≤ j ≤ m
uth attribute within the jth data user’s attribute set Aj where 1 ≤ u ≤ n and
1 ≤ j ≤ m
lth node of Ti where 1 ≤ l ≤ ∞ and 1 ≤ i ≤ k
Threshold value of xi

Aj,u
xi
l
k
xi
l

l where 1 ≤ l ≤ ∞ and 1 ≤ i ≤ k

levels within the hierarchy based on speciﬁc sets of attributes Aj = {Aj,1, Aj,2 . . . Aj,n} they

possess.

Key-issuer: A fully trusted entity that generates private keys for the data users that

possess a correct set of attributes.

Cloud server: A non-trusted entity that stores the encrypted segments of the data ﬁle.

14

Encrypted Data FileMost SensitiveLess SensitiveLeast SensitiveData Users in a HierarchyCloud ServerData OwnerData Fetching Key IssuerGranting Keys2.3 The Proposed P-MOD Scheme

We assume that ﬁle F is partitioned into k parts based on data sensitivity. Each part
of F is independently encrypted and shared among the data users of the system under a
privilege-based access structure.

2.3.1 Data File Partitioning and Encryption

The DO partitions ﬁle F into a set of k data sections, that is F = {F1, F2, . . . , Fk}. Each
Fi ∈ F is treated as a new ﬁle that is associated with a sensitivity value used to assign
access rights to the data users based on their privileges. The process of partitioning F is
performed based on the structure of F. We assume that F consists of at least one record,
resulting in multiple ways to partition it as shown in Fig. 2.2.

If F consists of a single record, then each Fi ∈ F represents one or more record at-
tribute(s) associated with the record, as shown in case (1) in Fig. 2.2. However, if F consists
of multiple records, then the DO has ﬂexibility in choosing how to partition it. One ap-

proach is to handle each record as a whole, where records are clustered into groups of similar

sensitivity. In this case, each Fi ∈ F represents one or more record(s), as shown case (2) in
Fig. 2.2. Alternatively, partitioning can be performed over speciﬁc record attributes, versus

the whole record. In this case, Fi ∈ F represents one or more record attribute(s) of the
records, as shown case (3) in Fig. 2.2.

Regardless of how partitioning is performed, each Fi ∈ F is then treated as a new data
ﬁle. Suppose F1 contains the most sensitive information of F that can only be accessed
by a data user at the highest level L1 and Fk contains the least sensitive information of
F that can be accessed by all data users at any level of the hierarchy. Before the DO

15

Figure 2.2: P-MOD ﬁle partitioning.

uploads {F1, F2, . . . , Fk} to the cloud, each Fi ∈ F is encrypted separately using a symmetric
encryption algorithm such as the Advanced Encryption Algorithm (AES) [44] with a secret

key ski to produce an encrypted ﬁle

EFi = Encski

(Fi).

(2.1)

For key selection, the DO randomly selects sk1. The remaining symmetric keys {sk2, . . . , skk}
are then derived from sk1 using a one-way cryptographic hash function h, that is

ski+1 = h(ski).

(2.2)

Next, the symmetric keys are encrypted as discussed in Section 2.3.3, to be accessed only

by the data users that have been granted the privilege of access. The privileged data users

that are successful in obtaining ski corresponding to level Li can derive {ski+1, . . . , skk}
using equation (2.2). However, given the properties of hash function h, ski cannot be used

to derive any of the symmetric keys {sk1, . . . , ski−1}.

16

LowestSensitivityHighestSensitivityCase (1)...Case (3)...Case (2)...F1F2FkF1F2FkF1F2FkFigure 2.3: P-MOD privilege-based multilevel access structure.

2.3.2 The P-MOD Privilege-Based Access Structure

The general privilege-based access structure of P-MOD is illustrated in Fig. 2.3. Data users

are ranked into k levels of privileges, {L1,L2, . . . ,Lk}. The DO deﬁnes an access tree
Ti at each corresponding level Li. Each Ti is associated with the appropriate leaf nodes
(attributes) that deﬁne the privileges of the level. Data users that possess the correct sets

of attributes that can satisfy Ti at Li are granted access to symmetric key ski, hence, can
derive {ski+1, . . . , skk} and are able to decrypt parts {EFi, EFi+1, . . . , EFk}.

As shown in Fig. 2.3, an access tree Ti may consist of non-leaf nodes and leaf nodes. The
non-leaf nodes are threshold gates, represented as ‘G’, while the leaf nodes are attributes,

represented as ‘A’. The DO may construct access trees from any number and layout of nodes

that satisfy the privacy preferences desired.

One of the advantages of our proposed privilege-based access structure is the ability to

reduce attribute replication when deﬁning the hierarchy. Data users that possess attributes

that can satisfy access tree Ti are granted access to ski, however, they do not need to

17

GGGAAAAGGGGAAAAGGAAAA……sk1=setToRandom()<latexit sha1_base64="oOUz7E89dRxZbS2h1/b1VkuMpVI=">AAACBnicbVDLSsNAFJ34rPUVdSnCYBHqpiQq6EYouHFZpS9oQ5hMJu3QyUyYmQgldOXGX3HjQhG3foM7/8ZpmoW2HrhwOOde7r0nSBhV2nG+raXlldW19dJGeXNre2fX3ttvK5FKTFpYMCG7AVKEUU5ammpGuokkKA4Y6QSjm6nfeSBSUcGbepwQL0YDTiOKkTaSbx+pke/Ca9iPkR6qKFNEN8U94qGIq6cT3644NScHXCRuQSqgQMO3v/qhwGlMuMYMKdVznUR7GZKaYkYm5X6qSILwCA1Iz1COYqK8LH9jAk+MEsJISFNcw1z9PZGhWKlxHJjO/Np5byr+5/VSHV15GeVJqgnHs0VRyqAWcJoJDKkkWLOxIQhLam6FeIgkwtokVzYhuPMvL5L2Wc09r7l3F5W6U8RRAofgGFSBCy5BHdyCBmgBDB7BM3gFb9aT9WK9Wx+z1iWrmDkAf2B9/gAwtJhB</latexit>T1atL1<latexit sha1_base64="uPinF5XVecQB9bG9WG/3e1uy+no=">AAACEHicbVA9SwNBEN2LXzF+RS1tFoNoFe5U0DJgY2ERIV+QHMfeZi9Zsrd37M6J4Uj+gY1/xcZCEVtLO/+Nm+QQTXww8Hhvhpl5fiy4Btv+snJLyyura/n1wsbm1vZOcXevoaNEUVankYhUyyeaCS5ZHTgI1ooVI6EvWNMfXE385h1TmkeyBsOYuSHpSR5wSsBIXvG4ExLoUyLS2shz8LgD7B5SAqMx/nFujOMVS3bZngIvEicjJZSh6hU/O92IJiGTQAXRuu3YMbgpUcCpYKNCJ9EsJnRAeqxtqCQh0246fWiEj4zSxUGkTEnAU/X3REpCrYehbzonR+p5byL+57UTCC7dlMs4ASbpbFGQCAwRnqSDu1wxCmJoCKGKm1sx7RNFKJgMCyYEZ/7lRdI4LTtnZef2vFSxszjy6AAdohPkoAtUQdeoiuqIogf0hF7Qq/VoPVtv1vusNWdlM/voD6yPb3H9nWg=</latexit>T2atL2<latexit sha1_base64="kRo6BQBsEnkYz0qsnSDk1rtvOIw=">AAACEHicbVDLSsNAFJ34rPUVdelmsIiuSlIFXRbcuHBRoS9oQ5hMp+3QyYOZG7GE9A/c+CtuXCji1qU7/8ZJG0RbD1w4nHMv997jRYIrsKwvY2l5ZXVtvbBR3Nza3tk19/abKowlZQ0ailC2PaKY4AFrAAfB2pFkxPcEa3mjq8xv3TGpeBjUYRwxxyeDgPc5JaAl1zzp+gSGlIiknroVPOkCu4eEQDrBP86NdlyzZJWtKfAisXNSQjlqrvnZ7YU09lkAVBClOrYVgZMQCZwKlha7sWIRoSMyYB1NA+Iz5STTh1J8rJUe7odSVwB4qv6eSIiv1Nj3dGd2pJr3MvE/rxND/9JJeBDFwAI6W9SPBYYQZ+ngHpeMghhrQqjk+lZMh0QSCjrDog7Bnn95kTQrZfusbN+el6pWHkcBHaIjdIpsdIGq6BrVUANR9ICe0At6NR6NZ+PNeJ+1Lhn5zAH6A+PjG3UfnWo=</latexit>sk2=h(sk1)<latexit sha1_base64="7lhqSbtLGmpY85e4U2ovz6S08Ys=">AAAB9XicbZBNSwMxEIZn/az1q+rRS7AI9VJ2q6AXoeDFYwX7Ae1asulsG5rNLklWKaX/w4sHRbz6X7z5b0zbPWjrC4GHd2aYyRskgmvjut/Oyura+sZmbiu/vbO7t184OGzoOFUM6ywWsWoFVKPgEuuGG4GtRCGNAoHNYHgzrTcfUWkey3szStCPaF/ykDNqrPWgh90KuSaDkgXvrFsoumV3JrIMXgZFyFTrFr46vZilEUrDBNW67bmJ8cdUGc4ETvKdVGNC2ZD2sW1R0gi1P55dPSGn1umRMFb2SUNm7u+JMY20HkWB7YyoGejF2tT8r9ZOTXjlj7lMUoOSzReFqSAmJtMISI8rZEaMLFCmuL2VsAFVlBkbVN6G4C1+eRkalbJ3XvbuLopVN4sjB8dwAiXw4BKqcAs1qAMDBc/wCm/Ok/PivDsf89YVJ5s5gj9yPn8AVV+RCg==</latexit>TkatLk<latexit sha1_base64="7/UJ++HPGFL9gFqLxzziQ2aKoNY=">AAACEHicbVA9SwNBEN2LXzF+RS1tFoNoFe5U0DJgY2ERIV+QC8feZpMs2ds7dufEcFz+gY1/xcZCEVtLO/+Nm+QQTXww8Hhvhpl5fiS4Btv+snJLyyura/n1wsbm1vZOcXevocNYUVanoQhVyyeaCS5ZHTgI1ooUI4EvWNMfXk385h1TmoeyBqOIdQLSl7zHKQEjecVjNyAwoEQktdQb4rEL7B4SAukY/zg3xvGKJbtsT4EXiZOREspQ9YqfbjekccAkUEG0bjt2BJ2EKOBUsLTgxppFhA5Jn7UNlSRgupNMH0rxkVG6uBcqUxLwVP09kZBA61Hgm87JkXrem4j/ee0YepedhMsoBibpbFEvFhhCPEkHd7liFMTIEEIVN7diOiCKUDAZFkwIzvzLi6RxWnbOys7tealiZ3Hk0QE6RCfIQReogq5RFdURRQ/oCb2gV+vRerberPdZa87KZvbRH1gf3yfAndw=</latexit>skk=h(skk 1)<latexit sha1_base64="coNuokSLmoRntbP+9HhlFnSab/4=">AAAB+3icbZDLSsNAFIZPvNZ6i3XpZrAIdWFJVNCNUHDjsoK9QBvCZDpth0wmYWYilpBXceNCEbe+iDvfxmmbhbb+MPDxn3M4Z/4g4Uxpx/m2VlbX1jc2S1vl7Z3dvX37oNJWcSoJbZGYx7IbYEU5E7Slmea0m0iKo4DTThDeTuudRyoVi8WDniTUi/BIsCEjWBvLtysq9EN0g8Y1A1l45uanvl116s5MaBncAqpQqOnbX/1BTNKICk04VqrnOon2Miw1I5zm5X6qaIJJiEe0Z1DgiCovm92eoxPjDNAwluYJjWbu74kMR0pNosB0RliP1WJtav5X66V6eO1lTCSppoLMFw1TjnSMpkGgAZOUaD4xgIlk5lZExlhiok1cZROCu/jlZWif192Lunt/WW04RRwlOIJjqIELV9CAO2hCCwg8wTO8wpuVWy/Wu/Uxb12xiplD+CPr8wctn5Ms</latexit>possess attributes that can also satisfy the access trees {Ti+1, . . . ,Tk} in order to obtain
{ski+1, . . . , skk}. This helps simplify the process of deﬁning a hierarchy as the number
of users or access constraints grow. With reduced-size access trees, we can greatly reduce

the computational complexity when encrypting ﬁle partitions, generating private keys for

privileged users and decrypting ciphertexts.

2.3.3 The Proposed P-MOD Construction

The scheme is based on the construction presented in [6] and formally divides the process

into four main functions:

Setup(1κ): This is a probabilistic function carried out by the key-issuer. The Setup function
takes a security parameter κ and randomly chooses values α, β ∈ Zp. The outputs of this

function are public key P K and master key M K deﬁned as:

P K = {G0, g, B = gβ, e(g, g)α},
M K = {β, gα}.

(2.3)

(2.4)

KeyGen(M K, Aj): This is a probabilistic function carried out by the key-issuer. The inputs
to this function are M K generated by the Setup function, and the attribute set Aj of jth
data user, where Aj,u ∈ Aj represents the uth attribute within the set. The KeyGen function
outputs a unique private key SKj for the data user. In order to guarantee a unique SKj,
it generates a random value rj ∈ Zp and incorporates it within the private key. Based on
the number of attributes in the input set Aj, the KeyGen function also generates a random

18

value rj,u ∈r Zp for each attribute within the set. The SKj is deﬁned as:
SKj =(cid:0)Dj = g(α+rj )/β,
(cid:1).
{Dj,u = grj · h(u)rj,u, D(cid:48)j,u = grj,u |∀Aj,u ∈ Aj}

(2.5)

The purpose of the randomly selected rj is to ensure that each SKj is unique and the

attribute components within the SKj are associated. It should be infeasible for data users

to collude by combining components of their private keys (Dj,u and D(cid:48)j,u) to decrypt data

beyond their individual access rights. This means, the attribute components from diﬀerent

private keys cannot be combined to access unauthorized data.

Encrypt(P K,ski,Ti): This is a probabilistic function carried out by the DO to encrypt the
symmetric keys that are to be shared with the privileged data users. The inputs to this

function are P K, the public key generated by the Setup function, the symmetric key ski

derived in equation (2.2) representing the data that will be encrypted, and the access tree

Ti that deﬁnes the authorized set of attributes at Li. The output of this function is the
encrypted symmetric key eski.

run, the Encrypt function chooses a polynomial q

For {sk1, . . . , skk}, the Encrypt function will run k times, once for each ski. At each
l − 1 for each node
xi
xi
l ∈ Ti. The process of assigning polynomials to each xi
l occurs in a top-bottom approach
starting from the root node in Ti. The Encrypt function chooses a secret si ∈ Zp and sets
(0) = si. Next, it randomly chooses the remaining points of the polynomial

the value of q

with degree d

= k

xi
l

xi
l

xi
1

to completely deﬁne it. For any other node xi

(0) = qparent(index(xi

l)), where qparent is the parent node polynomial of xi

q

xi
l

l ∈ Ti, the Encrypt function sets the value
l. The remaining

points of those polynomials are then randomly chosen.

19

Let Xi be the set of leaf nodes in Ti. The encrypted symmetric key eski at Li is then

constructed as:

eski =(cid:0)

Ti, ˜Ci = ski · e(g, g)α·si, Ci = gβ·si,
|∀xi

= h(xi
l)

, C(cid:48)xi

q
xi
l

q
xi
l

= g

(0)

(0)

l

{C

xi
l

l ∈ Xi}

(cid:1).

(2.6)

Decrypt(eski, SKj): This is a deterministic function carried out by the data user. The

inputs to this function are an encrypted symmetric key eski corresponding to Li, and the
private key SKj of the jth data user.

The Decrypt function operates in a recursive manner propagating through the nodes in

Ti by calling a recursive function deﬁned as DecryptNode(eski, SKj, xi
function are eski, SKj and a node xi

l within Ti. If the attributes incorporated within SKj

l). The inputs to this

satisfy the rules within Ti, the data user can decrypt eski and obtain ski.

DecryptNode(eski, SKj, xi

l) performs diﬀerently depending on whether xi
l
l) /∈ Aj, DecryptNode(eski, SKj, xi

l is a leaf node and att(xi

is a leaf or

l) returns ∅,

non-leaf node. If xi

otherwise, att(xi

l) = Aj,u ∈ Aj and DecryptNode(eski, SKj, xi

l) is deﬁned as:

DecryptNode(eski, SKj, xi

l) =

e(Dj,u, C
xi
l
e(D(cid:48)j,u, C(cid:48)xi

l

)

)

=

q
xi
l

(0)

)

(0)

)

e(grj · h(u)rj,u, g
q
xi
e(grj,u, h(xi
l)
l
(0)

rj·q

xi
l

= e(g, g)

.

(2.7)

If xi

l is a non-leaf node, DecryptNode(eski, SKj, xi

l) operates recursively. For each node

zi
l,c that is a child of xi

l, DecryptNode(eski, SKj, zi

l,c) is computed and the output is stored

in F

.

zi
l,c

20

This recursive function is based on Lagrange interpolation. The Lagrange coeﬃcient

∆a,A

j

for a ∈ Zp and the set of attributes Aj is deﬁned as:

(cid:89)
k∈A

j ,k(cid:54)=a

∆a,A

j

(x) =

x − k
a − k

.

(2.8)

Let A

j,xi
l

be an arbitrary k

xi
l

-sized set of child nodes zi

set exists then the function returns F

zi
l,c

= ∅. Otherwise, F

zi
l,c

interpolation as follows:

l,c such that F

l,c (cid:54)= ∅. If no such
zi
is computed using Lagrange

(0)(cid:33)∆

zi
l,c

(0)

a,A(cid:48)j,xi

l

rj·q

l,c))∆

(index(zi

(0)

a,A(cid:48)j,xi

l

parent(zi

l,c

)

(2.9)

rj·q

(0)

∆

a,A(cid:48)j,xi

l

F

zi
l,c

e(g, g)

(cid:32)
e(g, g)

(cid:89)
l,c∈A
zi
(cid:89)
l,c∈A
zi
(cid:89)
l,c∈A
zi
(cid:89)
l,c∈A
zi

j,xi
l

j,xi
l

j,xi
l

=

F

xi
l

=

=

=

j,xi
l
rj·q

xi
l

(0)

,

= e(g, g)

rj·q

xi
l

e(g, g)

(0)

(i)·∆

a,A(cid:48)j,xi

l

where a = index(zi

l,c) and A(cid:48)j,xi

l

= {index(zi

l,c) : zi

l,c ∈ A

l}.
j,xi

21

If the attributes in Aj satisfy Ti, then the following is computed at the root node xi
1.

Ri = DecryptNode(eski, SKj, xi
1)

= e(g, g)

rj·q

xi
1

(0)

(2.10)

= e(g, g)rj·si.

To obtain ski from the result derived in equation (2.10), we compute the following:

˜Ci

e(Ci,Dj )

Ri

=

(α+rj )/β(cid:19)
(cid:18)
ski · e(g, g)α·si
gβ·si ,g

e

e(g,g)

rj·si

= ski.

(2.11)

At this point, the data user can simply decrypt EFi using the ski derived from equation

(2.11) to obtain the plaintext Fi as follows:

Fi = Decski

(EFi).

(2.12)

If the data user is interested in attaining data ﬁles {Fi+1 . . . Fk} belonging to levels
{Li+1 . . .Lk} respectively, the user can compute the symmetric keys {ski+1, . . . , skk} of
the lower-level using the derived ski as previously discussed in equation (2.2). Finally, any

{EFi+1 . . . EFk} at the lower-levels can be decrypted as in equation (2.12).

2.4 Security Analysis

In this section, a formal proof of security for P-MOD is presented.

It is assumed that a

symmetric encryption technique such as AES is used to secure each data ﬁle Fi ∈ F. It
is also assumed that the process of attribute authentication between a data user and the

22

key-issuer, in order for the data user to obtain a private key, is secure and eﬃcient.

Theorem 1. P-MOD is secure against unprivileged accesses assuming the hash function is

collusion resistant.

Proof. Let Aj = {Aj,1, Aj,2 . . . Aj,n} be a set of attributes possessed by user DUj and Ti an
access tree at level Li of a hierarchy. If Aj ∈ Ti, the user can obtain ski as discussed in equa-
tions (2.7-2.11). Given hash function h, the user cannot derive {sk1 . . . ski−1}. Therefore,
P-MOD is immune to unprivileged accesses.

Following the work presented in [43], we also provide a security proof based on ciphertext

indistinguishability that proves that the adversary is not able to distinguish pairs of cipher-

texts. A cryptosystem is considered to be secure under this property if the probability of an

adversary to identify a data ﬁle that has been randomly selected from a two-element data

ﬁle chosen by the adversary and encrypted does not signiﬁcantly exceed 1

2. We ﬁrst present

an Indistinguishability under Chosen-Plaintext Attack (IND-CPA) security game [45]. Next,

based on the IND-CPA security game, a formal proof of security is provided for P-MOD.

IND-CPA is a game used to test for the security of asymmetric key encryption algorithms.

In this game, the adversary is modeled as a probabilistic polynomial-time algorithm. The

algorithms in the game must be completed and the results returned within a polynomial

number of time steps. The adversary will choose to be challenged on an encryption under

a leveled access tree T ∗. The adversary can impersonate any data user and request many
private keys SKj. However, the game rules require that any attribute set Aj that the
adversary claims to possess does not satisfy T ∗. The security game is divided into the
following steps:

23

Initialization: The adversary selects an access tree T ∗ to be challenged against and com-
mits to it.

Setup: The challenger runs the Setup function and sends the public key P K to the adver-

sary.

Phase 1: The adversary requests multiple private keys (SK1, . . . , SKq1) corresponding to
q1 diﬀerent sets of attributes (A1, . . . , Aq1).

Challenge: The adversary submits two equal length data ﬁles F0 and F1 to the challenger.
The adversary also sends T ∗ such that none of (SK1, . . . , SKq1) generated from Phase 1
contain correct sets of attributes that satisfy it. The challenger ﬂips a coin µ randomly and

encrypts Fµ under T ∗. Finally, the challenger sends the ciphertext CT∗ generated according
to equation (2.6) to the adversary.

Phase 2: Repeat phase 1 with the restriction that none of the newly generated private
keys (SKq1+1, . . . , SKq) corresponding to the diﬀerent sets of attributes (Aq1+1, . . . , Aq)
contain correct sets of attributes that satisfy T ∗.

Guess: The adversary outputs a guess µ(cid:48) of µ. The adversary wins the security game if

µ(cid:48) = µ and loses otherwise.

Deﬁnition 1 (Secure against adaptively chosen plaintext attack.). P-MOD is said to be

secure against an adaptively chosen plaintext attack if any polynomial-time adversary has

only a negligible advantage in the security game, where the advantage is deﬁned as Adv =

P r[µ(cid:48) = µ] − 1
2.

24

The security of P-MOD is reduced to the hardness of the DBDH problem. Based on

Theorem 1 and by proving that a single eski at any Li is secure, the whole system is proved
to be secure since all ciphertexts at any level follow the same rules.

Theorem 2. P-MOD is secure against adaptively chosen plaintext attack if the DBDH

assumption holds.

Proof. Assume there is an adversary that has non-negligible advantage ε = AdvA. We
construct a simulator that can distinguish a DBDH element from a random element with

advantage ε. Let e : G0 × G0 → G1 be an eﬃciently computable bilinear map and G0 is

of prime order p with generator g. The DBDH challenger begins by selecting the random

parameters: a, b, c ∈r Zp. Let g ∈ G0 be a generator and T is deﬁned as T = e(g, g)abc if
µ = 0, and T = R otherwise, where µ ∈r {0, 1} and R ∈r G1. The simulator acts as the

challenger in the following game:

Initialization: The simulator accepts the DBDH challenge requested by the adversary

who selects the T ∗.

Setup: The simulator runs the Setup function. It chooses a random α∗ ∈r Zp and computes
the value α = α∗ + ab. Next, it simulates e(g, g)α ← e(g, g)α∗+ab = e(g, g)α∗e(g, g)ab and
B = gβ ← gb, where b represents a simulation of the value β. Finally, it sends all components
of P K = {G0, g, B = gβ, e(g, g)α} to the adversary.

Phase 1:
In this phase, the adversary requests multiple private keys (SK1, . . . , SKq1)
corresponding to q1 diﬀerent sets of attributes (A1, . . . , Aq1). After receiving an SKj query
for a given set Aj where Aj /∈ T ∗ (i.e. ∀Aj,u ∈ Aj does not satisfy T ∗), the simulator

25

chooses a random r(cid:48)j ∈r Zp and deﬁnes rj = r(cid:48)j − b. Next, it simulates Dj = g(α+rj )/β ←
. Then, ∀Aj,u ∈ Aj, it selects a random rj,u ∈r Zp and
gα/βgrj /β = g(α∗+ab)/bg
simulates Dj,u = grj · h(u)rj,u ← g
h(u)rj,u and D(cid:48)j,u ← grj,u. Finally, the simulated
values of SKj = (Dj,{Dj,u, D(cid:48)j,u | rj,u ∈r Zp, ∀Aj,u ∈ Aj}) are sent to the adversary.

(r(cid:48)j−b)/b

r(cid:48)j−b

Challenge: The adversary sends two plaintext data ﬁles sk0 and sk1 to the simulator who

randomly chooses a µ ∈r {0, 1} by ﬂipping a coin to select one of the ﬁles. The simulator then
runs the Encrypt function and derives a ciphertext CT∗. It simulates ˜C = skµ·e(g, g)α·skµ ←
skµ · e(g, g)(α∗+ab)c = skµ · T e(g, g)α∗c, where c represents a simulation of the value skµ and
T = e(g, g)abc. Next, it simulates C = gβ·skµ ← gbc. Finally, for each attribute x ∈ X∗ (set
of leaf nodes in T ∗) it computes Cx = gqx(0) and C(cid:48)x = h(x)qx(0). The simulated values of
CT∗ = {T ∗, ˜C, C,∀x ∈ X∗ : Cx, C(cid:48)x} are then sent to the adversary.

Phase 2: Repeat Phase 1 with the restriction that the requested private keys are associated

with attribute sets such that, ∀Aj | q1 + 1 ≤ j ≤ q and Aj /∈ T ∗.

Guess: The adversary tries to guess the value µ.

If the adversary guesses the correct

value, the simulator outputs 0 to indicate that T = e(g, g)abc, or 1 to indicate that T = R,
a random group element in G1.

Given a simulator A, if T = e(g, g)abc, then CT∗ is a valid ciphertext, Adv = ε and

(cid:104)

(cid:16)

g, ga, gb, gc, T = e(g, g)abc(cid:17)

(cid:105)

P r

A

= 0

=

1
2

+ ε.

(2.13)

26

If T = R then ˜C is nothing more than a random value to the adversary. Therefore,

g, ga, gb, gc, T = R

= 0

=

1
2

.

(2.14)

(cid:104)

(cid:16)

P r

A

(cid:17)

(cid:105)

From equation (2.13) and equation (2.14), we can conclude that

(cid:12)(cid:12)(cid:12)P r

(cid:104)

A

(cid:16)

g, ga, gb, gc, T = e(g, g)abc(cid:17)

(cid:105)

= 0

(cid:16)

(cid:104)

A

− P r

(cid:17)

(cid:105)(cid:12)(cid:12)(cid:12) = ε. (2.15)

= 0

g, ga, gb, gc, T = R

Therefore, the simulator plays the DBDH game with a non-negligible advantage and the

proof is complete.

2.5 Performance Analysis

In this section, we present a performance analysis for P-MOD and compare it with three

existing schemes, CP-ABE [6], HABE [24] and FH-CP-ABE [31].

2.5.1 Traditional CP-ABE in a Hierarchical Setting

CP-ABE [6] handles the sharing of independent pieces of data based on independent access

policies. It was not designed to support a privilege-based access structure (i.e. hierarchical

organization). Therefore, to adapt CP-ABE to a privilege-based access structure, the Encrypt

function runs once for each level. However, if it were to be used in a hierarchical organization,

there would be a trade-oﬀ between the key management and the complexity of the encryption

and decryption processes. Fig. 2.4 shows the two general cases in which CP-ABE is used to

share data with users in a hierarchical organization.

In case (1), key management is favored over encryption and decryption complexities. The

27

Figure 2.4: CP-ABE used in a hierarchical organization.

individuals at each level possess attributes that deﬁne their privileges and the private keys

to encrypt the data ﬁles. The size of each private key is therefore optimized and the key

management process becomes less resource-intensive. However, since levels are independent,

attributes must be repeatedly incorporated into each access tree. As a result, the sizes of

the access trees at lower-levels will increase, as shown in case (1) of Fig. 2.4. For complex

organizations with a large number of levels, the access trees will become even larger. This

results in an increase in encryption and decryption complexities.

On the other hand, case (2) favors minimizing encryption and decryption complexities

over key management. Each data ﬁle is encrypted under an access policy with a set of

unique attributes at each level without considering privileges and relationships. This results

in simpler access trees at each level of the hierarchy and therefore lower encryption and

decryption complexities. However, to grant access to an individual at a speciﬁc level, the

key-issuer must generate a private key for that individual that incorporates the attributes

at that level and all the levels below. Complex hierarchies that include a large number of

28

OROROR……Case (1)GA1A2GA3A4GAnAm…..GA1A2GA3A4A1A2GGG……C1Cp…B1Bm…A1An…Case (2)T1atL1<latexit sha1_base64="68ef8QjNBFjq0e36R9BA/D3UFQc=">AAACEHicbVDLSsNAFJ34rPVVdelmsIiuSqIFXRbcuHBRoS9oQplMJ+3QySTM3IglpH/gxl9x40IRty7d+TdOH4i2HrhwOOde7r3HjwXXYNtf1tLyyuraem4jv7m1vbNb2Ntv6ChRlNVpJCLV8olmgktWBw6CtWLFSOgL1vQHV2O/eceU5pGswTBmXkh6kgecEjBSp3DihgT6lIi0lnUcPHKB3UNKIBvhH+fGOJ1C0S7ZE+BF4sxIEc1Q7RQ+3W5Ek5BJoIJo3XbsGLyUKOBUsCzvJprFhA5Ij7UNlSRk2ksnD2X42ChdHETKlAQ8UX9PpCTUehj6pnN8pJ73xuJ/XjuB4NJLuYwTYJJOFwWJwBDhcTq4yxWjIIaGEKq4uRXTPlGEgskwb0Jw5l9eJI2zknNecm7LxUp5FkcOHaIjdIocdIEq6BpVUR1R9ICe0At6tR6tZ+vNep+2LlmzmQP0B9bHN3MxnWw=</latexit>T2atL2<latexit sha1_base64="NUq61cVsamsV069mXmj7UXx6Fgk=">AAACEHicbVC7SgNBFJ2Nrxhfq5Y2g0G0CrsxoGXAxsIiQl6QhGV2MpsMmX0wc1cMS/IHNv6KjYUitpZ2/o2zySKaeODC4Zx7ufceNxJcgWV9GbmV1bX1jfxmYWt7Z3fP3D9oqjCWlDVoKELZdoliggesARwEa0eSEd8VrOWOrlK/dcek4mFQh3HEej4ZBNzjlICWHPO06xMYUiKS+sQp42kX2D0kBCZT/OPcaMcxi1bJmgEvEzsjRZSh5pif3X5IY58FQAVRqmNbEfQSIoFTwSaFbqxYROiIDFhH04D4TPWS2UMTfKKVPvZCqSsAPFN/TyTEV2rsu7ozPVIteqn4n9eJwbvsJTyIYmABnS/yYoEhxGk6uM8loyDGmhAqub4V0yGRhILOsKBDsBdfXibNcsk+L9m3lWK1ksWRR0foGJ0hG12gKrpGNdRAFD2gJ/SCXo1H49l4M97nrTkjmzlEf2B8fAN2U51u</latexit>TkatLk<latexit sha1_base64="+56lKINgOIZJ0X7/VRUMFZEhcag=">AAACEHicbVA9SwNBEN3zM8avU0ubxSBahTsNaBmwsbCIkC9IwrG32SRL9vaO3TkxHMk/sPGv2FgoYmtp579xLzlEEx8MPN6bYWaeHwmuwXG+rKXlldW19dxGfnNre2fX3tuv6zBWlNVoKELV9IlmgktWAw6CNSPFSOAL1vCHV6nfuGNK81BWYRSxTkD6kvc4JWAkzz5pBwQGlIikOvaGeNIGdg8JgfEE/zg3xvHsglN0psCLxM1IAWWoePZnuxvSOGASqCBat1wngk5CFHAq2DjfjjWLCB2SPmsZKknAdCeZPjTGx0bp4l6oTEnAU/X3REICrUeBbzrTI/W8l4r/ea0YepedhMsoBibpbFEvFhhCnKaDu1wxCmJkCKGKm1sxHRBFKJgM8yYEd/7lRVI/K7rnRfe2VCiXsjhy6BAdoVPkogtURteogmqIogf0hF7Qq/VoPVtv1vusdcnKZg7QH1gf3yj0neA=</latexit>T1atL1<latexit sha1_base64="68ef8QjNBFjq0e36R9BA/D3UFQc=">AAACEHicbVDLSsNAFJ34rPVVdelmsIiuSqIFXRbcuHBRoS9oQplMJ+3QySTM3IglpH/gxl9x40IRty7d+TdOH4i2HrhwOOde7r3HjwXXYNtf1tLyyuraem4jv7m1vbNb2Ntv6ChRlNVpJCLV8olmgktWBw6CtWLFSOgL1vQHV2O/eceU5pGswTBmXkh6kgecEjBSp3DihgT6lIi0lnUcPHKB3UNKIBvhH+fGOJ1C0S7ZE+BF4sxIEc1Q7RQ+3W5Ek5BJoIJo3XbsGLyUKOBUsCzvJprFhA5Ij7UNlSRk2ksnD2X42ChdHETKlAQ8UX9PpCTUehj6pnN8pJ73xuJ/XjuB4NJLuYwTYJJOFwWJwBDhcTq4yxWjIIaGEKq4uRXTPlGEgskwb0Jw5l9eJI2zknNecm7LxUp5FkcOHaIjdIocdIEq6BpVUR1R9ICe0At6tR6tZ+vNep+2LlmzmQP0B9bHN3MxnWw=</latexit>T2atL2<latexit sha1_base64="NUq61cVsamsV069mXmj7UXx6Fgk=">AAACEHicbVC7SgNBFJ2Nrxhfq5Y2g0G0CrsxoGXAxsIiQl6QhGV2MpsMmX0wc1cMS/IHNv6KjYUitpZ2/o2zySKaeODC4Zx7ufceNxJcgWV9GbmV1bX1jfxmYWt7Z3fP3D9oqjCWlDVoKELZdoliggesARwEa0eSEd8VrOWOrlK/dcek4mFQh3HEej4ZBNzjlICWHPO06xMYUiKS+sQp42kX2D0kBCZT/OPcaMcxi1bJmgEvEzsjRZSh5pif3X5IY58FQAVRqmNbEfQSIoFTwSaFbqxYROiIDFhH04D4TPWS2UMTfKKVPvZCqSsAPFN/TyTEV2rsu7ozPVIteqn4n9eJwbvsJTyIYmABnS/yYoEhxGk6uM8loyDGmhAqub4V0yGRhILOsKBDsBdfXibNcsk+L9m3lWK1ksWRR0foGJ0hG12gKrpGNdRAFD2gJ/SCXo1H49l4M97nrTkjmzlEf2B8fAN2U51u</latexit>TkatLk<latexit sha1_base64="+56lKINgOIZJ0X7/VRUMFZEhcag=">AAACEHicbVA9SwNBEN3zM8avU0ubxSBahTsNaBmwsbCIkC9IwrG32SRL9vaO3TkxHMk/sPGv2FgoYmtp579xLzlEEx8MPN6bYWaeHwmuwXG+rKXlldW19dxGfnNre2fX3tuv6zBWlNVoKELV9IlmgktWAw6CNSPFSOAL1vCHV6nfuGNK81BWYRSxTkD6kvc4JWAkzz5pBwQGlIikOvaGeNIGdg8JgfEE/zg3xvHsglN0psCLxM1IAWWoePZnuxvSOGASqCBat1wngk5CFHAq2DjfjjWLCB2SPmsZKknAdCeZPjTGx0bp4l6oTEnAU/X3REICrUeBbzrTI/W8l4r/ea0YepedhMsoBibpbFEvFhhCnKaDu1wxCmJkCKGKm1sxHRBFKJgM8yYEd/7lRVI/K7rnRfe2VCiXsjhy6BAdoVPkogtURteogmqIogf0hF7Qq/VoPVtv1vusdcnKZg7QH1gf3yj0neA=</latexit>attributes, could result in complicated key management. As a result, private keys will require

incorporating a large number of attributes.

2.5.2 Computational Cost

We formulate the encryption and decryption costs based on the number of group operations
for groups G0, G1 respectively and the number of bilinear mapping operations e

, fG

fG

0

1

involved in the Encrypt and the Decrypt functions for each scheme. Table 2.2 summarizes

the number of operations for each scheme.

2.5.2.1 Encryption Cost

Encryption cost is measured as the number of basic operations involved in generating the

ciphertext from the plaintext. It is formulated based on the Encrypt function that involves

group operations fG

and fG

.

1

0

The number of operations involved in sharing an independent piece of data using CP-ABE

and fG

is (2|X| + 1) and 2 for fG
respectively, where |X| denotes the number of leaf nodes
(attributes) of the access tree T . For a hierarchical organization, we present the encryption
complexity of case (1) in Fig. 2.4 as it involves a relationship between all levels. In a real-life

0

1

application, case (2) would not satisfy a hierarchical organization as it requires attributes

to be shared by all users regardless of which level they belong to. As shown in Table 2.2,

the number of operations involved in the encryption process is formulated as (2(|X1| +··· +
|Xk|) + k) and 2k for fG
respectively, where |X1|,|X2|,··· ,|Xk| are the number of
leaf nodes (attributes) associated with access trees T1,T2,··· ,Tk respectively. However, the
reuse of attributes needed at each level in this scheme increases the computational complexity,

and fG

0

1

making it an overall ineﬃcient solution for hierarchical organizational structures.

29

Similarly, P-MOD generates a ciphertext for each level of the hierarchy. The number of

operations involved is (2(|Y1| + ··· + |Yk|) + k) and 2k for fG
|Y1|,|Y2|,··· ,|Yk| are the number of leaf nodes (attributes) associated with access trees
T1,T2,··· ,Tk respectively. However, P-MOD leverages a privilege-based access structure as
discussed in Section 2.3.2. This results in smaller sized and level-speciﬁc access trees that

respectively, where

and fG

0

1

minimize the number of attributes where, |Yi| < |Xi|,∀i ∈ k.

When comparing P-MOD with hierarchical schemes such as HABE and FH-CP-ABE,

P-MOD minimizes the overall number of operations. This is because the encryption process

for schemes such as HABE and FH-CP-ABE involve more complex hierarchies and access

trees that contain all the access policies for all the levels. On the contrary, P-MOD involves

smaller access trees, each one limited to level-speciﬁc attributes and policies.

In HABE, a single hierarchy represents the root master, domain masters, users, and at-

tributes. This may result in complex hierarchies as the number of users increases. Therefore,

the size of |X| may end up being large making the encryption process expensive. Similarly,
FH-CP-ABE [31] uses a single access tree T to encrypt all data ﬁles to be shared. The
number of operations involved are (2|X| + k) and (2v|A
and fG
respectively, where |A
T| is the number of transport nodes (levels), and v is the number of
children nodes associated with a transport node. This may also result in large sets |X|, |A
T|

T| + 2k) for operations fG

1

0

and v and lead to an expensive encryption process.

2.5.2.2 Decryption Cost

Decryption cost is measured as the number of basic operations involved in decrypting the

ciphertext into plaintext. It is formulated based on the Decrypt function that involves bilinear

operations e and group operations fG

.

1

30

e and fG
1

For a single Decrypt run, CP-ABE [6] involves (2|Aj|) and (2|S|+2) number of operations
respectively, where |Aj| is the number of attributes possessed by the jth data user
and |S| is the least number of interior nodes that satisfy T . However, to adapt CP-ABE to a
privilege-based access structure, the Decrypt function is run as many times as the number of

ciphertexts a data user wishes to decrypt. The number of operations involved are k(2|Aj|+1)
and (2[|S1| +··· +|Sk|] + 2k) for operations e and fG
respectively, where |S1|,|S2|,··· ,|Sk|
are the least number of interior nodes that satisfy the access trees T1,T2,··· ,Tk respectively.
The number of operations involved in the decryption process for P-MOD is similar to a

1

single CP-ABE decryption run. P-MOD needs to run the Decrypt function only one time,

even if the data user needs to obtain more than one data ﬁle. The Decrypt function requires

(2|Aj|) and (2|S| + 2) for operations e and fG

1

respectively. Once the data user successfully

decrypts the ciphertext (obtains the symmetric key at his/her level), the user can derive the

remaining lower-level keys as described in equation (2.2). The complexity of the operations

involved in deriving the symmetric keys and decrypting ciphertexts at the lower-levels are

negligible in comparison with the group and bilinear operations involved in running the

Decrypt function, and therefore could be ignored.

When comparing P-MOD to HABE, the e number of operations involved are slightly

similar. However, the fG

number of operations required are |Aj−1| and 2|S|+2 respectively.

1

Therefore, for both schemes, encryption complexity would greatly depend on the number of

attributes possessed by the user and the access tree generated during encryption.

FH-CP-ABE [31] aims to satisfy a certain transport node (level) of the single access tree

T allowing the data user to decrypt certain encrypted ﬁles up to that level. The number
of operations involved in this process are (2|Aj| + 1) and (2|S| + v|A
T| + 2k) for operations
respectively. Again, since FH-CP-ABE uses a single access tree T , the number

e and fG

1

31

Encrypt

Function Op.
fG0
fG1
e
fG1

Decrypt

Table 2.2: P-MOD comparison of number of operations.

CP-ABE [6]

HABE [24, 25] FH-CP-ABE [31]

P-MOD

2(|X1| + ··· + |Xk|) + k

2k

k(2|Aj| + 1)

2(|S1| + ··· + |Sk|) + 2k

|X|
1
3|Aj|
|Aj| − 1

2|X| + k
2v|A
T| + 2k
2|Aj| + 1

2|S| + v|A

T| + 2k

2(|Y1| + ··· + |Yk|) + k

2

2|Aj|
2|S| + 2

Comp

Table 2.3: P-MOD comparison of private key and ciphertext sizes.
Len
Priv Key LG
0
LG
0
LG

2(|X1| + ··· + |Xk|) + k

HABE [24, 25] FH-CP-ABE [31]

2|Aj| + 1

|Aj|
|X|
1

CP-ABE [6]

Ciphertext

P-MOD

2|Aj| + 1

2(|Y1| + ··· + |Yk|) + k

k

1

k

2|Aj| + 1
2|X| + k
v|A
T| + k

of transport nodes |A
T| can be large, resulting in higher decryption complexity. The least
number of interior nodes |S| that satisfy T can also be large if the construction of the access
tree T is not optimized. Constructing a single access tree that accommodates a large number
of attributes is resource-intensive and could become complicated as the access rules become

more sophisticated. When comparing the decryption costs of P-MOD and FH-CP-ABE,

the decryption cost of P-MOD depends on |Aj| and |S| while the decryption complexity of
FH-CP-ABE depends on |S|, v, |A
T| and k. The size of the sets in FH-CP-ABE will always
be greater than the size of the sets in P-MOD due to the diﬀerent constructions of access

trees in each scheme.

2.5.3 Storage Cost

To evaluate the storage eﬃciency, we formulate the bit-length of the private keys and cipher-

texts generated by each scheme. Table 2.3 represents a comparison of the storage costs in
bit-length for all schemes. The bit-length of a single element in G0, G1 and Zp are denoted

as LG

, LG

1

0

and LZp respectively.

As shown in the table, the bit-length of the private keys for all schemes are similar. In

terms of space complexity, this can be reduced to O(Aj). On the other hand, the size of the

32

ciphertexts diﬀer. The size of a ciphertext is based on the output of the Encrypt function for

each scheme.

For CP-ABE, the total size of all generated ciphertexts from all levels consists of (2(|X1|+
··· + |Xk|) + k) elements from G0 and k elements from G1. Using this scheme, the lower-

levels must accommodate attributes of the higher levels. Ciphertext size can potentially end

up large in size due to attribute replication at each level.

For HABE, the ciphertext consists of |X| elements from G0 and 1 element from G1.

Similar to the discussion in the encryption cost of HABE, the size of X can be large due to

the complexity of how the hierarchy is deﬁned. Likewise, the single ciphertext generated by

T| + k) elements from G1.
FH-CP-ABE [31] consists of (2|X| + k) elements from G0 and (v|A
In this scheme, the ciphertext size depends on |X|, |A
T|, v and k. As the size of these sets
grows, the ciphertext size can grow exponentially based on how the tree T is constructed.
P-MOD generates ciphertexts in a similar approach to those generated by CP-ABE. The

total size of all generated ciphertexts consists of (2[|Y1| + ··· + |Yk|] + k) elements from G0
and k elements from G1. However, the size of the ciphertext generated by P-MOD is shown

to be smaller in size than CP-ABE in all instances. This is based on the composition of

our proposed access structure that does not duplicate attributes, therefore generates smaller

ciphertexts.

2.6 Empirical Results

In this section, the results of various simulations are presented to support the performance

analysis discussed in Section 2.5. P-MOD is implemented and simulated in Java using the

CP-ABE toolkit [46] and the Java Pairing-Based Cryptography library (JPBC) [47]. For

33

comparison, simulations are also conducted for CP-ABE [6] and FH-CP-ABE [31] under

the same conditions as P-MOD. All simulations are conducted on an Intel(R) Core(TM)

i5-4200M at 2.50 GHz and 4.00 GB RAM machine running the Windows 10 OS.

In the simulations, the number of levels k is equivalent to the number of ﬁle partitions

being shared. The total number of diﬀerent user attributes applied to users across all levels

within H is represented as N . Both variables are compared for key generation, encryption,

and decryption time-costs. The experiments include applying the values for k = {3, 6, 9},
to each value of N = {10, 100}, resulting in 6 experimental cases. Within each case, it is
possible to distribute the user attributes among the levels in numerous ways. For example,

when testing the schemes with a hierarchy H for k = 3 and N = 10, there are 36 diﬀerent

ways to distribute the 10 user attributes among the 3 levels, excluding the cases of having
zero user attributes at any level. A possible way of distributing them could be A
= 4, where A
A
L1
L2
the highest level within H and A

represents the set of user attributes possessed by users in

the set of user attributes possessed by users in the lowest

= 3,

L1

= 3 and A

L3

L3

level. As the values of k and N increase, the possible ways of user attribute distribution

among all levels increase. Without loss of generality, we assume that the number of attributes

for each level follows the normal distribution in our simulations.

The data set used in the simulations is sampled from the Census Income data set [42].

The sample consists of 30,163 records, and each record is composed of 9 diﬀerent record

attributes. The records are stored in a Microsoft Excel ﬁle F, with a total size of 2.42MB.
For simulation purposes, it is assumed that data sensitivity is deﬁned over record attributes

(vertical columns), corresponding to the second approach described in Section 2.3.1.

F is partitioned into k sections, where ∀Fi ∈ F and 1 ≤ i ≤ k, each Fi represents one or
more full columns of F. Table 2.4 represents the 9 record attributes distribution of F into

34

each partition Fi, based on the given value for k in each scenario.

Table 2.4: Attribute distribution in each partition Fi.

k F1 F2 F3 F4 F5 F6 F7 F8 F9
-
3
-
6
9
1

-
2
1

-
2
1

-
-
1

-
-
1

-
2
1

2
1
1

3
1
1

4
1
1

All simulations are performed in consideration of users at the most sensitive (highest)

level within the hierarchy. This approach considers the most complicated applications, where

a user needs to gain access to the entire ﬁle. Fig. 2.5 summarizes the time expended by the

three schemes to generate a private key for the user, encrypt all partitions of F and decrypt
all partitions respectively, in all 6 scenarios. By analyzing the ﬁgures, some observations can

be derived. The results are summarized in the following subsections.

2.6.1 Key Generation Time-Cost

To measure the time to generate a private key for a user, the same attribute and level

conditions are applied to all three schemes. P-MOD outperforms CP-ABE and FH-CP-ABE

in all experimental evaluations. As illustrated in Fig. 2.5(a), the time taken to generate

a private key for a user at the highest level in CP-ABE and FH-CP-ABE, is independent

of the value of k and remains nearly constant when N is kept constant, for both values of

N = {10, 100} tested.

In comparison, P-MOD reacts diﬀerently. When the total number of user attributes is

normally distributed among the levels, the time taken to generate a private key by P-MOD

is approximately the reciprocal of the value of k multiplied by the equivalent time taken

by CP-ABE or FH-CP-ABE to perform the same function. For example, the time taken

by CP-ABE and FH-CP-ABE in the case where k = 6 and N = 100 is approximately 2.2

35

Figure 2.5: P-MOD performance comparison: (a) Key generation time, (b) Encryption time,
and (c) Decryption time.

36

3456789Number of levels (k)00.511.522.5Time in ms#104Key GenerationCP-ABE 10AFH-CP-ABE 10AP-MOD 10ACP-ABE 100AFH-CP-ABE 100AP-MOD 100AKey Generation Time in msNumber of levels (k)3456789Number of levels (k)024681012Time in ms#104EncryptionCP-ABE 10AFH-CP-ABE 10AP-MOD 10ACP-ABE 100AFH-CP-ABE 100AP-MOD 100ANumber of levels (k)Encryption Time in ms3456789Number of levels (k)010002000300040005000600070008000Time in msDecryptionCP-ABE 10AFH-CP-ABE 10AP-MOD 10ACP-ABE 100AFH-CP-ABE 100AP-MOD 100ANumber of levels (k)Decryption Time in msseconds. The time taken by P-MOD in the same conditions is approximately 0.35 seconds,

that is nearly 1

6 times the time-cost of the other schemes. Therefore, the time taken to

generate a private key by P-MOD is inversely proportional to the value of k in the hierarchy.

This is true for a constant value N with normally distributed attributes.

Another observation that can be made from Fig. 2.5(a) is the eﬀect of the values N on

the time expended. The time-cost is directly related to the value N and the degree of this

proportional increase is diﬀerent for each scheme as the value N increases.

We deﬁne variable δ as the diﬀerence in time-cost of two experimental evaluations of

the same scheme at N = 10 and N = 100, while keeping k ﬁxed. In Fig. 2.5(a), consider

the simulated values for each scheme at k = 9. Both CP-ABE and FH-CP-ABE result in

large time-cost changes, approximately δCP-ABE = 19039ms and δFH-CP-ABE = 18821ms,

while P-MOD results in a smaller value, δP-MOD = 2204ms. Based on these values, δP-MOD

is approximately 11.7% of δCP-ABE and δFH-CP-ABE, resulting in an approximately 88.3%

improvement. The signiﬁcance of this observation can be seen in applications where N is a

large value. Eﬃciency can be gained in time expenditure by using P-MOD, versus CP-ABE

and FH-CP-ABE.

2.6.2 Encryption Time-Cost

The encryption time-cost is the time it takes each scheme to perform the encryption function

over all partitions of F. Fig. 2.5(b) represents the time expenditure of each scheme under
all six experimental scenarios. P-MOD surpasses both CP-ABE and FH-CP-ABE in every

experimental case. For example, compare the time duration of the three schemes at k = 9

and N = 100. The time duration for CP-ABE is approximately 4.3 times of P-MOD to

perform encryption. Similarly, FH-CP-ABE is approximately 1.2 times of P-MOD to perform

37

encryption.

All schemes follow a direct proportional pattern as the values k and N increase. These

results prove the correctness of the encryption computational complexity analysis presented

in Section 2.5.2.1. The encryption function in all schemes involves a number of fG0

and fG1

operations that are dependent on both the values k and N . However, based on the proposed

hierarchical access structure, P-MOD is able to outperform both CP-ABE and FH-CP-ABE.

In addition to this, the eﬀect of changing the value N is also illustrated clearly in Fig. 2.5(b).

For example, when k = 9, δP-MOD is approximately 21.6% of δCP-ABE and approximately

82% of δFH-CP-ABE.

2.6.3 Decryption Time-Cost

As previously discussed, the experiments are performed from the perspective of a user that

appears at the highest level of the hierarchy. Taking this into account, the decryption

time-cost is deﬁned as the time for the user to successfully decrypt all ciphertexts EFi

corresponding to all partitions Fi, if the user possesses the correct set of user attributes.

Fig. 2.5(c) illustrates the time to perform the decryption function by each scheme. The

decryption function of both CP-ABE and FH-CP-ABE both involve e and fG1

operations

that are dependent on the values k and N . As these values increase, the decryption time-

cost increases linearly for both schemes, proving the correctness of the decryption complexity

analysis in Section 2.5.2.2.

When measuring the decryption time-cost, P-MOD outperforms both CP-ABE and FH-

CP-ABE. This is due to the time-cost being inversely proportional to the value of k. The

time-cost decreases as the value of k increases while N is kept constant.

The decryption time-cost of P-MOD does not severely increase while the value N changes

38

from 10 to 100. In contrast to this, CP-ABE and FH-CP-ABE are greatly aﬀected, as seen

in Fig. 2.5(c). For example, when k = 9, δP-MOD is approximately 13.4% of δCP-ABE and

approximately 15.8% of δFH-CP-ABE. The decryption time-cost of P-MOD is expected to

drop when k increases while keeping the ﬁle size constant.

In summary, for a hierarchical organization with many levels, the simulation results show

that P-MOD is signiﬁcantly more eﬃcient at generating keys, encryption, and decryption

than that of both CP-ABE and FH-CP-ABE schemes.

2.7 Summary

The numerous beneﬁts provided by the cloud have driven many large multilevel organizations

to store and share their data on it.

In this chapter, we ﬁrst pointed out major security

concerns data owners have when sharing their data on the cloud. To address the concerns,

we proposed a Privilege-based Multilevel Organizational Data sharing scheme (P-MOD) that

allows data to be shared eﬃciently and securely on the cloud. P-MOD partitions a data ﬁle

into multiple segments based on user privileges and data sensitivity. Each segment of the

data ﬁle is then shared depending on data user privileges. We formally proved that P-

MOD is secure against adaptively chosen plaintext attack assuming the DBDH assumption

holds. Our comprehensive performance comparison with the two most representative schemes

showed that P-MOD can signiﬁcantly reduce the computational complexity while minimizing

the storage space.

39

Chapter 3

Bitcoin and Blockchain

3.1

Introduction

In Chapter 2, we showed that we can signiﬁcantly reduce the computational complexity when

sharing data in complex hierarchical organizations. However, our P-MOD scheme relies on

a cloud storage system which represents a single point of failure. Therefore, our next goal is

to eliminate reliance on a central system. In order to achieve that, we expand our scheme

into a blockchain-based system.

Before we delve into the details of this scheme, in this chapter, we review and analyze

the major security issues of Bitcoin and its underlying foundation, blockchain. We begin by

presenting a comprehensive background of Bitcoin and the preliminaries on security. Next,

we investigate double-spending attacks that pose a major security threat to Bitcoin. The

main purpose of this research is twofold. We aim at analyzing the security of blockchain

to understand the potential of developing schemes that build over it. We also provide an

analysis that presents a trade-oﬀ between the waiting time before accepting a transaction

in Bitcoin versus the proﬁts/losses of the attackers. This analysis can act as an indicator to

determine how secure schemes that build atop blockchain may be.

The rest of this chapter is structured as follows. In Section 3.2, we provide a compre-

hensive background review on Bitcoin. Next, in Section 3.3, we evaluate double-spending

40

attacks and present our proﬁtability analysis. Finally, in Section 3.4, we conclude our study.

3.2 Understanding Bitcoin

Blockchain as a concept was initially proposed by Stuart Haber and W. Scott Stornetta in

1991 [48]. Their blockchain aimed at certifying the creation/modiﬁcation of a digital record

by digitally time-stamping the records. However, the blockchain was not eﬃcient since each

record was independently time-stamped. To improve eﬃciency, Merkle trees [49] were incor-

porated into blockchains in 1992 [50]. They improved eﬃciency by handling multiple digital

records into one block. Finally, Satoshi Nakamoto implemented the ﬁrst real blockchain and

used it as the core technology for the Bitcoin cryptocurrency system [51]. In this section,

we will present the major building blocks and protocols of Bitcoin.

3.2.1 The Bitcoin Network

Bitcoin runs over a P2P network. The main advantage of using a P2P network is the

agile movement of data for all nodes to achieve consensus. In contrast to the typical P2P

network used to share data ﬁles between interested peers, Bitcoin uses the network to rapidly

broadcast data among all the connected nodes. This process is known as ﬂooding and

continues until all nodes within the network receive the broadcast data.

It is important to diﬀerentiate between the terms node and peer of a P2P network. A node

is a network entity that is connected to one or multiple other similar nodes. The directly

connected nodes are referred to as peers. Nodes propagate data to indirectly connected

nodes by forwarding it to their peers until the data reaches every connected node. In the

Bitcoin network, data being ﬂooded includes IP addresses of the nodes, newly generated

41

transactions, and blocks of veriﬁed transactions that extend the blockchain. Peers share IP

addresses of other nodes that they are connected to or have discovered from their peer nodes.

The goal behind sharing IP addresses is to allow peers in the network to discover and connect

to more nodes resulting in a random network topology. Newly generated transactions are

broadcast through the network to rapidly publicize their occurrence to all connected nodes.

Miners compete to mine these transactions into blocks. The winning miner broadcasts the

block to all the connected nodes to extend and update their version of the blockchain.

Nodes in the Bitcoin P2P network are deﬁned based on their roles. The main duties

are summarized as transaction generation, block/transaction routing, block/transaction ver-

iﬁcation, and transaction mining. Block/transaction routing is performed by all nodes. A

node that can perform all functions is referred to as a full node. It consistently keeps a copy

of the full blockchain allowing it to verify any transaction without needing the assistance of

other connected nodes. It also possesses a BTC wallet that can generate transactions and

compute the possessed value of BTC by the node. Moreover, a full node possesses computa-

tional resources to compete in the mining competition. Nodes that do not store a full copy of

the blockchain are referred to as Simpliﬁed Payment Veriﬁcation (SPV) nodes or lightweight

nodes [52]. These nodes require assistance from full nodes when verifying a transaction. Full

nodes feed the SPV nodes with the required information from the blockchain necessary to

complete the transaction veriﬁcation. Some nodes may only perform one particular function.

Ones that are engaged in the mining process are referred to as mining nodes while others

that generate transactions are referred to as wallets.

In most Bitcoin software implementations, all nodes are treated equally and are uniquely

identiﬁed by their IP addresses. Using these addresses, peers establish Transmission Control

Protocol (TCP) connections with one another. Each node can choose whether to connect to

42

the network using a public or private IP. A node that uses a public IP is accessible over the

Internet by any node while one with a private IP is only accessible by nodes within its private

network. By default, a node with a public IP address is granted 8 outbound connections and

117 inbound connections, resulting in a total of 125 connections. On the other hand, a node

with a private IP address is granted only 8 outbound connections. An outbound connection

is initiated by the node itself when it requests connecting to a discoverable node while an

inbound connection is initiated by other nodes in the network that desire connecting to the

node.

The node that initiates a connection is deﬁned as client and the node that waits for

an incoming connection as server. Both nodes engage in a TCP handshake by exchanging

network packets deﬁned as version and verack. The client initiates a connection request

by sending a version packet addressed to the IP address of the server. By default, the

server listens on port 8333 for incoming version packets. If the server accepts the version

packet, it responds with a verack packet and its own version packet, both addressed to the

IP address of the client. Finally, the client responds by sending a verack packet addressed

to the IP address of the server and the connection is established. The connection enables

symmetric communication allowing the client and server to exchange data bidirectionally.

The connection is lost if peers do not communicate for a speciﬁed idle time. To reconnect,

peers engage in a new TCP handshake.

As discussed previously, a node shares with its peers a list of IP addresses that it has

learned as a result of being connected to the network. Each node stores its list in two separate

tables: a tried table and a new table. The tried table of a node stores IP addresses that the

node has established connections with while its new table stores IP addresses that it has only

discovered but did not attempt to connect to yet. When a node desires sharing IP addresses

43

with its peers, it randomly selects IP addresses from both tables and sends them in addr

messages. An addr message can contain up to 23% or a maximum of 1000 IP addresses of

the total IP addresses stored in both tables [53]. To initiate sharing, a node sends a getaddr

message to its peers requesting them to share their lists of IP addresses. The peers then

respond with an addr message. In some cases, sharing IP addresses is unsolicited if a node

voluntarily sends an addr messages to its peers without receiving a getaddr message.

A node that wishes to connect to the Bitcoin network for the ﬁrst time cannot obtain

IP addresses by this method. Bootstrapping is mainly achieved by communicating with a

Domain Name Server (DNS) seeder. The node sends a DNS query requesting a list of active

IP addresses. If the DNS fails to respond with an appropriate list of active IP addresses, the

node can still connect to the network by using a hard-coded list of IP addresses, referred to

as seeds. Once connected to any of these IP addresses, the node can then request more IP

addresses from its peers by sending getaddr messages.

Nodes also relay veriﬁed transactions and blocks to their peers to reach consensus. A

node begins by broadcasting an inventory (inv) message to all its peers informing them of

the new transactions/blocks it has received and veriﬁed. The peers check whether they are

already informed of these new transactions and blocks. Then, they respond to the node

with a getdata message. The getdata message includes all the transactions and blocks that

a peer node is not aware of. The node then responds with a transaction/block message

that includes the complete transactions/blocks the peer requests. Once received, the peer

validates the transactions or blocks and continues to relay them to its peers in a similar

manner. If a received transaction or block cannot be validated, it is immediately dropped

and its propagation is discontinued.

44

3.2.2 Bitcoin Transactions

We deﬁne a Bitcoin transaction (TX) as the transfer of an amount of BTC ownership rights

from the wallet of the buyer to the wallet of the seller, in exchange for a product or service.

BTC wallets use elliptic curve digital signatures to handle the transfer of ownership rights

and ensure that unauthorized spending of the cryptocurrency is infeasible. Each wallet ran-

domly generates a private key Pr, that is used to derive its corresponding public key Pub

that is shared among all users. The Pub is used to generate the address of the wallet needed

to make payments to it while the Pr is used to generate a digital signature corresponding to

the Pub in order to claim payments made to the wallet and use them in later transactions.

A Pr is ﬁrst generated from a Cryptographically Secure Pseudo-Random Number Gener-

ator (CSPRNG) and its corresponding Pub is then calculated using Elliptic Curve Digital

Signature Algorithm (ECDSA). Calculations are performed based on the ﬁeld and curve

parameters deﬁned by secp256k1 with the curve order n [54] as follows

Pr = CSPRNG(),

Pub = Pr × G (mod n),

(3.1)

(3.2)

where G is a generator of the elliptic curve and × represents elliptic curve multiplication.
The BTC wallet of the buyer assembles a transaction using the Unspent Transaction

Outputs (UTXO) of the buyer stored in the blockchain. An UTXO speciﬁes an amount of

BTC claimed earlier by the buyer as a result of a previously processed transaction. A simple

BTC transaction is shown in Fig. 3.1. In the ﬁgure, we show that a transaction can consist

of multiple inputs and outputs. The output UTXOpay represents the transfer of ownership

rights of a certain amount of BTC from the wallet of the buyer to the wallet of the seller.

45

The output UTXOch represents redirecting ownership rights of the BTC change amount back

to the wallet of the buyer. A distinct locking script is attached to each of these outputs that

speciﬁes conditions that must be met in order to grant ownership rights. For example, the

locking script attached to UTXOpay must include the Pub of the seller needed to generate

his/her wallet address. This ensures that the payment is made to the wallet of the seller and

only he/she is granted access to it with his/her corresponding Pr. Using Pr, the seller can

generate a digital signature that corresponds to the Pub associated with the locking script,

hence claim the output.

The inputs {UTXO1, UTXO2,··· , UTXOn} represent unspent transaction outputs claimed
by the buyer from previous transactions. When a buyer decides to use a speciﬁc output from

a previous transaction as an input to a new transaction, the buyer must specify proof that

he/she still possesses ownership rights and did not previously spend them in another trans-

action. This is done by attaching an unlocking script to each input. The unlocking script

solves the locking script that was associated with the output from the previous transaction.

Likewise, the unlocking script is a digital signature produced by the Pr of the buyer that

corresponds to a Pub associated with the locking script of an UTXO. A valid unlocking script

is legitimate proof of continuous possession of ownership rights to certain BTC being used

as input. As a result, BTC can be viewed as a chain of digitally signed transactions where

ownership rights are transferred from one owner to the other by digitally signing them.

A transaction must include at least one input, however, it may include multiple outputs

to simultaneously pay diﬀerent sellers from the total value associated with the inputs. The

locking script of each output would specify the conditions of its claimer. However, it is

necessary that the total BTC value of the inputs is always equal to or greater than the total

value of the outputs. In the event that the total value of the inputs is greater than the total

46

Figure 3.1: A single Bitcoin transaction with multiple UTXO inputs and outputs.

outputs, the diﬀerence, known as the transaction fee, is rewarded to the miner that adds

the transaction into a block attached to the blockchain. For guaranteed processing, most

available wallets today derive the transaction fee as a ﬁxed amount of BTC in relation to

the size of the transaction. In other words, the transaction fee increases with the size of the

transaction.

The wallet of the user combines all the transaction inputs/outputs and their correspond-

ing scripts into one digital message M. It then applies the Secure Hash Algorithm SHA256

to M twice to increase security before releasing it into the network. The 32 byte digest

representing the identity of the transaction (IDTX) is generated as follows

IDTX = SHA256 (SHA256(M)) .

(3.3)

A newly generated transaction assembled by the BTC wallet of a buyer is released into

the Bitcoin network to be validated and stored in the blockchain. The generating node

transfers the transaction to its peers that ﬂood it to the rest of the network nodes. Each

node that receives it audits the inputs by executing the scripts associated with it. This audit

involves checking whether the execution of the unlocking script integrated by a buyer within

47

InputsOutputs+UTXOaUTXObUnlocking ScriptUTXOn...UTXOpayUTXOchLocking ScriptLocking ScriptUnlocking ScriptUnlocking Scripteach input matches its corresponding locking script deﬁned in the previous transaction. If

a match exists, the node relays the transaction to its peers and temporarily places it in its

transaction pool until chosen to be mined, otherwise, the transaction is dropped.

In some cases, transactions are not ﬂooded into the network in the same order they are

generated. As a result, during the audit, a node might not be aware of some inputs of a

transaction (child transaction) referring to the outputs of other transactions (parent transac-

tions). Instead of immediately rejecting the transaction and considering its inputs as invalid,

the node can temporarily place it into an orphan transaction pool. If the parent transaction

shows up, the inputs of the child transaction become valid and it can be transferred to the

transaction pool.

3.2.3 Bitcoin Transaction Standards

Currently, there are ﬁve Bitcoin transaction standards and a few non-standard transac-

tions [52]. All transaction types are generated with a stack-based scripting language that is

processed from left to right. A script consists of a list of instructions that must be executed

in the correct order to grant an individual the right to spend the BTC within a transaction.

The list of standards is described below.

Pay to Public Key Hash (P2PKH): This standard transaction is the most used type.

The locking script within each output of a transaction holds the public key hash (serving as

a Bitcoin address) of the seller that will claim the BTC amount included. In other words, the

locking script deﬁnes a condition that the seller must possess a speciﬁc Pr corresponding to

the public key hash to claim the output. Once claimed by the seller, the output becomes an

UTXO owned by the seller. In order for the seller to use this speciﬁc UTXO as an input to a

48

future transaction, the seller must attach a valid unlocking script to it. The unlocking script

includes the Pub of the seller and a digital signature generated by his/her Pr that corresponds

to the public key hash associated with the locking script of the previous transaction output.

Pay to Public Key: The intent behind this standard transaction is to simplify the P2PKH

standard. Rather than associating the public key hash within the locking script of the

output, the public key itself is used. As a result, the validation process is simple. The digital

signature of the seller generated with a Pr can immediately be compared to the associated

Pub by searching whether or not they match.

Multi-signature (MultiSig):

In this standard transaction, a combination of keys is re-

quired to authorize an output claim. The locking script of a transaction output is associated

with a number (N ) of public keys. In order for an individual to claim the output, the indi-

vidual must possess M -of-N private keys that correspond to the N public keys. This type

of transaction can increase security and can be used in scenarios that require more than

one user to be present in order to claim and spend BTC. However, as the number N of

public keys associated with the transaction output increases, the size of the transaction also

increases. As a result, these transactions acquire large space in the UTXO pool, therefore

requiring more storage memory. As discussed previously, larger transactions also require

larger transaction fees.

Pay to Script Hash (P2SH): This standard transaction was introduced to resolve the

complex issues caused by MultiSig transactions. The transaction has the same simple com-

plexity as a P2PKH transaction. Rather than associating the entire locking script with a

transaction output that includes multiple public keys, a double hash computation is applied

49

to the entire script, speciﬁcally SHA256(cid:0)RIPEMD160(script)(cid:1). The result is a 20-byte digest

that is attached to the locking script instead of the entire original script. In order to use

the output from this transaction as an input to another transaction, the buyer creates an

unlocking script that holds M -of-N private keys and the original script that was crypto-

graphically hashed earlier. In that way, suﬃcient information is available in the locking and

unlocking scripts to validate the UTXO for spending. In addition, the buyer no longer has

to worry about generating large transactions that might require hefty transaction fees to

process. Instead, only the seller is required to provide the unlocking script he/she wishes to

spend the output in a new transaction.

Data Output: This standard transaction is intended to store arbitrary data on the blockchain

rather than transfer BTC from a buyer to a seller. In the Bitcoin community, many mem-

bers believe that such transactions are abusive to the system since it places a burden on

the network nodes to process transactions that do not carry BTC. However, such transac-

tions exist and allow 40 bytes of data to be stored per transaction. These transactions are

un-spendable, therefore are not stored in the UTXO set.

Non-Standard: A very small percentage of transactions are processed under non-standard

transactions. Non-standard transactions use more sophisticated scripts that strive to provide

higher complexity and security. In some cases, these transactions might even be the result

of bugs or mistakes resulting in loss of BTC.

50

3.2.4 Merkle Trees

Validated transactions are grouped into blocks that are then mined and stored in the

blockchain. A single block can contain multiple transactions up to the block size limit.

Merkle trees, sometimes referred to as hash trees, are used to cluster multiple transactions

in one block.

A Merkle tree is a tree data structure generated in a bottom-up approach that can

eﬃciently summarize and verify the integrity of the transactions being combined. Starting

from the leaf nodes that are hashes of the original data, each non-leaf node is generated as

a computation of its respective children nodes. For a single non-leaf node, all its children

nodes are concatenated then hashed to produce a single digest that represents the node in

the tree. This approach continues until a single node is generated that is deﬁned as the root

node.

BTC uses a binary Merkle tree in which each non-leaf node has exactly two children. It

applies a double hash computation SHA256 (SHA256(·)) when generating nodes. The leaf
nodes used to construct the tree are the identities IDTX generated for each transaction as

discussed in equation (3.3).

In a binary Merkle tree, each row within the tree consists of an even number of nodes,

except the root node. In the case where a row consists of an odd number of nodes, a replica

of the last node is reproduced to even out the number of nodes in that row. To better

comprehend the construction of the binary Merkle tree, consider a block that consists of

ﬁve transactions, {TX1, TX2,··· , TX5}. Each one of these transactions has already been
validated by the nodes and an identity for each transaction has been generated as discussed

in equation (3.3). We denote the corresponding identities as {IDTX1

, IDTX2

,··· , IDTX5},

51

where each identity represents a leaf node in the tree.

In this example, the number of

nodes at the leaf node level is odd, therefore a replica of the ﬁfth identity is generated,

, IDTX2

, IDTX6

{IDTX1
= IDTX5}. Next, the double hash computation is applied
to the concatenation of each two identities to generate the parent non-leaf nodes of the

,··· , IDTX5

Merkle tree as follows

N1 = SHA256(cid:0)SHA256(IDTX1(cid:107)IDTX2
N2 = SHA256(cid:0)SHA256(IDTX3(cid:107)IDTX4
N3 = SHA256(cid:0)SHA256(IDTX5(cid:107)IDTX6

)(cid:1),
)(cid:1),
)(cid:1),

(3.4)

(3.5)

(3.6)

where (cid:107) is the concatenation of two identities.

As shown in the previous equations, an odd number of non-leaf nodes is generated at

that level. To even it out, we replicate N3 to produce N4 as

N4 = SHA256(cid:0)SHA256(IDTX5(cid:107)IDTX6

)(cid:1).

Using the resulting digests we can generate the following level of non-leaf nodes as

N5 = SHA256(cid:0)SHA256(N1(cid:107)N2)(cid:1),
N6 = SHA256(cid:0)SHA256(N3(cid:107)N4)(cid:1).

(3.7)

(3.8)

(3.9)

Finally, the 32 bytes root node is derived as

R = SHA256(cid:0)SHA256(N5(cid:107)N6)(cid:1).

(3.10)

52

Fig. 3.2 represents the complete construction of the Merkle tree for this example. The

dotted nodes represent the replicated nodes that are added to even out the odd rows. The

root node, R, representing the summary of all transactions is placed into the block header

of a block to be mined and chained to the blockchain.

Figure 3.2: A Merkle tree within a block.

The use of Merkle trees is more common in SPV nodes since they do not store a copy of

the full blockchain. When an SPV node needs proof of the existence of a transaction within

a block, it turns to a full node for assistance. The full node will generate a merkle path

by computing a maximum of log2 N SHA256(cid:0)SHA256(·)(cid:1) computations, where N represents

the total number of transactions in the tree. Using the Merkle path as an authentication

path, the SPV node can prove the existence of a transaction within the tree. This proof of

existence method is considered to be eﬃcient since it only requires hash computations.

3.2.5 Blockchain

The blockchain is a public ledger that stores all previous transactions since the creation of

Bitcoin.

It provides its users with transaction conﬁrmations to track ownership rights of

BTC. As new transactions are processed, the blockchain is extended. It consists of blocks

{B0, B1,··· , Bn}, each carrying a set of validated transactions, where B0 represents the

53

N1IDTX2IDTX2IDTX1IDTX1IDTX4IDTX4IDTX3IDTX3IDTX6IDTX6IDTX5IDTX5N2N3N5N6RN4ﬁrst block and Bn represents the most recent block attached to the blockchain. Blocks

are linked back-to-back, with each one referencing its previous block to form the complete

blockchain. To reference a block, a unique 32 byte identity IDBi

applying SHA256(cid:0)SHA256(·)(cid:1) to the block header. An identity is referred to as the block

is generated for Bi by

hash.

The head of the blockchain is denoted as B0 and is deﬁned as the genesis block. B0 diﬀers

from all the other blocks as it does not reference any previous block. At the launching stage

of the system, B0 was a stand-alone block waiting for the system to initiate a newly mined

block to be chained to it.

Each block consists of two parts, a header, and a body. Each header incorporates the

block hash of its predecessor block in the chain. The header also consists of a diﬃculty target,

nonce, and a time-stamp that are discussed in more detail in the following subsection. The

body carries all the leaf nodes and non-leaf nodes of the Merkle tree, excluding the Merkle

root, which is incorporated in the header. This design makes it infeasible to retroactively

alter records within any block of the blockchain. Any modiﬁcation to one block will require

adjusting all the subsequent chained blocks.

3.2.6 Bitcoin Mining

Bitcoin mining is the ﬁnal stage to secure validated transactions and add them to the

blockchain. Once a transaction is added to the blockchain, it becomes completely veri-

ﬁed and public to all users. The transaction claimer(s) can use the embedded UTXO(s) as

the input to other transactions whenever desired.

Miners begin by selecting transactions from their transaction pools that will be placed

into a block where a block cannot exceed 1MB in size. A small portion of that space is

54

speciﬁed to carry high priority transactions. Priority is based on the size and age of the

transaction inputs. The rest of the block is ﬁlled with other transactions that have greater

transaction fees to maximize the proﬁt that a miner can turn if successful in mining the

block ﬁrst. A transaction with low or no fees will probably remain in the transaction pool

of the miner until it ages and becomes a high priority transaction.

Next, the miner assembles a special transaction, known as the coinbase transaction. This

transaction is a reward paying transaction to the miner in the event of winning a mining

competition. It does not have any inputs and consists of a single output addressed to the

wallet of the miner. The amount incorporated in the output is the reward mining fee (12.5

BTC at the time of writing) plus the sum of all transaction fees included in each transaction.

All the selected transactions along with the coinbase transaction are then combined into

a Merkle tree as discussed previously. At this point, the miner has all the components

needed to construct the block header of the new block except the nonce. The nonce is a

value that if concatenated with the block header of the group of chosen transactions and then

double hashed, it produces a digest with a speciﬁc preﬁx of zeros in its binary representation.

Searching for this value is performed in a brute-force manner and is directly correlated with

the computational power available. The more available computational power, the faster a

miner is able to ﬁnd the correct nonce. A successful miner will then broadcast his/her proof-

of-work to prove that he/she consumed computational resources in order to ﬁnd the correct

nonce.

The primary advantage of the proof-of-work is to make it computationally infeasible to

perform Sybil attacks [55]. This process is intentionally designed to be resource-intensive to

perform while eﬃcient to verify that the work has been done. It is required that a certain

number of zeros appear in the preﬁx of the digest as a result of applying the double SHA256

55

computation. The preﬁx determines the diﬃculty of ﬁnding the correct nonce. The more

zeros required in the preﬁx of the digest, the harder it is to ﬁnd the correct nonce and vice

versa. The diﬃculty is dynamically altered every two weeks so that the average time it takes

a miner to ﬁnd a correct solution is approximately ten minutes. As the number of miners

increases, the diﬃculty increases, and vice versa.

The ﬁrst miner to ﬁnd the correct nonce to a block of transactions is rewarded a mining

reward as compensation for the computational power spent. The mining reward is halved

precisely every 210,000 blocks that are added to the blockchain. It is estimated to continue

until the year 2140 when nearly 21 million BTC will have been released into the system.

The reason for having a ﬁxed supply of BTC is to prevent price inﬂation in the future.

Another incentive that encourages miners to spend their computational power to perform

mining is the transaction fee. The winner is not only rewarded the mining reward but is

also given all the transaction fees incorporated with all the transactions in the block. With

time, the mining reward will decrease due to halving, which will demand higher transaction

fees in the future to compensate for the reduced mining reward.

After a block is successfully mined, all the miners check their transaction pools to elimi-

nate the transactions that have been included in the mined block and immediately construct

a new block of transactions. The end of the mining race marks the beginning of a new one.

Miners instantly begin to search for the nonce of the next block of transactions.

Simultaneously, the mined block is ﬂooded through the network so that all the nodes

can update their blockchains. The winning miner transmits the block to its peer nodes to

validate it before propagating it further through the network. The peer nodes check whether

the block is correctly assembled in terms of syntax and variables. The proof-of-work provided

by the miner must be correct and the coinbase transaction must include the correct amount

56

to pay the miner. If any information is invalid, the block is immediately dropped.

Quite regularly, as blocks are mined to extend the blockchain, a temporary incident,

known as a fork, might occur. A fork occurs when two miners are able to simultaneously

mine two diﬀerent blocks at the same time. As a result, both newly mined blocks are

accepted to extend the blockchain. The blocks are ﬂooded into the network and the miners

update their version of the blockchain-based on the block they receive ﬁrst. This results in

two valid versions of the blockchain in possession by the miners with two diﬀerent paths.

However, the miners continue to extend their version of the blockchain regardless of which

path they possess. Eventually, one path will grow longer than the other as mining continues.

The path that grows longer is the winner and all nodes immediately discard the other path

and update their blockchain to the longer one. In literature, the blocks that are dropped are

known as orphan blocks; valid blocks that were part of the blockchain at some point.

3.2.7 Bitcoin Mining Pools and Payment Methods

Although solo miners can compete in the mining process, the likelihood of a successful

return is very low. This is even the case for solo miners with the most powerful computing

machines. As a solution to this problem, solo miners collaborate in the mining process by

joining computational power into mining pools. Together, they form a large organization

with signiﬁcant computational power that can compete with the other large entities. The

members of the mining pool work together to ﬁnd the correct nonce for a candidate block

and report the result as one miner, increasing their chances of winning the competition.

In the event of success, the rewards are split among the participating miners based on the

contribution provided by each.

The concept of a mining pool can be compared to the lottery. Assuming individuals with

57

the same ﬁnancial capabilities, if a large group buys tickets together, the individuals within

the group have a better chance of winning than a single individual buying tickets alone. If

any ticket owned by the group wins the lottery, the participating individuals split the reward

proportional to the amount invested by each. In a mining pool, the computational power

provided by each solo miner is analogous to the amount invested by each ticket buyer.

A mining pool is managed by a pool operator who handles the entire pool server and re-

ceives a percentage of the rewards as compensation. The role of the operator is to coordinate

the mining performed by all the participating miners. The operator keeps a continuously

updated copy of the entire blockchain to ease the job of the participating miners. Using

the updated blockchain, the operator veriﬁes any transaction that appears in the network

and places it in a candidate block for mining. By that, miners only need to worry about

ﬁnding the correct nonce of the candidate block. If the mining pool wins the competition,

the operator divides the rewards among the participating miners.

Reward splitting can be performed in multiple forms and varies from one mining pool

to the other. As described in [56], these methods can be categorized into simple reward,

score-based reward, or risk-free pay-per share reward.

Simple reward systems consist of either proportional systems or Pay-Per Share (PPS)

systems. In the proportional systems, a reward B is split among the participating miners at

the end of each round, where a round is the consecutive time between two successful blocks

generated by the pool. The operator keeps a percentage of the reward f B and divides the

remaining (1 − f )B among the miners based on the shares they submit. Shares are deﬁned
as the number of hashes performed by each miner in attempt to ﬁnd the correct proof-of-

work. A miner that submits n shares from a total of N shares submitted by all the miners

in the pool receives a reward of n

N (1 − f )B BTC on average. Conversely, the PPS system

58

is a deterministic one where the miner knows how much reward can be turned in advance.

The operator immediately pays each miner based on the submitted shares regardless of the

mining result. In other words, a miner that submits n shares receives (1− f )pB BTC/share,
In this
where p represents the probability of one share being the correct proof-of-work.

system, the operator is taking the risk of mining independently since the miners receive

ensured payments whether or not the pool generates a block.

Score-based reward systems come in many forms and strive to prevent miners from pool-

hopping. Pool-hopping is the practice of mining in a pool only during its good times (suc-

cessfully generating blocks) and leaving it during its bad times. A pool-hopper can maximize

his/her rewards at the expense of miners that remain loyal to the pool at all times. The

method introduced by Slush [57] is one of the ﬁrst implemented score-based systems that

extends the proportional method. Rather than paying the miner an amount based on the

submitted shares after each round, the miner is given a score that is proportional to his/her

contribution and increases as more time elapses from the start of the round. The score is

used to calculate the reward share given to the miner at the end of the round. However,

this method is still susceptible to hopping since the score does not consider factors such

as the mining diﬃculty or the hashrate of the pool. Also, in this method mining at the

beginning of a round is more proﬁtable since there are fewer shares at that time. As a result,

the geometric method was introduced to address these weaknesses. This method introduced

a ﬁxed fee, a constant amount taken from the reward of each block, and a variable fee, a

score granted at the beginning of each round to the operator. As time passes, the variable

fee declines, making mining equally proﬁtable throughout the entire round. Shorter rounds

result in larger variable fees and vice versa. By that, there is no advantage to mining early

in the round.

59

Another score-based method is Pay-Per-Last-N-Shares (PPLNS) that exists in diﬀerent

forms. In this method, the concept of rewarding miners after each round is replaced with

rewarding miners that have been participating in earlier rounds, regardless of the mining

result. In other words, the operator pays miners based on their contributions from previous

eﬀorts. Later on, more advanced payment systems evolved such as the Double Geometric

Method (DGM). This system is a hybrid between the PPLNS and geometric system that

combines the advantages of both methods.

Some mining pools employ a risk-free pay-per share system. One of the ﬁrst implemented

systems is known as the Maximum Pay-Per Share (MPPS). It combines both the PPS and

proportional systems, where each participating miner has a balance of each. If the miner

submits a share, the PPS balance is incremented and when the pool successfully generates a

block, the proportional balance is incremented. At pay time, the miner receives the minimum

of both balances. This method protects the pool from taking the risk alone. However, this

method is inconsiderate to the miners, since they will always make less whether the pool

is successful or not. In addition to this, the system suﬀers from pool-hopping. A solution

was later proposed to solve this problem in the Shared Maximum Pay-Per-Share (SMPPS)

system. The miners have a PPS balance that continues to accumulate as the miners are

participating. If a block is found by the pool and there are suﬃcient funds, the miners are

paid based on their PPS balance. However, if there are no suﬃcient funds, miners are paid

proportional to the available funds and given credit to be paid later for whatever balance

that is owed.

Today, a broad range of mining pools exist that give miners a variety of options when

joining pools. The question most miners would ask is which mining pool is the best to join.

The answer here lies in the preferences of the miners. For example, some miners are not

60

willing to take the risk of not getting paid in the event of being unsuccessful in generating

a block and would prefer a PPS mining pool. Others might be willing to take the risk and

choose a score-based system for instance, in return for a larger proﬁt.

3.2.8 Alternative Cryptocurrencies

In literature, alternative cryptocurrencies are known as altcoins, most of which are inspired

by Bitcoin. Altcoins strive to oﬀer innovative features and enhanced security/privacy coun-

termeasures in an eﬀort to compete with Bitcoin. Their development process is based on the

level of innovation and security/privacy countermeasures they present.

The simplest method to develop an altcoin is by forking the open-source code of Bit-

coin [58] while adding/modifying any features to it. In software development, a fork is a

completely independent project that exploits a copy of the original source code. A Bitcoin

fork generates an entirely new blockchain and is completely independent of Bitcoin. Name-

coin [59] is the ﬁrst developed Bitcoin fork that adopted all of the characteristics of Bitcoin.

It also introduced an additional feature allowing users to store data within its transactions.

Various Bitcoin forks have evolved latterly with more features and handled security/privacy

issues. Many of these forks implemented privacy protocols to increase the anonymity of

cryptocurrencies.

On exceptional occasions, an altcoin can also be the result of a hardfork. A hardfork

occurs when modiﬁcations are made to the original software of Bitcoin making its new

transactions/blocks incompatible with those previously generated prior to the modiﬁcations.

These modiﬁcations can be as simple as altering certain parameters, such as the block size,

or as complex as changing major protocols, such as the consensus algorithm. In order to

enforce these modiﬁcations, the majority of users/miners must upgrade their client nodes to

61

the latest version that accommodates these changes. The users/miners that do not accept

the upgrade will view the new transactions/blocks as invalid and will not accept them.

As a result, the blockchain will inevitably split into two paths, one storing transactions of

the original cryptocurrency and one storing transactions generated due to the modiﬁcations

made, hence creating a new altcoin. Users in possession of the original cryptocurrency will

automatically be granted an equivalent amount of the new altcoin to what they hold.

Bitcoin Cash [60] is a notable example of a Bitcoin hardfork that occurred on August 1,

2017. It was the result of enforcing BIP91 [61] that proposed activating Segregated Witness

(SegWit) [62]. SegWit increases the transaction speed of Bitcoin by splitting the transaction

into segments and removing the unlocking signatures that are attached separately at the

end. The majority of the miners accepted this proposal resulting in Bitcoin Cash. Users

who possessed BTC were immediately granted an equivalent amount of BCC (The currency

of Bitcoin Cash) to the BTC they possessed.

While only borrowing the concept of storing transactions in a blockchain, some altcoins

have been implemented from scratch with a completely diﬀerent design and purpose. These

altcoins strive to provide services and security/privacy countermeasures beyond the capabil-

ities of Bitcoin or any of its forks. They present substantial diﬀerences such as integrating

enhanced consensus algorithms or using private (permissioned) blockchains. In contrast to

the public (permissionless) blockchain of Bitcoin, where all participating nodes are allowed to

execute the consensus protocol and maintain the blockchain, a private blockchain is limited

to only speciﬁc nodes. As a result, the cryptocurrency market has witnessed a considerable

number of altcoins with substantial innovative features.

62

3.3 Bitcoin Security Issues - Double-Spending Attacks

Double-spending is an attack that could be performed by malicious users attempting to

deceive the system by spending the same BTC more than once. The attacker generates

duplicates of the same UTXO and uses it as an input in more than one transaction. Dif-

ferentiating between the duplicated (fraudulent) copies and the original becomes an issue

when used in a decentralized system. There is no trusted entity that veriﬁes the legitimacy

of the UTXO used as input in a transaction. The inputs of a transaction may consist of

unidentiﬁable fraudulent BTC that have possibly been spent earlier.

The system defends against such attacks by relying on its users (miners) to validate the

legitimacy of the BTC used as an input to transactions. Using the information stored in

the blockchain from the previous transactions, the miners validate the inputs of any new

transaction to ensure that it does not contain previously spent inputs. Once veriﬁed, the

transaction is mined into a block that is attached to the blockchain. Any user that refers

to the blockchain becomes aware that speciﬁc UTXO(s) have been spent earlier, making

fraudulent input transactions detectable.

To ensure that attackers cannot manipulate the blockchain in their favor, the mining

process is designed to be an expensive and resource-intensive operation. To mine a block of

transactions in the blockchain, the miners must provide a valid proof-of-work. An attacker

that wishes to double-spend BTC must reverse a transaction that has been stored in the

blockchain to reuse its inputs in another transaction. Reversing an already stored transaction

in the blockchain is an extremely diﬃcult task since it requires a signiﬁcant share of the total

computational power of the system.

In the rest of this section, we will analyze the double-spending attacks. We ﬁrst discuss

63

conventional methods to perform double-spending. Next, we analyze the probability and

proﬁtability of the double-spending attack and present a trade-oﬀ between the waiting time

before accepting a transaction versus the proﬁtability of the attack.

3.3.1 Types of Attacks

A double-spending attack comes in many forms. We discuss various techniques that can be

performed.

3.3.1.1 Race Attack

A race attack refers to the case where a merchant accepts an unconﬁrmed transaction (a

transaction in a transaction pool waiting to be mined and stored in the blockchain) and

immediately provides the payer with a product/service before waiting for conﬁrmation. An

attacker with the intention of deceiving the merchant creates two transactions: (i) a trans-

action that pays the merchant an amount of BTC in return for a product/service and (ii) a

fraudulent transaction that pays the same amount to the wallet of the attacker. Both trans-

actions use the same inputs (duplicated BTC) and try to spend the same BTC. The attacker

concurrently releases both transactions into the Bitcoin network. The miners consider both

transactions as being valid until one of them gets stored in the blockchain. At that point, the

inputs of the stored transaction cannot be used as inputs to other transactions. Therefore,

the fraudulent transaction has a chance of being veriﬁed ﬁrst and added to the blockchain

making the merchant-paying transaction invalid. The invalid transaction is rejected by the

system and dropped from the transaction pools of miners.

To avoid a race attack, merchants must wait for the mining to be completed and the

transaction to appear in the blockchain before providing the payer with the product/service.

64

It is recommended that the merchant should wait for at least six subsequent blocks as

conﬁrmation before making the trade. In this case, the chances for an attacker to reverse a

transaction are negligible, assuming that the attacker can control no more than 10% of the

total computational power used in mining.

3.3.1.2 Finney Attack

Finney attack was ﬁrst suggested in a Bitcoin forum [63]. Similar to the race attack, the

attacker performing this attack will only succeed if the merchant accepts an unconﬁrmed

transaction. The attacker creates two transactions similar to those in the race attack and

holds on to both of them. The attacker then begins mining the block containing the fraudu-

lent transaction. If the attacker is successful in mining the block, the attacker then uses the

other transaction to pay a merchant immediately in exchange for a product/service. Once

the merchant makes the trade, the attacker releases the mined block that contains the fraud-

ulent transaction into the network. Given that the block is already mined, it will be added

to the blockchain immediately. As a result, the merchant-paying transaction will become

invalid. In addition to this, the attacker is rewarded the mining reward for the mined block

carrying the fraudulent transaction. However, the ability to independently mine a block is

improbable given the resources necessary to perform the task.

3.3.1.3 Vector76 Attack

In comparison to the race and Finney attacks, the Vector76 attack requires the merchant

to wait for a single block to be mined and added to the blockchain as a conﬁrmation. To

reverse the transaction, the attacker needs to create a fork in the blockchain. Initially, the

attacker creates a merchant-paying transaction and does not broadcast it to the network.

65

Next, the attacker tries to independently and secretly mine this transaction into a block. If

successful, the attacker holds onto the block until the honest miners discover another block.

The attacker then simultaneously releases the block into the network at the same time as

the honest miners release their block that will result in a fork. Before the fork is resolved,

the attacker creates a fraudulent transaction that double-spends the same BTC used in the

merchant-paying transaction. The attacker then relays the fraudulent transaction to the

honest miners that do not have the path of the blockchain that carries the merchant-paying

transaction. These miners see the fraudulent transaction as valid and begin mining it into

a block. As a result, each path of the blockchain stores one of the transactions. If the path

that holds the fraudulent transaction grows longer than the other path, the double-spending

attempt is successful.

3.3.1.4 51% Attack

51% attack is the largest threat to the BTC system. This attack is also referred to as the

majority attack in which the attacker (usually a pool of miners) controls more than half of

the total computational power of the system. By controlling the majority of the power, the

attacker is capable of interfering with the process of mining blocks and reversing any block

of transactions. During a 51% attack, the system loses integrity since the other miners no

longer have an incentive to compete in the mining process.

To better comprehend this attack, consider the case where the attacker generates a

merchant-paying transaction and releases it into the network. The merchant waits for an

appropriate number of conﬁrmations before accepting the payment and making the trade.

Simultaneously, the attacker secretly begins to mine a block that contains a fraudulent

transaction followed by more blocks to extend it. Since the computational power of the

66

attacker is more than the rest of the computational power of all the miners combined,

the attacker can mine blocks in less time. Once the merchant accepts the transaction, the

attacker releases the secretly mined blocks to create a fork in the blockchain. If the fraudulent

fork created by the attacker is longer than the original chain, it becomes dominant and all

miners begin to extend on it. By that, the merchant-paying transaction no longer exists in

the blockchain.

This attack represents the biggest threat to Bitcoin as it is directly correlated to the

resources an attacker can provide. Resources are measured in terms of ﬁnancial and compu-

tational power. Large entities such as governments or intelligence agencies have the means to

control a large share of the total computational power. They are able to destroy or push the

system into their favorable status. It is important to note that even with a computational

power that is slightly less than 50%, an attacker may still be able to severely manipulate the

system. In the next subsection, we analyze the chances of success of the attackers based on

the share of computational power they control.

3.3.2 Probability of Success

Despite the continuously increasing popularity of Bitcoin, the number of merchants that

have accepted it as a method of payment today is still relatively minimal. Many merchants

have concerns about its capabilities in terms of security, while others consider it as a slow

method to make payments. Those that accept it should try to take all precautions before

accepting a transaction to prevent double-spending attacks.

One of the important precautions is to decide when to accept a transaction before making

the trade. Merchants prefer to obtain a certain degree of conﬁdence as assurance that the

payer will not be able to reverse the transaction. Those that can aﬀord to wait a long

67

period before accepting a transaction (for example, online platforms) require a minimum

of six conﬁrmations before accepting a transaction and considering it as being irreversible.

However, others that cannot aﬀord this time waiting (such as vending machines), rush into

accepting transactions at the risk of losing the payment to a double-spending attack.

Similar to the analysis in [51], we model the race between the honest miners and the

attacker to generate blocks as a binomial random walk. The race is denoted as z that

represents the number of blocks generated by the honest miners with computational power p

minus the number of blocks generated by the attacker with computational power q = 1 − p.
If a block is generated by the honest miners, we increment z by 1. Conversely, if a block is

generated by the attacker, we decrement z by 1. The race between the honest chain and the

chain generated by the attacker can be derived as

zi + 1, with probability p,

zi − 1, with probability q,

zi+1 =

(3.11)

where i represents an individual block race. If q > p and the attacker has unlimited resources,

the attacker will eventually reach z < 0. At that point, the attacker can replace the blocks

generated by the honest miners and succeed in performing the attack.

The probability of the attackers to catch up and surpass the blocks generated by the

honest miners can be compared to the Gambler’s Ruin Problem. Similar to the description

in [64], we assume a gambler (attacker) begins with an initial fortune i, 0 < i < N , and

either wins $1 with probability q or loses $1 with probability p = 1 − q, in each successive
gamble. The game represents a random walk that terminates at i = 0 (fail) or at i = N

68

(success). The probability of success after i trials is denoted as Pi and can be calculated as:

Pi = qPi+1 + pPi−1.

Since q + p = 1, we can rewrite equation (3.12) as:

Pi+1 − Pi =

p
q

(Pi − Pi−1) .

(3.12)

(3.13)

At i = 0, the attacker has a probability of success P0 = 0. By rearranging and generalizing

equation (3.13), we have

Pi+1 = P1

=

(cid:18) p

(cid:19)j

q

i(cid:88)

j=0

p
q )i+1
p
q )

1−(
1−(

P1(i + 1),

,

if p (cid:54)= q,

if p = q = 0.5.

p
q )N
p
q )

1−(
1−(

,

if p (cid:54)= q,

1 = PN =

P1N,

if p = q = 0.5.

P1
P1


(3.14)

(3.15)

Let i = N − 1 meaning that PN = 1, we can rewrite equation (3.14) as

Solve P1 from equation (3.15) and substitute the result into equation (3.14) to obtain

if p (cid:54)= q,

if p = q = 0.5.

(3.16)

Pi =

p
q )i
1−(
q )N ,
p
1−(
i
N ,

69

Following the analysis in [65], we assume that the attacker begins with an initial fortune

i = y and can aﬀord to lose up to y dollars before giving up. The gambler wins if i = N =

y + z + 1 dollars. This assumption modiﬁes the game to account for the probability Ps of

the attacker to surpass the blocks generated by the honest miners as



Pi =

p
q )y

1−(
q )y+z+1 ,
p

1−(
y

y+z+1 ,

if p (cid:54)= q,

if p = q = 0.5.

(3.17)

Consider an attacker that possesses an unlimited amount of resources and is willing to

use as much of it as needed to perform the attack, i.e. y → ∞. If q > p, then

1 − ( p
q )y

1 − ( p

q )y+z+1

lim
y→∞

= 1.

(3.18)

then calculate the

(cid:17)y

(cid:16) p

q

.

(3.19)

For q < p, we ﬁrst divide the numerator and denominator by

limit as

( p
q )−y − 1
q )−y − ( p
( p

q )z+1

=

lim
y→∞

(cid:18) q

(cid:19)z+1

p

Finally, we can summarize the probability of the attacker to surpass the blocks generated

by the honest miners as

(cid:17)z+1



(cid:16) q

p

1,

Qz =

,

if q < p or z ≥ 0,

if q > p or z < 0.

(3.20)

The merchant has no way of ﬁguring out the number of blocks that the attacker has been

able to secretly mine. Therefore, one way to model the overall probability of the attacker

70

to surpass the honest chain is by using the Poisson distribution. The expected number of

blocks an attacker can generate is λ = (z +1)

. The overall probability Ps of the attacker

to surpass the honest chain can be computed by multiplying the Poisson density and the

(cid:16) q

(cid:17)

p

probability of surpassing the honest z − k remaining blocks as discussed in equation

∞(cid:88)

Ps =

λke−λ

k! × Qz−k

k=0

= 1 −

∞(cid:88)

k=0

λke−λ

k! ×

(cid:17)z−k+1

(cid:16) q

p

1 −

1 − 1 = 0,

,

if q < p or k ≤ z,

if q > p or k > z.

(3.21)

For equation (3.21), if q > p, we will always have Ps = 1, meaning that the attacker will

win. When q < p, the probability for the attacker to succeed is

(cid:32)

(cid:18) q

(cid:19)z−k+1(cid:33)

z+1(cid:88)

k=0

Ps = 1 −

λke−λ

k! ×

1 −

p

.

(3.22)

Another way to model this probability is by using the negative binomial distribution

assuming the attacker can pre-mine one block before broadcasting the merchant-paying

transaction to the network [66]. The merchant waits for n blocks to be generated by the

honest miners with computational power p before accepting the transaction. At that time,

the attacker can secretly generate m blocks with computational power q = 1 − p, where
m = n − z − 1. By deﬁnition, we can model this as the m number of blocks that the at-
tacker can generate (success) before the n number of blocks the honest miners can generate

(failure). Therefore, the probability of a successful double-spending attack for a given value

m can be calculated as

P (m) =

× pnqm.

(3.23)

(cid:18)m + n − 1
(cid:19)

m

71

Overall, the probability for an attacker to successfully surpass the number of blocks

generated by the honest miners can be computed as

∞(cid:88)

m=0

Ps =

P (m) × Qn−m−1

(cid:18)m + n − 1
(cid:19)

∞(cid:88)

m

= 1 −

m=0

(cid:17)n−m

(cid:16) q

p

1 −

1 − 1 = 0,

× pnqm ×

,

if q < p or k ≤ n − m,

(3.24)

if q > p or k > n − m.

Similar to the previous analysis, equation (3.24) conﬁrms that when q > p, the attacker

will always succeed since Ps = 1. When q < p, the probability of success can be deﬁned as

(cid:32)

(cid:18) q

(cid:19)n−m(cid:33)

× pnqm ×

1 −

p

(cid:18)m + n − 1
(cid:19)
(cid:18)m + n − 1
(cid:19)

k

m

n−1(cid:88)
n−1(cid:88)

m=0

m=0

Ps = 1 −

= 1 −

× (pnqm − pmqn).

(3.25)

Fig. 3.3 shows the results of Ps as n changes based on equation (3.25). From this ﬁgure,

the merchant can obtain the desired level of conﬁdence before accepting a transaction. The

obtained level of conﬁdence is deﬁnite for any q at n = 0, meaning that the attackers have

a 100% chance of success. As the number of blocks n increases, the chances of a successful

double-spending attack decline. Conversely, as q increases, the chances of a successful attack

increase. The ﬁgure also shows that if q ≥ 0.5 then we will always get Ps = 1. This is known
as the majority attack. In fact, even if the values of q are slightly less than 0.5, the chances

of a successful double-spending attack could still be high. However, the probability declines

exponentially as the value of n increases.

72

Figure 3.3: Probability of successful double-spending attacks vs. number of conﬁrmations
waited by the merchant.

3.3.3 Attack Proﬁtability

A successful double-spending attack is only proﬁtable if the revenue is higher than the cost of

performing the attack. Suppose an attacker tries to double-spend v BTC paid to a merchant

in exchange for a product/service. The attacker releases a transaction into the network

that pays v BTC to the wallet possessed by the merchant. Immediately after releasing the

transaction, the attacker secretly begins to mine blocks of transactions. One of these blocks

contains a fraudulent transaction that pays the same v BTC to the wallet possessed by the

attacker. The merchant accepts the transaction after observing that n blocks have been

extended to the blockchain. If the attacker is able to secretly mine m = n + 1 blocks and

replace the n blocks in the blockchain generated by the honest miners, then the attacker is

successful in gaining a product/service without paying for it. Assume that the attack returns

a value of 2v, one of which is the actual BTC as a result of reversing the merchant-paying

73

transaction and the other as the product/service. In addition to this, the attacker gains the

mining reward for each block mined and the transaction fees included in each transaction.

Then the revenue gained by the attacker can be formulated based on his/her corresponding

Ps as follows

Revenue ≈ v + Ps(v + Rm) BTC,

(3.26)

where R is the block reward and the transaction fee per block.

Multiple factors can impact the cost such as the price and depreciation value of machinery

used, the cost of electricity, and the amount of BTC being spent in the transaction. However,

formulating the cost with all the possible factors is infeasible. To simplify it, we focus our

analysis on the cost factors that could change signiﬁcantly as the attack is performed. These

factors include the v BTC an attacker spends in the merchant-paying transaction, the cost

of mining m blocks, and the depreciation cost d(t) of the computing device used in BTC at

time t. We derive the cost as follows

Cost ≈ v + meq(t) + d(t) BTC

(3.27)

where eq(t) is the estimated mining electrical cost in BTC/block of a miner with a share q

of the total computational power of the system. We assume eq(t) remains constant during

the total time T the attack is performed.

We also assume that the average lifespan of the mining equipment is approximately two

years. Using straight-line depreciation, d(t) is a negligible value for an attack over a short

period. Therefore, we can reduce the cost equation as follows

Cost ≈ v + meq(t) BTC

74

(3.28)

The time to mine a single block either by the honest miners or the attackers is approxi-

mately ten minutes. Therefore, we can rewrite equations (3.25), (3.26), and (3.28) as

10−1(cid:88)

T

m=0

Ps ≈ 1 −

(cid:18)m + T

(cid:19)

10 − 1
m

× (p

(cid:18)

v +

RT
10

Revenue ≈ v + Ps

T
10 ),

T

10 qm − pmq
(cid:19)

BTC,

(3.29)

(3.30)

(cid:18) T

(cid:19)

10

Cost ≈ v +

eq(t) BTC.

(3.31)

The proﬁt/loss can be formulated from equations (3.30) and (3.31) as

Proﬁt/Loss = Revenue − Cost
−

≈ Ps

v +

RT
10

(cid:19)

(cid:18) T

(cid:19)

10

eq(t) BTC.

(3.32)

(cid:18)

Nowadays, to stand a chance in mining Bitcoin, miners merge their computational power

into mining pools as discussed in Section 3.2.7. The mining pools combine the computa-

tional power provided by the computing machine of each participating miner. Machines

are categorized into one of four groups: Application-Speciﬁc Integrated Circuits (ASIC),

Field-Programmable Gate Array (FPGA), Graphics Processing Unit (GPU), and Central

Processing Unit (CPU). Each group can provide up to a certain computational power. Com-

parisons of most of those machines are presented in [67] and [68].

Each computing machine consumes electricity diﬀerently based on its speciﬁcations. Even

75

machines with similar speciﬁcations might vary in cost to operate. As a result, formulating

the cost of electricity spent by a miner in the mining process becomes challenging.

ASICs have monopolized the mining process due to their incomparable computational

power with those of the CPUs, GPUs, and FPGAs. Miners using any computing machines

other than ASICs have a negligible chance of competing. Many mining pools do not permit

miners with these machines to join their pools. A miner that joins a mining pool with one of

these machines would hardly earn BTC in the event that the pool successfully mines a block.

This is because rewards are usually divided among the miners based on their contributions

as discussed in Section 3.2.7.

Our goal now is to formulate the estimated electrical cost eq(t) of a mining pool. First

we estimate the total number of miners N (t) based on the total hashrate H(t) of the system

at a certain time t as

N (t) ≈

H(t)
h(t)

,

(3.33)

where h(t) is the average hashrate of a single mining machine involved in mining at time t.

The cost of electricity is measured in cents/kWh and varies based on the end-use sector

and time t. End-use sectors include, residential, commercial, industrial, and transportation.

We denote the average cost of electricity of all sectors at time t as ea(t). Using the computing

wattage w of the machine, the average running cost c(t) of a machine at time t is

c(t) ≈ ea(t) × w cents/hour.

Using equations (3.33) and (3.34), the total cost E(t) for all miners at time t is

E(t) ≈ N (t) × c(t) cents/hour.

76

(3.34)

(3.35)

We know that it takes approximately 10 minutes to generate one block, i.e.

in T = 1

hour, miners can generate m = 6 blocks. Therefore, the total cost C(t) for all miners to

generate one block at T = 10 minutes can be estimated as

C(t) ≈

E(t)

6

cents/10 minutes (1 block).

(3.36)

However, as discussed previously, miners merge their computational power to increase

their chances of winning in the mining competition. The largest mining pool that exists

today is Antpool [69] controlling approximately 25% of the total computational power. Other

mining pools also exist such as BTCC Pool [70], Bixin [71], BTC.com [72], and BTC.TOP [73]

that control approximately 7%-11% of the computational power.

We estimate the average electricity cost eq(t) of a mining pool based on its computational

power q as

eq(t) ≈ C(t) × q

H(t) × ea(t) × w × q

6h(t)

≈

cents/block.

(3.37)

For our simulations, we assume that the total cost of mining blocks C(t) by all miners and

computational power q of the mining pool are ﬁxed during the total mining time T . We also

assume a mining environment consists of miners using only ASICs such as Antminer S9 since

it is one of the most eﬃcient computing machines on the market today. The speciﬁcations

of this machine are h = 14 TH/s and w = 1.375 kWh.

Consider an attacker trying to perform a double-spending attack during October 2019.

During that period, 1 BTC was equal to approximately $9,200. The total hashrate power

77

Figure 3.4: Proﬁts/losses of attackers with varying computational power q trying to double-
spend v = 5 BTC during October 2019.

was approximately H = 95000000 TH/s and the average cost of electricity for all sectors

in the U.S. was approximately 10.45 cents/kWh, based on the data collected by the U.S

Energy Information Administration [74]. Under these circumstances, in Fig 3.4 we present

the expected proﬁt/loss of double-spending attacks for various computational powers q. For

this analysis, we assume the attackers try to double-spend v = 5 BTC.

In Fig. 3.4, a point above y = 0 represents a proﬁt while one below it represents a loss.

The point of intersection of a curve with y = 0 represents the break-even point of an attack.

The amount of BTC spent to perform the attack at this time is equal to the revenue returned.

By analyzing the ﬁgure, we attain the following ﬁndings:

1. For all values q at t = 0, the attacker turns a proﬁt of exactly 5 BTC. As shown in

Fig. 3.3, for any value q at n = 0 (or t = 0), Ps = 1. This may occur when the receiving user

accepts an unconﬁrmed transaction where the attacker has a theoretically perfect chance to

succeed.

78

2. If the receiving user waits for n conﬁrmations before accepting a transaction, the attacker

must compete to mine blocks for the blockchain. Based on the previous analysis, we know

that the probability of success Ps of mining blocks is based on the computational power q

of the attacker. We also know that Ps declines as n (or t) increases for all values q < 0.5.

Therefore, not only does a proﬁt turn into a loss as the attack time progresses, but an

attacker with a smaller q will more likely lose at an earlier point in time. However, with

smaller values q, losses are also smaller. This is evident as larger values q impose higher

electricity costs eq(t).

3. Looking closer at the attack with q = 0.1, we notice that the attacker breaks-even after

approximately 15 minutes, i.e. after mining at most one block. Beyond that time, if the

attacker has not been able to fork the block, she will begin losing if she continues to perform

the attack. This means, the attacker would most likely surrender at that time to avoid losses.

4. For q = 0.2 to q = 0.4, the potential proﬁts ﬁrst grow as t increases and rewards are

accumulated until they reach a turning point where they begin to decline and eventually

turn into losses. This turning point occurs when Ps starts to decline with t.

5. For q ≥ 0.5, the attacker will always turn a proﬁt representing the majority attack. This
is evident as represented by the straight line with a continuous positive slope.

In summary, an attacker with a computational power q < 0.5 will eventually lose at some

point as t increases. On the other hand, an attacker with computational power q ≥ 0.5 will
always succeed with a proﬁt. However, it is important to note that this analysis does not

include the luck factor. Consider two miners with computational powers q1 and q2 respec-

tively, where q1 > q2. The miner with computational power q1 has more resources to solve

the proof-of-work, therefore can perform mining faster than the miner with computational

79

power q2. However, the miner with computational power q2 could still ﬁnd the solution to

the proof-of-work ﬁrst due to the randomness of the exhaustive search performed. From a

probabilistic standpoint, the chances are low.

3.4 Summary

Since the inception of Bitcoin, the popularity of the blockchain technology has grown ag-

gressively. In this chapter, we ﬁrst introduced the background of Bitcoin and explicated its

major building blocks and protocols. Next, we delved into crucial security concerns. We

began by discussing the double-spending attacks and analyzing their probability of success.

Using this analysis, we further evaluated the proﬁtability of the attack. We showed that

attackers with less than half of the total computational power of the system will eventually

lose at some point while performing the attack.

80

Chapter 4

d-MABE: Distributed Multilevel

Attribute-Based EMR Management

and Applications

4.1

Introduction

After introducing the major concepts of blockchain and extensively looking into its security,

in this chapter, we propose a novel distributed data sharing scheme that leverages blockchain

and smart contracts. We choose electronic medical records as our ﬁrst application to demon-

strate the beneﬁts of alleviating dependence on the record-generating institutions and thus

empowering patients over their own sensitive data. Our proposed work aims at solving the

security and privacy challenges and interoperability issue of the current solutions. In the ex-

isting systems, EMRs are stored within the generating institution, which signiﬁcantly limits

the patients’ access of their own records. They must also trust that the institutions will not

leak any sensitive information of their EMRs to unintended organizations. In comparison,

our proposed work empowers them over their EMRs. First, it provides record ownership

to the patients, allowing them to be in actual possession of their records, rather than have

them stored by the generating institution. Second, it allows patients to selectively share

81

diﬀerent parts of their records with distinct data users based on their privacy preferences.

Using our proposed scheme, the record-generating institutions can be eliminated from the

record-sharing process, hence, eliminating the need for the third trust party.

The rest of this chapter is organized as follows. In Section 4.2 the problem formulation is

described outlining the system model and our design goals. Next, in Section 4.3, our proposed

scheme is presented in detail outlining the proposed algorithms. In Section 4.4 we formally

prove the security and privacy of our proposed scheme. Following that, in Section 4.5, we

present a performance and application analysis that is supported with numerical results in

Section 4.6. Finally, in Section 4.7 , we provide an extended application discussion and draw

a conclusion in Section 4.8.

4.2 Problem Formulation

Upon visiting a medical institution, a patient interacts with a wide spectrum of distinct

staﬀ members that may include but is not limited to: admittance staﬀ members, physicians

assisted by nurses and technicians that provide the required treatment, and maybe even ﬁ-

nancial or discharging staﬀ members. If those staﬀ members can individually and eﬃciently

access the necessary parts of previous medical record(s) of the patient generated by other

institutions required to perform their speciﬁc jobs, they can provide the patient with expe-

dited admittance and enhanced diagnosis and treatment decisions, reducing the overall time

spent during the visit. For example, an ER physician may learn severe drug allergies or cur-

rent medication for an unresponsive patient being rushed into the ER from his/her previous

records. As a result, the physician can prevent iatrogenic illnesses caused by administering

inappropriate medication to that patient or harmful drug interactions. By accessing critical

82

medical information, patient safety is enhanced while saving time and resources. However,

for the sake of patient privacy, only necessary staﬀ members should be given permission to

access patient record(s) from the medical history determined by the proﬁle of the patient.

In this chapter, we denote a record generated by an institution for a patient as R. We
denote the record attributes as {Rj ∈ R | 1 ≤ j ≤ n}, where n is the maximum number
of attributes within the record. Record attributes may include previous medical history,

medical treatment, lab tests, medication, personal information, ﬁnancial statements, credit

card information, and others. The corresponding attribute values of the record generated

for the patient can be denoted as {a(Rj) | 1 ≤ j ≤ n}.

In current systems, records often have a variety of attributes and may be blocked or

intentionally delayed by competitive record-generating institutions. To prevent this, record-

generating institutions should have minimal control over the records of patients whereby

empowering patients over their own records. Patients may have diﬀerent attitudes in terms

of sensitivity of their record attributes. The challenge today is to provide patients with

a method that allows them to share the attribute values of their records eﬃciently and

securely while maintaining the privacy preferences they desire. Patients should also be

able to selectively share certain parts of their record(s) with the healthcare providers they

select. Although patients may not easily understand how to share the medical parts of their

record(s), empowering them would at least allow them to consult their trusted medical staﬀ

such as their Primary Care Physicians (PCPs) to decide on how and which parts of their

medical record(s) are to be shared. With their consent, patients may even delegate this

process to such entities to avoid situations such as being unconscious, requiring someone to

make decisions for them. Lastly, patients should not be concerned with the availability nor

the guaranteed secure storage of their records.

83

4.2.1 Design Goals

Our proposed scheme aims at satisfying the following:

Data Ownership: Patients are empowered over their records. They have full ownership

and control over how to share their records.

Fine-grained access control: Patients can grant speciﬁc access permissions to the desired

staﬀ members based on their privacy preferences. They can selectively share any part of their

records using any set of staﬀ member attributes they wish, limiting access to speciﬁc parts

of their record and certain staﬀ members at particular medical institutions.

Collusion resistant: Two or more staﬀ members that possess diﬀerent sets of attributes

or are employed at the same/diﬀerent institution(s) cannot combine their attributes to gain

access to any part of the records they are not authorized to access individually.

Data conﬁdentiality: Records are completely protected from any unauthorized staﬀ

members that do not possess the correct set of attributes predeﬁned by the patient. This

includes any type of space that stores the records for the patients.

Distributed storage: Patient records are stored in a distributed storage network. They

can be accessed at any time and by any permissioned staﬀ member. Storage is not centralized

or controlled by the record-generating institution.

Data access trace-ability: Patients can trace-back to the entire history of accesses of

their records. This allows patients to keep track of how their records are being accessed.

84

4.2.2 System Model

The general model of our proposed record-sharing scheme is illustrated in Fig. 4.1. The

system consists of six main entities:

Patients: Patients are data owners that wish to share their data selectively based on their

privacy preferences. We denote the set of patients as {pa ∈ P | 1 ≤ a ≤ ∞}.

Medical institutions: Medical institutions provide medical treatment and generate med-

ical records for the patients. We denote the set of medical institutions as {mb ∈ M | 1 ≤
b ≤ ∞}.

Staﬀ members: personnel employed at a medical institution. We denote the set of staﬀ

members at medical institution mb as {sb,l ∀ 1 ≤ l ≤ ∞}. Staﬀ members are categorized
into groups of similar characteristics based on the set of attributes A = {A1, A2 . . . , An}

they each possess. Attributes represent identiﬁers that are selected from an inﬁnite pool set.

They may be as general as the role of a staﬀ member and could be as unique as a biometric.

Key-issuer:

a semi-trusted party that generates keys for permissioned staﬀ members upon

their requests to access parts of the record.

Distributed storage P2P network: a non-trusted P2P network used by the patients to

store and share their records with staﬀ members at any medical institution.

Blockchain P2P network:

a non-trusted P2P network that maintains a blockchain to

regulate access permissions of patient records to the staﬀ members of medical institutions.

85

Figure 4.1: d-MABE general scheme orchestration.

In general, in order for patients, medical institutions, staﬀ members or key-issuers to in-

teract with any of the two P2P networks, they must run nodes that are capable of connecting

to either network. These nodes may be light-weight nodes (similar to the simple payment

veriﬁcation nodes in Bitcoin [51]) that can assemble transactions and propagate them to

the network. Running light-weight clients does not require powerful computing machines

and can be done via simple machines such as mobile devices making our proposed scheme

accessible to everyone. The common requirement for running either node is being able to

generate the public key Pub and private key Pr pair.

4.3 The Proposed d-MABE Scheme

In this section, we ﬁrst present an overview of the major chain of events in our proposed

scheme. For discussion purposes, we assume that a patient pa has already received some

medical treatment at institution mb and that a medical record R was generated for him/her.
We also assume that pa anticipates visiting some other institution {mb | b ≥ 1} in the future
and would like to share R selectively among its staﬀ members. Based on our description,
we then propose a generalized structure through three smart contracts that can be deployed

86

Key-issuerMedical institution PatientDistributed storage P2P networkBlockchainP2P network2. Accesspolicy 1. Partition and encrypt3. Visit4. Request access permissionns5. Granting keys7. Record fetching6. Fetch keysMost sensitiveLess sensitiveLeast sensitiveon a blockchain.

4.3.1 Scheme Orchestration

The proposed scheme can be divided into seven major events as demonstrated in Fig. 4.1.

Record partition and encryption: Following the work presented in [75], pa initially

deﬁnes a privilege-based access structure that satisﬁes his/her privacy desires. The access

structure is divided into k levels {L1,L2, . . . ,Lk} where each level is associated with an
access policy {T1,T2, . . . ,Tk}. The pa then partitions R into k segments {R1, R2, . . . , Rk} of
diﬀerent sensitivity. Each Ri ∈ R may represent one or more record attribute(s) depending
on how pa partitions R. Next, pa derives a set of symmetric keys {sk1, sk2, . . . , skk} as
described in Section 2.3.1. Using these keys, pa then encrypts each corresponding Ri ∈ R as

ERi = Sym-Encski(cid:107)SBK(Ri),

(4.1)

where Sym-Enc is a symmetric encryption algorithm such as the Advanced Encryption Al-

gorithm (AES) [44] and SBK denotes the subscription based key that is only accessible to

active members through membership subscription authentication. Members have no direct

access to SBK and can only use it while on site. This design serves two purposes: (i) it

prevents any users from sharing the ER(cid:48)is even if they share their ski’s with others and give

them unprivileged access and (ii) it provides an easy option to disable unsubscribed members

from accessing EMRs. In other words, with this design, key revocation is no longer needed.

Finally, using the Encryption function discussed in Section 2.3.3, pa encrypts each ski under

87

its corresponding Ti deﬁned in the privilege-based access structure, such that

Eski = CPABE-Enc(PK, ski,Ti),

(4.2)

where CPABE-Enc is the CP-ABE encryption function and PK is the public key generated

during the Setup phase. The pa then stores the generated ciphertexts {ER1, ER2, . . . , ERk}
and {Esk1, Esk2, . . . , Eskk} over IPFS. In cases that EMRs consist of multiple/diverse at-
tributes, it may be a burden for patients to deﬁne their privilege-based access structures

and may even leak sensitive information if deﬁned inappropriately. A possible approach that

patients may consider is to divide their records into multiple categories based on the nature

of data being shared. For example, a record could be divided into personal information,

medical information, and ﬁnancial information. For each category, a patient can then deﬁne

a separate privilege-based access structure which would reduce the overall complexity of each

structure. Patients may even consult their trusted Primary Care Physicians (PCPs) on how

to deﬁne a privilege-based access structure for proper medical information sharing.

Access policy: To facilitate data management and sharing, pa shares the privilege-based

access structure that incorporates the access policies {T1,T2, . . . ,Tk}. The access structure
is incorporated into a smart contract that is then deployed over the blockchain.

Medical institution visit: When pa visits a medical institution {mb | b ≥ 1} to get
medical treatment, pa interacts with various staﬀ members that may or may not belong to

a predeﬁned privilege-based access structure.

88

Requesting access permissions: Permissioned staﬀ members will be granted access to

certain parts of a record based on the attributes they possess. To request access, a staﬀ

member sb,l interacts with the smart contract previously deployed by pa that incorporates

the privilege-based access structure. The smart contract will verify whether the sb,l possesses

a set of attributes that can satisfy an access policy within the access structure. Once the

veriﬁcation is performed the smart contract will publicly announce the access permissions of

the sb,l over the blockchain.

Granting keys: The key-issuer continuously monitors the blockchain for access announce-
ments. An announcement contains the set of attributes A that the sb,l possesses. It is a form
of veriﬁcation to the key-issuer that the sb,l possesses set A before generating a secret key SK

using the KeyGeneration function described in Section 2.3.3. The key-issuer then encrypts

SK with the public key Pub of sb,l as the following

ESK = Asym-EncPub(SK),

(4.3)

where Asym-Enc is the asymmetric encryption function. To share the key with the sb,l, the

key-issuer transacts with a global smart contract that publicly announces the encrypted key

over the blockchain.

Key fetching: The sb,l can obtain the uniquely encrypted secret key ESK by monitoring

the announcements made over the blockchain. Once obtained, the sb,l can decrypt ESK using

his/her private key Pr that corresponds to the public key Pub such that

SK = Asym-DecPr(ESK),

(4.4)

89

where Asym-Dec is the asymmetric decryption function.

Record fetching: The storage location of the encrypted EMRs over IPFS is also provided

to the sb,l in an encrypted form under his/her public key in the announcement made over

blockchain. The sb,l decrypts the IPFS location to fetch the EMRs. Here, the sb,l possesses

the necessary key to decrypt the parts he/she has been granted access to. Once fetched, the

sb,l can decrypt the encrypted symmetric key Eski using the derived key SK as explained in

Section 2.3.3. That is

ski = CPABE-Dec(Eski, SK),

(4.5)

where CPABE-Dec is the CP-ABE decryption function. After obtaining ski, the sb,l can

derive the remaining set of secret keys {ski+1, . . . , skk} using equation (2.2). Finally, the sb,l
can decrypt each encrypted partition, such that

Ri = Sym-Decski(cid:107)SBK(ERi),

(4.6)

where Sym-Dec is the decryption function.

4.3.2 Smart Contracts

Our proposed scheme includes three main smart contracts: (i) staﬀ member registration,

(ii) access veriﬁcation and permission announcements, and (iii) granting keys.

Staﬀ member registration: Staﬀ Member Registration (SMR) is a global smart contract

that registers staﬀ members of medical institutions for patient EMRs access. The process

of registering staﬀ members by interacting with this smart contract can be delegated to a

90

Algorithm 1 SMR smart contract

staﬀID
staﬀAttributes[upBound]

1: struct staﬀMember contains
2:
3:
4: end
5: variable mapping(address → staﬀMember) Map
6: function addStaﬀMember( address, id, attributes[upBound])
7:
8:
9:
10:
11:
12: end function

if (msg.sender (cid:54)∈ setOfCertiﬁers) then
else

throw

Map( address → id, attributes[upBound])

end if

13: function getAttributes( address)
14:
15: end function

return Map[ address].staﬀAttributes

number of certiﬁed institutions that manually verify staﬀ members against their possessed

attributes before registering them. This is achieved by incorporating precise policies into

the smart contract which requires certain identities to trigger it. Once a certiﬁed institution

veriﬁes the attributes of a staﬀ member, it executes the smart contract and uses the attributes

as input. This results in the attributes being stored over the blockchain. This process maps

the identity (in our case the Ethereum address) of the staﬀ member to the set of attributes

possessed. By observing the blockchain, we can trace-back the on-going activity of these

certiﬁed institutions as they add, delete, or modify the information of staﬀ members, hence,

we can also detect malicious data manipulation.

Algorithm 1 outlines the general functions of the SMR smart contract. Lines 1-4 rep-

resent a generalized data structure staﬀMember that the smart contract uses to store new

staﬀ members. It incorporates the identity staﬀID and the array of attributes staﬀAttributes

of the staﬀ member. Lines 6-15 represent the two main functions addStaﬀMember and

getAttributes of the smart contract. The addStaﬀMember function allows only certiﬁed insti-

91

Algorithm 2 AVPA smart contract

1: Initialized variables address = SMR address
2: event LogAnnounce(address, attributes,Ti)
3: function verifyRequest( msg, σ)
signer ← ecrecover( msg, σ)
4:
if (signer (cid:54)= Hash(Pub)) then
5:
6:
else
7:
8:
9:
10:
11:
12:
13:
14:
15:
16:
17: end function

r ← SMR( address)
fetched ← r.getAttributes(signer)
for i ← 1 to k do

end for

if (satisfy(Ti, fetched) == true) then

LogAnnounce(signer, fetched,Ti)
break

throw

end if

end if

tutions setOfCertiﬁers to upload the data of new staﬀ members after physically verifying their

attributes. This conditioned access is to prevent malicious attackers from granting access to

themselves or others. On the contrary, the getAttributes function can be called by any user.

Given the address address of a speciﬁc staﬀ member, this function returns the previously

veriﬁed and stored attributes of this staﬀ member.

Access veriﬁcation and permission announcements: The Access Veriﬁcation and

Permission Announcements (AVPA) is a unique smart contract deﬁned by each patient that

incorporates his/her personally deﬁned privilege-based access structure in order to facilitate

the selective record management and sharing. The staﬀ members interested in obtaining

parts of the record interact with AVPA contracts to request access permissions. The smart

contract is outlined in Algorithm 2.

The smart contract performs a sequential veriﬁcation process. This is represented in a

single function verifyRequest that takes two inputs: a message msg prior to being signed and

92

its signature σ. That is

msg = Hash(staﬀID),

σ = Sign(Pr, Pub, msg),

(4.7)

(4.8)

where Hash is the hashing function and Sign is an ECDSA [76] signing function.

In the initial veriﬁcation process, the AVPA smart contract uses an ECDSA compatible

validation function ecrecover( msg, σ) to validate σ by verifying whether

ecrecover( msg, σ) = Hash(Pub)

(4.9)

holds true. In fact, if the validation function is conducted truthfully, then the correctness

of equation (4.9) follows from the ECDSA. Next, signer is compared to the address of the

requesting staﬀ member as shown in line 5. If the addresses match, the contract uses signer to

fetch the previously veriﬁed and stored attributes of this staﬀ member in the SMR contract.

However, this method is liable to replay attacks [76]. In section 4.4, we discuss this issue

and propose a countermeasure.

If the initial veriﬁcation is successful, the smart contract executes the second veriﬁca-

tion as outlined in lines 10-14. In this process, the fetched attributes are checked against

the access policy Ti within the access structure. The access structure is deﬁned by the pa-
tient during contract deployment and maintains the desired privacy settings as discussed

in equations (4.1) and (4.2). The access policies are tested sequentially starting from the

highest ranked access policy T1 at level L1. If the attributes satisfy the access policy Ti,
the veriﬁcation is discontinued and the function ﬁres an event LogAnnounce that announces

a permanent log of the smart contract transaction stored over the blockchain. This event

93

is a public and immutable announcement to ensure that the staﬀ member in possession of

signer should be granted access to the parts of the record {Ri . . . Rk} corresponding to levels
{Li . . .Lk}.

Granting keys: Granting Keys (GK) is a global smart contract that is used to share

the location of the generated and encrypted access keys over the distributed storage. The

contract generally consists of a single function, addKey, as outlined in Algorithm 3.

When the key-issuer observes a LogAnnounce event ﬁred by the AVPA smart contract,

it generates an access secret key SK for the speciﬁed staﬀ member using the unique set of

attributes fetched stored in the logs. Next, the key-issuer encrypts this access key as shown

in Equation 4.3 with the public key of the staﬀ member. It also encrypts the IPFS record

location as ERA = Asym-EncPub( ipfsRecordAddress) with the same public key and uploads

both values to the IPFS storage, maintaining its reference location. Following that, the

smart contract ﬁres an event, LogKeys as shown in line 6, which permanently stores the

address of the requesting staﬀ member staﬀAddress along with the IPFS storage location

ipfsStorageAddress. Using ipfsStorageAddress, the staﬀ member can fetch the encrypted key

ESK and ERA stored over the network. However, as shown in the addKey function, only

certiﬁed key-issuers setOfIssuers are capable of triggering this function. Therefore, attackers

are prevented from spamming the smart contract logs with fake keys or addresses. The staﬀ

member can listen for these ﬁred events and obtain the corresponding ipfsStorageAddress as

it appears in the logs. Using the ipfsStorageAddress, the staﬀ member can fetch ESK and

ERA from the network. It is important to note that only the staﬀ member in possession of

the private key Pr corresponding to the public key Pub will be able to decrypt the fetched

encrypted key and IPFS record address. Attackers continuously listening to the ﬁred events

94

Algorithm 3 GK smart contract

1: event LogKeys(staﬀAddress, ipfsAddress)

if (msg.sender (cid:54)∈ setOfIssuers) then
else

2: function addKey( staﬀAddress, ipfsAddress)
3:
4:
5:
6:
7:
8: end function

LogKeys( staﬀAddress, ipfsStorageAddress)

throw

end if

may be able to learn certain IPFS addresses storing generated encrypted keys, but not

the actual keys or the record address. First the data user decrypts

ipfsRecordAddress =

DecPr(ERA) to learn the location of the encrypted record. Next, using the obtained secret

key SK, the staﬀ member can then decrypt Eski to generate the secret key ski. Finally, the

staﬀ member can decrypt ERi stored over IPFS at ipfsRecordAddress to obtain the record

part Ri.

4.3.3 Access Permission Revocation

Attributes possessed by members may change over time due to, for example, job switch

or retirement. As a result, the access rights to the patient records should be revoked to

be consistent with the access policy. However, this could be challenging since it requires

the attributes of the staﬀ members to be updated periodically. While adding an expiration

date [77] to each attribute when registering staﬀ members seems to be a simple solution, it

requires the patients to incorporate time constraints into their access policies.

The proposed d-MABE scheme can handle access permission revocation eﬃciently with-

out requiring any extra process for the key-issuer. The staﬀ members only need to be

registered to ensure attribute-based access control. As a result, if at any point in time a staﬀ

member is unable to provide evidence of registration to the level of access claimed, future

95

access to the patient records will be disabled. Consequently, if the staﬀ member attempts

to request records he/she is no longer entitled to access, the AVPA smart contract will fail

to verify the request.

For data users that have already accessed certain records and are no longer registered,

our proposed scheme can also revoke their access permissions since each record partition

Ri is encrypted under ski(cid:107)SBK. To revoke access from those data users that are no longer
registered, the patient only needs to generate a new subscription based key SBK then re-

encrypt his/her record partitions under ski(cid:107)SBK. This design does not require regenerating
a new set of keys {sk(cid:48)1 . . . sk(cid:48)k} or making any changes to the privilege-based access structure.
To prevent any data user that has been granted key ski at some point with an inactive

membership from accessing the record partitions, we only need to re-encrypt the record

partitions either periodically (such as monthly), or based on demand.

4.4 Security and Privacy Analysis

In this section, we present the security and privacy discussions of our proposed scheme. We

assume our system runs over a blockchain that is maintained by a signiﬁcant number of

nodes, for example, Ethereum. In such blockchain platforms, it is very expensive to tamper

with veriﬁed transactions processed into blocks. The best bet of the attackers would be

to control more than half of the computational power in the network in order to perform

51% attacks. We also consider distributed storage networks such as IPFS that are content-

addressable. These networks use the hash computation of the original data when storing

and referencing it over the network nodes, resulting in tamper-resistant storage. Moreover,

we consider adversaries that can act as any entity within the system and can manage to

96

connect to either the IPFS or the blockchain network. Such adversaries are capable of

generating fraudulent transactions and propagating them through the blockchain network.

By fraudulent transactions, we mean transacting with smart contracts in an eﬀort to access

data to which they are not granted access. They may also deploy their own smart contracts

over the blockchain, request/store data over the IPFS nodes, and continuously monitor the

network channels.

4.4.1 Security Analysis

We ﬁrst discuss how our proposed scheme is secure against potential attacks. Next, we

present good practices that may help improve the security of the system.

Theorem 3. The proposed scheme is secure against replay attacks.

Proof. The initial veriﬁcation process performed by the AVPA smart contract requires the

requesting staﬀ members to submit ECDSA digital signatures σ to prove their identities.

Given that all data sent over the blockchain is public, malicious attackers can easily maintain

a copy of the submitted signatures and later on impersonate the honest staﬀ members giving

them access to data they should not be able to access. To work around this issue, staﬀ

members are required to time-stamp their digital signatures before transacting with the

AVPA contract, such that

σ ← Sign(Pr, Pub, Hash(staﬀID(cid:107)T)),

(4.10)

where T is the time at which the signature is generated. Any submitted time-stamped sig-

nature is stored over the blockchain in order for the nodes to check if it has been submitted

before. The blockchain nodes will not execute any requests associated with time-stamped

97

signatures that have appeared previously. Therefore, to be able to perform replay attacks

requires the attackers to reverse the transactions that contain an already used digital signa-

ture. The success of reversing a transaction can be modeled as a race between the honest

miners and the attackers trying to generate blocks by competing to solve a hard cryptop-

uzzle known as Proof-of-Work [51]. The race can be modeled as a binomial random walk.

It is denoted as z which represents the number of blocks generated by the honest miners

with computational power p minus the number of blocks generated by the attackers with

computational power q = 1 − p. This race can be derived as

zi + 1, with probability p,

zi − 1, with probability q,

zi+1 =

(4.11)

Now, using the negative binomial distribution, we can model the probability of success

of the attackers. Assume the key generator waits for n blocks to be generated by the

honest miners with computational power p before generating a private key for the requesting

attacker. Also assume that, at that time, the attacker is able to secretly generate m blocks

with computational power q = 1−p, where m = n−z−1. By deﬁnition, this can be modeled
as the m number of blocks that the attacker can generate before the n number of blocks

generated by the honest miners. Therefore, the probability of a reversing a transaction for

a given value m is calculated as

P (m) =

(cid:18)m + n − 1
(cid:19)

m

× pnqm.

(4.12)

Overall, the probability for an attacker to surpass successfully the number of blocks

98

generated by the honest miners can be computed as

∞(cid:88)

m=0

Ps =

P (m) × Qn−m−1

(cid:18)m + n − 1
(cid:19)

∞(cid:88)

m

= 1 −

m=0

(cid:17)n−m

(cid:16) q

p

1 −

1 − 1 = 0,

× pnqm ×

(4.13)

,

if q < p or k ≤ n − m,

if q > p or k > n − m.

Equation (4.13) conﬁrms that when q > p, the attacker will always succeed since Ps = 1.

When q < p, the probability of success can be deﬁned as

(cid:32)

(cid:18) q

(cid:19)n−m(cid:33)

p

× pnqm ×

1 −

(cid:18)m + n − 1
(cid:19)
(cid:18)m + n − 1
(cid:19)

k

m

n−1(cid:88)
n−1(cid:88)

m=0

m=0

Ps = 1 −

= 1 −

× (pnqm − pmqn).

(4.14)

Therefore, as more blocks are appended to the blockchain and q < p, replay attacks are

infeasible.

Theorem 4. The proposed smart contracts are collusion resistant.

Proof. Our proposed scheme requires staﬀ members to become registered through the SMR

smart contract before being able to request access permissions through the AVPA contracts.
During registration, each staﬀ member address is mapped to a single set of attributes A =

{A1, A2 . . . , An} after providing proof of possession to the certiﬁed institutions. The AVPA
smart contract can only accept a single signature as input when triggered. Therefore, to

perform a collusion attack, the attackers are required to regenerate a single digital signature

of an already registered staﬀ member that possesses the set of attributes desired. For ECDSA,

a signature is generated by randomly selecting a cryptographically secure random integer d,

99

such that

(x1, y1) = d × G,

(4.15)

where (x1, y1) are the calculated elliptic curve points, G is the generator of the elliptic curve

with a large prime order n, and d ∈r [1, n − 1]. Next, the digital signature is calculated as
two components σ = (r, s). That is

r = x1 mod n,

s = d−1(z + r · pr) mod n,

(4.16)

(4.17)

where z is the Ln leftmost bits of msg such that Ln is the bit length of the group order n. If

any of the values r or s are equal to zero, d is randomly selected again and equations (4.15),

(4.16), and (4.17) are recalculated. Since the signature components (r, s) are both derived

based on the random value d, it is infeasible for the attackers to regenerate a speciﬁc signature

that belongs to a certain staﬀ member, hence, the smart contracts are collusion resistant.

Theorem 5. Using a content-addressable storage such as IPFS ensures that data is tamper-

resistant.

Proof. In IPFS, records are initially partitioned into n smaller segments before being stored.

That is

R = {R1 . . . Rn},

(4.18)

where each Ri ∈ R is 256KB, by default. However, the size of a partition may also be
customized by the patient as desired. Next, each Ri ∈ R is cryptographically-hashed as

hi = Hash(Ri),

100

(4.19)

where hi is the resulting digest and is used to reference data stored in the network nodes
through the DHT. For the Hash function, it is computationally infeasible to ﬁnd an R(cid:48)i (cid:54)= Ri,
such that

Hash(R(cid:48)i) = Hash(Ri).

(4.20)

Therefore, it is computationally infeasible for an attacker to tamper with the records of the

patients.

Theorem 6. The proposed scheme can counter Distributed Denial of Service (DDoS) at-

tempts.

Proof. DDoS attacks over IPFS aim at congesting the network by sending numerous requests

to the nodes storing the requested data. However, to perform such attacks, the attackers

must initially identify all the nodes storing the target data from the DHT. Next, they must

disrupt the service by sending these nodes a large number of requests, such that

Number of requests (cid:29) MAX TRAN,

(4.21)

where MAX TRAN is the maximum number of transactions that are accepted by a node as

deﬁned in its conﬁguration ﬁle.

Attackers may also try to isolate and monopolizes all inbound and outbound connections

and data ﬂows of the storage nodes (referred to as an Eclipse attack [78]) from the entire

network. These attacks become infeasible as the number of nodes storing the data increases

since the record is replicated across the network nodes. In addition, such attacks are limited

by the time tscheme for a request to appear over the blockchain until the staﬀ member fetches

the data from the IPFS network nodes. Attackers must be able to perform the entire attack

101

in time tattack. That is

tattack < tscheme.

(4.22)

An attacker with no prior knowledge of which data will be requested by staﬀ members

has no advantage of identifying the nodes storing the data. Therefore, as the value of

tscheme becomes smaller and with an increased record replication, DDoS attacks become

infeasible.

4.4.2 Recommended Security Practices

Security may also be enhanced by adopting good development practices. In the following,

we present some highly recommended techniques that developers should keep in the back of

their minds while implementing the system.

Global smart contracts access control:

It is necessary to ensure the integrity of the

data stored in the SMR and GK global contracts since they provide the necessary informa-

tion required to verify AVPA contracts and grant keys to the staﬀ members. Therefore, it is

imperative to halt the attackers from manipulating the state of these contracts with fraudu-

lent information. Attackers with access rights that can modify the SMR contract may easily

register themselves with any set of attributes required to pass the veriﬁcation of certain

AVPA contracts. It is also feasible to perform DDoS attacks by deleting or modifying the

registration of staﬀ members in the SMR contract or spamming the logs of the GK contract

with fraudulent information. This will result in interrupting or slowing down the process

of retrieving patient records. Therefore, when developing such contracts, it is important to

limit the ability to modify the state of global contracts to only certiﬁed institutions. This is

outlined in lines 7 and 3 of Algorithms 1 and 3, respectively. The current smart contract pro-

102

gramming languages, for example, Solidity [79], provide speciﬁc functions to verify certain

conditions or variable values before execution is performed. Well-known examples include

the require() or assert() functions that verify preliminary conditions and consume a portion

or all of the gas associated with a transaction. In general, two main reasons for charging

users gas to trigger smart contracts deployed over the blockchain are to reward the executing

network nodes and make potential attacks expensive. This means that attackers must pay

for each transaction they request for it to be executed by the network nodes. This approach

will not prevent attackers from registering themselves into the SMR with a set of fraudulent

attributes but could help halt those trying to spam the network. Assuming an attacker has

been able to modify the state of a global smart contract, the attacker will still have to pay

for each requested transaction. To increase the security measures in this scenario, smart

contracts may be designed to require a minimum gas amount in order to execute any smart

contract. However, this countermeasure results in a trade-oﬀ between security and cost for

the honest users.

External smart contract calls: Our proposed scheme relies on external smart contract

calls that fetch the stored attributes of registered staﬀ members to compare them to those

sent by the requesting staﬀ members before validating if they satisfy a certain access policy.

External calls can introduce several unexpected risks or errors if the smart contracts mistak-

enly execute malicious code. Therefore, it is important that patients cautiously implement

their AVPA contracts to handle errors when externally calling the SMR global contracts dur-

ing the initial veriﬁcation process. In Solidity [79], external calls can be performed in two

ways:

low-level calls and contract calls. Low-level calls do not throw exceptions, instead

they return false if the external call of another smart contract itself encounters an exception.

103

Patients that use low-level calls must check if the returned value within the AVPA contract

fails and then properly handle it. On the other hand, contract calls are more direct as they

throw exceptions as encountered by the external call. From a security standpoint, it may be

more secure to use contract calls especially when the patient is less experienced in handling

complex scenarios.

Non-Public Blockchains: The security of our proposed scheme could also be enhanced

by running the smart contracts over a consortium or private blockchain. In such a setting,

miners are selected nodes within the peer-to-peer network that are known and trusted. If any

of these nodes attempt to act maliciously, they are immediately identiﬁed and eliminated.

A good example of such candidate nodes could be the medical institutions themselves that

are interested in maintaining the system in return for eﬃcient data access when required.

4.4.3 Privacy Analysis

While our proposed scheme aims at satisfying the privacy desires of patients and their records,

it tends to leak knowledge about the staﬀ members and their access patterns. Each staﬀ

member must initially become registered and is linked to a unique address before engaging

with the system. Since this information can be leaked, attackers can simply monitor requests

triggered by the staﬀ members by observing the blockchain activity. However, since no

information is revealed about the records of patients through the AVPA smart contracts,

attackers can only track when staﬀ members trigger requests, but not the data they have

requested or the patient information.

From the perspective of patients, our proposed scheme is intentionally designed to allow

them to trace-back the accesses performed by staﬀ members to their records. This is possible

104

by tracing the history of LogAnnounce events ﬁred as outlined in line 12 of Algorithm 2. All

ﬁred events are permanently stored over the blockchain allowing the patients to continuously

monitor how their records are being accessed. If a patient notices irregular staﬀ member

accesses that are undesired, the patient can simply redeploy a new AVPA smart contract that

deﬁnes new or more strict access policies.

Theorem 7. Under the proposed model, the privacy of patient records location is preserved.

Proof. First, since the user record is encrypted as described in equation (4.3) before it is

uploaded to IPFS, the content of the user record is protected. Second, for the current IPFS

system, any user in possession of the data can reproduce its cryptographic-hash, search its

corresponding location that is maintained by the DHT, and locate it within the peer-to-

peer network nodes determined by the default IPFS partitioning techniques. This will also

result in recursively locating any data linked to it. To ensure data privacy, we will append

a random value r to the record before uploading it to IPFS. Therefore, the storage location

is determined by

h = Hash(R(cid:107)r).

(4.23)

The randomness of r makes it infeasible for the attacker to locate the data stored over IPFS

nodes.

4.5 Performance Analysis

In this section, we will evaluate the proposed smart contracts in terms of performance and

monetary costs.

The performance measurement of smart contracts can be performed either through (i)

105

calculating the gas costs required to deploy/transact with the contract, or (ii) measuring the

computational complexity of the algorithms. Method (i) is usually more accurate since it

presents an estimate of the actual monetary costs paid by the users in order to deploy/trans-

act with the contract. However, this method is directly dependent on the actual source code

implementation. That being said, developers that consider optimization techniques when

coding their contracts can probably end up reducing the gas costs. In addition to this, even

if we were to assume access to the entire source code of smart contracts, it is still diﬃcult

to determine the expected gas costs, since contracts may behave diﬀerently based on their

state. On the other hand, method (ii) can help give us an indication of the computational

complexity based on the number of inputs. This computational complexity directly corre-

lates to the gas costs paid by the users. In other words, the more complex an algorithm is,

the higher gas costs it will require to be deployed/executed. However, with smart contracts,

the concept of reducing computational complexity to make it run faster is not of signiﬁcant

importance since blockchain transactions are generally executed at discrete intervals. There-

fore, if we were to reduce the computational complexity, it would only be to reduce the gas

costs required rather than the total latency.

To measure the performance of our proposed smart contracts, we will present discussions

on each of our proposed algorithms that combine both methods. Our discussions assume

that smart contracts will be implemented in Solidity [79], a contract-oriented, high-level

language for implementing smart contracts deployable over the Ethereuem blockchain and

processed by the EVM.

106

4.5.1 SMR Contract

As outlined in Algorithm 1, the addStaﬀMember function takes attributes as input to register

a new staﬀ member. Due to the computational limitations of the EVM, returning dynamic

content from external function calls is not possible, i.e. returning a dynamically sized array of

attributes resulting from the getAttributes function when called by the AVPA smart contract.

This means, the size of Map[ address].staﬀAttributes must be ﬁxed in order to be executed

by the EVM. As a result, the only workaround is to use statically-sized upBound arrays of

attributes as input to the addStaﬀMember. Therefore, in this case, performance is mainly

based on the size of upBound, such that

Performance ∝ upBound.

(4.24)

4.5.2 AVPA Contract

As discussed previously in Agorithm 2, the AVPA smart contract consists of two sequential

veriﬁcations that play a dominant role in the performance of the contract.

In the initial

veriﬁcation, an input digital signature σ is validated to prove the identity of the requesting

staﬀ member. With Solidity, the only available cryptographic function that can perform

such a process is the ECDSA ecrecover function. The function recovers the Ethereum address

associated with the public key from the elliptic curve signature or returns zero on error. At

the time of writing, the ecrecover function requires 3000 gas units. In comparison to most

of the available possible computations available by the EVM [34], the ecrecover function is

considered to be relatively expensive. However, this initial veriﬁcation is ﬁxed with each

AVPA contract transaction.

107

Assuming the initial veriﬁcation is successful, the AVPA externally fetches the attributes

of the staﬀ member to test them against the incorporated access structure. Similar to

the SMR contract, the performance of the AVPA contract is dependent on the upBound of

the fetched attributes. Finally, the fetched attributes are tested against the access policies

starting at the highest level within the hierarchy. Here, performance is aﬀected by the

complexity of the overall access structure deﬁned and the number of levels k within the

hierarchy. Assuming the worst case scenario, a requesting staﬀ member might have his/her

attributes tested against all access policies within the access structure and only receive the

least sensitive part Rk of the record or nothing at all. The computational complexity can

be represented as O(k × upBound × |Ti|), where |Ti| is the number of attributes that form
the access policy at level Li. To reduce this complexity, we can modify the AVPA contract
such that the staﬀ members can request that their attributes be tested against only speciﬁc

access policies rather than being tested sequentially against all policies. This would reduce

the computational complexity to O(upBound × |Ti|). We can also incorporate optimized
search functions, for example, binary search, that can help optimize searching for attributes

in the fetched set against those in the access policies. This means that we can further reduce

the computational complexity to O(upBound × log |Ti|). However, it is important to note
that in such a scenario, the patients need to make their access structures publicly available

in order for the staﬀ members to request certain access policies. This results in a trade-oﬀ

between the performance of the AVPA contract and the privacy of the access policy deﬁned.

4.5.3 GK Contract

The GK contract requires the least computational complexity and gas costs with respect

to the SMR and AVPA smart contracts. The computations involved are as simple as ﬁring

108

a LogKeys event when access permissions are granted to staﬀ members. The gas costs of

such operations are minimal in comparison to the other computations in the SMR and AVPA

smart contracts.

4.6 Empirical Results

In this section, we implement, simulate and present the estimated costs required to de-

ploy/transact with the smart contracts of our proposed scheme in multiple scenarios. Our

experiments are performed over the Ethereum testnet blockchain [80]. We speciﬁcally choose

the Ethereum blockchain given the fact that it is by far the most widely used smart contract

hosting blockchain. However, we note that our proposed scheme can also be implemented

over any other blockchain that incorporates smart contracts.

In our simulation, the AVPA smart contract with the highest degree of complexity is

implemented. When executed, the AVPA smart contracts sequentially check whether the

attributes of the requesting users satisfy the access policies in order starting from the highest

level of the hierarchy. As discussed in Section 4.5, there is a trade-oﬀ between privacy and

performance. Therefore, we note that our presented results can be further enhanced in terms

of performance as discussed previously. However, we intentionally choose to implement our

smart contracts this way to show that even with these conditions, our proposed contracts are

still cheaper when compared to the current systems used by the record-generating institutions

to share medical records. In contrast to the current record-sharing systems, our proposed

scheme eliminates the role of the record-generating institution completely. This results

in eliminating other overhead costs required when sharing medical records. For example,

in developed countries such as the U.S., record-generating institutions must comply with

109

Table 4.1: Estimated d-MABE smart contracts deployment at k=5 and N =25.

Smart Contract

SMR
AVPA

GK

k = 5, N = 25

Gas Limit Cost/ETH Cost/USD Data/bytes

240383
857419
125617

0.004809
0.017148
0.002512

1.4
5.21
0.74

736
3323
304

Table 4.2: Estimated d-MABE smart contracts deployment at k=5 and N =50.

Smart Contract

Gas Limit Cost/ETH Cost/USD Data/bytes

k = 5, N = 50

SMR
AVPA

GK

240447
1132628
125617

0.004809
0.022653
0.002512

1.42
6.67
0.74

736
4536
304

Table 4.3: Estimated d-MABE smart contracts deployment at k=10 and N =100.

Smart Contract

Gas Limit Cost/ETH Cost/USD Data/bytes

k = 10, N = 100

SMR
AVPA

GK

240447
1303607
125617

0.004809
0.026072
0.002512

1.42
7.56
0.74

736
5188
304

certain state laws that enforce a maximum fee a record-generating institution may charge

when requested to share its records. Table 4.4 illustrates a sample of these fees that could

be inclusive or exclusive to the methods of delivering the record itself. Given the current

competitiveness, in most cases, the medical institutions would charge the entire allowed fees

to maximize their proﬁts.

For our simulations, we apply the values k = {5, 10}, upBound = {10, 20, 30, 40, 50} and
N = {25, 50, 100}, where k is the number of levels in the hierarchy and N is the total
number of attributes used in the entire access structure. Without loss of generality, we also

assume that the number of attributes for each level follows the normal distribution. For

example, if k = 5 and N = 50, then the number of attributes used to deﬁne an access policy

is |Ti| = 10. The following experiments have been conducted through an Ethereum node
running on a system with 1.8 GHz Intel Core i5 and 8 GB RAM. Our numerical results are

110

Table 4.4: U.S maximum state fees to copy and deliver records upon being requested by
others.

State

Search fee

Cost per page

Misc. costs per page

California [81]

N/A

$0.25

Microﬁlm: $0.50

Texas [82]
Florida [83]

$82.95

$1.00/year

New York [84]

N/A

Illinois [85]

$27.91

P.1-10: $45.79 ﬂat fee
P.11-60: $1.54
P.61-400: $0.76
P.401+: $0.41
$1.00
$0.75

P.1-25: $1.05
P.26-50: $0.70
P.50+: $0.35

Microﬁlm:
P.1-10: $69.74 ﬂat fee
P.11+: $1.54

Microﬁlm: $2.00

X-rays: Actual cost of reproduction

Microﬁlm: $1.74

Table 4.5: Estimated d-MABE addStaﬀMember function costs.

upBound

10

20

30

40

50

246531

Gas Limit
1060034
Cost/ETH 0.004931 0.008998 0.013065 0.017133 0.021201
Cost/USD

449890

653249

856674

2.65

3.85

5.06

6.29

1.45

Table 4.6: Estimated d-MABE addKey function costs.

upBound

10

20

30

40

50

26379

Gas Limit
Cost/ETH 0.000528 0.000528 0.00053 0.000524 0.000528
Cost/USD

26380

26375

26379

26378

0.15

0.15

0.14

0.14

0.14

also the averages of 10 trials under each scenario.

Based on our experiments, the costs of deploying our smart contracts are not aﬀected by

the value of upBound since there is no major change in code as this value changes. Therefore,

for all values of upBound that we tested, the costs remained constant. Tables 4.1, 4.2, and 4.3

summarizes these costs in terms of gas limits and their estimated and equivalent costs in ETH

and United States Dollar (USD) at the time of testing. To measure the optimum gas limit

for each smart contract deployment, we used the JSON-RPC method, eth estimateGas [86],

that generates and returns an estimate of the required gas limit based on the network success

rate. We also set the gas price to 20 GWEI, where 1 ETH = 1× 109 GWEI. We intentionally

111

Figure 4.2: Estimated costs to run the d-MABE AVPA smart contract.

select this value relatively higher than the average gas prices speciﬁed in [87] for guaranteed

and fast processing. Again, we note that our presented results can be further optimized by

selecting reduced gas price values. As demonstrated in Tables 4.1, 4.2, and 4.3, the estimated

costs for deploying the SMR and GK smart contracts remain constant as we alter the values

of k and N . However, the costs for deploying the AVPA smart contracts increase with an

increase in the values k or N .

Figure 4.2 demonstrates the changes in USD costs of transacting with already deployed

AVPA smart contracts as we alter the values k and N for permissioned staﬀ members at

levels L1, L5 and L10. As shown by the ﬁgure, we recognize an increase in costs as these
values increase. We also realize that as the upBound value increases while keeping the values

k, N , and Li constant, the costs slightly increase.

Finally, in Tables 4.5 and 4.6, we present the gas limits and costs in both ETH and USD

to transact with the addStaﬀMember and addKey functions. As shown in Table 4.5, the costs

112

upBoundEstimated cost in USDk=5,N=25,L1<latexit sha1_base64="Eyx/eJfdgOX8+REIv8C5EGpDPRs=">AAACAXicbVDLSsNAFL3xWesr6kZwM1gEF1KSquimUHDjQqSCfUAbwmQ6aYdOHsxMhBLqxl9x40IRt/6FO//GSZuFth6Y4XDOvdx7jxdzJpVlfRsLi0vLK6uFteL6xubWtrmz25RRIghtkIhHou1hSTkLaUMxxWk7FhQHHqctb3iV+a0HKiSLwns1iqkT4H7IfEaw0pJr7g+r5yfotlrRfzfAakAwT2/Gru2aJatsTYDmiZ2TEuSou+ZXtxeRJKChIhxL2bGtWDkpFooRTsfFbiJpjMkQ92lH0xAHVDrp5IIxOtJKD/mR0C9UaKL+7khxIOUo8HRltqSc9TLxP6+TKP/SSVkYJ4qGZDrITzhSEcriQD0mKFF8pAkmguldERlggYnSoRV1CPbsyfOkWSnbp2X77qxUq+RxFOAADuEYbLiAGlxDHRpA4BGe4RXejCfjxXg3PqalC0beswd/YHz+AIwZlP8=</latexit>k=5,N=25,L5<latexit sha1_base64="Pt7DeEQFWWMUIJQi0t1jahbTL7s=">AAACAXicbVC7SgNBFL0bXzG+Vm0Em8EgWEjYjQZtAgEbC5EI5gHJssxOZpMhsw9mZoWwxMZfsbFQxNa/sPNvnE1SaOKBGQ7n3Mu993gxZ1JZ1reRW1peWV3Lrxc2Nre2d8zdvaaMEkFog0Q8Em0PS8pZSBuKKU7bsaA48DhtecOrzG89UCFZFN6rUUydAPdD5jOClZZc82BYrZyi22pZ/90AqwHBPL0ZuxXXLFolawK0SOwZKcIMddf86vYikgQ0VIRjKTu2FSsnxUIxwum40E0kjTEZ4j7taBrigEonnVwwRsda6SE/EvqFCk3U3x0pDqQcBZ6uzJaU814m/ud1EuVfOikL40TRkEwH+QlHKkJZHKjHBCWKjzTBRDC9KyIDLDBROrSCDsGeP3mRNMsl+6xk350Xa+VZHHk4hCM4ARsuoAbXUIcGEHiEZ3iFN+PJeDHejY9pac6Y9ezDHxifP5IplQM=</latexit>k=5,N=50,L1<latexit sha1_base64="MtCFjrNhVo+6kvIRpT7MTQs1TH4=">AAACAXicbVA9SwNBEJ2LXzF+ndoINotBsAjhLhq0CQRsLEQimBhIjmNvs0mW7H2wuyeEIzb+FRsLRWz9F3b+G/eSKzTxwcDjvRlm5nkRZ1JZ1reRW1peWV3Lrxc2Nre2d8zdvZYMY0Fok4Q8FG0PS8pZQJuKKU7bkaDY9zi990aXqX//QIVkYXCnxhF1fDwIWJ8RrLTkmgejWrWEbmpVq4S6PlZDgnlyPXFt1yxaZWsKtEjsjBQhQ8M1v7q9kMQ+DRThWMqObUXKSbBQjHA6KXRjSSNMRnhAO5oG2KfSSaYfTNCxVnqoHwpdgUJT9fdEgn0px76nO9Mj5byXiv95nVj1L5yEBVGsaEBmi/oxRypEaRyoxwQlio81wUQwfSsiQywwUTq0gg7Bnn95kbQqZfu0bN+eFeuVLI48HMIRnIAN51CHK2hAEwg8wjO8wpvxZLwY78bHrDVnZDP78AfG5w+I9pT9</latexit>k=5,N=50,L5<latexit sha1_base64="BpGRNFDKzFgB2dbYyYDxWUITqXY=">AAACAXicbVA9SwNBEJ2LXzF+ndoINotBsAjhLhq0CQRsLEQimBhIjmNvs0mW7H2wuyeEIzb+FRsLRWz9F3b+G/eSKzTxwcDjvRlm5nkRZ1JZ1reRW1peWV3Lrxc2Nre2d8zdvZYMY0Fok4Q8FG0PS8pZQJuKKU7bkaDY9zi990aXqX//QIVkYXCnxhF1fDwIWJ8RrLTkmgejWrWEbmpVq4S6PlZDgnlyPXGrrlm0ytYUaJHYGSlChoZrfnV7IYl9GijCsZQd24qUk2ChGOF0UujGkkaYjPCAdjQNsE+lk0w/mKBjrfRQPxS6AoWm6u+JBPtSjn1Pd6ZHynkvFf/zOrHqXzgJC6JY0YDMFvVjjlSI0jhQjwlKFB9rgolg+lZEhlhgonRoBR2CPf/yImlVyvZp2b49K9YrWRx5OIQjOAEbzqEOV9CAJhB4hGd4hTfjyXgx3o2PWWvOyGb24Q+Mzx+PBpUB</latexit>k=10,N=100,L1<latexit sha1_base64="0CxFCZHJWZqQxqB1YoCdnEl4/b8=">AAACA3icbVDLSgMxFL3js9bXqDvdBIvgQspMFXQjFNy4EKlgH9AOJZNm2tBMZkgyQhkKbvwVNy4UcetPuPNvzLSz0NYDSQ7n3EvuPX7MmdKO820tLC4tr6wW1orrG5tb2/bObkNFiSS0TiIeyZaPFeVM0LpmmtNWLCkOfU6b/vAq85sPVCoWiXs9iqkX4r5gASNYG6lr7w8vXecE3ZrbPJ0Q6wHBPL0Zd92uXXLKzgRonrg5KUGOWtf+6vQikoRUaMKxUm3XibWXYqkZ4XRc7CSKxpgMcZ+2DRU4pMpLJzuM0ZFReiiIpDlCo4n6uyPFoVKj0DeV2ZBq1svE/7x2ooMLL2UiTjQVZPpRkHCkI5QFgnpMUqL5yBBMJDOzIjLAEhNtYiuaENzZledJo1J2T8vu3VmpWsnjKMABHMIxuHAOVbiGGtSBwCM8wyu8WU/Wi/VufUxLF6y8Zw/+wPr8AWIPlWk=</latexit>k=10,N=100,L10<latexit sha1_base64="MyVESVpgNUSAvMlYyfIW2KTyTcI=">AAACBnicbVDLSgMxFM34rPU16lKEYBFcSJlUQTdCwY0LkQr2Ae0wZNJMG5rJDElGKMOs3Pgrblwo4tZvcOffmGlnoa0HkhzOuZfce/yYM6Ud59taWFxaXlktrZXXNza3tu2d3ZaKEklok0Q8kh0fK8qZoE3NNKedWFIc+py2/dFV7rcfqFQsEvd6HFM3xAPBAkawNpJnH4wukXMCb81tnl6I9ZBgnt5kXoqczLMrTtWZAM4TVJAKKNDw7K9ePyJJSIUmHCvVRU6s3RRLzQinWbmXKBpjMsID2jVU4JAqN52skcEjo/RhEElzhIYT9XdHikOlxqFvKvM51ayXi/953UQHF27KRJxoKsj0oyDhUEcwzwT2maRE87EhmEhmZoVkiCUm2iRXNiGg2ZXnSatWRadVdHdWqdeKOEpgHxyCY4DAOaiDa9AATUDAI3gGr+DNerJerHfrY1q6YBU9e+APrM8fqzeWrw==</latexit>of the transactions increase as the value upBound increases. On the contrary, as presented

in Table 4.6, the costs of transacting with the addKey function remains constant regardless

of the upBound value used.

4.7 Extended Application Discussions

Aggregated patient data has the capability of transforming the future standard of care for

patients through the application of predictive analytics and big data methods. One of the

biggest challenges to using health data to its fullest extent is the inaccessibility and segmen-

tation of EMRs [88]. As demonstrated by our analyses, the proposed scheme can eliminate

these constraints. It grants healthcare providers the ability to reliably and eﬃciently extract

knowledge from disconnected EMR sources that have been willingly shared by patients.

For example, a meaningful opportunity for big data analytics in this context is the ca-

pability to model drug and treatment eﬃcacy. Healthcare providers can access previously

shared EMRs to understand how life-saving drugs and treatments perform, respective to

the disease state and proﬁle of the patient. This data can be used to make enhanced and

customized treatment decisions for patients. As healthcare treatment becomes more indi-

vidualized, successful patient outcomes are more likely and a one-size-ﬁts-all approach to

patient treatment is no longer adequate.

4.8 Summary

In this chapter, we proposed d-MABE, a secure, privacy-preserving and distributed data

sharing scheme that runs over a blockchain using smart contracts. We demonstrated that d-

MABE can be used in cases such as medical record-sharing where patients require an eﬃcient

113

and selective method to share their records with staﬀ members of medical institutions. Our

d-MABE proposed scheme eliminates reliance on the record-generating institutions when

data is shared and completely empowers the patients. Our security and privacy analyses

show that d-MABE is secure and preserves the privacy of patient records. We also presented

some recommended security practices developers should keep in mind when developing their

system. Finally, our comprehensive evaluation proves the eﬃciency of d-MABE and presents

numerical results that support our performance analysis.

114

Chapter 5

d-CRAME: Distributed Coercion-

Resistant and Anonymous Mobile

Electronic Voting

5.1

Introduction

To further demonstrate the beneﬁts of decentralization, we expand our research by proposing

our second application. In this chapter, we introduce a novel voting scheme that enables

free and fair large-scale elections. Our proposed scheme leverages the existence of at least

two parties of an election with diﬀerent allegiances that engage in a multi-party computation

along with the voters. We assume that it conﬂicts with the interest of these parties to collude

or exchange any information during the election process that may sacriﬁce the winning

chances of the candidates they support. All computations can be performed remotely at the

convenience of the parties involved. In addition, voters are able to cast their votes from a

mobile device and verify whether their votes have been cast and counted properly. We design

voter veriﬁability to be based on randomly generated values that even if shared with coercers

willingly, will not provide any information on how the voters have voted. Furthermore, we

utilize a blockchain that acts as a publicly accessible bulletin board that voters cast and store

115

their votes to. We would like to make it clear that no computations of our proposed scheme

are performed over the blockchain which will circumvent the scalability issues of blockchain

in large-scale elections.

The rest of this chapter is organized as follows. Next, in Section 5.2, the problem for-

mulation is described, outlining our design goals and system model.

In Section 5.3, our

proposed scheme is presented in detail outlining the proposed algorithms. Following that,

in Section 5.4 we formally prove the security and privacy of our proposed scheme. In Sec-

tion 5.5, a performance analysis and evaluation of d-CRAME are conducted and give our

empirical results in Section 5.6. Finally, in Section 5.7, a conclusion is drawn to summarize

the work done in this research.

5.2 Problem Formulation

Large-scale elections typically involve at least two parties with conﬂicting allegiances com-

peting to win an election. Relying on a single entity to conduct a free and fair election

between those parties requires signiﬁcant trust in that entity to be unbiased. The trusted

entity is responsible for multiple imperative tasks. Initially, it must properly register eligible

voters prior to the voting phase. Next, it must authenticate voters during the voting pro-

cess and provide them with a secure and coercion-resistant voting space for voters to cast

their desired votes freely. Once the voting phase is over, the trusted entity must also fairly

tabulate all votes, discarding the invalid ones and ﬁnally announce the winning candidates.

Given these constraints, large-scale elections are usually run or operated using in-person

poll-sites. However, this may result in reduced voter turnout. While incorporating absentee

ballots may improve voter turnout, it requires more trust not just in the organizing entity,

116

but also during vote transmission. Even with stringent audits and monitoring, the entity

performing any of those tasks may be able to cheat. Therefore, the challenge is to provide a

complete voting process that all voters and running candidates can trust. This process must

allow eligible voters to cast their votes remotely from anywhere while securing the integrity

of the election and the safety of the voters.

5.2.1 Design Goals

Based on the problems described above, we have the following design goals:

Distributed trust: Organizing an election requires the involvement of at least two parties

in which voters and candidates can trust. If a party attempts to cheat, it is exposed and

disqualiﬁed from organizing future elections, incentivizing them to remain honest.

Voter eligibility: Voting is limited to eligible voters. This requires proper voter registra-

tion that determines and conﬁrms the eligibility of voters, granting them voting rights.

Double-voting resistant: Each eligible voter is entitled to only a single vote counted

toward the election. Submitting multiple votes would disqualify the previous votes, counting

only the most recent one.

Anonymity: A cast vote cannot be linked to the identity of a voter. This protects voters,

allowing them to freely voice their desired opinions.

Coercion-resistant: Remote voting may expose voters to coercion. If subject to coercion,

voters can trick the coercers into believing that they have voted as they have been ordered,

whilst still protecting their actual votes.

117

Voter veriﬁability: Voters can verify that their votes have been cast properly and counted

toward the election results.

Election result manipulation-resistant: Last-minute voters cannot manipulate the

election results in a close race. This requires concealing all votes until the voting phase

is over.

5.2.2 System Model

Based on the design goals discussed, the system consists of a minimum of six entities, each

with a distinct role.

Voters: The eligible set of voters {vi ∈ V | 1 ≤ i ≤ n} that are granted the right to cast a
vote in an election. This set is public and subject to audits to prove to the public that only

eligible voters can vote.

Registrar R: The ﬁrst organizing entity, responsible for generating unique and random
digital ballots to be shared with voters anonymously. It cannot link a digital ballot to its

assigned voter.

Moderator M: The second organizing entity, responsible for concealing the identities of
voters and delivering the ballots to them anonymously. It cannot reveal the concealed digital

ballots as it delivers them to the voters, hence, it cannot link a digital ballot to its assigned

voter.

Election candidates: The eligible set of candidates {candk ∈ C | 1 ≤ k ≤ m} running in
an election.

118

Blockchain network: A non-trusted peer-to-peer network that maintains a publicly ac-

cessible blockchain and runs the election smart contract. The network nodes cannot link

cast votes to voters or diﬀerentiate between valid and invalid votes.

Tallying authority: A party that performs vote tabulation at the end of the casting

phase. This task is performed and monitored publicly and therefore does not require any

trust.

5.3 The Proposed d-CRAME Scheme

Our proposed scheme consists of six main sequential phases, each occurring in a speciﬁc time

frame predeﬁned by the election organizer. We assume that the registrar R and the mod-
erator M are two opposing parties in an election competing to win the election. Therefore,
they are unlikely to collude. Let G be a publicly chosen multiplicative cyclic group of prime
order p and g is a generator of G.

5.3.1 Setup

The registrar and moderator each generate a key pair. They each select a secret key xr ∈ Z∗p
and xm ∈ Z∗p randomly and then compute their corresponding public keys as yr = gxr
(mod p) and ym = gxm (mod p), respectively.

5.3.2 Voter Registration

At this phase, voters are required to prove their voting eligibility to the registrar by providing

evidence such as their identities. After validation, voters are added to the electoral roll.

119

Algorithm 4 Voter Registration
Input: Public key yi of vi.
Output: Digital Signature EG-Signyr (yi).
Setup: (Registrar R)
1: Randomly select secret key xr ∈ Z∗p.
2: Compute public key as yr = gxr (mod p).
Registration Request: (Voter vi)
1: Randomly select secret key xi ∈ Z∗p.
2: Compute public key as yi = gxi (mod p).
3: Send yi to R.
Authentication: (Registrar R) If vi is eligible:
1: Randomly select ui with 1 < ui < p − 1 and gcd(ui, p − 1) = 1.
2: Compute wi = gui (mod p).
3: Compute si = (h(yi) − xrwi)u−1
4: Send (wi, si) to vi.

(mod p).

i

Algorithm 4 summaries this process.

5.3.2.1 Voter Key Generation and Registration

Each voter vi selects a secret key xi ∈ Z∗p randomly and then computes the corresponding
public key as yi = gxi (mod p). Public keys are shared with the registrar to complete

registration.

5.3.2.2 Signing Voter’s Public Key

The registrar veriﬁes the eligibility of the voters and signs their public keys to add them to

the electoral roll. It selects ui randomly where 1 < ui < p − 1 and gcd(ui, p − 1) = 1 then
computes the following:

wi = gui

(mod p),

si = (h(yi) − xrwi)u−1

i

(mod p),

(5.1)

(5.2)

120

where (wi, si) is the signature. The registrar discloses the electoral roll of public keys once

registration is complete.

5.3.3 Acquiring a Ballot

To cast a vote, each voter must acquire a digital ballot. Algorithm 5 summarizes this process.

5.3.3.1 Ballots Generation

The registrar generates n unique and random digital ballots T = {ti | i ∈ Zn} then dig-
itally signs them using ElGamal signature scheme. We denote the set as B = {bali =
EG-Sign(ti)| i ∈ Zn}. Next, it performs a permutation π on B such that:

π : B → B.

5.3.3.2 Voter Permutation

The moderator is an intermediary that conceals voter identities from the registrar during

ballot distribution. It generates a one time permuted set σ of n unique numbers used with

every voter ballot request such that:

σ : Zn → Zn.

5.3.3.3 Requesting a Ballot

Voters initiate the request by sharing their signed public keys with the moderator.

121

(mod p). If Valid, Γ = true; otherwise Γ = false.

wi
i

si
, w
i

Algorithm 5 Ballot Acquisition
Input: Public key yi and it’s signature EG-Signyr (yi) of vi.
Output: Ballot bali
Ballot Generation (Registrar R):
1: Generate random ballots as T = {ti | i ∈ Zn}.
2: Digitally sign ballots as B = {bali = EG-Sign(ti)| i ∈ Zn}.
3: Perform permutation π : B → B.
Voter Permutation (Moderator M):
1: Generate permutation as σ : Zn → Zn.
Request Ballot (Voter vi):
1: Send public key yi and EG-Signyr (yi) to the moderator.
Voter Key Validation (Moderator M):
1: Check 0 < wi < p and 0 < si < p − 1.
2: Veriﬁes gh(yi) ≡ y
If Γ = true:
Voter Key Obscuring (Moderator M):
1: Randomly select bi ∈ Z∗p.
2: Compute y(cid:48)i = y
3: Send y(cid:48)i to R.
Ballot Assignment and Encryption (Registrar R):
1: Select ballot as bali = π(σ(i)).
2: Compute key as ki = (y(cid:48)i)qi = gxibiqi (mod p).
3: Encrypt ballot as ebali = AES-Encki
4: Compute ephemeral key as Qi = gqi (mod p).
5: Send ebali and Qi to M.
Ballot Transmission (Moderator M):
1: Encrypt ebi = (grm, bi · yrm
2: Send ebali, Qi, and ebi to vi.
Derive Ballot (Voter vi):
1: Decrypt bi = c2 · c−xi
· (grm)−xi.
2: Compute ki = (Qi)xibi = gqixibi (mod p).
3: Decrypt bali = AES-Decki

1 = bi · yrm
(ebali).

bi
i = gxibi (mod p).

(bali).

) = (c1, c2)

i

i

5.3.3.4 Voter Identity Obscuring

The moderator compares the public key with the electoral roll list to verify eligibility. It

validates the signature (wi, si) corresponding to the shared public key yi by verifying whether

122

the following equation:

gh(yi) ≡ y

wi
i

si
, w
i

(mod p),

(5.3)

holds true.

In this event, the moderator obscures yi by selecting a blind factor bi ∈ Z∗p

randomly and computing the following:

y(cid:48)i = y

bi
i = gxibi

(mod p).

(5.4)

The moderator then sends y(cid:48)i and σ(i) to the registrar.

5.3.3.5 Ballot Assignment and Encryption

Ballots are assigned randomly by the registrar to the blinded voters as follows:

bali = π(σ(i)).

To conceal bali, the registrar encrypts it under an encryption key ki derived as:

ki = (y(cid:48)i)qi = gxibiqi

(mod p),

(5.5)

(5.6)

where qi ∈ Z∗p is selected randomly. Using ki, the registrar encrypts the ballot as the

following:

ebali = AES-Encki

(bali),

(5.7)

where AES-Enc is the AES encryption function. The purpose of this encryption is to conceal

the ballot from the moderator. It enables the registrar to share this ballot with the voter

anonymously. For this purpose, it generates an ephemeral key Qi that would allow the voter

123

to regenerate ki such that:

Qi = gqi

(mod p).

(5.8)

Finally, the registrar sends ebali and Qi to the moderator.

5.3.3.6 Encrypted Ballot Transmission

Once received, the moderator sends ebali and Qi to the voter along with the encryption of

the blind factor bi as:

ebi = (grm, bi · yrm

i

) = (c1, c2),

(5.9)

where rm ∈ Z∗p is selected randomly.

5.3.3.7 Deriving Ballot

The voter can decrypt ebi and recover bi as:

bi = c2 · c−xi

1 = bi · yrm

i

· (grm)−xi.

(5.10)

Next, the voter regenerates key ki as:

ki = (Qi)xibi = gqixibi

(mod p).

(5.11)

Finally, the voter decrypts ebali as:

bali = AES-Decki

(ebali),

(5.12)

where AES-Dec is the AES decryption function.

124

Figure 5.1: d-CRAME steps to select the desired candidates: (1) application start and (2)
selecting and encrypting selected candidates.

(1)

(2)

5.3.4 Casting Votes

Before casting a vote, voters select the desired candidates they wish to vote for. Fig. 5.1

represents our designed mobile application showing candidate selection. Next, the voter

proceeds with encrypting and casting the vote.

5.3.4.1 Ballot Double-Encryption

The ballot associated with a vote, denoted as Bi, is encrypted under the public keys yr and

ym of both the registrar and moderator as:

Bi = (gv, T · (yr · ym)v) = (c3, c4),

(5.13)

125

where v ∈ Z∗p is selected randomly, T = (bali (cid:107) Vote), and Vote = (cand1, cand2, . . . , candm)

is a sequence of bits representing each candidate such that:

if voting for candk,

if voting against candk.

(5.14)

1,

0,

candk =

5.3.4.2 Submit Vote

To submit the encrypted ballot, the voter calls the election vote smart contract SCvote and

integrates Bi as input. A vote is permanently cast once the result of the smart contract is

appended to the blockchain. Fig. 5.2 is a summary of the steps performed by the voter to

cast the encrypted vote. This process involves importing the blockchain account of the voter

that allows her to call the casting smart contract deployed over the blockchain. Once the

vote is cast, the voter receives a conﬁrmation of the vote along with the transaction hash

reference. The transaction hash is used to verify that the cast vote is stored permanently

over the blockchain.

5.3.5 Tabulation

Votes that appear on the blockchain within the voting phase are collected to be validated and

counted. Both the moderator and registrar publicly disclose their secret keys xm and xr. The

registrar also discloses a new permutation γ : B → B containing all ballots distributed among
voters during the acquiring ballots phase. The tallying authority decrypts the attached

ballots appearing on the blockchain as:

T = c4 · c−xm

3

· c−xr
3 = T · (yr · ym)v · (gv)−xm · (gv)−xr .

(5.15)

126

(3)

(4)

(5)

(6)

Figure 5.2: d-CRAME steps to cast and verify vote: (3) encryption result, (4) importing
blockchain account to cast vote, (5) information on block containing cast vote, and (6)
verifying vote permanently exists on blockchain.

If bali ∈ γ, the tallying authority examines the attached vote sequence and increments the
counter of each candidate accordingly. Each election may specify its own rules that disqualify

votes which are cast incorrectly. For example, voting yes for two opposing candidates.

127

5.3.6 Vote Veriﬁability

At the end of the tabulation phase, the tallying authority publishes the results of the election

on the blockchain. Each vote is published with its corresponding ballot as proof of the

legitimacy of the results. Voters can simply recognize their ballots and verify that their

votes have been counted properly toward the election. The moderator also veriﬁes that only

eligible voters have submitted votes since it has the list of registered voters. This prevents

the registrar from attempting to issue unassigned ballots to voters of its own if they have

not been claimed by registered voters.

5.4 Security and Privacy Analysis

In this section, a formal proof of security and privacy is presented.

Unlike the conventional elections where eligible voters are granted the right to voice

their opinion exactly one time, there are many possible new security and privacy issues for

electronic and remote voting. The malicious act of attempting to cast more than one vote is

referred to as double-voting and aims to give an election candidate an advantage in winning

the election over others. In countries such as the United States, while the majority of states

prohibit voting twice in the same election, only a few of them prohibit voting in more than one

state [89]. This means that eligible voters could register in more than one state and attempt

to double-vote. The process of detecting and penalizing such voters becomes expensive and

challenging. In remote electronic-voting systems, sometimes referred to as internet-voting

(i-voting), voters receive digital ballots and cast their votes remotely rather than visiting a

polling station. Given this ﬂexibility, elections running over i-voting schemes may even allow

voters to cast their votes multiple times using the same digital ballot, counting only their

128

ﬁnal vote and discarding all previous ones. Since our proposed scheme uses the blockchain

as a bulletin board to permanently store votes, it becomes simple to identify double-voting

attempts that reuse digital ballots. However, the reuse of digital ballots is not the only

method to attempt double-voting. Adversaries may attempt to double-vote by obtaining

undeserved voting credentials giving them the right to cast more votes. We formally prove

that an election running our proposed scheme is secure against such attempts.

Deﬁnition 2 (Secure against double-voting). A voting scheme is said to be secure against

double-voting if no PPT adversary is able to forge a digital ballot that is digitally signed by

the registrar.

Theorem 8. It is infeasible for any adversary to generate a legitimate ballot that can be

used to cast a vote correctly if the DDH assumption holds.

Proof. Each ballot bali ∈ T is digitally signed by the registrar before being distributed
among the anonymous voters. For the adversary to generate acceptable ballots, it must

be able to forge a signature of a ballot bal(cid:48)i = EG-Sign(t(cid:48)i). This requires the adversary to
learn the secret key xr of the registrar or ﬁnd collisions such that h(bal(cid:48)i) = h(bali). Both

operations can be reduced to the discrete logarithm problem. Therefore, it is infeasible for

the adversary to generate a signed ballot correctly and the proof is complete.

To preserve the anonymity of voters, an adversary should not be able to link any vote to a

speciﬁc voter. The proposed scheme relies on a secure multi-party computation performed by

parties of diﬀerent allegiances to address this issue. It requires a minimum of two conﬂicting

parties to participate during the ballot distribution process. As demonstrated, a moderator

conceals the public key yi of the voter using a blind factor fi and associates it with a random

value selected from the permutation σ(i). On the other hand, the registrar selects a ballot

129

randomly and assigns it to the anonymous voter. The moderator and registrar would need

to collude for ballots to be linked to the identities of voters. We assume collusion is not

in the best interest of any of these parties, therefore, voter anonymity can be preserved.

Our proposed scheme is considered to be secure under this assumption that the probability

of an adversary to identify a public key yi that has been randomly selected from a two-

element public key chosen by the adversary and encrypted does not signiﬁcantly exceed 1
2.

Using the Indistinguishibility under Chosen-Plaintext Attack (IND-CPA) Security Game,

we provide a formal proof of security for our proposed scheme. We denote the DDH oracle

as O1 = (BFGen, Blind, Rec), where BFGen is the blind factor generation function, Blind is
the blind function, and Rec is the recovery function. The IND-CPA game consists of a set

of interactions between two PPT machines, an adversary A and a challenger C acting as the
moderator.

1. C computes a blind factor f = BFGen(1k) and keeps it secret.

2. Since A does not have access to O1, it may request that C blinds for it as many
public addresses as it likes during any time of the game. A then computes two public
addresses y0 and y1 to be challenged against and sends them to C.

3. C uniformly and randomly selects µ ∈r {0, 1} then computes y(cid:48) = Blind(f, s, yµ), where
s represents a randomness state to diversify the blind process and is a value that has

not been used in any of the previously computed ciphertexts. Next, C sends y(cid:48) to A.

4. A outputs a guess µ(cid:48) of µ. A wins the security game if µ(cid:48) = µ and loses otherwise.

An adversary that can derive which public key was blinded in polynomial time may be

able to identify the identities of the voters and link them to the ballots being assigned. The

130

DDH assumption implies that the adversary is unable to get a non-negligible advantage from

the IND-CPA Security Game in determining the public key that is blinded.

Deﬁnition 3 (Voter anonymity). A voting scheme is said to preserve the anonymity of voters

if no PPT adversary is able to get a non-negligible advantage by performing the IND-CPA

Security Game protocol, i.e. Adv = P r[µ(cid:48) = µ] < 1

2 + ε for any given negligible ε.

Theorem 9. The proposed scheme preserves the anonymity of voters if the DDH assumption

holds.

Proof. Assume there is an adversary that has non-negligible advantage ε, then AdvA > 1
We construct a simulator A that can distinguish a DDH element from a random element
with advantage ε. Let G be a publicly chosen multiplicative cyclic group of prime order p.
The DDH challenger begins by selecting the random parameters: a, b ∈r Z∗p. Let g ∈ G be
a generator and T is deﬁned as T = gab (mod p) if µ = 0, and T = gc (mod p) for some
random c ∈ Z∗p otherwise, where µ ∈r {0, 1}. The simulator acts as the challenger in the

2 +ε.

IND-CPA game.

1. C chooses a blind factor f ∈r Z∗p and state s∗ ∈r Z∗p then computes s = s∗ + ab and

keeps them secret.

2. A chooses two secret keys x0, x1 ∈r Z∗p then computes their corresponding public

addresses y0 and y1 and sends them to C.

3. C uniformly and randomly selects µ ∈r {0, 1} then computes y∗ = Blind(f, s, yµ) =
µgs = gxµf gs∗+ab = gxµf gs∗T (mod p), where T = gab (mod p). Next, C sends y∗
yf
to A.

4. A outputs a guess µ(cid:48) of µ. A wins the security game if µ(cid:48) = µ and loses otherwise.

131

Given A, if T = gab (mod p), then y∗ is a valid ciphertext, Adv = ε and

(cid:104)

(cid:16)

g, ga, gb, T = gab(cid:17)

(cid:105)

Pr

A

= 1

=

1
2

+ ε.

(5.16)

If T = gc (mod p) or T (cid:54)= gab (mod p) then y∗ is nothing more than a random value to

the adversary. Therefore,

(cid:104)

A

Pr

(cid:16)

g, ga, gb, T = gc(cid:17)

(cid:105)

= 1

=

1
2

.

From equation (5.16) and equation (5.17), we can conclude that

(cid:104)

(cid:12)(cid:12)(cid:12)Pr

A

(cid:16)

g, ga, gb, T = gab(cid:17)

(cid:105)

= 1

− Pr

A

(cid:104)

(cid:16)

g, ga, gb, T = gc(cid:17)

(5.17)

(5.18)

(cid:105)(cid:12)(cid:12)(cid:12) = ε.

= 1

The simulator plays the DDH game with a non-negligible advantage which contradicts

the DDH assumption. Therefore, the adversary cannot have advantage ε and the proof is

complete.

In traditional paper ballot based voting systems, voters can cast their votes in a physically

isolated environment without facing interference from adversaries throughout the election

process.

In i-voting systems, voting is performed remotely by the voters, therefore, it is

possible that an adversary may be able to monitor the vote casting process. To address

this concern, a stronger form of private voting known as coercion-resistance emerged [90]. A

coercion-resistant voting system is one that accounts for an adversary that can engage with

voters while they cast their votes during an election. Engagement may be in the form of an

adversary coercing voters to cast their votes in a speciﬁc form or even forcing them to divulge

132

their voting credentials by easily blackmailing the voters. Other engagements may even

include an adversary willing to peacefully buy votes from the voters. Therefore, a practical

coercion-resistant voting system should be receipt-free, allowing voters to evade proving to

their coercers how they voted. This property requires a voting system to be designed in a

way so that it does not generate any legitimate evidence that may leak information about

the votes. Concurrently, the voting system should still allow users to recognize their votes

individually and check whether their votes are being counted toward the ﬁnal election results

properly. In comparison to the receipt-free property, vote recognition is based on randomness,

providing voters with values that correspond to their votes to give them the assurance needed

to prove that their votes have been counted correctly. Even if shared by the voters with their

coercers willingly, these values should not suﬃce to expose any vote information.

In addition to individual veriﬁability, e-voting systems could be designed to provide

universal veriﬁability. This feature is intended to allow all voters and any auditors to verify

the legitimacy of each cast vote and validate the integrity of the election results [91]. However,

by deﬁnition, this property is orthogonal to coercion-resistance [92]. That is, a system that

would allow any individual to verify every vote may also leak information about how voters

cast their votes. From a practical and feasible standpoint, we believe that the coercion-

resistance property is more signiﬁcant for practical i-voting schemes. Not only does it allow

voters to freely cast their votes as they desire while defrauding any type of adversary, but

it also provides each voter with individual veriﬁability. Therefore, our goal is to introduce a

coercion-resistance i-voting scheme where we assume the adversaries can interfere with the

election but it is infeasible for them to distinguish between any real voting credentials and

defrauded ones they may obtain from the coerced voters.

In the proposed scheme, deriving valid credentials used to cast a vote traces back to the

133

voters acquiring the correct blind factors used to keep their identities anonymous as shown

in equation (5.10). To prove that the proposed scheme is coercion-resistant, we present the

following Indistinguishability of Encryption Keys (IND-EK) Security Game consisting of a

DDH oracle O2 = (KeyGen, Enc, Dec), where KeyGen is a public key pair generator, Enc is
the encryption function, and Dec is the decryption function. The IND-EK game is a set of

interactions between two PPT machines, an adversary A and a challenger C.

1. C computes two pairs of keys KeyGen(1k) → (y0, x0) and KeyGen(1k) → (y1, x1) then

sends the public keys y0 and y1 to A while keeping x0 and x1 secret.

2. A has access to O2 and can encrypt as many blind factors of its choice as it chooses
with any of the keys it receives. Next, A chooses a blind factor f and sends it to C.

3. C uniformly and randomly selects µ ∈r {0, 1} then computes c∗ = Enc(yµ, f ). Next, C

sends c∗ to A.

4. A outputs a guess µ(cid:48) of µ. A wins the security game if µ(cid:48) = µ and loses otherwise.

An adversary that can derive which public key was used in encryption eﬃciently may

potentially be able to link the votes to the actual voters. As a result, the adversary may be

able to discover whether the coerced voter has behaved as instructed. The DDH assumption

implies that the adversary is unable to get a non-negligible advantage from the IND-EK

Security Game in determining the public key of the corresponding private key.

Deﬁnition 4 (Coercion-resistance). A voting scheme is said to be coercion-resistant if no

PPT adversary is able to get a non-negligible advantage by performing the IND-EK Security

Game, i.e. AdvA = P r[µ(cid:48) = µi] < 1

2 + ε for any negligible ε.

Theorem 10. The proposed scheme is coercion-resistant if the DDH assumption holds.

134

2 + ε.

Proof. Assume there is an adversary that has non-negligible advantage ε, i.e., AdvA > 1
We construct a simulator A that can distinguish the public key used to encrypt a blind factor
with ε. Let G be a publicly chosen multiplicative cyclic group of prime order p. The DDH
challenger begins by selecting the random parameters: a, b ∈r Z∗p. Let g ∈ G be a generator
and T is deﬁned as T = gab (mod p) if µ = 0, and T = gc (mod p) for some random c ∈ Z∗p
otherwise, where µ ∈r {0, 1}. The simulator acts as the challenger in the following game:
1. C chooses the parameters x∗0, x∗1 ∈r Z∗p then computes x0 = x∗0 + a and x1 = x∗1 + a.
Next, it simulates y0 = gx0 ← gx∗0+a (mod p) and y1 = gx1 ← gx∗1+a (mod p).
Finally, it sends y0 and y1 to A and keeps x0 and x1 secret.
2. A chooses a blind factor f ∈ Z∗p and sends it to the simulator.
3. C uniformly and randomly selects µ ∈r {0, 1} then computes c∗ = Enc(yµ, f ) =
µ) = (gb, f gxµb) = (gb, f g(x∗µ+a)b) = (gb, f T gx∗µ) (mod p), where T = gab

(gb, f yb

(mod p). Next, C sends c∗ to A.

4. If A guesses the correct value, the challenger outputs 0 to indicate that T = gab, or 1

to indicate that T = R, a random group element in G.

Given A, if T = gab (mod p), then c∗ is a valid ciphertext, Adv = ε and

(cid:104)

(cid:16)

g, ga, gb, T = gab(cid:17)

(cid:105)

Pr

A

= 1

=

1
2

+ ε.

(5.19)

T = gc (mod p) or T (cid:54)= gab (mod p) then y∗ then c∗ is nothing more than a random

value to the adversary. Therefore,

(cid:104)

Pr

A

(cid:16)

g, ga, gb, T = gc(cid:17)

(cid:105)

= 1

=

1
2

.

135

(5.20)

From equation (5.19) and equation (5.20), we can conclude that

(cid:104)

(cid:12)(cid:12)(cid:12)Pr

A

(cid:16)

g, ga, gb, T = gab(cid:17)

(cid:105)

(cid:104)

(cid:16)

g, ga, gb, T = gc(cid:17)

(cid:105)(cid:12)(cid:12)(cid:12) = ε.

= 1

(5.21)

= 1

− Pr

A

The simulator plays the DDH game with a non-negligible advantage which contradicts

the DDH assumption. Therefore the adversary cannot have advantage ε and the proof is

complete.

The proposed scheme utilizes a blockchain as its public bulletin board for voters to cast

their votes. Voters interact with a voting smart contract that accepts a vote in the form

shown by equation (5.13) as input. At the end of the voting phase, the tallying authorities

scan the blockchain logs to collect all votes that have been cast during the voting phase.

Votes that pass the validation are counted towards the election results while those that fail

are discarded. Valid votes are posted to the blockchain along with their corresponding ballots

to announce election results and allow voter validation. Voters can individually recognize

that their votes have been counted correctly towards the ﬁnal election results.

Before casting their votes to the blockchain, voters are required to encrypt their votes

under the public keys xr and xm of the registrar and moderator. The encryption conceals the

actual vote during the voting phase and protects the integrity of the results from last-minute

voters that may try to manipulate the election results in a close race by voting in favor of

a certain candidate or party. However, encryption alone may be insuﬃcient to prevent an

adversary from attempting to manipulate the election results. An adversary with signiﬁcant

computational power may attempt to remove votes that have already been cast from the

blockchain to reduce the count of a candidate or party. To do this, the adversary must

compete in the blockchain mining process to fork the blockchain and remove certain blocks

136

carrying speciﬁc votes.

Theorem 11. The proposed scheme is secure against election result manipulation if the

computational power of the adversary in the blockchain network is q < 1/2 < p where p is

the computational power of the honest miners.

Proof. The race of generating a block can be modeled as a binomial random walk. Let z be

the number of blocks generated by the honest miners minus the number of blocks generated

by the adversary. This race can be derived as:

zi + 1, with probability p,

zi − 1, with probability q.

zi+1 =

(5.22)

Now, using the negative binomial distribution, we can model the probability of success of

the adversary. First, as deﬁned in [93], the probability of the attacker to surpass the blocks

generated by the honest miners as:

(cid:18) q

(cid:19)z+1

p

,

Qz =

(5.23)

where z ≥ 0. We then assume a voter waits for n new blocks to be generated by the honest
miners to be appended to the blockchain beyond the block containing their vote. We also

assume that, at that time, the adversary is able to secretly generate m = n + 1 blocks. This

can be modeled as m blocks that the adversary can generate before n blocks are generated

by the honest miners. Therefore, the probability of reversing a transaction for a given value

(cid:18)m + n − 1
(cid:19)

m

137

m is:

Pr(m) =

× pnqm.

(5.24)

The probability for the adversary to surpass successfully the number of blocks generated

by the honest miners can be computed as:

Prs =

n−1(cid:88)

m=0

= 1 −

= 1 −

Pr(m) × Qn−m−1

(cid:18)m + n − 1
(cid:19)
(cid:18)m + n − 1
(cid:19)

m

n−1(cid:88)
n−1(cid:88)

m=0

m=0

m

(cid:32)

× pnqm ×

1 −

(cid:18) q

(cid:19)n−m(cid:33)

p

× (pnqm − pmqn).

(5.25)

For attackers to succeed in performing the attack, they must be able to generate more blocks,

i.e. m > n. The likelihood decreases as n or p increase since q < p and the attackers are

limited to the election time frame. Therefore, as more blocks are appended to the blockchain,

manipulation of the election result becomes less likely or even infeasible.

5.5 Performance Evaluation

In this section, we present a performance evaluation for d-CRAME and compare it with two

blockchain-based schemes presented in [36] and [40]. We speciﬁcally select these schemes to

analyze the diﬀerence in computational complexity when applying diﬀerent cryptographic

primitives. We formulate the computational costs of each voting stage for these three schemes

based on the number of multiplication (M) and exponentiation (E) operations. Table 5.1

summarizes the number of these two operations for each scheme.

138

Scheme
M
Operation M E M E
0
0
0 0
Voter
Moderator n/a
0
0
Registrar n/a 5+κ 1+2κ 0
0
Tallier
Smart
Contract

n/a

n/a

0
2

0 0

0

0

E M E M E M
0
0
1 2 1 1
n/a 0 0
1
0
1
n/a 0 0 2n
0
n/a n/a
0

E M E M E M
0 0 n/a
1
2
n/a n/a 2n
0
n
n/a n/a
0
0
n/a n/a
0

[40] d-CRAME [36]
E M E
2
6n n/a
2n n/a
n/a
0

[40]

M

E

0
0

n/a

0
0

2 3 2(m + b + 1) 3m + b + 9

Tabulation
[40]
E M E M E
0
2
5
0
0
n/a
n/a
0

0 0 0
n/a 10
n/a
n/a

M
0
0
0
2n

E
0
0
0
2n

0

M
2
0
0
0

0

Table 5.1: d-CRAME performance comparison.

Setup

Voter Reg.

Acquiring Ballots

Casting Votes

[36]

[40]

d-CRAME [36] [40] d-CRAME [36]

d-CRAME [36]

d-CRAME

0

0

0 0 0 n 0

0

2n 2n n/a

0

0

0 0

n(1 + b)

5n

0

n 0 1 3 + m 0

5.5.1 Setup

In [36], the setup consists of an admin that authenticates voters with their Ethereum user-

controlled accounts then updates a whitelist of voters to include all eligible voters. However,

the paper did not discuss any cryptographic operations in the authentication process. We

assume that the setup for this scheme does not require any M or E operations.

On the contrary, [40] requires a smart contract admin to generate an RSA key pair

(sks, pks), imposing 2M operations. This key pair is used to sign/verify transactions between

voters and the smart contracts. The scheme also requires a voting admin to generate the

homomorphic Paillier key pair (skPaillier, pkPaillier) at a total computational cost of 4M and

a single E operation. Using the generated key, the voting admin next encrypts the value

zero κ times and stores all encryptions as set T . The set T is used to enforce randomness

as each voter is randomly assigned a value from the set during vote casting that is used to

prevent coercion. Each encryption in set T imposes a single M and 2E operations, resulting

in a total of (M + 2E)κ operations during encryption. Finally, the voting admin generates a

Short Linkable Ring Signatures (SLRS) [94] key pair which requires one M operation. This

key pair is used during the election to prevent linking voters to their ballots. For explanation

purposes, we refer to the smart contract admin as the moderator and the voting admin as the

registrar. In comparison to both schemes, d-CRAME requires the moderator and registrar to

generate their public key pairs, ym and yr, that are used during the entire election process,

139

imposing a single E operation for each.

5.5.2 Voter Registration

At the voter registration stage, [36] requires each voter to generate a key pair (ski, pki) at a

cost of one E operation. Voters must also generate a non-interactive Zero-Knowledge Proof

ZKP(xi) to prove knowledge of their secret key ski. Speciﬁcally, it uses a Schnorr proof [95]

made non-interactive using the Fiat-Shamir heuristic [96], resulting in an additional M and

E operation. Once derived, voters broadcast pki and ZKP(xi) through the speciﬁed election

smart contract.

In [40], voter registration requires interaction between the voters and an election smart

contract. Initially, eligible voters obtain the SLRS parameters generated during the setup

phase. Using these parameters, voters then generate their SLRS key pairs (ski, pki), imposing

a single M operation each. Voters then send their public keys pki to the blockchain through

the smart contract.

In order for the smart contract to accept and register their public

keys, voters must sign the transaction with the RSA key generated during the setup phase,

imposing a single E operation. The smart contract then veriﬁes the signature and adds the

public keys to the blockchain if valid. Each veriﬁcation requires a single E operation. As

mentioned in [40], this transaction veriﬁcation process is performed with every interaction

between a voter and a smart contract throughout the entire election.

In comparison, d-CREAM relies on a registrar to facilitate voter registration. Voters

begin by generating their key pairs (ski, pki). After verifying the eligibility of voters, the

registrar signs their public keys and adds them to the electoral roll. The total signatures for

registering all voters are 2nM and nE operation.

140

5.5.3 Acquiring Ballots

To acquire a ballot, [36] initially requires the election smart contract to verify all n received

ZKP(xi). A single veriﬁcation costs one M and 2E operations. Next, the smart contract gen-

erates all n digital ballots, where a single ballot generation requires a total of nM operations.

Voters acquire their corresponding generated ballots and use them in the next stage during

vote casting.

In [40], voters are not required to perform any computations to acquire a digital ballot.

Instead, they can immediately cast their votes once registration is complete.

In comparison to both schemes, d-CRAME involves voter engagement with the moder-

ator and registrar. Initially, the registrar generates the set of random and digitally signed

ballots. Next, the voters interact with the moderator, who obscures their identities from the

registrar while facilitating ballot distribution. The total operations performed by a voter,

the moderator, and the registrar are 2M + 2E, 2nM + 6nE, and 2nE operations, respectively.

5.5.4 Casting Vote

To cast a vote, [36] requires each voter to perform one M and 2E operations. Each voter

must also generate another ZKP(xi) to prove that they have voted either yes or no. The

generated ZKP(xi) requires an additional M and E operation.

In [40], voters can vote for multiple candidates, however, casting votes is a more complex

process.

Initially, voters select their desired candidates within an eligible pool of candi-

dates and encode it. Next, they homomorphically encrypt the result using their previously

generated Paillier secret key. These two functions together impose 3M and 2E operations.

Following that, the voters compute a Proof of Knowledge (PoK) to prove that they correctly

141

encrypt a candidate within the eligible pool. This proof requires 2mM and (3m + 1)E oper-

ations, where m is the total number of running candidates, which could be a big number for

a large scale election. Once computed, the voters send their encryption and derived PoKs to

the smart contract for veriﬁcation. The computational cost of a single veriﬁcation imposes

2M and 3E operations. If valid, the smart contract signs the result and returns it to the

voter. As previously discussed, a single signature imposes a single E operation for each voter.

Upon receiving it, voters validate the signature at a computation cost of one E operation.

If valid, the voters generate SLRS over the result to ﬁnalize their votes and cast it to the

blockchain. A single SLRS requires voters to initially generate a witness value, imposing

(2b − 1)M and bE operations, where b represents the number of ring members, which has to
be a reasonably big number. Next, they generate a proof of knowledge for their signatures,

imposing 5E operations. Once the SLRS is generated, the voters send it to the smart contract

for veriﬁcation. In order for the smart contract to verify a single SLRS, it imposes (b − 1)M
and one E operation. Finally, if valid, a vote is permanently stored on the blockchain.

In comparison, d-CRAME requires a voter to double encrypt bali and attach it to the

Votei where, Votei is a sequence of bits representing each candidate, 1 if voting for candk

and 0 if not voting for candk. The total operations performed by a voter include 2M and 2E

operations.

5.5.5 Tabulation

In [36], tabulation requires verifying each ZKP(xi) submitted the voters, imposing one M

and 2E operations per veriﬁcation. Next, the tally is computed requiring nM operations.

In [40], tabulation is performed over multiple steps. The election smart contract ﬁrst adds

all published and encrypted votes then signs the result and sends it to the admin, imposing

142

one E operation. The admin next veriﬁes the signature at a cost of one E operation and begins

to homomorphically decrypt the sum. Homomorphic decryption enforces a computational

cost of 3M and one E operations. The admin must also perform a decryption and verify the

correctness of the PoK and sends the results back to the smart contract, resulting in 7M and

3E operations. The smart contract veriﬁes the correctness of the information sent imposing

one M and 2E operations. If valid, the smart contract computes the ballots for m candidates,

where each candidate requires one E operation.

In comparison to both schemes, for d-CRAME, the tallier is required to double decrypt

each existing vote using the revealed moderator and registrar private keys, xm and xr. This

imposes a computational cost of 2nM and 2nE operations.

5.6 Empirical Results

In this section, we present our empirical results of various simulations of d-CRAME. We

implement two versions of d-CRAME, a desktop application, and a smartphone application.

The desktop application is implemented using Maple v16 and we simulate it on a MacBook

Air running OS X 10.13.6 equipped with 2 cores, 1.8 GHz Intel Core i5, and 8 GB 1600 MHz

DDR3. The smartphone application is developed over Xcode version 11.2.1 and is written in

Swift 5. We use the BigInt library [97] to perform large number cryptographic computations.

Our smartphone simulations are performed over an iPhone XR running iOS 13.2.2 equipped

with the Apple A12 Bionic, a 64-bit ARMv8.3-A six-core CPU, with two cores running at

2.49 GHz.

Both applications interact with our Solidity-based smart contract that we deploy over

the Ethereum Ropsten testnet blockchain to cast the encrypted votes at the end of the vot-

143

ing stage. The main inputs to the smart contract are the two encrypted vote components,

Bi = (c3, c4), as depicted in equation (5.13). For our desktop application, we utilize Meta-

Mask [98], a browser plug-in that allows voters to manage their Ethereum wallets and call our

deployed smart contract to cast their votes. For our smartphone application, we incorporate

the web3swift library [99] into our code to provide the voters with the same functionality to

cast their votes using their mobile devices. We note that the performance relies on the selec-

tion of software packages. Our selection does not guarantee the best performance. Instead,

it shows that d-CRAME is suitable for large-scale elections.

For an election, d-CRAME is run by each voter on either a desktop or mobile device.

After vote casting is complete, tabulating the votes is a job performed by the tallier, which is

generally performed oﬄine using powerful computers. We, therefore, assume that all stages

are performed over a mobile device while tabulation is performed by anyone that has access

to a desktop machine. In comparison to other schemes, d-CRAME is the only scheme that

allows voters to generate the election results themselves if they wish and can aﬀord the

required computational requirements.

Therefore, we focus our analysis on investigating the diﬀerent time costs for casting a

vote for both a computer and a smartphone. Speciﬁcally, we measure the time costs to

perform the double encryption computation presented in equation (5.13). Fig. 5.3 shows the

time costs we obtain under eight diﬀerent encryption key sizes. The presented time costs

are the average time costs of ten diﬀerent trials for each key size.

Based on our results, we come to various conclusions. Fig. 5.3(a) shows that with max-

imized security at a key size of 4096 bits, voters can cast their votes in approximately 0.11

seconds using a desktop. Correspondingly, Fig. 5.3(b) shows that to maintain the same level

of security while running d-CRAME over a smartphone, it would take less than a minute to

144

Figure 5.3: d-CRAME time comparison for various key sizes: (a) in computer, (b) in smart-
phone, (c) smartphone/computer.

145

cast a vote. In both cases, the time increases exponentially with increasing of key sizes. To-

day, in practice, key sizes of 2048 bits are generally considered secure, hence, casting a vote

through a mobile device may even be reduced to approximately 8.36 seconds. The diﬀerence

between the time costs and of running d-CRAME over a desktop and a smartphone is evi-

dent because of the processor capabilities we describe above. Desktop machines are generally

built with more powerful processors giving them a conspicuous advantage over smartphones.

However, the purpose of this analysis is to prove that even with this advantage, it is still

feasible to run d-CRAME over a smartphone and achieve acceptable results.

To further analyze our ﬁndings, we also measure the smartphone to desktop time cost

ratio and observe its change as we increase the key size. Fig. 5.3(c) presents these results

showing that as the key size increases, the ratio increases logarithmically. This suggests that,

beyond a certain key size, the advantage of running d-CRAME over a desktop versus running

it over a smartphone decays as the keys grow in size. For example, as shown in Fig. 5.3(c),

the smartphone/desktop ratio at a key size of 2048 bits is 466 and increases to 478 with key

size of 2560 bits, showing a 2.5% increase. This increase is smaller when compared to the

ratio increase between key size of 1536 bits, where the ratio is 444, and that at key size of

2048 bits, showing an increase of 5%. This pattern can be observed between all key sizes

shown in the ﬁgure.

5.7 Summary

In this chapter, we proposed d-CRAME, a novel blockchain-based and remote electronic

voting scheme. The proposed scheme is designed to run large-scale elections and aims at

improving voter turnout. Our security and privacy analyses show that d-CRAME is se-

146

cure, preserves voter privacy, protects voters against coercers, and maintains the integrity

of election results. In our performance analysis, we compare the computational complexity

of d-CRAME to two schemes and show that d-CRAME is more appropriate for large-scale

elections. Finally, in our empirical results, we present the results of running various sim-

ulations for d-CRAME over both a desktop machine and a smartphone. Our results show

that it is feasible to run d-CRAME over a mobile device, hence improving the accessibility

of elections.

147

Chapter 6

Conclusion and Future Work

6.1 Conclusion

In this dissertation, we presented sensitive data sharing schemes that address data sharing

in both centralized and distributed systems. In the ﬁrst scheme discussed, P-MOD, the data

owner partitions a data ﬁle into multiple segments and shares the independently encrypted

segments in a privilege-based access structure. We formally proved the security of this

scheme against an adaptively chosen-plaintext attack assuming the DBDH assumption holds.

We then ran simulations of P-MOD to compare its performance to CP-ABE and FH-CP-

ABE. P-MOD demonstrated an 88.3% improvement in key generation time-cost compared

to these schemes. Similarly, P-MOD outperformed these schemes in encryption time-cost by

approximately 78.4% and 18%, respectively, and in decryption time-cost by approximately

85.4% for both schemes.

Following P-MOD, we explored blockchain and distributed systems to evaluate the ben-

eﬁts of blockchain and the security pitfalls. We delved into the underlying structure and

security of the ﬁrst system to adopt blockchain, Bitcoin. To explore its security, we analyzed

double-spending attacks and evaluated the probability of success of such attacks, given the

computing power of the attackers. Our probability analysis concluded that there is a trade-

oﬀ between the time waited before accepting a transaction and the proﬁt attackers could

148

gain. Through understanding the security limitations of Bitcoin, we gained an understanding

of how to enhance the security of a distributed system.

In the second scheme presented, d-MABE, we applied the concepts of privilege-based

multilevel data sharing to a distributed system by using blockchain and smart contracts.

We used electronic medical records (EMRs) to demonstrate the beneﬁts of such a system.

We implemented smart contracts to facilitate access veriﬁcation and key granting. In the

security and performance analysis sections, we showed that d-MABE is both secure and

practical.

In the third scheme, d-CRAME, we proposed a novel remote electronic voting scheme.

The proposed scheme is designed to run large-scale elections and aims at improving voter

turnout. Our security and privacy analyses showed that d-CRAME is secure, preserves voter

privacy, protects voters against coercers, and maintains the integrity of election results. In

our performance analysis, we compared the computational complexity of d-CRAME to two

schemes and show that d-CRAME is more appropriate for large-scale elections. Finally,

in our empirical results, we implemented d-CRAME and presented the results of running

various simulations for over both a desktop machine and a smartphone. Our results show

that it is feasible to run d-CRAME over a mobile device, hence improving the accessibility

of elections and overall voter turnout.

The schemes developed and presented in this dissertation aim to provide eﬃcient, secure,

and privacy-preserving methods to share sensitive data. They share a centralized goal,

empowering the data owners to deﬁne how their sensitive data will be shared with data

users. The beneﬁts of such systems can be seen in many applications. Electronic medical

records were initially chosen as a common theme in this dissertation since electronic health

data often contain the most sensitive information of a person. The health sector is also one of

149

the most targeted in terms of malicious attacks. The schemes presented address the security

concerns seen today and provide an alternative method that can provide further advantages.

We then followed with an electronic voting scheme to show that data censorship is also an

imperative factor that must be considered and is applicable to other domains.

6.2 Future Work

There is continued interest in building data sharing and censorship schemes that mitigate the

security and privacy risks of trusted third parties. These security risks include preventing

data loss and leakage of sensitive data, protecting data integrity, and ensuring its availability

and veriﬁability whenever requested. In the following, we highlight some promising research

directions that incorporate the security and privacy advantages of distributed networks and

build on the work presented in this dissertation.

• Improving the privacy of distributed storage. Distributed systems such as IPFS can
provide eﬃcient methods for data storage to be accessed by data users. In such sys-

tems, data is stored across a P2P network, hence, eliminating single points of failure.

In addition, distributed storage systems can help improve the eﬃciency of retrieving

data when requested. However, these systems lack suﬃcient privacy countermeasures.

Users may protect the privacy of their data by encrypting it before uploading it to the

distributed storage, however, this results in additional computational and monetary

costs. We plan on performing more research in this area to improve the privacy coun-

termeasures and reducing the overall costs of using distributed storage. We plan to

investigate the feasibility of a distributed storage system as an alternative to current

centralized systems with similar or superior security and privacy countermeasures.

150

• Improving linkability/traceability of published data. We have shown that distributed
platforms may signiﬁcantly help improve security and single points of failure issues

when compared to reliance on a trusted third party. However, distributed platforms

also introduce new privacy challenges that must be addressed in order for these systems

to be accepted. Sharing data in a distributed platform exposes data owners to loss

of their anonymity. Research in mixed networks or use of expensive cryptographic

primitives such as ring signatures [100] and zero-knowledge proofs [101] have been

introduced to overcome this issue. However, it has been proven that mixing and the

current cryptographic primitives solutions cannot provide perfect secrecy [102] and are

expensive, making the proposed solutions infeasible to be adopted in real-life. This

becomes a bigger issue if an application requires user or universal veriﬁability where

individuals are allowed to verify that their data is stored correctly at all times. We wish

to explore potential solutions to this privacy issue that will allow users to publicly share

their data in a distributed manner while continuously maintaining their privacy. This

would be beneﬁcial to systems such as electronic voting schemes where voters could

continue to keep their votes secret even after the election results have been announced.

• Designing data sharing and censorship schemes for expanded application domains.
Since the outset of distributed platforms such as blockchain, it has become an inter-

esting research area to design distributed data sharing and censorship schemes. While

blockchain was originally invented to enable the decentralized digital currency Bitcoin,

in this dissertation we utilized this technology to design schemes for secure data sharing

of electronic medical records and censorship of votes in electronically held elections.

Similarly, several other domains may make use of the permanent, transparent, and

151

irreversible features of blockchain. One particular area of interest is the application of

blockchain in commercial operations and supply chain of manufacturing. In today’s

world, materials can be acquired from multiple vendors spanning across continents.

Full immutable records on the history of all materials into a ﬁnished good can allow

deeper data analysis and trending capabilities. Here we discuss two industries, the

automobile and pharmaceutical industry, in which blockchain could be leveraged to

beneﬁt consumers and companies.

In the automobile industry, recalls can cause consumer distrust.

In extreme cases,

faulty parts may be dangerous or lead to fatal outcomes. By utilizing blockchain

to create a record on the parts, manufacturing history and ownership history of a

vehicle, recall alerts can be expedited to consumers. In addition, vehicle manufacturers

will have access to data on parts and manufacturing history that they can leverage

to ﬁnd the root cause of issues. This data could even lead to the prevention of a

recall by tracing a faulty part to vehicles still in manufacture, or identiﬁcation of a

certain supplier providing a high number of faulty parts.

In another application in

the automobile industry, a public, immutable ledger may give rise to shared public

ownership of vehicles. Blockchain can be utilized to make small payments on vehicle

use as needed. In addition, a fully traceable history of the vehicle can allow preventive

maintenance to be maintained without a sole owner.

Similar to the automobile industry, medicine recalls in the pharmaceutical industry can

also cause consumer distrust and lead to dangerous outcomes. However, this industry

presents an even greater challenge for traceability because a consumer may receive the

ﬁnished product in the form of tablets dispensed by a pharmacist, with no indication

152

to the consumer if their product is aﬀected by a recall after. By leveraging blockchain,

the pharmaceutical industry can work with pharmacists to create an end-to-end history

of the materials and manufacturing of a product that can be used to inform patients

sooner of product recalls.

In the future, we will explore the applicability and feasibility of distributed platforms

such as blockchain to other domains.

153

BIBLIOGRAPHY

154

BIBLIOGRAPHY

[1] M. Healey, “Umass memorial health care entities to pay $230,000 to resolve ag’s law-
suit over data breaches,” https://www.mass.gov/news/umass-memorial-health-care-
entities-to-pay-230000-to-resolve-ags-lawsuit-over-data-breaches, 2018,
[Online; ac-
cessed 10-December-2018].

[2] A. Sahai and B. Waters, “Fuzzy identity-based encryption,” in Annual International
Springer,

Conference on the Theory and Applications of Cryptographic Techniques.
2005, pp. 457–473.

[3] R. Ostrovsky, A. Sahai, and B. Waters, “Attribute-based encryption with non-
monotonic access structures,” in Proceedings of the 14th ACM conference on Computer
and communications security, 2007, pp. 195–203.

[4] V. Goyal, O. Pandey, A. Sahai, and B. Waters, “Attribute-based encryption for ﬁne-
grained access control of encrypted data,” in Proceedings of the 13th ACM conference
on Computer and communications security. Acm, 2006, pp. 89–98.

[5] N. Attrapadung, B. Libert, and E. De Panaﬁeu, “Expressive key-policy attribute-based
encryption with constant-size ciphertexts,” in International Workshop on Public Key
Cryptography. Springer, 2011, pp. 90–108.

[6] J. Bethencourt, A. Sahai, and B. Waters, “Ciphertext-policy attribute-based encryp-
IEEE, 2007, pp.

tion,” in 2007 IEEE symposium on security and privacy (SP’07).
321–334.

[7] D. F. Ferraiolo and D. R. Kuhn, “Role-based access controls,” arXiv preprint

arXiv:0903.2171, 2009.

[8] L. Cheung and C. Newport, “Provably secure ciphertext policy abe,” in Proceedings
of the 14th ACM conference on Computer and communications security. ACM, 2007,
pp. 456–465.

[9] A. Lewko, T. Okamoto, A. Sahai, K. Takashima, and B. Waters, “Fully secure func-
tional encryption: Attribute-based encryption and (hierarchical) inner product encryp-
tion,” in 2010 Int. Conf. on the Theory and Applications of Cryptographic Techniques.
Springer, 2010, pp. 62–91.

155

[10] L. Ibraimi, M. Petkovic, S. Nikova, P. Hartel, and W. Jonker, “Mediated ciphertext-
policy attribute-based encryption and its application,” in International Workshop on
Information Security Applications. Springer, 2009, pp. 309–323.

[11] S. Roy and M. Chuah, “Secure data retrieval based on ciphertext policy attribute-based

encryption (cp-abe) system for the dtns,” Citeseer, Tech. Rep., 2009.

[12] J. Lai, R. H. Deng, and Y. Li, “Fully secure cipertext-policy hiding cp-abe,” in In-
Springer,

ternational Conference on Information Security Practice and Experience.
2011, pp. 24–39.

[13] J. Lai, R. Deng, and Y. Li, “Expressive cp-abe with partially hidden access structures,”
in Proceedings of the 7th ACM symposium on information, computer and communica-
tions security, 2012, pp. 18–19.

[14] F. Guo, Y. Mu, W. Susilo, D. S. Wong, and V. Varadharajan, “Cp-abe with constant-
size keys for lightweight devices,” IEEE transactions on information forensics and
security, vol. 9, no. 5, pp. 763–771, 2014.

[15] J. Hur and D. K. Noh, “Attribute-based access control with eﬃcient revocation in data
outsourcing systems,” IEEE Transactions on Parallel and Distributed Systems, vol. 22,
no. 7, pp. 1214–1221, 2010.

[16] K. Emura, A. Miyaji, A. Nomura, K. Omote, and M. Soshi, “A ciphertext-policy
attribute-based encryption scheme with constant ciphertext length,” in International
Conference on Information Security Practice and Experience.
Springer, 2009, pp.
13–23.

[17] Z. Zhou and D. Huang, “On eﬃcient ciphertext-policy attribute based encryption and
broadcast encryption,” in Proceedings of the 17th ACM conference on Computer and
communications security, 2010, pp. 753–755.

[18] J. Herranz, F. Laguillaumie, and C. R`afols, “Constant size ciphertexts in threshold
attribute-based encryption,” in International Workshop on Public Key Cryptography.
Springer, 2010, pp. 19–34.

[19] A. Lewko and B. Waters, “New proof methods for attribute-based encryption: Achiev-
ing full security through selective techniques,” in Annual Cryptology Conference.
Springer, 2012, pp. 180–198.

156

[20] C. Chen, Z. Zhang, and D. Feng, “Eﬃcient ciphertext policy attribute-based encryp-
tion with constant-size ciphertext and constant computation-cost,” in International
Conference on Provable Security. Springer, 2011, pp. 84–101.

[21] Z. Liu, Z. Cao, and D. S. Wong, “White-box traceable ciphertext-policy attribute-
based encryption supporting any monotone access structures,” IEEE Transactions on
Information Forensics and Security, vol. 8, no. 1, pp. 76–88, 2012.

[22] V. Goyal, A. Jain, O. Pandey, and A. Sahai, “Bounded ciphertext policy attribute
based encryption,” in International Colloquium on Automata, Languages, and Pro-
gramming. Springer, 2008, pp. 579–591.

[23] A. Ge, R. Zhang, C. Chen, C. Ma, and Z. Zhang, “Threshold ciphertext policy
attribute-based encryption with constant size ciphertexts,” in Australasian Confer-
ence on Information Security and Privacy. Springer, 2012, pp. 336–349.

[24] G. Wang, Q. Liu, and J. Wu, “Hierarchical attribute-based encryption for ﬁne-grained
access control in cloud storage services,” in Proceedings of the 17th ACM conference
on Computer and communications security. ACM, 2010, pp. 735–737.

[25] G. Wang, Q. Liu, J. Wu, and M. Guo, “Hierarchical attribute-based encryption and
scalable user revocation for sharing data in cloud servers,” computers & security,
vol. 30, no. 5, pp. 320–331, 2011.

[26] C. Gentry and A. Silverberg, “Hierarchical id-based cryptography,” in International
Conference on the Theory and Application of Cryptology and Information Security.
Springer, 2002, pp. 548–566.

[27] R. Bobba, H. Khurana, and M. Prabhakaran, “Attribute-sets: A practically motivated
enhancement to attribute-based encryption,” in European Symposium on Research in
Computer Security. Springer, 2009, pp. 587–604.

[28] Z. Wan, J. Liu, and R. H. Deng, “Hasbe: a hierarchical attribute-based solution for
ﬂexible and scalable access control in cloud computing,” IEEE transactions on infor-
mation forensics and security, vol. 7, no. 2, pp. 743–754, 2012.

[29] H. Deng, Q. Wu, B. Qin, J. Domingo-Ferrer, L. Zhang, J. Liu, and W. Shi, “Ciphertext-
policy hierarchical attribute-based encryption with short ciphertexts,” Information
Sciences, vol. 275, pp. 370–384, 2014.

157

[30] J. Li, Q. Wang, C. Wang, and K. Ren, “Enhancing attribute-based encryption with
attribute hierarchy,” Mobile networks and applications, vol. 16, no. 5, pp. 553–561,
2011.

[31] S. Wang, J. Zhou, J. K. Liu, J. Yu, J. Chen, and W. Xie, “An eﬃcient ﬁle hierar-
chy attribute-based encryption scheme in cloud computing,” IEEE Transactions on
Information Forensics and Security, vol. 11, no. 6, pp. 1265–1277, 2016.

[32] G. Zyskind, O. Nathan et al., “Decentralizing privacy: Using blockchain to protect
IEEE, 2015,

personal data,” in Security and Privacy Workshops (SPW), 2015 IEEE.
pp. 180–184.

[33] A. Azaria, A. Ekblaw, T. Vieira, and A. Lippman, “Medrec: Using blockchain for
medical data access and permission management,” in Open and Big Data (OBD),
International Conference on.

IEEE, 2016, pp. 25–30.

[34] G. Wood, “Ethereum: A secure decentralised generalised transaction ledger,”

Ethereum project yellow paper, vol. 151, pp. 1–32, 2014.

[35] A. Dubovitskaya, Z. Xu, S. Ryu, M. Schumacher, and F. Wang, “Secure and trustable
electronic medical records sharing using blockchain,” in AMIA Annual Symposium
Proceedings, vol. 2017. American Medical Informatics Association, 2017, p. 650.

[36] P. McCorry, S. F. Shahandashti, and F. Hao, “A smart contract for boardroom voting
with maximum voter privacy,” in International Conference on Financial Cryptography
and Data Security. Springer, 2017, pp. 357–375.

[37] F. Hao, P. Y. Ryan, and P. Zieli´nski, “Anonymous voting by two-round public discus-

sion,” IET Information Security, vol. 4, no. 2, pp. 62–67, 2010.

[38] G. G. Dagher, P. B. Marella, M. Milojkovic, and J. Mohler, “Broncovote: Secure voting

system using ethereum’s blockchain,” 2018.

[39] F. S. Hardwick, A. Gioulis, R. N. Akram, and K. Markantonakis, “E-voting with
blockchain: an e-voting protocol with decentralisation and voter privacy,” in 2018
IEEE International Conference on Internet of Things (iThings) and IEEE Green Com-
puting and Communications (GreenCom) and IEEE Cyber, Physical and Social Com-
puting (CPSCom) and IEEE Smart Data (SmartData).
IEEE, 2018, pp. 1561–1567.

[40] B. Yu, J. K. Liu, A. Sakzad, S. Nepal, R. Steinfeld, P. Rimba, and M. H. Au, “Platform-
independent secure blockchain-based voting system,” in International Conference on
Information Security. Springer, 2018, pp. 369–386.

158

[41] B. Wang, J. Sun, Y. He, D. Pang, and N. Lu, “Large-scale election based on

blockchain,” Procedia Computer Science, vol. 129, pp. 234–237, 2018.

[42] M. Lichman, “Uci machine learning repository,” = http://archive.ics.uci.edu/ml, 2013,

[Online; accessed 9-September-2016].

[43] B. Waters, “Ciphertext-policy attribute-based encryption: An expressive, eﬃcient, and
provably secure realization,” in International Workshop on Public Key Cryptography.
Springer, 2011, pp. 53–70.

[44] J. Daemen and V. Rijmen, The design of Rijndael: AES-the advanced encryption

standard. Springer Science & Business Media, 2013.

[45] J. Katz and Y. Lindell, Introduction to modern cryptography. Chapman and Hal-

l/CRC, 2014.

[46] J. Wang, “Ciphertext-policy attribute based encryption java toolkit,” https://github.

com/junwei-wang/cpabe, 2015, [Online; accessed 9-September-2016].

[47] A. De Caro and V. Iovino, “Jpbc: Java pairing based cryptography,” in Computers
IEEE, 2011, pp. 850–855.

and communications (ISCC), 2011 IEEE Symposium on.

[48] S. Haber and W. S. Stornetta, “How to time-stamp a digital document,” in Conference

on the Theory and Application of Cryptography. Springer, 1990, pp. 437–455.

[49] R. C. Merkle, “A digital signature based on a conventional encryption function,” in
Springer,

Conference on the Theory and Application of Cryptographic Techniques.
1987, pp. 369–378.

[50] D. Bayer, S. Haber, and W. S. Stornetta, “Improving the eﬃciency and reliability of

digital time-stamping,” in Sequences II. Springer, 1993, pp. 329–334.

[51] S. Nakamoto, “Bitcoin: A peer-to-peer electronic cash system,” https://bitcoin.org/

bitcoin.pdf, 2008, [Online; accessed 12-December-2017].

[52] A. M. Antonopoulos, Mastering Bitcoin: unlocking digital cryptocurrencies. ” O’Reilly

Media, Inc.”, 2014.

[53] “Satoshi

client node discovery,” https://en.bitcoin.it/wiki/Satoshi Client Node

Discovery, 2017, [Online; accessed 15-December-2017].

159

[54] D. R. L. Brown, “SEC 2: Recommended elliptic curve domain parameters,” Certicom

Research, Tech. Rep., 2010.

[55] J. R. Douceur, “The sybil attack,” in International Workshop on Peer-to-Peer Systems.

Springer, 2002, pp. 251–260.

[56] M. Rosenfeld, “Analysis of bitcoin pooled mining reward systems,” arXiv preprint

arXiv:1112.4980, 2011.

[57] S. Pool, “Reward system,” https://slushpool.com/help/manual/rewards, 2017, [On-

line; accessed 12-December-2017].

[58] T. B. C. developers, “bitcoin/bitcoin,” https://github.com/bitcoin/bitcoin, 2017, [On-

line; accessed 26-March-2017].

[59] A. Loibl, “Namecoin,” Network Architectures and Services, vol. 107, 2014.

[60] “Bitcoin Cash,” https://www.bitcoincash.org/, 2017, [Online; accessed 10-October-

2017].

[61] J. Hilliard, “Reduced threshold segwit masf,” https://github.com/bitcoin/bips/wiki/

Comments:BIP-0091, 2017, [Online; accessed 13-November-2017].

[62] P. Wuille, “Segregated witness and its impact on scalability,” in SF Bitcoin Devs

Seminar, 2015.

[63] H. Finney, “Best practice for fast transaction acceptance - how high is the risk?’,”
https://bitcointalk.org/index.php?topic=3441.msg48384#msg48384, Feb. 2011, [On-
line; accessed 13-November-2017].

[64] K. Sigman, “Gambler’s Ruin Problem,” http://www.columbia.edu/∼ks20/FE-Notes/

4700-07-Notes-GR.pdf.

[65] A. P. Ozisik and B. N. Levine, “An explanation of nakamoto’s analysis of double-spend

attacks,” arXiv preprint arXiv:1701.03977, 2017.

[66] M. Rosenfeld, “Analysis of hashrate-based double spending,” arXiv preprint

arXiv:1402.2009, 2014.

[67] “Mining

hardware

comparison,”

https://en.bitcoin.it/wiki/Mining hardware

comparison, 2017, [Online; accessed 10-October-2017].

160

[68] “Non-specialized hardware comparison,” https://en.bitcoin.it/wiki/Non-specialized

hardware comparison, 2017, [Online; accessed 10-October-2017].

[69] Bitmain, “BIP process, revised,” https://www.antpool.com/, 2017, [Online; accessed

9-October-2017].

[70] B. China, “Bttc,” https://www.btcc.com/, 2011-2017, [Online; accessed 9-October-

2017].

[71] L. HK Bixin Network Technology Co., “Bixin,” https://bixin.com/, [Online; accessed

9-October-2017].

[72] Bitmain, “Btc.com,” https://btc.com/, 2017, [Online; accessed 9-October-2017].

[73] Btctop, “btc.top,” http://btc.top/, 2017, [Online; accessed 9-October-2017].

[74] “U.S. energy information administration,” https://www.eia.gov/electricity/monthly/

epm table grapher.php?t=epmt 5 6 a, [Online; accessed 9-October-2017].

[75] E. Zaghloul, K. Zhou, and J. Ren, “P-mod: Secure privilege-based multilevel organi-

zational data-sharing in cloud computing,” arXiv preprint arXiv:1801.02685, 2018.

[76] W. Stallings, Cryptography and network security: principles and practice. Pearson

Upper Saddle River, NJ, 2017.

[77] M. Pirretti, P. Traynor, P. McDaniel, and B. Waters, “Secure attribute-based systems,”

J. of Comp. Security, vol. 18, no. 5, pp. 799–837, 2010.

[78] A. Singh, T. wan Ngan, P. Druschel, and D. S. Wallach, “Eclipse attacks on overlay

networks: Threats and defenses,” in In IEEE INFOCOM. Citeseer, 2006.

[79] “Solidity,” http://solidity.readthedocs.io/en/v0.4.24/, 2018.

[80] “Ropsten (revival) testnet,” https://ropsten.etherscan.io, 2018.

[81] California Legislative Information, “Chapter 1. patient access to health records
[123100 - 123149.5],” http://leginfo.legislature.ca.gov/faces/codes displaySection.
xhtml?lawCode=HSC&sectionNum=123110, 2017.

[82] “Maximum fees allowed for providing health care information,” https://www.utmb.

edu/him/pdf/fee schedule.pdf, 2017.

161

[83] “395.3025-patient and personnel records; copies; examination,” http://www.ﬂsenate.

gov/laws/statutes/2011/395.3025, 2018.

[84] “Statute sections 17 and 18 of public health law (phl),” https://www.health.ny.gov,

2018.

[85] “Copying

fees

adjustments,”

https://illinoiscomptroller.gov/agencies/resource-

library/statutorily-required/copying-fees-adjustments/, 2018.

[86] “Json rpc,” https://github.com/ethereum/wiki/wiki/JSON-RPC#eth estimategas,

2018.

[87] “Ethereum average gasprice chart,” https://etherscan.io/chart/gasprice, 2018.

[88] S. E. White, “A review of big data in health care: challenges and opportunities,” Open

Access Bioinformatics, vol. 6, pp. 13–18, 2014.

[89] National Conference of State Legislatures, “Double voting,” http://www.ncsl.org/

research/elections-and-campaigns/double-voting.aspx, 2018.

[90] A. Juels, D. Catalano, and M. Jakobsson, “Coercion-resistant electronic elections,” in
Proceedings of the 2005 ACM workshop on Privacy in the electronic society. ACM,
2005, pp. 61–70.

[91] K. Sako and J. Kilian, “Receipt-free mix-type voting scheme,” in International Con-
ference on the Theory and Applications of Cryptographic Techniques. Springer, 1995,
pp. 393–403.

[92] D. W. Jones, “Some problems with end-to-end voting,” in End-to-End Voting Systems
Workshop, Washington DC: National Institute for Standards and Technology (NIST),
at www. divms. uiowa. edu/˜ jones/voting/E2E2009. pdf [last accessed December 23,
2011], 2009.

[93] E. Zaghloul, T. Li, M. Mutka, and J. Ren, “Bitcoin and blockchain: Security and

privacy,” arXiv preprint arXiv:1904.11435, 2019.

[94] M. H. Au, S. S. Chow, W. Susilo, and P. P. Tsang, “Short linkable ring signatures
Springer, 2006, pp.

revisited,” in European Public Key Infrastructure Workshop.
101–115.

162

[95] C.-P. Schnorr, “Eﬃcient signature generation by smart cards,” Journal of cryptology,

vol. 4, no. 3, pp. 161–174, 1991.

[96] A. Fiat and A. Shamir, “How to prove yourself: Practical solutions to identiﬁcation and
signature problems,” in Conference on the Theory and Application of Cryptographic
Techniques. Springer, 1986, pp. 186–194.

[97] “Arbitrary-precision arithmetic in pure swift,” https://github.com/attaswift/BigInt,

2019, [Online; accessed 15-May-2019].

[98] “Metamask browser extension,” https://github.com/MetaMask/metamask-extension,

2019, [Online; accessed 15-May-2019].

[99] “Elegant web3js functionality in swift. native abi parsing and smart contract inter-
actions.” https://github.com/matter-labs/web3swift, 2019, [Online; accessed 15-May-
2019].

[100] T. Ruﬃng, P. Moreno-Sanchez, and A. Kate, “Coinshuﬄe: Practical decentralized
coin mixing for bitcoin,” in European Symposium on Research in Computer Security.
Springer, 2014, pp. 345–364.

[101] E. B. Sasson, A. Chiesa, C. Garman, M. Green, I. Miers, E. Tromer, and M. Virza,
“Zerocash: Decentralized anonymous payments from bitcoin,” in Security and Privacy
(SP), 2014 IEEE Symposium on.

IEEE, 2014, pp. 459–474.

[102] M. M¨oser, K. Soska, E. Heilman, K. Lee, H. Heﬀan, S. Srivastava, K. Hogan, J. Hen-
nessey, A. Miller, A. Narayanan et al., “An empirical analysis of traceability in the
monero blockchain,” Proceedings on Privacy Enhancing Technologies, vol. 2018, no. 3,
pp. 143–163, 2018.

163