Tracing Distributed Algorithms Using Replay Clocks
In this thesis, we introduce replay clocks ($\repcl$), a novel clock infrastructure that allows us to do offline analyses of distributed computations. The replay clock structure provides a methodology to \textit{replay} a computation as it happened, with the ability to represent concurrent events effectively. It builds on the structures introduced by vector clocks ($\vc$) and the Hybrid Logical Clock ($\hlc$), combining their infrastructures to provide efficient replay. With such a clock, a user can replay a computation whilst considering multiple paths of executions, and check for constraint violations and properties that potential pathways could take, especially in the presence of concurrent events. Specifically, if event $e$ must occur before $f$ then the replay clock must ensure that $e$ is replayed before $f$. On the other hand, if $e$ and $f$ could occur in any order, replay should not force an order between them. After identifying the limitations of existing clocks to provide the replay primitive, we present the $\repcl$ structure and identify an efficient representation for the same. We demonstrate that $\repcl$ can be implemented with less than four integers for 64 processes for various system parameters if clocks are synchronized within $1ms$. Furthermore, the overhead of $\repcl$ (for computing/comparing timestamps and message size) is proportional to the size of the clock. Using simulations in a custom distributed system and NS-3, a state-of-the-art network simulator, we identify the expected overhead of $\repcl$ based on the given system settings. We also identify how a user can then identify feasibility region for $\repcl$. Specifically, given the desired overhead of $\repcl$, it identifies the region where unabridged replay is possible. Using the $\repcl$, we provide a tracer for distributed computations, that allows any computation using the $\repcl$ to be replayed efficiently. The visualization allows users to analyze specific properties and constraints in an online fashion, with the ability to consider concurrent paths independently. The visualization provides per-process views and an overarching view of the whole computation based on the time recorded by the $\repcl$ for each event.
Read
- In Collections
-
Electronic Theses & Dissertations
- Copyright Status
- Attribution 4.0 International
- Material Type
-
Theses
- Authors
-
Lagwankar, Ishaan Kiran
- Thesis Advisors
-
Kulkarni, Sandeep S.
- Committee Members
-
Xiao, Li
McKinley, Phillip K.
- Date Published
-
2024
- Subjects
-
Computer science
- Program of Study
-
Computer Science - Master of Science
- Degree Level
-
Masters
- Language
-
English
- Pages
- 64 pages
- Permalink
- https://doi.org/doi:10.25335/cbq9-be16