You are here
Search results
(1 - 7 of 7)
- Title
- Using Eventual Consistency to Improve the Performance of Distributed Graph Computation In Key-Value Stores
- Creator
- Nguyen, Duong Ngoc
- Date
- 2021
- Collection
- Electronic Theses & Dissertations
- Description
-
Key-value stores have gained increasing popularity due to their fast performance and simple data model. A key-value store usually consists of multiple replicas located in different geographical regions to provide higher availability and fault tolerance. Consequently, a protocol is employed to ensure that data are consistent across the replicas.The CAP theorem states the impossibility of simultaneously achieving three desirable properties in a distributed system, namely consistency,...
Show moreKey-value stores have gained increasing popularity due to their fast performance and simple data model. A key-value store usually consists of multiple replicas located in different geographical regions to provide higher availability and fault tolerance. Consequently, a protocol is employed to ensure that data are consistent across the replicas.The CAP theorem states the impossibility of simultaneously achieving three desirable properties in a distributed system, namely consistency, availability, and network partition tolerance. Since failures are a norm in distributed systems and the capability to maintain the service at an acceptable level in the presence of failures is a critical dependability and business requirement of any system, the partition tolerance property is a necessity. Consequently, the trade-off between consistency and availability (performance) is inevitable. Strong consistency is attained at the cost of slow performance and fast performance is attained at the cost of weak consistency, resulting in a spectrum of consistency models suitable for different needs. Among the consistency models, sequential consistency and eventual consistency are two common ones. The former is easier to program with but suffers from poor performance whereas the latter suffers from potential data anomalies while providing higher performance.In this dissertation, we focus on the problem of what a designer should do if he/she is asked to solve a problem on a key-value store that provides eventual consistency. Specifically, we are interested in the approaches that allow the designer to run his/her applications on an eventually consistent key-value store and handle data anomalies if they occur during the computation. To that end, we investigate two options: (1) Using detect-rollback approach, and (2) Using stabilization approach. In the first option, the designer identifies a correctness predicate, say $\Phi$, and continues to run the application as if it was running on sequential consistency, as our system monitors $\Phi$. If $\Phi$ is violated (because the underlying key-value store provides eventual consistency), the system rolls back to a state where $\Phi$ holds and the computation is resumed from there. In the second option, the data anomalies are treated as state perturbations and handled by the convergence property of stabilizing algorithms.We choose LinkedIn's Voldemort key-value store as the example key-value store for our study. We run experiments with several graph-based applications on Amazon AWS platform to evaluate the benefits of the two approaches. From the experiment results, we observe that overall, both approaches provide benefits to the applications when compared to running the applications on sequential consistency. However, stabilization provides higher benefits, especially in the aggressive stabilization mode which trades more perturbations for no locking overhead.The results suggest that while there is some cost associated with making an algorithm stabilizing, there may be a substantial benefit in revising an existing algorithm for the problem at hand to make it stabilizing and reduce the overall runtime under eventual consistency.There are several directions of extension. For the detect-rollback approach, we are working to develop a more general rollback mechanism for the applications and improve the efficiency and accuracy of the monitors. For the stabilization approach, we are working to develop an analytical model for the benefits of eventual consistency in stabilizing programs. Our current work focuses on silent stabilization and we plan to extend our approach to other variations of stabilization.
Show less
- Title
- Towards machine learning based source identification of encrypted video traffic
- Creator
- Shi, Yan (Of Michigan State University)
- Date
- 2019
- Collection
- Electronic Theses & Dissertations
- Description
-
The rapid growth of the Internet has helped to popularize video streaming services, which has now become the most dominant content on the Internet. The management of video streaming traffic is complicated by its enormous volume, diverse communication protocols and data formats, and the widespread adoption of encryption. In this thesis, the aim is to develop a novel firewall framework, named Soft-margined Firewall, for managing encrypted video streaming traffic while avoiding violation of user...
Show moreThe rapid growth of the Internet has helped to popularize video streaming services, which has now become the most dominant content on the Internet. The management of video streaming traffic is complicated by its enormous volume, diverse communication protocols and data formats, and the widespread adoption of encryption. In this thesis, the aim is to develop a novel firewall framework, named Soft-margined Firewall, for managing encrypted video streaming traffic while avoiding violation of user privacy. The system distinguishes itself from conventional firewall systems by incorporating machine learning and Traffic Analysis (TA) as a traffic detection and blocking mechanism. The goal is to detect unknown network traffic, including traffic that is encrypted, tunneled through Virtual Private Network, or obfuscated, in realistic application scenarios. Existing TA methods have limitations in that they can deal only with simple traffic patterns-usually, only a single source of traffic is allowed in a tunnel, and a trained classifier is not portable between network locations, requiring redundant training. This work aims to address these limitations with new techniques in machine learning. The three main contributions of this work are: 1) developing new statistical features around traffic surge periods that can better identify websites with dynamic contents; 2) a two-stage classifier architecture to solve the mixed-traffic problem with state-of-the-art TA features; and 3) leveraging a novel natural-language inspired feature to solve the mixed-traffic problem using Deep-Learning methods. A fully working Soft-margin Firewall with the above distinctive features have been designed, implemented, and verified for both conventional classifiers and the proposed deep-learning based classifiers. The efficacy of the proposed system is confirmed via experiments conducted on actual network setups with a custom-built prototype firewall and OpenVPN servers. The proposed feature-classifier combinations show superior performance compared to previous state-of-the-art results. The solution that combines natural-language inspired traffic feature and Deep-Learning is demonstrated to be able to solve the mixed-traffic problem, and capable of predicting multiple labels associated with one sample. Additionally, the classifier can classify traffic recorded from locations that are different from where the trained traffic was collected. These results are the first of their kind and are expected to lead the way of creating next-generation TA-based firewall systems.
Show less
- Title
- Signal processing and machine learning approaches to enabling advanced sensing and networking capabilities in everyday infrastructure and electronics
- Creator
- Ali, Kamran (Scientist)
- Date
- 2020
- Collection
- Electronic Theses & Dissertations
- Description
-
Mainstream commercial off-the-shelf (COTS) electronic devices of daily use are usually designed and manufactured to serve a very specific purpose. For example, the WiFi routers and network interface cards (NICs) are designed for high speed wireless communication, RFID readers and tags are designed to identify and track items in supply chain, and smartphone vibrator motors are designed to provide haptic feedback (e.g. notifications in silent mode) to the users. This dissertation focuses on...
Show moreMainstream commercial off-the-shelf (COTS) electronic devices of daily use are usually designed and manufactured to serve a very specific purpose. For example, the WiFi routers and network interface cards (NICs) are designed for high speed wireless communication, RFID readers and tags are designed to identify and track items in supply chain, and smartphone vibrator motors are designed to provide haptic feedback (e.g. notifications in silent mode) to the users. This dissertation focuses on revisiting the physical-layer of various such everyday COTS electronic devices, either to leverage the signals obtained from their physical layers to develop novel sensing applications, or to modify/improve their PHY/MAC layer protocols to enable even more useful deployment scenarios and networking applications - while keeping their original purpose intact - by introducing mere software/firmware level changes and completely avoiding any hardware level changes. Adding such new usefulness and functionalities to existing everyday infrastructure and electronics has advantages both in terms of cost and convenience of use/deployment, as those devices (and their protocols) are already mainstream, easily available, and often already purchased and in use/deployed to serve their mainstream purpose of use.In our works on WiFi signals based sensing, we propose signal processing and machine learning approaches to enable fine-grained gesture recognition and sleep monitoring using COTS WiFi devices. In our work on gesture recognition, we show for the first time thatWiFi signals can be used to recognize small gestures with high accuracy. In our work on sleep monitoring, we propose for the first time aWiFi CSI based sleep quality monitoring scheme which can robustly track breathing and body/limb activity related vital signs during sleep throughout a night in an individual and environment independent manner.In our work on RFID signals based sensing, we propose signal processing and machine learning approaches to effectively image customer activity in front of display items in places such as retail stores using commercial off-the-shelf (COTS) monostatic RFID devices (i.e. which use a single antenna at a time for both transmitting and receiving RFID signals to and from the tags). The key novelty of this work is on achieving multi-person activity tracking in front of display items by constructing coarse grained images via robust, analytical model-driven deep learning based, RFID imaging. We implemented our scheme using a COTS RFID reader and tags.In our work on smartphone's vibration based sensing, we propose a robust and practical vibration based sensing scheme that works with smartphones with different hardware, can extract fine-grained vibration signatures of different surfaces, and is robust to environmental noise and hardware based irregularities. A useful application of this sensing is symbolic localization/tagging, e.g. figuring out whether a user's device is in their hand, pocket, or at their bedroom table, etc. Such symbolic tagging of locations can provide us with indirect information about user activities and intentions without any dedicated infrastructure, based on which we can enable useful services such as context aware notifications/alarms. To make our scheme easily scalable and compatible with COTS smartphones, we design our signal processing and machine learning pipeline such that it relies only on builtin vibration motors and microphone for sensing, and it is robust to hardware irregularities and background environmental noises. We tested our scheme on two different Android smartphones.In our work on powerline communications (PLCs), we propose a distributed spectrum sharing scheme for enterprise level PLC mesh networks. This work is a major step towards using existing COTS PLC devices to connect different types of Internet of Things (IoT) devices for sensing and control related applications in large campuses such as enterprises. Our work is based on identification of a key weakness of the existing HomePlug AV (HPAV) PLC protocol that it does not support spectrum sharing, i.e., currently each link operates over the whole available spectrum, and therefore, only one link can operate at a time. Our proposed spectrum sharing scheme significantly boosts both aggregated and per-link throughputs, by allowing multiple links to communicate concurrently, while requiring a few modifications to the existing HPAV protocol.
Show less
- Title
- Metamodeling framework for simultaneous multi-objective optimization using efficient evolutionary algorithms
- Creator
- Roy, Proteek Chandan
- Date
- 2019
- Collection
- Electronic Theses & Dissertations
- Description
-
Most real-world problems are comprised of multiple conflicting objectives and solutions to those problems are multiple Pareto-optimal trade-off solutions. The main challenge of these practical problems is that the objectives and constraints do not have any closed functional forms and they are expensive for computation as well. Objectives coming from finite element analysis, computational fluid dynamics software, network flow simulators, crop modeling, weather modeling or any other simulations...
Show moreMost real-world problems are comprised of multiple conflicting objectives and solutions to those problems are multiple Pareto-optimal trade-off solutions. The main challenge of these practical problems is that the objectives and constraints do not have any closed functional forms and they are expensive for computation as well. Objectives coming from finite element analysis, computational fluid dynamics software, network flow simulators, crop modeling, weather modeling or any other simulations which involve partial differential equations are good examples of expensive problems. These problems can also be regarded as l03000300ow-budget'' problems since only a few solution evaluations can be performed given limited time. Nevertheless, parameter estimation and optimization of objectives related to these simulations require a good number of solution evaluations to come up with better parameters or a reasonably good trade-off front. To provide an efficient search process within a limited number of exact evaluations, metamodel-assisted algorithms have been proposed in the literature. These algorithms attempt to construct a computationally inexpensive representative model of the problem, having the same global optima and thereby providing a way to carry out the optimization in metamodel space in an efficient way. Population-based methods like evolutionary algorithms have become standard for solving multi-objective problems and recently Metamodel-based evolutionary algorithms are being used for solving expensive problems. In this thesis, we would like to address a few challenges of metamodel-based optimization algorithms and propose some efficient and innovative ways to construct these algorithms. To approach efficient design of metamodel-based optimization algorithm, one needs to address the choice of metamodeling functions. The most trivial way is to build metamodels for each objective and constraint separately. But we can reduce the number of metamodel constructions by using some aggregated functions and target either single or multiple optima in each step. We propose a taxonomy of possible metamodel-based algorithmic frameworks which not only includes most algorithms from the literature but also suggests some new ones. We improve each of the frameworks by introducing trust region concepts in the multi-objective scenario and present two strategies for building trust regions. Apart from addressing the main bottleneck of the limited number of solution evaluations, we also propose efficient non-dominated sorting methods that further reduce computational time for a basic step of multi-objective optimization. We have carried out extensive experiments over all representative metamodeling frameworks and shown that each of them can solve a good number of test problems. We have not tried to tune the algorithmic parameters yet and it remains as our future work. Our theoretical analyses and extensive experiments suggest that we can achieve efficient metamodel-based multi-objective optimization algorithms for solving test as well as real-world expensive and low-budget problems.
Show less
- Title
- Hardware algorithms for high-speed packet processing
- Creator
- Norige, Eric
- Date
- 2017
- Collection
- Electronic Theses & Dissertations
- Description
-
The networking industry is facing enormous challenges of scaling devices to support theexponential growth of internet traffic as well as increasing number of features being implemented inside the network. Algorithmic hardware improvements to networking componentshave largely been neglected due to the ease of leveraging increased clock frequency and compute power and the risks of implementing complex hardware designs. As clock frequencyslows its growth, algorithmic solutions become important...
Show moreThe networking industry is facing enormous challenges of scaling devices to support theexponential growth of internet traffic as well as increasing number of features being implemented inside the network. Algorithmic hardware improvements to networking componentshave largely been neglected due to the ease of leveraging increased clock frequency and compute power and the risks of implementing complex hardware designs. As clock frequencyslows its growth, algorithmic solutions become important to fill the gap between currentgeneration capability and next generation requirements. This paper presents algorithmicsolutions to networking problems in three domains: Deep Packet Inspection(DPI), firewall(and other) ruleset compression and non-cryptographic hashing. The improvements in DPIare two-pronged: first in the area of application-level protocol field extraction, which allowssecurity devices to precisely identify packet fields for targeted validity checks. By usingcounting automata, we achieve precise parsing of non-regular protocols with small, constantper-flow memory requirements, extracting at rates of up to 30gbps on real traffic in softwarewhile using only 112 bytes of state per flow. The second DPI improvement is on the longstanding regular expression matching problem, where we complete the HFA solution to theDFA state explosion problem with efficient construction algorithms and optimized memorylayout for hardware or software implementation. These methods construct automata toocomplex to be constructed by previous methods in seconds, while being capable of 29gbpsthroughput with an ASIC implementation. Firewall ruleset compression enables more firewall entries to be stored in a fixed capacity pattern matching engine, and can also be usedto reorganize a firewall specification for higher performance software matching. A novelrecursive structure called TUF is given to unify the best known solutions to this problemand suggest future avenues of attack. These algorithms, with little tuning, achieve a 13.7%improvement in compression on large, real-life classifiers, and can achieve the same results asexisting algorithms while running 20 times faster. Finally, non-cryptographic hash functionscan be used for anything from hash tables to track network flows to packet sampling fortraffic characterization. We give a novel approach to generating hardware hash functionsin between the extremes of expensive cryptographic hash functions and low quality linearhash functions. To evaluate these mid-range hash functions properly, we develop new evaluation methods to better distinguish non-cryptographic hash function quality. The hashfunctions described in this paper achieve low-latency, wide hashing with good avalanche anduniversality properties at a much lower cost than existing solutions.
Show less
- Title
- Fluid animation on deforming surface meshes
- Creator
- Wang, Xiaojun (Graduate of Michigan State University)
- Date
- 2017
- Collection
- Electronic Theses & Dissertations
- Description
-
"We explore methods for visually plausible fluid simulation on deforming surfaces with inhomogeneous diffusion properties. While there are methods for fluid simulation on surfaces, not much research effort focused on the influence of the motion of underlying surface, in particular when it is not a rigid surface, such as knitted or woven textiles in motion. The complexity involved makes the simulation challenging to account for the non-inertial local frames typically used to describe the...
Show more"We explore methods for visually plausible fluid simulation on deforming surfaces with inhomogeneous diffusion properties. While there are methods for fluid simulation on surfaces, not much research effort focused on the influence of the motion of underlying surface, in particular when it is not a rigid surface, such as knitted or woven textiles in motion. The complexity involved makes the simulation challenging to account for the non-inertial local frames typically used to describe the motion and the anisotropic effects in diffusion, absorption, adsorption. Thus, our primary goal is to enable fast and stable method for such scenarios. First, in preparation of the material properties for the surface domain, we describe textiles with salient feature direction by bulk material property tensors in order to reduce the complexity, by employing 2D homogenization technique, which effectively turns microscale inhomogeneous properties into homogeneous properties in macroscale descriptions. We then use standard texture mapping techniques to map these tensors to triangles in the curved surface mesh, taking into account the alignment of each local tangent space with correct feature directions of the macroscale tensor. We show that this homogenization tool is intuitive, flexible and easily adjusted. Second, for efficient description of the deforming surface, we offer a new geometry representation for the surface with solely angles instead of vertex coordinates, to reduce storage for the motion of underlying surface. Since our simulation tool relies heavily on long sequences of 3D curved triangular meshes, it is worthwhile exploring such efficient representations to make our tool practical by reducing the memory access during real-time simulations as well as reducing the file sizes. Inspired by angle-based representations for tetrahedral meshes, we use spectral method to restore curved surface using both angles of the triangles and dihedral angles between adjacent triangles in the mesh. Moreover, in many surface deformation sequences, it is often sufficient to update the dihedral angles while keeping the triangle interior angles fixed. Third, we propose a framework for simulating various effects of fluid flowing on deforming surfaces. We directly applied our simulator on curved surface meshes instead of in parameter domains, whereas many existing simulation methods require a parameterization on the surface. We further demonstrate that fictitious forces induced by the surface motion can be added to the surface-based simulation at a small additional cost. These fictitious forces can be decomposed into different components. Only the rectilinear and Coriolis components are relevant to our choice of local frames. Other effects, such as diffusion, adsorption, absorption, and evaporation are also incorporated for realistic stain simulation. Finally, we explore the extraction of Lagrangian Coherent Structure (LCS), which is often referred to as the skeleton of fluid motion. The LCS structures are often described by ridges of the finite time Lyapunov exponent (FTLE) fields, which describe the extremal stretching of fluid parcels following the flow. We proposed a novel improvement to the ridge marching algorithm, which extract such ridges robustly for the typically noisy FTLE estimates even in well-defined fluid flows. Our results are potentially applicable to visualizing and controlling fluid trajectory patterns. In contrast to current methods for LCS calculation, which are only applicable to flat 2D or 3D domains and sensitive to noise, our ridge extraction is readily applicable to curved surfaces even when they are deforming. The collection of these computational tools will facilitate generation of realistic and easy to adjust surface fluid animation with various physically plausible effects on surface."--Pages ii-iii.
Show less
- Title
- Achieving reliable distributed systems : through efficient run-time monitoring and predicate detection
- Creator
- Tekken Valapil, Vidhya
- Date
- 2020
- Collection
- Electronic Theses & Dissertations
- Description
-
Runtime monitoring of distributed systems to perform predicate detection is critical as well as a challenging task. It is critical because it ensures the reliability of the system by detecting all possible violations of system requirements. It is challenging because to guarantee lack of violations one has to analyze every possible ordering of system events and this is an expensive task. In this report, wefocus on ordering events in a system run using HLC (Hybrid Logical Clocks) timestamps,...
Show moreRuntime monitoring of distributed systems to perform predicate detection is critical as well as a challenging task. It is critical because it ensures the reliability of the system by detecting all possible violations of system requirements. It is challenging because to guarantee lack of violations one has to analyze every possible ordering of system events and this is an expensive task. In this report, wefocus on ordering events in a system run using HLC (Hybrid Logical Clocks) timestamps, which are O(1) sized timestamps, and present some efficient algorithms to perform predicate detection using HLC. Since, with HLC, the runtime monitor cannot find all possible orderings of systems events, we present a new type of clock called Biased Hybrid Logical Clocks (BHLC), that are capable of finding more possible orderings than HLC. Thus we show that BHLC based predicate detection can find more violations than HLC based predicate detection. Since predicate detection based on both HLC and BHLC do not guarantee detection of all possible violations in a system run, we present an SMT (Satisfiability Modulo Theories) solver based predicate detection approach, that guarantees the detection of all possible violations in a system run. While a runtime monitor that performs predicate detection using SMT solvers is accurate, the time taken by the solver to detect the presence or absence of a violation can be high. To reduce the time taken by the runtime monitor, we propose the use of an efficient two-layered monitoring approach, where the first layer of the monitor is efficient but less accurate and the second layer is accurate but less efficient. Together they reduce the overall time taken to perform predicate detection drastically and also guarantee detection of all possible violations.
Show less