You are here
Search results
(1  20 of 27)
Pages
 Title
 Formalization and verification of property specification patterns
 Creator
 Bryndin, Dmitriy
 Date
 2010
 Collection
 Electronic Theses & Dissertations
 Description

Finitestate verification (FSV) techniques are intended for proving properties of software systems. Although significant progress has been made in the last decade automating FSV techniques, the adoption of these techniques by software developers is low. The Specification Pattern System (SPS) is intended to assist users in creating such specifications. It identifies common specification patterns and indicates how to translate the patterns into a variety of different specification languages....
Show moreFinitestate verification (FSV) techniques are intended for proving properties of software systems. Although significant progress has been made in the last decade automating FSV techniques, the adoption of these techniques by software developers is low. The Specification Pattern System (SPS) is intended to assist users in creating such specifications. It identifies common specification patterns and indicates how to translate the patterns into a variety of different specification languages. However, the patterns in the SPS are defined informally and their translations are not verified. This work discusses the informal nature of these definitions, proposes a formalization for them and provides formal proofs for the translation of patterns to Linear Temporal Logic.
Show less
 Title
 A Study of Bluetooth Frequency Hopping Sequence : Modeling and Practical Attack
 Creator
 ALBAZRQAOE, WAHHAB
 Date
 2011
 Collection
 Electronic Theses & Dissertations
 Description

The Bluetooth is a wireless interface that enables electronic devices to establish shortrange, adhoc wireless connections. This kind of shortrange wireless networking is known as Wireless Personal Area Networks (WPAN). Because of its attractive features of small size, low cost, and low power, Bluetooth gains a world wide usage. It is embedded in many portable computing devices and considered as a good replacement for local wire connections. Since wireless data is inherently exposed to...
Show moreThe Bluetooth is a wireless interface that enables electronic devices to establish shortrange, adhoc wireless connections. This kind of shortrange wireless networking is known as Wireless Personal Area Networks (WPAN). Because of its attractive features of small size, low cost, and low power, Bluetooth gains a world wide usage. It is embedded in many portable computing devices and considered as a good replacement for local wire connections. Since wireless data is inherently exposed to eavesdropping, the security and confidentiality is a central issue for wireless standard as well as Bluetooth. To maintain security and confidentiality of wireless packets, the Bluetooth system mainly relies on the Frequency Hopping mechanism to equivocate an adversary. By this technique, a wireless channel is accessed for transmitting a packet. For each wireless packet, a single channel is selected in a pseudo random way. This kind of randomness in channel selection makes it difficult for an eavesdropped to predict the next channel to be accessed. Hence, capturing Bluetooth wireless packets is a challenge. In this work, we investigate the Frequency Hopping sequence and specifically the hop selection kernel. We analyze the operation of the kernel hardware by partitioning it into three parts. Based on this modeling, we propose an attacking method for the hop selection kernel. The proposed method shows how to expose the clock value hidden in the kernel. This helps to predict Bluetooth hopping sequence and, hence, capturing Bluetooth wireless packet is possible.
Show less
 Title
 The evolutionary potential of populations on complex fitness landscapes
 Creator
 Bryson, David Michael
 Date
 2012
 Collection
 Electronic Theses & Dissertations
 Description

Evolution is a highly contingent process, where the quality of the solutions produced is affected by many factors. I explore and describe the contributions of three such aspects that influence overall evolutionary potential: the prior history of a population, the type and frequency of mutations that the organisms are subject to, and the composition of the underlying genetic hardware. I have systematically tested changes to a digital evolution system, Avida, measuring evolutionary potential in...
Show moreEvolution is a highly contingent process, where the quality of the solutions produced is affected by many factors. I explore and describe the contributions of three such aspects that influence overall evolutionary potential: the prior history of a population, the type and frequency of mutations that the organisms are subject to, and the composition of the underlying genetic hardware. I have systematically tested changes to a digital evolution system, Avida, measuring evolutionary potential in seven different computational environments ranging in complexity of the underlying fitness landscapes. I have examined trends and general principles that these measurements demonstrate and used my results to optimize the evolutionary potential of the system, broadly enhancing performance. The results of this work show that history and mutation rate play significant roles in evolutionary potential, but the final fitness levels of populations are remarkably stable to substantial changes in the genetic hardware and a broad range of mutation types.
Show less
 Title
 Distancepreserving graphs
 Creator
 Nussbaum, Ronald
 Date
 2014
 Collection
 Electronic Theses & Dissertations
 Description

Let G be a simple graph on n vertices, where d_G(u,v) denotes the distance between vertices u and v in G. An induced subgraph H of G is isometric if d_H(u,v)=d_G(u,v) for all u,v in V(H). We say that G is a distancepreserving graph if G contains at least one isometric subgraph of order k for every k where 1<=k<=n.A number of sufficient conditions exist for a graph to be distancepreserving. We show that all hypercubes and graphs with delta(G)>=2n/31 are distancepreserving. Towards this end...
Show moreLet G be a simple graph on n vertices, where d_G(u,v) denotes the distance between vertices u and v in G. An induced subgraph H of G is isometric if d_H(u,v)=d_G(u,v) for all u,v in V(H). We say that G is a distancepreserving graph if G contains at least one isometric subgraph of order k for every k where 1<=k<=n.A number of sufficient conditions exist for a graph to be distancepreserving. We show that all hypercubes and graphs with delta(G)>=2n/31 are distancepreserving. Towards this end, we carefully examine the role of "forbidden" subgraphs. We discuss our observations, and provide some conjectures which we computationally verified for small values of n. We say that a distancepreserving graph is sequentially distancepreserving if each subgraph in the set of isometric subgraphs is a superset of the previous one, and consider this special case as well.There are a number of questions involving the construction of distancepreserving graphs. We show that it is always possible to add an edge to a noncomplete sequentially distancepreserving graph such that the augmented graph is still sequentially distancepreserving. We further conjecture that the same is true of all distancepreserving graphs. We discuss our observations on making nondistancepreserving graphs into distance preserving ones via adding edges. We show methods for constructing regular distancepreserving graphs, and consider constructing distancepreserving graphs for arbitrary degree sequences. As before, all conjectures here have been computationally verified for small values of n.
Show less
 Title
 Geometric and topological modeling techniques for large and complex shapes
 Creator
 Feng, Xin
 Date
 2014
 Collection
 Electronic Theses & Dissertations
 Description

The past few decades have witnessed the incredible advancements in modeling, digitizing and visualizing techniques for three–dimensional shapes. Those advancements led to an explosion in the number of three–dimensional models being created for design, manufacture, architecture, medical imaging, etc. At the same time, the structure, function, stability, and dynamics of proteins, subcellular structures, organelles, and multiprotein complexes have emerged as a leading interest in...
Show moreThe past few decades have witnessed the incredible advancements in modeling, digitizing and visualizing techniques for three–dimensional shapes. Those advancements led to an explosion in the number of three–dimensional models being created for design, manufacture, architecture, medical imaging, etc. At the same time, the structure, function, stability, and dynamics of proteins, subcellular structures, organelles, and multiprotein complexes have emerged as a leading interest in structural biology, another major source of large and complex geometric models. Geometric modeling not only provides visualizations of shapes for large biomolecular complexes but also fills the gap between structural information and theoretical modeling, and enables the understanding of function, stability, and dynamics.We first propose, for tessellated volumes of arbitrary topology, a compact data structure that offers constant–time–complexity incidence queries among cells of any dimensions. Our data structure is simple to implement, easy to use, and allows for arbitrary, user–defined 3–cells such as prisms and hexahedra, while remaining highly efficient in memory usage compared to previous work. We also provide the analysis on its time complexity for commonly–used incidence and adjacency queries such as vertex and edge one–rings.We then introduce a suite of computational tools for volumetric data processing, information extraction, surface mesh rendering, geometric measurement, and curvature estimation for biomolecular complexes. Particular emphasis is given to the modeling of Electron Microscopy Data Bank (EMDB) data and Protein Data Bank (PDB) data. Lagrangian and Cartesian representations are discussed for the surface presentation. Based on these representations, practical algorithms are developed for surface area and surface–enclosed volume calculation, and curvature estimation. Methods for volumetric meshing have also been presented. Because the technological development in computer science and mathematics has led to a variety of choices at each stage of the geometric modeling, we discuss the rationales in the design and selection of various algorithms. Analytical test models are designed to verify the computational accuracy and convergence of proposed algorithms. We selected six EMDB data and six PDB data to demonstrate the efficacy of the proposed algorithms in handling biomolecular surfaces and explore their capability of geometric characterization of binding targets. Thus, our toolkit offers a comprehensive protocol for the geometric modeling of proteins, subcellular structures, organelles, and multiprotein complexes.Furthermore, we present a method for computing “choking” loops—a set of surface loops that describe the narrowing of the volumes inside/outside of the surface and extend the notion of surface homology and homotopy loops. The intuition behind their definition is that a choking loop represents the region where an offset of the original surface would get pinched. Our generalized loops naturally include the usual2g handles/tunnels computed based on the topology of the genus–g surface, but also include loops that identify chokepoints or bottlenecks, i.e., boundaries of small membranes separating the inside or outside volume of the surface into disconnected regions. Our definition is based on persistent homology theory, which gives a measure to topological structures, thus providing resilience to noise and a well–defined way to determine topological feature size.Finally, we explore the application of persistent homology theory in protein folding analysis. The extremely complex process of protein folding brings challenges for both experimental study and theoretical modeling. The persistent homology approach studies the Euler characteristics of the protein conformations during the folding process. More precisely, the persistence is measured by the variation of van der Waals radius, which leads to the change of protein 3D structures and uncovers the inter–connectivity. Our results on fullerenes demonstrate the potential of our geometric and topological approach to protein stability analysis.
Show less
 Title
 Measurement and modeling of large scale networks
 Creator
 Shafiq, Muhammad Zubair
 Date
 2014
 Collection
 Electronic Theses & Dissertations
 Description

The goal of this thesis is to identify measurement, modeling, and optimization opportunities for large scale networks  with specific focus on cellular networks and online social networks. These networks are facing unprecedented operational challenges due to their very large scale.Cellular networks are experiencing an explosive increase in the volume of traffic for the last few years. This unprecedented increase in the volume of mobile traffic is attributed to the increase in the subscriber...
Show moreThe goal of this thesis is to identify measurement, modeling, and optimization opportunities for large scale networks  with specific focus on cellular networks and online social networks. These networks are facing unprecedented operational challenges due to their very large scale.Cellular networks are experiencing an explosive increase in the volume of traffic for the last few years. This unprecedented increase in the volume of mobile traffic is attributed to the increase in the subscriber base, improving network connection speeds, and improving hardware and software capabilities of modern smartphones. In contrast to the traditional fixed IP networks, mobile network operators are faced with the constraint of limited radio frequency spectrum at their disposal. As the communication technologies evolve beyond 3G to Long Term Evolution (LTE), the competition for the limited radio frequency spectrum is becoming even more intense. Therefore, mobile network operators increasingly focus on optimizing different aspects of the network by customized design and management to improve key performance indicators (KPIs).Online social networks are increasing at a very rapid pace, while trying to provide more contentrich and interactive services to their users. For instance, Facebook currently has more than 1.2 billion monthly active users and offers news feed, graph search, groups, photo sharing, and messaging services. The information for such a large user base cannot be efficiently and securely managed by traditional database systems. Social network service providers are deploying novel large scale infrastructure to cope with these scaling challenges.In this thesis, I present novel approaches to tackle these challenges by revisiting the current practices for the design, deployment, and management of large scale network systems using a combination of theoretical and empirical methods. I take a datadriven approach in which the theoretical and empirical analyses are intertwined. First, I measure and analyze the trends in data and then model the identified trends using suitable parametric models. Finally, I rigorously evaluate the developed models and the resulting system design prototypes using extensive simulations, realistic testbed environments, or realworld deployment. This methodology is to used to address several problems related to cellular networks and online social networks.
Show less
 Title
 Near duplicate image search
 Creator
 Li, Fengjie
 Date
 2014
 Collection
 Electronic Theses & Dissertations
 Description

Information retrieval addresses the fundamental problem of how to identify the objects from database that satisfies the information needs of users. Facing the information overload, the major challenge in search algorithm design is to ensure that useful information can be found both accurately and efficiently from large databases.To address this challenge, different indexing and retrieval methods had been proposed for different types of data, namely sparse data (e.g. documents), dense data (e...
Show moreInformation retrieval addresses the fundamental problem of how to identify the objects from database that satisfies the information needs of users. Facing the information overload, the major challenge in search algorithm design is to ensure that useful information can be found both accurately and efficiently from large databases.To address this challenge, different indexing and retrieval methods had been proposed for different types of data, namely sparse data (e.g. documents), dense data (e.g. dense feature vectors) and bagoffeatures (e.g. local feature represented images). For sparse data, inverted index and document retrieval models had been proved to be very effective for large scale retrieval problems. For dense data and bagoffeature data, however, there are still some open problems. For example, Locality Sensitive Hashing, a stateoftheart method for searching high dimensional vectors, often fails to make a good tradeoff between precision and recall. Namely, it tends to achieve high preci sion but with low recall or vice versa. The bagofwords model, a popular approach for searching objects represented bagoffeatures, has a limited performance because of the information loss during the quantization procedure.Since the general problem of searching objects represented in dense vectors and bagoffeatures may be too challenging, in this dissertation, we focus on nearly duplicate search, in which the matched objects is almost identical to the query. By effectively exploring the statistical proper ties of near duplicities, we will be able to design more effective indexing schemes and search algorithms. Thus, the focus of this dissertation is to design new indexing methods and retrieval algorithms, for near duplicate search in large scale databases, that accurately capture the data simi larity and delivers more accurate and efficient search. Below, we summarize the main contributions of this dissertation:Our first contribution is a new algorithm for searching near duplicate bagoffeatures data. The proposed algorithm, named random seeding quantization, is more efficient in generating bagof words representations for near duplicate images. The new scheme is motivated by approximating the optimal partial matching between bagoffeatures, and thus produces a bagofwords representation capturing the true similarities of the data, leading to more accurate and efficient retrieval of bagoffeatures data.Our second contribution, termed Random Projection Filtering, is a search algorithm designed for efficient near duplicate vector search. By explicitly exploiting the statistical properties of near duplicity, the algorithm projects high dimensional vectors into lower dimensional space and filter out irrelevant items. Our effective filtering procedure makes RPF more accurate and efficient to identify nearly duplicate objects in databases.Our third contribution is to develop and evaluate a new randomized range search algorithm for near duplicate vectors in high dimensional spaces, termed as Random Projection Search. Different from RPF, the algorithm presented in this chapter is suitable for a wider range of applications be cause it does not require the sparsity constrains for high search accuracy. The key idea is to project both the data points and the query point into an one dimensional space by a random projection, and perform one dimensional range search to find the subset of data points that are within the range of a given query using binary search. We prove the theoretical guarantee for the proposed algorithm and evaluate its empirical performance on a dataset of 1.1 billion image features.
Show less
 Title
 Exploiting crosstechnology interference for efficient network services in wireless systems
 Creator
 Zhou, Ruogu
 Date
 2014
 Collection
 Electronic Theses & Dissertations
 Description

In the last decade, we have witnessed the wide adoption of a variety of wireless technologies like WiFi, Cellular, Bluetooth, ZigBee, and Nearfield Communication(NFC). However, the fast growth of wireless networks generates significant crosstechnology interference, which leads to network performance degradation and potential security breach. In this dissertation, we propose two novel physical layer techniques to deal with the interference, and improve the performance and security of sensor...
Show moreIn the last decade, we have witnessed the wide adoption of a variety of wireless technologies like WiFi, Cellular, Bluetooth, ZigBee, and Nearfield Communication(NFC). However, the fast growth of wireless networks generates significant crosstechnology interference, which leads to network performance degradation and potential security breach. In this dissertation, we propose two novel physical layer techniques to deal with the interference, and improve the performance and security of sensor networks and mobile systems, respectively. First, we exploit the WiFi interference as a ``blessing" in the design of sensor networks and develop novel WiFi interference detection techniques for ZigBee sensors. Second, utilizing these techniques, we design three efficient network services: WiFi discovery which detects the existence of nearby WiFi networks using ZigBee sensors, WiFi performance monitoring which measures and tracks performance of WiFi networks using a ZigBee sensor network, and time synchronization which provides synchronized clocks for sensor networks based on WiFi signals. Third, we design a novel, noninvasive NFC security system called {\em nShield} to reduce the transmission power of NFC radios, which protects NFC against passive eavesdropping. nShield implements a novel adaptive RF attenuation scheme, in which the extra RF energy of NFC transmissions is determined and absorbed by nShield. At the same time, nShield scavenges the extra RF energy to sustain the perpetual operation. Together with the extremely lopower design, it enables nShield to provide the host uninterrupted protection against malicious eavesdropping. The above systems are implemented and extensively evaluated on a testbed of sensor networks and smartphones.
Show less
 Title
 Automated addition of faulttolerance via lazy repair and graceful degradation
 Creator
 Lin, Yiyan
 Date
 2015
 Collection
 Electronic Theses & Dissertations
 Description

In this dissertation, we concentrate on the problem of automated addition of faulttolerance that transforms a faultintolerant program to be a faulttolerant program. We solve this problem via model repair. Model repair is a correctbyconstruct technique to revise an existing model so that the revised model satisfies the given correctness criteria, such as safety, liveness, or faulttolerance. We consider two problems of using model repair to add faulttolerance. First, if the repaired...
Show moreIn this dissertation, we concentrate on the problem of automated addition of faulttolerance that transforms a faultintolerant program to be a faulttolerant program. We solve this problem via model repair. Model repair is a correctbyconstruct technique to revise an existing model so that the revised model satisfies the given correctness criteria, such as safety, liveness, or faulttolerance. We consider two problems of using model repair to add faulttolerance. First, if the repaired model violates the assumptions (e.g., partial observability, inability to detect crashed processes, etc) made in the underlying system, then it cannot be implemented. We denote these requirements as realizability constraints. Second, the addition of faulttolerance may fail if the program cannot fully recover after certain faults occur. In this dissertation, we propose a lazy repair approach to address realizability issues in adding faulttolerance. Additionally, we propose a technique to automatically add graceful degradation to a program, so that the program can recover with partial functionality (that is identified by the designer to be the critical functionality) if full recovery is impossible.A model repair technique transforms a model to another model that satisfies a new set of properties. Such a transformation should also maintain the mapping between the model and the underlying program. For example, in a distributed program, every process is restricted to read (or write) some variables in other processes. A model that represents this program should also disallow the process to read (or write) those inaccessable variables. If these constraints are violated, then the corresponding model will be unrealizable. An unrealizable model (in this context, a model that violates the read/write restrictions) may make it impossible to obtain the corresponding implementation.%In this dissertation, we call the read (or write) restriction as a realizability constraint in distributed systems. An unrealizable model (a model that violates the realizability constraints) may complicate the implementation by introducing extra amount of modification to the program. Such modification may in turn break the program's correctness.Resolving realizability constraints increases the complexity of model repair. Existing model repair techniques introduce heuristics to reduce the complexity. However, this heuristicbased approach is designed and optimized specifically for distributed programs. We need a more generic model repair approach for other types of programs, e.g., synchronous programs, cyberphysical programs, etc. Hence, in this dissertation, we propose a model repair technique, i.e., lazy repair, to add faulttolerance to programs with different types of realizability constraints. It involves two steps. First, we only focus on repairing to obtain a model that satisfies correctness criteria while ignoring realizability constraints. In the second step, we repair this model further by removing behaviors while ensuring that the desired specification is preserved. The lazy repair approach simplifies the process of developing heuristics, and provides a tradeoff in terms of the time saved in the first step and the extra work required in the second step. We demonstrate that lazy repair is applicable in the context of distributed systems, synchronous systems and cyberphysical systems.In addition, safety critical systems such as airplanes, automobiles and elevators should operate with high dependability in the presence of faults. If the occurrence of faults breaks down some components, the system may not be able to fully recover. In this scenario, the system can still operate with remaining resources and deliver partial but core functionality, i.e., to display graceful degradation. Existing model repair approaches, such as addition of faulttolerance, cannot transform a program to provide graceful degradation. In this dissertation, we propose a technique to add faulttolerance to a program with graceful degradation. In the absence of faults, such a program exhibits ideal behaviors. In the presence of faults, the program is allowed to recover with reduced functionality. This technique involves two steps. First, it automatically generates a program with graceful degradation based on the input faultintolerant program. Second, it adds faulttolerance to the output program from first step. We demonstrate that this technique is applicable in the context of high atomicity programs as well as low atomicity programs (i.e., distributed programs). We also present a case study on adding multigraceful degradation to a dangerous gas detection and ventilation system. Through this case study, we show that our approach can assist the designer to obtain a program that behaves like the deployed system.
Show less
 Title
 Computational identification and analysis of noncoding RNAs in largescale biological data
 Creator
 Lei, Jikai
 Date
 2015
 Collection
 Electronic Theses & Dissertations
 Description

Nonproteincoding RNAs (ncRNAs) are RNA molecules that function directly at the level of RNA without translating into protein. They play important biological functions in all three domains of life, i.e. Eukarya, Bacteria and Archaea. To understand the working mechanisms and the functions of ncRNAs in various species, a fundamental step is to identify both known and novel ncRNAs from largescale biological data.Largescale genomic data includes both genomic sequence data and NGS sequencing...
Show moreNonproteincoding RNAs (ncRNAs) are RNA molecules that function directly at the level of RNA without translating into protein. They play important biological functions in all three domains of life, i.e. Eukarya, Bacteria and Archaea. To understand the working mechanisms and the functions of ncRNAs in various species, a fundamental step is to identify both known and novel ncRNAs from largescale biological data.Largescale genomic data includes both genomic sequence data and NGS sequencing data. Both types of genomic data provide great opportunity for identifying ncRNAs. For genomic sequence data, a lot of ncRNA identification tools that use comparative sequence analysis have been developed. These methods work well for ncRNAs that have strong sequence similarity. However, they are not wellsuited for detecting ncRNAs that are remotely homologous. Next generation sequencing (NGS), while it opens a new horizon for annotating and understanding known and novel ncRNAs, also introduces many challenges. First, existing genomic sequence searching tools can not be readily applied to NGS data because NGS technology produces short, fragmentary reads. Second, most NGS data sets are largescale. Existing algorithms are infeasible on NGS data because of high resource requirements. Third, metagenomic sequencing, which utilizes NGS technology to sequence uncultured, complex microbial communities directly from their natural inhabitants, further aggravates the difficulties. Thus, massive amount of genomic sequence data and NGS data calls for efficient algorithms and tools for ncRNA annotation.In this dissertation, I present three computational methods and tools to efficiently identify ncRNAs from largescale biological data. ChainRNA is a tool that combines both sequence similarity and structure similarity to locate crossspecies conserved RNA elements with low sequence similarity in genomic sequence data. It can achieve significantly higher sensitivity in identifying remotely conserved ncRNA elements than sequence based methods such as BLAST, and is much faster than existing structural alignment tools. miRPREFeR (miRNA PREdiction From small RNASeq data) utilizes expression patterns of miRNA and follows the criteria for plant microRNA annotation to accurately predict plant miRNAs from one or more small RNASeq data samples. It is sensitive, accurate, fast and has lowmemory footprint. metaCRISPR focuses on identifying Clustered Regularly Interspaced Short Palindromic Repeats (CRISPRs) from largescale metagenomic sequencing data. It uses a kmer hash table to efficiently detect reads that belong to CRISPRs from the raw metagonmic data set. Overlap graph based clustering is then conducted on the reduced data set to separate different CRSIPRs. A set of graph based algorithms are used to assemble and recover CRISPRs from the clusters.
Show less
 Title
 Novel computational approaches to investigate microbial diversity
 Creator
 Zhang, Qingpeng
 Date
 2015
 Collection
 Electronic Theses & Dissertations
 Description

Species diversity is an important measurement of ecological communities.Scientists believe that there is a strong relationship between speciesdiversity and ecosystem processes. However efforts to investigate microbialdiversity using whole genome shotgun reads data are still scarce. With novel applications of data structuresand the development of novel algorithms, firstly we developed an efficient kmer countingapproach and approaches to enable scalable streaming analysis of large and error...
Show moreSpecies diversity is an important measurement of ecological communities.Scientists believe that there is a strong relationship between speciesdiversity and ecosystem processes. However efforts to investigate microbialdiversity using whole genome shotgun reads data are still scarce. With novel applications of data structuresand the development of novel algorithms, firstly we developed an efficient kmer countingapproach and approaches to enable scalable streaming analysis of large and errorprone shortread shotgun data sets. Then based on these efforts, we developed a statistical framework allowing for scalable diversity analysis of large,complex metagenomes without the need for assembly or reference sequences. Thismethod is evaluated on multiple large metagenomes from differentenvironments, such as seawater, human microbiome, soil. Given the velocity ingrowth of sequencing data, this method is promising for analyzing highlydiverse samples with relatively low computational requirements. Further, as themethod does not depend on reference genomes, it also provides opportunities totackle the large amounts of unknowns we find in metagenomicdatasets.
Show less
 Title
 Statistical and learning algorithms for the design, analysis, measurement, and modeling of networking and security systems
 Creator
 Shahzad, Muhammad (College teacher)
 Date
 2015
 Collection
 Electronic Theses & Dissertations
 Description

"The goal of this thesis is to develop statistical and learning algorithms for the design, analysis, measurement, and modeling of networking and security systems with specific focus on RFID systems, network performance metrics, user security, and software security. Next, I give a brief overview of these four areas of focus."  Abstract.
 Title
 Secure and efficient spectrum sharing and QoS analysis in OFDMbased heterogeneous wireless networks
 Creator
 Alahmadi, Ahmed S.
 Date
 2016
 Collection
 Electronic Theses & Dissertations
 Description

The Internet of Things (IoT), which networks versatile devices for information exchange, remote sensing, monitoring and control, is finding promising applications in nearly every field. However, due to its high density and enormous spectrum requirement, the practical development of IoT technology seems to be not available until the release of the largemillimeter wave (mmWave) band (30GHz300GHz). Compared to existing lower band systems (such as 3G, 4G), mmWave band signals generally require...
Show moreThe Internet of Things (IoT), which networks versatile devices for information exchange, remote sensing, monitoring and control, is finding promising applications in nearly every field. However, due to its high density and enormous spectrum requirement, the practical development of IoT technology seems to be not available until the release of the largemillimeter wave (mmWave) band (30GHz300GHz). Compared to existing lower band systems (such as 3G, 4G), mmWave band signals generally require line of sight (LOS) path and suffer from severe fading effects, leading to much smaller coverage area. For network design and management, this implies that: (i) MmWave band alone could not support the IoT networks, but has to be integrated with the existing lower band systems through secure and effective spectrum sharing, especially in the lower frequency bands; and (ii) The IoT networks will have very high density node distribution, which is a significant challenge in network design, especially with the scarce energy budget of IoT applications. Motivated by these observations, in this dissertation, we consider three problems: (1) How to achieve secure and effective spectrum sharing? (2) How to accommodate the energy limited IoT devices? (3) How to evaluate the Quality of Service (QoS) in the high density IoT networks? We aim to develop innovative techniques for the design, evaluation and management of future IoT networks under both benign and hostile environments. The maincontributions of this dissertation are outlined as follows.First, we develop a secure and efficient spectrum sharing scheme in singlecarrier wireless networks. Cognitive radio (CR) is a key enabling technology for spectrum sharing, where the unoccupied spectrum is identified for secondary users (SUs), without interfering with the primary user (PU). A serious security threat to the CR networks is referred to as primary user emulation attack (PUEA), in which a malicious user (MU) emulates the signal characteristics of the PU, thereby causing the SUs to erroneously identify the attacker as the PU. Here, we consider fullband PUEA detection and propose a reliable AESassisted DTV scheme, where an AESencrypted reference signal is generated at the DTV transmitter and used as the sync bits of the DTV data frames. For PU detection, we investigate the crosscorrelation between the received sequence and reference sequence. The MU detection can be performed by investigating the autocorrelation of the received sequence. We further develop a secure and efficient spectrum sharing scheme in multicarrier wireless networks. We consider subband malicious user detection and propose a secure AESbased DTV scheme, where the existing reference sequence used to generate the pilot symbols in the DVBT2 frames is encrypted using the AES algorithm. The resulted sequence is exploited for accurate detection of the authorized PU and the MU. Second, we develop an energy efficient transmission scheme in CR networks using energy harvesting. We propose a transmitting scheme for the SUs such that each SU can perform information reception and energy harvesting simultaneously. We perform sumrate optimization for the SUs under PUEA. It is observed that the sumrate of the SU network can be improved significantly with the energy harvesting technique. Potentially, the proposed scheme can be applied directly to the energyconstrained IoT networks.Finally, we investigate QoS performance analysis methodologies, which can provide insightful feedbacks to IoT network design and planning. Taking the spatial randomness of the IoT network into consideration, we investigate coverage probability (CP) and blocking probability (BP) in relayassisted OFDMA networks using stochastic geometry. More specifically, we model the intercell interference from the neighboring cells at each typical node, and derive the CP in the downlink transmissions. Based on their data rate requirements, we classify the incoming users into different classes, and calculate the BP using the multidimensional loss model.
Show less
 Title
 Hardware Algorithms for HighSpeed Packet Processing
 Creator
 Norige, Eric
 Date
 2017
 Collection
 Electronic Theses & Dissertations
 Description

The networking industry is facing enormous challenges of scaling devices to support theexponential growth of internet traﬃc as well as increasing number of features being implemented inside the network. Algorithmic hardware improvements to networking componentshave largely been neglected due to the ease of leveraging increased clock frequency and compute power and the risks of implementing complex hardware designs. As clock frequencyslows its growth, algorithmic solutions become important to...
Show moreThe networking industry is facing enormous challenges of scaling devices to support theexponential growth of internet traﬃc as well as increasing number of features being implemented inside the network. Algorithmic hardware improvements to networking componentshave largely been neglected due to the ease of leveraging increased clock frequency and compute power and the risks of implementing complex hardware designs. As clock frequencyslows its growth, algorithmic solutions become important to ﬁll the gap between currentgeneration capability and next generation requirements. This paper presents algorithmicsolutions to networking problems in three domains: Deep Packet Inspection(DPI), ﬁrewall(and other) ruleset compression and noncryptographic hashing. The improvements in DPIare twopronged: ﬁrst in the area of applicationlevel protocol ﬁeld extraction, which allowssecurity devices to precisely identify packet ﬁelds for targeted validity checks. By usingcounting automata, we achieve precise parsing of nonregular protocols with small, constantperﬂow memory requirements, extracting at rates of up to 30gbps on real traﬃc in softwarewhile using only 112 bytes of state per ﬂow. The second DPI improvement is on the longstanding regular expression matching problem, where we complete the HFA solution to theDFA state explosion problem with eﬃcient construction algorithms and optimized memorylayout for hardware or software implementation. These methods construct automata toocomplex to be constructed by previous methods in seconds, while being capable of 29gbpsthroughput with an ASIC implementation. Firewall ruleset compression enables more ﬁrewall entries to be stored in a ﬁxed capacity pattern matching engine, and can also be usedto reorganize a ﬁrewall speciﬁcation for higher performance software matching. A novelrecursive structure called TUF is given to unify the best known solutions to this problemand suggest future avenues of attack. These algorithms, with little tuning, achieve a 13.7%improvement in compression on large, reallife classiﬁers, and can achieve the same results asexisting algorithms while running 20 times faster. Finally, noncryptographic hash functionscan be used for anything from hash tables to track network ﬂows to packet sampling fortraﬃc characterization. We give a novel approach to generating hardware hash functionsin between the extremes of expensive cryptographic hash functions and low quality linearhash functions. To evaluate these midrange hash functions properly, we develop new evaluation methods to better distinguish noncryptographic hash function quality. The hashfunctions described in this paper achieve lowlatency, wide hashing with good avalanche anduniversality properties at a much lower cost than existing solutions.
Show less
 Title
 FLUID ANIMATION ON DEFORMING SURFACE MESHES
 Creator
 Wang, Xiaojun
 Date
 2017
 Collection
 Electronic Theses & Dissertations
 Description

We explore methods for visually plausible fluid simulation on deforming surfaces withinhomogeneous diffusion properties. While there are methods for fluid simulation onsurfaces, not much research effort focused on the influence of the motion of underlyingsurface, in particular when it is not a rigid surface, such as knitted or woven textilesin motion. The complexity involved makes the simulation challenging to account forthe noninertial local frames typically used to describe the motion and...
Show moreWe explore methods for visually plausible fluid simulation on deforming surfaces withinhomogeneous diffusion properties. While there are methods for fluid simulation onsurfaces, not much research effort focused on the influence of the motion of underlyingsurface, in particular when it is not a rigid surface, such as knitted or woven textilesin motion. The complexity involved makes the simulation challenging to account forthe noninertial local frames typically used to describe the motion and the anisotropiceffects in diffusion, absorption, adsorption. Thus, our primary goal is to enable fastand stable method for such scenarios.First, in preparation of the material properties for the surface domain, we describetextiles with salient feature direction by bulk material property tensors in order toreduce the complexity, by employing 2D homogenization technique, which effectivelyturns microscale inhomogeneous properties into homogeneous properties in macroscaledescriptions. We then use standard texture mapping techniques to map these tensorsto triangles in the curved surface mesh, taking into account the alignment of each localtangent space with correct feature directions of the macroscale tensor. We show thatthis homogenization tool is intuitive, flexible and easily adjusted.Second, for efficient description of the deforming surface, we offer a new geometryrepresentation for the surface with solely angles instead of vertex coordinates, to reducestorage for the motion of underlying surface. Since our simulation tool relies heavilyon long sequences of 3D curved triangular meshes, it is worthwhile exploring suchefficient representations to make our tool practical by reducing the memory accessduring realtime simulations as well as reducing the file sizes. Inspired by anglebasedrepresentations for tetrahedral meshes, we use spectral method to restore curved surfaceusing both angles of the triangles and dihedral angles between adjacent triangles in themesh. Moreover, in many surface deformation sequences, it is often sufficient to updatethe dihedral angles while keeping the triangle interior angles fixed.Third, we propose a framework for simulating various effects of fluid flowing on deformingsurfaces. We directly applied our simulator on curved surface meshes insteadof in parameter domains, whereas many existing simulation methods require a parameterizationon the surface. We further demonstrate that fictitious forces induced bythe surface motion can be added to the surfacebased simulation at a small additionalcost. These fictitious forces can be decomposed into different components. Only therectilinear and Coriolis components are relevant to our choice of local frames. Othereffects, such as diffusion, adsorption, absorption, and evaporation are also incorporatedfor realistic stain simulation.Finally, we explore the extraction of Lagrangian Coherent Structure (LCS), whichis often referred to as the skeleton of fluid motion. The LCS structures are often describedby ridges of the finite time Lyapunov exponent (FTLE) fields, which describethe extremal stretching of fluid parcels following the flow. We proposed a novel improvementto the ridge marching algorithm, which extract such ridges robustly for thetypically noisy FTLE estimates even in welldefined fluid flows. Our results are potentiallyapplicable to visualizing and controlling fluid trajectory patterns. In contrastto current methods for LCS calculation, which are only applicable to flat 2D or 3Ddomains and sensitive to noise, our ridge extraction is readily applicable to curvedsurfaces even when they are deforming.The collection of these computational tools will facilitate generation of realisticand easy to adjust surface fluid animation with various physically plausible effects onsurface.
Show less
 Title
 Distance Preserving Graphs
 Creator
 Zahedi, Emad
 Date
 2017
 Collection
 Electronic Theses & Dissertations
 Description

The computational complexity of exploring distance properties of large graphs such as realworld social networks which consist of millions of nodes is extremely expensive. Recomputing distances in subgraphs of the original graph will add to the cost. One way to avoid this is to use subgraphs where the distance between any pair of vertices is the same as in the original graph. Such a subgraph is called {\em isometric}. A connected graph is {\em distance preserving}, for which we use the...
Show moreThe computational complexity of exploring distance properties of large graphs such as realworld social networks which consist of millions of nodes is extremely expensive. Recomputing distances in subgraphs of the original graph will add to the cost. One way to avoid this is to use subgraphs where the distance between any pair of vertices is the same as in the original graph. Such a subgraph is called {\em isometric}. A connected graph is {\em distance preserving}, for which we use the abbreviation dp, if it has an isometric subgraph of every order. In this framework we study dp graphs from both the structural and algorithmic perspectives. First, we study the structural nature of dp graphs. This involves classifying graphs based on the dp property and the relation between dp graphs to other graph classes. Second, we study the recognition problem of dp graphs. We intend to develop efficient algorithms for finding isometric subgraphs as well as deciding whether a graph is dp or not.
Show less
 Title
 Hidden Markov modelbased homology search and gene prediction in NGS ERA
 Creator
 Techaangkoon, Prapaporn
 Date
 2017
 Collection
 Electronic Theses & Dissertations
 Description

The exponential cost reduction of nextgeneration sequencing (NGS) enabled researchers to sequence a large number of organisms in order to answer various questions in biology, ecology, health, etc. For newly sequenced genomes, gene prediction and homology search against characterized protein sequence databases are two fundamental tasks for annotating functional elements in the genomes. The main goal of gene prediction is to identify the gene locus and their structures. As there is...
Show moreThe exponential cost reduction of nextgeneration sequencing (NGS) enabled researchers to sequence a large number of organisms in order to answer various questions in biology, ecology, health, etc. For newly sequenced genomes, gene prediction and homology search against characterized protein sequence databases are two fundamental tasks for annotating functional elements in the genomes. The main goal of gene prediction is to identify the gene locus and their structures. As there is accumulating evidence showing important functions of RNAs (ncRNAs), comprehensive gene prediction should include both proteincoding genes and ncRNAs. Homology search against protein sequences can aid identification of functional elements in genomes. Although there are intensive research in the fields of gene prediction, ncRNA search, and homology search, there are still unaddressed challenges. In this dissertation, I made contributions in these three areas. For gene prediction, I designed an HMMbased ab initio gene prediction tool that considers G+C gradient in grass genomes. For homology search, I designed a method that can align short reads against protein families using profile HMMs. For ncRNA search, I designed a ncRNA alignment tool that can align highly structured ncRNAs using only sequence similarity. Below I summarize my contributions.Despite decades of research about gene prediction, existing gene prediction tools are not carefully designed to deal with variant G+C content and 5'3' changing patterns inside coding regions. Thus, these tools can miss genes with positive or negative G+C gradient in grass genomes such as rice, maize, sorghum, etc. I implemented a tool named AUGUSTUSGC that accounts for 5'3' G+C gradient. Our tool can accurately predict proteincoding genes in plant genomes especially grass genomes.A large number of sequencing projects produced short reads from the whole genomes or transcriptomic data. I designed a short reads homology search tool that employs pairedend reads to improve homology search sensitivity. The experimental results show that our tool can achieve significantly better sensitivity and accuracy in aligning short reads that are part of remote homologs.Despite the extensive studies of ncRNA search, the existing tools that heavily depend on the secondary structure in homology search cannot efficiently handle RNAseq data that is accumulating rapidly. It will be ideal if we can have a faster ncRNA homology search tool with similar accuracy as those adopting secondary structure. I implemented an accurate ncRNA alignment tool called gluRNA that can achieve similar accuracy to structural alignment tools while keeping the same running time complexity as sequence alignment tools. The experimental results demonstrate that our tool can achieve more accurate alignments than the popular sequence alignment tools and a wellknown structural alignment program.
Show less
 Title
 Capturing Bluetooth Traffic in the Wild : Practical Systems and Privacy Implications
 Creator
 ALBAZRQAOE, WAHHAB
 Date
 2018
 Collection
 Electronic Theses & Dissertations
 Description

Bluetooth wireless technology is today present in billions of smartphones, mobile devices, and portable electronics. With the prevalence of personal Bluetooth devices, a practical Bluetooth traffic sniffer is of increasing interest due to the following. First, it has been reported that a traffic sniffer is an essential, daytoday tool for Bluetooth engineers and applications developers; and second, as the communication between Bluetooth devices is privacysensitive in nature, exploring the...
Show moreBluetooth wireless technology is today present in billions of smartphones, mobile devices, and portable electronics. With the prevalence of personal Bluetooth devices, a practical Bluetooth traffic sniffer is of increasing interest due to the following. First, it has been reported that a traffic sniffer is an essential, daytoday tool for Bluetooth engineers and applications developers; and second, as the communication between Bluetooth devices is privacysensitive in nature, exploring the possibility of Bluetooth traffic sniffing in practical settings sheds lights into potential user privacy leakage. To date, sniffing Bluetooth traffic has been widely considered an extremely intricate task due to wideband spread spectrum of Bluetooth, pseudorandom frequency hopping adopted by Bluetooth at baseband, and the interference in the open 2.4 GHz band.This thesis addresses these challenges by introducing novel traffic sniffers that capture Bluetooth packets in practical environments. In particular, we present the following systems. (i) BlueEar, the first practical Bluetooth traffic sniffing system only using general, inexpensive wireless platforms. BlueEar features a novel dualradio architecture where two inexpensive, Bluetoothcompliant radios coordinate with each other to eavesdrop on hopping subchannels in indiscoverable mode. Statistic models and lightweight machine learning tools are integrated to learn the adaptive hopping behavior of the target. Our results show that BlueEar maintains a packet capture rate higher than 90% consistently in dynamic settings. In addition, we discuss the implications of the BlueEar approach on Bluetooth LE sniffing and present a practical countermeasure that effectively reduces the packet capture rate of sniffer by 70%, which can be easily implemented on the Bluetooth master while requiring no modification to slave devices like keyboards and headsets.
Show less
 Title
 FASTER ALGORITHMS FOR MACHINE LEARNING PROBLEMS IN HIGH DIMENSION
 Creator
 Ye, Mingquan
 Date
 2019
 Collection
 Electronic Theses & Dissertations
 Description

When dealing with datasets with high dimension, the existing machine learning algorithms oftendo not work in practice. Actually, most of the realworld data has the nature of low intrinsicdimension. For example, data often lies on a lowdimensional manifold or has a low doublingdimension. Inspired by this phenomenon, this thesis tries to improve the time complexities of twofundamental problems in machine learning using some techniques in computational geometry.In Chapter two, we propose a bi...
Show moreWhen dealing with datasets with high dimension, the existing machine learning algorithms oftendo not work in practice. Actually, most of the realworld data has the nature of low intrinsicdimension. For example, data often lies on a lowdimensional manifold or has a low doublingdimension. Inspired by this phenomenon, this thesis tries to improve the time complexities of twofundamental problems in machine learning using some techniques in computational geometry.In Chapter two, we propose a bicriteria approximation algorithm for minimum enclosing ballwith outliers and extend it to the outlier recognition problem. By virtue of the “coreset” idea andthe Random Gradient Descent Tree, we propose an efficient algorithm which is linear in the numberof points n and the dimensionality d, and provides a probability bound. In experiments, comparedwith some existing outlier recognition algorithms, our method is proven to be efficient and robustto the outlier ratios.In Chapter three, we adopt the “doubling dimension” to characterize the intrinsic dimension ofa point set. By the property of doubling dimension, we can approximate the geometric alignmentbetween two point sets by executing the existing alignment algorithms on their subsets, whichachieves a much smaller time complexity. More importantly, the proposed approximate methodhas a theoretical upper bound and can serve as the preprocessing step of any alignment algorithm.
Show less
 Title
 Metamodeling Framework for Simultaneous MultiObjective Optimization Using Efficient Evolutionary Algorithms
 Creator
 Roy, Proteek Chandan
 Date
 2019
 Collection
 Electronic Theses & Dissertations
 Description

Most realworld problems are comprised of multiple conflicting objectives and solutions to those problems are multiple Paretooptimal tradeoff solutions. The main challenge of these practical problems is that the objectives and constraints do not have any closed functional forms and they are expensive for computation as well. Objectives coming from finite element analysis, computational fluid dynamics software, network flow simulators, crop modeling, weather modeling or any other simulations...
Show moreMost realworld problems are comprised of multiple conflicting objectives and solutions to those problems are multiple Paretooptimal tradeoff solutions. The main challenge of these practical problems is that the objectives and constraints do not have any closed functional forms and they are expensive for computation as well. Objectives coming from finite element analysis, computational fluid dynamics software, network flow simulators, crop modeling, weather modeling or any other simulations which involve partial differential equations are good examples of expensive problems. These problems can also be regarded as ``lowbudget'' problems since only a few solution evaluations can be performed given limited time. Nevertheless, parameter estimation and optimization of objectives related to these simulations require a good number of solution evaluations to come up with better parameters or a reasonably good tradeoff front. To provide an efficient search process within a limited number of exact evaluations, metamodelassisted algorithms have been proposed in the literature. These algorithms attempt to construct a computationally inexpensive representative model of the problem, having the same global optima and thereby providing a way to carry out the optimization in metamodel space in an efficient way. Populationbased methods like evolutionary algorithms have become standard for solving multiobjective problems and recently Metamodelbased evolutionary algorithms are being used for solving expensive problems. In this thesis, we would like to address a few challenges of metamodelbased optimization algorithms and propose some efficient and innovative ways to construct these algorithms. To approach efficient design of metamodelbased optimization algorithm, one needs to address the choice of metamodeling functions. The most trivial way is to build metamodels for each objective and constraint separately. But we can reduce the number of metamodel constructions by using some aggregated functions and target either single or multiple optima in each step. We propose a taxonomy of possible metamodelbased algorithmic frameworks which not only includes most algorithms from the literature but also suggests some new ones. We improve each of the frameworks by introducing trust region concepts in the multiobjective scenario and present two strategies for building trust regions. Apart from addressing the main bottleneck of the limited number of solution evaluations, we also propose efficient nondominated sorting methods that further reduce computational time for a basic step of multiobjective optimization. We have carried out extensive experiments over all representative metamodeling frameworks and shown that each of them can solve a good number of test problems. We have not tried to tune the algorithmic parameters yet and it remains as our future work. Our theoretical analyses and extensive experiments suggest that we can achieve efficient metamodelbased multiobjective optimization algorithms for solving test as well as realworld expensive and lowbudget problems.
Show less