Autonomous robots are increasingly working alongside humans in a variety of environments. While simple applications in controlled environments work fine with fully autonomous robots and little interaction between human and robot, mission-critical applications in unstructured and uncertain environments require a stronger collaboration between human and robot. An example of such an instance occurs in dismounted military operations in which one or more autonomous robots act as part of a team of... Show moreAutonomous robots are increasingly working alongside humans in a variety of environments. While simple applications in controlled environments work fine with fully autonomous robots and little interaction between human and robot, mission-critical applications in unstructured and uncertain environments require a stronger collaboration between human and robot. An example of such an instance occurs in dismounted military operations in which one or more autonomous robots act as part of a team of soldiers. The performance of the human-robot team depends largely on the interaction between human and robot, more specifically the communication interfaces between the two. Furthermore, due to the complex and unstructured environments in which dismounted military missions take place, robots need to have a diverse skill set. Therefore, a variety of sensors, robot platform types (e.g. wheeled vs legged) and other capabilities are needed. The goal of this research was to understand how robot platform type and visual complexity of the human-robot interface, in particular a Mixed Reality interface, affect cooperative human-robot teaming in dismounted military operations. More specifically, the research objectives were to understand how robot platform type (wheeled vs. legged) impacts the human's perception of robot capability and performance, and to assess how visual complexity of a Mixed Reality interface affects accuracy and response time for an information reporting task and a signal detection task. The results of this study revealed that an increased visual complexity of the Mixed Reality-based human-robot interface improved response time and accuracy for an information reporting task and resulted in a more usable interface. Furthermore, the results indicated that the response time and accuracy for a signal detection task did not differ between high visual complexity and low visual complexity modes of the human-robot interface, which was likely due to a low task load. Users of the interface in high visual complexity mode reported lower perceived workload and better perceived performance compared to users of the interface in low visual complexity mode. Moreover, the findings of this study demonstrated that the unique appearance of a biologically-inspired legged robot was not enough to result in a difference in perceived performance and trust compared to a more traditional- looking wheeled robot. Therefore, there was no basis to conclude that the unique appearance of the legged robot resulted in the user anthropomorphizing the legged robot more than the wheeled robot. Additionally, free response feedback from users revealed that Mixed Reality-based head-mounted displays have the potential to overcome the shortcomings of Augmented Reality-based head-mounted displays and offer a suitable alternative to hand-held displays in dismounted military operations. Finally, this study demonstrated that an increase in visual complexity of a Mixed Reality-based human-robot interface results in improved effectiveness of human robot interaction and ultimately human-robot team performance as long as the additional complexity supports the tasks of the human. Show less

Date Issued

2017

Identifier

FSU_SUMMER2017_Kopinsky_fsu_0071E_14062

Format

Thesis

Title

A Study on Semantic Relation Representations in Neural Word Embeddings.

Neural network based word embeddings have demonstrated outstanding results in a variety of tasks, and become a standard input for Natural Language Processing (NLP) related deep learning methods. Despite these representations are able to capture semantic regularities in languages, some general questions, e.g., "what kinds of semantic relations do the embeddings represent?" and "how could the semantic relations be retrieved from an embedding?" are not clear and very little relevant work has... Show moreNeural network based word embeddings have demonstrated outstanding results in a variety of tasks, and become a standard input for Natural Language Processing (NLP) related deep learning methods. Despite these representations are able to capture semantic regularities in languages, some general questions, e.g., "what kinds of semantic relations do the embeddings represent?" and "how could the semantic relations be retrieved from an embedding?" are not clear and very little relevant work has been done. In this study, we propose a new approach to exploring the semantic relations represented in neural embeddings based on WordNet and Unified Medical Language System (UMLS). Our study demonstrates that neural embeddings do prefer some semantic relations and that the neural embeddings also represent diverse semantic relations. Our study also finds that the Named Entity Recognition (NER)-based phrase composition outperforms Word2phrase and the word variants do not affect the performance on analogy and semantic relation tasks. Show less

Clustering is an essential data mining tool that aims to discover inherent cluster structure in data. As such, the study of clusterability, which evaluates whether data possesses such structure, is an integral part of cluster analysis. Yet, despite their central role in the theory and application of clustering, current notions of clusterability fall short in two crucial aspects that render them impractical; most are computationally infeasible and others fail to classify the structure of real... Show moreClustering is an essential data mining tool that aims to discover inherent cluster structure in data. As such, the study of clusterability, which evaluates whether data possesses such structure, is an integral part of cluster analysis. Yet, despite their central role in the theory and application of clustering, current notions of clusterability fall short in two crucial aspects that render them impractical; most are computationally infeasible and others fail to classify the structure of real datasets. In this thesis, we propose a novel approach to clusterability evaluation that is both computationally efficient and successfully captures the structure in real data. Our method applies multimodality tests to the (one-dimensional) set of pairwise distances based on the original, potentially high-dimensional data. We present extensive analyses of our approach for both the Dip and Silverman multimodality tests on real data as well as 17,000 simulations, demonstrating the success of our approach as the first practical notion of clusterability. Show less

The Distributed Oceanographic Match-Up Service (DOMS) currently under development is a centralized service that allows researchers to easily match in situ and satellite oceanographic data from distributed sources to facilitate satellite calibration, validation, and retrieval algorithm development. The Shipboard Automated Meteorological and Oceanographic System (SAMOS) initiative provides routine access to high-quality marine meteorological and near-surface oceanographic observations from... Show moreThe Distributed Oceanographic Match-Up Service (DOMS) currently under development is a centralized service that allows researchers to easily match in situ and satellite oceanographic data from distributed sources to facilitate satellite calibration, validation, and retrieval algorithm development. The Shipboard Automated Meteorological and Oceanographic System (SAMOS) initiative provides routine access to high-quality marine meteorological and near-surface oceanographic observations from research vessels. SAMOS is one of several endpoints connected into the DOMS network, providing in-situ data for the match-up service. DOMS in-situ endpoints currently use Apache Solr as a backend search engine on each node in the distributed network. While Solr is a high-performance solution that facilitates creation and maintenance of indexed data, it is limited in the sense that its schema is fixed. The property graph model escapes this limitation by removing any prohibiting requirements on the data model, and permitting relationships between data objects. This paper documents the development of the SAMOS Neo4j property graph database including new search possibilities that take advantage of the property graph model, performance comparisons with Apache Solr, and a vision for graph databases as a storage tool for oceanographic data. The integration of the SAMOS Neo4j graph into DOMS is also described. Various data models are explored including spatial-temporal records from SAMOS added to a time tree using Graph Aware technology. This extension provides callable Java procedures within the CYPHER query language that generate in-graph structures used in data retrieval. Neo4j excels at performing relationship and path-based queries, which challenge relational-SQL databases because they require memory intensive joins due to the limitation of their design. Consider a user who wants to find records over several years, but only for specific months. If a traditional database only stores timestamps, this type of query could be complex and likely prohibitively slow. Using the time tree model in a graph, one can specify a path from the root to the data which restricts resolutions to certain time frames (e.g., months). This query can be executed without joins, unions, or other compute-intensive operations, putting Neo4j at a computational advantage to the SQL database alternative. That said, while this advantage may be useful, it should not be interpreted as an advantage to Solr in the context of DOMS. Solr makes use of Apache Lucene indexing at its core, while Neo4j provides its own native schema indexes. Ultimately they each provide unique solutions for data retrieval that are geared for specific tasks. In the DOMS setting it would appear that Solr is the most suitable option, as there seems to be very limited use cases where Neo4j does outperform Solr. This is primarily because the use case as a subsetting tool does not require the flexibility and path-based queries that graph database tools offer. Rather, DOMS nodes are using high performance indexing structures to quickly filter large amounts of raw data that are not deeply connected, a feature of large data sets where graph queries would indeed become useful. Show less

Pseudorandom number generators (PRNGs) are an essential tool in many areas, including simulation studies of stochastic processes, modeling, randomized algorithms, and games. The performance of any PRNGs depends on the quality of the generated random sequences; they must be generated quickly and have good statistical properties. Several statistical test suites have been developed to evaluate a single stream of random numbers, such as TestU01, DIEHARD, the tests from the SPRNG package, and a... Show morePseudorandom number generators (PRNGs) are an essential tool in many areas, including simulation studies of stochastic processes, modeling, randomized algorithms, and games. The performance of any PRNGs depends on the quality of the generated random sequences; they must be generated quickly and have good statistical properties. Several statistical test suites have been developed to evaluate a single stream of random numbers, such as TestU01, DIEHARD, the tests from the SPRNG package, and a set of tests designed to evaluate bit sequences developed at NIST. TestU01 provides batteries of test that are sets of the mentioned suites. The predefined batteries are SmallCrush (10 tests, 16 p-values) that runs quickly, Crush (96 tests, 187 p-values) and BigCrush (106 tests, 2254 p-values) batteries that take longer to run. Most pseudorandom generators use recursion to produce sequences of numbers that appear to be random. The linear congruential generator is one of the well-known pseudorandom generators, the next number in the random sequences is determined by the previous one. The recurrences start with a value called the seed. Each time a recurrence starts with the same seed the same sequence is produced. This thesis develops a new pseudorandom number generation scheme that produces random sequences with good statistical properties via scrambling linear congruential generators. The scrambling technique is based on a simplified version of Feistel network, which is a symmetric structure used in the construction of cryptographic block ciphers. The proposed research seeks to improve the quality of the linear congruential generators’ output streams and to break up the regularities existing in the generators. Show less

Processors that employ instruction fusion can improve performance and energy usage beyond traditional processors by collapsing and simultaneously executing dependent instruction chains on the critical path. This paper describes compiler mechanisms that can facilitate and guide instruction fusion in processors built to execute fused instructions. The compiler support discussed in this paper includes compiler annotations to guide fusion, exploring multiple new fusion configurations, and... Show moreProcessors that employ instruction fusion can improve performance and energy usage beyond traditional processors by collapsing and simultaneously executing dependent instruction chains on the critical path. This paper describes compiler mechanisms that can facilitate and guide instruction fusion in processors built to execute fused instructions. The compiler support discussed in this paper includes compiler annotations to guide fusion, exploring multiple new fusion configurations, and developing scheduling algorithms that effectively select and order fusible instructions. The benefits of providing compiler support for dependent instruction fusion include statically detecting fusible instruction chains without the need for hardware dynamic detection support and improved performance by increasing available parallelism. Show less

Internet of Things (IOT) systems are becoming a popular concept of every smart system. Many people intend to develop various IOT systems which could be smart socket can be controlled remotely and tracking the electricity consumption to save energy or a security system for home which combines several sensors and cover a big area. The goal of this thesis was to introduce a method to construct an IOT system that can monitor different parameters. The design of this project also focused on... Show moreInternet of Things (IOT) systems are becoming a popular concept of every smart system. Many people intend to develop various IOT systems which could be smart socket can be controlled remotely and tracking the electricity consumption to save energy or a security system for home which combines several sensors and cover a big area. The goal of this thesis was to introduce a method to construct an IOT system that can monitor different parameters. The design of this project also focused on wireless interaction in order to make the system more perceptual. The design of the system was modified several times which include a changing from using Ethernet to Wi-Fi. Ultimately, it provides an effective method for monitoring a building system which could be the temperature, humidity, photo intensity, or the movement of objects, etc. The final design fulfills the fundamental goals and there is a visualization web page for the IOT system which both includes a real time data monitoring and a real time charting. This thesis will give a thorough overview of how to build an own IOT system. Show less

One of the main goals of robotics research is to give physical platforms intelligence, allowing for the platforms to act autonomously with minimal direction from humans. Motion planning is the process by which a mobile robot plans a trajectory that moves the robot from one state to another. While there are many motion planning algorithms, this research focuses on Sampling Based Model Predictive Optimization (SBMPO), a motion planning algorithm that allows for the generation of trajectories... Show moreOne of the main goals of robotics research is to give physical platforms intelligence, allowing for the platforms to act autonomously with minimal direction from humans. Motion planning is the process by which a mobile robot plans a trajectory that moves the robot from one state to another. While there are many motion planning algorithms, this research focuses on Sampling Based Model Predictive Optimization (SBMPO), a motion planning algorithm that allows for the generation of trajectories that are not only dynamically feasible, but also efficient in terms of a user defined cost function (specifically in this research, distance traveled or energy consumed). To accomplish this, SBMPO uses the kinematic, dynamic, and power models of the robot. The kinematic, dynamic, and power models of a skid-steered robot are dependent on the type and inclination of the terrain over which the robot is traversing. Previous research has successfully used SBMPO to plan trajectories on different inclinations and terrain types, but with the terrain type and inclination being held constant over the trajectory. This research extends the prior work to plan trajectories where the terrain type changes over the trajectory and where the robot has the option to go over or around hills, situations extremely common in real world environments encountered in military and search and rescue operations. Furthermore, this research documents the design and implementation of a 3D visualization environment which allows for the visualization of the trajectory generated by the planner without having a robot follow the trajectory in a physical environment. Show less

In many pattern classification problems, efficiently learning a suitable low-dimensional representation of high dimensional data is essential. The advantages of linear dimension reduction methods are their simplicity and efficiency. Optimal component analysis (OCA) is a recently proposed linear dimensional reduction method which seeks to optimize the discriminant ability of the nearest neighbor classifier for data classification and labeling. Mathematically, OCA defines an objective function... Show moreIn many pattern classification problems, efficiently learning a suitable low-dimensional representation of high dimensional data is essential. The advantages of linear dimension reduction methods are their simplicity and efficiency. Optimal component analysis (OCA) is a recently proposed linear dimensional reduction method which seeks to optimize the discriminant ability of the nearest neighbor classifier for data classification and labeling. Mathematically, OCA defines an objective function which aims to discriminatively separate data in different classes and an optimal basis is obtained through a stochastic gradient search on the underlying Grassmann manifold. OCA shows good performance in various applications including face recognition, object recognition, and image retrieval. However, a limitation of OCA is that the computational complexity is high, which prevents its wide usage in real applications. In this dissertation, several efficient methods, including two-stage OCA, multi-stage OCA, scalable OCA, and two-stage sphere factor analysis (SFA), have been proposed to cope with this problem and achieve both efficiency and accuracy. Two-stage and multi-stage OCA aim to speed up the OCA search by reducing the dimension of the search space; scalable OCA uses a more efficient gradient updating method to reduce the computational complexity of OCA; two-stage SFA first reduces the search space and then the optimal basis is searched on a simpler geometrical manifold than that of OCA. Furthermore, a sparse OCA method is also proposed by adding sparseness constraints to OCA. Additionally, an application of the efficient OCA methods on rapid classification tree is also presented. Experimental results on face and object classification show these methods achieve efficiency and discrimination simultaneously. Show less

This research explores the idea of extracting three-dimensional features from video clips, in order to aid various video analysis and mining tasks. Although video analysis problems are well-established in the literature, the use of three-dimensional information in is scarce due to the inherent difficulties of building such a system. When the only input to the system is a video stream with no previous knowledge of the scene or camera (a typical scenario in video analysis), extracting... Show moreThis research explores the idea of extracting three-dimensional features from video clips, in order to aid various video analysis and mining tasks. Although video analysis problems are well-established in the literature, the use of three-dimensional information in is scarce due to the inherent difficulties of building such a system. When the only input to the system is a video stream with no previous knowledge of the scene or camera (a typical scenario in video analysis), extracting meaningful and accurate 3D representations becomes a very difficult task. However, several recently proposed methods have shown some progress in working towards this goal by applying techniques from various other topics including simultaneous localization and mapping, structure from motion, and 3D reconstruction. In the research presented here, I present two main contributions towards solving this problem. First, I propose a method capable of generating a three-dimensional representation of a scene as observed by a monocular video, using no previous information. The method exploits the movement of the camera while robustly tracking features over time in order to obtain multiple views of a scene and perform 3D reconstruction. This system performs automatic camera calibration, estimates the three-dimensional structure of the scene, and tracks the scene across time while refining its results as new frames are obtained. Additionally, the system can track a scene even under the presence of moving people, a limitation of most SLAM and SFM approaches available in the literature. Secondly, I present a method for extracting the three-dimensional pose and motion of a person in a video. The method extends previously published work related to two-dimensional human pose estimation by incorporating a human motion model and expands the two-dimensional pose onto three dimensions using several heuristics. Together, these methods yield an intrinsic 3D representation of the static background and the people in a scene which can be used to solve various video analysis tasks. To prove the feasibility of my proposed method, I show how it can be used to solve a selection of video analysis tasks. First, I show how a three-dimensional point cloud of the scene can be used along with robust feature tracking to detect shot- boundaries in the video. Next, I present an automatic approach to stereoscopic video conversion using no prior knowledge of the input video. Finally, I illustrate how a three-dimensional human model can be incorporated with simple linear classifiers to perform human action recognition with high classification results. Show less

Internet fault diagnosis has attracted much attention in recent years. In this paper, we focus on the problem of finding the Link Pass Ratios (LPRs) when the Path Pass Ratios (PPRs) of a set of paths are given. Usually, given the PPRs of the paths, the LPRs of a significant percentage of the links cannot be uniquely determined because the system is under-constrained. We consider the Maximum Likelihood Estimation of the LPRs of such links. We prove that the problem of finding the Maximum... Show moreInternet fault diagnosis has attracted much attention in recent years. In this paper, we focus on the problem of finding the Link Pass Ratios (LPRs) when the Path Pass Ratios (PPRs) of a set of paths are given. Usually, given the PPRs of the paths, the LPRs of a significant percentage of the links cannot be uniquely determined because the system is under-constrained. We consider the Maximum Likelihood Estimation of the LPRs of such links. We prove that the problem of finding the Maximum Likelihood Estimation is NP-hard, then propose a simple algorithm based on divide-and-conquer. We first estimate the number of faulty links on a path, then use the global information to assign LPRs to the links. We conduct simulations on networks of various sizes and the results show that our algorithm performs very well in terms of identifying faulty links. Show less

Date Issued

2009

Identifier

FSU_migr_etd-1501

Format

Thesis

Title

Design of a Low‐Cost Adaptive Question Answering System for Closed Domain Factoid Queries.

Closed domain question answering (QA) systems achieve precision and recall at the cost of complex language processing techniques to parse the answer corpus. We propose a query-based model for indexing answers in a closed domain factoid QA system. Further, we use a phrase term inference method for improving the ranking order of related questions. We posit that a query can be used as the unique identifier of an answer, and thus, the recognition of a query allows us to retrieve the correct... Show moreClosed domain question answering (QA) systems achieve precision and recall at the cost of complex language processing techniques to parse the answer corpus. We propose a query-based model for indexing answers in a closed domain factoid QA system. Further, we use a phrase term inference method for improving the ranking order of related questions. We posit that a query can be used as the unique identifier of an answer, and thus, the recognition of a query allows us to retrieve the correct answer. In instances where a query is unrecognized, we infer synonymous relationships with other queries through the use of a user feedback loop to improve the ranking order of closely related questions, where possible. The goal of this research is to build a prototype as proof-of-concept that will learn domain-specific knowledge with increased usage through time. This study will focus its efforts in researching the feasibility of a lightweight QA learning system that adapts its responses based on the interaction amongst its users. This offers a lightweight approach to a factoid question answering system for domain specific knowledge bases with significantly simplified language processing techniques. Show less

Many applications are featured with both text and location information, which leads to a novel type of search: spatial approximate string search (Sas). The Sas is gaining attention from the database community only recently. A large variety of problems remain open. Spatial and text data have been independently studied for decades. Spatial data have many unique features that are drastically different from text data. As a result, most of the existing techniques for string processing are either... Show moreMany applications are featured with both text and location information, which leads to a novel type of search: spatial approximate string search (Sas). The Sas is gaining attention from the database community only recently. A large variety of problems remain open. Spatial and text data have been independently studied for decades. Spatial data have many unique features that are drastically different from text data. As a result, most of the existing techniques for string processing are either inapplicable or inefficient when adapted to spatial databases. We have investigated four issues in the general area of Sas. They are: (i) Spatial approximate string search in Euclidean space (Esas); (ii) Spatial approximate string search on road networks (Rsas); (iii) Selectivity Estimation for Esas Range Queries; (iv) Multi-Approximate-Keyword Routing query on road networks. For efficiently answering spatial approximate string queries in Euclidean space, we propose a novel index structure, MhR-tree, which is based on the R-tree augmented with the min-wise signature and the linear hashing technique. The min-wise signature for an index node u keeps a concise representation of the union of q-grams from strings under the subtree of u. We analyze the pruning functionality of such signatures based on set resemblance between the query string and the q-grams from the sub-trees of index nodes. For the Rsas, we propose a novel method, RsasSol, which is superior to the base line algorithm in practice. The RsasSol is a combination technique of inverted list and reference points. Extensive experiments on large real data sets demonstrate the efficiency and effectiveness of our methods. We also discuss how to estimate range query selectivity in Euclidean space since the query selectivity estimation is always important for the query optimization. Based on the spatial uniformity principle and the minimum number of neighborhoods principle for strings, we define a partitioning metric to facilitate the construction of buckets. We present a novel adaptive algorithm for finding balanced partitions using both the spatial and string information stored in the R-tree, which works well in practice. Lastly, we go further beyond the basic kNN and range queries for Sas and consider an interesting application of Sas on the road network: find the shortest path with keywords constraints on the road network. We propose both exact and approximate solutions for this issue. For small set of query keywords (m ≤ 6), our exact solutions can quickly find the top-k shortest paths. For large set of query keywords, our approximate solutions can still return the answer on real time with good approximate quality. Show less

Date Issued

2011

Identifier

FSU_migr_etd-0988

Format

Thesis

Title

Time Parallelization Methods for the Solution of Initial Value Problems.

Many scientific problems are posed as Ordinary differential Equations (ODEs). A large subset of these are initial value problems, which are typically solved numerically. The solution starts by using a known state-space of the ODE system to determine the state at a subsequent point it time. This process is repeated several times. When the computational demand is high due to large state space, parallel computers can be used efficiently to reduce the time of solution. Conventional... Show moreMany scientific problems are posed as Ordinary differential Equations (ODEs). A large subset of these are initial value problems, which are typically solved numerically. The solution starts by using a known state-space of the ODE system to determine the state at a subsequent point it time. This process is repeated several times. When the computational demand is high due to large state space, parallel computers can be used efficiently to reduce the time of solution. Conventional parallelization strategies distribute the state space of the problem amongst processors and distribute the task of computing for a single time step amongst the processors. They are not effective when the computational problems have fine granularity, for example, when the state space is relatively small and the computational effort arises largely from the long time span of the initial value problem. The above limitation is increasingly becoming a bottleneck for important applications, in particular due to a couple of architectural trends. One is the increase in number of cores on massively parallel machines. The high end systems of today have hundreds of thousands of cores, and machines of the near future are expected to support the order of a million simultaneous threads. Computations that were coarse grained on earlier machines are often fine grained on these. Another trend is the increased number of cores on a chip. This has provided desktop access to parallel computing for the average user. A typical low-end user requiring the solution of an ODE with a small state space would earlier not consider a parallel system. However, such a system is now available to the user. Users of both the above environments need to deal with the problem of parallelizing ODEs with small state space. Parallelization of the time domain appears promising to deal with this problem. The idea behind this is to divide the entire time span of the initial value problem into smaller intervals and have each processor compute one interval at a time, instead of dividing the state space. The difficulty lies in that time is an intrinsically sequential quantity and one time interval can only start after its preceding interval completes, since we are solving an initial value problem. Earlier attempts at parallelizing the time domain were not very successful. This thesis proposes two different time parallelization strategies, and demonstrates their effectiveness in dealing with the bottleneck described above. The thesis first proposes a hybrid dynamic iterations method which combines conventional sequential ODE solvers with dynamic iterations. Empirical results demonstrate a factor of two to three improvement in performance of the hybrid dynamic iterations method over a sequential solver on an $8$ core processor, while conventional state-space decomposition is not useful due to the communication overhead. Compared to Picard iterations (also parallelized in the time domain), the proposed method shows better convergence and speedup results when high accuracy is required. The second proposed method is a data-driven time parallelization algorithm. The idea is to use results from related prior computations to predict the states in a new computation, and then parallelize the new computation in the time domain. The effectiveness of the this method is demonstrated on Molecular Dynamics (MD) simulations of Carbon Nanotube (CNT) tensile tests. MD simulation is a special application of initial value problems. Empirical results show that the data-driven time parallelization method scales to two to three orders of magnitude larger numbers of processors than conventional state-space decomposition methods. This approach achieves the highest scalability for MD on general purpose computers. The time parallel method can also be combined with state space decomposition methods to improve the scalability and efficiency of the conventional parallelization method. This thesis presents a combined data-driven time parallelization and state space decomposition method and adapts it in MD simulations of soft-matter, which is typically seen in computational biology. Since MD method is an important atomistic simulation technique and widely used in computational chemistry, biology and materials, the data-driven time parallel method also suggests a promising approach for realistic simulations with long time span. Show less

Bayesian inference with Bayesian networks is a #P-complete problem in general. Exact Bayesian inference is feasible in practice only on small-scale Bayesian networks or networks that are dramatically simplified, such as with naive Bayes or other approximations. Stochastic sampling methods, in particular importance sampling, form one of the most prominent and efficient approximate inference techniques that gives answers in reasonable computing time. A critical problem for importance sampling... Show moreBayesian inference with Bayesian networks is a #P-complete problem in general. Exact Bayesian inference is feasible in practice only on small-scale Bayesian networks or networks that are dramatically simplified, such as with naive Bayes or other approximations. Stochastic sampling methods, in particular importance sampling, form one of the most prominent and efficient approximate inference techniques that gives answers in reasonable computing time. A critical problem for importance sampling is to choose an importance function to efficiently and accurately sample the posterior probability distribution given a set of observed variables (also referred to as evidence). Choosing an importance function, or designing a new function, faces two major hurdles: how to approach the posterior probability distribution accurately and how to reduce the generation of undesirable inconsistent samples caused by deterministic relations in the network (also known as the rejection problem). In my dissertation I propose Refractor Importance Sampling, consisting of a family of importance sampling algorithms to effectively break the lower bound on the error of the importance function to ensure it can approach the posterior probability distribution of a Bayesian network given evidence. The aim is to increase the convergence rate of the approximation and to reduce sampling variance. I also propose Posterior Subgraph Clustering to improve the sampling process by breaking a network into several small independent subgraphs. To address the rejection problem, I propose Zero Probability Backfilling and Compressed Vertex Tree methods to detect and store deterministic constrains that are the root cause of inconsistent samples. The rejection problem is NP-hard and I prove in my dissertation that even limiting the inconsistent samples to a certain degree with positive probability is too hard to be polynomial. I propose the k-test to measure the hardness of the rejection problem by scaling the clause density of a CNF constructed from a Bayesian network. In the final part of my dissertation, I design and implement importance sampling algorithms for GPU platforms to speed up sampling by exploiting fine-grain parallelism. GPUs are cost-effective computing devices. However, the randomness of memory-intensive operations by Bayesian network sampling causes difficulties to obtain computational speedups with these devices. In my dissertation I propose a new method, Parallel Irregular Wavefront Sampling, to improve memory access on GPU by leveraging the conditional independence relations between variables in a network. I improve this scheme further by using the posterior distribution as an oracle to reduce costly memory fetches in the GPU. The proposed methods are implemented and experimentally tested. The results show a significant contribution in efficiency and accuracy of Bayesian network sampling for Bayesian inference with real-world and synthetic benchmark Bayesian networks. Show less

Date Issued

2011

Identifier

FSU_migr_etd-0912

Format

Thesis

Title

Adaptive Observations in a 4D-Var Framework Applied to the Nonlinear Burgers Equation Model.

In 4D-Var data assimilation for geophysical models, the goal is to reduce the lack of fit between model and observations (strong constraint approach assuming perfect model). In the last two decades four dimensional variational technique has been extensively used in the numerical weather prediction due to the fact that time distributed observations are assimilated to obtain a better initial condition thus leading to more accurate forecasts using the above 4D-Var approach. The use of large... Show moreIn 4D-Var data assimilation for geophysical models, the goal is to reduce the lack of fit between model and observations (strong constraint approach assuming perfect model). In the last two decades four dimensional variational technique has been extensively used in the numerical weather prediction due to the fact that time distributed observations are assimilated to obtain a better initial condition thus leading to more accurate forecasts using the above 4D-Var approach. The use of large-scale unconstrained minimization routines to minimize a cost functional measuring lack of fit between observations and model forecast requires availability of the gradient of the cost functional with respect to the control variables. Nonlinear Burgers equation model is used as numerical forecast model. First order adjoint model can be used to find the gradient of the cost functional. The use of targeted observations supplementing routine observations contributes to the reduction of the forecast analysis error and can provide improved forecast of weather events of critical societal impact, for instance, hurricanes, tornadoes, sharp fronts etc. The optimal space and time locations of the adaptive observations can be determined by using a singular vector approach. In our work we use both adjoint sensitivity and sensitivity to observation approaches to identify the optimal space and time locations for targeted observations at future time aimed at providing an improved forecast. Both approaches are compared in this work and some conclusions are outlined. Show less

Mobile Ad Hoc Networks (MANETs) are a collection of wireless mobile nodes with links that are made and broken arbitrarily. They have limited resources in power, computation, broadcast range, and a dynamic topology with no fixed infrastructure. Communication is accomplished by routes where individual nodes relay packets between a source and destination. A lot of work has been done in this area since 1998 focusing mainly on availability, reliability, or efficiency of routing algorithms. Little... Show moreMobile Ad Hoc Networks (MANETs) are a collection of wireless mobile nodes with links that are made and broken arbitrarily. They have limited resources in power, computation, broadcast range, and a dynamic topology with no fixed infrastructure. Communication is accomplished by routes where individual nodes relay packets between a source and destination. A lot of work has been done in this area since 1998 focusing mainly on availability, reliability, or efficiency of routing algorithms. Little thought has gone into security as part of the protocol development process. In many instances, MANET security has been built-in as an afterthought. Before MANETs become a mainstay in computing applications, many security issues need to be addressed. One such issue is the insider threat. How does one find and effectively neutralize a malicious node? It has been recently shown how to do this with digital signatures; however, this may not always be possible because of the computational cost of digital signatures. CONTRIBUTION OF THESIS: The main contribution of this thesis is to present the following: • An approach for dealing with malicious faults in MANETs • A forward and reverse symmetric key authentication chain for MANETs • A malicious fault tracing algorithm for communication routing protocols in ad hoc networks with constrained resources Show less

Efficient implementations of the Discrete Fourier Transform (DFT) for GPUs provide good performance with large data sizes, but are not competitive with CPU code for small data sizes. On the other hand, several applications require the use of a large number of DFTs on small data sizes. In fact, even algorithms for large data sizes use a divide-and-conquer approach, where eventually small DFTs need to be performed. In this thesis, it will be shown that the asymptotically slow matrix... Show moreEfficient implementations of the Discrete Fourier Transform (DFT) for GPUs provide good performance with large data sizes, but are not competitive with CPU code for small data sizes. On the other hand, several applications require the use of a large number of DFTs on small data sizes. In fact, even algorithms for large data sizes use a divide-and-conquer approach, where eventually small DFTs need to be performed. In this thesis, it will be shown that the asymptotically slow matrix multiplication approach can yield better performance on GPUs than asymptotically faster algorithms, for small data, due to its regular memory access and computational patterns. Also discussed in this thesis are the effect of different optimization techniques, and the combination of the matrix multiplication algorithm with the mixed radix algorithm for 2-D and 3-D complex DFTs. This implementation performs up to 21 times faster than cuFFT on an NVIDIA GeForce 9800 GTX GPU and up to 5 times faster than FFTW on a CPU. Furthermore, this GPU implementation can accelerate the performance of a Quantum Monte Carlo application for which cuFFT is not effective. The primary contributions of this work lie in providing a GPU implementation that is efficient for small DFTs and in demonstrating the utility of the matrix multiplication approach. Show less

This thesis presents and evaluates a new algorithm which generates random numbers. The algorithm uses a Number Theory class of numbers called Normal Numbers. Normal Numbers consist of an infinite sequence of digits which are uniformly distributed in all sequence lengths. The algorithm is then integrated into the SPRNG package with some ideas as to how it can be parallelized. Finally, the performance of this algorithm is evaluated using a standard test suite. This new algorithm is compared... Show moreThis thesis presents and evaluates a new algorithm which generates random numbers. The algorithm uses a Number Theory class of numbers called Normal Numbers. Normal Numbers consist of an infinite sequence of digits which are uniformly distributed in all sequence lengths. The algorithm is then integrated into the SPRNG package with some ideas as to how it can be parallelized. Finally, the performance of this algorithm is evaluated using a standard test suite. This new algorithm is compared with similar, known good, generators using the spectral test and also as the random number generator in a Monte Carlo algorithm. The generator is shown to work well in all tests and to produce value with moderately good speed. Show less

Date Issued

2010

Identifier

FSU_migr_etd-3409

Format

Thesis

Title

A System Architecture That Facilitates Collaboration via Handheld Devices (PDAS).

This research examines the development of a system architecture for collaborative learning that combines feedback, group awareness, and chat in the form of both textual and auditory input. The goals are to evaluate the capability of building such a model through the design of a prototype system, and to investigate the feasibility of implementing the prototype in a handheld learning environment. This model can serve as a template for providing interface tools, communication strategies and data... Show moreThis research examines the development of a system architecture for collaborative learning that combines feedback, group awareness, and chat in the form of both textual and auditory input. The goals are to evaluate the capability of building such a model through the design of a prototype system, and to investigate the feasibility of implementing the prototype in a handheld learning environment. This model can serve as a template for providing interface tools, communication strategies and data manipulation. It is also easily adaptable and can be modified to support a variety of platforms, including both wireless and wired scenarios. The mobility created via a wireless network provides the opportunity for users to move about collaborating freely without being tied to desks, workstations, or even laboratories. The implementation of voice input for soliciting user data is important in settings where users are younger (i.e., elementary level users) or unable to manipulate the standard keypad or provide written input. This model when coupled with a proven collaborative learning methodology can be effective in assisting individuals in building cognition. Also, the architecture can be used by others wishing to develop collaborative learning systems for handhelds, an area of research where such systems are nonexistent. Because of the popularity of handhelds and their incorporation into a variety of settings, computer scientists need to be at the forefront in developing significant research projects that can investigate the capability, impact and extensibility of handheld computers. In the study, a paper prototype test was conducted to determine an optimum interface layout conducive to mobile interaction between users via personal digital assistants (iPAQ™ PDAs). The test responses confirmed that the interface design strategies decided upon prior to testing were consistent with user preferences, and that speech was indeed the preferred method of input for the target group (younger users). The prototype system, developed using the Java 2 Mobile Environment (J2ME) software platform and the Java Wireless Toolkit 2.1 development tool, presents a model for providing a variety of collaborative communication methods between users by incorporating both textual and voice input methods. The architecture also provides a mechanism for handling networked messaging between users on a wireless network, and demonstrates that the model can be made adaptable to a wired network with little modification. The application of the prototype to a successful reading comprehension methodology – Question-Answer Relationships (QAR), is demonstrated and lessons learned during system development are presented. A discussion follows of future research strategies and remaining areas of application. Show less

Date Issued

2004

Identifier

FSU_migr_etd-3712

Format

Thesis

Title

College of Arts and Sciences Application of Sampling-Based Model Predictive Control to Motion Planning for Robotic Manipulators.

This thesis presents the Sampling-Based Model Predictive Control (SBMPC) Algorithm, a novel sampling-based planning algorithm. This algorithm was originally designed for dynamic system which are described by a set of system states and a sequence of control inputs that drive their evolution; however, this thesis reviews the applications of SBMPC to motion planning for robotic manipulators using kinematic models and kinematic models plus integrators (also referred to as extended kinematic... Show moreThis thesis presents the Sampling-Based Model Predictive Control (SBMPC) Algorithm, a novel sampling-based planning algorithm. This algorithm was originally designed for dynamic system which are described by a set of system states and a sequence of control inputs that drive their evolution; however, this thesis reviews the applications of SBMPC to motion planning for robotic manipulators using kinematic models and kinematic models plus integrators (also referred to as extended kinematic models). To properly evaluate SBMPC, first a brief survey of conventional motion planning techniques used for manipulators is presented. This survey includes a comparison between combinatorial and sampling-based path planning algorithms as well as an overview of two recent sampling-based path planning algorithms: Rapidly Exploring Random Trees (RRTs) and Randomized A* (RA*). Once this preliminary information is discussed, a detail explanation of SBMPC and its components is presented. SBMPC actually combines four main components: an A*-type algorithm used for optimization, sampling of the control inputs, discretization of the system states, and the concept of a system model, whose purpose is to predict the new system state, given the current system states and a new set of control inputs. Next, results for some experiments done with SBMPC are presented. More specifically, SBMPC was used with simulations of two manipulators, namely a 3-link planar manipulator and the Barrett WAM. The results show that SBMPC can be used to generate smooth paths for these two manipulators simply by sampling joint velocities instead of joint angles, and even smoother paths by sampling joint accelerations instead of joint velocities. The results also demonstrate that SBMPC can generate smooth trajectories given that the proper cost function and goal heuristic are used. Since SBMPC is able to generate smooth smooth paths or smooth trajectories in a single step, this presents a significant improvement to the conventional approach used for robot motion planning used in the past few decades. Finally, SBMPC is still under investigation, and even though important results are presented here, SBMPC still has room for improvement that could lead to more interesting projects Show less

There has been a great deal of research devoted to computer vision related assistive technologies. Unfortunately, this area of research has not produced many usable solutions. The long cane and the guard dog are still far more useful than most of these devices. Through the push for advanced mobile and gaming systems, new low-cost solutions have become available for building innovative and creative assistive technologies. These technologies have been used for sensory substitution projects that... Show moreThere has been a great deal of research devoted to computer vision related assistive technologies. Unfortunately, this area of research has not produced many usable solutions. The long cane and the guard dog are still far more useful than most of these devices. Through the push for advanced mobile and gaming systems, new low-cost solutions have become available for building innovative and creative assistive technologies. These technologies have been used for sensory substitution projects that attempt to convert vision into either auditory or tactile stimuli. These projects have reported some degree of measurable success. Most of these projects focused on converting either image brightness or depth into auditory signals. This research was devoted to the design and creation of a video game simulator that was capable of performing research and training for these sensory substitution concepts that converts vision into auditory stimuli. The simulator was used to perform direct comparisons between some of the popular sensory substitution techniques as well as exploring new concepts for conversion. This research of 42 participants tested different techniques for image simplification and discovered that using depth-to-tone sensory substitution may be more usable than brightness-to-tone simulation. The study has shown that using 3D game simulators can be used in lieu of building costly prototypes for testing new sensory substitution concepts. Show less

Near point problems are widely used in computational geometry as well as a variety of other scientific fields. This work examines four common near point problems and presents original algorithms that solve them. Planar nearest neighbor searching is highly motivated by geographic information system and sensor network problems. Efficient data structures to solve near neighbor queries in the plane can exploit the extreme low dimension for fast results. To this end, DealaunayNN is an algorithm... Show moreNear point problems are widely used in computational geometry as well as a variety of other scientific fields. This work examines four common near point problems and presents original algorithms that solve them. Planar nearest neighbor searching is highly motivated by geographic information system and sensor network problems. Efficient data structures to solve near neighbor queries in the plane can exploit the extreme low dimension for fast results. To this end, DealaunayNN is an algorithm using Delaunay graphs and Voronoi cells to answer queries in O(log n) time, faster in practice than other common state-of-the art algorithms. k-Nearest neighbor graph construction arises in computer graphics in areas of normal estimation and surface simplification. This work presents knng, an efficient algorithm using Morton ordering to solve the problem. The knng algorithm exploits cache coherence and low storage space, as well as being extremely optimize-able for parallel processors. The GeoFilterKruskal algorithm solves the problem of computing geometric minimum spanning trees. A common tool in tackling clustering problems, GMSTs are an extension of the minimum spanning tree graph problem, applied to the complete graph of a point set. By using well separated pair decomposition, bi-chromatic closest pair computation, and partitioning and filtering techniques, GeoFilterKruskal greatly reduces the total computation required. It is also one of the only algorithms to compute GMSTs in a manner that lends itself to parallel computation; a major advantage over its competitors. High dimensional nearest neighbor searching is an expensive operation, due to an exponential dependence on dimension from many lower dimensional solutions. Modern techniques to solve this problem often revolve around projecting data points into a large number of lower dimensional subspaces. PCANN explores the idea of picking one particularly relevant subspace for projection. When used on SIFT data, principal component analysis allows for greatly reduced dimension with no need for multiple projection. Additionally, this algorithm is also highly motivated to make use of parallel computing power. Show less

Python Package Index (PyPI) is a repository that hosts all the packages ever developed for the Python community. It hosts thousands of packages from different developers and for the Python community, it is the primary source for downloading and installing packages. It also provides a simple web interface to search for these packages. A direct search on PyPI returns hundreds of packages that are not intuitively ordered, thus making it harder to find the right package. Developers consequently... Show morePython Package Index (PyPI) is a repository that hosts all the packages ever developed for the Python community. It hosts thousands of packages from different developers and for the Python community, it is the primary source for downloading and installing packages. It also provides a simple web interface to search for these packages. A direct search on PyPI returns hundreds of packages that are not intuitively ordered, thus making it harder to find the right package. Developers consequently resort to mature search engines like Google, Bing or Yahoo which redirect them to the appropriate package homepage at PyPI. Hence, the first task of this thesis is to improve search results for python packages. Secondly, this thesis also attempts to develop a new search engine that allows Python developers to perform a code search targeting python modules. Currently, the existing search engines classify programming languages such that a developer must select a programming language from a list. As a result every time a developer performs a search operation, he or she has to choose Python out of a plethora of programming languages. This thesis seeks to offer a more reliable and dedicated search engine that caters specifically to the Python community and ensures a more efficient way to search for Python packages and modules. Show less

We construct a phase field model for simulating the adhesion of a cell membrane to a substrate. The model features two phase field functions which are used to simulate the membrane and the substrate. An energy model is defined which accounts for the elastic bending energy and the contact potential energy as well as, through a penalty method, vesicle volume and surface area constraints. Numerical results are provided to verify our model and to provide visual illustrations of the interactions... Show moreWe construct a phase field model for simulating the adhesion of a cell membrane to a substrate. The model features two phase field functions which are used to simulate the membrane and the substrate. An energy model is defined which accounts for the elastic bending energy and the contact potential energy as well as, through a penalty method, vesicle volume and surface area constraints. Numerical results are provided to verify our model and to provide visual illustrations of the interactions between a lipid vesicle and substrates having complex shapes. Examples are also provided for the adhesion process in the presence of gravitational and point pulling forces. A comparison with experimental results demonstrates the effectiveness of the two phase field approach. Similarly to simulating vesicle-substrate adhesion, we construct a multi-phase-field model for simulating the adhesion between two vesicles. Two phase field functions are introduced to simulate each of the two vesicles. An energy model is defined which accounts for the elastic bending energy of each vesicle and the contact potential energy between the two vesicles; the vesicle volume and surface area constraints are imposed using a penalty method. Numerical results are provided to verify the efficacy of our model and to provide visual illustrations of the different types of contact. The method can be adjusted to solve endocytosis problems by modifying the bending rigidity coefficients of the two elastic bending energies. The method can also be extended to simulate multi-cell adhesions, one example of which is erythrocyte rouleaux. A comparison with laboratory observations demonstrates the effectiveness of the multi-phase field approach. Coupled with fluid, we construct a phase field model for simulating vesicle-vessel adhesion in a flow. Two phase field functions are introduced to simulate the vesicle and vessel respectively. The fluid is modeled and confined inside the tube by a phase field coupled Navier-Stokes equation. Both vesicle and vessel are transported by fluid flow inside our computational domain. An energy model regarding the comprehensive behavior of vesicle-fluid interaction, vessel-fluid interaction, vesicle-vessel adhesion is defined. The vesicle volume and surface area constraints are imposed using a penalty method, while the vessel elasticity is modeled under Hooke's Law. Numerical results are provided to verify the efficacy of our model and to demonstrate the effectiveness of our fluid-coupled vesicle vessel adhesion phase field approach by comparison with laboratory observations. Show less

Molecular Dynamics (MD) is an important simulation technique with widespread use in computational chemistry, biology, and materials. An important limitation of MD is that the time step size is limited to around a femto (10-15) second. Consequently, a large number of iterations are required to simulate to realistic time spans. This is a major bottleneck in MD, and has been identified as an important challenge in computational biology and nano-materials. While parallelization has been effective... Show moreMolecular Dynamics (MD) is an important simulation technique with widespread use in computational chemistry, biology, and materials. An important limitation of MD is that the time step size is limited to around a femto (10-15) second. Consequently, a large number of iterations are required to simulate to realistic time spans. This is a major bottleneck in MD, and has been identified as an important challenge in computational biology and nano-materials. While parallelization has been effective in dealing with the computational effort that arises in simulating large physical systems (that is, having a large number of atoms), conventional parallelization is not effective in simulating small or moderate sized physical systems to long time spans. We recently introduced a new approach to parallelization, where data from prior simulations are used to parallelize a new computation along the time domain. We demonstrated its effectiveness in a nano-materials application, where this approach scaled to a larger number of processors than conventional parallelization. In this thesis, we parallelize a computational biology application – the AFM pulling of a protein – using this approach. The significance of this work arises in demonstrating the effectiveness of this technique in a soft-matter application, which is more challenging than the hard-matter applications considered earlier. Show less

Date Issued

2006

Identifier

FSU_migr_etd-3519

Format

Thesis

Title

Methods for Linear and Nonlinear Array Data Dependence Analysis with the Chains of Recurrences Algebra.

The presence of data dependences between statements in a loop iteration space imposes strict constraints on statement order and loop restructuring when preserving program semantics. A compiler determines the safe partial ordering of statements that enhance performance by explicitly disproving the presence of dependences. As a result, the false positive rate of a dependence analysis technique is a crucial factor in the effectiveness of a restructuring compiler's ability to optimize the... Show moreThe presence of data dependences between statements in a loop iteration space imposes strict constraints on statement order and loop restructuring when preserving program semantics. A compiler determines the safe partial ordering of statements that enhance performance by explicitly disproving the presence of dependences. As a result, the false positive rate of a dependence analysis technique is a crucial factor in the effectiveness of a restructuring compiler's ability to optimize the execution of performance-critical code fragments. This dissertation investigates reducing the false positive rate by improving the accuracy of analysis methods for dependence problems and increasing the total number of problems analyzed. Fundamental to these improvements is the rephrasing of the dependence problem in terms of Chains of Recurrences (CR), a formalism that has been shown to be conducive to efficient loop induction variable analysis. An infrastructure utilizing CR-analysis methods and enhanced dependence testing techniques is developed and tested. Experimental results indicate capabilities of dependence analysis methods can be improved without a reduction in efficiency. This results in a reduction in the false positive rate and an increase in the number of optimized and parallelized code fragments. Show less

Passwords are still one of the most common means of securing computer systems. Most organizations rely on password authentication systems, and therefore, it is very important for them to enforce their users to have strong passwords. They usually try to enforce security by mandating users to follow password creation policies. They force users to follow some rules such as a minimum length, or using symbols and numbers. However, these policies are not consistent with each other; for example, the... Show morePasswords are still one of the most common means of securing computer systems. Most organizations rely on password authentication systems, and therefore, it is very important for them to enforce their users to have strong passwords. They usually try to enforce security by mandating users to follow password creation policies. They force users to follow some rules such as a minimum length, or using symbols and numbers. However, these policies are not consistent with each other; for example, the length of a good password is different in each policy. They usually ignore the importance of usability of the password for the users. The more complex they are the more they frustrate users and they end up with some coping strategies such as adding "123" at the end of their passwords or repeating a word to make their passwords longer, which reduces the security of the password, and more importantly there is no scientific basis for these password creation policies to make sure that passwords that are created based on these rules are resistance against real attacks. In fact, there are studies that show that even the NIST proposal for a password creation policy that results in strong passwords is not valid. This paper describes different password creation policies and password checkers that try to help users create strong passwords and addresses their issues. Metrics for password strength are explored in this paper and new approaches to calculate these metrics for password distributions are introduced. Furthermore, a new technique to estimate password strength based on its likelihood of being cracked by an attacker is described. In addition, a tool called PAM has been developed and explained in details in this paper to help users have strong passwords using these metrics. PAM is a password analyzer and modifier, which rejects weak passwords and suggests a new stronger password with slight changes to the original one to ensure the usability of the password for each individual. Show less

Energy efficiency is an important design consideration in nearly all classes of processors, but is of particular importance to mobile and embedded systems. The data cache accounts for a significant portion of processor power. We have previously presented an approach to reducing cache energy by introducing an explicitly controlled tagless access buffer (TAB) at the top of the cache hierarchy. The TAB reduces energy usage by redirecting loop memory references from the level-one data cache (L1D)... Show moreEnergy efficiency is an important design consideration in nearly all classes of processors, but is of particular importance to mobile and embedded systems. The data cache accounts for a significant portion of processor power. We have previously presented an approach to reducing cache energy by introducing an explicitly controlled tagless access buffer (TAB) at the top of the cache hierarchy. The TAB reduces energy usage by redirecting loop memory references from the level-one data cache (L1D) to the smaller, more energy-efficient TAB. These references need not access the data translation lookaside buffer (DTLB), and can sometimes avoid unnecessary transfers from lower levels of the memory hierarchy. We improve upon our previous design to create a system that requires fewer instruction set changes and gives more explicit control over the allocation and deallocation of TAB resources. We show that with a cache line size of 32 bytes, a four-line TAB can eliminate on average 31% of L1D accesses, which reduces L1D/DTLB energy usage by 19%. Show less

This report shall discuss the threats to information and services in open systems. In it we shall provide an illustration where these threats are confronted, describe the three basic techniques of authentication, and comment on the various methods of access control. We go on to explain information compartmentalization, information disclosure policies, preventing the unauthorized modification of data, and audit logs. Finally we outline a hybrid approach for access control that incorporates... Show moreThis report shall discuss the threats to information and services in open systems. In it we shall provide an illustration where these threats are confronted, describe the three basic techniques of authentication, and comment on the various methods of access control. We go on to explain information compartmentalization, information disclosure policies, preventing the unauthorized modification of data, and audit logs. Finally we outline a hybrid approach for access control that incorporates strong authentication and uses an access control tree to represent privilege. Our system is real-time, perimeter based, community centric, and user friendly. The intelligent security decision that results from using an access control tree is commendably better than any current system. With our access control system we can dynamically determine capability based on real world conditions by incorporating security information from external data sources, software agents, as well as location based sensors. Show less

This dissertation presents contributions to several problems relevant to cyber physical system (CPS) vulnerability research. Cyber physical systems encompass any computational systems that interact with physical processes, encompassing critical infrastructure networks, industrial control system networks, SCADA networks, and more. Vulnerability research encompasses three main areas of cybersecurity: finding new vulnerabilities, exploit development, and exploit mitigation. This work primarily... Show moreThis dissertation presents contributions to several problems relevant to cyber physical system (CPS) vulnerability research. Cyber physical systems encompass any computational systems that interact with physical processes, encompassing critical infrastructure networks, industrial control system networks, SCADA networks, and more. Vulnerability research encompasses three main areas of cybersecurity: finding new vulnerabilities, exploit development, and exploit mitigation. This work primarily focuses on exploit mitigation, and ethically utilizes the other factors for conclusive validation of the research contributions. Contributions in this work span problems in, but not limited to dynamic trust management; vulnerability research education; CPS situational awareness and threat intelligence; CPS physics impact analysis; embedded CPS malware analysis; and finally embedded CPS forensics. Notable contributions include the acclaimed Offensive Computer Security open courseware, the introduction of physics-based-intrusion-detection for cyber physical systems, novel threat intelligence and situational awareness framework for CPS, and novel embedded CPS virtualization and simulated physics integration methodology for dynamic analysis and physics analysis for embedded CPS. The contribution results have been peer reviewed and published, and the list of publications generated during the doctorate research is listed in the Biographical Sketch. All of these areas and contributions are critically relevant to critical infrastructure and industrial control systems. Show less

Quasi-Monte Carlo methods are a variant of ordinary Monte Carlo methods that employ highly uniform quasirandom numbers in place of Monte Carlo's pseudorandom numbers. Monte Carlo methods offer statistical error estimates; however, while quasi-Monte Carlo has a faster convergence rate than normal Monte Carlo, one cannot obtain error estimates from quasi-Monte Carlo sample values by any practical way. A recently proposed method, called randomized quasi-Monte Carlo methods, takes advantage of... Show moreQuasi-Monte Carlo methods are a variant of ordinary Monte Carlo methods that employ highly uniform quasirandom numbers in place of Monte Carlo's pseudorandom numbers. Monte Carlo methods offer statistical error estimates; however, while quasi-Monte Carlo has a faster convergence rate than normal Monte Carlo, one cannot obtain error estimates from quasi-Monte Carlo sample values by any practical way. A recently proposed method, called randomized quasi-Monte Carlo methods, takes advantage of Monte Carlo and quasi-Monte Carlo methods. Randomness can be brought to bear on quasirandom sequences through scrambling and other related randomization techniques in randomized quasi-Monte Carlo methods, which provide an elegant approach to obtain error estimates for quasi-Monte Carlo based on treating each scrambled sequence as a different and independent random sample. The core of randomized quasi-Monte Carlo is to find an effective and fast algorithm to scramble (randomize) quasirandom sequences. This dissertation surveys research on algorithms and implementations of scrambled quasirandom sequences and proposes some new algorithms to improve the quality of scrambled quasirandom sequences. Besides obtaining error estimates for quasi-Monte Carlo, scrambling techniques provide a natural way to parallelize quasirandom sequences. This scheme is especially suitable for distributed or grid computing. By scrambling a quasirandom sequence we can produce a family of related quasirandom sequences. Finding one or a subset of optimal quasirandom sequences within this family is an interesting problem, as such optimal quasirandom sequences can be quite useful for quasi-Monte Carlo. The process of finding such optimal quasirandom sequences is called the derandomization of a randomized (scrambled) family. We summarize aspects of this technique and propose some new algorithms for finding optimal sequences from the Halton, Faure and Sobol sequences. Finally we explore applications of derandomization. Show less

Statically pipelined processors offer a new way to improve the performance beyond that of a traditional in-order pipeline while simultaneously reducing energy usage by enabling the compiler to control more fine-grained details of the program execution. This paper describes how a compiler can exploit the features of the static pipeline architecture to apply optimizations on transfers of control that are not possible on a conventional architecture. The optimizations presented in this paper... Show moreStatically pipelined processors offer a new way to improve the performance beyond that of a traditional in-order pipeline while simultaneously reducing energy usage by enabling the compiler to control more fine-grained details of the program execution. This paper describes how a compiler can exploit the features of the static pipeline architecture to apply optimizations on transfers of control that are not possible on a conventional architecture. The optimizations presented in this paper include hoisting the target address calculations for branches, jumps, and calls out of loops, performing branch chaining between calls and jumps, hoisting the setting of return addresses out of loops, and exploiting conditional calls and returns. The benefits of performing these transfer of control optimizations include a 6.8% reduction in execution time and a 3.6% decrease in estimated energy usage. Show less

We have considered the problem of tracking and recognition using a three dimensional representation of human faces. First we present a review of the research in the tracking and recognition fields including a list of several commercially available face tracking and recognition systems. Next, two algorithms are described: one for tracking faces from observed images and one for recognition of faces from observed geometries. The tracking algorithm uses 3D shape and texture of a human face to... Show moreWe have considered the problem of tracking and recognition using a three dimensional representation of human faces. First we present a review of the research in the tracking and recognition fields including a list of several commercially available face tracking and recognition systems. Next, two algorithms are described: one for tracking faces from observed images and one for recognition of faces from observed geometries. The tracking algorithm uses 3D shape and texture of a human face to estimate the changing position and orientation of a real face in a video image sequence. The recognition algorithm uses principal component analysis (PCA) of range images generated from the 3D shape of a human face to create a database of low-dimensional face representations for efficient recognition. Range images are robust to illumination and texture variations and thus avoid some of the current limitations in face recognition. Show less

Memory hungry applications consistently keep their memory requirement curves ahead of the growth of DRAM capacity in modern computer systems. Such applications quickly start paging to swap space on the local disk, which brings down their performance, an old and ongoing battle between the disk and RAM in the memory hierarchy. This thesis presents a practical low-cost solution to this important performance problem. We give the design, implementation and evaluation of Anemone - an Adaptive... Show moreMemory hungry applications consistently keep their memory requirement curves ahead of the growth of DRAM capacity in modern computer systems. Such applications quickly start paging to swap space on the local disk, which brings down their performance, an old and ongoing battle between the disk and RAM in the memory hierarchy. This thesis presents a practical low-cost solution to this important performance problem. We give the design, implementation and evaluation of Anemone - an Adaptive NEtwork MemOry engiNE. Anemone pools together the memory resources of many machines in a clustered network of computers. It then presents an interface to client machines in order to use the collective memory pool in a virtualized manner, providing potentially unlimited amounts of memory to memory-hungry high-performance applications. Using real applications like the ns-2 simulator, the ray-tracing program POV-ray, and quicksort, disk-based page-fault latencies average 6.5 milliseconds whereas Anemone provides an average of latency of 700.2 microseconds, 9.2 times faster than using the disk. In contrast to the disk-based paging, our results indicate that Anemone reduces the execution time of single memory-bound processes by half. Additionally, Anemone reduces the execution times of multiple, concurrent memory-bound processes by a factor of 10 on the average. Another key advantage of Anemone is that this performance improvement is achieved with no modifications to the client's operating system nor the memory-bound applications due to the use of a novel NFS-based low-latency remote paging mechanism. Show less

Date Issued

2005

Identifier

FSU_migr_etd-4038

Format

Thesis

Title

Computational Transformation Between Different Symbolic Representations of BK Products of Fuzzy Relations.

Fuzzy relational calculi based on BK products of relations have representational and computational means for handling both concrete numerical representations of relations and symbolic manipulation of relations. BK calculus of relations together with fast fuzzy relational algorithms allows concrete numerical representations of relations to be used extensively in applications. On the other hand, when enriched by relational inequalities like BK Bootstrap or combined with other theories such as... Show moreFuzzy relational calculi based on BK products of relations have representational and computational means for handling both concrete numerical representations of relations and symbolic manipulation of relations. BK calculus of relations together with fast fuzzy relational algorithms allows concrete numerical representations of relations to be used extensively in applications. On the other hand, when enriched by relational inequalities like BK Bootstrap or combined with other theories such as generalized morphisms, high level symbolic forms of relations can be used for symbolic manipulation of relations that have been abstracted from numerical representations. Furthermore, symbolic formulas of relations can be handled equationally. Equations over BK-products can characterize relational properties in a universal way. The research in this dissertation focuses on symbolic manipulations of BK products of fuzzy relations. We have developed as a proof-of-concept an automated tool that works with various representational forms of relations and facilitates transformations among them. Major contribution that this system brings into the field is that, it provides a link between numerical and symbolic representations of relations, which can substantially extend the applicability of fuzzy relations. The pilot implementation of the tool consists of two systems. At a high level of general fuzzy logic systems, the first system transforms BK-product formulas syntactically between three notational forms: matrix form, set form and predicate form. We have defined for each kind of BK-product representations a tree-type data structure, called a notational tree. All transformations are then carried out by set of transformational algorithms among the notational trees of BK representational forms. At a lower level of t-norm based residuated logic systems (BL logic), we have developed a second system which is a term rewriting theorem prover/checker that validates and generates proofs for theorems of BK relational calculi. For each given theorem, a derivation tree will first be generated. A matching of any node in that tree with the theorem's conclusion will validate it. We proposed a generate-and-match algorithm based on a breadth-first-search navigation process through theorems' derivation trees which guarantees a loop-free result for any derivable theorem (in a given theory). The original version of this algorithm has been improved further by applying a human-like proof strategy, which we called distance-first-search and optimized distance-first-search algorithms. These optimized versions improve the performance of our system significantly, reducing both number of logical inferences and the CPU's time required. The experiments also showed that proofs in BK calculi are significantly shorter than in predicate calculus of BL logic. Interestingly enough, proofs generated by the tool are the same as those done by hand. This illustrates the successfulness of our human-like strategy. Show less

Instruction fetch is an important pipeline stage for embedded processors, as it can consume a significant fraction of the total processor energy. This dissertation describes the design and implementation of two new fetch enhancements that seek to improve overall energy efficiency without any performance tradeoff. Instruction packing is a combination architectural/compiler technique that leverages code redundancy to reduce energy consumption, code size, and execution time. Frequently occurring... Show moreInstruction fetch is an important pipeline stage for embedded processors, as it can consume a significant fraction of the total processor energy. This dissertation describes the design and implementation of two new fetch enhancements that seek to improve overall energy efficiency without any performance tradeoff. Instruction packing is a combination architectural/compiler technique that leverages code redundancy to reduce energy consumption, code size, and execution time. Frequently occurring instructions are placed into a small instruction register file (IRF), which requires less energy to access than an L1 instruction cache. Multiple instruction register references are placed in a single packed instruction, leading to reduced cache accesses and static code size. Hardware register windows and compiler optimizations tailored for instruction packing yield greater reductions in fetch energy consumption and static code size. The Lookahead Instruction Fetch Engine (LIFE) is a microarchitectural technique designed to exploit the regularity present in instruction fetch. The nucleus of LIFE is the Tagless Hit Instruction Cache (TH-IC), a small cache that assists the instruction fetch pipeline stage as it efficiently captures information about both sequential and non-sequential transitions between instructions. TH-IC provides a considerable savings in fetch energy without incurring the performance penalty normally associated with small filter instruction caches. Furthermore, TH-IC makes the common case (cache hit) more energy efficient by making the tag check unnecessary. LIFE extends TH-IC by making use of advanced control flow metadata to further improve utilization of fetch-associated structures such as the branch predictor, branch target buffer, and return address stack. LIFE enables significant reductions in total processor energy consumption with no impact on application execution times even for the most aggressive power-saving configuration. Both IRF and LIFE (including TH-IC) improve overall processor efficiency by actively recognizing and exploiting the common properties of instruction fetch. Show less

With the advancement information and communication technologies, networked computing devices have been adopted to address real-world challenges due to their efficiency and programmability while maintaining scalability, sustainability, and resilience. As a result, computing and communication technologies have been integrated into critical infrastructures and other physical processes. Cyber physical systems (CPS) integrate computation and physical processes of critical infrastructure systems.... Show moreWith the advancement information and communication technologies, networked computing devices have been adopted to address real-world challenges due to their efficiency and programmability while maintaining scalability, sustainability, and resilience. As a result, computing and communication technologies have been integrated into critical infrastructures and other physical processes. Cyber physical systems (CPS) integrate computation and physical processes of critical infrastructure systems. Historically, these systems mostly relied on proprietary technologies and were built as stand-alone systems in physically secure locations. However, the situation has changed considerably in recent years. Commodity hardware, software, and standardized communication technologies are used in CPS to enhance their connectivity, provide better accessibility to costumers and maintenance personnel, and improve overall efficiency and robustness of their operations. Unfortunately, increased connectivity, efficiency, and openness have also signiﬁcantly increased vulnerabilities of CPS to cyber attacks. These vulnerabilities could allow attackers to alter the systems' behavior and cause irreversible physical damage, or even worse cyber-induced disasters. However, existing security measures cannot be eﬀectively applied to CPS directly because they are mostly for cyber only systems. Thus, new approaches to preventing cyber physical system disasters are essential. We recognize very diﬀerent characteristics of cyber and physical components in CPS, where cyber components are ﬂexible with large attack surfaces while physical components are inﬂexible and relatively simple with very small attack surfaces. This research focuses on the components where cyber and physical components interact. Securing cyber-physical interfaces will complete a layer-based defense strategy in the "Defense in Depth Framework". In this research we propose Trusted Security Modules (TSM) as a systematic solution to provide a guarantee to prevent cyber-induced physical damage even when operating systems and controllers are compromised. TSMs will be placed at the interface between cyber and physical components by adapting the existing integrity enforcing mechanisms such as Trusted Platform Module (static integrity), Control-Flow Integrity (dynamic integrity) to enhance its own security and integrity. Through this dissertation we introduce the general design and number of ways to implement the TSM. We also show the behaviors of TSM with a working prototype and simulation. Show less

The nature of embedded systems development places a great deal of importance on meeting strict requirements in areas such as static code size, power consumption, and execution time. Due to this, embedded developers frequently generate and tune assembly code for applications by hand. The phase ordering problem is a well-known problem a®ecting the design of optimizing compilers. VISTA is an optimizing compiler framework that employs iteration of optimization phase sequences and a genetic... Show moreThe nature of embedded systems development places a great deal of importance on meeting strict requirements in areas such as static code size, power consumption, and execution time. Due to this, embedded developers frequently generate and tune assembly code for applications by hand. The phase ordering problem is a well-known problem a®ecting the design of optimizing compilers. VISTA is an optimizing compiler framework that employs iteration of optimization phase sequences and a genetic algorithm search for e®ective phase sequences in an e®ort to minimize the e®ects of the phase ordering problem. Hand-generated code is susceptible to an analogous problem to phase ordering, but there has been little research in mitigating its e®ect on the quality of the generated code. One approach for adjusting the phase ordering of such previously optimized code is to de-optimize the code by undoing the potential work done by prior optimization phases. This thesis presents an extension of the VISTA framework for investigating the e®ect and potential bene¯t of performing de-optimization before re-optimizing assembly code. The construction of a translator tool suite for the purpose of converting assembly code to the VISTA RTL input format is discussed. The design and implementation of algorithms for de-optimization of both loop-invariant code motion and register allocation, along with results of performing experiments regarding de-optimization and re-optimization of previously generated assembly code are also presented. Show less

Date Issued

2004

Identifier

FSU_migr_etd-4039

Format

Thesis

Title

Reduced Order Modeling for a Nonlocal Approach to Anomalous Diffusion Problems.

With the recent advances in using nonlocal approaches to approximate traditional partial differential equations(PDEs), a number of new research avenues have been opened that warrant further study. One such path, that has yet to be explored, is using reduced order techniques to solve nonlocal problems. Due to the interactions between the discretized nodes or particles inherent to a nonlocal model, the system sparsity is often significantly less than its PDE counterpart. Coupling a reduced... Show moreWith the recent advances in using nonlocal approaches to approximate traditional partial differential equations(PDEs), a number of new research avenues have been opened that warrant further study. One such path, that has yet to be explored, is using reduced order techniques to solve nonlocal problems. Due to the interactions between the discretized nodes or particles inherent to a nonlocal model, the system sparsity is often significantly less than its PDE counterpart. Coupling a reduced order approach to a nonlocal problem would ideally reduce the computational cost without sacrificing accuracy. This would allow for the use of a nonlocal approach in large parameter studies or uncertainty quantification. Additionally, because nonlocal problems inherently have no spatial derivatives, solutions with jump discontinuities are permitted. This work seeks to apply reduced order nonlocal concepts to a variety of problem situations including anomalous diffusion, advection, the advection-diffusion equation and solutions with spatial discontinuities. The goal is to show that one can use an accurate reduced order approximation to formulate a solution at a fraction of the cost of traditional techniques. Show less

Date Issued

2016

Identifier

FSU_2016SP_Witman_fsu_0071E_13130

Format

Thesis

Title

Reducing Power Involved in Pipeline Scheduling Through the Use of an Incremental Hybrid Scheduler.

When considering computer processors, there is a trade-off between performance and power; improved performance does not typically come without an increase in power. Similarly, the reduction of power often times means a reduction in performance. This paper proposes an approach that realizes both high performance and low power levels. The approach involves an incremental hybrid scheduler. The hybrid scheduler takes advantage of the already low-power Tagless-Hit Instruction Cache (TH-IC), by... Show moreWhen considering computer processors, there is a trade-off between performance and power; improved performance does not typically come without an increase in power. Similarly, the reduction of power often times means a reduction in performance. This paper proposes an approach that realizes both high performance and low power levels. The approach involves an incremental hybrid scheduler. The hybrid scheduler takes advantage of the already low-power Tagless-Hit Instruction Cache (TH-IC), by caching instruction schedules. This eliminates many unnecessary creations of redundant schedules. Additionally, the hybrid scheduler essentially runs in a low-power in-order mode and switches to a more aggressive scheduling mode only when in the code hot spots. The hybrid scheduler offers a simple approach to gain improvements in performance as compared to an in-order processor while still maintaining a low power level. Show less

Date Issued

2011

Identifier

FSU_migr_etd-4092

Format

Thesis

Title

Improvement of a Tracer Correlation Problem with a Non-Iterative Limiter.

A functional relation between two chemical species puts observational constraints on attempts to model the atmosphere. For example, adequate representation of these relations is important when modeling the depletion of stratospheric ozone by nitrous oxide. Previous work has shown a case where a linear functional relation is not preserved in the tracer transport scheme of the Higher Order Methods Modeling Environment (HOMME), which is the spectral element dynamics core used by the Community... Show moreA functional relation between two chemical species puts observational constraints on attempts to model the atmosphere. For example, adequate representation of these relations is important when modeling the depletion of stratospheric ozone by nitrous oxide. Previous work has shown a case where a linear functional relation is not preserved in the tracer transport scheme of the Higher Order Methods Modeling Environment (HOMME), which is the spectral element dynamics core used by the Community Atmosphere Model (CAM). Application of a certain simple tracer chemistry reaction before each model time step can test whether the scheme actually preserves linear tracer correlations (LCs) to machine precision. Using this method, we confirm previous results that, the implementation of the default shape-preserving filter of HOMME used in the transport scheme does not preserve LCs. However, since we prove that this limiter along with a few other limiter algorithms do in fact preserve LCs in exact arithmetic, we suggest that these limiter algorithms exacerbate the growth of roundoff error in elements where tracers have very different magnitudes. Nevertheless, we manage to put forth a limiting scheme that improves the tracer correlation problem. We also derive another new limiter that relies on multiplicative rescaling of nodal values within a given element. This algorithm does not rely on iterations for convergence and thus has the advantage of being more computationally efficient than the current default CAM-SE limiter. Results also show that the default limiter does not always introduce the lowest amount of L₂ error, which contradicts its purpose, since it was derived to minimize error in the L₂ norm. Show less

Within the last decade, commodity Graphics Processing Units (GPUs) specialized for 2D and 3D scene rendering have seen an explosive growth in processing power compared to their general purpose counterpart, the CPU. Currently capable of near teraflop speeds and sporting gigabytes of on-board memory, GPUs have transformed from accessory video game hardware to truly general purpose computational coprocessors. One of the first applications of GPUs for general computing was 2D Voronoi tessellation... Show moreWithin the last decade, commodity Graphics Processing Units (GPUs) specialized for 2D and 3D scene rendering have seen an explosive growth in processing power compared to their general purpose counterpart, the CPU. Currently capable of near teraflop speeds and sporting gigabytes of on-board memory, GPUs have transformed from accessory video game hardware to truly general purpose computational coprocessors. One of the first applications of GPUs for general computing was 2D Voronoi tessellation--partitioning a 2D domain into regions based on a set of seed points, such that regions contain all points closer to one seed than to any other. Although the topic has been revisited many times, related work has failed to consider GPU based production of emph{centroidal} Voronoi tessellations, where seed points are also the regional centers of mass. This thesis presents a first look at centroidal Voronoi tessellation computed entirely on the GPU. An extension to centroidal Voronoi tessellation is also considered for partitioning surfaces (2-manifolds) of the form $f(u,v) ightarrow (u, v, z(u,v))$ using Euclidean based metrics. To complete these tasks, a highly efficient flooding algorithm is used for the Voronoi tessellation, while a regularized sampling approach is employed to compute the centroids of the Voronoi regions. Seeds are updated by a deterministic Lloyd's method. Show less

Date Issued

2009

Identifier

FSU_migr_etd-3606

Format

Thesis

Title

Authority Distribution in a Proxy-Based Massively Multiplayer Game Architecture.

This thesis presents a new architecture for improving the interactivity and responsiveness of massively multiplayer on-line games. In order to aide in the quick delivery of game updates generated by a game client, this thesis presents a stateless game communications proxy. This communications proxy allows for quick routing of updates from the game clients to the server and quick routing of updates from client to client. This communications proxy maintains very little game-state information,... Show moreThis thesis presents a new architecture for improving the interactivity and responsiveness of massively multiplayer on-line games. In order to aide in the quick delivery of game updates generated by a game client, this thesis presents a stateless game communications proxy. This communications proxy allows for quick routing of updates from the game clients to the server and quick routing of updates from client to client. This communications proxy maintains very little game-state information, it only stores a list of all clients' physical location and of entities within the area of interest of all connected clients. The communications proxy is not specific to any game and can be used to improve communications on any type of online multiplayer game. In addition to the communications proxy, this thesis presents new methods for distributing authority to propagate game-state updates between the client and server in order to create a more responsive game. This distribution of authority away from the game server can reduce the propagation delay of game updates and reduce the computing load on the server if used in conjunction with the communications proxy. Some of the methods for authority distribution described in this paper have been implemented in a multiplayer game called RPGQuest to illustrate their advantages and disadvantages. Show less

In numerous applications involving high dimensional data, certain subspace techniques such as principal components analysis (PCA) may be utilized in feature extraction. Often, PCA can reduce the dimensionality while retaining most of the significant information of the original data. This can be beneficial not only for representation of the data more compactly (compression), but also for transforming the data into a more useful form for applications involving feature extraction and... Show moreIn numerous applications involving high dimensional data, certain subspace techniques such as principal components analysis (PCA) may be utilized in feature extraction. Often, PCA can reduce the dimensionality while retaining most of the significant information of the original data. This can be beneficial not only for representation of the data more compactly (compression), but also for transforming the data into a more useful form for applications involving feature extraction and classification. Relatively recent developments with PCA extend conventional principal components analysis to newer variants of PCA which appear particularly useful in computer vision and image applications: (1) two dimensional PCA ("2D PCA"), and (2) bidirectional or bilateral two dimensional PCA ("B2DPCA", "Bi2DPCA", or "(2D)² PCA"). The latter category includes an iterative version which is an example of coupled subspace analysis or "CSA"; the non-iterative version is known as projective Bi2DPCA. In this thesis, these PCA variants are considered as special cases of the more general CSA. Theoretical advantages of 2D PCA and bidirectional PCA over conventional PCA should arise from the fact that significant information about the spatial relationship between image pixels may be discarded in conventional PCA as the image is represented by a large column vector, whereas 2D PCA and bidirectional PCA techniques can preserve more of this information by representing the image as a matrix rather than a long vector. The problems of small sample size, and curse of dimensionality are also alleviated to some extent, particularly in the cases of B2DPCA and iterated CSA. Some of these PCA variants have been proposed in various image recognition applications recently, including biometric identification using iris texture, face images, and palm prints, and categorization of wood species based on wood grain texture to name a few examples. So, while much focus has been placed on feature extraction methods such as use of Gabor wavelets or similar techniques for some applications such as iris recognition, some subspace techniques, including some of these PCA variants, have shown promise in conjunction with image preprocessing techniques for removal of uneven background illumination and contrast enhancement. In this thesis, the image application of biometric iris recognition is chosen as the means of evaluating potential advantages of these newer PCA variants, including CSA, in the context of feature extraction and classification. The rich texture information of these images, and the utilization of effective image registration techniques, yields images which are well suited for this purpose. As the primary focus of this thesis, these PCA variants are evalulated using closed set identification test mode, and are compared using Euclidean distance single nearest neighbor classifier; images are preprocessed using top-hat filtering and contrast limiting adaptive histogram equalization (CLAHE). Use of multiple test (probe) images is considered, and the impact on performance is considered also for training image sets with 2, 3, and 4 sample images per class. Concurrently, the application of iris image recognition is addressed in detail. Other applications for which these PCA variants and preprocessing techniques may be beneficial are discussed in the concluding section. Show less

In this dissertation we study the security and survivability of wireless mobile network systems in two distinct threat models: the Byzantine threat model and the sel¯sh node threat model. Wireless mobile networks are collections of self-organizing mobile nodes with dynamic topologies and have no ¯xed infrastructure. Because of their dynamic ad hoc nature, these networks are particularly vulnerable to security threats. The security of such systems is, to a large extent, based on trust... Show moreIn this dissertation we study the security and survivability of wireless mobile network systems in two distinct threat models: the Byzantine threat model and the sel¯sh node threat model. Wireless mobile networks are collections of self-organizing mobile nodes with dynamic topologies and have no ¯xed infrastructure. Because of their dynamic ad hoc nature, these networks are particularly vulnerable to security threats. The security of such systems is, to a large extent, based on trust associations. There are several ways in which trust can be supported in a network system. The way we adopt is to establish a secure public key management infrastructure (PKI). This enables basic cryptographic functionalities, such as integrity, privacy, etc. However, due to the dynamic character of a wireless mobile network and its ad hoc topology changes, the trust associations cannot depend on any pre-established trust relations and must support a °exible, uncertain and incomplete trust model. One of our main goals in this dissertation is to analyze the distributed nature of trust in wireless mobile networks and to consider approaches that manage trust based only on locally available information. In our analysis for this problem we use the traditional Byzantine attack model. After reviewing the trust models proposed in the literature we propose an extension that supports a distributed trust management infrastructure. In this model the trust is distributed horizontally via multiple disjoint trust °ows. Compared to the traditional hierarchical trust distribution, our approach is appropriate for dynamic wireless systems for which there are no central trust authorities. A second goal is to manage trust based on the good behavior of nodes. Mobile wireless networks rely heavily on node collaboration. However, since the nodes are often battery xi powered, they may behave sel¯shly to preserve power. The threat model for this application is restricted to sel¯sh node attacks. We present a simple and e±cient reputation system, Locally Aware Reputation System (LARS) that mitigates sel¯sh node behavior. We explore methods that stimulate node cooperation in mobile wireless networks, and analyze the reputation systems proposed in the literature. The performance of LARS is evaluated in terms of its packet delivery ratio, its end-to-end delay and its overhead, and compared to the other reputation systems proposed in the literature. Finally, to enhance the security and survivability of wireless mobile networks against sel¯sh threats, we propose a mechanism that will trace sel¯sh node behavior. Chapter 4, 5 and 6 are the main contribution of the dissertation. Show less

The problems inherent to providing security for network systems are relative to the openness and design of network architecture. Typically network security is achieved through the use of monitoring tools based on pattern recognition or behavioral analysis. One of the tools based on pattern recognition is SNORT. SNORT attempts to protect networks by alerting system administrators when network received packets of information match predetermined signatures contained in the SNORT tool.... Show moreThe problems inherent to providing security for network systems are relative to the openness and design of network architecture. Typically network security is achieved through the use of monitoring tools based on pattern recognition or behavioral analysis. One of the tools based on pattern recognition is SNORT. SNORT attempts to protect networks by alerting system administrators when network received packets of information match predetermined signatures contained in the SNORT tool. Unfortunately, by the very nature of this design, SNORT operates at the packet data level and has no concept of the specific properties of the network it is trying to protect. This thesis provides the design of an alert management tool which, upon taking SNORT alert signatures as inputs and using a knowledge base of intruders and local Network Systems, attempts to reduce false-positive and negative alerts sent to the system administrator. The major drawback to SNORT is that many false alerts are sent from the SNORT engine, and must then be sifted through and classified by system administrators. This thesis proposes a tool which should lessen this stress and considerably reduce the workload of having to classify alerts by human beings. Show less

The digital revolution, fired by the development of the information and communication technologies, has fundamentally changed the way we think, behave, communicate, work and earn livelihood (the World Summit on the Information Society). These technologies have affected all aspects of our society and economy. However, the Information Society developments present us not only with new benefits and opportunities, but also with new challenges. Information security is one of these challenges, and... Show moreThe digital revolution, fired by the development of the information and communication technologies, has fundamentally changed the way we think, behave, communicate, work and earn livelihood (the World Summit on the Information Society). These technologies have affected all aspects of our society and economy. However, the Information Society developments present us not only with new benefits and opportunities, but also with new challenges. Information security is one of these challenges, and nowadays, information security mechanisms are inevitable components of virtually every information system. Information authentication is one of the basic information security goals, and it addresses the issues of source corroboration and improper or unauthorized modification of data. More specific, data integrity is the property that the data has not been changed in an unauthorized manner since its creation, transmission or storage. Data origin authentication, or message authentication, is the property whereby a party can be corroborated as a source of the data. Usually, message authentication is achieved by appending an authentication tag or a digital signature to the message. The authentication tag (resp., digital signature) is computed in such a way so that only an entity that is in possession of the secret key can produce it, and it is used by the verifier to determine the authenticity of the message. During this procedure, the message is considered to be an atomic object in the following sense. The verifier needs the complete message in order to check its validity. Presented with the authentication tag (resp., digital signature) and an incomplete message, the verifier cannot determine whether the presented incomplete message is authentic or not. We consider a more general authentication model, where the verifier is able to check the validity of incomplete messages. In particular, we analyze the cases of erasure-tolerant information authentication and stream authentication. Our model of erasure-tolerant information authentication assumes that a limited number of the message ``letters' can be lost during the transmission. Nevertheless, the verifier should still be able to check the authenticity of the received incomplete message. We provide answers to several fundamental questions in this model (e.g., lower bounds on the deception probability, distance properties, optimal constructions, etc.), and we propose some constructions of erasure-tolerant authentication codes. Streams of data are bit sequences of a finite, but a priori unknown length that a sender sends to one or more recipients, and they occur naturally when on-line processing is required. In this case, the receiver should be able to verify the authenticity of a prefix of the stream, that is, the part of the stream that has been received so far. We provide efficient and proven secure schemes for both unicast and multicast stream authentication. The security proof of one of the proposed multicast stream authentication schemes assumes that the underlying block cipher is a related-key secure pseudorandom permutation. So, we also study the resistance of AES (Advanced Encryption Standard) to related-key differential attacks. Show less

This thesis describes the implementation of a fast, dynamic, approximate, nearest-neighbor search algorithm that works well in fixed dimensions (d The implementation is competitive with the best approximate nearest neighbor searching codes available on the web, especially for creating approximate k-nearest neighbor graphs of a point cloud. An extensive C++ library has been built implementing the research presented here. It can be found at: http://www.compgeom.com/~stann