This paper presents novel statistical algorithms for protecting the iTrust information retrieval network against malicious attacks. In iTrust, metadata describing documents, and requests containing keywords, are randomly distributed to multiple participating nodes. The nodes that receive the requests try to match the keywords in the requests with the metadata they hold. If a node finds a match, the matching node returns the URL of the associated information to the requesting node. The requesting node then uses the URL to retrieve the information from the source node. The novel detection algorithm determines empirically the probabilities of the specific number of matches based on the number of responses that the requesting node receives. It also calculates the analytical probabilities of the specific numbers of matches. It compares the observed and the analytical probabilities to estimate the proportion of subverted or non-operational nodes in the iTrust network using a window-based method and the chi-squared statistic. If the detection algorithm determines that some of the nodes in the iTrust network are subverted or non-operational, then the novel defensive adaptation algorithm increases the number of nodes to which the requests are distributed to maintain the same probability of a match when some of the nodes are subverted or non-operational as compared to when all of the nodes are operational. Experimental results substantiate the effectiveness of the detection and defensive adaptation algorithms for protecting the iTrust information retrieval network against malicious attacks.

Keys play a fundamental role in all data models. They allow database systems to uniquely identify data items, and therefore, promote efficient data processing in many applications. Due to this, support is required to discover keys. These include keys that are semantically meaningful for the application domain, or are satisfied by a given database. We study the discovery of keys from SQL tables. We investigate the structural and computational properties of Armstrong tables for sets of SQL keys. Inspections of Armstrong tables enable data engineers to consolidate their understanding of semantically meaningful keys, and to communicate this understanding to other stake-holders. The stake-holders may want to make changes to the tables or provide entirely different tables to communicate their views to the data engineers. For such a purpose, we propose data mining algorithms that discover keys from a given SQL table. We combine the key mining algorithms with Armstrong table computations to generate informative Armstrong tables, that is, key-preserving semantic samples of existing SQL tables. Finally, we define formal measures to assess the distance between sets of SQL keys. The measures can be applied to validate the usefulness of Armstrong tables, and to automate the marking and feedback of non-multiple choice questions in database courses.

In recent years, the skyline query paradigm has been established as a reliable method for database query personalization. While early efficiency problems have been solved by sophisticated algorithms and advanced indexing, new challenges in skyline retrieval effectiveness continuously arise. In particular, the rise of the Semantic Web and linked open data leads to personalization issues where skyline queries cannot be applied easily. We addressed the special challenges presented by linked open data in previous work; and now further extend this work, with a heuristic workflow to boost efficiency. This is necessary; because the new view on linked open data dominance has serious implications for the efficiency of the actual skyline computation, since transitivity of the dominance relationships is no longer granted. Therefore, our contributions in this paper can be summarized as: we present an intuitive skyline query paradigm to deal with linked open data; we provide an effective dominance definition, and establish its theoretical properties; we develop innovative skyline algorithms to deal with the resulting challenges; and we design efficient heuristics for the case of predicate equivalences that may often happen in linked open data. We extensively evaluate our new algorithms with respect to performance, and the enriched skyline semantics.

Recently, shape analysis of human organs has achieved much attention, owing to its potential to localize structural abnormalities. For a group-wise shape analysis, it is important to accurately restore the shape of a target structure in each subject and to build the inter-subject shape correspondences. To accomplish this, we propose a shape modeling method based on the Laplacian deformation framework. We deform a template model of a target structure in the segmented images while restoring subject-specific shape features by using Laplacian surface representation. In order to build the inter-subject shape correspondences, we implemented the progressive weighting scheme for adaptively controlling the rigidity parameter of the deformable model. This weighting scheme helps to preserve the relative distance between each point in the template model as much as possible during model deformation. This area-preserving deformation allows each point of the template model to be located at an anatomically consistent position in the target structure. Another advantage of our method is its application to human organs of non-spherical topology. We present the experiments for evaluating the robustness of shape modeling against large variations in shape and size with the synthetic sets of the second cervical vertebrae (C2), which has a complex shape with holes.

Vehicular Ad Hoc Networks (VANETs) are highly dynamic and unstable due to the heterogeneous nature of the communications, intermittent links, high mobility and constant changes in network topology. Currently, some of the most important challenges of VANETs are the scalability problem, congestion, unnecessary duplication of data, low delivery rate, communication delay and temporary fragmentation. Many recent studies have focused on a hybrid mechanism to disseminate information implementing the store and forward technique in sparse vehicular networks, as well as clustering techniques to avoid the scalability problem in dense vehicular networks. However, the selection of intermediate nodes in the store and forward technique, the stability of the clusters and the unnecessary duplication of data remain as central challenges. Therefore, we propose an adaptable destination-based dissemination algorithm (DBDA) using the publish/subscribe model. DBDA considers the destination of the vehicles as an important parameter to form the clusters and select the intermediate nodes, contrary to other proposed solutions. Additionally, DBDA implements a publish/subscribe model. This model provides a context-aware service to select the intermediate nodes according to the importance of the message, destination, current location and speed of the vehicles; as a result, it avoids delay, congestion, unnecessary duplications and low delivery rate.