73 search hits

The term “Software Chrestomaty” is defined as a collection of software systems meant to be useful in learning about or gaining insight into software languages, software technologies, software concepts, programming, and software engineering. 101companies software chrestomathy is a community project with the attributes of a Research 2.0 infrastructure for various stakeholders in software languages and technology communities. The core of 101companies combines a semantic wiki and confederated open source repositories. We designed and developed an integrated ontology-based knowledge base about software languages and technologies. The knowledge is created by the community of contributors and supported with a running example and structured documentation. The complete ecosystem is exposed by using Linked Data principles and equipped with the additional metadata about individual artifacts. Within the context of software chrestomathy we explored a new type of software architecture – linguistic architecture that is targeted on the language and technology relationships within a software product and based on the megamodels. Our approach to documentation of the software systems is highly structured and makes use of the concepts of the newly developed megamodeling language MegaL. We “connect” an emerging ontology with the megamodeling artifacts to raise the cognitive value of the linguistic architecture.

This thesis proposes the use of MSR (Mining Software Repositories) techniques to identify software developers with exclusive expertise about specific APIs and programming domains in software repositories. A pilot Tool for finding such
“Islands of Knowledge” in Node.js projects is presented and applied in a case study to the 180 most popular npm packages. It is found that on average each package has 2.3 Islands of Knowledge, which is possibly explained by the finding that npm packages tend to have only one main contributor. In a survey, the maintainers of 50 packages are contacted and asked for opinions on the results produced by the Tool. Together with their responses, this thesis reports on experiences made with the pilot Tool and how future iterations could produce even more accurate statements about programming expertise distribution in developer teams.

Using semantic data from general-purpose programming languages does not provide the unified experience one would want for such an application. Static error checking is lacking, especially with regards to static typing of the data. Based on the previous work of λ-DL, which integrates semantic queries and concepts as types into a typed λ-calculus, this work takes its ideas a step further to meld them into a real-world programming language. This thesis explores how λ-DL's features can be extended and integrated into an existing language, researches an appropriate extension mechanism and produces Semantics4J, a JastAdd-based Java language semantic data extension for type-safe OWL programming, together with examples of its usage.

One of the main goals of the artificial intelligence community is to create machines able to reason with dynamically changing knowledge. To achieve this goal, a multitude of different problems have to be solved, of which many have been addressed in the various sub-disciplines of artificial intelligence, like automated reasoning and machine learning. The thesis at hand focuses on the automated reasoning aspects of these problems and address two of the problems which have to be overcome to reach the afore-mentioned goal, namely 1. the fact that reasoning in logical knowledge bases is intractable and 2. the fact that applying changes to formalized knowledge can easily introduce inconsistencies, which leads to unwanted results in most scenarios.
To ease the intractability of logical reasoning, I suggest to adapt a technique called knowledge compilation, known from propositional logic, to description logic knowledge bases. The basic idea of this technique is to compile the given knowledge base into a normal form which allows to answer queries efficiently. This compilation step is very expensive but has to be performed only once and as soon as the result of this step is used to answer many queries, the expensive compilation step gets worthwhile. In the thesis at hand, I develop a normal form, called linkless normal form, suitable for knowledge compilation for description logic knowledge bases. From a computational point of view, the linkless normal form has very nice properties which are introduced in this thesis.
For the second problem, I focus on changes occurring on the instance level of description logic knowledge bases. I introduce three change operators interesting for these knowledge bases, namely deletion and insertion of assertions as well as repair of inconsistent instance bases. These change operators are defined such that in all three cases, the resulting knowledge base is ensured to be consistent and changes performed to the knowledge base are minimal. This allows us to preserve as much of the original knowledge base as possible. Furthermore, I show how these changes can be applied by using a transformation of the knowledge base.
For both issues I suggest to adapt techniques successfully used in other logics to get promising methods for description logic knowledge bases.

Reactive local algorithms are distributed algorithms which suit the needs of battery-powered, large-scale wireless ad hoc and sensor networks particularly well. By avoiding both unnecessary wireless transmissions and proactive maintenance of neighborhood tables (i.e., beaconing), such algorithms minimize communication load and overhead, and scale well with increasing network size. This way, resources such as bandwidth and energy are saved, and the probability of message collisions is reduced, which leads to an increase in the packet reception ratio and a decrease of latencies.
Currently, the two main application areas of this algorithm type are geographic routing and topology control, in particular the construction of a node's adjacency in a connected, planar representation of the network graph. Geographic routing enables wireless multi-hop communication in the absence of any network infrastructure, based on geographic node positions. The construction of planar topologies is a requirement for efficient, local solutions for a variety of algorithmic problems.
This thesis contributes to reactive algorithm research in two ways, on an abstract level, as well as by the introduction of novel algorithms:
For the very first time, reactive algorithms are considered as a whole and as an individual research area. A comprehensive survey of the literature is given which lists and classifies known algorithms, techniques, and application domains. Moreover, the mathematical concept of O- and Omega-reactive local topology control is introduced. This concept unambiguously distinguishes reactive from conventional, beacon-based, topology control algorithms, serves as a taxonomy for existing and prospective algorithms of this kind, and facilitates in-depth investigations of the principal power of the reactive approach, beyond analysis of concrete algorithms.
Novel reactive local topology control and geographic routing algorithms are introduced under both the unit disk and quasi unit disk graph model. These algorithms compute a node's local view on connected, planar, constant stretch Euclidean and topological spanners of the underlying network graph and route messages reactively on these spanners while guaranteeing the messages' delivery. All previously known algorithms are either not reactive, or do not provide constant Euclidean and topological stretch properties. A particularly important partial result of this work is that the partial Delaunay triangulation (PDT) is a constant stretch Euclidean spanner for the unit disk graph.
To conclude, this thesis provides a basis for structured and substantial research in this field and shows the reactive approach to be a powerful tool for algorithm design in wireless ad hoc and sensor networking.

The publication of open source software aims to support the reuse, the distribution and the general utilization of software. This can only be enabled by the correct usage of open source software licenses. Therefore associations provide a multitude of open source software licenses with different features, of which a developer can choose, to regulate the interaction with his software. Those licenses are the core theme of this thesis.
After an extensive literature research, two general research questions are elaborated in detail. First, a license usage analysis of licenses in the open source sector is applied, to identify current trends and statistics. This includes questions concerning the distribution of licenses, the consistency in their usage, their association over a period of time and their publication.
Afterwards the recommendation of licenses for specific projects is investigated. Therefore, a recommendation logic is presented, which includes several influences on a suitable license choice, to generate an at most applicable recommendation. Besides the exact features of a license of which a user can choose, different methods of ranking the recommendation results are proposed. This is based on the examination of the current situation of open source licensing and license suggestion. Finally, the logic is evaluated on the exemplary use-case of the 101companies project.

This thesis analyzes the online attention towards scientists and their research topics. The studies compare the attention dynamics towards the winners of important scientific prizes with scientists who did not receive a prize. Web signals such as Wikipedia page views, Wikipedia edits, and Google Trends were used as a proxy for online attention. One study focused on the time between the creation of the article about a scientist and their research topics. It was discovered that articles about research topics were created closer to the articles of prize winners than to scientists who did not receive a prize. One possible explanation could be that the research topics are more closely related to the scientist who got an award. This supports that scientists who received the prize introduced the topics to the public. Another study considered the public attention trends towards the related research topics before and after a page of a scientist was created. It was observed that after a page about a scientist was created, research topics of prize winners received more attention than the topics of scientists who did not receive a prize. Furthermore, it was demonstrated that Nobel Prize winners get a lower amount of attention before receiving the prize than the potential nominees from the list of Citation Laureates of Thompson Reuters. Also, their popularity is going down faster after receiving it. It was also shown that it is difficult to predict the prize winners based on the attention dynamics towards them.

Confidentiality, integrity, and availability are often listed as the three major requirements for achieving data security and are collectively referred to as the C-I-A triad. Confidentiality of data restricts the data access to authorized parties only, integrity means that the data can only be modified by authorized parties, and availability states that the data must always be accessible when requested. Although these requirements are relevant for any computer system, they are especially important in open and distributed networks. Such networks are able to store large amounts of data without having a single entity in control of ensuring the data's security. The Semantic Web applies to these characteristics as well as it aims at creating a global and decentralized network of machine-readable data. Ensuring the confidentiality, integrity, and availability of this data is therefore also important and must be achieved by corresponding security mechanisms. However, the current reference architecture of the Semantic Web does not define any particular security mechanism yet which implements these requirements. Instead, it only contains a rather abstract representation of security.
This thesis fills this gap by introducing three different security mechanisms for each of the identified security requirements confidentiality, integrity, and availability of Semantic Web data. The mechanisms are not restricted to the very basics of implementing each of the requirements and provide additional features as well. Confidentiality is usually achieved with data encryption. This thesis not only provides an approach for encrypting Semantic Web data, it also allows to search in the resulting ciphertext data without decrypting it first. Integrity of data is typically implemented with digital signatures. Instead of defining a single signature algorithm, this thesis defines a formal framework for signing arbitrary Semantic Web graphs which can be configured with various algorithms to achieve different features. Availability is generally supported by redundant data storage. This thesis expands the classical definition of availability to compliant availability which means that data must only be available as long as the access request complies with a set of predefined policies. This requirement is implemented with a modular and extensible policy language for regulating information flow control. This thesis presents each of these three security mechanisms in detail, evaluates them against a set of requirements, and compares them with the state of the art and related work.

The provision of electronic participation services (e-participation) is a complex socio-technical undertaking that needs comprehensive design and implementation strategies. E-participation service providers, in the most cases administrations and governments, struggle with changing requirements that demand more transparency, better connectivity and increased collaboration among different actors. At the same time, less staff are available. As a result, recent research assesses only a minority of e-participation services as successful. The challenge is that the e-participation domain lacks comprehensive approaches to design and implement (e-)participation services. Enterprise Architecture (EA) frameworks have evolved in information systems research as an approach to guide the development of complex socio-technical systems. This approach can guide the design and implementation services, if the collection of organisations with the commonly held goal to provide participation services is understood as an E Participation Enterprise (EE). However, research & practice in the e participation domain has not yet exploited EA frameworks. Consequently, the problem scope that motivates this dissertation is the existing gap in research to deploy EA frameworks in e participation design and implementation. The research question that drives this research is: What methodical and technical guides do architecture frameworks provide that can be used to design and implement better and successful e participation?
This dissertation presents a literature study showing that existing approaches have not covered yet the challenges of comprehensive e participation design and implementation. Accordingly, the research moves on to investigate established EA frameworks such as the Zachman Framework, TOGAF, the DoDAF, the FEA, the ARIS, and the ArchiMate for their use. While the application of these frameworks in e participation design and implementation is possible, an integrated approach is lacking so far. The synthesis of literature review and practical insights in design and implementation of e participation services from four projects show the challenges of adapting architecture frameworks for this domain. However, the research shows also the potential of a combination of the different approaches. Consequently, the research moves on to develop the E-Participation Architecture Framework (EPART-Framework). Therefore, the dissertation applies design science research including literature review and action research. Two independent settings test an initial EPART-Framework version. The results yield into the EPART-Framework presented in this dissertation.
The EPART-Framework comprises of the EPART-Metamodel with six EPART-Viewpoints, which frame the stakeholder concerns: the Participation Scope, the Participant Viewpoint, the Participation Viewpoint, the Data & Information Viewpoint, the E-participation Viewpoint, and Implementation & Governance Viewpoint. The EPART-Method supports the stakeholders to design the EE and implement e participation and stores its output in an architecture description and a solution repository. It consists of five consecutive phases accompanied by requirements management: Initiation, Design, Implementation and Preparation, Participation, and Evaluation. The EPART-Framework fills the gap between the e participation domain and the enterprise architecture framework domain. The evaluation gives reasonable evidence that the framework is a valuable addition in academia and in practice to improve e-participation design and implementation. The same time, it shows opportunities for future research to extend and advance the framework.

This thesis addresses the problem of terrain classification in unstructured outdoor environments. Terrain classification includes the detection of obstacles and passable areas as well as the analysis of ground surfaces. A 3D laser range finder is used as primary sensor for perceiving the surroundings of the robot. First of all, a grid structure is introduced for data reduction. The chosen data representation allows for multi-sensor integration, e.g., cameras for color and texture information or further laser range finders for improved data density. Subsequently, features are computed for each terrain cell within the grid. Classification is performedrnwith a Markov random field for context-sensitivity and to compensate for sensor noise and varying data density within the grid. A Gibbs sampler is used for optimization and is parallelized on the CPU and GPU in order to achieve real-time performance. Dynamic obstacles are detected and tracked using different state-of-the-art approaches. The resulting information - where other traffic participants move and are going to move to - is used to perform inference in regions where the terrain surface is partially or completely invisible for the sensors. Algorithms are tested and validated on different autonomous robot platforms and the evaluation is carried out with human-annotated ground truth maps of millions of measurements. The terrain classification approach of this thesis proved reliable in all real-time scenarios and domains and yielded new insights. Furthermore, if combined with a path planning algorithm, it enables full autonomy for all kinds of wheeled outdoor robots in natural outdoor environments.

The identification of experts for a specific technology or framework produces a large benefit for collaborative software projects. Hence it reduces the communication overhead that is required to identify an expert on the fly. Therefore this thesis describes a tool and approach that can be used to identify an expert that has a specific skill-set. It will mainly focus on the skills and expertise of developers that use the Django framework. By adding more rules to our framework that approach could easily be extended for different technologies or frameworks. The paper will close with a case study on an open source project.

Geographic cluster based routing in ad-hoc wireless sensor networks is a current field of research. Various algorithms to route in wireless ad-hoc networks based on position information already exist. Among them algorithms that use the traditional beaconing approach as well as algorithms that work beaconless (no information about the environment is required besides the own position and the destination). Geographic cluster based routing with guaranteed message delivery can be carried out on overlay graphs as well. Until now the required planar overlay graphs are not being constructed reactively.
This thesis proposes a reactive algorithm, the Beaconless Cluster Based Planarization (BCBP) algorithm, which constructs a planar overlay graph and noticeably reduces the number of messages required for that. Based on an algorithm for cluster based planarization it beaconlessly constructs a planar overlay graph in an unit disk graph (UDG). An UDG is a model for a wireless network in which every participant has the same sending radius. Evaluation of the algorithm shows it to be more efficient than the non beaconless variant. Another result of this thesis is the Beaconless LLRAP (BLLRAP) algorithm, for which planarity but not continued connectivity could be proven.

One task of executives and project managers in IT companies or departments is to hire suitable developers and to assign them to suitable problems. In this paper, we propose a new technique that directly leverages previous work experience of developers in a systematic manner. Existing evidence for developer expertise based on the version history of existing projects is analyzed. More specifically, we analyze the commits to a repository in terms of affected API usage. On these grounds, we associate APIs with developers and thus we assess API experience of developers. In transitive closure, we also assess programming domain experience.

The publication of freely available and machine-readable information has increased significantly in the last years. Especially the Linked Data initiative has been receiving a lot of attention. Linked Data is based on the Resource Description Framework (RDF) and anybody can simply publish their data in RDF and link it to other datasets. The structure is similar to the World Wide Web where individual HTML documents are connected with links. Linked Data entities are identified by URIs which are dereferenceable to retrieve information describing the entity. Additionally, so called SPARQL endpoints can be used to access the data with an algebraic query language (SPARQL) similar to SQL. By integrating multiple SPARQL endpoints it is possible to create a federation of distributed RDF data sources which acts like one big data store.
In contrast to the federation of classical relational database systems there are some differences for federated RDF data. RDF stores are accessed either via SPARQL endpoints or by resolving URIs. There is no coordination between RDF data sources and machine-readable meta data about a source- data is commonly limited or not available at all. Moreover, there is no common directory which can be used to discover RDF data sources or ask for sources which offer specific data. The federation of distributed and linked RDF data sources has to deal with various challenges. In order to distribute queries automatically, suitable data sources have to be selected based on query details and information that is available about the data sources. Furthermore, the minimization of query execution time requires optimization techniques that take into account the execution cost for query operators and the network communication overhead for contacting individual data sources. In this thesis, solutions for these problems are discussed. Moreover, SPLENDID is presented, a new federation infrastructure for distributed RDF data sources which uses optimization techniques based on statistical information.

Software systems are often developed as a set of variants to meet diverse requirements. Two common approaches to this are "clone-and-owning" and software product lines. Both approaches have advantages and disadvantages. In previous work we and collaborators proposed an idea which combines both approaches to manage variants, similarities, and cloning by using a virtual platform and cloning-related operators.
In this thesis, we present an approach for aggregating essential metadata to enable a propagate operator, which implements a form of change propagation. For this we have developed a system to annotate code similarities which were extracted throughout the history of a software repository. The annotations express similarity maintenance tasks, which can then either be executed automatically by propagate or have to be performed manually by the user. In this work we outline the automated metadata extraction process and the system for annotating similarities; we explain how the implemented system can be integrated into the workflow of an existing version control system (Git); and, finally, we present a case study using the 101haskell corpus of variants.

Traditional Driver Assistance Systems (DAS) like for example Lane Departure Warning Systems or the well-known Electronic Stability Program have in common that their system and software architecture is static. This means that neither the number and topology of Electronic Control Units (ECUs) nor the presence and functionality of software modules changes after the vehicles leave the factory.
However, some future DAS do face changes at runtime. This is true for example for truck and trailer DAS as their hardware components and software entities are spread over both parts of the combination. These new requirements cannot be faced by state-of-the-art approaches of automotive software systems. Instead, a different technique of designing such Distributed Driver Assistance Systems (DDAS) needs to be developed. The main contribution of this thesis is the development of a novel software and system architecture for dynamically changing DAS using the example of driving assistance for truck and trailer. This architecture has to be able to autonomously detect and handle changes within the topology. In order to do so, the system decides which degree of assistance and which types of HMI can be offered every time a trailer is connected or disconnected. Therefore an analysis of the available software and hardware components as well as a determination of possible assistance functionality and a re-configuration of the system take place. Such adaptation can be granted by the principles of Service-oriented Architecture (SOA). In this architectural style all functionality is encapsulated in self-contained units, so-called Services. These Services offer the functionality through well-defined interfaces whose behavior is described in contracts. Using these Services, large-scale applications can be built and adapted at runtime. This thesis describes the research conducted in achieving the goals described by introducing Service-oriented Architectures into the automotive domain. SOA deals with the high degree of distribution, the demand for re-usability and the heterogeneity of the needed components.
It also applies automatic re-configuration in the event of a system change. Instead of adapting one of the frameworks available to this scenario, the main principles of Service-orientation are picked up and tailored. This leads to the development of the Service-oriented Driver Assistance (SODA) framework, which implements the benefits of Service-orientation while ensuring compatibility and compliance to automotive requirements, best-practices and standards. Within this thesis several state-of-the-art Service-oriented frameworks are analyzed and compared. Furthermore, the SODA framework as well as all its different aspects regarding the automotive software domain are described in detail. These aspects include a well-defined reference model that introduces and relates terms and concepts and defines an architectural blueprint. Furthermore, some of the modules of this blueprint such as the re-configuration module and the Communication Model are presented in full detail. In order to prove the compliance of the framework regarding state-of-the-art automotive software systems, a development process respecting today's best practices in automotive design procedures as well as the integration of SODA into the AUTOSAR standard are discussed. Finally, the SODA framework is used to build a full-scale demonstrator in order to evaluate its performance and efficiency.

In the recent years, Software Engineering research has shown the rise of interest in the empirical studies. Such studies are often based on empirical evidence derived from corpora - collections of software artifacts. While there are established forms of carrying out empirical research (experiments, case studies, surveys, etc.), the common task of preparing the underlying collection of software artifacts is typically addressed in ad hoc manner.
In this thesis, by means of a literature survey we show how frequently software engineering research employs software corpora and using a developed classification scheme we discuss their characteristics. Addressing the lack of methodology, we suggest a method of corpus (re-)engineering and apply it to an existing collection of Java projects.
We report two extensive empirical studies, where we perform a broad and diverse range of analyses on the language for privacy preferences (P3P) and on object-oriented application programming interfaces (APIs). In both cases, we are driven by the data at hand, by the corpus itself, discovering the actual usage of the languages.

Diffusion imaging captures the movement of water molecules in tissue by applying varying gradient fields in a magnetic resonance imaging (MRI)-based setting. It poses a crucial contribution to in vivo examinations of neuronal connections: The local diffusion profile enables inference of the position and orientation of fiber pathways. Diffusion imaging is a significant technique for fundamental neuroscience, in which pathways connecting cortical activation zones are examined, and for neurosurgical planning, where fiber reconstructions are considered as intervention related risk structures.
Diffusion tensor imaging (DTI) is currently applied in clinical environments in order to model the MRI signal due to its fast acquisition and reconstruction time. However, the inability of DTI to model complex intra-voxel diffusion distributions gave rise to an advanced reconstruction scheme which is known as high angular resolution diffusion imaging (HARDI). HARDI received increasing interest in neuroscience due to its potential to provide a more accurate view of pathway configurations in the human brain.
In order to fully exploit the advantages of HARDI over DTI, advanced fiber reconstructions and visualizations are required. This work presents novel approaches contributing to current research in the field of diffusion image processing and visualization. Diffusion classification, tractography, and visualizations approaches were designed to enable a meaningful exploration of neuronal connections as well as their constitution. Furthermore, an interactive neurosurgical planning tool with consideration of neuronal pathways was developed.
The research results in this work provide an enhanced and task-related insight into neuronal connections for neuroscientists as well as neurosurgeons and contribute to the implementation of HARDI in clinical environments.

Web 2.0 provides technologies for online collaboration of users as well as the creation, publication and sharing of user-generated contents in an interactive way. Twitter, CNET, CiteSeerX, etc. are examples of Web 2.0 platforms which facilitate users in these activities and are viewed as rich sources of information. In the platforms mentioned as examples, users can participate in discussions, comment others, provide feedback on various issues, publish articles and write blogs, thereby producing a high volume of unstructured data which at the same time leads to an information overload. To satisfy various types of human information needs arising from the purpose and nature of the platforms requires methods for appropriate aggregation and automatic analysis of this unstructured data. In this thesis, we propose methods which attempt to overcome the problem of information overload and help in satisfying user information needs in three scenarios.
To this end, first we look at two of the main challenges of sparsity and content quality in Twitter and how these challenges can influence standard retrieval models. We analyze and identify Twitter content features that reflect high quality information. Based on this analysis we introduce the concept of "interestingness" as a static quality measure. We empirically show that our proposed measure helps in retrieving and filtering high quality information in Twitter. Our second contribution relates to the content diversification problem in a collaborative social environment, where the motive of the end user is to gain a comprehensive overview of the pros and cons of a discussion track which results from social collaboration of the people. For this purpose, we develop the FREuD approach which aims at solving the content diversification problem by combining latent semantic analysis with sentiment estimation approaches. Our evaluation results show that the FREuD approach provides a representative overview of sub-topics and aspects of discussions, characteristic user sentiments under different aspects, and reasons expressed by different opponents. Our third contribution presents a novel probabilistic Author-Topic-Time model, which aims at mining topical trends and user interests from social media. Our approach solves this problem by means of Bayesian modeling of relations between authors, latent topics and temporal information. We present results of application of the model to the scientific publication datasets from CiteSeerX showing improved semantically cohesive topic detection and capturing shifts in authors" interest in relation to topic evolution.

This dissertation investigates the usage of theorem provers in automated question answering (QA). QA systems attempt to compute correct answers for questions phrased in a natural language. Commonly they utilize a multitude of methods from computational linguistics and knowledge representation to process the questions and to obtain the answers from extensive knowledge bases. These methods are often syntax-based, and they cannot derive implicit knowledge. Automated theorem provers (ATP) on the other hand can compute logical derivations with millions of inference steps. By integrating a prover into a QA system this reasoning strength could be harnessed to deduce new knowledge from the facts in the knowledge base and thereby improve the QA capabilities. This involves challenges in that the contrary approaches of QA and automated reasoning must be combined: QA methods normally aim for speed and robustness to obtain useful results even from incomplete of faulty data, whereas ATP systems employ logical calculi to derive unambiguous and rigorous proofs. The latter approach is difficult to reconcile with the quantity and the quality of the knowledge bases in QA. The dissertation describes modifications to ATP systems in order to overcome these obstacles. The central example is the theorem prover E-KRHyper which was developed by the author at the Universität Koblenz-Landau. As part of the research work for this dissertation E-KRHyper was embedded into a framework of components for natural language processing, information retrieval and knowledge representation, together forming the QA system LogAnswer.
Also presented are additional extensions to the prover implementation and the underlying calculi which go beyond enhancing the reasoning strength of QA systems by giving access to external knowledge sources like web services. These allow the prover to fill gaps in the knowledge during the derivation, or to use external ontologies in other ways, for example for abductive reasoning. While the modifications and extensions detailed in the dissertation are a direct result of adapting an ATP system to QA, some of them can be useful for automated reasoning in general. Evaluation results from experiments and competition participations demonstrate the effectiveness of the methods under discussion.

E-KRHyper is a versatile theorem prover and model generator for firstorder logic that natively supports equality. Inequality of constants, however, has to be given by explicitly adding facts. As the amount of these facts grows quadratically in the number of these distinct constants, the knowledge base is blown up. This makes it harder for a human reader to focus on the actual problem, and impairs the reasoning process. We extend E-Hyper- underlying E-KRhyper tableau calculus to avoid this blow-up by implementing a native handling for inequality of constants. This is done by introducing the unique name assumption for a subset of the constants (the so called distinct object identifiers). The obtained calculus is shown to be sound and complete and is implemented into the E-KRHyper system. Synthetic benchmarks, situated in the theory of arrays, are used to back up the benefits of the new calculus.

The paper deals with a specific introduction into probability propagation nets. Starting from dependency nets (which in a way can be considered the maximum information which follows from the directed graph structure of Bayesian networks), the probability propagation nets are constructed by joining a dependency net and (a slightly adapted version of) its dual net. Probability propagation nets are the Petri net version of Bayesian networks. In contrast to Bayesian networks, Petri nets are transparent and easy to operate. The high degree of transparency is due to the fact that every state in a process is visible as a marking of the Petri net. The convenient operability consists in the fact that there is no algorithm apart from the firing rule of Petri net transitions. Besides the structural importance of the Petri net duality there is a semantic matter; common sense in the form of probabilities and evidencebased likelihoods are dual to each other.

Modern Internet and Intranet techniques, such as Web services and virtualization, facilitate the distributed processing of data providing improved flexibility. The gain in flexibility also incurs disadvantages. Integrated workflows forward and distribute data between departments and across organizations. The data may be affected by privacy laws, contracts, or intellectual property rights. Under such circumstances of flexible cooperations between organizations, accounting for the processing of data and restricting actions performed on the data may be legally and contractually required. In the Internet and Intranet, monitoring mechanisms provide means for observing and auditing the processing of data, while policy languages constitute a mechanism for specifying restrictions and obligations.
In this thesis, we present our contributions to these fields by providing improvements for auditing and restricting the data processing in distributed environments. We define formal qualities of auditing methods used in distributed environments. Based on these qualities, we provide a novel monitoring solution supporting a data-centric view on the distributed data processing. We present a solution for provenance-aware policies and a formal specification of obligations offering a procedure to decide whether obligatory processing steps can be met in the future.

In this paper, we demonstrate by means of two examples how to work with probability propagation nets (PPNs). The fiirst, which comes from the book by Peng and Reggia [1], is a small example of medical diagnosis. The second one comes from [2]. It is an example of operational risk and is to show how the evidence flow in PPNs gives hints to reduce high losses. In terms of Bayesian networks, both examples contain cycles which are resolved by the conditioning technique [3].

Dualizing marked Petri nets results in tokens for transitions (t-tokens). A marked transition can strictly not be enabled, even if there are sufficient "enabling" tokens (p-tokens) on its input places. On the other hand, t-tokens can be moved by the firing of places. This permits flows of t-tokens which describe sequences of non-events. Their benefiit to simulation is the possibility to model (and observe) causes and effects of non-events, e.g. if something is broken down.

Routing with Metric based Topology Investigation (RMTI) is an algorithm meant to extend distance-vector routing protocols. It is under research and development at the University of Koblenz-Landau since 1999 and currently implemented on top of the well-known Routing Information Protocol (RIP). Around midyear 2009, the latest implementation of RMTI included a lot of deprecated functionality. Because of this, the first goal of this thesis was the reduction of the codebase to a minimum. Beside a lot of reorganization and a general cleanup, this mainly involved the removal of some no longer needed modes as well as the separation of the formerly mandatory XTPeer test environment. During the second part, many test series were carried out in order to ensure the correctness of the latest RMTI implementation. A replacement for XTPeer was needed and several new ways of testing were explored. In conjunction with this thesis, the RMTI source code was finally released to the public under a free software license.

Folksonomies are Web 2.0 platforms where users share resources with each other. Furthermore, they can assign keywords (called tags) to the resources for categorizing and organizing the resources. Numerous types of resources like websites (Delicious), images (Flickr), and videos (YouTube) are supported by different folksonomies. The folksonomies are easy to use and thus attract the attention of millions of users. Together with the ease they offer, there are also some problems. This thesis addresses different problems of folksonomies and proposes solutions for these problems. The first problem occurs when users search for relevant resources in folksonomies. Often, the users are not able to find all relevant resources because they don't know which tags are relevant. The second problem is assigning tags to resources. Although many folksonomies (like Delicious) recommend tags for the resources, other folksonomies (like Flickr) do not recommend any tags. Tag recommendation helps the users to easily tag their resources. The third problem is that tags and resources are lacking semantics. This leads for example to ambiguous tags. The tags are lacking semantics because they are freely chosen keywords. The automatic identification of the semantics of tags and resources helps in reducing problems that arise from this freedom of the users in choosing the tags. This thesis proposes methods which exploit semantics to address the problems of search, tag recommendation, and the identification of tag semantics. The semantics are discovered from a variety of sources. In this thesis, we exploit web search engines, online social communities and the co-occurrences of tags as sources of semantics. Using different sources for discovering semantics reduces the efforts to build systems which solve the problems mentioned earlier. This thesis evaluates the proposed methods on a large scale data set. The evaluation results suggest that it is possible to exploit the semantics for improving search, recommendation of tags, and automatic identification of the semantics of tags and resources.

This paper documents the development of an abstract physics layer (APL) for Simspark. After short introductions to physics engines and Simspark, reasons why an APL was developed are explained. The biggest part of this paper describes the new design and why certain design choices were made based on requirements that arose during developement. It concludes by explaining how the new design was eventually implemented and what future possibilities the new design holds.

The semantic web and model-driven engineering are changing the enterprise computing paradigm. By introducing technologies like ontologies, metadata and logic, the semantic web improves drastically how companies manage knowledge. In counterpart, model-driven engineering relies on the principle of using models to provide abstraction, enabling developers to concentrate on the system functionality rather than on technical platforms. The next enterprise computing era will rely on the synergy between both technologies. On the one side, ontology technologies organize system knowledge in conceptual domains according to its meaning. It addresses enterprise computing needs by identifying, abstracting and rationalizing commonalities, and checking for inconsistencies across system specifications. On the other side, model-driven engineering is closing the gap among business requirements, designs and executables by using domain-specific languages with custom-built syntax and semantics. In this scenario, the research question that arises is: What are the scientific and technical results around ontology technologies that can be used in model-driven engineering and vice versa? The objective is to analyze approaches available in the literature that involve both ontologies and model-driven engineering. Therefore, we conduct a literature review that resulted in a feature model for classifying state-of-the-art approaches. The results show that the usage of ontologies and model-driven engineering together have multiple purposes: validation, visual notation, expressiveness and interoperability. While approaches involving both paradigms exist, an integrated approach for UML class-based modeling and ontology modeling is lacking so far. Therefore, we investigate the techniques and languages for designing integrated models. The objective is to provide an approach to support the design of integrated solutions. Thus, we develop a conceptual framework involving the structure and the notations of a solution to represent and query software artifacts using a combination of ontologies and class-based modeling. As proof of concept, we have implemented our approach as a set of open source plug-ins -- the TwoUse Toolkit. The hypothesis is that a combination of both paradigms yields improvements in both fields, ontology engineering and model-driven engineering. For MDE, we investigate the impact of using features of the Web Ontology Language in software modeling. The results are patterns and guidelines for designing ontology-based information systems and for supporting software engineers in modeling software. The results include alternative ways of describing classes and objects and querying software models and metamodels. Applications show improvements on changeability and extensibility. In the ontology engineering domain, we investigate the application of techniques used in model-driven engineering to fill the abstraction gap between ontology specification languages and programming languages. The objective is to provide a model-driven platform for supporting activities in the ontology engineering life cycle. Therefore, we study the development of core ontologies in our department, namely the core ontology for multimedia (COMM) and the multimedia metadata ontology. The results are domain-specific languages that allow ontology engineers to abstract from implementation issues and concentrate on the ontology engineering task. It results in increasing productivity by filling the gap between domain models and source code.

Model-Driven Engineering (MDE) aims to raise the level of abstraction in software system specifications and increase automation in software development. Modelware technological spaces contain the languages and tools for MDE that software developers take into consideration to model systems and domains. Ontoware technological spaces contain ontology languages and technologies to design, query, and reason on knowledge. With the advent of the Semantic Web, ontologies are now being used within the field of software development, as well. In this thesis, bridging technologies are developed to combine two technological spaces in general. Transformation bridges translate models between spaces, mapping bridges relate different models between two spaces, and, integration bridges merge spaces to new all-embracing technological spaces. API bridges establish interoperability between the tools used in the space. In particular, this thesis focuses on the combination of modelware and ontoware technological spaces. Subsequent to a sound comparison of languages and tools in both spaces, the integration bridge is used to build a common technological space, which allows for the hybrid use of languages and the interoperable use of tools. The new space allows for language and domain engineering. Ontology-based software languages may be designed in the new space where syntax and formal semantics are defined with the support of ontology languages, and the correctness of language models is ensured by the use of ontology reasoning technologies. These languages represent a core means for exploiting expressive ontology reasoning in the software modeling domain, while remaining flexible enough to accommodate varying needs of software modelers. Application domains are conceptually described by languages that allow for defining domain instances and types within one domain model. Integrated ontology languages may provide formal semantics for domain-specific languages and ontology technologies allow for reasoning over types and instances in domain models. A scenario in which configurations for network device families are modeled illustrates the approaches discussed in this thesis. Furthermore, the implementation of all bridging technologies for the combination of technological spaces and all tools for ontology-based language engineering and use is illustrated.

Eye-trackers have been used in the past to identify visual foci in images, find task-related image regions, or localize affective regions in images. However, they have not been used for identifying specific objects in images. In this paper, we investigate whether it is possible to assign image regions showing specific objects with tags describing these objects by analyzing the users' gaze paths. To this end, we have conducted an experiment with 20 subjects viewing 50 image-tag-pairs each. We have compared the tag-to-region assignments for nine existing and four new fixation measures. In addition, we have investigated the impact of extending region boundaries, weighting small image regions, and the number of subjects viewing the images. The paper shows that a tag-to-region assignment with an accuracy of 67% can be achieved by using gaze information. In addition, we show that multiple regions on the same image can be differentiated with an accuracy of 38%.

Hybrid automata are used as standard means for the specification and analysis of dynamical systems. Several researches have approached them to formally specify reactive Multi-agent systems situated in a physical environment, where the agents react continuously to their environment. The specified systems, in turn, are formally checked with the help of existing hybrid automata verification tools. However, when dealing with multi-agent systems, two problems may be raised. The first problem is a state space problem raised due to the composition process, where the agents have to be parallel composed into an agent capturing all possible behaviors of the multi-agent system prior to the verification phase. The second problem concerns the expressiveness of verification tools when modeling and verifying certain behaviors. Therefore, this paper tackles these problems by showing how multi-agent systems, specified as hybrid automata, can be modeled and verified using constraint logic programming(CLP). In particular, a CLP framework is presented to show how the composition of multi-agent behaviors can be captured dynamically during the verification phase. This can relieve the state space complexity that may occur as a result of the composition process. Additionally, the expressiveness of the CLP model flexibly allows not only to model multi-agent systems, but also to check various properties by means of the reachability analysis. Experiments are promising to show the feasibility of our approach.

Conventional security infrastructures in the Internet cannot be directly adopted to ambient systems, especially if based on short-range communication channels: Personal, mobile devices are used and the participants are present during communication, so privacy protection is a crucial issue. As ambient systems cannot rely on an uninterrupted connection to a Trust Center, certiffed data has to be veriffed locally. Security techniques have to be adjusted to the special environment. This paper introduces a public key infrastructure (PKI) to provide secure communication channels with respect to privacy, confidentiality, data integrity, non-repudiability, and user or device authentication. It supports three certiffcate levels with a different balance between authenticity and anonymity. This PKI is currently under implementation as part of the iCity project.

This thesis introduces fnnlib, a C++ library for recurrent neural network simulations that I developed between October 2009 and March 2010 at Osaka University's Graduate School of Engineering. After covering the theory behind recurrent neural networks, backpropagation through time, recurrent neural networks with parametric bias, continuous-time recurrent neural networks, and echo state networks, the design of the library is explained. All of the classes as well as their interrelationships are presented along with reasons as to why certain design decisions were made. Towards the end of the thesis, a small practical example is shown. Also, fnnlib is compared to other neural network libraries.

This paper has a twofold concern. First, to show that probability propagation nets (PPNs) can be derived from formulae for conditional probabilities. Second, to show that PPNs are usable for (local) propagations of mass distributions, too, if the inscriptions are suitably adapted.

Knowledge compilation is a common technique for propositional logic knowledge bases. A given knowledge base is transformed into a normal form, for which queries can be answered efficiently. This precompilation step is expensive, but it only has to be performed once. We apply this technique to concepts defined in the Description Logic ALC. We introduce a normal form called linkless normal form for ALC concepts and discuss an efficient satisability test for concepts given in this normal form. Furthermore, we will show how to efficiently calculate uniform interpolants of precompiled concepts w.r.t. a given signature.

The processing of data is often restricted by contractual and legal requirements for protecting privacy and IPRs. Policies provide means to control how and by whom data is processed. Conditions of policies may depend on the previous processing of the data. However, existing policy languages do not provide means to express such conditions. In this work we present a formal model and language allowing for specifying conditions based on the history of data processing. We base the model and language on XACML.

Specifying behaviors of multi-agent systems (MASs) is a demanding task, especially when applied in safety-critical systems. In the latter systems, the specification of behaviors has to be carried out carefully in order to avoid side effects that might cause unwanted or even disastrous behaviors. Thus, formal methods based on mathematical models of the system under design are helpful. They not only allow us to formally specify the system at different levels of abstraction, but also to verify the consistency of the specified systems before implementing them. The formal specification aims a precise and unambiguous description of the behavior of MASs, whereas the verification aims at proving the satisfaction of specified requirements. A behavior of an agent can be described as discrete changes of its states with respect to external or internal actions. Whenever an action occurs, the agent moves from one state to another one. Therefore, an efficient way to model this type of discrete behaviors is to use a kind of state transition diagrams such as finite automata. One remarkable advantage of such transition diagrams is that they lend themselves formal analysis techniques using model checking. The latter is an automatic verification technique which determines whether given properties are satisfied within a model underlying a particular system. In realistic physical environments, however, it is necessary to consider continuous behaviors in addition to discrete behaviors of MASs. Examples of those type of behaviors include the movement of a soccer agent to kick off or to go to the ball, the process of putting out the fire by a fire brigade agent in a rescue scenario, or any other behaviors that depend on any timed physical law. The traditional state transition diagrams are not sufficient to combine these types of behaviors. Hybrid automata offer an elegant method to capture such types of behaviors. Hybrid automata extend regular state transition diagrams with methods that deal with those continuous actions such that the state transition diagrams are used to model the discrete changes of behaviors, while differential equations are used to model the continuous changes. The semantics of hybrid automata make them accessible to formal verification by means of model checking. The main goal of this thesis is to approach hybrid automata for specifying and verifying behaviors of MASs. However, specifying and and verifying behaviors of MASs by means of hybrid automata raises several issues that should be considered. These issues include the complexity, modularity, and the expressiveness of MASs' models. This thesis addresses these issues and provides possible solutions to tackle them.

This thesis presents an analysis of API usage in a large corpus of Java software retrieved from the open source repositories hosted at SourceForge. Most larger software projects use software libraries, which offer a public "application programming interface" or API as an interface for the programmer. In order to facilitate the transition between different APIs, there are emerging research projects in the field of automated API migration. However, there is a lack of basic statistical background information about in-the-wild usage of APIs as such measurements have, until now, only been done on rather small corpora. We thus present an analysis method suitable for measurements with large corpora. First, we create a corpus of open source projects hosted on SourceForge, as well as a corpus of software libraries. Then, all projects in the corpus are compiled with an instrumented compiler. We use a compiler plugin for javac that gives detailed information about every method created by the compiler. This information is stored in a database and analyzed.

This minor thesis shows a way to optimise a generated oracle to achieve shorter runtimes. Shorter runtimes of test cases allows the execution of more test cases in the same time. The execution of more test cases leads to a higher confidence in the software-quality. Oracles can be derived from specifications. However specifications are used for different purposes and therefore are not necessarily executable. Even if the are executable it might be with only a high runtime. Those two facts come mostly from the use of quantifiers in the logic. If the quantifier-range is not bounded, respectively if the bounds are outside the target language-datatype-limits, the specification is too expressive to be exported into a program. Even if the bounds inside the used datatype-limits, the quantification is represented as a loop which leads to a runtime blowup, especially if quantifiers are nested. This work explains four different possibilities to reduce the execution time of the oracle by manipulating the quantified formular whereas this approach is only applicable if the quantified variables are of type Integer.

Software is vital for modern society. The efficient development of correct and reliable software is of ever-growing importance. An important technique to achieve this goal is deductive program verification: the construction of logical proofs that programs are correct. In this thesis, we address three important challenges for deductive verification on its way to a wider deployment in the industry: 1. verification of thread-based concurrent programs 2. correctness management of verification systems 3. change management in the verification process. These are consistently brought up by practitioners when applying otherwise mature verification systems. The three challenges correspond to the three parts of this thesis (not counting the introductory first part, providing technical background on the KeY verification approach). In the first part, we define a novel program logic for specifying correctness properties of object-oriented programs with unbounded thread-based concurrency. We also present a calculus for the above logic, which allows verifying actual Java programs. The calculus is based on symbolic execution resulting in its good understandability for the user. We describe the implementation of the calculus in the KeY verification system and present a case study. In the second part, we provide a first systematic survey and appraisal of factors involved in reliability of formal reasoning. We elucidate the potential and limitations of self-application of formal methods in this area and give recommendations based on our experience in design and operation of verification systems. In the third part, we show how the technique of similarity-based proof reuse can be applied to the problems of industrial verification life cycle. We address issues (e.g., coping with changes in the proof system) that are important in verification practice, but have been neglected by research so far.

Hybrid systems are the result of merging the two most commonly used models for dynamical systems, namely continuous dynamical systems defined by differential equations and discrete-event systems defined by automata. One can view hybrid systems as constrained systems, where the constraints describe the possible process flows, invariants within states, and transitions on the one hand, and to characterize certain parts of the state space (e.g. the set of initial states, or the set of unsafe states) on the other hand. Therefore, it is advantageous to use constraint logic programming (CLP) as an approach to model hybrid systems. In this paper, we provide CLP implementations, that model hybrid systems comprising several concurrent hybrid automata, whose size is only straight proportional to the size of the given system description. Furthermore, we allow different levels of abstraction by making use of hierarchies as in UML statecharts. In consequence, the CLP model can be used for analyzing and testing the absence or existence of (un)wanted behaviors in hybrid systems. Thus in summary, we get a procedure for the formal verification of hybrid systems by model checking, employing logic programming with constraints.

The efficiency of marketing campaigns is a precondition for business success. This paper discusses a technique to transfer advertisement content vie Bluetooth technology and collects market research information at the same time. Conventional advertisement media were enhanced by devices to automatically measure the number, distance, frequency and exposure time of passersby, making information available to evaluate both the wireless media as well as the location in general. This paper presents a study analyzing these data. A cryptographic one-way function protects privacy during data acquisition.

We introduce a new routing algorithm which can detect routing loops by evaluating routing updates more thoroughly. Our new algorithm is called Routing with Metric based Topology Investigation (RMTI), which is based on the simple Routing Information Protocol (RIP) and is compatible to all RIP versions. In case of a link failure, a network can reorganize itself if there are redundant links available. Redundant links are only available in a network system like the internet if the topology contains loops. Therefore, it is necessary to recognize and to prevent routing loops. A routing loop can be seen as a circular trace of a routing update information which returns to the same router, either directly from the neighbor router or via a loop topology. Routing loops could consume a large amount of network bandwidth and could impact the endtoend performance of the network. Our RMTI approach is capable to improve the efficiency of Distance Vector Routing.

The lack of a formal event model hinders interoperability in distributed event-based systems. Consequently, we present in this paper a formal model of events, called F. The model bases on an upper-level ontology and pro-vides comprehensive support for all aspects of events such as time and space, objects and persons involved, as well as the structural aspects, namely mereological, causal, and correlational relationships. The event model provides a flexible means for event composition, modeling of event causality and correlation, and allows for representing different interpretations of the same event. The foundational event model F is developed in a pattern-oriented approach, modularized in different ontologies, and can be easily extended by domain specifific ontologies.

This dissertation introduces a methodology for formal specification and verification of user interfaces under security aspects. The methodology allows to use formal methods pervasively in the specification and verification of human-computer interaction. This work consists of three parts. In the first part, a formal methodology for the description of human-computer interaction is developed. In the second part, existing definitions of computer security are adapted for human-computer interaction and formalized. A generic formal model of human-computer interaction is developed. In the third part, the methodology is applied to the specification and verification of a secure email client.

The Semantic Web is based on accessing and reusing RDF data from many different sources, which one may assign different levels of authority and credibility. Existing Semantic Web query languages, like SPARQL, have targeted the retrieval, combination and reuse of facts, but have so far ignored all aspects of meta knowledge, such as origins, authorship, recency or certainty of data, to name but a few. In this paper, we present an original, generic, formalized and implemented approach for managing many dimensions of meta knowledge, like source, authorship, certainty and others. The approach re-uses existing RDF modeling possibilities in order to represent meta knowledge. Then, it extends SPARQL query processing in such a way that given a SPARQL query for data, one may request meta knowledge without modifying the query proper. Thus, our approach achieves highly flexible and automatically coordinated querying for data and meta knowledge, while completely separating the two areas of concern.