Dependability of service provision is one of the primary goals in modern networks. Since providers and clients are part of a connecting Information and Communications Technology (ICT) infrastructure, service dependability varies with the position of actors as the ICT devices needed for service provision change. We present two approaches to quantify user-perceived service dependability. The first is a model-driven approach to calculate instantaneous service availability. Using input models of the service, the infrastructure and a mapping between the two to describe actors of service communication, availability models are automatically created by a series of model to model transformations. The feasibility of the approach is demonstrated using exemplary services in the network of University of Lugano, Switzerland. The second approach aims at the responsiveness of the service discovery layer, the probability to find service instances within a deadline even in the presence of faults, and is the main part of this thesis. We present a hierarchy of stochastic models to calculate user-perceived responsiveness based on monitoring data from the routing layer. Extensive series of experiments have been run on the Distributed Embedded Systems (DES) wireless testbed at Freie Universität Berlin. They serve both to demonstrate the shortcomings of current discovery protocols in modern dynamic networks and to validate the presented stochastic models. Both approaches demonstrate that the dependability of service provision indeed differs considerably depending on the position of service clients and providers, even in highly reliable wired networks. The two approaches enable optimization of service networks with respect to known or predicted usage patterns. Furthermore, they anticipate novel service dependability models which combine service discovery, timeliness, placement and usage, areas that until now have been treated to a large extent separately.

Service Discovery (SD) is an integral part of service networks. Before a service can be used, it needs to be discovered successfully. Thus, a comprehensive service dependability analysis needs to consider the dependability of the SD process. As a time-critical operation, an important property of SD is responsiveness: the probability of successful discovery within a deadline, even in the presence of faults. This is especially true for dynamic networks with complex fault behavior such as wireless networks. We present results of a comprehensive responsiveness evaluation of decentralized SD, specifically active SD using the Zeroconf protocol. The ExCovery experiment framework has been employed in the Distributed Embedded System (DES) wireless testbed at Freie Universität Berlin. We present and discuss the experiment results and show how SD responsiveness is affected by the position and number of requesters and providers as well as the load in the network. Results clearly demonstrate that in all but the most favorable conditions, the configurations of current SD protocols struggle to achieve a high responsiveness. We further discuss results reflecting the long-term behavior of the testbed and how its varying reliability impacts SD responsiveness.

Experiments are a fundamental part of science. They are needed when the system under evaluation is too complex to be analytically described and they serve to empirically validate hypotheses. This work presents the experimentation framework ExCovery for dependability analysis of distributed processes. It provides concepts that cover the description, execution, measurement and storage of experiments. These concepts foster transparency and repeatability of experiments for further sharing and comparison. ExCovery has been tried and refined in a manifold of dependability related experiments during the last two years. A case study is provided to describe service discovery as experiment process. A working prototype for IP networks runs on the Distributed Embedded System (DES) wireless testbed at the Freie Universität Berlin.

In service networks, discovery plays a crucial role as a layer where providers can be published and enumerated. This work focuses on the responsiveness of the discovery layer, the probability to operate successfully within a deadline, even in the presence of faults. It proposes a hierarchy of stochastic models for decentralized discovery and uses it to describe the discovery of a single service using three popular protocols. A methodology to use the model hierarchy in wireless mesh networks is introduced. Given a pair requester and provider, a discovery protocol and a deadline, it generates specific model instances and calculates responsiveness. Furthermore, this paper introduces a new metric, the expected responsiveness distance der, to estimate the maximum distance from a provider where requesters can still discover it with a required responsiveness. Using monitoring data from the DES testbed at Freie Universität Berlin, it is shown how responsiveness and der of the protocols change depending on the position of nodes and the link qualities in the network.

Proceedings of the Joint Workshop of the German Research Training Groups in Computer Science, Algorithmic synthesis of reactive and discrete-continuous systems (AlgoSyn), Dagstuhl, Germany, May 31 – June 2, 2010

Disasters striking in inhabited areas pose a significant risk to the development and growth of modern societies. The impact of any disaster would be severe. In case a disaster strikes, fast and safe mitigation of damages is important. Information and communication technology (ICT) plays a crucial role in helping reconnaissance and first response teams on disaster sites.

Most rescue teams bring their own network equipment to use several IT services. Many of these services (e.g., infrastructure, location, communication) could be shared among teams but most of the time they are not. Coordination of teams is partly done by pen and paper-based methods. A single network for all participating teams with the possibility to reliably publish, discover and use services would be of great benefit.

Despite the participating teams and course of action being different on every site, described service networks display certain common properties: They arise spontaneously,
the number of nodes and their capabilities are subject to high fluctuation, the number and types of services are also fluctuating strongly and there is no global administrative configuration.

Because of these properties all network layers involved would need to be configured automatically. Based on the Internet Protocol (IP) — the only well-established global networking standard — a number of mechanisms promise to automatically configure service networks. In disaster management scenarios, where various services are critical for operation, mission control could benefit from these mechanisms by getting a live view of all active services and their states. It needs to be investigated if and how they are applicable.

Given an ad-hoc, auto-configuring service network, how and to what extent can we guarantee dependability properties such as availability, the ability to perform in the presence of faults (performability) and ultimately the ability to sustain certain levels of availability or performability (survivability) for critical services at run-time?

The goal of this dissertation is to provide a comprehensive dependability evaluation for such heterogenous and dynamic service networks. A run-time dependability cycle is being embedded into the network. In this cycle, the network is constantly monitored.
A distributed service discovery layer provides network-wide service presence monitoring. This will be extended to provide monitoring for availability and performability assessment. Based on monitoring data, dependability properties are evaluated at run-time. The survivability of critical services can be estimated by calculating the expected availability or performability with a given fault model. If necessary, adaptation measures are triggered which in turn can cause the monitoring to be reconfigured. Even if no adaptation is possible, run-time awareness of critical states is already a huge benefit. This cycle is the base of a self-aware adaptive service network.