Web services technologies are changing the way we design, describe, publish, and discover Web systems. Essentially, these technologies support autonomous software components that use XML-based standards for interface definition, such as WSDL (Web Services Description Language); remote communication, such as SOAP; and service registration and discovery, such as UDDI. 1,2 Web services represent an evolution from conventional browser interfaces to Web-enabled components that integrate business processes within and across enterprise boundaries.

Quality of service is critical to Web services design, particularly when clients can choose among semantically equivalent services offered by different servers. 3,4 Such replicated services might be located in different and competing organizations or in the same organization. The communication infrastructure usually affects the QoS attributes that clients consider in service selection, such as performance and availability, so service evaluations must occur on the client side. Moreover, when different organizations deploy semantically equivalent Web services, they often provide interfaces that aren't compatible with the ones clients expect. In such cases, the flow of Web services requests must interpose adapters that enable remote interfaces to conform to the client requirements. Ideally, middleware systems used in Web services client implementations should encapsulate both server-selection policies and adapters. In other words, these systems should provide support for replication transparency.

However, most current middleware systems, such as Apache Axis ( http://ws.apache.org/axis) and JAX-WS ( http://java.sun.com/webservices), lack support for replication transparency. To tackle this problem, we propose using smart proxies to extend a middleware system with such transparency. Smart proxies are a metaprogramming mechanism commonly used in the customization and extension of middleware systems. 5,6 Several middleware platforms have implemented them, including TAO 6 and Java RMI. 7

SmartWS is a system that uses smart proxies to encapsulate a variety of server-selection policies representing a broad spectrum of those typically used in selecting independent, autonomous Web services (see the "Related Work in Replicated Web Resource Access" sidebar). Moreover, SmartWS supports a new server-selection policy that combines the advantages of two of the more effective policies described in the literature. SmartWS proxies can also encapsulate adapters to bridge potential interface incompatibilities between Web services and clients. Experimental performance results on a prototype system support preliminary guidelines for choosing different server-selection policies.

SmartWS users can also implement adapter objects that the system interposes between the generated smart proxies and remote servers. Smart proxies use such adapters to translate remote Web service interfaces into the interfaces clients require—that is, the abstract interfaces. The current SmartWS implementation supports the generation of smart proxies in Java using Apache Axis as the underlying middleware infrastructure.

To make its implementation backward-compatible with standard Web services clients, SmartWS makes no effort to keep the state of the available replicas consistent—for example, by using active or passive replication techniques. 8 For this reason, users should rely on SmartWS only when requesting replicated services that any available server can handle—for example, read-only services.

On the other hand, when dispatching requests that affect the contacted service providers' state, users can disable automatic server selection by defining a group of invocations (or a session) that generated smart proxies should dispatch to a single server.

Server-selection policies

Currently, SmartWS supports five server-selection policies:

Static policy. Developers must define a given service's server-invocation order at deployment time. The generated smart proxy will invoke the first server of this sequence; if the server fails, the proxy will invoke the next server and so on. The static policy fails when all the listed servers fail; in this case, the proxy generates an exception to the client application.

Random policy. The proxy accesses the server randomly at invocation time. If the selected server fails, the random-selection process repeats using the remaining servers. If all servers fail, the proxy propagates an exception to the client process.

Parallel policy. The proxy uses threads to invoke all listed servers concurrently. It returns the first reply to the client application and ignores the remaining replies. If all servers fail, the proxy raises an exception.

Best-median policy. This policy first computes the median response time of the last k-invocations for each server. Then the proxy invokes the server with the lowest median. If this server fails, the proxy invokes the remaining servers sequentially, following the ascending order of the medians. If all servers fail, the proxy raises an exception.

Parallel best-median policy. PBM generalizes the parallel and best-median policies, striving to combine their benefits, as we now describe in more detail.

Suppose r servers provide a replicated Web service. Similarly to the best-median policy, PBM first computes the median response time of the last k-invocations for each server. Next, it invokes concurrently the server with the lowest median and the servers whose medians are less or equal than m * k, where m is the lowest median and k is a constant that defines an upper-bound median for the servers that the policy invokes in parallel ( k > 1). Moreover, the number of servers accessed in parallel is limited to p, where p ≤ r. If all such servers fail, the proxy invokes the remaining servers sequentially, following the ascending order of the medians. If the remaining servers also fail, the system raises an exception. At cycles of n requests, the PBM policy assumes the parallel policy's behavior for the t subsequent requests. For example, supposing n = 16 and t = 3, the PBM policy will invoke all available servers in parallel at invocations (1, 2, 3), (17, 18, 19), (33, 34, 35), and so on.

Even reliable Web servers can present temporary response-time degradations—for example, due to bandwidth fluctuations. The best-median policy handles such situations by selecting the server with the best median at each invocation. However, this policy doesn't update ignored servers' invocation times, so it can't recover from their occasional performance degradations. To address this problem, the PBM policy periodically assumes a parallel behavior, which forces an update in the invocation time of servers that haven't been considered during the policy's normal, best-median behavior. Furthermore, PBM's normal behavior lets users configure the number of servers invoked concurrently. By adjusting this parameter, developers can establish a trade-off between selection quality and communication overhead. On the one hand, invoking more servers in parallel increases the possibility of contacting the best replica at the moment of invocation. On the other hand, accessing fewer servers reduces the incoming traffic inherent to the parallel policy.

In the PBM policy, SmartWS users must define the size of the buffer that stores the invocation times and the values of k, p, n, and t on the basis of the client, network, and server configurations. Particularly, if p = 1, the PBM policy behaves similarly to the best median. The only difference is the window of t parallel accesses at cycles of n invocations. If p = r and k = ∞, PBM has the same behavior as the parallel policy.

Adapters

We can't assume that developers in charge of designing semantically equivalent Web services will agree on standard interfaces for describing their systems. More probably, Web service interfaces will present type and structural incompatibilities—for example, regarding the name of the available operations, the number and type of invocation parameters, and the type of returning values. To address this problem, SmartWS generates smart proxies that rely on adapters—that is, objects specifically designed to bridge client-server interface incompatibilities. In this first system implementation, we decided to delegate adapters' implementation to SmartWS users. However, in the future, we plan a semiautomated approach to adapters generation—for example, using template mappings.

In the current system, smart proxies classes implement an abstract interface. The system user defines this interface, which—from the client point of view—standardizes the service-operations signatures that a set of autonomous servers provides. An adapter for a remote interface I1 made available by a server S1 is an object that implements the abstract interface I (expected by the client) and that wraps a reference of type I1 denoting the server S1. Implementing the I's methods in the adapter requires redirecting the incoming calls to S1, after providing code that performs the required adaptations.

Sessions

Designing clients of semantically equivalent Web services often requires assigning a single server to handle a request sequence—for example, when operations must access a server state that a previous request had modified. Consider a client that executes a login operation in a given server: it expects subsequent requests to be delivered to the same server until it calls a logout operation. To handle such cases, SmartWS supports the definition of operations that delimit a session of service invocations that smart proxies must dispatch to a single server.

Configuration files

SmartWS configuration files are defined using a simple declarative language. In this language, the interface element defines the name of the Java interface containing the abstract interface that regulates the communication between the client application and the smart proxy. This interface is a standard Java interface, without any particular code.

The webserver element defines properties about each available server, including an alias, its WSDL interface, and, optionally, an adapter class. The following example illustrates the definition of the interface and webserver elements:

The policy element specifies the server-selection policy used to select the servers described in the webserver section. The following example shows how to specify that the generated smart proxy must support the PBM policy:

policy pbm(k=1.2, p=2, n=10, t=3, service1, ..., serviceN)

The session element defines session delimitation. Basically, developers must indicate abstract interface methods that start and end a group of service invocations that smart proxies must dispatch to the same server. For example,

session begin= login end= logout

Empirical evaluation

We carried out some initial experiments to evaluate the performance of smart proxies generated with our prototype SmartWS implementation. We focused mainly on assessing the impact of the different server-selection policies, so we didn't consider the SmartWS adapters and session features in our analysis.

Our experiments consisted of deploying a simple (stateless) echo service over six geographically distributed Web servers in Brazil:

In each experiment, we configured a client application to invoke the echo service at 30-second intervals, from Monday to Thursday, 24 hours per day. We used the following policies in a round-robin fashion in each client invocation: random, parallel, best median, and two PBM configurations (called PBM1 and PBM2, which used p = 1 and p = 2, respectively, and n = 16, t = 3, and k = 1.2 for the remaining parameters). During the first week of the experiment, we configured the client to invoke the echo service by passing an 8-Kbyte argument string. In the second and third weeks, the client used argument strings of 4 Kbytes and 1 Kbyte, respectively. We defined a 20-second timeout as indicating a server failure. The client executed in a Pentium IV 3-GHz computer with 1 Gbyte of RAM, Linux operating system, JDK 5.0, and a FastEthernet 100-Mbps network interface. We located the client at the Distributed Computing Lab of Pontifical Catholic University of Minas Gerais (a private Brazilian university) and connected to the Internet through a 2-Mbps asymmetric digital subscriber line.

Experiment 1

In the first experiment, we invoked the echo service 30,075 times, using 8-Kbyte messages. About 1.6 percent of the invocations failed, due to timeouts, client failures, and so on. Figure 2 presents the cumulative distribution function over the response times obtained with each policy. The two BPM policy variants performed best in this experiment. The random policy invokes all services with the same probability, so, as expected, it presented the worst cumulative distribution.

The best-median policy was next; its low performance in this experiment reflects the policy's inability to automatically adjust itself to cope with severe and occasional variations in a given server's the response time. For example, when WS6, the best server in the experiment, responded slowly, a self-adjusting policy would start contacting the other servers. However, the best-median policy includes no mechanisms to refresh the invocation times of the ignored server, so it couldn't recover from the server's sporadic performance degradation.

The parallel policy imposed a high overhead in each service call. The figure 2 results show that only 10 percent of all invocations using that policy achieved response times under 2 seconds. For the best-median and PBM policies, this ratio was close to 60 percent.

Experiment 2

In the first experiment, WS6 performed best in almost 70 percent of the invocations. For this reason, we conducted a second experiment without that server. We wanted to evaluate the policy behaviors in a scenario where servers offer similar performance levels. In this new experiment, which we performed during the second week of our experiments, we invoked the echo service 26,010 times, using 4-Kbyte messages as invocation parameters. Even with smaller message sizes, the failure ratio increased to 3.6 percent, which reinforces the fact of WS6's far superior performance over the other servers.

Figure 3 presents the cumulative distribution function for this second experiment. Once more, the random policy presented the worst cumulative distribution, followed by best median. With smaller messages, the results for the parallel policy improved, particularly when considering invocations that required more than 3 seconds to complete. On the other hand, considering only invocations with small response times, the PBM and PBM2 policies performed better than the parallel policy. For example, more than 40 percent of the invocations dispatched using PBM and PBM2 had an overall response time less than 1.25 seconds. For the parallel policy, only 32 percent of the invocations performed as well.

Figure 3. Cumulative distribution function over the response times obtained by each policy in experiment 2 (4 KB messages, servers WS1 to WS5).

Experiment 3

Next, we invoked the echo service 38,510 times, using 1-Kbyte messages and the same five servers from experiment 2. About 3.6 percent of those invocations failed. Figure 4 presents the cumulative distribution function over the response times obtained with each policy in this third experiment. The results were similar to those observed in experiment 2. Again, the random and best-median policies presented the worst performance overall. When considering only invocations completed in less than 500 milliseconds, PBM and PBM2 performed best. On the other hand, when also considering long-duration calls (with response times over 2 seconds), the parallel policy obtained the best results.

Summary results

Figure 5 presents the sum of the response times obtained with each policy, when regarding all invocations performed in each of the three experiments. It shows the PBM2 policy performing best for 8-Kbyte messages (experiment 1). For 4-Kbyte messages (experiment 2), PBM2 and parallel policies performed best, with similar results. Finally, for 1-Kbyte messages (experiment 3), the parallel policy obtained the best results.

Discussion

SmartWS relies on smart proxies to encapsulate several tasks that clients perform when accessing autonomous and semantically equivalent Web services. On the basis of our experiments, we offer some guidelines to help SmartWS users choose the server-selection policy that is most suitable to a given configuration of Web services providers:

In scenarios where developers know the involved Web servers' response times a priori and where such times are fairly constant, we recommend static and random policies. Static policies perform better when significant differences exist among response times; random policies when the response times are similar to each other.

We recommend the parallel policy when the involved servers have unpredictable response times. However, this policy requires the client, servers, and network to have enough resources to cope with the traffic overhead generated by the flooding behavior inherent to this policy. For example, considering the configurations used in our experiments, the parallel policy is only indicated for invocations involving SOAP messages up to 4 Kbytes in size.

In all other scenarios, we recommend using the PBM policy. We particularly recommend against using the best-median policy: as experiment 1 showed, it's not flexible enough to recover from short-term variations in a given server's response times. Instead, we suggest using PBM with p = 1, where p is the number of concurrent servers the policy contacts.

Conclusion

In addition to semiautomating an approach for implementing adapters, we also plan to investigate techniques that let smart proxies dynamically infer and change some parameters the proposed policies require—for example, the values of p, n, and t that PBM requires. We also intend to support dynamic updates in the list of servers the proposed smart proxies contact and to incorporate other QoS aspects (such as reliability, security, transactions, and privacy) as part of our server-selection policies.

Related Work in Replicated Web Resource Access

Researchers have traditionally used either server-side or client-side approaches to access replicated resources on the Web. 1 Server-side approaches usually deploy Domain Name System servers or dedicated/proprietary routers close to the replicated servers. These elements manage client requests toward available servers. For this reason, this approach is mostly indicated for servers of the same administrative domain. Moreover, server-side architectures usually fail to capture information about network conditions, including information about Web traffic and the client's local network. On the other hand, client-side approaches are usually recommended when accessing servers spread throughout the Internet. In this case, the clients must implement the server-selection policy.

Sandra Dykes, Kay Robbins, and Clinton Jeffrey investigated client-side selection algorithms for replicated services (including random, best-median, best-last, and others). 2 However, they restricted their research to traditional Web resource contexts, such as HTML documents and image files. Smart Client is another client-based solution. 3 It relies on Java applets to provide transparent access to replicated network services (including HTTP, FTP, telnet, and so on). However, Smart Client requires developers to implement applets manually, including applets for server-selection policies. SmartWS instead defines policies in a simple declarative language, from which the system automatically generates smart proxies.

Nabor Mendonça and José Silva proposed the Replicated Web Services framework for client-side selection. 4 RWS supports some of the SmartWS server-selection policies, including best median, parallel, and random. By comparison, SmartWS supports a new policy (parallel best-median), service adapters, and invocation sessions.

WS-Replication is a system that relies on group communication to support active replication in Web services. 5 It requires accessed replicas to be configured with components that support SOAP-based group-communication semantics. In this way, WS-Replication isn't compatible with standard Web services providers, such as the servers used in the experiments we describe in the main text. On the other hand, WS-Replication supports fault-tolerance even when the called operations have side effects. FT-SOAP is another replication framework that extends server-side components with passive-replication support to fault-tolerant Web services. 6

Shankar Ponnekanti and Armando Fox investigated the types of incompatibilities that might arise when Web services interfaces evolve independently. 7 They propose a GUI tool for resolving incompatibilities and generating middleware components, called cross stubs, that enable interoperation with semantically equivalent services. We plan to incorporate a similar tool in SmartWS to make generating adapters more functional and scalable.

Marco Túlio Valente is an assistant professor at the Institute of Informatics at Pontifical Catholic University of Minas Gerais. His research interests include middleware, programming languages, aspect-oriented programming, and software engineering. He received his PhD in computer science from Federal University of Minas Gerais. He's a member of the Brazilian Computer Society. Contact him at PUC Minas, Unidade Sao Gabriel, Anel Rodoviario Km 23.5, Belo Horizonte, MG, Brazil, 31980-110; mailto:mtov@pucminas.br.

Nabor C. Mendonça is a titular professor at the Center of Technological Sciences, University of Fortaleza. His research interests include distributed systems, service-oriented development, aspect-oriented programming, and software maintenance and reengineering. He received his PhD in computing from Imperial College London. He's a member of the IEEE Computer Society and the Brazilian Computer Society. Contact him at UNIFOR, Av. Washington Soares, 1321, Fortaleza, CE, Brazil, 60811-905; mailto:nabor@unifor.br">nabor@unifor.br.