Web-services using SOAP or REST style are becoming more and more popular in also bioinformatics. The Embrace Registry and BioCatalogue are registries which provide a good overview of publicly available bioinformatics services. They aim at making services easier to describe and find. It is also often heard that web-services provide a platform- and language-independent interface to databases and computation. Furthermore, web-services can be composed into complex-workflows, used for data-integration, automated user-interface and API-generation, that's at least the theory. How does the reality look for you?

Do you use SOAP/REST/.NET services in bioinformatics,
and what are your experiences with the different services and service providers?

Main aspects I am interested in are motivated by my own recent (and very mixed) experiences:

Did you encounter interoperability or language-dependence problems?

How did the providers react?

What would make you replace local scripts and tools by web-services?

Edit: I will have difficulties to choose the right answer, because everything said so far is valid.

Here are some intermediate results. I am testing different SOAP services using Axis2/Java at the moment, using the wsdl2code to generate the Java trying adb and xmlbeans databindings. Maybe this list will grow:

KEGG: wsdl2code: only with xmlbeans , usable: no, couldn't set arrayOfString because use of soapenc:array

BRENDA: wsdl2code: Error message: Wsdl not WS-I compliant, usable: no

BioMart soap: wsdl2code: yes, usable: no, after few mods of wsdl-file and tweaking axis2 params could send a valid message, response message is not valid

I have used the Stanford HIVdb web service, which is a fantastic tool for identifying drug resistant mutations in HIV sequences. Fortunately, they provide the entire client-side base code in both Perl and Java, otherwise it might have been too onerous to deal with. I think this is absolutely critical to getting any traction if you intend to roll out a web service like this.

I believe our group was the first to submit pyrosequencing data to this service, which meant they received tens of thousands of hits from us instead of a just a few. Still, it was able to deal with the increased load over a weekend.

Did you encounter interoperability or language-dependence problems?

Yes their Perl client did not work after a certain version. Fortunately they provided a Java client which continued to work.

How did the providers react?

Very well. They were very helpful.

What would make you replace local scripts and tools by web-services?

Because I work in industry now it
would be difficult for me to get the
use of these services blessed by the
powers that be without a security
framework in place (https?). There also appears to be a lot more available on the human side than for plants.

I would say SOAP is losing the popularity contest to REST, not because of a lack of merits, but because only Pierre Lindenbaum understands how to use SOAP. Seriously though, I think SOAP was/is too complex or intimidating for most end-users to wrap their head around even though it is a more powerful framework.

Another factor that I think will be in REST's favor is the proliferation of modern web frameworks like Rails and Grails that make it easy to develop RESTy interfaces which serve both human and robot clients with the flip of a switch.

IMHO, I think SOAP isn't "too complex" in the sense that it's impossible to figure out. But it's certainly more complex than a REST service, and I've never found a situation (as a consumer or a provider) where that complexity was substantially better. So given the choice between simple and complex for the same output, I'm choosing simplicity every time...

I think many are overwhelmed by the complexity of new methods in general and it might be too much said that I for example understand SOAP, but I can generate clients and servers, using e.g. Axis2 tools from WSDLs and then usem them quite easily without knowing all the intricacies of the protocols. What do you think about this?

I have nothing personally against SOAP but I found it was a watershed moment when Google dropped support for its SOAP Search API. As I understand it there will always been a need for SOAP for really complex queries but any time you need a special library just to interact with a data source you are introducing another ingredient into the mix that can confuse people. So it is kind of a lowest common denominator thing.

The choice is yours as a programmer and I think it's better to do one thing right than trying to do two thanks in a mediocre way. However that choice gives me headaches, because (repeating myself)
"Having well-defined interfaces is crucial for e.g. automatic user-interface generation, combining web-services into work-flows. Also semantic annotation of web-services requires a defined interface.
Without a contract, the client can simply not know what to expect." That's why large project like the EMBRACE have chosen to support SOAP as a standard.

Interesting topic. The bio domain has loads of great resources available as web sites. I like it a lot when data providers give programmatic access to structured data over HTTP (a very inclusive definition of web service), and that's something we should support and promote.

How they choose to do it is up to them. My experience has been that SOAP and the tooling that surrounds it don't add much of value. None the less, it beats the pants off of screen scraping, which is truly horrible. Whether a data provider serves up JSON, XML, or whatever, give me a URL that points to something I can parse, and I'm happy.

For what it's worth, my experience comes from writing a Firefox plugin called Firegoose, whose purpose is to exchange data between web resources (STRING, KEGG, DAVID, among others) and desktop tools. I was able to work with web services of various kinds, when available, and suffered through screen scraping in some unhappy cases. Doing data integration in the browser is nice because you can navigate the usual way to a resource, lets say a protein interaction network from STRING. When you've got something you like, you can then pull that into a desktop tool like Cytoscape or R an work with it further. It's a lot like workflow tools like taverna, but more ad-hoc and user-directed.

I agree, SOAP is not difficult, especially if you use a SOAP implementation like Axis2 or JavaWS. Moast depends on the WSDL being made correctly. soapenc:array is a big lagacy problem. That comes when people don't know how to do their XML schema of complex types right. It's a shame that very large institutions are unable ore unwilling to test their services properly.

We're currently gearing up to implement some of our fragmented analyses into Taverna workflows. The motivation for us is to provide web applications for both ourselves and other community members (for both dry and wet lab people) to get the most out of our high throughput data sources. There is no denying that part of this motivation is not purely philanthropic, but also to raise the impact of our work which makes our funders happy.

From a technical point of view we are having to implement our own web services from quite a wide range of sources; mainly Perl, C, C++ and R. Some of these are quite easy to do fairly directly with SOAP/WDSL (especially for Perl). Others like R can take advantage of some recent tools such as RShell http://www.ncbi.nlm.nih.gov/pubmed/19607662. The hardest is implementing new algorithms in C or C++ into usable web services, which we really are feeling our way around at the moment.

If you're looking at providing R services you might want to look at http://www.rforge.net/Rserve/ "Rserve is a TCP/IP server which allows other programs to use facilities of R (see www.r-project.org) from various languages without the need to initialize R or link against R library. Every connection has a separate workspace and working directory. Client-side implementations are available for popular languages such as C/C++ and Java. Typical use is to integrate R backend for computation of statstical models, plots etc. in other applications."

Thanks Daniel, we are definitely looking at Rserve as well, but I have to admit I hadn't thought of using it for our C/C++ code, but having just looked at the documentation that looks like a great way to go.

How about the C implementation of Axis2? I found the Java Axis2 to be quite standard compliant, maybe it's worth giving that a try? I would like to try to convince providers to provide interoperable services, so maybe the point is mainly to find libraries to support this.

I feel somewhat embarrassed to admit that I know very little about these web-services. I knew that they existed but never saw examples of actual use cases performed with them. When simply reading about a service they feel a little complicated, imposing a cognitive overhead that may not obviously pay off in long term.

But I have learned a number of neat tricks here on this site on how to access various resources, and I plan to put those to use. I find cookbook like approaches: this is how we do X or Y with a bioservice as being the most effective method of demonstrating their value.

Edit (forgot to answer this):

What would make you replace local scripts and tools by web-services?

For me the first priority is that the simplicity of the overall solution. What is the added complexity of one approach versus the other, and weighing that against overall goals.

I always very much liked the idea of webservices, and the several standards have mixed goals and features. I quite like the idea of SOAP. The SOAP standard practically settled for XML, but there are alternatives, like SOAP over XMPP.

The standard has been complex and large, resulting in many partial implementations. This makes the SOAP practically difficult to use, and resulting in best practices, effectively reducing the size of the standard, so that libraries can focus on that subset. WSDL is one additional standard required by most of those best practices.

Moreover, those incomplete libraries are often mutually incompatible, which has prominently been the case for Axis1/Axis2. However, some SOAP services could be properly accessed by the first and not the latter and the other way around. Try setting up a client that supports services that require both versions.

REST is much simpler, but does not offer the standardized discovery, and any service may use a different design.

Disclaimer: we developed an XMPP alternative that supports asynchronous web services recently, doi:10.1186/1471-2105-10-279.

I will take part in an EMBRACE course in Copenhagen for web-service providers in bioinformatics in June. Therefore, I need to sum up my own thoughts and experiences about recommendations that I would give to other institutions that already provide or plan to provide web-services. That's why I give an answer to my question here.

This is all based on my personal experience:
Imho, currently, the big promise of interoperability is nothing but a promise, interoperability simply doesn't exist yet, instead there are many perl-services which talk to perl-clients exclusively, Java-clients that understand Java servers and so on. To my experience, this is not only a problem of SOAP/vs. REST, it is more substantial, mainly related to a lack of testing and also implementation competence on the side of the providers.

Many of the service providers are not hobby programmers or one-man projects but they are hosted by large consortia andinstitutions that are funded for providing data and services in a sustainable and interoperable way. Certainly, the requirements of testing and interoperability must be interpreted more strictly for those.

To address that, I would like to give some very basic recommendations to people implementing web-services regarding interoperability, many of which may apply to both SOAP&REST, most are basic software-engineering common sense:

test all your artifacts (WSDLS, XML-schema) and make sure they validate against their definition

test and validate generated XML/SOAP messages against their schema. distributing artifacts that do not validate is like delivering code that does not compile! (you can use SoapUIs validator for example)

test your service for interoperability: choose (at least two) other client languages to check the clients with

document which language/clients your service is compatible with / has been tested with

put example clients online

if you cannot do follow these tests, mark your service as experimental or don't publish it at all. Don't claim it's interoperable. That will save a lot of time and frustration.

Some SOAP recommendations:

Make your Service WS-I Basic profile 1.2 compliant. check that using the WS-I checker.

In particular: Avoid use of soap-encoding (soapenc:array) in schema definitions. Use of soapenc:array locks out most or all standards-compliant clients.

Avoid to use or require legacy software, such as Axis1(Java). Most languages have more standard conforming libraries.

REST vs SOAP

There are many pros and cons but each method has its merits. With REST it is probably easier to make interoperable services, you need only a browser in principle, however SOAP&WSDL provide a way to specify a contract between service and client which can be easily checked.

Having well-defined interfaces is crucial for e.g. automatic user-interface generation, combining web-services into work-flows. Also semantic annotation of web-services requires a defined interface.
Withou a contract, the client can simply not know what to expect.

SOAP&WSDL is much more meant for complex queries and work-flows.
The modern implementations of SOAP make it simpler to implement clients and services from the services, all that is required is little more testing.

One of the existing concept of true RESTful service is autodiscovery. The client start from a single point and keep on discovering the rest of the services and workflow without any prior assumption.
The other part of the REST serving different data representation from the same resource url based on the content type. For example, a REST based url/resource (http://myrest.org/[?]/sequence) can return sequence in various format(fasta,genbank,uniprot,gff) depending on what is being requested from the client.

@biosidd: I am sorry, this does not make sense to me I dont understand what restful has to do with autodiscovery. How can any interaction work without "any prior assumption" ??? Especially when there is nothing known about the service datatypes?

@Michael: For autodiscovery i referred to the response for starting resources where it can contain links to the other resources.
If datatype refers to content-type then just check the response of the HTTP header.

DAS based bioinformatics web services are quite prevalent and being used by bunch of large bio databases and resources(EBI, ensembl, uniprot, wormbase etc). Its a HTTP GET based (read only) web service standard using XML format. I am not sure if it falls in REST category but definitely provides hierarchical way of discovering the services from a starting point. The DAS website do have bunch of web clients written in both perl and java. DAS registry would be a good starting point to get a list of service providers.

Looks to me as if DAS is implemented using a RESTful pattern, just trying to specify in addition the semantics of the composed URLs and adding a bit to the http header. This is in my oppinion not standars compliant at all. What I saw in the DAS specs actually shows the complexity and dilemma of trying to build an interface/API contract in human readable form instead of based on machine readable (aka XML) and parsable format.

@Michael: I am not sure it is RESTful, may be kind of REST, however it does have the ubiquity of HTTP protocol. Well,i agree it kind of having a bunch of query options crammed into some url but i don't get the point of having not being standard compliant.

As far as I can see DAS is in fact true REST, in that it is protocol specification (plus data formats) with all the REST characteristics (see Roy Fielding's thesis http://www.ics.uci.edu/~fielding/pubs/dissertation/rest_arch_style.htm). While the protocol used is based on HTTP and is compatible for most practical purposes, it is distinct and has a specification which details the behaviors required by client and server implementations. This contrasts with web applications which provide a REST-like interface by using HTTP.