Introduction

The Semantic Biomedical Tagger (SBT) has a built-in capability to recognize 133 biomedical entity types and semantically link them to the knowledge base systems, in this case LinkedLifeData (LLD). The SBT can load entity names from the LLD service or any other RDF database with a SPARQL endpoint. The current version is preloaded with the latest release of LLD dataset. All URIs used by SBT are resolvable and can be opened by a web browser or a machine accessible API.

REST API

The details on the REST API for the Semantic Biomedical Tagger service are available on the Text Analytics page.

SBT annotations

The SBT creates semantic annotations that have names (Annotation type) and features: class (URI), instance (URI), and string (instance label). Both URIs can be further explored in the LLD service.

For the sake of clarity, if you annotate the sample text above with the demo UI of S4 you will see a result like this:
The JSON request for the Semantic Biomedical Tagger (SBT) service will look like (Please refer to the Text Analytics page for details on the JSON input/output formats):

RESTful Request (Plain Text Content)

We are now ready to send a simple RESTful request to the S4 text analytics services using a simple command line tool like curl:

Lets go step-by-step through the sample code above:

we specify the API Key and secret - all S4 requests need a valid API key and secret pair which can be generated from the S4 Management Console

we specify the S4 RESTful service to be used - in this case the "Semantic Biomedical Tagger" text analytics service. Note that as part of the endpoint URL we also provide the API key and secret

we make a RESTful request to the S4 service via curl, providing the JSON request document (from step 4), the S4 service endpoint (from step 2) and we specify in the HTTP header that this HTTP request type is "application/json"

RESTful Request (Office Formats Content)

The following example demonstrates the processing of Office documents (Word) as input for the S4 text anaylitics services. The result is in the format described in the next section.

API Key, Secret and service URL configured in the same way as in the previous example. The request payload comprises of two parts:

The RESTful request itself is performed via curl as multipart message. The HTTP request type should not be explicitly provided (curl configures it properly), however the JSON part 'meta' should explicitly set its content type ("type=application/json")

JSON Result

The result of the service invocation is another JSON document (the structure is described on the Text Analytics page) which contains annotations and their offsets for various entities found in text:

Pharmacologic Substance: "sitagliptin", "lansoprazole"

disease: "Type 1 diabetes"

cells: "pancreatic β cells"

molecular function: "dipeptidyl peptidase-4 inhibitors"

activities, laboratory procedures, etc

original text snippet + sentence splits

The full JSON response is available below.

Some important details:

the original text (rows 2-5) and the sentence splits (rows 166-179) are available

the offsets of the annotations in the original text are provided with the "indices" key

additional annotation information such as type, id, string, etc is available

the class and the instance that the annotation represents are also available