Menu

Monthly Archives: April 2017

Hypothesis: the pChEMBL property can be used to filter for active drug-target interactions

Start date: 2017-04-17 End date: 2017-04-17

Description: If we wish to integrate binding affinity data with other data sets, we need to have access to this data. ChEMBL is a CC-BY data source that provide literature reported data and provides a SPARQL end point to provides access to it. The predicate for the pChEMBL value can be used to filter and list only active (pChEMBL>=5) interactions.

Methods

Create a SPARQLquery that lists all binding affinities

Create a SPARQL query that only selects those with a high pChEMBL value

Report

The example query for activity data for gleevec was used as a starting point. In all following SPARQL the following prefixes are used:

Then, the predicate IRI was searched for the pChEMBL value, with a general SPARQL query:

SELECT DISTINCT ?pred
WHERE {
[] ?pred []
}

The predicate for the pChEMBL value showed to be http://rdf.ebi.ac.uk/terms/chembl#pChembl. This can be combined with the example query to result in the following query to count the number of activities with a defined pChEMBL value:

The latter command returns the drug-target interaction with a pChEMBL value greater than (or equal to) five. However, the SPARQL endpoint throttles the maximum number of returned values to 1000. At this moment I am not sure how to overcome that limit.