Pages

Sunday, September 27, 2015

There are many fancy tools to edit ontologies. I like simple editors, like nano. And like any hacker, I can hack OWL ontologies in nano. The hacking implies OWL was never meant to be hacked on a simple text editor; I am not sure that is really true. Anyways, HTML5 and RDFa will do fine, and here is a brief write up. This post will not cover the basics of RDFa and does assume you already know how triples work. If not, read this RDFa primer first.

The BridgeDb DataSource Ontology
This example uses the BridgeDb DataSource Ontology, created by BridgeDb developers from Manchester University (Christian, Stian, and Alasdair). The ontology covers describing data sources of identifiers, a technology outlined in the BridgeDb paper by Martijn (see below) as well as terms from the Open PHACTS Dataset Descriptions for the Open Pharmacological Space by Alasdair et al.

This is the last time I show the color coding, but for a first time it is useful. In red are basically the predicates, where @about indicates a new resource is started, @typeof defines the rdf:type, and @property indicates all other predicates. The blue and green blobs are literals and object resources, respectively. If you work this out, you get this OWL code (more or less):

An OWL class
Defining OWL classes are using the same approach: define the resource it is @about, define the @typeOf and giving is properties. BTW, note that I added a @id so that ontology terms can be looked up using the HTML # functionality. For example:

An OWL object property
Defining an OWL data property is pretty much the same, but note that we can arbitrary add additional things, making use of <span>, <div>, and <p> elements. The following example also defines the rdfs:domain and rdfs:range:

<div id="aboutOrganism"
about="http://vocabularies.bridgedb.org/ops#aboutOrganism"
typeof="owl:ObjectProperty">
<h3 property="rdfs:label">About Organism</h3>
<p><span property="dc:description">Organism for all entities
with identifiers from this datasource.</span>
This property has
<a property="rdfs:domain"
href="http://vocabularies.bridgedb.org/ops#DataSource">DataSource</a>
as domain and
<a property="rdfs:range"
href="http://vocabularies.bridgedb.org/ops#Organism">Organism</a>
as range.</p>
</div>

So, now anyone can host an OWL ontology with dereferencable terms: to remove confusion, I have used the full URLs of the terms in @about attributes.

Saturday, September 19, 2015

I wanted to know when a set of publications I was aggregating on CiteULike was published. The number of publications per year, for example. I did a quick Google but could not find an R package to client to the CiteULike API, and because I wanted to play with JSON in R anyway, I created a citeuliker package. Because I'm a liker of CiteULike (see these posts). Well, to me that makes sense.

citeuliker uses jsonlite, plyr, and curl (and testthat for testing). The first converts the JSON returned by the API to a R data structure. The package unfolds the "published" field, so that I can more easily plot things by year. I use this code for that:

The percentile statistics are also useful to me. After all, there is a clear pressure to have impact with your research. Getting your research known is a first step there. That's why we submit abstracts for orals and posters too. Advertisement. Anyway, there is enough to be said about how useful #altmetrics are, and my main interest is in using them to see what people say about that, but I don't have time now to do anything with that (it's about time for dinner and Dr. Who).

But, as a last plot, and happy my online presence is useful for something, here a plot of the percentile of my papers in the journal it was published in and for the full Altmetric.com corpus:

This figure shows that my social campaign puts many of my publications in the top 10. That's a start. Of course, these do not link one-to-one to citations, which are valued more by many, even though it also does not reflect well the true impact. Sadly, scientists here commonly ignore that the citation count also includes cito:disagreesWith and cito:citesAsAuthority.

Anyways... I think I need other R packages for getting citation counts from Google Scholar, Web of Science, and Scopus.

Search This Blog

This blog deals with chemblaics in the broader sense. Chemblaics (pronounced chem-bla-ics) is the science that uses computers to solve problems in chemistry, biochemistry and related fields. The big difference between chemblaics and areas such as chem(o)?informatics, chemometrics, computational chemistry, etc, is that chemblaics only uses open source software, open data, and open standards, making experimental results reproducible and validatable. And this is a big difference!

About Me

Assistant professor at the Dept of Bioinformatics - BiGCaT at NUTRIM, Maastricht University, studying biology at an unsupervised and atomic level. Open Science is my main hobby resulting in participation in, among many others, Bioclipse, CDK and WikiPathways. ORCID:0000-0001-7542-0286. Posts on G+ are personal.

Cookies

In the EU there is a directive upcoming requiring websites to warn people about HTTP cookies. This website uses the Blogger.com platform, Google Adsense (not that is it actually paying anything significantly), and a few scripts to count how often a blog post was tweeted, using Topsy and LinkedIn. These services undoubtedly make use of cookies, which you can disallow in your browser.