It’s been almost a decade since the term “Open Science” first appeared in Wikipedia. The page was created by Aaron Swartz and initially redirected to the “Open Access” entry. Some years later this young activist committed suicide as a result of the pressure from the judicial charges against him after having uploaded many privative licensed articles to the Internet.

A few weeks ago I finally had my dissertation (PhD) defence. Below you can see the shown slides (in Catalan), trying to touch in around 3/4 hour many of the different involved works.

One of the mentioned topics was protein moonlighting (or multitasking), that is, the property that some protein molecules may have additional functions or roles apart from the one that is primarily annotated or known. As example, the group in which I worked during my PhD studies is actually keeping an exhaustive list of these cases.

This usage can be considered as a kind of metaphor of the original moonlighting term. As it can be read in Urban Dictionary, it refers to the fact of having an additional job, normally during moonlit hours (at night).

For a non-native English speaker it's always hard to know how popular certain words or expressions are. It was funny to learn that at least this word seemed to exist already during the 1970s in the USA. However, as we can see during a conversation of the main character of Taxi Driver, not all people may have been fully familiar with it.

Around one year and a half ago I started some testing with graph databases (Neo4j so far) and I used Gene Ontology and NCBI taxonomy datasets as sample cases. I explained my experience in this presentation by February 2015:

Regarding Py2neo, I noticed that Neo4j REST API seems to rely more explicitly on Cypher queries that it did in the past. With the help of this article about multiprocessing in Python and Py2neo, and after several tries, I managed to get importing work within acceptable time.

As final tips, if you plan to use similar approaches with your own data, I would suggest to create nodes and populate their properties at the same time (keeping data in memory if necessary). I also noticed that trying to create relationships with multiple parallel processes fails, so keep only one worker for these steps.