You are here:

Geo-referencing and name disambiguation of research institution names

1 Research Collaborator position

Fields:

Big Data, Patent Databases, Publication Databases, Disambiguation

Activity:

The researcher will extract author/inventor names and addresses, document classifications, and publication dates from the Pubmed database for publications and the EPO, PCT, and USPTO databases for patents. These documents will be geocoded through the Yahoo Query Language, using Amazon AWS servers as the primary computational resource. The researcher will define an algorithm for quantifying the spatial extent of a research hub using the geo-located address. These geolocations are to be used as a guide for the disambiguation of institutional or author/inventors names, with names located nearby being compared to see if they likely correspond to the same person or institution. This approach is to be combined with other common methods of name disambiguation, and the researcher will evaluate the success of the approach manually for a few major research centers (those that were defined algorithmically above). Finally, the dynamics of the distribution of patent or publication classes for major research institutions (those determined through the disambiguation) will be studied and quantified.

Formal requirements:

Advanced knowledge of C/C++ or equivalent programming language and experience with Amazon AWS system. Excellent knowledge of English, both written and spoken.

Specific requirements:

A PhD in Physics, Mathematics, Economics or in a related field and a strong publication record in interdisciplinary journal are preferable.