I work on an open-source, semantic web based project called VIVO. Our work on VIVO is concerned with representing information about people – of course this includes the people we likely have in our minds right now who are typically participating in some way in the scholarly environment But increasingly we’ve also had a lot of interest from people who are more non-traditional – such as the citizen scientists. Important to represent a person’s interests, efforts and and areas of expertise Who are they? What do they do? What do they study and contribute to our understanding of our world around us?

At each implementation, VIVO enables research discovery – providing verifiable information about research and researchers. Each institution provides its own VIVO system and data. Local governance determines data to be provided. Across institutions VIVO provides a uniform semantic structure to enable a new class of tools using the data to advance science.

What is VIVO? It’s a semantic web application with rich profiles that display publications, teaching, service and professional affiliations. Faceted search for fast and meaningful results. What do you mean by “Semantic Web”? A group of methods and technologies to allow machines to understand the meaning – or &quot;semantics&quot; – of information on the World Wide Web. --------- Goal of VIVO : Improve all of science by providing the means for sharing and using current, accurate, and precise information regarding scientists’ interest, activities, and accomplishments. Foster team science by providing tools for identifying potential collaborators . Improve collaboration by creating tools that consume this data and repurpose it in such a way to enhance new and existing teams. Not limited to science – at Cornell, VIVO covers all disciplines across the entire institution

Profiles are largely created via automated data feeds , but can be customized to suit the needs of the individual. Information is open source (free) and is stored in a framework that allows for exporting to other applications. Profiles are richer in content than typical [web pages or] social networking sites and will rank higher in general internet searches.

VIVO harvests much of its data automatically from verified sources Therefore, reducing the need for manual input of data &amp; centralizing information and providing an integrated source. Much of the data in VIVO profiles is ingested from authoritative sources so it is accurate and current, reducing the need for manual input. The rich information in VIVO profiles can be repurposed and shared with other institutional web pages and consumers, reducing cost and increasing efficiencies across the institution. Private or sensitive information is never imported into VIVO. Only public information will be stored and displayed. Data is housed and maintained at the local institutions. There it can be updated on a regular basis. Search results are faceted so information can be located rapidly and with less time spent sorting through information. So where do we get our information for VIVO? So far agencies, repositories, and aggregators have been identified for VIVO.

Each element – subject, predicate, object is governed by ontologies with semantics. VIVO 1.2 includes an ontology module representing research resources such as biological specimens, human studies, instruments, organisms, protocols, reagents, and research opportunities. This module is aligned with the top-level ontology classes and properties ! from the NIH-funded eagle-i Project (https://www.eagle-i.org/home/). We’re also developing the extensions to the ontology that would allow more diverse types of efforts to be included in the profiles (blogs are currently avalable, wiki edits are not) – very important for microattribution/nanopublications efforts

VIVO uses linked open data concepts to provide data as RDF at URIs for each scientist. Critically important for building a web of data. Predicates have addresses, sites point to objects in other triples stores. Resolve queries across triple stores – “show investigators who genetic work is implicated in breast cancer.” VIVO won’t have information linkages between breast cancer and disease. Other resources will. But VIVO can link to external sources. “Mike worksOn GeneY” So where does data about Interests, activities and accomplishments come from? Archives. Data Aggregators. Publishers. Institutional repositories. So now we turn to tools

Strong open source development component to the project – this is reflected in part by the top notch applications that were submitted to a recent call for applications by the project

Miles Worthington Image from Dr. Barend Mons, Scientific Director of the Netherlands Bioinformatics Institute Allows experts to be found, but also ties the object to specific concepts

Nick benik at Harvard

There are many beautiful visualizations, developed by Katy Borner’s group at Indiana University. These include co-author and co-investigator networks and even temporal visualizations which allows discovery of grants and publications by defined groups over time within and beyond an institution. Most recently, the visualization team implemented a Science Map visualization, which allows users to visually explore the scientific strengths of a university, school, department, or person in the VIVO instance. Users will be able to see where an organization or person’s interests lay across 13 major scientific disciplines or 554 sub-disciplines, and will be able to see how these disciplines and sub-disciplines interrelate with one another on the map of science.

DEEP SEMANTIC SEARCH While searches for people are an obvious requirement for researcher networking, we don&apos;t want to limit ourselves to searching for people. VIVO&apos;s ontology-based data model is not limited to profiles of people, but includes organizations, events, publications, grants, and many other types of data. This enables VIVO to represent the relationships among people and other types of data as an interconnected network that can be accessed in many ways.

Core project development is augmented with contributions and feedback by other developers across multiple institutions on SourceForge. The open source community around VIVO is robust and dedicated. SourceForge also offers an open environment to share materials and ideas related to implementation and adoption. More and more content is added every day

As you can see, The VIVO project itself is a rather large, geographically dispersed team. 7 institutions Project areas: development, implementation, ontology, and outreach Inspiring, hard-working group of people with whom I am grateful to know and collaborate with on the project.

3.
Public, structured linked data about investigators interests, activities andaccomplishments, and tools to use that data to advance science

4.
What is VIVO? An open-source semantic web application that An open-source semantic web application that enables the discovery of research and enables the discovery of research and scholarship across disciplines in an institution. scholarship across disciplines in an institution. Populated with detailed profiles of faculty and Populated with detailed profiles of faculty and researchers; displaying items such as researchers; displaying items such as publications, teaching, service, and professional publications, teaching, service, and professional affiliations. affiliations. A powerful search A powerful search functionality for locating functionality for locating people and information within people and information within or across institutions. or across institutions.

5.
A VIVO profile allows you to: Find potential colleagues by research area, authorship, Find potential colleagues by research area, authorship, and collaborations. and collaborations. Showcase credentials, expertise, skills, and professional Showcase credentials, expertise, skills, and professional achievements. achievements. Connect within focus areas and geographic expertise. Connect within focus areas and geographic expertise. Simplify reporting tasks and link data to external Simplify reporting tasks and link data to external applications – e.g., to generate biosketches or CVs. applications – e.g., to generate biosketches or CVs. Publish the URL or link the profile to other applications. Publish the URL or link the profile to other applications. Display visualizations of complex research networks and Display visualizations of complex research networks and relationships. relationships.

7.
How does VIVO store data? Information is stored using the Resource Description Framework (RDF) and data are structured in the form of “triples” as subject-predicate-object. Concepts and their relationships use a shared ontology to facilitate the harvesting of data from multiple sources. Dept. of Genetics College of Medicine is member of Jane Genetics Smith has affiliations with Institute Journal author of article Book Book chapter Subject Predicate Object

9.
Using VIVO data By storing data in VIVO in RDF and using standard ontologies, the information in VIVO can either be displayed in a human readable web page or delivered directly to other systems as RDF. This allows the open researcher data in VIVO to be harvested, aggregated, and integrated into the Linked Open Data cloud. VIVO enables authoritative data about researchers to become part of the Linked Data cloud.

10.
The Semantic Web & Researcher Networking• Increasing recognition of the value of semantic web standards• Increasing momentum in support of semantic web technologies to facilitate research discovery• Recommendations for researcher networking recently endorsed by the CTSA Consortium Steering Committee represent a new standard in researcher networking. – Read more at http://vivoweb.org/blog• Examples of applications that consume these rich data include: visualizations, enhanced multi-site search, and VIVO Searchlight. Other utilities are in development across a wide range of topic areas.

11.
Notable SemWeb projects• Dbpedia is a community effort to extract structured information from Wikipedia and to make this information available on the Web.• NextBio is a database consolidating high-throughput life sciences experimental data tagged and connected via biomedical ontologies.• GoPubMed a semantic search engine for the life sciences. It uses the GeneOntology (GO) and the Medical Subject Headings (MeSH) to semantically filter millions of biomedical abstracts from MEDLINE.• OpenPHACTS will create an open innovative platform, Open Pharmacological Space, which will be freely accessible for knowledge discovery and verification. Open PHACTS will provide a growing body of data on small molecules, their pharmacological profiles, pharmacokinetics, biological targets and pathways in a semantically interoperable format. Aligning and integrating proprietary and public data sources into a single system is currently a very difficult and time consuming task, repeated across companies, institutes and academic laboratories.• Open Government initiatives• Publications efforts• DOD• Federal Profiling• Many others