This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Preprints) and either DOI or URL of the article must be cited.

Abstract

A Spatial Data Infrastructure (SDI) is a framework of geospatial data, metadata, users and tools intended to provide an efficient and flexible way to use spatial information. One of the key software components of an SDI is the catalogue service which is needed to discover, query, and manage the metadata. Catalogue services in an SDI are typically based on the Open Geospatial Consortium (OGC) Catalogue Service for the Web (CSW) standard which defines common interfaces for accessing the metadata information.

A search engine is a software system capable of supporting fast and reliable search, which may use “any means necessary” to get users to the resources they need quickly and efficiently. These techniques may include features such as full text search, natural language processing, weighted results, fuzzy tolerance results, faceting, hit highlighting, recommendations, feedback mechanisms based on log mining, usage statistic gathering, and many others. In this paper we will be focusing on improving geospatial search with a search engine platform that uses Lucene, a Java-based search library, at its core.

In work funded by the National Endowment for the Humanities, the Centre for Geographic Analysis (CGA) at Harvard University is in the process of re-engineering the search component of its public domain SDI (WorldMap http://worldmap.harvard.edu ) which is based on the GeoNode platform. In the process the CGA has developed Harvard Hypermap (HHypermap), a map services registry and search platform independent from WorldMap.

The goal of HHypermap is to provide a framework for building and maintaining a comprehensive registry of web map services, and because such a registry is expected to be large, the system supports the development of clients with modern search capabilities such as spatial and temporal faceting and instant previews via an open API. Behind the scenes HHypermap scalably harvests OGC and Esri service metadata from distributed servers, organizes that information, and pushes it to a search engine. The system monitors services for reliability and uses that to improve search. End users will be able to search the SDI metadata using standard interfaces provided by the internal CSW catalogue, and will benefit from the enhanced search possibilities provided by an advanced search engine. HHypermap is built on an open source software source stack.

Author Comment

Changes Tom Kralidis name to his full correct name

Additional Information

Competing Interests

The authors declare that they have no competing interests.

Author Contributions

Paolo Corti conceived and designed the experiments, performed the experiments, analyzed the data, contributed reagents/materials/analysis tools, wrote the paper, prepared figures and/or tables, performed the computation work.

Benjamin G Lewis conceived and designed the experiments, performed the experiments, analyzed the data, contributed reagents/materials/analysis tools, reviewed drafts of the paper.

Athanasios Tom Kralidis conceived and designed the experiments, performed the experiments, analyzed the data, contributed reagents/materials/analysis tools, performed the computation work, reviewed drafts of the paper.

Ntabathia Jude Mwenda conceived and designed the experiments, performed the experiments, analyzed the data, contributed reagents/materials/analysis tools, performed the computation work, reviewed drafts of the paper.

Data Deposition

The following information was supplied regarding data availability:

HHypermap

https://github.com/cga-harvard/HHypermap

Funding

The work was funded in part by an Implementation Grant from the National Endowment for the Humanities. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Feedback on other revisions

Add your feedback

Before adding feedback, consider if it can be asked as a question instead, and if so then use the Question tab. Pointing out typos is fine, but authors are encouraged to accept only substantially helpful feedback.

Follow this preprint for updates

"Following" is like subscribing to any updates related to a preprint.
These updates will appear in your home dashboard each time you visit PeerJ.

You can also choose to receive updates via daily or weekly email digests.
If you are following multiple preprints then we will send you
no more than one email per day or week based on your preferences.

Note: You are now also subscribed to the subject areas of this preprint
and will receive updates in the daily or weekly email digests if turned on.
You can add specific subject areas through your profile settings.