Data mining of historical herbarium specimens from the Korean peninsula

Jeju island, the largest island on the Korean peninsula, is home to many endemic species of vascular plants in the region. Photo by KOREA.NET licensed under CC BY-SA 2.0.

Institutions outside of the Korean peninsula hold much of the historical, legacy biodiversity information on the region. With nearly 140,000 specimens including the data on specimens stored at foreign herbaria, there is a comprehensive chronological, historical, taxonomic, and geographic coverage of Korean plants including those from inaccessible areas, such as North Korea. Despite an abundance of biodiversity information in collections, there is a pressing need to make this data accessible and integrated sufficiently to foster query-based inquiries to assist with regional conservation priorities.

This project will thus mobilize existing biodiversity information and knowledge within the Korean Peninsula using the BRAHMS database. Using the advantages of BRAHMS, the project will be able to query foreign herbaria historical records, generate specimen georeferenced data, and produce photo images about the North and South Korean vascular plants which will published through GBIF.org.

Through these goals the project will address the biodiversity information imbalance between South and North Korea and reduce the knowledge gap surrounding the diversity and distribution of vascular plants.

Project Progress

As part of the first phase of the project, a dataset of 15,000 SNUA specimens has been compiled. To enable usability, this dataset set will go through the process of validation and data cleaning, including a retrospective georeferencing, until March 2019. Other datasets are under preparation and a selection of photographs are currently presented on the institute´s website.

The project has also participated in the collective work about historical collections by undertaking visits and obtaining additional photographs about Faurie/Taquet collections. Labelling species names and collection numbers for each photograph taken has been completed.

To promote the project, the project team presented in September 2018, during the International Symposium of Mapping Asia Plants, Institute of Botany, Chinese Academy of Sciences, on the subject “Data cleaning process in historical collections -Old labels give clue for new science. Case of the Korean peninsula and Northeastern China”.

The project has now the total of 18,106 SNUA specimens stored in BRAHMS and uploaded it for everyone to see. The project visited two Japanese herbaria, the TI and KYO, and obtanied additional photos of Faurie/Taquet collections.

During the project it has come clear that there is a need for institutionalisation of a data publishing framework in Korea. The project has worked on filling the gaps of biodiversity information in Asia. They are working on solving the conflicting scientific names currently used by China, Russia, Japan, North and South Korea.

The project has created data about woody plants that deal wtih biodiversity data for Asia and a distribution maps of selected herbaceous taxa that will be completely done by the end of the year.

The project also aims to produce three data papers for submission into journals in 2019.