Knowledge organization systems (KOS) in the semantic web

Since the Simple Knowledge Organization System (SKOS) specification and its SKOS eXtension for Labels (SKOS-XL) became formal W3C recommendations in 2009, as a separate, lightweight, intuitive language for developing and sharing new knowledge organization systems (KOS), a significant number of conventional knowledge organization systems (including thesauri, classification schemes, name authorities, and lists of codes and terms, produced before the arrival of the ontology-wave) have made their journeys to join the Semantic Web mainstream. Semantic Web standards such as SKOS, OWL, RDFS, and SPARQL have provided adequate building blocks for publishing KOS as Linked Open Data (LOD), resulting in the following outcomes: (1) many traditional KOS vocabularies have been turned into lightweight OWL ontologies or SKOSified value vocabulary datasets; and (2) such datasets are usually available as data dumps or accessed by means of SPARQL endpoints. Through the vocabulary services, the developers have come up with the strategies and technologies to ensure not only the availability, but also the interoperability, stability, and scalability of the contents and applications they provide. This paper uses “LOD KOS” as an umbrella term to refer to all of the value vocabularies and lightweight ontologies within the Semantic Web framework. The paper provides an overview of what the LOD KOS movement has brought to various communities and users. These are not limited to the colonies of the value vocabulary constructors and providers, nor the catalogers and indexers who have a long history of applying the vocabularies to their products. The LOD dataset producers and LOD service providers, the information architects and interface designers, and researchers in sciences and humanities, are also direct beneficiaries of LOD KOS. The paper examines a set of the collected cases (experimental or in real applications) that are collected from the released LOD KOS products, journal articles, conference presentations, workshops and webinars, related tweets, blogs, and the posts in community-shared spaces. This accumulated set is open-ended and has no boundary. The paper does not intend to prove any pre-set hypothesis. Its purpose is to find the usages of LOD KOS in order to share the best practices and ideas among communities and users. Through the viewpoints of a number of different personas designed in this study, the functions of LOD KOS are examined from multiple dimensions. The full paper includes two parts. Part I focuses on the LOD dataset producers, vocabulary producers, and researchers (end-users). Part II will focus on the technology-oriented cases around the Website/tool developers and LOD KOS service providers.