MeSH is the National Library of Medicine controlled vocabulary thesaurus which is updated annually. NLM uses the MeSH thesaurus to index articles from thousands of biomedical journals for the MEDLINE/PubMed database and for the cataloging of books, documents, and audiovisuals acquired by the Library.

MeSH experts/users will need to absorb the details but some of the changes include:

Overview of Vocabulary Development and Changes for 2016 MeSH

438 Descriptors added

17 Descriptor terms replaced with more up-to-date terminology

9 Descriptors deleted

1 Qualifier (Subheading) deleted

and,

MeSH Tree Changes: Uncle vs. Nephew Project

In the past, MeSH headings were loosely organized in trees and could appear in multiple locations depending upon the importance and specificity. In some cases the heading would appear two or more times in the same tree at higher and lower levels. This arrangement led to some headings appearing as a sibling (uncle) next to the heading under which they were treed as a nephew. In other cases a heading was included at a top level so it could be seen more readily in printed material. We reviewed these headings in MeSH and removed either the Uncle or Nephew depending upon the judgement of our Internal and External reviewers. There were over 1,000 tree changes resulting from this work, many of which will affect search retrieval in MEDLINE/PubMed and the NLM Catalog.

and,

MeSH Scope Notes

MeSH had a policy that each descriptor should have a scope note regardless of how obvious its meaning. There were many legacy headings that were created without scope notes before this rule came into effect. This year we initiated a project to write scope notes for all existing headings. Thus far 481 scope notes to MeSH were added and the project continues for 2017 MeSH.

Echoes of Heraclitus:

It is not possible to step twice into the same river according to Heraclitus, or to come into contact twice with a mortal being in the same state. (Plutarch) (Heraclitus)

Semantics and the words we use to invoke them are always in a state of flux. Sometimes more, sometimes less.

The lesson here is that anyone who says you can have a fixed and stable vocabulary is not only selling something, they are selling you a broken something. If not broken on the day you start to use it, then fairly soon thereafter.

It took time for me to come to the realization that the same is true about information systems that attempt to capture changing semantics at any given point.

Topic maps in the sense of ISO 13250-2, for example, can capture and map changing semantics, but if and only if you are willing to accept its data model.

Which is good as far as it goes but what if I want a different data model? That is to still capture changing semantics and map between them, but using a different data model.

We may have a use case to map back to ISO 13250-2 or to some other data model. The point being that we should not privilege any data model or syntax in advance, at least not absolutely.

Not only do communities change but their preferences for technologies change as well. It seems just a bit odd to be selling an approach on the basis of capturing change only to build a dike to prevent change in your implementation.

As part of its ongoing effort to provide effective access to library materials, the Library of Congress is developing a new vocabulary, entitled Library of Congress Demographic Group Terms (LCDGT). This vocabulary will be used to describe the creators of, and contributors to, resources, and also the intended audience of resources. It will be created and maintained by the Policy and Standards Division, and be distinct from the other vocabularies that are maintained by that division: Library of Congress Subject Headings (LCSH), Library of Congress Genre/Form Terms for Library and Archival Materials (LCGFT), and the Library of Congress Medium of Performance Thesaurus for Music (LCMPT).

A general rationale for the development of LCDGT, information about the pilot vocabulary, and a link to the Tentative List of terms in the pilot may be found on LC’s Acquisitions and Bibliographic Access website at http://www.loc.gov/catdir/cpso/lcdgt-announcement.html.

The Policy and Standards Division is accepting comments on the pilot vocabulary and the principles guiding its development through June 5, 2015. Comments may be sent to Janis L. Young at jayo@loc.gov.

A follow-up question to this post asked:

Is there a list of the codes used in field 072 in these lists? Some I can figure out, but it would be nice to see a list of the categories you’re using.

At the 39th Joint Doctrine Planning Conference, a semiannual meeting on topics related to military doctrine and planning held in May 2007, a contractor for Booz Allan Hamilton named Paul Schuh gave a short presentation discussing doctrinal issues related to “cyberspace” and the military’s increasing effort to define its operations involving computer networks. Schuh, who would later become chief of the Doctrine Branch at U.S. Cyber Command, argued that military terminology related to cyberspace operations was inadequate and failed to address the expansive nature of cyberspace. According to Schuh, the existing definition of cyberspace as “the notional environment in which digitized information is communicated over computer networks” was imprecise. Instead, he proposed that cyberspace be defined as “a domain characterized by the use of electronics and the electromagnetic spectrum to store, modify, and exchange data via networked systems and associated physical infrastructures.”

Amid the disagreements about “notional environments” and “operational domains,” Schuh informed the conference that “experience gleaned from recent cyberspace operations” had revealed “the necessity for development of a lexicon to accommodate cyberspace operations, cyber warfare and various related terms” such as “weapons consequence” or “target vulnerability.” The lexicon needed to explain how the “‘four D’s (deny, degrade, disrupt, destroy)” and other core terms in military terminology could be applied to cyber weapons. The document that would later be produced to fill this void is The Cyber Warfare Lexicon, a relatively short compendium designed to “consolidate the core terminology of cyberspace operations.” Produced by the U.S. Strategic Command’s Joint Functional Command Component – Network Warfare, a predecessor to the current U.S. Cyber Command, the lexicon documents early attempts by the U.S. military to define its own cyber operations and place them within the larger context of traditional warfighting. A version of the lexicon from January 2009 obtained by Public Intelligence includes a complete listing of terms related to the process of creating, classifying and analyzing the effects of cyber weapons. An attachment to the lexicon includes a series of discussions on the evolution of military commanders’ conceptual understanding of cyber warfare and its accompanying terminology, attempting to align the actions of software with the outcomes of traditional weaponry.

A bit dated, 2009, particularly in terms of the understanding of cyber war but possibly useful for leaked documents from that time period and as a starting point to study the evolution of terminology in the area.

Ranging from large national libraries to small and medium-sized institutions, many cultural heritage organizations, including libraries, archives, and museums, have been working with controlled vocabularies in linked data and semantic web contexts. Such work has included transforming existing vocabularies, thesauri, subject heading schemes, authority files, term and code lists into SKOS and other machine-consumable linked data formats.

This special issue of the Journal of Library Metadata welcomes articles from a wide variety of types and sizes of organizations on a wide range of topics related to controlled vocabularies, ontologies, and models for linked data and semantic web deployment, whether theoretical, experimental, or actual.

Researchers and practitioners are invited to submit a proposal (approximately 500 words) including a problem statement, problem significance, objectives, methodology, and conclusions (or tentative conclusions for work in progress). Proposals must be received by March 1, 2015. Full manuscripts (4000-7000 words) are expected to be submitted by June 1, 2015. All submitted manuscripts will be reviewed on a double-blind review basis.

Library of Metadata online. Unfortunately one of those journals where authors have to pay for their work to be accessible to others. The interface makes it look like you are going to have access until you attempt to view a particular article. I didn’t stumble across any that were accessible but I only tried four (4) or (5) of them.

Interesting journal if you have access to it or if you are willing to pay $40.00 per article for viewing. I worked for an academic publisher for a number of years and have an acute sense of the value-add publishers bring to the table. Volunteer authors, volunteer editors, etc.

Words refer to objects in the world, but this correspondence is not one-to-one: Each word has a range of referents that share features on some dimensions but differ on others. This property of language is called underspecification. Parts of the lexicon have characteristic patterns of underspecification; for example, artifact nouns tend to specify shape, but not color, whereas substance nouns specify material but not shape. These regularities in the lexicon enable learners to generalize new words appropriately. How does the lexicon come to have these helpful regularities? We test the hypothesis that systematic backgrounding of some dimensions during learning and use causes language to gradually change, over repeated episodes of transmission, to produce a lexicon with strong patterns of underspecification across these less salient dimensions. This offers a cultural evolutionary mechanism linking individual word learning and generalization to the origin of regularities in the lexicon that help learners generalize words appropriately.

I can’t seem to access the article today but the premise is intriguing.

Perhaps people can have different “…less salient dimensions…” and therefore are generalizing words “inappropriately” from the standpoint of another person.

Curious if a test can be devised to identify those “…less salient dimensions…” in some target population? Might lead to faster identification of terms likely to be mis-understood.

I think it is important to have all the integrity checks related to this aspect clear for humans, and not only have them sealed deep in the code. These notes will help users get acquainted with this feature in advance. Once completed, these will be included also in the manual of VB.

For the moment I’ve only written the introduction, some notes about data integrity and then described the checks carried upon the most dangerous operation: removing a concept from a scheme. Together with the VB development group, we will add more information in the next days. However, if you have some questions about this feature, you may post them here, as usual (or you may use the vocbench user/developer user groups).

A consistent set of operations and integrity checks for cross-scheme are already in place for this 2.1, which will be released in the next days.

VB2.2 will focus on other aspects (multi-project management), while we foresee a second wave of facilities for cross-scheme management (such as mass-move/add/remove actions, fixing utilities, analysis of dangling concepts, corrective actions etc..) for VB2.3

I agree that:

I think it is important to have all the integrity checks related to this aspect clear for humans, and not only have them sealed deep in the code.

But I am less certain that following the integrity checks of SKOS is useful in all mappings between schemes.

The SBVR specification is applicable to the domain of business vocabularies and business rules of all kinds of business activities in all kinds of organizations. It provides an unambiguous, meaning-centric, multilingual, and semantically rich capability for defining meanings of the language used by people in an industry, profession, discipline, field of study, or organization.

This specification is conceptualized optimally for business people rather than automated processing. It is designed to be used for business purposes, independent of information systems designs to serve these business purposes:

Unambiguous definition of the meaning of business concepts and business rules, consistently across all the terms, names and other representations used to express them, and across the natural languages in which those representations are expressed, so that they are not easily misunderstood either by “ordinary business people” or by lawyers.

Expression of the meanings of concepts and business rules in the wordings used by business people, who may belong to different communities, so that each expression wording is uniquely associated with one meaning in a given context.

Transformation of the meanings of concepts and business rules as expressed by humans into forms that are suitable to be processed by tools, and vice versa.

Interpretation of the meanings of concepts and business rules in order to discover inconsistencies and gaps within an SBVR Content Model (see 2.4) using logic-based techniques.

Application of the meanings of concepts and business rules to real-world business situations in order to enable reproducible decisions and to identify conformant and non-conformant business behavior.

Exchange of the meanings of concepts and business rules between humans and tools as well as between tools without losing information about the essence of those meanings.

I do need to repeat their warning from 6.2 How to Read this Specification:

This specification describes a vocabulary, or actually a set of vocabularies, using terminological entries. Each entry includes a definition, along with other specifications such as notes and examples. Often, the entries include rules (necessities) about the particular item being defined.

The sequencing of the clauses in this specification reflects the inherent logical order of the subject matter itself. Later clauses build semantically on the earlier ones. The initial clauses are therefore rather ‘deep’ in terms of SBVR’s grounding in formal logics and linguistics. Only after these clauses are presented do clauses more relevant to day-to-day business communication and business rules emerge.

This overall form of presentation, essential for a vocabulary standard, unfortunately means the material is rather difficult to approach. A figure presented for each sub-vocabulary does help illustrate its structure; however, no continuous ‘narrative’ or explanation is appropriate.

😉

OK. so you aren’t going to read it for giggles. But you will be encountering it in the wild world of data so at least mark the reference.

Three Recommendations were published today to enhance data interoperability, especially in government data. Each one specifies an RDF vocabulary (a set of properties and classes) for conveying a particular kind of information:

The Data Catalog (DCAT) Vocabulary is used to provide information about available data sources. When data sources are described using DCAT, it becomes much easier to create high-quality integrated and customized catalogs including entries from many different providers. Many national data portals are already using DCAT.

The Data Cube Vocabulary brings the cube model underlying SDMX (Statistical Data and Metadata eXchange, a popular ISO standard) to Linked Data. This vocabulary enables statistical and other regular data, such as measurements, to be published and then integrated and analyzed with RDF-based tools.

The Organization Ontology provides a powerful and flexible vocabulary for expressing the official relationships and roles within an organization. This allows for interoperation of personnel tools and will support emerging socially-aware software.

More vocabularies for mapping into their respective areas, backwards for pre-existing vocabularies and forward for vocabularies that succeed them.

a collection of machines that work together to deliver a customer service. Cloud clusters grow and shrink on-demand. A cloud service provides an API for scaling out a cluster, by adding more machines. </blockquote>

Which would distinguish (when searching) a “cluster” of computers from one of the other 38 uses of “cluster” found at: en.wikipedia.org/wiki/Cluster

Rather than using “Thing” from schema.org, I really should find or make an extension to that vocabulary for terms in various areas that are relevant to topic maps.

In my opening post on this blog I hinted that another would follow concerning vocabularies. Here it is.

When the Semantic Web first began, the expectation was that people would create their own vocabularies/schemas as required – it was all part of the open world (free love, do what you feel, dude) Zeitgeist. Over time, however, and with the benefit of a large measure of hindsight, it’s become clear that this is not what’s required.

The success of Linked Open Vocabularies as a central information point about vocabularies is symptomatic of a need, or at least a desire, for an authoritative reference point to aid the encoding and publication of data. This need/desire is expressed even more forcefully in the rapid success and adoption of schema.org. The large and growing set of terms in the schema.org namespace includes many established terms defined elsewhere, such as in vCard, FOAF, Good Relations and rNews. I’m delighted that Dan Brickley has indicated that schema.org will reference what one might call ‘source vocabularies’ in the near future, I hope with assertions like owl:equivalentClass, owl:equivalentProperty etc.

Designed and promoted as a means of helping search engines make sense of unstructured data (i.e. text), schema.org terms are being adopted in other contexts, for example in the ADMS. The Data Activity supports the schema.org effort as an important component and we’re delighted that the partners (Google, Microsoft, Yahoo! and Yandex) develop the vocabulary through the Web Schemas Task Force, part of the W3C Semantic Web Interest Group of which Dan Brickley is chair.

…

Phil then makes a pitch for doing vocabulary work at the W3C but you can see his post for the details.

I think the success of schema.org is a flashing pointer to a semantic sweet spot.

It isn’t nearly everything that you could do with RDF/OWL or with topic maps, but it’s enough to show immediate ROI for a minimum of investment.

Make no mistake, people will develop different vocabularies for the same activities. Not a problem. Topic maps will be able to help you robustly map between different vocabularies.

It’s painful for US soldiers to hear discussions and watch movies about modern wars when the dialogue is full of obsolete slang, like “chopper” and “GI.”

Slang changes with the times, and the military’s is no different. Soldiers fighting the wars in Iraq and Afghanistan have developed an expansive new military vocabulary, taking elements from popular culture as well as the doublespeak of the military industrial complex.

The US military drawdown in Afghanistan — which is underway but still awaiting the outcome of a proposed bilateral security agreement — is often referred to by soldiers as “the retrograde,” which is an old military euphemism for retreat. Of course the US military never “retreats” — rather it conducts a “tactical retrograde.”

This list is by no means exhaustive, and some of the terms originated prior to the wars in Afghanistan and Iraq. But these terms are critical to speaking the current language of soldiers, and understanding it when they speak to others. Please leave anything you think should be included in the comments.

The Linked Data Service is to provide access to commonly found standards and vocabularies promulgated by the Library of Congress. This includes data values and the controlled vocabularies that house them. Below are descriptions of each preservation vocabulary derived from the PREMIS standard. Inside each, a search box allows you to search the vocabularies individually .

The Webinar will introduce the new ontology plug-in developed in the context of the AIMS Community, how it works and the usage possibilities. It was created within the context of AgriOcean DSpace, however it is an independent plug-in and can be used in any other applications and information management systems.

The ontology plug-in searches multiple thesauri and ontologies simultaneously by using a web service broker (e.g. AGROVOC, ASFA, Plant Ontology, NERC-C19 ontology, and OceanExpert). It delivers as output a list of selected concepts, where each concept has a URI (or unique ID), a preferred label with optional language definition and the ontology from which the concepts has been selected. The application uses JAVA, Javascript and JQuery. As it is open software, developers are invited to reuse, enrich and enhance the existing source code.

We invite the participants of the webinar to give their view how thesauri and ontologies can be used in repositories and other types of information management systems and how the ontology plug-in can be further developed.

Date

4th of July 2013 – 16:00 Rome Time (Use Time Converter to calculate the time difference between your location and Rome , Italy)

One of the major stumbling blocks in deploying RDF has been the difficulty data providers have in determining which vocabularies to use. For example, a publisher of scientific papers who wants to embed document metadata in the web pages about each paper has to make an extensive search to find the possible vocabularies and gather the data to decide which among them are appropriate for this use. Many vocabularies may already exist, but they are difficult to find; there may be more than one on the same subject area, but it is not clear which ones have a reasonable level of stability and community acceptance; or there may be none, i.e. one may have to be developed in which case it is unclear how to make the community know about the existence of such a vocabulary.

There have been several attempts to create vocabulary catalogs, indexes, etc. but none of them has gained a general acceptance and few have remained up for very long. The latest notable attempt is LOV, created and maintained by Bernard Vatant (Mondeca) and Pierre-Yves Vandenbussche (DERI) as part of the DataLift project. Other application areas have more specific, application-dependent catalogs; e.g., the HCLS community has established such application-specific “ontology portals” (vocabulary hosting and/or directory services) as NCBO and OBO. (Note that for the purposes of this document, the terms “ontology” and “vocabulary” are synonyms.) Unfortunately, many of the cataloging projects in the past relied on a specific project or some individuals and they became, more often than not, obsolete after a while.

Initially (1999-2003) W3C stayed out of this process, waiting to see if the community would sort out this issue by itself. We hoped to see the emergence of an open market for vocabularies, including development tools, reviews, catalogs, consultants, etc. When that did not emerge, we decided to begin offering ontology hosting (on www.w3.org) and we began the Ontaria project (with DARPA funding) to provide an ontology directory service. Implementation of these services was not completed, however, and project funding ended in 2005. After that, W3C took no active role until the emergence of schema.org and the eventual creation of the Web Schemas Task Force of the Semantic Web Interest Group. WSTF was created both to provide an open process for schema.org and as a general forum for people interested in developing vocabularies. At this point, we are contemplating taking a more active role supporting the vocabulary ecosystem. (emphasis added)

The W3C proposal fails to address two issues with vocabularies:

1. Vocabularies are not the origin of the meanings of terms they contain.

…Awful, according to yet another master of the king’s English quoted by Fries, could only mean awe-inspiring.

But it was not so. “The real meaning of any word,” argued Fries, “must be finally determined, not by its original meaning, it source or etymology, but by the content given the word in actual practical usage…. Even a hardy purist would scarcely dare pronounce a painter’s masterpiece awful, without explanations. [The Story of Ain’t by David Skinner, HarperCollins 2012, page 47)

Vocabularies represent some community of semantic practice but that brings us to the second problem the W3C proposal ignores.

2. The meaning of terms in a vocabulary are not stable, universal nor self-evident.

The problem with most vocabularies being they have no way to signal the the context, community or other information that would help distinguish one vocabulary meaning from another.

A human reader may intuit context and other clues from a vocabulary and use those factors when comparing the vocabulary to a text.

Computers, on the other hand, know no more than they have been told.

Vocabularies need to move beyond being simple tokens and represent terms with structures that capture some of the information a human reader knows intuitively about those terms.

Otherwise vocabularies will remain mute records of some socially defined meaning, but we won’t know which ones.

If you are working in the public/national security area, you may need some vocabulary help.

I would check the definitions against other sources.

Here’s why:

Hunters (AKA Biters) Hunters are individuals who intend to follow a path toward violence and behave in ways to further that goal

I’m sure the NRA will like that one.

Identification Thoughts of the necessity and utility of violence by a subject that are made evident through behaviors such as researching previous attackers and collecting, practicing, and fantasizing about weapons

That looks like a typo but I can’t tell where it should go.

Terrorism Act of violence or threats of violence used to further the agenda of the perpetrator while causing fear and psychological distress

Lots of people talk about tags, and they all tend to assume they mean the same thing. However, there are lots of different types of tag from HTML tags for marking up web pages to labels in databases and this can lead to all sorts of confusion and problems in projects.

Here are some definitions of “tag” that I’ve heard and that are different in significant ways. If you think my definitions can be improved, please comment, and please let me know of any other usages of that tricksy little word “tag” that you’ve happened upon.

1) A tag is a free text keyword you add as part of the metadata of something to help search

…

2) A tag is a keyword that is selected from a controlled vocabulary or authority list

….

3) A tag is a keyword that is selected from a taxonomy

…

4) A tag is a type of Uniform Resource Identifier (URI)

….

5) A tag is metadata added to a web page for search engines to index

….

6) A tag is a label used to mark up content within a web page that can be used for display purposes and for indexing

Vocabulary control is used to improve the effectiveness of information storage and retrieval systems, Web navigation systems, and other environments that seek to both identify and locate desired content via some sort of description using language. The primary purpose of vocabulary control is to achieve consistency in the description of content objects and to facilitate retrieval.

1.1 Need for Vocabulary Control (1.1)

The need for vocabulary control arises from two basic features of natural language, namely:

• Two or more words or terms can be used to represent a single concept

Example:

salinity/saltiness

VHF/Very High Frequency

• Two or more words that have the same spelling can represent different concepts

Example:

Mercury (planet)

Mercury (metal)

Mercury (automobile)

Mercury (mythical being)

Great examples for vocabulary control but for topic maps as well!

The topic map question is:

What do you know about the subject(s) in either case, that would make you say the words mean the same subject or different subjects?

If we can capture the information you think makes them represent the same or different subjects, there is a basis for repeating that comparison.

DCAT is an RDF vocabulary designed to facilitate interoperability between data catalogs published on the Web. This document defines the schema and provides examples for its use.

By using DCAT to describe datasets in data catalogs, publishers increase discoverability and enable applications easily to consume metadata from multiple catalogs. It further enables decentralized publishing of catalogs and facilitates federated dataset search across sites. Aggregated DCAT metadata can serve as a manifest file to facilitate digital preservation.

If you have comments, now would be a good time to finish them up for submission.

The Core Public Service Vocabulary has entered in public review period. Anyone interested is invited to provide feedback until 27 February 2013 (inclusive).

In December 2012, the ISA Programme launched the Core Public Service Vocabulary (CPSV) initiative as part of Action 1.1 on improving semantic interoperability in European e-Government systems. The CPSV is a simplified, reusable and extensible data model that captures the fundamental characteristics of a service offered by public administrations.

The CPSV is designed to make it easy to exchange basic information about the functions carried out by the public sector and the services in which those functions are carried out. By using the vocabulary, organisations publishing data about their services will for example enable:

easier discovery of those services within and across countries;

easier discovery of the legislation and policies that underpin service provision;

easier comparison of similar services provided by different organisations.

Reusing domain vocabularies in the context of developing the knowledge based Linked Open data system is the most important discipline on the web. Many editors are available for developing and managing the vocabularies or Ontologies. However, selecting the most relevant editor is very difficult since each vocabulary construction initiative requires its own budget, time, resources. In this paper a novel unsupervised machine learning based comparative assessment mechanism has been proposed for selecting the most relevant editor. Defined evaluation criterions were functionality, reusability, data storage, complexity, association, maintainability, resilience, reliability, robustness, learnability, availability, flexibility, and visibility. Principal component analysis (PCA) was applied on the feedback data set collected from a survey involving sixty users. Focus was to identify the least correlated features carrying the most independent information variance to optimize the tool selection process. An automatic evaluation method based on Bagging Decision Trees has been used to identify the most suitable editor. Three tools namely Vocbench, TopBraid EVN and Pool Party Thesaurus Manager have been evaluated. Decision tree based analysis recommended the Vocbench and the Pool Party Thesaurus Manager are the better performer than the TopBraid EVN tool with very similar recommendation scores.

With the caveat that sixty (60) users in your organization (the number tested in this study), might reach different results, a useful study of vocabulary software.

More useful for the evaluation criteria to use with vocabulary software than in any absolute guide to the appropriate software.

EuroVoc 4.4 will be released on December 18, 2012. During this day, the website might be temporary unavailable.

6.883 thesaurus concepts

This new edition is the result of a thorough revision among other things according to the concepts introduced by the ‘Lisbon Treaty’. It includes 6.883 thesaurus concepts of which 85 concepts are new, 142 have been updated and 28 have been classified as obsolete concepts.

These new concepts are the results of the proposals sent by the librarians from the libraries of the national parliaments in Europe, the European Institutions namely the European Parliament and the users of EuroVoc. All the terms in Portuguese have been revised according to the Portuguese language spelling reform. The prior lexical value remains available as Non-Preferred Terms.

EuroVoc, the EU’s multilingual thesaurus

EuroVoc is a multilingual, multidisciplinary thesaurus covering the activities of the EU, the European Parliament in particular. It contains terms in 22 EU languages. It is managed by the Publications Office, which moved forward to ontology-based thesaurus management and semantic web technologies conformant to W3C recommendations as well as latest trends in thesaurus standards.

There are documents prior to this version of the thesaurus and even documents prior to there being a EuroVoc thesaurus at all.

And there will be documents after EuroVoc has been superceded.

Not to mention in between there will be documents that use other vocabularies.

Good thing we have topic maps to use this resource to its best advantage.

LSC, the Linked Science Core Vocabulary, is a lightweight vocabulary providing terms to enable publishers and researchers to relate things in science to time, space, and themes. More precisely, LSC is designed for describing scientific resources including elements of research, their context, and for interconnecting them. We introduce LSC as an example of building blocks for Linked Science to communicate the linkage between scientific resources in a machine-understandable way. The “core” in the name refers to the fact that LSC only defines the basic terms for science. We argue that the success of Linked Science—or Linked Data in general—lies in interconnected, yet distributed vocabularies that minimize ontological commitments. More specific terms needed by different scientific communities can therefore be introduced as extensions of LSC. LSC is hosted at LinkedScience.org; please check also other available vocabularies at LinkedScience.org/vocabularies.

All the vocabularies in the VEST Registry are classified by type and subject domain. Most of the Vocabularies are related to indexing. The purpose of indexing is to facilitate the search and finding of the content in a collection by the use of controlled/code lists, authority files or controlled subject vocabularies. The indexing ensures that the content will be found by users when they search specifically in information management systems. There are different type of vocabularies like authority files, classification systems, concept maps, controlled lists, ontologies, taxonomies or subject headings. But under the concept Vocabularies you can find as well dictionaries, encyclopedies, glossaries, lexical databases or topic trees.

If I am reading the webpage correctly, 116 separate vocabularies.

Browse through them to get an idea of the range of materials here.

Just on the homepage I see:

African Studies Thesaurus

A structured vocabulary of 12,100 English terms in the field of African studies, the African Studies Thesaurus is developed and maintained by staff at the library of the African Studies Centre Leiden. It is used for indexing and retrieving material in the library collection and is directly linked to the catalogue.
…ARABTERM United Nations Multilingual Terminology Database of the Arabic Translation Service

ARABTERM is a multilingual terminology database which provides United Nations nomenclature and special terms in four of the official UN languages – Arabic, English, French and Spanish. The database is mainly intended for use by the language and editorial staff of the United Nations to ensure consistent translation of common terms and phrases used within the Organization.
…Biological Entities

This ontology manages reference data about biological species needed for fisheries fact sheets and statistical information, among other resources. Species items are organized and maintained in the Aquatic Science and Fisheries Information System (ASFIS) and currently includes nearly 11.000 species items related to Fisheries and Aquaculture.
…CABI thesaurus

The CAB Thesaurus is the essential search tool for all users of the CAB ABSTRACTS™ and Global Health databases and related products. The CAB Thesaurus is not only an invaluable aid for database users but it has many potential uses by individuals and organizations indexing their own information resources for both internal use and on the Internet..

No slight intended towards any vocabulary I didn’t mention. Just a random listing from the homepage.

The FAAT List is a handy reference for the myriad of acronyms and abbreviations used within the federal government, emergency management and first response communities. This year’s new edition, which continues to reflect the evolving U.S. Department of Homeland Security, contains an approximately 50 percent increase in the number of entries and definitions bringing the total to over 6,200 acronyms, abbreviations and terms. Some items listed are obsolete, but they are included because they may still appear in publications and correspondence. Obsolete items can be found at the end of this document.