Library metadata techniques and trends by Thom Hickey

Corporate VIAF

For those not familiar with VIAF, it collects, matches and makes available library authority files, mostly from national libraries. We currently are merging 21 files from 18 participants and the latest VIAF database has 14 million clusters formed from 17 million name records and 60 million bibliographic records. Until yesterday VIAF only matched personal names.

Normally we update VIAF monthly, based on new data from participants, but since July we have been concentrating on extending VIAF to corporate and conference names. Corporate names pose something of a challenge to match across files. The names are often quite generic, as are the titles of documents associated with them. Dates are seldom available and the names themselves are more often translated into local forms than personal names are. Luckily, the size of VIAF seems to help our matching. The more records we have for a name, the closer we come to a comprehensive set of cross-reference for it, which improves the matching process.

Since we have been working on matching personal names in VIAF for over five years, the personal name matching is more refined than the corporate matching, but we think the corporates are at a stage where they will be useful.

So, look up some corporations in VIAF and let us know what isn't working (and maybe even what is). We are especially interested in clusters that bring together names that should not be grouped together. I know there are some of these, Jenny and I found a few yesterday, but your example might be a new pattern we are not aware of.

Credits: VIAF is a joint project of the BnF, DNB, LC and OCLC, which builds on the decades of work of untold librarians around the world creating library authority files. In addition, Jenny Toves worked on the corporate matching, Jeff Young has redone the RDF, J.D. Shipengrover does our interface design, and Ralph LeVan provides VIAF's text retrieval and linked data infrastructure.