Metadata projects in three easy steps : where to begin?

04Sep

Rather than discuss cataloging solely, which is one specific type, this post will focus on metadata as a whole. The implications and advice will apply to broader projects as well as cataloging but I want to share what I have learned with my most recent project. Mainly, this post is about how to start a metadata project in three steps.

1) Creating metadata

First and foremost, is the creation of information about information, or metadata. Even before the creation, however, decisions must be made, namely what information is needed and how much detail? For example, someone wants to make a list of all their board games so the name of the board game is essential but then number of players and typical duration might be good to include for when friends come over to play games and want to decide on a game quickly.

Usually metadata is readily available, such as the name of a board game on the box, so creating metadata in this case means collecting it. Whether it gets written down by hand or typed up in a document, this gathering of information is really the creation of metadata. But sometimes it isn’t as simple and easy as described, since many items lack identifiable information. What if the board game box is missing? Then the issue at hand becomes either best describing the item based off what’s at-hand or searching to try and track down the name and other information. Searching can be a large part of creating metadata and there are numerous ways to figure out the information that’s lacking: the Internet, other books or materials, asking others, and asking the persons it belonged to or were involved. If my friend made a new board game that I played with them, then it is likely I could provide the name and other information about it if someone asked me.

In most cases, metadata can be created or found out in our tech-savvy, micro-connected information world. It might take time and searching and re-seaching since new information is online and discovered every second of every day, and new minds become experts that see and piece together information differently. But other times, using the best judgement to provide some semblance of metadata for the item at-hand is the only option available.

2) Organizing metadata

This step follows hand-in-hand with the creation of metadata because organizing it is crucial to ensure consistency of a project. Metadata projects usually become large very quickly and need to be orderly to ensure that all of the wanted information is there for each item. All metadata must be in the same location, in the same format, and be as complete as possible. If the board game list was partially written down on scraps of paper in different people’s handwriting while the other portion was typed up on someone’s computer and someone else’s smartphone, how would anyone know if the list was complete before deciding on the game to play? Metadata must be organized to be meaningful.

Today, placing metadata on a computer or in the cloud makes it readable and accessible but there are still various ways in which to organize it. Should the information be typed into a document, a spreadsheet, a database, a MARC record (for cataloging), or on a website made to house that type of metadata? This decision depends on the project itself. A board game list would suffice as a document but in order to easily sort the list by number of players, then a spreadsheet is a much better choice. If the list is for fun then a database is probably not needed but if it’s for a collection of every board game ever made (or Amazon’s warehouse of board games) then a database is the best option. Scale and use of the metadata should determine the organization of it.

Another aspect of organizing is sorting. This also relates to the decision of what information is needed since categories will help ensure consistency. For a board game spreadsheet, three categories for the metadata have already been decided: name of the game, number of players, and typical duration of the game. The spreadsheet needs three columns, which would have shortened column titles, so Name, Players, and Duration; identifying the spreadsheet as So-and-So’s Board Games will allow anyone who sees it to know exactly what the spreadsheet is and means. Ensuring that the metadata is clearly marked helps those who work on the project to be on the same page, as it were, and others who see or use it to know as well. When organizing metadata, make sure to categorize information precisely and concisely.

3) Applying metadata

Depending on the metadata project, this third step will vary. How will the metadata be used? For cataloging, it is meant to be displayed and searchable by the library user in order to find materials, but not all of the metadata is shown because most won’t mean much to that user. Sure, the call number and title are helpful but the MARC record fixed fields and some local librarian notes are not but still must be included for system or librarian use. For the board game list, everything is viewable since there isn’t much in the spreadsheet and it is all relevant to the selection of a game. If information about when the game was bought and how much it cost or if the owner hated or loved certain games was in the spreadsheet, those columns could be hidden since it means something to the owner but they might not want it displayed to others.

Besides concealing metadata not relevant to users, search engines and other company’s who provide databases and information on websites also conceal certain metadata, such as proprietary internal knowledge or processes. There is more than what meets the eye, especially when it comes to search results via the web or a database. Also, hiding some information might be way of protecting certain metadata that is meant to remain anonymous even though it is still connected in some way to the displayed information. As a physical application of metadata, let’s think about the President of the United States’ motorcade–of the five, or so, black SUVs driving to an event, the general public has no idea which one the President is in, although that metadata is know to a select few.

This third step also invokes the how of displaying metadata. For the board games list, allowing anyone to view the spreadsheet themselves is probably fine since the list won’t be long and only has relevant information in it. However, for libraries and companies, displaying certain metadata and providing a means of interacting with it is a necessary chore. A library catalog that uses MARC record metadata can have settings adjusted to show certain fields to a library user, such as call number and title. The same is true for companies and their search results on the web or in a database. For someone beginning a metadata project from scratch, this would be more difficult without software to help. Libraries typically use integrated library system software while companies probably build software internally to have more control. Even before considering how to display metadata, the initial question should be does metadata need to be displayed? Not every metadata project needs to but it will depend on the purpose of the project and who the users will be. If it is meant for internal use to organize information and keep track of items, then a display component probably isn’t necessary but if the metadata project is going to be used by outside researchers and users then some form of search and display capability is needed so that they can use the metadata themselves rather than ask someone on the project for help getting information each and every time. Creating a display won’t be covered in this post but there are lots of possibilities and people capable in the world for setting something up.

Why this blog post now, when I’ve worked as a cataloger for a few years already? In July, I began setting up a new metadata project for the website OpenCoverLetters.com which houses redacted, successful librarian cover letters. All of my knowledge of and skills for cataloging are being used but applied in a different manner for this project. This led me to think about metadata projects more broadly, especially when it comes to first beginning one.

The real challenge is in deciding if a metadata project is needed, and then the rest plays out from there. I hope this post explains metadata further and encourages metadata projects, whether personal or professional. We all have lot of possible metadata projects in our lives that could be done but the question is which ones should be done? Priorities matter since projects like these take much time and effort to get started and keep going as new information and items become available. Upkeep is important as well. Which metadata projects will make the most impact, either in help locate resources or by provide new information of unique and rare materials? High impact ones are a must. For example, my dad’s antique book collection is at top on my list when I am home for the holidays this year–watch for that blog post in late December 2013, if all goes well. Good luck with new or continuing metadata projects!