Catalog Software Basics

Peter Krogh

Catalog software is an essential component in the management of an image collection. It lets you know what's in your image collection, and enables you to group the images in useful ways. It also lets you browse offline images, manage migration and validation, and restore the collection in the event of drive failure.

Browsers vs. Catalogs

There are two fundamental kinds of software you can use to work with large groups of images: browsers and cataloging software. A browser is a piece of software that can display the contents of a folder or drive for inspection. It will have some organizing tools, but does not have the ability to remember what is in the collection when it’s not connected to the storage media. Cataloging software harvests a thumbnail or preview of the images, as well as the metadata, and keeps it in a database. It will have robust tools to create and manipulate metadata to organize and remember information about the images.

At first, browser and cataloging applications look similar. Each one can display multiple files, sort according to multiple criteria, and hand files off to other programs. Behind the scenes, however, there is an important difference. A browser extracts data from the files on a more or less “real-time” basis and builds its utility around this information. Cataloging software, however, keeps a permanent catalog of information about the images, including thumbnails, metadata and more.

Capabilities of catalog software

Because cataloging software keeps the extracted information in a database, it has several important advantages over a browser. The differences between the application types don’t really become apparent until you have a large number of files to work with.

It’s DAM faster

One thing cataloging software can do better than a browser is to return search results quickly. Because cataloging software keeps all the organizational information in a database document, it only needs to do a local search to find, for instance, all images with “Josie” written in the keywords. A browser may have to look through the keywords of 100,000 files stored on several different drives to return the same results.

It allows collection-wide filtering

Because cataloging software can harvest and contain so much information, it allows you to organize images with much more flexibility. Since all the date information from a large group of images can be collected in one place, for instance, it’s easy to filter down to images only from a particular time frame to find just what you’re looking for. And you can easily use the shoot-date information combined with location information to find images from a certain time and place.

There will be times when it's useful to browse by any number of metadata types. You might want to search by camera or lens, as well as annotations like date or location. Catalog software lets you collect images according to what information is useful to you.Read more about metadata categories in the Metadata Overview

File Management with catalog software

By collecting all the information about your images in one place, catalog software can assist in all kinds of file management tasks. It can let you check that files are where they are supposed to be. Because you can see and control images from a single spot, catalog software can also help you in data migration tasks, such as transfer to new media.

Catalog software knows where stuff is supposed to be

Since catalog software knows where files are supposed to be, it can pull up an image without having to dive through folders looking for it. The file path is just another piece of metadata that the catalog remembers about a file. This information can also assist you in keeping track of images that you may have erased, renamed, or moved accidentally. A cataloging application will be able to tell you that an image is missing and should be found or restored from your backup, while a browser will simply omit the file. Cataloging software therefore helps you to truly manage your files.Read more in Data Validation Overview

It allows faster backup of important organizational work

Cataloging software allows you to back up your valuable sorting work quickly and thoroughly. Because the cataloging application stores all the information you use to organize your collection in one place (the catalog), it is easy to back up your work after every sorting session. If you are using a browser to do sorting work, you will need to write a sorting term — a keyword, for instance — back into the original files. You may then have a bunch of widely distributed metadata that you need to back up if you want to be sure you are saving this work. This adds quite a bit of time and complexity to the process of saving your work compared to simply saving the catalog document.

Virtual sets

Even more important than filtering is the ability of good cataloging software to create and keep virtual sets. Virtual sets are like folders that you keep images in, except that you create them by pointing to the original files, rather than by moving the files into folders. This allows you to include an image as part of multiple sets without having to copy the file multiple times — for instance, the same file can live in the “Vacation” group, the “Grand Canyon” group, the “Pictures of Maddy” group, the “Stock Photos” group, and the “Mom’s Favorites” group.

Virtual sets let you create groups with specific intent, which is more valuable than simple filtering. While it’s useful to filter ratings against keywords to find, say, your best architecture photos, it’s even more valuable to select specific images for your architecture portfolio and save that grouping. This group, created with specific intent, has significantly more value than the filtered group because you’ve put more work into the selection process. You’ll want to return to it and know that it won’t change unless you tell it to.

The best of the cataloging applications will also let you organize your groups into nested subgroups. For example, within the “Collections” group, you can create a subset called “Projects,” and within that you can create the “Iceland Adventure” group. This set of images can, in turn, be subdivided into “Book Candidates,” which can be further refined to the “Book Finals” group. These nested virtual sets are a versatile way to organize your collection. Unlike the standardized metadata fields, they let you organize your collection according to what you shoot and the way you think. You can even subdivide the collection in many different ways—for example, dividing images into assignments in one nested hierarchy and dividing by subject matter in another. Nested virtual sets also help to reduce the visual clutter that organization can create. We suggest using several top-level hierarchies to help wrangle all the organizational terms. Figure 1 shows one way to use these.

Figure 1Nested virtual sets let you organize your images in whatever way makes the most sense for what you shoot and how you think.

Supported formats

Format support is a critical issue for many people managing archives. While the dedicated photo managers (Lightroom, Aperture) generally do a good job of letting you work with digital camera files, most of them limit support to in-camera still images and plain vanilla JPEG, DNG, TIFF, and PSD. Many other common file types, such as in-camera movies, some layered files, and CMYK files are often not supported. File types beyond that (PDF, Flash, QuickTime movies, text files, design documents, audio) are entirely unsupported. This will increasingly become a problem for anyone managing a photo archive, as more and more still cameras produce movies, and as you use still images to create other important documents that you must manage.

Once again, at least for the time being, we are faced with a decision between imperfect solutions. PIEware offers wonderful functionality for working with photos, but leaves out the other file types entirely. As you weigh your options, it’s important to consider how valuable these unsupported file types are to how you work with photos and other media.Read more about formats in File Formats Overview

Capacity

Ideally, you would only need one catalog to manage your entire photo collection. You would have only one place to look to find any image, and you could group images with any other images in the entire collection. If your entire collection is fewer than 20,000 total image files, you can achieve that today. Once your collection stretches to 50,000 or more image files, however, things start to get more complicated. Some programs are limited to a certain number of records. Some programs don’t have a hard limit, but become slow and unwieldy as the catalog grows too large.

Check the specifications of the program to see what limits are built in. You might also want to test the software with your own image collection to see how it performs with your images on your hardware.

Exporting metadata: The prenup

Photographers sometimes express wariness about using catalog software because they don’t want to be left hanging if the manufacturer discontinues the program. Nor do they want to be obliged to own that particular application forever in order to have access to the organizational work they’ve done.

According to writer and photographer John Beardsworth, a photographer’s relationship with a catalog program should be one of “serial monogamy,” meaning that you need to be married to the application when you’re using it, even though you may divorce later and take up with some other program. If that happens, you will want to bring all your hard-won metadata with you, and not have to leave it behind. This means that the software must have a way to export the information in some usable form.

Figure 2When a catalog first indexes an image, it imports metadata. The ability to push metadata back to the files from the catalog is your "pre-nup"; this offers you the ability to take all your work with you to a new catalog program in the future.

XMP interchange capability

For photographs, the best way to make your data portable is probably going to be an XMP export back into the original file. If the organizational work you do to your images can be written back to the metadata of the file, that work is portable and can be made universally accessible.

Once the data is pushed into the file, moving to a new cataloging application is really easy: simply catalog the files with the new software, and all your information should be available in the new software. This might require that you copy the data from one field to another to be visible to the new program.

One of the reasons we suggest use of the DNG format is that it’s very safe to use the XMP space inside the DNG file to carry the metadata. It’s a very elastic space, able to hold as much information as you like. And because there is only one place to embed any field in the file, it’s much less likely to lead to any data collisions.

There are other ways to export metadata so you can use it in new cataloging applications, such as exporting XML data from the catalog (a big list of all the metadata for each file) or even exporting it as a tab-delimited file. Both of these bulk exports, however, require some experience working with database import/export, and can be difficult to accomplish.

Output

Any catalog program you consider should have the capability to create useful output from your image files. This includes converting and resizing images as well as creating collections of images in web galleries, contact sheets, slideshows, movies, and more. In fact, making use of your image files is one of the most important capabilities that catalog software brings to the table.

In order for output from a catalog to be valuable, it needs to be reasonably accurate, according to the intentions of the user. PIEware reproduces accurate renderings of images created in the software. With freestanding catalog programs, it can be tougher to get an accurate rendering, especially with proprietary raw files. If you edit a CR2 in ACR or Lightroom then look at it in non-Adobe catalog software, the catalog simply won’t see the edits. If you save that same file as a DNG with a full-size preview; however, the catalog can use that preview to render the file.Read more about DNG in Raw File Formats

Figure 3Your catalog software should be able to make useful derivatives of your images in bulk form. Web galleries, contact sheets, prints, files, slideshows and videos are all common capabilities.

Catalog PIEware

In the last few years, the software landscape for imaging tools has been revolutionized by the emergence of cataloging programs that include a way to do nondestructive image editing — cataloging PIEware. Adobe Lightroom and Apple’s Aperture are two examples of programs that can manage an archive as well as make the pictures look the way you want them to. This approach will eventually provide the best possible way to manage an archive, but the current versions of the software may not yet provide everything you need.

While the image editors in these cataloging all-in-one applications are as good as any PIEware out there, the catalog capabilities are not as robust as the dedicated catalog software (at least as of this writing). Building a good catalog program is surprisingly difficult, and it’s made doubly hard when the developers are incorporating a parametric image editor into the mix. It’s clear in both Lightroom and Aperture that some very significant catalog capabilities are simply not built yet. These omissions include working with the video files that come out of many digital cameras, flexible organizational tools, and multi-computer capabilities. They also don’t currently have the ability to open multiple catalogs at the same time, nor to search for missing files on a collection-wide basis.

If you can do everything you need to with an all-in-one application, your life can be considerably simpler than creating a multi-application workflow. A person with one “photography” computer whose images can fit into a single catalog can probably do everything inside Lightroom or Aperture and would be well advised to do so (depending on what other needs he might have).

Once you start adding more computers to the mix, or more than one user, or more than one drive for storage, or more than one catalog of images, these programs start to break from a workflow perspective. We suggest it’s better to use a dedicated cataloging program to organize and manage the collection as a whole, even if you still use your cataloging PIEware to manage large groups of images as they are works in progress.Read more about Catalog PIEware