The Great Photo Organization Project

Alternate title: I am in the fifth circle of photo organization hell, reserved for those who knew they should be organizing their photos better but didn’t because they were lazy bastards.

So begins the great personal photo organization project. Here’s the situation: I’m looking at 30,000+ photos, a mix of personal and professional work, shot with 5 different cameras, spanning 10 years, in thousands of date-coded folders. The problem is simple: I can’t find anything. I’d like, for example, to see all photos of my daughter, oldest first. Or see all of the photos I’ve shot for stock. Can’t. I don’t have any catalogs, libraries, or tags.

I know. I’m an idiot.

I’ve entered the world of digital asset management (DAM). DAM in a nutshell is a process or system for organizing photos so that you can a) find the photo you want when you need it and b) archive your photos forever. You may be tempted to rely on your memory if you have a just a few thousand photos. Maybe your memory is better than mine. But I’ve crossed a personal threshold where it’s becoming increasingly frustrating to find old photos.

So here’s the simple and daunting plan: starting with my most recent photos, tag every photo with keywords. I’m using software that can add tags to photos and then allows you to search for photos with those tags. So, all photos I shoot for stock might be tagged “Stock.” Photos of landscapes shot for stock might be “Landscape, Outdoors, Stock.” I’m committed to tagging for an hour a day, every day, until it’s done.

Aside from the sheer enormity of the task (30,000 is a lot of photos), I’m faced with some practical problems related to tagging that I hope you can help me with:

As soon as I tagged my second photo I realized that I need a common taxonomy for my tags. I can’t tag some photos “Bird” and others “Birds” and still others “Winged beasts.” I need to pick a definitive tag and use it for all of my photos. How do I decide? Should all tags be plural or singular? Does it matter? Also, while I’ll be the main user of the archive for now, one day my children or grandchildren will inherit my collection, so the tags will need to make sense to other people besides myself.

Tags don’t have heirarchy. But I’d like to group related tags together. People, for example. Is it better to tag photos of people with just their name? Or something like this: “People:John Watson”. Or use multiple tags like this: “People, John Watson”? The idea being, I’d like to see all of my photos of John Watson (handsome fellow) but I think sometimes I’d also like to see photos of all of my people.

Do you have an opinion on where the metadata should be stored? As I see it there are two choices: in the photo itself or in a central database. There are pros and cons for each method. Searching is much faster if you have a database. Compatibility with future DAM software is better if you store metadata inside the photo. Some people advocate not altering originals at all, not even the metadata. If you store metadata in your images, how do you alter your tags if you decide to rename one or you want to combine two similar tags into one?

How do you organize your photos? How do you find photos when you need them later?

It’s honestly driving me a little nuts right now. It’s no coincidence that I just finished reading The DAM Book, Second Edition (Krogh). I’ll have a full review (and giveaway) soon, but in short: this book is incredible—get it.

John Watson

John is the original founder of Photodoto, but after running it for 4 years he had to focus on different things. If you’re interested in what John has been up to recently, you can check is personal blog or browse his photo blog.

If you enjoyed this article, you might also like:

From your post, I gather that even though you’ve referenced Lightroom you haven’t necessarily decided to use it yet? My response is pretty Lightroom-centric…

Taxonomy: I struggle with this as well. I tend to use what’s most obvious to me, (birds, not winged beasts) and I use the plural form whenever there’s a question. Lightroom does have the option of assigning aliases to tags, so you can make bird an alias for birds, but this doesn’t seem to affect the tagging process. I assume it applies to search instead.

As for your grandkitds, if they prefer “winged beasts” they can always rename the tag.

Hierarchy. Tags do have hierarchy. I have a “people” tag, with a “friends” tag inside it, with a “joe” tag inside that. I also have an “animals” tag, with a “birds” tag inside it, with a “ducks” tag inside that. Tagging something with “ducks” means it’ll show up in a search for animals.

Metadata Storage Location: Ideally, both. Which is how Lightroom works by default with DNG and JPG files, I believe.

Whether you use Lightroom or not, your concerns can and should be addressed by the tool you use. Make your decision based on how well the available tools solve your problems. (Lightroom seems to do this stuff quite, quite well.)

You’re right, I’m not using Lightroom. I’d thought I’d use IPTC for tagging since, although it’s older, it seems to have the widest support among tools. I can always convert my IPTC keywords to XMP later. But maybe I should choose a tool that uses XMP from the start (like Lightroom) since XMP is “the future” and it has some nice features (like nested tags). Thanks for your advice.

Stew

I’m with Rick. Lightroom has solid features for things like hierarchies, aliases and prompting you with the current tag values (so you don’t have to remember if you previously said “bird” or “Birds”).

Since it also uses XMP by default and you believe that’s the future, why not be forward, rather than backward-compatible? It’s also supported by Adobe Bridge, so you’re not locked entirely into Lightroom.

Finally, Lightroom can show you just the photos that are un-tagged, which helps you find what you need. Unfortunately, as soon as you add the first tag, those shots will disappear from view (since they’re now tagged), which can be a pain at times.

If that’s a problem, I’d filter for untagged, select a mess of them and press “B” to put them in the Quick Collection. Then you can jump to the QC (press Ctrl-B) and you don’t have to use the “not tagged” filter any more. Tag a mess of them and when you’re happy with them, select and press “B” and they’re cleaned out of the QC. Keep working until the QC is empty.

Unfortunately, no (I use Ubuntu as well and we’ve got a Macbook in the house). Picasa is one of the best choices right now. I have done some research on this exact question though… I’ll write about it as soon as I can.

Katie

Use singular tags. My reasoning is that you’re tagging a picture’s subject, not the quantity of subjects. But this is what makes sense to me and might not make sense to others.

For hierarchy, I agree with Rick. “People:Joe” doesn’t seem like a good choice for a tag if you’re going to be searching for pictures of Joe later. Sub categories seems to be the most logical in terms of later searches.

Thanks for weighing in, Katie. I thought it would be a good idea to choose singular as well for the reason you state. But then I run across “difficult” words like “glass.” Does it mean a window? Or jewelry? Or a drinking glass? And what about glasses (for vision correction)? Maybe I’m over-analyzing.

I agree sub-categories as implemented by Lightroom and some other apps are nice, but as I understand it, there is no standard way to define heirarchies within the data structure itself (IPTC and XMP don’t define a standard for it). Keywords are intentionally flat. That means every vendor does it differently which makes a native Lightroom heirarchy, for example, unportable (or portable with difficulty). Other apps probably can’t see the heirarchy and if Lightroom is discontinued in the future (think 10-20 years from now) and you need to move your stuff to a new app you may find that your keywords translate but not the heirarchy.

I use Windows Live Photo Gallery to manage 25,000 photos. One can tag through drag and drop or typing to select. Tags can be moved, renamed, deleted, and can be hierarchical (wildlife/birds/roadrunners). I’ve also struggled with plural vs singular (bird/birds?). All systems of organization have to allow for change over time. FWIW, WLPG and Picasa have separate ‘people tags.’ Eventually, automatic face recognition will play a role (may already in Picasa). Metadata *must* be part of the file so it travels with the file when you send it as an a attachment or upload it to a service such as Flickr, which parses that metadata. peace, mjh

Woodie

FWIW– I use Photo Mechanic for all the upfront work; i.e., IPTC entry. This is saved into the NEF image file so that it stays with the image. The Library of Congress Thesaurus for Graphic Materials II (TGMII) is my guide for taxonomy. A recognized “standard” and I don’t need to reinvent the wheel. I switched to IDimager for cataloging from IMatch. I tried many other applications but these work well for me since they write to raw files. I found many applications did not allow a “not” search only “and” and “or.” Why do you need a not? Photos with more than one subject will have more than one keyword. Dog and cat for example. You only want pictures of the dog, no cat, no kids, no ferret (ok, you don’t have a ferret.) So the search needs to be able to exclude keywords as well as include.

Good luck. I’m up to about 13,000 images and still have a LONG way to go to catch up.

John, I’m in a similar state as you (although with not quite so many files). I’ve been really hoping to find a free/open tool to do this job, but it seems like most everybody has given in and joined the Lightroom cult. I don’t know if I can stand to buy a piece of software that costs as much as my last camera body.
I’ll be eagerly awaiting follow-ups here.

This post came at such a great time for me. I’m another one in a similar position, and I’ve just started selling some photos. After starting in stock I realised I needed to get my keywords under control. I’ve been using Lightroom, but my keywords were a mess. I do use hierarchical keywording to make my life easier. I’ve just finished changing all my messy keywords into something more structured and now I’m on to adding / verifying keywords. I really like your idea of 1h per day and the way you’re keeping track on the side of your blog post. I think I’ll take your advice and maybe sometime in the next few months the task will get done! Best of luck to you on your organising!

I use Adobe’s Bridge (CS3) for tagging, viewing and sorting images. It isn’t perfect but works for day-to-day work flow.

Bridge is essentially a glorified folder viewer. You can view EXIF data, move images, view and assign tags, and preview images. The hierarchical structure of tags works quite well. I really like the ability to group (stack) images. This way, all of my working files for a particular image show up as just one file within Bridge.

The big downside for me is searching across multiple folders. Yes, Bridge does this (via search) but I find it a bit cumbersome. Perhaps the CS4 version has improved upon this.

While I like Bridge for day-to-day use, I am looking for something a little better for the ‘across the folders’ search. Bridge CS4 is one possibility. Another to check out is Expression Media (Microsoft bought the app a few years ago).

Using Ubuntu allows you to use Digikam, although it is a “KDE” program. It is, I believe, a really good program.

banj

Funny how everyone seems to come to the same ‘problem’ at the same time- I’m in pretty much the same situation, I’ve just started assembling and organizing all my photos from many different sources, and I’m beginning to see that finding and importing them into Aperture was by far the easiest part of the project!!

Tagging my (30 000) photos is going to be a totally mammoth task, and aperture doesn’t exactly streamline the process – I’d kill for a program that simply displays each photo one after another and lets you click on applicable tags and/or add new ones.

But the architecture design of the photo library is what really (actually) keeps me awake at night… I think in terms of hierarchy, so the flat nature of keywords is a little disconcerting for me. If i create a tag for one of my friends and nest it in the ‘friends’ tag, then ideally every keyword leading to ‘friends’ would also be applied to my picture. I really don’t want to have to define all those paths by hand for each and every photo…

Woodie

I’m not sure why you would want to tag 30,000 images one at a time but everyone has a different workflow. Anyway, in an earlier post I mentioned apps I use for cataloging but not how I use them. I use a hierarchical structure that you mentioned. Here’s an short example. I have 500 photos shot at a polo match. When I ingest from the card(s)(I use Photo Mechanic [PM] for speed) I add all the common info: location, photographer, copyright, etc. I also add the common hierarchical keywords that apply to all or most images: animals.horses.polo ponies, sports.polo, etc. and rename. Once copied all 500 now have some IPTC info. (PM will ingest to two locations so I copy to an internal and external drive at the same time giving me three copies of the images at this point) Next I cull, rate and label images. Next, I check (a Photo Mechanic function)the images that need no additional keywording and view only those that need additional keywords. Maybe names of individuals, or something else unique to the photo or groups. I select all images that need the same additional keywords and add that. I repeat until all photos have appropriate keywords.

I then rename again. Why? I use a sequence number as part of my filename. I’m compulsive or something like that and I don’t like missing sequence numbers. If a number is missing, I don’t know if it should be missing or it was deleted by mistake. Once this is completed I synchronize with the external hard drive so I now have the culled, keyworded, rated, labeled files two places and the originals still on the card.

I now read the files into IDimager for cataloging. I could use IDimager for all the upfront work but PM just works better for me (FAST). IDimager recognizes the keyword structure animals.horses.polo ponies as hierarchical and creates (if it’s not already there) the catalog entry. If I search animals, I get all animals. If I search horses I get all the horses. A polo ponies search gets only those images. I can also search for horses not polo ponies and get all other horses.

Most catalog apps have trials available so I encourage you to try a lot of them until you find the one(s) that work the way you do. The one thing that I cannot stress enough no matter what apps you use or what is your workflow. Be CONSISTENT with your keywords. Automobile, automobiles, auto, autos, car, and cars are all DIFFERENT to the app. Bob and Robert aren’t the same. Lastname, firstname is not the same as firstname, lastname. Did I say it already: be CONSISTENT.

A final word and I’m out of here. If you plan to submit to a stock agency, do yourself a favor and use the agency’s keyword structure. Sounds obvious but it’s sometimes overlooked.

My sorting workflow is based on Adobe Bridge:
1.i find that it’s better to have a root-keyword with sub-keys like (year[root]:1999, 2000, 2001,…,[keys]), (Photo Type: People, Automotive, Nature,…,), (Month: January, February,…,), (Location#1: Country), (Location#2: City) etc.
2.Adobe Bridge has a better keyword system than LR because it has some kind of keyword-tree available and you can search your entire archive with just a right-click.
3.Do all your keywording in Bridge after each Camera-photo-download.