AI for Art Direction

The neural network and other tech is changing the role of the Art Director.

For decades, art directors, photographers, and designers in agencies everywhere have spent a large chunk of their time searching.

Searching for images to use as comps, styling references, and moodboard visuals. Quite often, the execution of a good idea relies on finding just the right shot.

On that note, in March this year (2017) Flickr released its Similarity Search Pivot tool. Clunky name, but sweet tech. it’s a fancy way of saying they’re using digital smarts to make finding the right picture a lot easier.

The technology seems non-trivial at first, but when you look at it again it’s insane. Similarity Search employs neural networks and deep machine learning to trawl up images that are similar to the one you’re looking at — based on qualities like tone, composition, color, texture and even aesthetic quality. This isn’t a simple text or keyword search. This is A.I. software allowing you to find images based on a purely visual characteristic, something that is often hard to articulate.

The technology itself is tricky to put into words, too. So here’s an example. Thanks to this recent innovation, Flickr now knows that this image:

…is similar to this image:

That’s insane. It’s something that’d be extremely difficult for a human to do manually. You’d have to trawl through millions of images, scanning for a certain texture or other hard-to-define quality. Just searching for “sneaker” or “black and white” wouldn’t get you anything helpful either. Keywords don’t cut it, unless all the images had previously been meticulously tagged with a hundred eloquent descriptive terms. By definition, pictures are visual (“woah, woah, slow down egghead”). But searching for them required words, which meant translating an image into an often-clunky prosaic description, that could in turn be used to bring up that image when it was being searched for. Similarity does away with that bothersome human middle-man and the words-to-pictures conversion, and just lets software itself use imagery to find imagery. This kind of thing could assist creative types come up with fresh visual ideas, or even generate new and interesting associations they hadn’t considered.

Chris Rodley on Twitter first alerted me to this tool. He used the trick to find images similar to The Last Supper:

Rodley is the guy who recently used deep learning to cross an image of a dinosaur with a book about vintage flowers, and the result was weird and beautiful:

But this is even more impressive. I can’t imagine it’s too tricky for a piece of software to determine the predominate RGB color of a jpeg off of pixel data. However, being able to evaluate imagery for things like tone or texture must be a little more abstract. The former tells you what an image looks like; the latter starts to get at what an image feels like.

Another interesting development in this area is Google’s Creatism efforts. The company trained some software to scrape through terabytes of the panoramic landscape photography their Streetview platform has gathered and pick out the most aesthetically-pleasing shots, and then crop and color them according to current visual trends: aspect ratios, film-emulation filters, gradients and so on. This means Google Streetview is taking pictures that are more (or just as) pleasingly composed than most tourists, turning this:

…into this:

Beautiful, no? What’s not as pretty are recent efforts to use deep learning in the field of logomark creation. The MarkMaker online tool lets you (or the robot under the hood to be more accurate) algorithmically create a logo. It uses a combination of public domain Google Fonts and stock vector imagery from The Noun Project. The technical effort is impressive, but the executions look like they came from the graphic designer intern you just fired:

With all of these AI platforms/generative design algorithms/terrifying-threats-to-human-labor, it helps that the companies involved have such huge datasets on hand (With the exception of MarkMaker, above, which uses user input applied only to the logos created thus far to train the algorithm). It’s hard to train up an automaton in what makes an image look nice without having thousands of nice images that flesh-and-blood photographers (or driverless cars, in the case of Streetview) have already taken. But the next step — computers generating their own images and standards for what’s artistically valid — probably ain’t far.

Below is another example of Flickr’s tech. I tried the technique on an image of some spacemen a tourist had taken at NASA (top row, second right, below). What returned was an array of similarly beautiful images — underexposed, dark shadows, soft highlighted subjects, and coolish, muted colors. These are image qualities that only someone (or something) with an eye for aesthetics would be able to pinpoint or convey in a textual search.