Ars talks to Yahoo about making photos easier to find, share, and communicate with.

SAN FRANCISCO—“We found that people were searching for squirrels just to favorite them, just to click 'like.' And the same with buses."

That's what David Amyan Shamma, senior research manager at Yahoo Labs, told a small group of journalists at the company's local headquarters on Friday. It's a bit of trivia that pretty accurately reflects our obsession with images in the digital age.

But given our apparent love for pictures, searching for the right photos online remains inconvenient. While text search is effective enough that we tend to take it for granted, image search has traditionally been a more difficult problem. Searching the image itself requires hefty computer vision resources, and searching the metadata of a photo is not always effective.

That's why Shamma and his colleagues are taking aim at the issue, with Yahoo Labs leveraging its photo-sharing property Flickr to make image search better.

A social call

In particular, Yahoo has two immediate goals for improving its image search capabilities: First, it's looking to use its wealth of Flickr photos to drive engagement on the photo-sharing platform. Second, it wants to be able to use photos in products like Yahoo Weather and on e-commerce sites. Better search helps the people searching for "cactus" find more cacti and fewer dogs named Cactus. And making Yahoo's Weather product more visually appealing needed a similar hammer for a similar nail. Yahoo wanted local images of weather patterns to present to a user when he or she searched for the weather in that particular place, but to do that the company had to surface hundreds of thousands of high-quality weather photos taken from various locations around the world.

Part of this problem was initially solved by the community: users voluntarily added photos to Flickr's Project Weather group and real, human editors went through the photos and selected the best and most interesting ones. But at a certain point the editors needed to search outside Project Weather, and they wanted to find the weather photos most interesting to the Flickr community out of the 10 billion-or-so images uploaded to the platform.

Instead of turning immediately to computer vision or geo-location, the people at Yahoo Labs posited that people who like weather images on Flickr tend to view more of them. They looked at which users liked which photos and then drew implied connections from users who clicked the same photos without the users being socially connected. Flickr grouped those people with implied connections using what researchers have called “Clique Percolation,” allowing Yahoo to find the photos favorited by the resulting groups of unrelated users with similar interests.

That method surfaced over six million “reasonably high-quality” images of weather on Flickr, Shamma said, at which point computer vision was used to eliminate photos with faces in them and other undesirable image artifacts. Editors at Yahoo contacted the owners of the photos and asked if they would agree to having their photo displayed by Yahoo Weather. Next, Yahoo used a method known as “Deep Convolutional Neural Networks,” which was able to assign each photo a “daytime/nighttime” classification as well as a designation for various weather patterns like sunny, rainy, cloudy, snowy, etc. (Shamma noted that the photo's metatdata showing time and location was often not enough to determine night or day correctly because of inaccuracies caused by individual devices.)

A team led by Pierre Garrigues, a senior research engineer for Flickr, publicly demonstrated the "Deep Convolutional Neural Networks" method in October of this year with a tool called "Flickr PARK or BIRD," which automatically checks if a photo contains a park or a bird. The project was inspired by an XKCD comic poking fun at how difficult image identification is, saying that it's easy to find out if an image is of a National Park (just check the location data) but its much harder to figure out if the image has a bird in it. Flickr managed that feat by training an algorithm, feeding it "millions of images" of birds and then asking it to analyze each image in layers from the most basic features to simple shapes to bird heads and wings.

Find me a blouse with cats on it

Yahoo's Jia Li, a senior research scientist who specializes in visual computing, talked about another initiative that the company has tested on its Taiwan e-commerce site: letting users search for other products based on images of the first product. The example in this case was blouses. A user types “blouses” into Yahoo's search engine and is presented with a number of different blouses from different clothiers with different patterns. If you see a blouse you like but don't know how to describe it in order to find more of that kind of shirt, Yahoo presents a button you can click to find images of shirts that are similar to the blouse you liked based on a computer analysis of the patterns and colors in it.

“I’m searching not with a set of terms, I’m searching with an image,” Li told the group. “The brand of the blouse could be very different and the language description could be very different for this blouse. The way we could describe the visual similarity is by using an advanced visual detection system… and advanced machine learning technique like deep learning.”

Of course, deep learning techniques often require training, and humans are still best at weeding out false positives returned by computers. So for refining Flickr's search at least, human input is still necessary. “By combining computer vision and humans in the loop we can reinforce learning and…people can have a better experience as well,” Li said.

A personal product

Yahoo's efforts to make photo search better has a simple mantra: “more relevant photos for users, not just the most popular photos,” as Li put it. To that extent, Flickr tries to improve general search while also improving search relevance within a person's likely-massive online photo album.

Shamma noted that batch upload and the gigabytes and terabytes of storage offered to customers at relatively cheap prices have changed how we photograph things. Accordingly, storage and recall of photographs has to adapt to fit the morphing definition of photography. “The practice of photography is changing very quickly, using photos for communication has been growing,” Shamma said.

Garrigues added that Yahoo is “developing technologies to help people handle their increasingly large photo collections; image recognition is not enough in itself to handle the increasingly large corpus of photography.”

Enlarge/ I tried to stump the Park or Bird and I failed. That is indeed a bird.

The company says it is relying on metadata, social interactions (like tracking favorites for Yahoo Weather ), multimedia signals, geodata, and other social media inputs to make photos more searchable beyond just applying image recognition technology to each photo in order to classify it.

Still, image recognition could become a part of Flickr's future search. Currently, the company is working on adding what it calls “auto-tags” to photos uploaded to the platform. These auto-tags are assigned by computer vision algorithms that can “recognize over 1,000 different concepts”—like shoes, cats, and so forth—and tag the photo appropriately. The tags are invisible to the end user, however, and Garrigues noted that users can't edit or change them for now. That may change soon. “At the moment they are indexed into our search engine, we have ongoing research on how people can interact with the suggested tags.”

The tags don't affect whether a photo is public or private, so once-private photos will not be searchable publicly of course, “Privacy is the most important concern here at the moment,” Garrigues told the journalists.

Still, having a hidden label on a photo could present some ethical problems, if not practical ones. “We have to think about are we going to bias this in a certain way? We have to be very careful,” Garrigues admitted. He did not go into detail about how Flickr might prevent the wrong tags from being applied to photos.

The kind of image search being demoed at Yahoo—combining human editorial power, social network data, and computer-based image search—is a first step to decluttering a world that's more dominated by images. As Garrigues said, "So far there's something that’s missing, you hear a lot about it in the press but [deep learning] hasn’t changed [peoples'] lives yet. We think there is a missing link there between having this technology and bringing it to people."