Posted
by
samzenpus
on Monday August 18, 2014 @02:49PM
from the best-representation dept.

Zothecula writes If you're trying to find out what the common features of tabby cats are, a Google image search will likely yield more results than you'd ever have the time or inclination to look over. New software created at the University of California, Berkeley, however, is designed to make such quests considerably easier. Known as AverageExplorer, it searches out thousands of images of a given subject, then amalgamates them into one composite "average" image.

I was also going to point that out. Any graphics program can blur and image with very similar results.

I could see a benefit to this for pattern recognition, such as determining people's ancestral makeup or what breeds a particular dog is composed of.

The key would be well defined inputs. A large sample of each possible output value would be needed, along with details about a particular value. This would be the training (200 Labradors, 200 Beagles, etc.).

You may have read the article (dubitable), but you didn't watch the video or read the SIGGRAPH paper. They demonstrate a browsing tool that enables you to, for example, find an average nose nearly instantly. You can then filter the thousands or millions of images to find specific cat breeds, poses, situations, or colors in seconds.

The tool is called average explorer, and it allows a user to interactively explore a vast set of image data quickly and efficiently. The one picture you describe was a single click in the explorer.

You did the equivalent of saying "Wow, I can make a black dot on a white canvas. That's not very exciting." when presented a single click with a single tool in Photoshop.

Somehow, I still fail to see the point... I can search for "cat" in Google Images, then if I'm not happy then "siamese cat" and finally "siamese cat jumping" because I'm probably looking for one useful picture, not a blurred mess as I'd expect trying to average what a "jump" looks like. And if you ask what an average face looks like, they mean the average feature size and location not a mathematical average. I'm trying to think of one single purpose where the results of this "average browser" is what I'm lo

Well, I did read the article. I did not immediately watch the video, and now that I have, I'm still not impressed.

The strength of the tool is NOT the averaging of multitudes of shapes, which is what is essentially advertised. Instead, it is in finding images in the set that conform to what the user selects: filtering, not combining.

So, the "average" of blue butterfly wings with this shape is that they are blue and have this shape. You're not AVERAGING, you're FILTERING.

If this software searches out all images of a subject and averages them automatically, that means that there's no human control over which images to use and which to reject. Imagine what would happen if you were to let this program loose to create an average image of Shirley Temple. [wikipedia.org] She started in films at the age of three and reached the age of 85, and the software would create an "average image" by mixing images of her as a small child with ones of her as an elderly woman. Even worse, there's a non-alcoholic cocktail [wikipedia.org] named after her, and pictures of it would almost certainly get included.

best Ima-drink-agen EVER. combines the vivacity of youth, the wisdom decades, the trials and tribulations, loves and losses of an entire lifetime, with a splash of grenadine. Would definitely drink that in, as it were.

i'd be more afraid of image searches for rick santorum or prince albert.

Not necessarily.
Machine learning algorithms like K-means clustering are designed for exactly this sort of problem. In principle, it can figure out that there are two different shirley temple images: Shirley Temple the human; and shirley temple the bright-red drink.
Of course, depending on *how* you wash the image data into something that can be analyzed, it could make unexpected categories like "Shirley Temple black & white" vs "Shirley Temple mostly red" vs "everything else".

But seriously, I've seen the same technique used to discredit a movie of a UFO shot on 8 mm film. If you just watch the movie, you see an elliptical blob flying. Someone scanned the blob from each frame, aligned them, and averaged them. The increased contrast (bit-depth and resolution basically) let you see that the elliptical blob was more a diagonal prism, and that there were dark features underneath it. Basically it was a Cessna with the sun reflecting off the top of the wing.

I'd like to see one that constructs a 3D model. Perhaps it could use a genetic algorithm (GA) to breed a 3D model that can best represent the most actual specimens of the target object type.

It may be a lot of computations, however, because one is not just running genetic algorithms, but also rotating all the candidate 3D models and lighting conditions to see which best fits the actual specimen images PER GA candidate PER specimen. Perhaps a 3D thumbnail version can be used to for initial placement estimations to be fine-tuned with a fuller model.

Then you got spot and texture variations within specimens. You have to model varying textures, not average them out. But even if it ignores texture & spots to simplify things, a 3D shape model result would be cool.