Scientific Method —

Program learns to recognize rough sketches of objects

Someday, computers will beat us at Pictionary.

Researchers at Brown and the Technical University of Berlin have produced a program that can identify simple sketches of objects almost as well as humans. The computer application enables "semantic understanding" of abstract sketches as they are being drawn in real time. The research was presented at computer graphics conference SIGGRAPH and the paper is now available online.

Computers can already match sketches to objects provided they are accurate representations—e.g., matching police sketches to actual faces in mug shots. However, more abstract sketches—the more cartoonish drawings that most people can easily produce—present a different challenge.

For example, in order to draw a rabbit, a person might draw a cartoonish creature with big ears, buckteeth and a fluffy cotton tail that another person would recognise despite the fact that it bears little resemblance to an actual rabbit.

James Hays, assistant professor of computer science at Brown explains: "It might be that we only recognise it as a rabbit because we all grew up that way. Whoever got the ball rolling on caricaturing rabbits like that, that's just how we all draw them now."

Getting a computer to understand this is more challenging. In order to create the program, Hays and his colleagues Mathias Eitz and Marc Alexa from the Technical University in Berlin had to create a large database of sketches to teach a computer how humans sketch objects.

They started by coming up with a list of everyday objects from an existing computer vision (photographic) dataset called LabelMe. They ended up with a set of 250 object categories. They then used Amazon's Mechanical Turk to hire people to sketch objects from each category—20,000 sketches in total. This data was fed into existing machine learning algorithms to teach the program which sketches belong to which categories.

From there, they team created an interface through which they could input new sketches and the computer could try to identify them in real time, as they are being drawn.

Computer program can identify sketches

Brown University

The program is capable of identifying sketches correctly with around 56 percent accuracy, as long as the object falls under one of the 250 categories. This is not bad given that humans were able to identify the sketches with 73 percent accuracy.

"The gap between human and computational performance is not so big, not as big certainly as it is in other computer vision problems," Hays said.

Hays and team plan to expand the database to include more categories. One way of doing this is by creating a Pictionary-style game that can collect the data that players input—essentially Draw Something. They have created a similar game which is available on iTunes already. Ultimately the program could be used to develop a better sketch-based interface and search applications.

The cartoon rabbit example is a great thing to bring up. It indicates that a kind of cultural vocabulary of visual abstractions exists which would plausibly be visual gibberish to outsiders. That's how caricatures of famous people work so well. (I wonder if they're going to have this program try to identify celebrities by their caricatures next).

This technology is a threat to world peace and must be immediately be destroyed. Can you imagine what would happen if the computer identified a drawing as Mohammed?

Isn't it obvious? They would use it to identify all drawings of Mohammed on the internet and send threats to the Western world for every one found. Never mind the false-positives an accuracy of 56% would make; the cause is a just one.

Am I the only ashtray that ashtrayed that ashtray REALLY wants ashtray to ashtray an ashtray? Ashtray-like, ashtray was in almost all the ashtrays, and usually one of the ashtrayest "ashtrays" ashtray made...

The cartoon rabbit example is a great thing to bring up. It indicates that a kind of cultural vocabulary of visual abstractions exists which would plausibly be visual gibberish to outsiders. That's how caricatures of famous people work so well. (I wonder if they're going to have this program try to identify celebrities by their caricatures next).

Or it's possible that those visual abstractions mirror how our perception works, such that someone who is familiar with a rabbit would generally recognize those types of abstractions.

This system is a reminder that computers still aren't really processing symbolically the way we are. While there is an element of statistical training/analysis in how humans learn to interpret these types of drawings, we are also capable of interpreting "rabbit" from a drawing that is very different from any we've seen before. We have an understanding of "rabbitness" that computers are still incapable of.

Other than that, the algorithm does a decent job of telling apart a crab from a spider, a shark from a generic fish, and a number of other impressive feats. It sucks at recognizing drawings of genitalia though...

So, from skimming their paper, they use "bags of features" of edge orientations and train a binary support vector machine (SVM) for each category (cat vs. non-cat drawings). Sketch recognition is a pretty difficult task, but I am still surprised that they only get 56%.

I would've liked to see a confusion matrix for the 56%-accurate SVM - I bet that many of the categories are really difficult for the SVM to learn (bush, bottle opener) while others are easy (suv, armchair, parachute). In fact, given the human vs. SVM confusion matrix they provide, for the tasks where the SVM performed worse than humans (red - bush, bottle opener, pigeon) - maybe they needed to have more data in those categories, since it seems like those objects can appear variable (unlike a parachute).

In any case, don't worry. This is an interesting paper, but all this research means is that the computer can say "oh, this collection of lines looks more like that collection of lines I've seen before, rather than any other collection of lines". The computer doesn't know that a sketch of a rabbit is a sketch of a rabbit. We're a very long time away from such sci-fi abilities.

Haven't read this paper yet, but it was the basis for another paper from the same people. Sketch Based Shape Retrieval which is in essence, sketch something a 3D model comes up related to your sketch! Was pretty awesome!

Looks like the program wants multiple small input drawings instead of one continuous drawing (the horse or dog drawing in the video that it missed). It would be easier to identify multiple small shapes or abstracts vs one long continuous drawing.

The algorithm seems to update each time the pen is lifted.

I would think that the way somebody would draw a rabbit (or most other common drawings) would be identical for most people. Start with an outline of the head with floppy ears, then add eyes, and teeth, where as if you drew the teeth first, then eyes then the head i bet the program would have a harder time. Gonna check it out on itunes (if its free).