Macroglossa’s Visual Search Engine fails to meet basic expectations

Macroglossa, an Italian search engine based on image comparison, is launching the alpha version of its “Engine Comparer Technology”, theoretically pushing the limits of search by working solely on visual recognition. But, right now, it’s only theoretically.

How it is supposed to work: take a picture you can’t really make out the content of. Upload it to the website http://www.macroglossa.com and launch your search. It should tell you what it is. Simples.

This technology has several advantages. First, it allows users to pull results from collections of visual content without using tags for search. Of course, this goes with the current burgeoning trend of paid searches without keywords. Second, the visuals can be crowdsourced. Just imagine the endless possibilities.

So far, the company said it is slated to release a faster and more comprehensive version at the end of April (the only category working today is “animals”), with plans for an iPhone app “within the first week of June”.

Sounds cool ? Read on.
Yes, visual recognition is currently the ultimate level in robotics as it makes us think the computer or device has a conscience – although it clearly does not. Look at ASIMO (for “Advanced Step in Innovative MObility“), for example, the Japanese humanoid robot. ASIMO has spatial, sound, kinetic and – yet limited – visual recognition capacity, being able to “stock” only ten faces. Together with the help of a little memory boost, a technology like Macroglossa’s would enable the robot seem almost human — if not for its astronaut like gait and overall look –, with a whole library of visuals at its disposal.

The applications of this image recognition search could be amazing. Beyond spotting and naming animals / buildings / places you snapped during your last holidays, the implications in terms of defence, intelligence and security also raise serious privacy issues. It could spark another round of criticism and scepticism on the part of official authorities, just like it did with Google Buzz and Street View (Canada, France, Germany, Israel, Italy, Ireland, Netherlands, New Zealand, Spain and the UK have addressed an open letter to Google to that effect).

Time for a test drive.
Curious to see how it works, I fed the search engine first with the picture of the fail whale. Entered the captcha words. Launched the search. It failed. The fail whale is a cartoon, so I guess it does not qualify.

Then I tried the picture of a humming bird. It failed. Again. The error message was telling me that the format was not jpeg although it was — .jpg, in fact. More annoyingly, it also tells me to “use the browser back button to return to the form” and do another search. Second fail in a row: poor usability.

Finally, I used a puppy dog picture and it did function… after a long wait of just under five minutes. The results were ‘amazing’: I was given a whole list of pictures of animals, ranging from birds to bears, tigers, spiders, kangaroos, butterflies or monkeys with no apparent link. Again, great disappointment.

Here’s a screen shot:

Back in 2009, IBM was already working on its SAPIR, Search in Audio-Visual Content Using Peer-to-peer Information Retrieval, system. This other visual recognition search technology allows users to upload images and match them to similar ones. To the difference of Macroglossa, IBM’s system analyzes everything from digital photographs, to sound files to video that it automatically indexes and ranks for retrieval.

More recently, Cortexica, a London-based bio-inspired image recognition business, announced yesterday at GeeknRolla that it has gone live with its Wine Findr iPhone app. Wine Findr is a visual search software that enables users to snap a bottle of wine and get price comparatives from different retailers, as well as other information related to the wine.

Based on research on the “human visual cortex”, the system’s algorithms and computer models can accurately mimick human visual recognition. This technology takes the challenge further by integrating the image search to mobile devices and by indexing 200 million frames of video per day. Cortexica is notably offering that capability on a commercial basis to media monitoring agencies.

So in the end, although the technology has great potential going forward, Macroglossa in itself is a disappointing project with much less to offer than both its early and current competitors as it solely focuses on still imagery with the technology still not up to par on top of poor usability.

But you never know, it may be worth following. As the company said, “all future updates and improvements to the engine will be made available on our Twitter Channel: http://www.twitter.com/macroglossa.”

You know where to find them.

This post was written by Liva Judic who starts writing for SEW next week. Please give her a warm welcome!