State-of-the-Art Computer Vision Technologies

Posts tagged: computer vision

SIGGRAPH 2009 accepted papers are coming out. Totally 78 papers out of 439 submissions are accepted (acceptance rate: ~18%). Congratulations to the authors. Many researchers can not stand waiting long time to post their works online. Thanks to Ke-Sen Huang, he maintains a very nice website of accepted papers.

This Sunday (Feb. 01, 2009) is a football fan’s celebration day. It is also a 3D TV fan’s big day. At the end of the second quarter, there will be three 3D commercials at NBC: an animated movie Monsters vs. Aliens by DreamWorks, SoBe LifeWater energy drinks and the special 3D episode of Chuck. Are you ready for this new experience at home? Use above picture to make sure you can see the vivid effects with your 3D glasses. If now, you still have time to get a free pair of glasses from many local stores (KMart, BestBuy, etc).

ZuraVision is a website on embedding pictures or videos into a video using the computer vision technology developed by Stanford students.

One immediate application of ZuraVision is to embed visual advertisements into the video. As Google AdSense do for text advertisement, ZuraVision may bring you new opportunities to monetize your video (e.g., on YouTube) in the near future.

The image on the left was cropped from a video embedded with our “VisionWang” logo, giving a realistic perspective.

There are around 10,000 applications from third-party developers for Apple iPhones by the end of 2008. Among them, there are a number of camera/photo-related applications for you to play with. You will be amazed by how computer vision and image processing technologies could deliver you astonishing visual effects and fun, even with a 2MP mobile camera. Some of them offers a set of image editing functionalities like Adobe Photoshop, such as image enhancement, image filtering, image/video editing, etc. Others use more advanced and complex computer vision technologies, such as object detection/recognition and image retrieval. Here we will review some of the top computer vision apps for iPhone and their core technologies.

GazoPa is another image based visual serch engine (still in Beta). It is owned by Hitachi, Japan. Users can search similar images on the web based on image color, shape or faces. Similiar to Like.com, but GazoPa has interest of general objects rather than only commercial products. Different from TinEye, GazoPa does not focus on finding the same images.

In such a wide open area, the traditional fire and smoke detector used in a building usually do not apply here because it requires the smoke particles reaching to the detector before an alarm, resulting a significant latency. It is also not practical to install huge amount of detectors in a large open space.

Thanks to the recent computer vision technology in video fire detection, it enables large area monitoring with instant alarming system, with the help of volumetric sensors (cameras) and intelligent video analytics software. Mostly importantly, existing video surveillance infrastructure (cameras and network) can be leveraged for this application.

Like.com is one of the earliest visual search engines. It starts from another visual search engine called Riya, which was originally used to search celebrity people, objects in the pictures, and faces in the photos. From the profitability point of view, like.com now changes its business model to high-end online shopping based on visual search.

The first thing I would like to talk about is like.com’s business model. There are always cool computer vision technologies. However, how to leverage the technology to make profit is a different thing. Like.com is undoubtedly a good model for visual technology entrepreneurs. Visual search is high-tech, and online shopping is something popular and something you can make profit from. The marriage of high-tech search engine with online shopping will undoubtedly bring you venture capital and eventually gain profit for you.

From now on, a series of posts will focus on reviewing several of the best visual search engines in the world. These search engines represents the state-of-the-art developments and applications in the field of visual search and retrieval. Each engine will be evaluated based on technology, functionalities, searching speed and accuracy, and business model.

The first one to be reviewed here is TinEye, owned by idee, a Canada based company. Tineye provides image search by image functionality, and show where and how the given image appears all over the web. This is one of the most brilliant applications of computer vision.

Text based search engines, such asGoogle,Yahoo, andMicrosoft search engines, have become an important tool for us to obtain useful information through Internet. Interestingly, even you want to find a picture on the web, a text description of the picture has to be input to the search box. The search engine is then actually doing a text-context based search.

Imagine the following scenarios. Your girlfriend/wife came across a lady carrying a beautiful handbag on the 5th Ave of New York City. She quickly took a photo of that bag using heriPhone. Can she find similar bags instantly through iPhone? In another scenario when she came across a stylishChanelhandbag, which is too expensive for you to afford. Can you quickly find a cheaper one of similar style online?

Obviously, the current functionalities of major search engines are far from meeting our increasing demand. First of all, “Each picture is worth a thousand of words”, therefore, sometimes it is kind of difficult to describe what you really want using concise phrases. Secondly, even if you can describe the scene, many unrelated pictures are usually returned for you to manually sift what you need page by page.

Is there an efficient and automatic way to solve this problem? The answer is yes, I believe. Visual search will be the future and the ultimate solution for above scenarios.