Pages

Wednesday, October 30, 2013

The ACM Multimedia Systems conference (http://www.mmsys.org) provides a forum for researchers, engineers, and scientists to present and share their latest research findings in multimedia systems. While research about specific aspects of multimedia systems is regularlypublished in the various proceedings and transactions of the networking, operating system, real-time system, and database communities, MMSys aims to cut across these domains in the context of multimedia data types. This provides a unique opportunity to view the intersections and interplay of the various approaches and solutions developed across these domains to deal with multimedia data types.

Furthermore, MMSys provides an avenue for communicating research that addresses multimedia systems holistically. As an integral part of the conference since 2012, the Dataset Track provides an opportunity for researchers and practitioners to make their work available (and citable) to the multimedia community. MMSys encourages and recognizes dataset sharing, and seeks contributions in all areas of multimedia (not limited to MM systems). Authors publishing datasets will benefit by increasing the public awareness of their effort in collecting the datasets.In particular, authors of datasets accepted for publication will receive:

Dataset hosting from MMSys for at least 5 years

Citable publication of the dataset description in the proceedings published by ACM

15 minutes oral presentation time at the MMSys 2014 Dataset Track

All submissions will be peer-reviewed by at least two members of the technical program committee of the MMSys 2014. Datasets will be evaluated by the committee on the basis of the collection methodology and the value of the dataset as a resource for the research community.

Submission GuidelinesAuthors interested in submitting a dataset should:1. Make their data available by providing a public URL for download2. Write a short paper describing a. motivation for data collection and intended use of the data set, b. the format of the data collected, c. the methodology used to collect the dataset, and d. basic characterizing statistics from the dataset.

Papers should be at most 6 pages long (in PDF format) prepared in the ACM style and written in English.

The 12th International CBMI Workshop aims at bringing together the various communities involved in all aspects of content-based multimedia indexing, retrieval, browsing and presentation. The scientific program of CBMI 2014 will include invited keynote talks and regular, special and demo sessions with contributed research papers.

We sincerely hope that a carefully crafted program, the scientific discussions that the workshop will hopefully stimulate, and your additional activities in Klagenfurt and its surroundings, most importantly the lovely Lake Wörthersee, will make your CBMI 2014 participation worthwhile and a memorable experience.

The Solr plugin itself is fully functional for Solr 4.4 and the source is available at https://bitbucket.org/dermotte/liresolr. There is a markdown document README.md explaining what can be done with plugin and how to actually install it. Basically it can do content based search, content based re-ranking of text searches and brings along a custom field implementation & sub linear search based on hashing.

Thursday, October 24, 2013

Essentia 2.0 beta, is an open-source C++ library for audio analysis and audio-based music information retrieval released under the Affero GPLv3 license (also available underproprietary license upon request). It contains an extensive collection of reusable algorithmswhich implement audio input/output functionality, standard digital signal processing blocks, statistical characterization of data, and a large set of spectral, temporal, tonal and high-level music descriptors. In addition, Essentia can be complemented with Gaia, a C++ library with python bindings which implement similarity measures and classiﬁcations on the results of audio analysis, and generate classiﬁcation models that Essentia can use to compute high-level description of music (same license terms apply).

Essentia is not a framework, but rather a collection of algorithms (plus some infrastructure for multithreading and low memory usage) wrapped in a library. It doesn’t provide common high-level logic for descriptor computation (so you aren’t locked into a certain way of doing things). It rather focuses on the robustness, performance and optimality of the provided algorithms, as well as ease of use. The flow of the analysis is decided and implemented by the user, while Essentia is taking care of the implementation details of the algorithms being used. An example extractor is provided, but it should be considered as an example only, not “the” only correct way of doing things.

The library is also wrapped in Python and includes a number of predeﬁned executable extractors for the available music descriptors, which facilitates its use for fast prototyping and allows setting up research experiments very rapidly. Furthermore, it includes a Vamp plugin to be used with Sonic Visualiser for visualization purposes. The library is cross-platform and currently supports Linux, Mac OS X, and Windows systems. Essentia is designed with a focus on the robustness of the provided music descriptors and is optimized in terms of the computational cost of the algorithms. The provided functionality, speciﬁcally the music descriptors included in-the-box and signal processing algorithms, is easily expandable and allows for both research experiments and development of large-scale industrial applications.

This paper presents a novel Attribute-augmented Semantic Hierarchy (A2 SH) and demonstrates its effectiveness in bridging both the semantic and intention gaps in Content-based Image Retrieval (CBIR). A2 SH organizes the semantic concepts into multiple semantic levels and augments each concept with a set of related attributes, which describe the multiple facets of the concept and act as the intermediate bridge connecting the concept and low-level visual content. A hierarchical semantic similarity function is learnt to characterize the semantic similarities among images for retrieval. To better capture user search intent, a hybrid feedback mechanism is developed, which collects hybrid feedbacks on attributes and images. These feedbacks are then used to refine the search results based on A2 SH. We develop a content-based image retrieval system based on the proposed A2 SH. We conduct extensive experiments on a large-scale data set of over one million Web images. Experimental results show that the proposed A2 SH can characterize the semantic affinities among images accurately and can shape user search intent precisely and quickly, leading to more accurate search results as compared to state-of-the-art CBIR solutions.

Wednesday, October 23, 2013

This dataset contains the facial expression images captured using the novaemötions game. It contains over 40,000 images, labeled with the challenged expression and the expression recognized by the game algorithm, augmented with labels obtained through crowdsourcing.

Tuesday, October 22, 2013

Golden Retriever Image Retrieval Engine (GRire) is an open source light weight Java library developed for Content Based Image Retrieval (CBIR) tasks, employing the Bag of Visual Words (BOVW) model. It provides a complete framework for creating CBIR system including image analysis tools, classifiers, weighting schemes etc., for efficient indexing and retrieval procedures. Its eminent feature is its extensibility, achieved through the open source nature of the library as well as a user-friendly embedded plug-in system. GRire is available on-line along with install and development documentation on http://www.grire.net and on its Google Code page http://code.google.com/p/grire. It is distributed either as a Java library or as a standalone Java application, both GPL licensed.

Monday, October 21, 2013

This approach tackles the problem of globally localizing a camera-equipped micro aerial vehicle flying within urban environments for which a Google Street View image database exists. To avoid the caveats of current image-search algorithms in case of severe viewpoint changes between the query and the database images, the authors proposed to generate virtual views of the scene, which exploit the air-ground geometry of the system. To limit the computational complexity of the algorithm, they rely on a histogram-voting scheme to select the best putative image correspondences. The proposed approach is tested on a 2km image dataset captured with a small quadroctopter flying in the streets of Zurich. The success of the approach showsthat the new air-ground matching algorithm can robustly handle extreme changes in viewpoint, illumination, perceptual aliasing, and over-season variations, thus, outperforming conventionalvisual place-recognition approaches.

Wednesday, October 16, 2013

Computer technology still lags far behind the abilities of the human brain, which has billions of neurons that help us simultaneously process a plethora of stimuli from our many senses. But Qualcomm hopes to shrink that bridge with a new type of computer architecture modeled after the brain, which would be able to learn new skills and react to inputs without needing a human to manually write any code. It's calling its new chips Qualcomm Zeroth Processors, categorized as Neural Processing Units (NPUs), and already has a suite of software tools that can teach computers good and bad behavior without explicit programming.

Qualcomm demoed its technology by creating a robot that learns to visit only white tiles on a gridded floor. The robot first explores the environment, then is given positive reinforcement while on a white tile, and proceeds to only seek out other white tiles. The robot learns to like the white tile due to a simple "good robot" command, rather than any unique algorithm or code

The computer architecture is modeled after biological neurons, which respond to the environment through a series of electrical pulses. This allows the NPU to passively respond to stimuli, waiting for neural spikes to return relevant information for a more effective communication structure. According to MIT Technology Review, Qualcomm is hoping to have a software platform ready for researchers and startups by next year.

Qualcomm isn't the only company working on building a brain-like computer system. IBM has a project known as SyNAPSE that relates to objects and ideas, rather than the typical if-this-then-that computer processing model. This new architecture would someday allow a computer to efficiently recognize a friendly face in a crowd, something that takes significant computing power with today's current technology. Modeling new technology after the human brain may be the next big evolutionary step in creating more powerful computers.

Thursday, October 10, 2013

Small cubes with no exterior moving parts can propel themselves forward, jump on top of each other, and snap together to form arbitrary shapes.

In 2011, when an MIT senior named John Romanishin proposed a new design for modular robots to his robotics professor, Daniela Rus, she said, “That can’t be done.”

Two years later, Rus showed her colleague Hod Lipson, a robotics researcher at Cornell University, a video of prototype robots, based on Romanishin’s design, in action. “That can’t be done,” Lipson said.

In November, Romanishin — now a research scientist in MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) — Rus, and postdoc Kyle Gilpin will establish once and for all that it can be done, when they present a paper describing their new robots at the IEEE/RSJ International Conference on Intelligent Robots and Systems.

Known as M-Blocks, the robots are cubes with no external moving parts. Nonetheless, they’re able to climb over and around one another, leap through the air, roll across the ground, and even move while suspended upside down from metallic surfaces.

Inside each M-Block is a flywheel that can reach speeds of 20,000 revolutions per minute; when the flywheel is braked, it imparts its angular momentum to the cube. On each edge of an M-Block, and on every face, are cleverly arranged permanent magnets that allow any two cubes to attach to each other.

A prototype of a new modular robot, with its innards exposed and its flywheel — which gives it the ability to move independently — pulled out. Photo: M. Scott Brauer

“It’s one of these things that the [modular-robotics] community has been trying to do for a long time,” says Rus, a professor of electrical engineering and computer science and director of CSAIL. “We just needed a creative insight and somebody who was passionate enough to keep coming at it — despite being discouraged.”

Embodied abstraction

As Rus explains, researchers studying reconfigurable robots have long used an abstraction called the sliding-cube model. In this model, if two cubes are face to face, one of them can slide up the side of the other and, without changing orientation, slide across its top.

The sliding-cube model simplifies the development of self-assembly algorithms, but the robots that implement them tend to be much more complex devices. Rus’ group, for instance, previously developed a modular robot called the Molecule, which consisted of two cubes connected by an angled bar and had 18 separate motors. “We were quite proud of it at the time,” Rus says.

According to Gilpin, existing modular-robot systems are also “statically stable,” meaning that “you can pause the motion at any point, and they’ll stay where they are.” What enabled the MIT researchers to drastically simplify their robots’ design was giving up on the principle of static stability.

“There’s a point in time when the cube is essentially flying through the air,” Gilpin says. “And you are depending on the magnets to bring it into alignment when it lands. That’s something that’s totally unique to this system.”

That’s also what made Rus skeptical about Romanishin’s initial proposal. “I asked him build a prototype,” Rus says. “Then I said, ‘OK, maybe I was wrong.’”

Geoff Hinton presents as part of the UBC Department of Computer Science's Distinguished Lecture Series, May 30, 2013.Professor Hinton was awarded the 2011 Herzberg Canada Gold Medal for Science and Engineering, among many other prizes. He is also responsible for many technological advancements impacting many of us (better speech recognition, image search, etc.)

Wednesday, October 9, 2013

Disney researchers have found a way for people to "feel" the texture of objects seen on a flat touchscreen.

The technique involves sending tiny vibrations through the display that let people "feel" the shallow bumps, ridges and edges of an object.

The vibrations fooled fingers into believing they were touching a textured surface, said the Disney researchers.

The vibration-generating algorithm should be easy to add to existing touchscreen systems, they added.

Developed by Dr Ali Israr and colleagues at Disney's research lab in Pittsburgh, the vibrational technique re-creates what happens when a finger tip passes over a real bump.

"Our brain perceives the 3D bump on a surface mostly from information that it receives via skin stretching," said Ivan Poupyrev, head of the interaction research group in Pittsburgh.

To fool the brain into thinking it is touching a real feature, the vibrations imparted via the screen artificially stretch the skin on a fingertip so a bump is felt even though the touchscreen surface is smooth.

The researchers have developed an underlying algorithm that can be used to generate textures found on a wide variety of objects.

A video depicting the system in action shows people feeling apples, jellyfish, pineapples, a fossilised trilobite as well as the hills and valleys on a map.

The more pronounced the feature, the greater the vibration is needed to mimic its feel.

The vibration system should be more flexible than existing systems used to give tactile feedback on touchscreens, which typically used a library of canned effects, said Dr Israr.

"With our algorithm we do not have one or two effects, but a set of controls that make it possible to tune tactile effects to a specific visual artefact on the fly," he added.

SIMPLE Descriptors

A set of local image descriptors specifically designed for image retrieval tasks.

Compact Composite Descriptors

A set of global image descriptors for image retrieval tasks.

MPEG-7 Descriptors

Download the latest Version of MPEG-7 Descriptors for C#. The implementation of these descriptors is based on Lire image retrieval System (Lire). Download the Descriptors

The LIRE (Lucene Image REtrieval) library provides a simple way to retrieve images and photos based on their color and texture characteristics. LIRE creates a Lucene index of image features for content based image retrieval (CBIR). Three of the available image features are taken from the MPEG-7 Standard: ScalableColor, ColorLayout and EdgeHistogram a fourth one, the Auto Color Correlogram has been implemented based on recent research results. Furthermore simple methods for searching the index and result browsing are provided by LIRE. The LIRE library and the LIRE Demo application as well as all the source are available under the Gnu GPL license.

Img(Rummager)

Img(Rummager) software can be connected with a database and execute a retrieval procedure, extracting the necessary for the comparison features in real time. The image-database can be stored either in the computer where the retrieval is actually taking place, or in a local network. Moreover, this software is capable of executing retrieval procedure among the keyword-based results that FlickR provides. Read More

Several image processing and retrieval examples using c#

Caliph & Emir
Caliph & Emir are MPEG-7 based Java prototypes for digital photo and image annotation and retrieval supporting graph like annotation for semantic metadata and content based image retrieval using MPEG-7 descriptors
Read More