ATLC is always keen to establish research collaborations with local research entities and universities with the goal of advancing the state-of-the-art in computer science fields and promote world class research work with in the local research community. One such effort is undertaken with Nile University in the field of computer vision.

The ubiquity of different image and video capturing devices such as digital cameras and mobile phones has resulted in an unprecedented volume of image and video data produced and shared every day. This has urged the development of more effective techniques for accessing this data whether for retrieving, browsing, filtering or summarizing. To date, most commercially deployed solutions are based on metadata associated with the image and video content. Unfortunately, only a small percentage of multimedia data comes with user annotations and even for the small percentage of metadata accompanied multimedia data, the metadata is far from covering the diverse user intentions specially that an image is worth a thousand words and video even richer in semantics. Besides, the textual annotations are usually associated with the video content as a whole, rendering it difficult to randomly access a particular time point of the video. The goal of this collaboration with Nile University is to enable such in-content access of a video footage. We are investigating a set of research directions that would advance the current state-of-the-art approaches for detecting and recognizing activities in real-world videos. For example, one outcome of the project is to provide a textual label for a scene in the video sequence by a tag such as “person running”, “walking” or “shop lifting”.

Project outcomes so far:

The project has resulted in a number of conference papers and paper submissions: