The presentation of this document has been augmented to identify changes
from a previous version. Three kinds of changes are highlighted: new, added text, changed
text, and deleted text. NOTE: the status
section of the document has not been augmented to identify changes from a
previous version.

Abstract

This document specifies use cases and requirements as an input for the
development of the "Ontology for Media Object 1.0" and the "API for Media
Object 1.0". The ontology will be a simple ontology to support cross-community
data integration of information related to media objects on the Web. The API
will provide read access and potentially write access to media objects, relying
on the definitions from the ontology.

The main scope of this document are videos.
Metadata for other media objects like audio or images will be taken into
account if it is applicable for videos as well.

Status of this Document

This section describes the status of this document at the time of its
publication. Other documents may supersede this document. A list of current W3C
publications and the latest revision of this technical report can be found in
the W3C technical reports index at
http://www.w3.org/TR/.

This is an updated Working Draft of the Use Cases and Requirements for
Ontology and API for Media Object 1.0 specification. It has been produced by
the Media Annotations
Working Group, which is part of the W3C Video on the Web Activity. The
purpose of this publication is to reflect the progress of the Working Group.
There are still topics e.g. in the area of terminology about which the Working
Group has not reached consensus.

Publication as a Working Draft does not imply endorsement by the W3C
Membership. This is a draft document and may be updated, replaced or obsoleted
by other documents at any time. It is inappropriate to cite this document as
other than work in progress.

Appendices

1 Introduction

Anticipating the increase in online video and audio in the upcoming years,
we can foresee that it will become progressively more difficult for viewers to
find the content using current search tools. In addition, video services on the
web that allow for upload of video,
need to display selected
information about the media documents which could be facilitated by a uniform
access to selected metadata across a variety of file formats.

The "Ontology for Media Object 1.0" will address the intercompatiblity
problem by providing a common set of properties to define the basic metadata
needed for media objects and the semantic links between their values in
different existing vocabularies. It will help circumventing the current
proliferation of video metadata formats by providing full or partial
translation and mapping between the existing formats. The ontology will be
accompanied by an API that provides uniform access to all elements defined by
the ontology, which are selected elements from different formats.

This document specifies the use cases and requirements that are motivating
the development of the "Ontology for Media Object 1.0". The scope is mainly
video media objects, but we take also other media objects into account if their
metadata information is related to video.

The development of the requirements has three major inputs: Use cases,
analysis of existing standards, and a description of canonical media
processes.

2 Purpose of
this draft publication

This initial version of this document contains only a small set of use cases
and requirements. Nevertheless it is being published to gather wide feedback on
the general direction of the Working Group. Hence, we would like to encourage
especially feedback on 6 Requirements, the
requirements which we are planning to implement, or others which we are
planning not to take into account.

Currently, there is an additional section under development, describing a
top-down modeling approach to describe the media annotation problem. The
Working Group is considering to publish that section in an updated version of
this document.

3 Purpose of the Ontology and the API

The following figure visualizes the purpose of the ontology, the purposeontology of the API and their relation to
applications.

The ontology will define mappings from properties in formats to a common set
of properties. The API then will define methods to access heterogeneous
metadata, using such mappings. An example: the property createDate
from XMP [XMP] can be mapped to the property
DateCreated from IPTC [IPTC]. The API will
then define a method getCreateDate that will return values either from XMP or IPTC metadata.

An important aspect of the above figure is that everything visualized above
the API is left to applications. For example.

The collections of cultural heritage institutions (libraries, museums,
archives, etc.) are increasingly digitised and made available on the Web. These collections range from text to image, video and audio (music and radio collections,
forcollections example). A comprehensive, professionally
created documentation is usually available, however, often using domain specific or even proprietary
metadata models. This hinders accessing and
linking these collectionscollections. The media types that are
archived in ana cultural heritage perspective range from image to
homogeneous or centralized way
and linking them across
collections.

For example, Jane is a TV journalist searching for material about some event
in contemporary history. She is interested in televisionmovie clips and radio broadcasts from this event, along with photos and
newspaper articles. All these resources come from different
collections,languages. She and some are in different languages. A homogeneous way
of accessing them across the Web
would improve her work. Web.

5.2 Recommendation across
different media types

Summary: Accessing heterogeneous media objects metadata as the input to the
creation of recommendations which is based on user preferences.

People nowadays are able to enjoy large number of programs from different
content providers (broadcasting companies, Internet video website, etc.). To
achieve better user experience, reduceuser history based recommendation is very
promising. Recommendation the
user's experienceto of being overloaded, and henceretain
users by retain users, some systems provideof recommendations based on the user's history, ratings,
or stateduser preferences. However, different
content providers usually have their specific or proprietary metadata models,
which is one of the key problems faced by recommendation service providers.
A common ontology spanningacross different metadata sets can allow
recommendation systems to return a better, larger,
and more relevant selectionusers than when the
metadata systems are unrelated.
metadata.

Company A is an IPTV add-value service provider. One of their services is to recommend programs that users might like, based on their watching history or explicit rating
of programs. In this system, users are able to watch
regular TV programs with electronic program guide (EPG) format metadata, videos such as fromvideos YouTube, with website-specific metadata, etc. In order to perform
uniform and effective
recommendation in theuniformly
absence of a common set of vocabularies, they
would need to design own integrated
media annotation model.

With modern devices, aA person can
capturecaptures his
or her experience, including all sorts oftheir daily
events, by creating images, audios and videos files, and publish them on the Web. These are called
"Life Logs". These Life Logs contain various information such as
time, location, creator's profile, relations
between differenthuman people, and even emotion.
If accessed via an ontology providing links
between the different metadata
usedthe to describe these various information, a user couldcan easily and efficiently
search for his or herhis/her personal Life Log information, including emotional
information ( this type of information can be
described using a vocabulary like [Emotions ML 1.0]),
or geolocation information on the Web (which can be describedweb, using the [Geolocation API]specification). Other people's Life Logs contents could also be searched and
accessed via this ontology.necessary.

John is developing a JavaScript library for accessing metadata of media
objects (e.g. video) in various formats. These
objects are available within a database, such as that of a search engine
indexing the internet or other web-accessible content (e.g. a corporate
repository, library, etc.). His library can be used to make
queries of the media objects like:

"Find me all media objects which have been created by a specified
person"

"Find me all media objects which have been created this year"

"Find me all videos which are not longer than a specified time"

"Extract all user added tags from all media objects available"

This use case is related to many other use cases. Nevertheless it is
mentioned separately since, at the difference fromother
requirements, its implementation requires only a small set of requirements.
TheAlso, the difference from this use case is not to require or to thepropose
developing a Cultural Heritage use
case is that the formerontology can be is very strongly tied to the requirement of such a read-only client side API.language.

5.5 User generated Metadata

Use case summary: Adding or linking to external metadata by different
users.

John wants to publish comments on the last movies he has seen on
http://example.cheap-vod.com/ . For each movie, he uses the description
metadata field to provide a personal summary of the movie (with incentive to
see or avoid the movie according to his own opinions), and the ranking
metadata. John is also not satisfied with the genre classification of the
website, so he uses the genre metadata field to provide his appreciation of the
genre with regard to a better scheme. He then publishes these metadata on his
blog (may be in the form of a podcast), but only links to the videos
themselves.

Jane, a friend of John's and another cheap-vod customer, can now configure
her cheap-vod account or her browser, to have John's metadata added to or
replacing the original metadata embedded in each file.

Now Jane wants to study more particularly the characters of the movie. For
making this easier, she defines one custom metadata field for each of the main
characters, and sets these fields to "yes" or "no" for each sequence, to
indicate if they contain that character or not. For example:

In this context, the ontology would enhance the
interoperability between different users.

5.6 Use cases: to be done

Editorial note

In a future draft of this
document, the following use cases will be spelled out separately,
integrated into existing use cases or dropped.

Multimedia adaptationadaptation, at least partly to be covered by

Multimedia presentation

Digital imaging lifecycle

Accessibility

6 Requirements

This sections describes requirements for the ontology and the API. The
Working Group has agreed to implement the following requirements. For the other
requirements, there is no agreement yet, and the Working Group is asking
reviewers of this document for feedback about their implementation.

Description: The API MUST provide methods for setting
metadata in media objects in different formats.

Rationale: The implementation of this requirement is mainly necessary for
use cases which involve change of media objects by users.

Target (API and / or ontology): API

Note:

The implementation of this requirement may impose several problems, like:
how to set information in formats which have more detailed information than our
ontology, or how to implement the setting process in the API (e.g. what
protocol to use). Due to such problems and since there seem to be no
implementations achieving this functionality, we might not take this
requirement into account.

6.3 Requirement r03: Providing in the
API a means for supporting structured annotations

Description: The API MUST provide a means to support
structured metadata to media objects, like the name of the creator being
structured in "first name" and "last name".

Rationale: There are existing, widely used formats like [XMP] which are defined in a structured manner. To be able to
support meta information for media objects, including such formats, the API
needs to have a means to achieve this.

Description: It MUST be possible to access user-defined metadata to media objects.
"user-defined metadata" means
metadata that is not defined in a standardized format, but which is being
created entirely by the user.

Target (API and / or ontology): Mainly the ontology, but possibly also the
API, if we require it to process this format.

6.7 Requirement r07: Introducing several
abstraction levels in the ontology

Description: The ontology MUST provide several abstraction
levels.

Rationale: Several metadata standards like [FRBR] or [CIDOC] allow referring to multimedia objects on several
abstraction levels, in order to separate e.g. a movie, a DVD which contains the
movie and a specific copy of the DVD. Especially for collections of multimedia
objects, knowledge about such abstraction levels is helpful, as a means for
accessing the objects on each level.

Target (API and / or ontology): ontology and potentially API, if we want to
provide access to metadata and multimedia objects on several abstraction
levels.

6.8 Requirement r08: Being able to apply
the ontology / API for collections of metadata

Description: Different roles in metadata processing MUST be
taken into account.

Rationale: Metadata is being dealt with by for example producers of metadata
(e.g. a video camera), changers (e.g. a person which modifies initially created
metadata) and consumers (e.g. an application which processes metadata to make
it accessible for search). If several pieces of metadata, created by machines
or people in different roles, are in conflict (e.g. contradictory creation
dates), a description of provenance related to roles can be useful for conflict
resolution (e.g. "metadata produced by the changer has provenance over metadata
produced by the creator").

Target (API and / or ontology): ontology

6.10 Requirement r10: Being able to
describe fragments of media objects

Description: It MUST be possible to relate metadata to
fragments of media objects.

Rationale: Processes like search may be specific to fragments of media
objects, e.g. a search for all kiss scenes in a movie. The implementation of
this requirement provides the means to implement such processes.

6.11 Requirement r11: Providing the
ontology in slices of conformance

Description: The ontology MUST be provided in a prose
description and MAY be provided in different serializations
(RDF, XML). The yet to be produced general conformance description
MUST require implementations to take the prose description
into account. Additional conformance descriptions, being specific to a
serialization, MAY be provided.

Rationale: Existing metadata formats use a wide range of serializations like
RDF and XML. To foster a widespread adoption of the ontology, we do not want to
be specific to one serialization, but rather state that following the prose
description is sufficient for an implementation. If there is a interest in the
Working Group to create one or more serializations, we may provide additional
types of conformance for them.

Target (API and / or ontology): ontology

6.12 Requirement r12: Provide support
for controlled vocabularies for the values of different properties

Description: It MUST be possible to take information from
controlled vocabularies for certain properties into account.

Rationale: Media archives often make use of controlled vocabularies (e.g.
classifications, thesauri, ontologies) for certain properties. Providing access
to knowledge about which vocabulary is actually being in use for a media
object, is an important requirement for such archives.

Target (API and / or ontology): ontology (for describing properties which
need a slot for specifying a controlled vocabulary) and the API ( for getting
information about which vocabulary is being used for a media object)

6.13 Requirement r13: Allow for
different return types for the same property

Description: It MUST be possible to provide different
return types for the same property.

Rationale: Some properties are defined with the same name and functionality
(e.g. conveying information about the creator of a media object), but use
different value types (e.g. string versus URI). This
raises the question whether the API should be specific to only one return type,
or allow for undefined/unformatted return values
forseveral the same property. ones.