Handbook of Multimedia for Digital Entertainment and Arts- P3

Handbook of Multimedia for Digital Entertainment and Arts- P3: The advances in computer entertainment, multi-player and online games,
technology-enabled art, culture and performance have created a new form of entertainment
and art, which attracts and absorbs their participants. The fantastic success
of this new field has influenced the development of the new digital entertainment
industry and related products and services, which has impacted every aspect of our
lives.

46 N. Kamimaeda et al.
Fig. 15 Automatic Metadata Expansion
title, keywords, and so on. Content proﬁles (CP) are created from these data and
they consist of some vectors like
CP (i) D (ContentId, Attribute Id, Value Id).
ContentId is a primary key to distinguish between content. Attribute is a class
unit such as cast, genre, and so on. Value is an instance of the class. For example,
comedy, sports, or drama is a value of an attribute of the genre. We can also ﬁnd that
A, B, C, D, or E is a value of an attribute of cast (Figure 15).
AME creates new content metadata, which are (ContentId, Destination Attribute,
Destination Value), from the original content metadata, which are (ContentId, Ori-
gin Attribute, Origin Value). Consider the following example with content named
001. It has cast of Mr. A, which becomes vector D (001, Cast, Mr. A). The con-
tent gets new metadata (001, Personality, Intelligent) since the ACD declares that
the customers think Mr. A is an intelligent person. Therefore, our system can use
not only original content metadata but also these expanded metadata for recommen-
dation purposes. These processes are performed in our system as a part of content
metadata creation in the mining engine.
If the ACD has knowledge based on the lifestyle, for example a person who
likes the title EB is Early Adapter or Follower likes Cast A, it is very efﬁcient to
apply this method to cross-category recommendation. Lifestyle is very suitable as
common metadata. Moreover, if the content metadata is expanded using lifestyle
knowledge, it becomes easier to realize advertisement recommendation.

2 Cross-category Recommendation for Multimedia Content 47
ICF
The second method is ICF. ICF can recommend items unknown to the user. This
has the same advantage as traditional CF methods, but the ICF also works well to
recommend completely new items, less reusable items such as TV programs, and
high merchandise turnover rate items.
As mentioned earlier, item-based CF selects recommendation items based on
groups of similar items and user-based CF selects recommendation items based on
groups of similar users. In either case, recommendation items are directly predicted
based on these groups (Figure 16).
On the other hand, ICF does not directly select favorable items. ICF predicts the
user preferences. These are registered as the expected user preferences as a part of
the user preference vector. Then, the recommendation items are selected based on
an existing recommendation method like the VSM by using not only the original
user preferences but also the expected user preferences. This is why this method is
called an “indirect” method (Figure 17).
ICF necessitates the following two steps:
1. The calculation of similarity between the users based on the original user pref-
erences. Similarities between the users are calculated based on the original user
preferences such as lifestyle, viewer’s age, viewing style, cast, and so on, as men-
tioned earlier. Formula (2) indicates how to calculate the similarities between
user X and user Y.
P
.Xv X/.Yvi Y /
si mxy D qP v qP ::: (2)
v .Xv X/2 v .Yv Y /2
Xv : User X’s preference value of V
X : User X’s average preference value.
Fig. 16 Traditional (Direct) CF
Fig. 17 ICF

48 N. Kamimaeda et al.
When the number of users becomes considerable, we need to decrease the calcu-
lation effort to ﬁnd similar users. In our system, hundreds of typical users can be
found by employing some clustering algorithms before performing the similarity
calculations.
2. Expectations of user preferences
User preferences that are not contained with the original user preferences are
predicted. They are referred to as the “expected user preferences.” Formula (3)
indicates how to calculate user X’s expected user preference value of V . The
expected user preferences are registered in the database for recommendation as
a part of the user preference vectors.
P
N .Nv N /simXN
ExpectX v D X C P ::: (3)
N jsimXN j
N : User who is similar to user X.
Therefore, the user can enjoy the recommendation based on not only the normal
user preference vectors but also the expected user preference vectors. The user can
ﬁnd favorable items that she/he has never seen before. Moreover, this method is
also beneﬁcial to a system administrator of a recommendation system. This method
is easy to apply to an existing recommendation system such as the VSM since the
expected user preference vectors have the same style as normal user preference vec-
tors and can be stored in the same table in the database. The system administrator
can easily create a multi-algorithm recommendation system with ICF. In VE, ICF is
used as a part of the user preference creation in the mining engine.
RCF
As mentioned earlier, CF methods cannot often work well with completely new
items. RCF attempts to resolve this problem by using not only traditional CF-
based similar items but also CBF-based similar items’ CF-based similar items. RCF
requires the following two steps:
1. The calculation of similar items based on traditional item-based CF for all of the
items
2. The calculation of RCF-based similar items using CBF-based similar items and
CF-based similar items
The following four small steps are involved in this process.
1) Select several attributes used for the CBF calculation from the metadata.
2) By using these attributes, CBF-based similar items are calculated based on the
cosine measure or inner product.
3) By using equation (4), calculate the RCF-based similarity based on the CBF-
based similarities calculated in step (2) and CF-based similarities calculated
in step (1).

2 Cross-category Recommendation for Multimedia Content 49
Fig. 18 RCF Step 1:Calculation of Similar Items Based on Traditional Item-based CF
Fig. 19 RCF Step 2: Calculation of RCF-based Similar Items
P
SimCBF .A; n/SimCF .n; B/
n2N
SimRCF .A; B/ D .1 ˇ/SimCF .A; B/ C ˇ P
SimCBF .A; n/
n2N
(4)
In this equation, A, B, and n are items. Here, n represents all of the CBF-based
similar items for A. Moreover, Sim represents the calculation of similarity
among the two items.
4) Based on step (3), RCF-based similar items are selected as the recommended
items.
The images of this algorithm are shown in Figures 18 and 19.
Evidently, this method can work well for almost all of the items even if they are
completely new items, unless the CBF-based similar items for the seed item do not
exist or all of the CBF-based similar items do not have CF-based similar items. In
VE, RCF is used as a part of content proﬁling in the mining engine.
Example of Practical Applications
Multimedia Content Recommendation
There have already been many systems and studies regarding multimedia recom-
mendation. Here, we introduce two practical systems as examples for multimedia

50 N. Kamimaeda et al.
content recommendation: branco [33] and SensMe [34]. branco is an IPTV recom-
mendation service. This recommendation function is realized using the VE. SensMe
is an automatic music playlist generator used in mobile phones made by Sony
Ericsson [35]. This application is based on the 12 Tone technology [36] developed
by Sony Corporation.
branco
branco is an example of content-meta-based search and user-preference-based
search. It is an IPTV service that uses IP multicast network and has several chan-
nels for broadcast. Therefore, a user can watch content like TV. Moreover, branco
adopts an advertising model. Therefore, the users who can connect to the IP multi-
cast network can use branco for free. The recommendation function for this service
is realized using VE. As an example, a user-preference-based search is shown in
Figure 20. The recommended programs by VE are shown as “anapita” which means
just ﬁtting you in Japanese.
SensMe
SensMe is an example of content-meta-based search. It is an automatic playlist gen-
erator from the music content of the user. An image of the SensMe application is
shown in Figure 21. SensMe analyzes the music automatically and then extracts the
music features like speed, tone and mood. After that, SensMe maps the music to an
X–Y axis using tempo and mood. On this X–Y dimension, the user can see the mu-
sic listened to by the user. Moreover, SensMe automatically generates 11 different
Fig. 20 Recommendation in branco

2 Cross-category Recommendation for Multimedia Content 51
Fig. 21 SensMe
music channels such as “Morning,” “Relax,” and “Upbeat.” Therefore, by choosing
a channel instead of selecting individual tracks, the user can listen to a music playlist
generated by SensMe as a recommendation.
Cross-category Recommendation
In this section, we introduce two systems as examples of cross-category recommen-
dation services: VAIO Giga Pocket Digital [37] and TV Kingdom service [12]. Both
these systems use the VE to realize recommendation functions. Giga Pocket Digi-
tal is a TV content manager for the VAIO system. The TV Kingdom service is an
online TV guide service in So-net (So-net Entertainment Corporation). Giga Pocket
Digital is an example of crossing categories only for user preference. TV Kingdom
is an example of crossing categories not only for user preference but also for recom-
mendation. The explanations for these two systems are described in the following
two sections.
VAIO Giga Pocket Digital
Giga Pocket Digital is a TV content manager. Using this system, a user can watch
TV programs in real time and record them manually by favorite keyword and
user preference. In this system, the VE realizes content-meta-based search and
user-preference-based search. For example, Figure 22 shows an image of the user-
preference-based search.
VE recommends only TV programs, because this application handles only TV
programs. However, VE learns the user preference from not only the user’s behav-
ior on this application but also what kind of music the user possesses. This means

52 N. Kamimaeda et al.
Fig. 22 User-preference-based search
that the user preference is generated from the user’s behavior in the TV and music
categories. It is very efﬁcient to have cross-categories even if only the user prefer-
ences are crossed. By crossing the user preference for the TV and music categories,
VE can recommend TV programs in which the user’s favorite artist appears.
VE also renders the edit function for user preferences, as shown in Figure 23.
This pane is called “My Carte”. Here, the user can see his/her own preferences, such
as frequently viewed casts, genres, and keywords. These casts include not only the
persons watched by the user but also the persons whose music the user possesses.
Users can also check their own TV-viewing style e.g., the user’s frequently watched
sports programs and infrequently watched drama programs. Moreover, users can
edit their own preferences to customize the recommendation result.
TV Kingdom Service
TV Kingdom now has 800,000 unique users per month who enjoy not only a con-
ventional TV guide but also a personalized TV guide, which can work together with
a consumer electronics appliance or a personal computer. Figure 24 indicates the
basic services offered by TV Kingdom.
1. A user can easily ﬁnd TV programs and enjoy a useful EPG service having
interesting functions, such as category-oriented list, recording-ranking list, and
cast-oriented list.
2. The user can record and reserve TV programs on her/his local personal video
recorder (PVR) through the TV Kingdom EPG by just clicking the iEPG icons.

2 Cross-category Recommendation for Multimedia Content 53
Fig. 23 My Carte
Fig. 24 Basic Services of TV Kingdom
3. The user can record programs on her/his local PVR even when away from home
using a mobile PC or cell phones provided by companies such as NTT-DoCoMo,
Softbank, and au.
4. The PVR automatically records the programs having the same keywords, such as
genre, title, or cast, as registered by the user.
These basic functions are really useful for the PVR user. However, it is somewhat
monotonous for the user to ﬁnd out her/his favorite programs and make a reserva-
tion for recording. The automatic recording function resolves this problem on some

54 N. Kamimaeda et al.
Fig. 25 Example of Cross-category Recommendation
level, but the user has to set her/his favorite keywords manually. We have tried to
offer a better solution for this issue by using a VE. Here, content-meta-based search
and user-preference-based search are realized by the VE.
Moreover, the TV Kingdom service offers not only TV program recommenda-
tions but also video, e-Book, DVD, CD, book, and cross-category recommendation
among these categories. An image of the cross-category recommendation service is
shown in Figure 25.
VE learns the user preferences from the user’s behavior on all the categories. By
employing these user preferences, the VE realizes cross-category recommendation
among these categories.
Difﬁculties
There are three difﬁculties for realizing multimedia content recommendation and
cross-category recommendation: how to extract features from content itself for mul-
timedia content recommendation and how to generate common metadata and to
merge each category’s user preference for cross-category recommendation.
The ﬁrst problem involves the extraction of features from the content itself. This
creates difﬁculty in realizing multimedia content recommendation. It is necessary to
develop feature extraction tools for each category such as music, picture or motion

2 Cross-category Recommendation for Multimedia Content 55
picture. For example, in the motion picture category, scene detection or face recog-
nition tools are necessary to be extracted. However, it is cumbersome to implement
these tools for each category. Moreover, sufﬁcient metadata cannot be extracted
from some content (e.g., motion picture) for efﬁcient recommendation. Therefore,
we need to consider recommendation algorithms using limited metadata when real-
izing recommendation functions for such content.
The second problem involves the generation of common metadata. This creates
difﬁculty in realizing cross-category recommendation. We can easily employ per-
son and keyword attributes as direct common metadata because almost all of these
items have these attributes. However, it is difﬁcult to select other common attributes,
because most of these items have different metadata. Therefore, in order to realize
a cross-category recommendation system, it is necessary to investigate what kind of
metadata the system can actually use. If common metadata do not exist, we should
consider an appropriate way to treat such content. For example, VE has personal-
ity data for each cast in the ACD and the content metadata are expanded using the
AME. Lifestyle segmentation data also can be used as common metadata. Here, a
large amount of time is normally required to generate such kinds of data. However,
common metadata is a key to successfully realize cross-category recommendation.
Therefore, it is very important to consider this issue.
The ﬁnal problem is how to merge each category’s user preference. This cre-
ates difﬁculty in realizing cross-category recommendation. Moreover, this is also
essential for successfully realizing cross-category recommendation, because rec-
ommended items will be changed based on how to merge the user preference. The
best way to merge depends on the system requirements. Therefore we determine
how to merge the user preference based on the results of the evaluation experiments
for each application and each system. In addition, on the basis of these results, we
change the weights of each attribute such as genre and keyword.
Summary and Future Prospects
This article has introduced cross-category recommendation technologies for mul-
timedia content. First, an overview of the recommendation technologies has been
outlined. After that, practical applications and services realizing multimedia recom-
mendation and cross-category recommendation have been described. Then, we have
mentioned difﬁculties in cross-category recommendation for multimedia recom-
mendation. These features are imperative to realize a good recommendation system.
Recently, multimedia recommendation technologies have become more impor-
tant because many products and services that involve multimedia content have been
developed. In the near future, cross-category recommendation technologies will be
more important. For example these technologies are necessary to realize advertise-
ment recommendations based on TV programs watched by the user or based on the
music listened to by the user. Moreover, for enhancing user experience, these tech-
nologies will be important, too. This is because these technologies eliminate the
differences among categories and support the users to explore the huge information

Chapter 3
Semantic-Based Framework for Integration
and Personalization of Television Related Media
Pieter Bellekens, Lora Aroyo, and Geert-Jan Houben
Introduction
The online information locomotive drives on at an ever increasing pace. Constantly
we see expansion of existing methods and systems, while at the same time, new
innovations and techniques sprout out of nowhere. These changes bring new possi-
bilities and challenges that affect the whole media chain: from content production,
via distribution, to last but not least the end-user (the consumer). Lately however,
the consumer himself transformed more and more into a content producer, as shown
by Berman [4], making the circle round and the speed of information growth even
larger. Subsequently, this breaks the traditional business model where companies
and institutions are the sole content providers. We describe in this paper our re-
search focusing on the synergy between available content on various media sources
and the consumer at home who wants to experience multimedia content through a
connected media centre.
As an effect, the new forms of home media that emerge as digital systems are
converging. Different content, e.g., from TV, social networks, music, homemade im-
ages and videos, is no longer bound to separate devices or to local storage, and the
development of the Internet makes the media boundaries become less limiting. As
envisioned by for instance IBM [4], the future media may become more pervasive
and offer a more ubiquitous and immersive experience, as increasing technological
sophistication brings new media environments. The transfer to digital content along
L. Aroyo ( )
Department of Computer Science, Free University of Amsterdam, Amsterdam, Netherlands
e-mail: l.m.aroyo@cs.vu.nl
P. Bellekens
Department of Mathematics & Computer Science, Eindhoven University of Technology,
Eindhoven, Netherlands
e-mail: p.a.e.bellekens@tue.nl
G.-J. Houben
Department of Software Technology, Delft University of Technology, Delft, Netherlands
e-mail: g.j.p.m.houben@tudelft.nl
B. Furht (ed.), Handbook of Multimedia for Digital Entertainment and Arts, 59
DOI 10.1007/978-0-387-89024-1 3, c Springer Science+Business Media, LLC 2009

60 P. Bellekens et al.
with technologies and standards like DVB11 , HDTV, voice over IP, Blu-ray2 and
TV-Anytime3 create opportunities to bring new interactivity to the traditional TV
concept and change it drastically. The television industry always has been a con-
servative one. It has not yet experienced a major revolution for the past ﬁfty years,
which constitutes a strong contrast to the Internet which has quickly evolved from
mere textual information to multimedia content. We believe that using Semantic
Web technology in the concept of TV content interaction may provide a change
from a traditional one-way communication to a two-way communication where the
user changes from a passive viewer to a more active participant and program struc-
tures change from ﬁxed to dynamic.
In this paper we try to identify requirements, opportunities and problems in home
media centers and we propose an approach to address them by describing an intel-
ligent home media environment. The major issues investigated are coping with the
information overﬂow in the current provision of TV programs and channels and
the need for personalization to speciﬁc users by adapting to their age, interests,
language abilities, and various context characteristics. The research presented in
this paper follows from a collaboration between Eindhoven University of Technol-
ogy, the Philips Applied Technologies group and Stoneroos Interactive Television.
The work has been partially carried out within the ITEA-funded European project
Passepartout, which also includes partners like Thomson, INRIA and ETRI.
In the following chapter we describe the motivation and research problem in re-
lation to related work, followed by an illustrative use case scenario. Afterwards, we
explain our data model which starts with explaining the TV-Anytime structure and
its enrichments with semantic knowledge from various ontologies and vocabularies.
The data model description then serves as the background for understanding our
proposed system architecture SenSee. Afterwards we go deeper into the user mod-
eling part and explain how our personalization approach works. The latter elaborates
on a design targeting interoperability and on semantic techniques for enabling in-
telligent context-aware personalization. In the implementation chapter we describe
some practical issues as well as our main interface showcase, iFanzy. Future work
and conclusions end this chapter.
Related Work
We investigate the design of a home media architecture of connected devices that
can provide access to a wide range of media sources, yet at the same time avoid an
overﬂow of information for the user. In our framework called SenSee, for sensing
the user and seeing the content, and the iFanzy application (a personalized EPG
running on top of SenSee), we aim to connect different devices, such as shared
1
http://www.dvb.org
2
http://www.blu-ray.com
3
http://www.tv-anytime.org

3 Semantic-Based Framework for Integration and Personalization 61
(large) screens with set-top boxes, personal (small) handheld devices and biosensor-
based interfaces, and different media sources like IP, broadcast and local storage.
This intentionally goes beyond the traditional limited solution of a single TV screen
and simple remote control and thus creates the foundation for an ambient home
environment to collect various data about the users and to subsequently use this data
for the personalization of his/her interaction with the TV content. Related work on
connected homes can be found in the ﬁeld of ambient intelligence, investigated for
instance at the Philips HomeLab [9].
Regarding the information overﬂow aspect, we assume that the amount of avail-
able digital content will increase enormously with the current digital development,
as also indicated by Murugesan and Deshpande [18]. Both paper program guides
and simple EPGs are thus likely to turn inefﬁcient in terms of helping the user
in choosing from an overwhelming amount of content, a situation previously also
shown by both Chorianopoulos and O’ Sullivan et al. [7, 20]. This creates a need
for media systems to support the user by providing intelligent search and recom-
mendations to propose the most relevant and interesting programs. Similar research
focused on ﬁltering for interactive TV systems in home environments have previ-
ously been done by e.g., Goren-Bar and Glinansky [10]. Here content ﬁltering and
user stereotypes were used for capturing and using user preferences.
Various researchers furthermore emphasize that there is a need for person-
alization in dealing with a vast amount of TV content [1]. We believe that a
personalization approach in home media centers is signiﬁcant in order to handle
the user’s preferences as basis for the interaction both regarding content and de-
vices. Since users differ in ages, interests, abilities and language preferences, it is
important that these preferences can be reﬂected in the system. For instance, an
eight year old person will have very different favorite programs than an adult, and
a given user might want the movies to always be displayed on the biggest screen,
but private content only on his or her handheld device. By creating a user model,
as described by Kobsa [13], for each user of the system, such personal preferences
may be stored. This needs to capture both a user proﬁle, with the user’s preferences,
and a user context, which describes the current situation that the user is in, for ex-
ample whether the user is alone or with a group, what the available devices are at
the moment, what the time is, what the location is etc. [27, 26, 25] argue that not
taking contextual information into account for recommendations, seriously limits
the relevance of the results, and as in SenSee and iFanzy, they advocate context-
awareness as a promising approach to enhance the performance of recommenders.
While Yap et al. [27] and Tung et al. [25] illustrate their framework with a restau-
rant recommender, which takes location, weather and restaurant-related data into
account, Woerndl and Groh [26] apply context-aware recommender systems (by us-
ing location and acceleration) in the domain of inter-networked cars. The models
furthermore constitute a necessary requirement for enabling intelligent ﬁltering of
content to make recommendations like explained by van Setten [24]. By this we
mean ﬁnding and suggesting content that should be interesting for the user, while
ﬁltering out unwanted or uninteresting information. Various ﬁltering techniques for
recommending movies have previously been explored by Masthoff [15], in which

62 P. Bellekens et al.
several user models are combined to create group ﬁltering. Other related work is the
PTVPlus online recommendation system for the television domain by O’Sullivan
et al. [20] and the Adaptive Assistant for Personalized TV by Yu and Zhou [27].
However, in general when new users (with empty proﬁles) are introduced, recom-
mender systems have a hard time since they miss essential information to provide
recommendations. To ﬁll this informational gap, we make use of social networks
like Hyves4 as they often harbor a vast amount of useful preferences and interests.
Also in work from Alshabib et al. [1] a social network (LinkedIn) is used to aggre-
gate ratings based on the structure of the network, by calculating the neighborhood
of users.
Apart from supplying semantic models of the user, it is also necessary to have
sufﬁcient metadata descriptions of the content. This constitutes the basis for content
classiﬁcation, i.e., sorting the content into different types like ﬁction, non-ﬁction,
news, sports, etc. Intelligent search and ﬁltering of content moreover beneﬁt from
metadata descriptions suitable for reasoning, to deduce new information and to
enrich content search. Current ongoing research in this area by the W3C Multi-
media Annotation on the Semantic Web Task Force has been described by Stamou
et al. [23].
Similar research as presented in this paper has furthermore been performed by
Hong and Lim [12] who also propose using TV-Anytime for handling content in
a personalized way. However, they focus on broadcasted content, whereas we also
consider content from IP and removable media like Blu-ray. Furthermore, their so-
lution for content search also uses keywords and user history to recommend content,
although the architecture differs in that all processing occurs at a metadata server.
As will be described later, we propose the modeling of TV content with the use of
ontologies. Relevant related work in this ﬁeld can therefore be found in the domain
of the Semantic Web. Necib and Freytag [19] have focused on using ontologies in
query processing with a similar approach to ours, which aims at reﬁning search
queries with synonyms (and yet avoiding homonyms). However, we intend to take
this one step further in our process as we also use other semantically related concepts
and a measurement for semantic closeness.
Application Scenario
In this section we describe a scenario to illustrate the target functionality of our
demonstrator. The setting is in the home of a European, well-off family in the year
2010, which is living in a region outside their original parental background. While
they wish for the children to integrate with the local community and live and learn
from their neighbors, they also value their heritage (linguistic, cultural and reli-
gious), to effectively communicate with distant relatives and friends. The family
consists of a mother, a father, a four year old child, a deaf nine year old, and a
4
http://www.hyves.nl

3 Semantic-Based Framework for Integration and Personalization 63
teenager. The parents are determined that the children should be effectively multi-
lingual and multi-cultural, and will invest time to adapt the multimedia content in
the home. They therefore act as media guides and to some extent teachers for their
children, by selecting and adapting the content. Since the parents have immigrated
to the region, they will have a different preference of content than the default local
selection and they use their home media centre to include also programs from their
original home area, e.g., for news, music and movies. They may also choose to alter
the language or subtitles of the content.
As the family gathers for a movie night together, the home media centre has
suggested a movie that suits each family member’s preference and interest. The
mother has brieﬂy scanned the story of the movie and discovered that the ending is
in her opinion not suitable for the children. She therefore changes to an alternative
ending. As they start the movie, they all together use a shared big screen. Although
they use subtitles on this shared screen, this evening the deaf child also includes
additional sign language on his personal small screen. The teenager on the other
hand needs to practice her second language so her parents asked her this time to
listen to an alternative language version with her headphones. Although they enjoy
the movie together, the father also wants to follow a live soccer game broadcast, and
therefore uses his own handheld screen to view this private video stream. The media
devices in the home are all connected to the ambient home media environment.
TV-Anytime
A content structure which goes beyond a ﬁxed linear time structure and allows mul-
tiple languages, alternative versions, etc. puts high demands on the content model.
It needs to have a dynamic structure, rich metadata, and be suitable for various
media. We believe that the TV-Anytime5 standard can serve as the basis for such
requirements and therefore we have built our demonstrator upon the TV-Anytime
concepts. TV-Anytime is a full and synchronized set of XML speciﬁcations estab-
lished by the TV-Anytime forum built to enable search, selection, acquisition and
rightful use of content from both broadcast and online services. It basically consists
out of two main parts usually referred to as Phase I and Phase II, each serving their
speciﬁc goals.
TV-Anytime Phase I
The TV-Anytime Phase I speciﬁcation is a very extensive metadata schema which
describes all content retrieved by the system—a fundamental feature for searching
5
http://www.tv-anytime.org

64 P. Bellekens et al.
and ﬁltering. A program description consists out of a set of information tables where
each describes a speciﬁc aspect of a program P :
ProgramInformationTable: Overview of metadata ﬁelds like title, synopsis, etc.
GroupInformationTable: Describes the groups P belongs to
ProgramLocationTable: States where/when P can be found/will be broadcasted
ServiceInformationTable: States who is the rightful owner/broadcaster of P
CreditsInformationTable: Contains all the credits of people in P
ProgramReviewTable: Lists of reviews of P
SegmentInformationTable: Contains metadata about speciﬁc segments of P
PurchaseInformationTable: Contains the information about how to obtain P
TV-Anytime is built to suit the needs of the future. It contains constructs to model
metadata which is currently not widely available yet, e.g. metadata describing spe-
ciﬁc scenes (segments) within a speciﬁc program. This property proves that the
TV-Anytime speciﬁcation is ready when the television market evolves and more
metadata will be generated. Among these the tables, the most important one for us
is the ProgramInformationTable as it contains all the essential program metadata.
In the following example we show an abbreviated example of the metadata in this
table for an arbitrary program P :
All Stars
Football, friendship and ...
Comedy
Football (soccer)
EN
2008-10-11 20:00:00
London
Every ProgramInformationTable consists of a list of ProgramInformation compo-
nents, and each has a BasicDescription block which contains P ’s main description.
Apart from technical descriptions such as the screen aspect ratio and the number
of audio channels (not shown in the example), we see here for example a title, a
synopsis and a list of genres.
In order to keep a grip on what metadata creators use as values in these TV-
Anytime ﬁelds, the TV-Anytime speciﬁcation contains a set of controlled term
hierarchies which are the only valid values for such properties. A good example
is the genre ﬁeld. The genre description in TV-Anytime is a ﬁne graded taxonomy
structure, going from general concepts like ﬁction/non-ﬁction down to speciﬁc cat-
egories in the leaf nodes of the structure (typically well known genres like comedy,
drama, daily news, weather forecast, etc.). It should for example be avoided that

3 Semantic-Based Framework for Integration and Personalization 65
people can create their own genres considering that one wants to keep interoper-
ability and consistency between various content descriptions intact. The following
example shows a part of the genre hierarchy:
NON-FICTION/INFORMATION
News
Sport News
News of sport events
Every genre has an id which exempliﬁes its depth in the tree and shows which
other genre is the parent (e.g. genre 3.1 is the parent of genre 3.1.1). The genre ‘sport
news’ is a specialization of the genre ‘news’ which in turn belongs to the group of
‘non-ﬁction/information’ genres.
To identify a program in TV-Anytime, the notion of a Content Reference Identi-
ﬁer (CRID) is used, following an RFC standard [8]. The program above for example
is identiﬁed by the CRID “crid://bds.tv/13594946”. With such a CRID, which
always uniquely identiﬁes a program, we can retrieve the program’s metadata. TV-
Anytime describes a Metadata Service (MS) which is responsible for the provision
of metadata. For every CRID existing, the MS can provide a metadata description,
given of course that whoever created the program belonging to this CRID made this
description publicly. However, people cannot just start creating CRIDs, as this typi-
cally is a process which should be centrally controlled. Therefore, the TV-Anytime
speciﬁcation describes the concept of a CRID Authority (CA) which main purpose
is to watch over CRID creation. Everybody who wants to create a new program in-
stance, ﬁrst needs to ask for a new unique CRID at the CA. In turn, this CRID can
then be used to both refer to this program and to obtain its metadata via the MS.
TV-Anytime Phase II
Currently, an important evolution in data retrieval is that different pieces (or sets)
of content are being linked together. Whether this is done via similarity properties
for recommendations (e.g. Amazon’s “Maybe you are also interested in: : :”) or via
clustering (e.g. connecting all episodes of Friends), it all serves the need for proper
navigation through structured information. Also TV-Anytime accommodates these
kinds of data structuring via its packaging concept, described in Phase II of the
speciﬁcation. A package is an interconnected structure where each piece of content
is referred to by a CRID, which can here be used for several purposes. Besides in-
dentifying program instances, CRIDs are also used to deﬁne locators, which give
the actual location where the content is stored, or for referring to some other set of
CRIDs. The TV-Anytime package is thus a structured collection of related CRIDs.

66 P. Bellekens et al.
Fig. 1 Package structure
The data model of a package adopts the multi-level structure of the MPEG-21
Digital Item Declaration Language [6], i.e., a container-item-component structure,
with some extensions.
For example, a language course structured as a package could be organized and
divided into chapters and sections where each chapter or section is identiﬁed by a
CRID. Figure 1 shows an example “Short break in Paris” language learning package
consisting of three exercises, where one of them has additional video clips. Each
content element is not stored in the package itself, but is referenced using a locator.
Thus some part can be distributed on a disc and another via IP. The main content of
the language course could for instance be on a disc that the user has bought, while
extra interactive content and trailer for the next course may be distributed via IP. This
packaging structure is very dynamic since parts can easily be modiﬁed or extended,
for example the course could be extended with a new chapter by simply adding a
CRID reference. Since packages are complex collections of CRIDs, they need to
be resolved to discover which items are contained, as well as to get the locator(s)
when viewing the actual content. This resolving process is also performed by the
previously described CRID Authority. The response of such resolving request is an
XML document containing a list of all CRIDs and locators in which it resolved.
Like can be seen in the ﬁgure, one object can reside in multiple locations, such that
in a certain situation the most appropriate content location in terms of availability,
quality, connection speed, etc. can be chosen.
Semantically Enriched Content Model
The content metadata previously described, is fundamental for searching and ﬁl-
tering of content. However, we imagine that due to the potentially vast amount of
content, it is not enough to simply describe and classify the content, there must
also be more intelligent ways of handling it. We therefore propose adding semantic