1. Introduction

Electronic commerce and the Internet are changing the way information about
customers is gathered and used. Unfortunately, most of the changes have
resulted in the reduction of consumer privacy. The ease of processing,
obtaining and transmitting information has made easier both trading in data as
well as collating information from different sources, and information about
individuals is often collected and sold without their knowledge/consent. The e
ase of breaking into data stores and wiretapping has reduced the security of
stored and transmitted information. Transfer of data from one location to
another with different laws complicates the privacy problem further. There is
an increasing awareness among consumers about privacy violations, and, with
it, an increasing resistance to go along with the privacy dilution. This is
making the legal position of the data collector extremely fragile.

The potential of e-commerce in digital assets makes the privacy problem
even more acute. Electronic tracking and user authentication make the
gathering of extremely granular, personally-identifiable digital asset usage
information a simple task, and increase the legal liability of the data
collector. In particular, those who benefit from the collection of this
information, as well as those who depend on the collection of this information
to prevent contract circumvention and thus determine fraudulent use of digital
assets, are vulnerable. It is not necessary to compromise consumer privacy to
prevent fraud, and, to be successful, DRM systems and frameworks should not
assume it is.

The class action suit against Real Networks because the Real Media Jukebox
has unique identifiers for each installation and corresponding tracking
potential [1], and the tremendous
negative publicity due to Intel for unique numerical identities to individual
Pentium processors [2], are simply
first examples of the impact of privacy concerns. A number of the attempts to
break the security of rights enforcement systems were initiated because of
growing public awareness of being `watched' by these systems. To be freer from
legal liability, and to be successful among consumers whose privacy awareness
is growing dramatically, DRM systems need to protect the rights of consumers
along with those of content providers. The very technology used to protect
content provider rights can, and should, be used symmetrically to protect
consumer privacy.

Privacy concerns are extremely important for W3C. P3P is beginning to gain
positive publicity, and Microsoft [3], has committed to implementing tools
for setting P3P preferences in the next versions of its browser. A DRM
standards effort from W3C that does not address privacy will have two-fold
negative impact: it will handicap the DRM standard itself, and dilute the
credibility of P3P. On the other hand, including privacy in a DRM effort will
enhance the case of both P3P and the DRM standard.

Current rights management systems focus on the rights of the content
provider, who is, from this point of view, the only first-class participant in
the systems. Privacy protection schemes exist that would enable the protection
of consumer rights while allowing also the protection of content provider
rights. We propose that the W3C provide a rights management framework that is
inclusive of these technologies and thus includes the consumer as a first
class participant. Details of what this means follow. Section 2 of this paper
addresses specific privacy infringement possibilities in DRM systems and ways
of addressing these. Section 3 briefly mentions existing privacy technologies
that address some of the privacy issues mentioned in section 2, and section 4
describes a couple of example outcomes of a W3C DRM standard that would
address privacy.

2. Privacy infringement and consumer as first-class participant

A system that treats the consumer as a first-class participant
is defined as one in which:

Personal/customer profiles are assets in the system, with
associated ownership, access rights, and rights and descriptive
metadata.

Identity is part of the personal profile.

Proof of identity,
in so much as it involves revelation of the profile, or enables its
revelation through the use of unique identifiers, is trade in an asset
when the information revealed is more than the minimum required with
current technology.

All transactions in personal profile are explicit and with
consumer participation (even when through default settings).

The rest of this section elaborates on specific implications
of the above more abstract description of what is meant by
first-class participation.

There are two essential steps in current rights management
systems that violate the privacy of the consumer, or, in b2b
situations, the commercial buyer. The first is the consumer/buyer
authentication step. This step establishes who the buyer is, and
also establishes a unique identifier for the buyer. The unique
identifier can thereafter be used to collate information about
the buyer obtained from the current transaction with all kinds of
other information divulged by the buyer using the same
identifier. The very requirement of this step prevents the
possibility of anonymous browsing [4]. The
second step that violates privacy is the tracking step. The
amount and quality of tracking information that can be generated
for digital media differs by many orders of magnitude from that
generated for physical media, and it can be very granular and
accurate. A usage log for a single user can itself be a fairly
valuable digital asset, often more valuable than the asset whose
use it logs.

The justification provided for user authentication and
tracking is that they form the fraud prevention mechanism of
current rights management systems. If a user identifies herself
and agrees to a contract, she can later be sued if tracking
indicates she has violated the contract. While this is true, it
is not the only way in which fraud can be prevented, and fraud
need not be highly prevalent in systems with more privacy. The
literature on electronic cash is rife with ways of preventing
fraud while retaining degrees of anonymity - for a great overview
and critical review that separates the anonymous from the
not-so-anonymous, see [5].

There is no doubt that both user identification and the
generation of user profiles can provide tremendous value, other
than fraud prevention, to both the consumer and the content
provider. For example, the detailed information can be used in
pay-per-view business models; it can be fed back into pricing
models; it can be used for highly directed marketing; and can
also be used for efficient classification and associated search
and retrieval, providing dramatic benefits to both the consumers
and the sellers of media assets and associated services.
Tracking of digital media is also useful in a closed digital
media publishing system (like a commercial printing workflow)
where the players may be assumed to be trusted and payments are
made based on the amount of usage of individual assets. In highly
trusted, closed systems, this might be the only expression of
rights management.

The value of tracking and user identification is considerably
diluted, however, when the consumer is not
allowed to participate in the determination of the degree of tracking, and
when he is not allowed to control the degree of anonymity allowed in the
system. While we do not propose allowing only the consumer to
determine these, they should not be established solely by the
needs and assumptions of the content provider as they are
today.

The focus of DRM systems needs to change to include the
consumer as a first-class participant. This implies the
following:

user authentication should not assume that the consumer
would not wish to be anonymous; the consumer should be allowed to
choose from a range of methods with different degrees of
anonymity; the maximum extent of anonymity allowed by the system
should be determined by technical feasibility; and whether the
more anonymous methods are used should be left as a choice to the
consumer and the content provider

rights clearing should not assume that the consumer can be
tracked to any degree; the consumer should be allowed to
participate in the degree of tracking established

consumer profiles should be treated as consumer assets in the
system

3. Existing (relevant) privacy protection technology

As in rights management, privacy technology can be thought of as
(policy/contract) expression technology and (policy/contract) compliance
technology. While W3C may not be an appropriate body for the details of
compliance technology, it has an impressive history in expression technology
for other applications (P3P, RDF, XML, HTML).

There are some aspects of compliance technology that cannot be ignored,
however, because they are interwoven very finely with rights management
protocols. A good example is anonymity. A rights management system can be
built on the assumption that each user will demonstrate their public key,
which is usually closely linked with personal identity, or it can be built on
the assumption that a user will demonstrate the minimum information required
to prevent fraud. The latter allows the user to then add on any other
information as a bonus in return for a discount, perhaps, from the seller. It
also enables the use of the protocol by those users who wish to retain more
anonymity.

3.1 Anonymity

Anonymity may be thought of as protection of the unique identifier
associated with a user. Different degrees of anonymity are required for
different applications, and by different users. It is important to allow
varying degrees of anonymity. Example existing schemes are:

3. 1. 1 Trusted Third Party:

This kind of anonymity implies the use of a trusted screening party as a
mediator. The trusted party strips information passing through it of any
identifiers that can be used by outsiders. It is not very strong anonymity
because all the information is available to the third party. The third party
may encrypt the information with the user’s public key so that only the user
may access it thereafter, thus preventing even the third party itself from
accessing the data. Even so, the third party knows that information was
generated, when it was generated, and between what two parties. This kind of
anonymity is broken if the third party reneges on the understanding that the
information held is private, and shares/sells the unencrypted information.

This kind of anonymity is slightly stronger than screening and can be used
in association with it. It prevents privacy violation by not allowing the
composition of data from different sources/sessions to compile a composite
personality. A user maintains a number of keys instead of simply one key and
uses different keys for different transactions/merchants/sessions. There is no
one unique identifier associated with the user. Hence, for example, the
user's profile with amazon.com cannot be merged with his profile at hp.com,
preventing complete identities from being developed by collaboration among
merchants. At the same time, this scheme allows the user to maintain a profile
with an individual merchant - the profile itself can be very beneficial to the
user because it helps in the generation of targeted marketing that can be very
consonant with the user's tastes.

Stefan Brands' of Zero Knowledge Systems has come up with a number of
schemes that provide remarkably strong anonymity while helping prevent fraud.
These schemes build on earlier exceptional work by David Chaum [8]. A very good review of these and
other schemes may be found in [5].
The essential idea of these schemes is to enable the use of tokens that
contain the information required to carry out a transaction. These tokens may
be electronic cash tokens, symbolizing a certain amount of money, or vouchers
such as those required for rights management transactions. The schemes provide
protocols to prove possession and honest use of the token without requiring
the disclosure of additional information, and enable simultaneous anonymity
and fraud prevention in degrees not possible hitherto. The current public key
based rights management protocols are special cases of these protocols, but
make assumptions not made by these protocols and hence cannot incorporate
them. We will refer to these protocols as Proof of Knowledge (POK) protocols
in this paper. We use the term for both the most general case that subsumes
all others including protocols based on PKI and SPKI (Simple Public Key
Infrastructure), as well as for the specific strong anonymity technologies
built around the protocols of [5]. It
will be clear from the context which of these we mean.

The anonymity technologies begin to enable different ways of looking at
identity, including, specifically, the strong connections between personal
profile revelation and proving identity. Parts of the personal profile are
revealed to identify one's self. Because the personal profile is an asset,
these parts are revealed carefully and the degree of revelation is
explicit.

3.2 Technology for Expression

In addition to the anonymity technologies of various degrees, usage
information can be made available at different levels of granularity by the
asset viewer. A standard vocabulary for the degree of granularity is needed.
In its absence, one can think of the granularity as being descriptive metadata
about a personal profile asset.

Expression technology for the access rights to a personal profile is a
large part of the technical contribution of P3P. Its future looks positive,
and time will tell how well consumers take to it as being a mode of expression
for privacy policy.

Finally, tracking information also needs to be expressed, and vocabulary
for this also doesn't exist in very standard form.

4. Example outcomes of the workshop

To illustrate the use of the ideas discussed earlier in the paper, we
present a couple of example outcomes.

4.1 A basic framework that is consistent with:

User Authentication with

Degrees and types of anonymity , for example:

PKI

SPKI

Nym

Anonymized through trusted third party

POK

Choice of when to reveal

identity and to what extent

Usage Tracking with

Extent of tracking (what is being tracked?)

Controlled revelation of usage data

: specification of granularity level of usage data (in what detail is it
being tracked?)

Rights clearing with

degree of usage and rights information staying with client vs.
rights clearing agency (how much of the tracking information is sent
back to the clearing agency and at what level of aggregation)

4.2 A fulfillment protocol including:

how often rights clearing agency is contacted wrt asset access

granularity of divulged usage logs

4.3 An example outcome with respect to the main HP position paper [9].

The main HP position paper by John Erickson et al [9] proposes a Policy and Rights
Expression Platform (PREP) that provides `a model defining open interfaces
between three architectural levels of abstraction: rights expression
languages, rights messaging protocols and mechanisms for policy enforcement
and compliance' [section 4, 9]. In
this section we propose aspects of these levels of abstraction that would be
useful from the point of view of privacy.

A personal profile would be an asset in the system, with ownership, access
rights and descriptive as well as rights metadata associated with it.

Rights Expression Languages: A semantic layer that all rights
expression languages can be translated into should address the needs of
privacy vocabularies and syntaxes including vocabularies for profile
description (vocabulary not for the profile itself but for metadata about
the profile such as the level of granularity), access rights to profiles
(example P3P, XrML), degrees of anonymity, and degrees of tracking. As far
as possible, this layer should not divide profiles and media assets into
two groups, and should instead enable possible combinations of these into
composite documents.

Rights Messaging Protocol: The rights messaging protocol
should not require user identification with a key as from the traditional
PKI. It should, instead, allow a choice of identification from the choices
of 4.1 and, in general, allow for POK with or without a third party
mediator (which generalizes all the options of 4.1). This is consistent
with the notion mentioned in section 3.1 that detailed identity is
synonymous with personal profile, which is an asset and hence revealed
carefully and explicitly and not frivolously.

Mechanisms for Policy Enforcement: The bindings mentioned in
[9] between elements of the
language layer and compliance mechanisms should not depend on traditional
identity, but on POKs. Further, these bindings should enable privacy
compliance for personal profiles. Authorization tokens for rights
enforcement at the server end should, again, use POK instead of
traditional identity. Secure containers should not assume only media
assets (where there is a minimum granularity for access - in an electronic
book, for example, a paragraph could be the smallest allowed unit for
granular specification of rights) but also personal profiles (where
granularity is a very different issue).

5. Conclusions

We have described the privacy invasions possible in rights management
systems and explained why a W3C DRM proposal ought to avoid these. We propose
that, instead, a W3C DRM standard treat the consumer as a first-class
participant along with the content provider, in a symmetric system which
treats consumer identity and personal profiles as assets. This would protect
content providers from the legal liabilities of privacy invasion, promote
success among consumers whose privacy awareness increases by the day, and
enhance the credibility of P3P. We have surveyed a number of privacy
protecting technologies and point out that it is not necessary to disregard
privacy for fraud prevention. While there are important benefits to both
consumer and content provider of profile and identity revelation, these
revelations should not be assumed, and there should be an explicit mechanism
for these revelations. Further, the consumer should have control over what is
acceptable. We provide example outcomes of the workshop that would be in
keeping with this vision, including specific suggestions with respect to the
PREP framework proposed in [9].