A Nutrition Label for Privacy

Carnegie Mellon University

School of Computer Sciencepkelley, jbresee, lorrie@cs.cmu.edu

ABSTRACTWe used an iterative design process to develop a privacy label thatpresents to consumers the ways organizations collect, use, andshare personal information. Many surveys have shown thatconsumers are concerned about online privacy, yet currentmechanisms to present website privacy policies have not beensuccessful. This research addresses the present gap in thecommunication and understanding of privacy policies, by creatingan information design that improves the visual presentation andcomprehensibility of privacy policies. Drawing from nutrition,warning, and energy labeling, as well as from the effort towardscreating a standardized banking privacy notification, we presentour process for constructing and refining a label tuned to privacy.This paper describes our design methodology; findings from twofocus groups; and accuracy, timing, and likeability results from alaboratory study with 24 participants. Our study resultsdemonstrate that compared to existing natural language privacypolicies, the proposed privacy label allows participants to findinformation more quickly and accurately, and provides a moreenjoyable information seeking experience.

1. INTRODUCTIONWebsite privacy policies are intended to assist consumers. Bynotifying them of what information will be collected, how it willbe used, and with whom it will be shared, consumers are, intheory, able to make informed decisions. These policies are alsomeant to inform consumers of the choices they have in managingtheir information: whether use of their information or sharing withthird parties can be limited, and if it is possible to requestmodification or removal of their information.However, Internet privacy is largely unregulated in the UnitedStates (except for childrens privacy and some sector-specificregulations) and the privacy policies created by companies are

**

MicrosoftTrust User Experience (TUX)roreeder@microsoft.comfrequently difficult for consumers to understand. Online privacypolicies are confusing due to the use of specific terms that manypeople do not understand, descriptions of activities that peoplehave difficulty relating to their own use of websites, a readabilitylevel that is congruent with a college education, and a noncommittal attitude towards specifics [14]. These issues arecomplicated by companies creating policies that are tested by theirlawyers, not their customers. It has further been establishedthrough numerous studies that people do not read privacy policies[21] and make mistaken assumptions based upon seeing that a sitehas a link to a privacy policy [26]. A recent study estimated that ifconsumers were somehow convinced to read the policies of all thecompanies they interact with, it would cost an estimated 365billion dollars per year in lost productivity [20].In addition, research has shown that consumers do not actuallybelieve they have choices when it comes to their privacy. Basedsolely on expectations, they believe there are no options forlimiting or controlling companies use of their personalinformation [16]. This is a finding that we again validated inour work.In short, todays online privacy policies are failing consumersbecause finding information in them is difficult, consumers do notunderstand that there are differences between privacy policies,and policies take too long to read. We set out to design a clear,uniform, single-page summary of a companys privacy policy thatwould help to remedy each of these three concerns.This paper first presents related work describing standardizationefforts in other domains in which companies present informationto consumers to aid in their decision making, as well as earlystandardization efforts for privacy policies. Our approach comesfrom a broad survey of work that provides consumers withinformation: nutrition labeling, drug facts, energy information,and most recently work commissioned by the Federal TradeCommission to create a standard financial privacy notice. Wediscuss our iterative design approach, including focus grouptesting, as we developed and refined our information design overseveral months. Finally, we describe our 24-participant laboratorystudy and discuss the results of our initial evaluation.

2.1 The Nutrition Facts Panel

Copyright is held by the author/owner. Permission to make digital orhard copies of all or part of this work for personal or classroom use isgranted without fee.Symposium On Usable Privacy and Security (SOUPS) 2009, July 15-17,2009, Mountain View, CA, USA.

In the United States, the nutrition label seen in Figure 1, has

become iconic after being mandated by the Nutrition Labeling andEducation Act of 1990 (NLEA) [28]. In the last nineteen years, itsincreasing ubiquity has led to a number of studies examining the

menus are often very small and the effects may vary depending onthe population studied. In a study of meal choices at a sandwichshop, Downs et al. found that if participants were given menusthat included calorie information, they ordered meals with about50 fewer total calories than participants who did not receivecalorie information. However, the authors stated that this was aneffect smaller than this study was powered to test. Nonetheless,they pointed out that if the finding proved reliable, it could besignificant if it caused people to reduce their caloric intake by asimilar amount for multiple meals each day. In a related study offood purchases at three New York City restaurants before andafter a law went into effect mandating the posting of calorieinformation on menu boards, the authors found no effects of thelegislation at two of the three restaurants. At the third restaurantthey found a small effect. They noted that the effect was larger fordieters than for non-dieters, suggesting that the availability oflabel information may again be most useful to people who arealready interested in the information provided by the label [9].

2.2 Other Privacy Notices

Figure 1. The Food and Drug Administrations Nutrition

Facts panel as regulated by the NLEA. Source:www.fda.govcosts of adoption and the ability to inform and change consumerpurchasing decisions.The sparse literature around the design of the nutrition label [3]focuses on the decisions made to simplify the information asmuch as possible for consumers. These decisions were made inpart to address low literacy rates and the needs of olderAmericans. These guidelines include defining a zone of authority,providing quantitative information about nutrients, definingminimum font sizes, and equalizing labels across products byproviding defined serving sizes and calculating percentages basedon standardized daily amounts.Surveys indicate that consumers would prefer that nutrition labelsinclude more information. However, studies have shown thatincluding more information would not actually be beneficial [10].Studies conducted to examine the impact of the NLEA have foundthat it is the populations of people who are educated and alreadymotivated to investigate nutritional information who benefit themost from nutrition labels [2][10]. Another study found thatnutrition information had the greatest impact when there was alimited number of items from which to make a selection [24]. Thisresult implies that the nutrition label made it easier to comparebetween a small set of items, allowing consumers to benefit,through informed decision making. Studies have demonstratedthat nutrition labels have an impact on consumer decision making,with some user-reported effect sizes up to 48% after the initiationof NLEA [10]. For most studies, however, the effect of thenutrition label is small and most studies focus on specific nutrientssuch as fat intake or specific products such as salad dressings. Weare not aware of controlled studies that measure the impact ofnutrition labels on consumer behavior over an extended periodof time.Other studies have found that the effects of providing calorieinformation (not a complete nutrition facts label) in restaurant

Layered Privacy Policies, a policy display format popularized by

the law firm Hunton & Williams [25], involve a short form orsummarized version of a privacy policy created using a step bystep process. This summary has standardized headings for thepolicy information, but the information itself is provided by thecompany, in free-form natural language text.The US Federal Trade Commission (FTC) is currently leading aneffort to develop a standardized financial privacy notice. TheKleimann Group used an iterative design process to develop aprototype notice for the FTC, focusing on user comprehension,allowing users to identify differences in sharing practices, andcompliance with the regulations surrounding financial privacynotices specified in the Gramm-Leach-Bliely Act. Over a 12month period the Kleimann Group iterated on several designprototypes, conducting focus groups and diagnostic usabilitytesting [16]. Our iterative design approach followed a similarprocess of testing labels for comprehension and then overalldesign through focus groups.The Kleimann Group final prototype consists of four parts: thetitle, the frame, the disclosure table, and the opt-out form. Thedisclosure table, which actually displays the companys privacypractices, makes up the majority of our label. The rest of theKleimann Group prototype was educational information to build afoundation of terms and understanding for the user [16].More recently, the Levy-Hastak report was released, detailing theresults of a 1032-participant mail/interview study [17]. Theauthors conclude that the table format performed the best on adiverse set of measures. Additionally, this success isattributed to the table providing a more holistic context for theparticular sharing of each financial institution.

2.3 Other Labeling Programs

We also explored energy labeling programs from the EuropeanUnion [12] and Australia [11], the US Consumer Products SafetyCommissions toy and game warnings [8], and the US FDA DrugFacts label [29], to gain a broader understanding of practices usedin designing and defining labeling requirements.In general, the standards documents [7][12][28] are occupied withdefining precise guidelines to describe compliance with thevarious labeling requirements. This includes point sizes of rulesand text, allowable typefaces, allowable colors, and minimumsizes. In some instances, such as choking warnings on childrensgames, standards also include placement requirements.Recently, a number of labels have been introduced to provideratings to consumers on a fixed scale, focusing on a single metricor small number of metrics. The Australian Water EfficiencyLabeling System (WELS) [32] and the British Food StandardsAgencys Signposting (or Traffic Light) [13] use very smallindicators with accompanying ratings. The WELS program usesan indicator with a possible score out of six blue stars. TheSignposting initiative rates the quantities of fat, saturates, sugar,and salt in foods using a red, amber, green traffic light coloringsystem. Early research [2][18] has shown that Signpostingenhances consumers ability to evaluate products more accuratelyand surveys show that ninety percent of consumers find this typeof label useful.

2.4 The Platform for Privacy Preferences

Due to the difficulties surrounding the use of text privacy policies,the World Wide Web Consortium created the Platform for PrivacyPreferences (P3P) [30]. P3P is a standard machine-readableformat for encoding the online privacy policy of a company ororganization. Once this P3P policy has been provided, consumersmust use a user agent to interpret it into somethingunderstandable. Unfortunately, widely available P3P user agents

have limited functionality. These include the P3P policy

processing elements of common web browsers and a few privacyspecific browser add-ons [6].To provide consumers with an active tool where they caninvestigate and explore the privacy policy of a website, earlierwork from the CyLab Usable Privacy and Security Lab (CUPS)produced the P3P Expandable Grid. This user agent was based onone of the central Expandable Grid objectives of displaying aholistic policy view [22]. The interface was created to use theentire P3P specification, broken down by categories. An exampleof the grid is shown in Figure 2.The P3P Expandable Grid has two main parts: the header and theinformation display. In the header, there is a title, a legend thatexplains the 10 possible symbols (8 pictured) that may appear inthe body of the grid, as well as expandable column headers thatexplain how that company uses data, and who they will share itwith. Finally, in the top-right corner of the header is a button thattoggles between showing and hiding information that isntcollected (i.e., hide rows that would be blank).In the body, information is displayed in blocks that correspond toP3P Statements. Each block starts with a title and a short textualdescription (if available) and is followed by a hierarchy ofexpandable rows, which list what information this companycollects. The symbols in each row show how that specific piece ofinformation could be used or shared according to the policy. Inthis way we were able to show the entire depth of the P3Pspecification in a two-dimensional grid.Based on an online survey of over 800 people in the summer of2007, we found further evidence that people generally do notunderstand the information presented in privacy policies and alsodo not enjoy reading them. When comparing three formats: astandard natural language policy; PrivacyFinder, which is asimplified human-readable version based on a P3P policy andconsisting mostly of bulleted lists; and the above version of the

Figure 2. Our P3P Expandable Grid, an early attempt at a standardized information design forprivacy policies. Due to its implementation of the entire P3P specification its complexity preventedlarge performance gains.

P3P Expandable Grid, we found that none of the three formats

were found to be pleasurable to read or easy to comprehend.Notably, we found the P3P Expandable Grid to be slightly worsethan the other formats, both in enjoyment andcomprehension [23].

3. DESIGN METHODOLOGYThis section elaborates on our iterative design process, presentingseveral prototype labels with benefits and criticisms, andhighlighting where knowledge from other label designswas applied. Throughout this process we leveraged informal userfeedback as well as focus groups, which are discussed in detail inSection 4.

3.1 Problems with the P3P Expandable Grid

Based on the analysis of the previously mentioned P3PExpandable Grid study results and a subsequent lab evaluation,we identified five major problems with the Expandable Grid [15]:

Many of the P3P labels are not clear to users. For example,Profiling and Miscellaneous Data are not terms thatusers encounter in the context of their use of websites.

The legend has a large number of symbols including multiple

symbols for expansion (depending on directionality), whichthe user may not understand.

Multiple statements that may be related to the same types of

information in a P3P policy are displayed separately,possibly requiring the user to check multiple rows to answera single question.

The Hide Used Information button in the top right only

condenses unused rows, not columns.

Rows with a plus symbol may be expanded; however, many

users (40.7%) never expanded any data types. By notexpanding data types, users never saw some important partsof the policy [23].

With these initial five problems in mind we abstracted several

general principles from the nutrition labeling literature[3][4][27][28].

Putting a box around the label identifies the boundaries of the

information, and, importantly, defines the areas that areregulated or should be trusted. This is a common issuewhen the label is placed in close proximity to otherinformation, but may not be as significant an issue online.

Using bold rules to separate sets of information gives the

reader an easy roadmap through the label and clearlydesignates sections that can be grouped by similarity.

Providing a clear and boldfaced title, e.g., Privacy Facts,

communicates the content and purpose of the labelspecifically and assists in recognition.

While much of the labeling literature also focuses on quantifiable

properties, such as amounts of fats or fiber or percentages ofactive ingredients or calories from a standardized expected dailyvalue, privacy policies typically do not include quantifiablemeasures, and the P3P specification includes no quantifiablefields. The Kleimann Group dealt with this lack of quantifiableinformation by moving to binary Yes/No statements, which theyfound to be readily understood by focus group participants.

Figure 3. Our Simplified Label, an early attempt at a

privacy label.

3.2 The Simplified Label

Our next design, following the P3P Expandable Grid, was theSimplified Label. In creating the Simplified Label, we usedYes/No statements and applied the three general principlesdiscussed above. The Simplified Label is shown in Figure 3.(Note: as with each of the screenshots shown below, this is one ofmany variants of a similar vein. We show only one of each thatwe believe is representative of the entire series.)While we made visual changes including adding a title and subhead, adding bold lines, and simplifying the table view, the mostsignificant change is a reduction in complexity. Two changescontributed most to simplifying the label: eliminating P3Pstatement groupings and eliminating the use of P3P datahierarchies. These changes are detailed below.

3.2.1 P3P Statements

P3P specifies data groupings called STATEMENT elements [31]:The STATEMENT element is a container that groupstogether a PURPOSE element, a RECIPIENT element, aRETENTION element, a DATA-GROUP element, andoptionally a CONSEQUENCE element and one or moreextensions. All of the data referenced by the DATA-GROUPis handled according to the disclosures made in the otherelements contained by the statement.This means that all of the collected information in a statement canbe used for certain purposes, and can be shared in the same way.A useful model is to think of P3P as consisting of multiple tripletsof information, {data, purpose, recipient}. We do not includeretention because our analysis of over 5000 unique P3P policiescollected by the Privacy Finder search engine [6] shows that themajority of P3P policies state that data is retained indefinitely. Incases where a website has a different data retention policy weinclude a note at the bottom of the label.Due to P3P information naturally falling into these triplets, adisplay such as the list in Figure 3 suffers some information loss.For example, it is possible contact information is used for

marketing exclusively and purchase information is used for

profiling purposes exclusively. Or it is possible that both contactand purchase information could be used for either purpose. Byremoving the triplets and only displaying a list, we lose thatdistinction. This tends to make privacy policies appear morepermissive than they actually are.A P3P policy may also have multiple statements. In the P3PExpandable Grid, statements were displayed in a numbered list. Inthe Simplified Label we have merged multiple statements into asingle list. For example, consider a policy where the firststatement of a policy was about cookies and the second dealt withweb activity. In the P3P Expandable Grid we would list thecategories twice. The first time only cookies would behighlighted; the second, web activity. With the Simplified Labelwe show the information from all of the statements in a single list.

3.2.2 P3P Data Hierarchies

P3P allows for two interchangeable and different hierarchies ofdata (collectable information). The more commonly used iscategories: a list of 17 types of information that companies cancollect. When a category is specified a company reserves the rightto collect any information that falls under that category (i.e.Physical Contact Information includes name and telephonenumber). The other data hierarchy, the base data schema, includesevery data element that can be specified using P3P, hierarchicallyarranged (e.g., NAME is a child of USER and includesGIVEN[name], MIDDLE[name], and FAMILY[name]). Furthercomplicating the situation, every element belongs to one or morecategory (NAME is a member of both demographic data andphysical contact information because ones GIVEN name is partof their contact information while ones FAMILY name providesdemographic information).In the original P3P Expandable Grid, each category was displayedin its entirety in each statement, with each element of the basedata schema hierarchically arranged as children. This led to nearly800 elements per category (if fully expanded). To simplify, wedecided to display only data categories. While this affords us a listof possible information that can fit on a page, it suffers whencompanies state they will only collect specific items. For exampleContact Information would be displayed similarly if a companycollected a consumers name, their postal address, their telephonenumber, or all of the above information. One way of preservingsome of this detail would be to display the specific data elementsa company collects when a user clicks on the name of a category.

3.2.3 Design Notes

To further reduce complexity, information that is not collected orpurposes that are not mentioned in a particular policy are notshown. The Show/Hide information button has also beenremoved; thus, there is no way to see uncollected information.

Figure 4. Our Simplified Grid in which the grid concept is

reintroduced to the label.The goal of our next design was to bring back more of the detailedinformation that privacy policies can provide withoutoverwhelming users. To do this we decided to try to find a happymedium between our Simplified Label and the best aspects of theoriginal P3P Expandable Grid. We adopted a two-dimensionalgrid layout, as shown in Figure 4. We call the resulting design theSimplified Grid.

3.3.1 Simplifying the P3P Expandable Grid

While the P3P Expandable Grid was not successful, this failurewas not a result of the tabular display. Also, as discussed above,due to the nature of P3P Statements each reduction indimensionality causes a loss of information and we wanted tominimize information loss to most benefit consumers. With thereintroduction of the two dimensional layout several changes weremade. As mentioned in 3.2.2 we only used Data Categories toshow what information companies collect, but we also simplifiedrecipients and purposes .

Finally, we have defined a maximum width of 760px for this label

and all following designs in this paper. One importantconsideration was that the privacy label design be printable to asingle page and viewable in the standard width of todays internetbrowsers.

Purposes, of which there are 12 specified1 in the P3P

specification, were grouped similarly to the categories in the P3PExpandable Grid. However the sub-categories were removed.Thus, Administration, Current Transaction, and Tailoring are all

3.3 The Simplified Grid

While the above label is extremely simple and closely follows a

pattern established by the nutrition facts panel and the financialprivacy notice, we felt that it sacrificed too much detail.

The P3P specification specifies 12 purpose elements: Current,

grouped under the title Provide service and maintain site. We

split the four P3P profiling-related purposes into two categories,based on whether that profiling is linked to the users identity orperformed anonymously. However, during our user testing, thisdistinction proved unclear to users.2

Of the 6 recipients specified by P3P , Ours and Delivery are both

never shown, as it is implied that the given company will alwaysmaintain the information. Other Companies merges the threeremaining types of recipients, distinguished by their own privacy2

The P3P specification specifies 6 recipient elements: Ours,

Same, Other-recipient, Delivery, Public, Unrelated [31].

practices. We decided the importance of this column was to show

whether any sharing with other companies was taking place.Public forums remained unchanged.

3.3.2 Symbols & Mixed Control

While you cannot opt-in or out to the trans-fat in your saladdressing, you might be able to have control over certain aspects ofyour information sharing on the internet. The Yes/No dichotomyadvocated by participants in the Kleimann Groups studies workswhen there are only one, or maybe two, columns of information.Here we would have needed 8 columns and 10 rows of Yes/Noinformation, which would have been visually difficult to parse.

Instead we again looked back to the P3P Expandable Grid and

used symbols. However, while the P3P Expandable Grid had anarray of 10 symbols, the Simplified Grid uses only four:

Exclamation Point: Data is collected and used in this way.

OUT (in a square): You can opt-out of this data use.

IN (in a circle): Your data will not be used in this way

unless you opt-in.

Square and circle: You can opt-in or opt-out of some

uses of this data.

Each of these four symbols was defined in a legend labeled

Understanding this privacy report directly below the policy. Thelegend is another device borrowed from the P3P Expandable Grid;however, it has been moved below the policy.Again, due to the way P3P uses data statements, it is possible thatin some instances consumers might be able to opt-out of allowingtheir demographic information to be used for profiling, but inothers it is required, or opt-in. The square and circle or mixedchoices symbol attempts to convey this possibility; however, inour user testing it was found to be incomprehensible.

3.3.3 Visual Intensity

The Simplified Grid is the first iteration of our label to use visualintensity to provide a high-level indication of the quality of agiven policy. Each of the four symbols has been colored such thatdarker symbols represent what could be more privacy-invasivepractices. The use of intensity allows users to make quick visualcomparisons that would not have been possible with text alone.

3.3.4 TestingThe most significant issue that arose in our testing was confusionover blank areas of the label. We thought that blank areas wouldclearly indicate information a company does not collect; after all,natural language policies typically leave out any mention of typesof information the company does not collect. However, in testing,many participants were unclear on the meaning of the blank cells.Some inferred the accurate meaning that such information useswould not occur, but others thought it allowed the company freereign to do anything in those situations or that they simply had notyet decided their practices.

3.4 Final Proposed Privacy Nutrition Label

Our Privacy Nutrition Label, shown in Figure 5, is a directdescendent of the Simplified Grid. With the Privacy NutritionLabel, we sought to refine the strengths of the Simplified Grid byreducing clutter, introducing color, and simplifying symbols.

3.4.1 Types of Information Displayed

We made changes in the way we present data categories as rowsin the table to better facilitate comparisons between policies andto reduce confusion about what data is being collected.All of the P3P Data Categories are now represented in rowsregardless of whether they are collected or not. For example, thelabel shown in Figure 5 indicates health & financial informationare not collected (and thus not used or shared), but they have notbeen removed. Any policy displayed in this format will haveexactly 10 rows, and the ordering will always be consistent. Thisallows two policies to be easily, visually compared side-by-side.Participants in a focus group we conducted after making thischange did not understand which information companies were notcollecting. We indicated the information that was not collected

with rows completely filled with minus symbols, but participants

believed that companies collected every piece of informationlisted on the grid. One participant asked, Why would they collectall that information if theyre not going to do anything with it? Inthe final prototype we grayed out the labels for data thatcompanies did not collect, and we changed the minus symbolsdescription from we will not use your information in this way towe will not collect or we will not use your information in thisway. We also changed the row-heading label from What weCollect to types of information. This change was made tohighlight the fact that we now show even un-collected informationand to reduce confusion about what was and was not beingcollected.

3.4.2 Symbol Changes & Color

In the Simplified Grid design, we marked types of informationthat companies collected and left other cells in the policy blank.However, half of the participants were afraid of the blank spaces;for instance, one said, Nothing is mentioned. It is completelyopen-ended. These guys [the company] can modify these values.Therefore, in the final version we introduced a symbol to indicatethat information was not collected or used.Focus group participants found the mixed choices symbolconfusing so we removed it. Instead we now display the symbolfor the most invasive practice. For example, if in somecircumstance one can opt-in and in another one can opt-out, wedisplay the opt-out symbol.We constrained our initial designs to grayscale to facilitate easyprinting without loss of information and to reserve color forhighlighting differences between a policy and a users personalpreferences (something we plan to implement later). However,feedback indicates that color seems to improve user enjoyment inreading the label, although we have not yet quantified thisimprovement. We selected the colors used in our label with careto accommodate viewers with color-blindness, allow for grayscalereproduction, and maintain the darker-is-worse high-level visualfeedback discussed in Section 3.3.3.

3.5 Useful Terms

Even with the understanding this privacy policy legend in placethere was still confusion over many of the terms used in the label.This was also a common issue during the development of theKleimann Groups Financial Privacy Notice, and in response theydeveloped what they call the Secondary Frame. This portion ofthe prototype notice included both frequently asked questions anda series of extended definitions, which are: [not] information asessential for consumers to have, but consumers often commentedthat they liked having it included. [16 p.27]Our version of the Secondary Frame is a single page hand-outof useful terms. Our useful terms information was informed by theHuman Readable definitions included in the P3P 1.1 WorkingGroup Note [31] and consists of seventeen definitions, one foreach of the row and column headers. Some are straightforward,others more detailed. For example, the definition of telemarketingstates: Contacting you by telephone to market services orproducts, while the profiling definition is:Collecting information about you in order to:

Do research and analysis

Make decisions that directly affect you, such as to

display ads based on your activity on the site.

Information that the site collects about you may be linked to

an anonymous ID code, or may be linked to your identity.

perfect either, because they share your preferences, and this mayinclude things like your religious or political preferences.

In future versions, clicking on or hovering over the headers could

pop-up these definitions.

After reviewing the grid design, we passed out the simple textpolicy. Participants reacted negatively to the text policy becausethey felt that it did not provide enough information, saying, Thisis an empty policy, it says nothing. I wouldn't trust it.Participants wanted to see how each piece of information wasbeing used. For example, one participant stated, With the grid it'seasier to see things. What information is being shared? We don'tknow that anymore.

4. FOCUS GROUPSWe held two, hour-long focus group sessions to review the designand discuss participants impressions and questions. We recruitedfocus group participants from the Carnegie Mellon University(CMU) Center for Behavioral Decision Research (CBDR)participant recruitment website. We paid participants $10 toparticipate in a 60 minute focus group.The first focus group was composed of three female and sevenmale CMU students. The participants reacted positively to theSimplified Grid. For example, one participant stated, This ismore convenient than scrolling through reams and reams ofparagraphs. I mean who reads them? and another participantsaid, I like the chart. [Its] better than long sentences. However,we found that some participants still had problems understandingprivacy concepts. For example, one participant asked, What isthe difference between opt-in or opt-out? and many others agreedthat they did not understand this distinction. Additionally, manyparticipants had trouble distinguishing different privacy concepts.Most participants were familiar with profiling, but did notunderstand the difference between Profiling linked to you andProfiling not linked to you. Similarly, participants did notunderstand the different meanings of cookies and uniqueidentifiers. It was this vein of feedback that led to the inclusionof the useful terms definitions described in Section 3.5.By asking participants to compare two policies, we found thatparticipants could easily isolate and describe differences.Participants noticed that Policy A had more opt-in symbols andPolicy B had more opt-out symbols. However, participants werenot able to make accurate judgments about the policies. When weasked the participants to chose the company with whom theywould prefer to do business, five of the ten participants chosePolicy B: the company that collected and used more of theirpersonal information.Using the feedback from the first focus group, we initiatedanother series of rapid iteration and prototyping, which resulted inthe final label prototype. Our second focus group compared thefinal Privacy Nutrition Label to the Simplified Label.The second focus group was composed of four female and threemale undergraduate students from CMU and the University ofPittsburgh. When reviewing the Privacy Nutrition Label vs. theSimplified Label we found that participants better understood thegrid and were able to make more accurate side-by-sidecomparisons. Participants understood the significance of the redsymbols, saying, Red is for stop or danger. We passed outtwo privacy policies, Policy A and Policy B, and asked theparticipants to raise their hands if they believed that Policy A isthe better policy. Every participant raised his or her hand,correctly identifying Policy A as the more favorable policy.Participants demonstrated a detailed understanding of thedifferences between the policies with comments such as Its veryclear which site is best and You should pick a site with moreopt-ins than opt-outs. Some participants even noted subtledifferences between the two policies saying, Policy A isn't

5. USER STUDY METHODOLOGY

Based on the feedback from our second focus group we performeda 24-participant laboratory user study comparing a standardnatural language (NL) privacy policy with privacy policiespresented in our Privacy Nutrition Label.We used a within-subjects design where participants wererandomly assigned to first use either the label or the naturallanguage format. Each participant completed 24 questions relatingto the policy format they were shown first and then the same 24questions again with the other format. These tasks are detailedbelow. We recorded accuracy as well as time for each participant.

5.1 ParticipantsWe recruited the 24 participants through the CBDR website. Ouronly requirement was that English be the participants nativelanguage. We offered participants $10 to participate in a 45minute study in our laboratory.Our participants included 16 students and 8 non-students. Of the16 students, 5 studied humanities, 5 economics or business, 2science, and 4 information science. 16 of our participants weremale, 8 were female.

5.2 Privacy Policy Selection

Our study used two NL privacy policies and two label formattedpolicies. We started with the current actual P3P policy of apopular online e-commerce website. We modified this policy inthree ways to produce two different label policies for the mythicalcompanies Acme and Button. The first change was to the datacollected. Acme has preference information collected but notdemographic information, whereas Button Co., collectsdemographic, not preference. This change is not incrediblysignificant but does distinguish the data collection. The secondchange was to the data uses. Acme does not do any profilingwhile Button Co. does. The third change was to informationsharing practices. While Acme only shares information whenconsumers opt-in, Button Co. shares information unlessconsumers opt-out. These significant differences were introducedso that there would be a clear correct response for participanttasks that require them to determine which company betterprotects their privacy (see 5.3.3).The two NL policies for the mythical companies ABC Group andBell General represent the exact same policies as described above.The ABC Group policy is the natural language policy of the samecompany whose P3P policy was used to populate the grid, againwith the three modifications above made to make it matchAcmes. We could not however simply make the threemodifications to the policy and also present it as the other naturallanguage option because two different companies, no matter how

Table 1. Extended Text & Readability Comparison for NL

Policy MetricABCBellWord Count

2287

2299

Sentence Count

136

130

Flesch Reading Ease

42.06

41.69

Flesch-Kincaid Grade

11.57

11.84

similar their practices, would not share the same text. Theintroduction, structure, and actual language used needed to bedifferent. Thus, to create the Bell General policy we used the textof a different, yet comparable e-commerce website, and changedthe practices so as to match that of Button Co.

The next three questions deal with the experience of interacting

with the privacy policy in the format we presented. L3: Findinginformation in Acmes privacy policy was a pleasurableexperience has participants rate their enjoyment of findinginformation. L4: I feel confident in my understanding of what Iread of Acmes privacy policy investigates participantsperceived accuracy in the earlier questions. L5: It was hard tofind information in Acmes policy has participants rate thedifficulty they had in finding information.The final question, L6: If all privacy policies looked just like thisI would be more likely to read them attempts to capture whetherour proposed label would encourage more people to readprivacy policies.

5.3.3 Policy Comparison Questions

In editing the natural language policies we removed any

references to programs that would distinguish the companies(such as specially branded programs), removed lists of links fromthe beginning of the policies, removed references to Safe Harbor,and additionally modified the second policy so that both wereapproximately the same length. For a more complete comparisonsee Table 1.

The third section requires participants to compare two policies of

the same format (ABC Group v. Bell General for NL or Acme v.Button Co. for the label). One of the policies in each comparisonis the same policy from the initial 8 information-finding questions.

We chose not to use layered policies. This decision was made

because layered policy adoption is not consistent or widespread,most common layered policies would not be suitable foranswering the questions we asked, and finally recent research hassuggested layered policies are no better at helping consumersunderstand privacy than full natural language policies [19].

The final two questions in this section are opinion questions,

asking: Which company will better protect your informationonline? and Youre looking to buy a gift online. At whichcompany would you prefer to shop?

5.3 Task Structure

The task structure for each condition was exactly the same, with24 tasks comprising a section. These sections can be split into fourparts, each of which is detailed here:

5.3.1 Information Finding

The first 8 questions were all Yes/No questions asked of a singlepolicy (ABC Group for NL, Acme for the label). Of these 8questions, 6 were single-element questions, involving only oneelement of the P3P statement triplet. For example: Does thepolicy allow the Acme website to use cookies? to which theanswer was Yes, or Does the policy allow the Acme website toshare your information on public bulletin boards? to which theanswer was also Yes.The remaining two questions all required two parts of the triplet toanswer the question, for example By default, does the policyallow the Acme website to collect your email address and use itfor marketing?

5.3.2 Perceived Privacy Policy Understanding

Following the 8 information finding questions, participants weregiven 6 questions on a 5-point Likert scale, from StronglyDisagree (1) to Strongly Agree (5). Each of these is describedbelow.The first question: L1: I feel secure about sharing my personalinformation with Acme after viewing their privacy practicesattempts to capture participants reaction to the actual content ofthe privacy policy they read. L2: I feel that Acmes privacypractices are explained thoroughly in the privacy policy I readquestions whether participants believe their practices arewell displayed.

The first four questions in this section are True/False statements

such as By default, Button Co. can share information about yourpurchases with other companies, but Acme cannot.

5.3.4 Policy Comparison Enjoyment & Ease

The final four questions are again on the 5-Likert scale presentedearlier. They are in two pairs, the first pair asking if, Looking atpolicies to find information was an enjoyable experience andLooking at policies to find information was easy to do. Thesecond pair focuses specifically on the comparison task,Comparing two policies was an enjoyable experience andComparing two policies was easy to do.

we will address the issue of information finding through our

quantifiable accuracy results. Next we describe the timing data onthose questions, showing information finding is not only moreaccurate but also faster with label polices than with NL policies.To conclude this section we will present the likeability of theprivacy label.

6.1 Accuracy Results

At a high level, people were able to answer more questionscorrectly with the label. We compared the correct number of totalquestions, per participant, for the label vs. the natural languagepolicy, M = 10.13 and M = 6.83 respectively, t(23) = 7.41,p < 0.001.We explored each of the questions individually by testing theproportions of correctness for each question by condition, usingMcNemars test. These results combine participants who saw thelabel first and with participants who saw the label second asaccuracy differences were not significant between these twoconditions. These comparisons show that the label is significantlymore accurate in 2 of the 8 information-finding questions and 2 ofthe 4 policy-comparison questions. The accuracy rates for eachquestion are shown in Table 2, with statistically significantcomparisons shown in bold.We performed a Benjamini-Hochberg correction to account formultiple testing across comparisons. Each of the pairedproportions are shown in Table 2 along with the McNemars pvalues and the corrected p-values.

6.2 Timing Data

For each of the information-finding and policy-comparisonquestions we collected time-to-task completion data. As shown inTable 3, the label was significantly faster than the naturallanguage policies for both the group of information-findingquestions and the group of policy-comparison questions(p<0.001).To test the mean task completion time for accurate answers weremoved all timing results where the resulting answer wasinaccurate and calculated means per question, per condition.Using a 2-sided t-test the label is significantly faster in 2 of the 8information-finding questions and significantly faster in 3 of the 4policy-comparison questions. In only one question was theaverage time faster for participants using the natural languagepolicy, and this difference was not significant. The full results forthis test can be found in Table 4.

6.3 Satisfaction Results

The satisfaction results were captured based on participantsresponses on a Likert scale from 1 (Strongly Disagree) to 5(Strongly Agree). We computed the mean response for the labeland for natural language, both combined, and also separated bywhich format was viewed first. For each of these questions higheris better, including Question L5 information was hard to find,which was reversed to be consistent with the remaining questions.We performed t-tests for each of these questions, to compare thelabel to the natural language policies. All but 2 of these 10questions resulted in significant results. The label was ratedsignificantly more pleasurable, easier to find information in, andeasier and more enjoyable to use when comparing two policies.

Average Total Time

Table 4. Time differences and p-values for average

time per question comparing only correct answers.All times reported in seconds.LabelNLDifferencep-value1

37.58

61.27

23.69

0.07

21.67

85.7

64.03

0.04

14.35

50.07

35.72

<.001

18.89

23.09

4.2

0.4

34.51

29.95

-4.56

0.46

20.19

50.24

30.05

0.06

16.32

22.82

6.5

0.88

26.93

36.79

9.86

0.73

15

46.58

132.69

86.11

0.0006

16

34.36

68.32

33.96

0.05

17

21.91

35.48

13.57

0.28

18

12.24

47.36

35.12

0.03

The results from each of these questions are shown with meansand p-values in Figure 8.Additionally we performed 2-sample t-tests between conditions toexploring priming effects, where opinions have changed based onthe policy format a participant viewed first. When looking at howparticipants answered the Likert scale questions about the label bycondition, 3 questions had significant results. Participants feltsignificantly more secure when viewing the grid if they saw theNL policy first, (label first=2.92, NL policy first=3.92, p=0.03)reported they were significantly more likely to read policies morein the label format if they saw the NL policy first (label first=4,NL policy first=4.5, p=0.04), and found comparisons on the labelsignificantly easier when viewing the NL policy first (labelfirst=3.92, NL policy first=4.58, p=0.004). These results showsignificant priming to appreciate the grid more when the NLpolicy was viewed first.

6.4 ObservationsThe initial results we have presented above are very strong,however there is still much room for improvement. We observedthat some participants still found elements of the label confusing.We began an additional round of iterative design and testing toaddress some of the issues we observed during the lab study.Several participants were confused by the symbols we used toindicate opt-in and opt-out. For instance, one participant did notunderstand what out meant, saying, Ive been messing thingsup because I thought out meant out of the question. To

improve users comprehension, we will alter the symbol design to

include the full phrases opt-out and opt-in.In addition, several participants in the lab study were completelyunfamiliar with the terms opt-out and opt-in, and they assumedthat the terms meant exactly the same thing. We will continue torefine our glossary definitions to help educate users about theseconcepts. The original definitions did not explain the terms opt-inand opt-out, with the legend reading we will collect and use yourinformation in this way unless you opt-out. The new definitionshelp explain the concepts, stating: we will collect and user yourinformation in this way unless you tell us not to by opting out.We plan to further test our design changes in focus groups, andbelieve that the design iterations will continue to improve thespeed and comprehensibility of the Privacy Nutrition Label.

7. DISCUSSIONWe began this paper with three factors in mind: the ability to findinformation, the understanding that there are differences betweenprivacy policies and control over ones information, and thesimple time-based costs of reading privacy policies. We strove todesign a single page summary of a companys privacy policy thatwould help to remedy each of these three concerns and at thesame time be enjoyable.We believe that the results presented above clearly show that eachof these areas was addressed. Accuracy results were better orsimilar for information finding and policy comparison. Taskcompletion times were significantly lower when using the labelthan when using a natural language policy. And across the board,participants believed information was easier to find and had amore pleasurable time finding it using the label.The final label design allows for information to be found in thesame place every time. It removes wiggle room and complicatedterminology by using four standard symbols that can be comparedeasily. It allows for quick high-level visual feedback by looking atthe overall intensity of the page, can be printed, can fit in abrowser window, and has a glossary of useful terms attached.People who have used it to find privacy information rated it aspleasurable. They not only rated it better than the naturallanguage, but actually rated it enjoyable to use.When using the label people far more consistently selected thecompany that had the stronger privacy policy. Participants alsorealized the benefits of the label for comparison: This mayactually be the biggest advantage of this system because you canput down two polices that are formatted the same and see theexact differences between them. Its really easy. Even moredirectly one participant said I guess Ill look to see which policyhas more blue, exactly capturing one of our intended designgoals.A number of open questions remain about how people will use thelabel in practice. Will people make more use of the label than theycurrently do of privacy policies? How will their use change asthey become more familiar with the labels through continued useover time?Our next step will be to iterate on a number of additional minorchanges and then run a large online study, similar to Reeder etal.s original test of the P3P Expandable Grid [23]. This willfurther confirm over a much larger and more diverse group ofpeople that the label is in fact, more accurate, faster, and morepleasurable. Additionally as this study will be conducted online,

people will be viewing privacy policies just as they normally

would, at their computer, which is very different than performingthese tasks in our laboratory on paper.Finally, we plan to integrate a version of the privacy label intoPrivacy Finder, a privacy search engine maintained by the CUPSLaboratory. This will allow people to use the label outside of thecontext of a research study and will allow us to monitor frequencyof use while collecting feedback on the label design. It is likelythis public online deployment that will bring us closer toanswering how much a standardized label design assists peopleover time as they become accustomed to using it.

8. ACKNOWLEDGMENTSThe authors would like to acknowledge Sungjoon Steve Won forhis early designs, including the simplified grid; Janice Tsai for herstatistical expertise; Daniel Rhim, Robert McGuire, and CristianBravo-Lillo for their technical assistance and assistance inconducting user studies; Norman Sadeh and Aleecia McDonaldfor their guidance and advice; and everyone who provided inputthroughout the design process.This work was supported in part by U.S. Army Research Officecontract DAAD19-02-1-0389 (Perpetually Available and SecureInformation Systems) to Carnegie Mellon UniversitysCyLab, by NSF Cyber Trust grant CNS-0627513, by Microsoftthrough the Carnegie Mellon Center for Computational Thinking,FCT through the CMU/Portugal Information and CommunicationTechnologies Institute, and the IBM OCR project on Privacy andSecurity Policy Management.