A Face Is Exposed for AOL Searcher No. 4417749

Published: August 9, 2006

Buried in a list of 20 million Web search queries collected by AOL and recently released on the Internet is user No. 4417749. The number was assigned by the company to protect the searcher's anonymity, but it was not much of a shield.

No. 4417749 conducted hundreds of searches over a three-month period on topics ranging from ''numb fingers'' to ''60 single men'' to ''dog that urinates on everything.''

And search by search, click by click, the identity of AOL user No. 4417749 became easier to discern. There are queries for ''landscapers in Lilburn, Ga,'' several people with the last name Arnold and ''homes sold in shadow lake subdivision gwinnett county georgia.''

It did not take much investigating to follow that data trail to Thelma Arnold, a 62-year-old widow who lives in Lilburn, Ga., frequently researches her friends' medical ailments and loves her three dogs. ''Those are my searches,'' she said, after a reporter read part of the list to her.

AOL removed the search data from its site over the weekend and apologized for its release, saying it was an unauthorized move by a team that had hoped it would benefit academic researchers.

But the detailed records of searches conducted by Ms. Arnold and 657,000 other Americans, copies of which continue to circulate online, underscore how much people unintentionally reveal about themselves when they use search engines -- and how risky it can be for companies like AOL, Google and Yahoo to compile such data.

Those risks have long pitted privacy advocates against online marketers and other Internet companies seeking to profit from the Internet's unique ability to track the comings and goings of users, allowing for more focused and therefore more lucrative advertising.

But the unintended consequences of all that data being compiled, stored and cross-linked are what Marc Rotenberg, the executive director of the Electronic Privacy Information Center, a privacy rights group in Washington, called ''a ticking privacy time bomb.''

Mr. Rotenberg pointed to Google's own joust earlier this year with the Justice Department over a subpoena for some of its search data. The company successfully fended off the agency's demand in court, but several other search companies, including AOL, complied. The Justice Department sought the information to help it defend a challenge to a law that is meant to shield children from sexually explicit material.

''We supported Google at the time,'' Mr. Rotenberg said, ''but we also said that it was a mistake for Google to be saving so much information because it creates a risk.''

Ms. Arnold, who agreed to discuss her searches with a reporter, said she was shocked to hear that AOL had saved and published three months' worth of them. ''My goodness, it's my whole personal life,'' she said. ''I had no idea somebody was looking over my shoulder.''

In the privacy of her four-bedroom home, Ms. Arnold searched for the answers to scores of life's questions, big and small. How could she buy ''school supplies for Iraq children''? What is the ''safest place to live''? What is ''the best season to visit Italy''?

Her searches are a catalog of intentions, curiosity, anxieties and quotidian questions. There was the day in May, for example, when she typed in ''termites,'' then ''tea for good health'' then ''mature living,'' all within a few hours.

Her queries mirror millions of those captured in AOL's database, which reveal the concerns of expectant mothers, cancer patients, college students and music lovers. User No. 2178 searches for ''foods to avoid when breast feeding.'' No. 3482401 seeks guidance on ''calorie counting.'' No. 3483689 searches for the songs ''Time After Time'' and ''Wind Beneath My Wings.''

There are also many thousands of sexual queries, along with searches about ''child porno'' and ''how to kill oneself by natural gas'' that raise questions about what legal authorities can and should do with such information.

But while these searches can tell the casual observer -- or the sociologist or the marketer -- much about the person who typed them, they can also prove highly misleading.

At first glace, it might appear that Ms. Arnold fears she is suffering from a wide range of ailments. Her search history includes ''hand tremors,'' ''nicotine effects on the body,'' ''dry mouth'' and ''bipolar.'' But in an interview, Ms. Arnold said she routinely researched medical conditions for her friends to assuage their anxieties. Explaining her queries about nicotine, for example, she said: ''I have a friend who needs to quit smoking and I want to help her do it.''

Asked about Ms. Arnold, an AOL spokesman, Andrew Weinstein, reiterated the company's position that the data release was a mistake. ''We apologize specifically to her,'' he said. ''There is not a whole lot we can do.''

Mr. Weinstein said he knew of no other cases thus far where users had been identified as a result of the search data, but he was not surprised. ''We acknowledged that there was information that could potentially lead to people being identified, which is why we were so angry.''

AOL keeps a record of each user's search queries for one month, Mr. Weinstein said. This allows users to refer back to previous searches and is also used by AOL to improve the quality of its search technology. The three-month data that was released came from a special system meant for AOL's internal researchers that does not record the users' AOL screen names, he said.

Several bloggers claimed yesterday to have identified other AOL users by examining data, while others hunted for particularly entertaining or shocking search histories. Some programmers made this easier by setting up Web sites that let people search the database of searches.

John Battelle, the author of the 2005 book ''The Search: How Google and Its Rivals Rewrote the Rules of Business and Transformed Our Culture,'' said AOL's misstep, while unfortunate, could have a silver lining if people began to understand just what was at stake. In his book, he says search engines are mining the priceless ''database of intentions'' formed by the world's search requests.

''It's only by these kinds of screw-ups and unintended behind-the-curtain views that we can push this dialogue along,'' Mr. Battelle said. ''As unhappy as I am to see this data on people leaked, I'm heartened that we will have this conversation as a culture, which is long overdue.''

Ms. Arnold says she loves online research, but the disclosure of her searches has left her disillusioned. In response, she plans to drop her AOL subscription. ''We all have a right to privacy,'' she said. ''Nobody should have found this all out.''

Photo: Thelma Arnold's identity was betrayed by AOL records of her Web searches, like ones for her dog, Dudley, who clearly has a problem. (Photo by Erik S. Lesser for The New York Times)(pg. A1)

Chart/Photo: ''What Revealing Search Data Reveals''
AOL posted, but later removed, a list of the Web search inquiries of 658,000 unnamed users on a new Web site for academic researchers. An interview with one of those unnamed users, Thelma Arnold, combined with her data reveal what she was searching for, why and on which Web sites.

A photo shows a sample of Thelma Arnold's search data released by AOL. In a list alongside the search samples, Mrs. Arnold explains why the search was conducted. Here are some of her answers:
''I was thinking about my grandchildren''
''I was looking for some.''
''I wanted to find out what my house was worth.''
''A woman was in the [public] bathroom crying. She was going though a divorce. I thought there was a place called 'Dances by Lori,' for singles.''