Online Diary

Detecting Gender by Prose and How to Make Everyday Things

By PAMELA LiCALZI O'CONNELL

Published: September 11, 2003

Gender Genie

Does your writing style indicate whether you are male or female? In a development reported in The New York Times Magazine and elsewhere, Israeli researchers recently created an algorithm that can predict the sex of an author from a writing sample. After the news coverage, some cheeky members of an online book discussion site, BookBlog, turned the algorithm into a Web application called Gender Genie ( www.bookblog.net/gender/genie.html ) that allows anyone to punch in some prose and receive a verdict. That's when the fun started. Since Aug. 15, Gender Genie has analyzed 250,000 documents submitted by 125,000 people, according to Mary Delli Santi, who runs the site. But although the researchers achieved an 80 percent accuracy rate, Gender Genie correctly guessed an author's sex only about half the time.

Contacted by e-mail, one of the scientists, Moshe Koppel, argued that the algorithm has not been applied correctly. (The researchers are working with Ms. Delli Santi on revising the Web version.) He also said that people were entering texts that were too short and ignoring the fact that the algorithm was designed to assess fiction, not blog entries, e-mail or the other sundry nonfiction samples that dominate submissions (for example, the Gender Genie said this column was written by a man).

But beyond the accuracy of the algorithm, the draw of Gender Genie has been to see how differently men and women react to the site's sexual judgments. "Men are definitely more sensitive about having their writing labeled as female" than the other way around, Ms. Delli Santi said. And some find the results too stark: "Transgendered people are asking that the Genie come up with more results than just male or female."

Amazon Fun

All sorts of fun and funky applications related to Amazon.com are popping up, only a fraction of them from the company's development team.

A little over a year ago, Amazon provided outside programmers with a free interface for interacting with its vast product database. (Google and eBay have done the same.) The result has been a creative explosion of ancillary Web sites and services, as detailed in "Amazon Hacks," a new book ( www.oreilly.com/catalog/amazonhks ) by Paul Bausch. "People should now think of Amazon as a technology platform," Mr. Bausch said. "The company does not know or control how people use its data. It's a bit surprising the way Amazon has embraced chaos."

Some applications build on features that Amazon already provides. For example, Amazon offers a list of its top-selling items over the last 24 hours. But what if you want to track the sales rank of specific items over time? The independent site JungleScan.com allows anyone to set up a free portfolio to do just that. (Mr. Bausch uses it to track his book sales, for instance.)

Another of Mr. Bausch's favorite hacks (the book lists 100) lets him call up his Amazon Wish List from his cellphone. "You can see why Amazon would not create such an application itself," he said. "Why would they want you to access your Wish List while you were standing in Barnes & Noble?"

Some of the hacks are a bit silly but indulge the desire of users to treat Amazon as a blank slate on which to project their own obsessive interests. Take the application called MP3 Piranha ( www.capescience.com/piranha ). If you forgo buying CD's to collect MP3 music files on your computer, you are missing out on the original album cover art. This free application asks Amazon to automatically pull the cover art for the albums from which your song collection is drawn.

Make My Day

How is a jelly bean made? A plastic bottle? An airplane? At the eye-opening site How Everyday Things Are Made ( manufacturing.stanford.edu ), more than 40 products and manufacturing processes are explained. Think of it as the mother of all factory tours.

This new site, based on a course offered annually to non-engineers at Stanford, provides nearly four hours of video of industrial plants. (A high-bandwidth connection to the Internet is necessary.) "We tried to focus on which processes are most mysterious to people," said Mark V. Martin, a lecturer at Stanford and the chief executive of design4X, the company that developed the site with the Alliance for Innovative Manufacturing.

The site's interactive features include discussion boards and, fortunately, a slow-motion dial, since some factories appear to operate at warp speed. Recurring sections ask site visitors to apply the knowledge they have just acquired to a new problem ("How would you get nuts or a creamy filling inside a chocolate candy bar?"). You must submit a response to find out the answer.

Dr. Martin would like to create a version of the site for younger children that would focus more on toys and food. Yet "the food industry is the hardest industry to get information out of," he said. "Apparently the breakfast-cereal people have a lot of proprietary manufacturing processes."

On the Radar

Famous Oscar acceptance speeches, sermons and political orations are a few examples of what can be found in the "online speech bank" at americanrhetoric.com . Don't skip the debatable list of the top 100 speeches. Imaging Everest ( imagingeverest.rgs.org ) offers 20,000 photos from expeditions up that majestic mount. Most of what can be found at United States Government Graphics and Photos ( firstgov.gov/Topics/Graphics.shtml ) is available for use in the public domain. Paste away!