Last week I was in Omaha, NE, at the headquarters of InfoUSA (which produces the ReferenceUSA database) to participate with other public librarians in a "customer conference."

Their goal was to get feedback from us on how we (and our patrons) use ReferenceUSA, and what we felt could be changed or added to improve the database. They also gave us a sneak peak at a bunch of new product offerings, as well as a tour of their facility and an overview of how they actually construct their database (and other products - they also produce the Polk City Directories).

I felt a bit out of place in the group of librarians they assembled. Here I am, representing Chelmsford, MA (pop. 32,000), and the other librarians are from places like Dallas, Denver, Brooklyn, Oklahoma City, Pittsburgh, Kansas City, Annapolis - all bigger library systems by far. Most of them were also business specialists, and if it weren't for my undergraduate marketing degree, I would have been lost somewhere between the "SWOT analysis" and "B2C channel positioning." But we all had experiences and viewpoints to share, so it worked out.

The Process for Data Integrity
Upon the conclusion of the conference, my overall opinion was that I was very impressed. I had always trusted their data, in the same way I more or less trust the data and articles in the other databases the library subscribes to. But after the tour of their work area, their claim of "99% accuracy" really means something.

They subscribe to over 6,000 phone books from across the country (which they have in a resource library - see photo above), and then, using a variety of processes, move that information into their database. A lot of it is automated, with most of their software being proprietary and home-grown. But the emphasis was clearly on using actual people to review the data make intelligent decisions to ensure accuracy. And then those peoples' work was checked, and the checker's work was checked. Which all makes for a high degree of accuracy.

Some notes about the data:

Data for their consumer database comes almost entirely from white pages. Since there is no reliable source for cell phone numbers, those are not in ReferenceUSA

All consumer data is scoured against national and state "do not call" lists, as well as the DMA's "do not mail" lists (so, even if a person is listed in the phone book, they won't be in ReferenceUSA if they've properly registered to protect their information)

ReferenceUSA is easily reached to add/remove/change records, either business or consumer

It is difficult to remove people who have died - their main sources are death benefit check records, but since these are often sent to next of kin at different addresses, it is hard to reconcile that back to the deceased's home address and social security number

They've been adding "store front images" of businesses in the database. There are over 3 million so far (each business has one close up shot and one wide shot)

In the case of moves, they keep previous address records for at least five years, but this information is not in the database or otherwise available to the public

The competitor report in the business module is compiled based on SIC and geography. So, if you want to see all the competitors of a local pet store, it's great; but if you want to see a bigger or national company's competitors, it's not much help at all

New features in the business module:

New data points include the number of PCs per location, square footage per location, and the gender of the executives

Annual reports are now included in the database, as are the last three years historical financial data

More powerful custom field selection/sorting for downloading records (hard to explain, but it's pretty slick)

They added all public libraries and branches into the database, based on ALA's library directory (neat)

Up-and-coming things for the next 6-8 months:

section 508 compliance (mostly ALT tags)

Adding US territories (Puerto Rico and US Virgin Islands) to the business file

Adding a search for brands and products, so you can find out which parent company manufacturers and sells them

Enhanced mapping, which will allow searching by map, plotting data points and drawing corridor grids (as in, "let me see all business of this type between point A and point B")

A historical module, with last 10 years worth of business financial data

An analytical module, with industry reports, size of business, etc (this is what my notes say, but I forget what it actually means)

A guided search, which prompts you to design a properly-formed search (only available on some modules initially)

New Products coming out soon:

New Movers module (people who have moved recently)

New Homeowners module (people who have recently purchased a house)

Business to Consumer Research module (for business to identify customers based on "lifestyle choices," such as hunters, skiiers, pet owners, etc)

New Business module (which pulls data from city, county, utility and tax records, which business have to file before they open - which means that these new businesses will be in the database before they even open their doors. This is great for insurers or other business-to-business companies, but also can answer "what restaurants are coming to town?" 50,000 business are added weekly, and they stay in this database for two years

My second day at NELA2007 (Tuesday, the last day of the conference), was a quick one. I just went to two morning sessions, and then left after lunch (I had to come home to pack for my trip tomorrow morning to Omaha).

I blogged both sessions today, and posted them on the NELA2007 blog. They were:

A patron and her son come to the desk, and she asks if we had math text books. During the reference interview, I learned that the patron didn't actually want a math text book.

What she wanted was a way to help her sixth grade son, who was struggling with math in school. We do have text books for the junior high classes, but we also have more general math books, and the Learning Express Library database, which has skills lessons and practice test.

While showing her the available resources, the patron stopped me and said:

"Don't you have something like Math for Dummies? That's kind of the level he's at."

Keep in mind that her son was standing right next to me. I felt so bad for him. Perhaps we should have been looking for a How Not To Destroy Your Child's Self-Esteem for Dummies book.

On a related note: a few weeks ago, someone came in asking if we had Homeschooling for Dummies. We do, and the Idiot's Guide, too, but I wonder: should "dummies" and "idiots" be teaching our future generations? I know "you can learn a lot from a dummy," but still.

This year I'm also attending as an official conference blogger. A group of us will be posting notes from (hopefully all) the sessions to the NELA2007 blog, so add the feed to your reader to catch all the action.

There's also a flickr pool, for those of you who like looking and photos of librarians.

ReferenceUSA Conference
On Wednesday and Thursday, I'll be in Omaha, NE, participating in a conference held by ReferenceUSA. They invited about 20 librarians from across the country to come and meet with their product development team. I think we'll be talking about how their database is currently used by libraries, and what new avenues they could pursue to improve delivering their content to our patrons.

That should be a fun trip. I even have friends in Omaha, but they chose that particular week to fly to Japan. Sheesh.

It'll be an active week, and if you can't reach me, this is why. But if you happen to see me anywhere along the way, please come up and say hi.