Categorizing Contributors

The final stage of the fingerprinting process is the real payoff. In this phase of the research, you'll be assigning category codes to each contribution. The codes will correspond to the contributor's specific industry or interest group. When this phase is done, you'll have completed the database. All that will remain is reviewing it and finding whatever patterns stand out.

Though this is a rewarding step in the process, it's a challenging one. Now that you've determined that Winifred Wyzinski, for example, works for Wyzinski & Associates you've got to figure out what in the world Wyzinski & Associates does. Is it a lobbying firm? A trucking company? A management consulting firm? There's no telling from the name, so you'll have to find out elsewhere.

Fortunately, you can learn a surprising amount about what different companies do right on the shelves of your local library. Tens of thousands of corporations are listed and described - and their officers named - in publications such as Standard & Poor's Register of Corporations, Directors and Executives and Dun & Bradstreet's Million Dollar Directory. Doctors, lawyers, and many other professionals can be found in professional directories that you'll also find on the shelves of a well-stocked library. And if the companies are local, you can always phone them up.

The categorization process applies not only to individual contributors' employers. You'll be using the same categories for all classes of contributors - PACs, individuals, corporations and labor unions. For each one, you'll be trying to figure out which industry or interest group best describes what the contributor does, or what it stands for. Before reviewing the research materials that will help you categorize the contributors, an explanation of the categories themselves is in order.

The need for a system of categories is obvious, as soon as you start compiling contributions into a database. It is certainly useful to know who the biggest contributors are, and how much particular unions, companies and PACs are giving to political candidates. But what is even more important (and revealing) is figuring out how much whole industries are giving.

The Center's coding system had its roots in Alaska, where it was originally designed to match the patterns of political money going to members of the state legislature. That original system has undergone countless revisions over the years, along with a major expansion when the categories were applied to Congress. The system is still evolving; with each new election cycle we still tweak one or two categories, based on recent shifts in contribution patterns.

The coding system is hierarchical. At the very highest level, there are five super-categories: Business, Labor, Ideological/Single-Issue, Other and Unknown. Below that top level there are 13 "sectors," about 100 "industries" and in all, some 400 categories. A full list of all the categories is included here, but for the moment, here's how the sectors and industries break down. (The most detailed "category" level has been omitted here to save space).

Agriculture

Crop Production & Basic Processing

Tobacco

Dairy

Poultry & Eggs

Livestock

Agricultural Services/Products

Food Processing & Sales

Forestry & Forest Products

Miscellaneous Agriculture

Communications & Electronics

Printing & Publishing

Media/Entertainment

Telephone Utilities

Telecom Services & Equipment

Electronics Manufacturing & Services

Computer Equipment & Services

Construction

General Contractors

Home Builders

Special Trade Contractors

Construction Services

Building Materials & Equipment

Defense

Defense Aerospace

Defense Electronics

Miscellaneous Defense

Energy & Natural Resources

Oil & Gas

Mining

Electric Utilities

Environmental Services/Equipment

Waste Management

Fisheries & Wildlife

Miscellaneous Energy

Finance, Insurance & Real Estate

Commercial Banks

Savings & Loans

Credit Unions

Finance/Credit Companies

Securities & Investment

Insurance

Real Estate

Accountants

Miscellaneous Finance

Health

Health Professionals

Hospitals/Nursing Homes

Health Services

Pharmaceuticals/Health Products

Miscellaneous Health

Lawyers & Lobbyists

Lawyers/Law Firms

Lobbyists/Public Relations

Transportation

Air Transport

Automotive

Trucking

Railroads

Sea Transport

Miscellaneous Transport

Miscellaneous Business

Business Associations

Food & Beverage

Beer, Wine & Liquor

Retail Sales

Miscellaneous Services

Business Services

Recreation/Live Entertainment

Casinos/Gambling

Lodging/Tourism

Chemical & Related Manufacturing

Steel Production

Misc Manufacturing & Distributing

Textiles

Miscellaneous Business

Labor

Building Trade Unions

Industrial Unions

Transportation Unions

Public Sector Unions

Miscellaneous Unions

Ideological/Single-Issue

Republican/Conservative

Democratic/Liberal

Leadership PACs

Foreign & Defense Policy

Pro-Israel

Abortion Policy

Gun Rights/Gun Control

Women's Issues

Human Rights

Miscellaneous Issues

Other

Non-Profit Institutions

Civil Servants/Public Officials

Education

Retired

Other

Unknown

Homemakers/Non-income earners

No Employer Listed or Found

Generic Occupation/Category Unknown

Engineers, unclassified

Employer Listed/Category Unknown

Unknown

Each category has its own five-character code, which is entered in the computer as the category is learned. The first character is a letter and generally corresponds to the sector - A for Agriculture, H for Health, etc. The other characters are numbers, which are also arranged hierarchically. As an example, Energy Production & Distribution is E1000, the Oil & Gas industry is E1100, Gasoline Service Stations are E1170.

Besides its own code, each category is also linked to higher "industry" and "sector" codes. So when you enter E1170 for Milo's Texaco, his contributions will be included under the Oil & Gas industry, and the Energy & Natural Resources sector.

SIC CODES

The category codes for business types are based, loosely, on a system of business classifications developed by the U.S. Government's Office of Management and Budget. That system is known as the Standard Industrial Classification, or SIC code. The basic SIC code list is a set of four-digit numbers that covers virtually every conceivable type of business, from "abrasive products" to "yarn texturizing, throwing, twisting and winding mills." The government uses these SIC codes to classify the millions of different businesses that operate in the U.S. More importantly, this standard government code has also been picked up by corporate directories, such as those put out by Standard & Poor's and Dun & Bradstreet. If you look up a company in Standard & Poor's' Register of Corporations (which you'll find in the business reference section of any moderately sized library), you'll find the name of the company, its top corporate officers, and a full listing of any SIC codes that describe the company's lines of business.

SIC codes based on Yellow Pages listings have also been used by a number of CD-ROM publishers to classify businesses. Look up Bank of America on the PhoneDisc Business CD-ROM, for instance, and you'll find it identified under the SIC code of 6021, National Commercial Banks - because that's the Yellow Pages category under which its ad was listed. SIC codes are also used by several other invaluable CD-ROMs, including Dun's Business Locator, which uses the codes to identify over 9.2 million businesses across the country. (More on reference materials below.)

Electronic copies of the Center's category system are available from the Center for Responsive Politics at nominal cost. Phone the Center at 202-857-0044 for details.

It's possible that the Center's category system - which is designed primarily for congressional candidates, and is arranged to coincide with the jurisdictions of congressional committees - will need to be amended to fit your local circumstances. Feel free to amend it as needed, the system here is offered as a starting point. If you can keep fairly close to the coding system, however, it will help, if later you want to exchange databases with someone from another state, or to supplement the Center's coding of federal candidates with your own coding of state and local candidates.

ASSIGNING CATEGORIES TO CONTRIBUTORS

There are any number of techniques for fingerprinting PACs, companies, labor unions, trade associations, and other contributors. For individuals, you'll be using the contributor's occupation/employer to determine the category code (unless you've identified them as an ideological contributor, as explained in the previous chapter).

It's often easiest to start with the PACs. First of all there are fewer of them, and they represent a large proportion of the campaign dollars at both the federal and state levels. At the federal level, PACs (or "political committees," as they're officially designated) don't have to declare to the Federal Election Commission what their agenda is. The only thing a PAC like "Citizens for Better Government" has to disclose is its name, address and treasurer. But if a PAC is sponsored by a corporation, labor union, trade association, or other organization, it must list its sponsoring group with the FEC.

The Realtors PAC, for example, is sponsored by the National Association of Realtors. If you're trying to categorize a state or local PAC that is affiliated with a federal PAC, your best bet may be to contact the Center and find out how we've coded it.

Most corporate and union PACs - even if your state doesn't have a counterpart to the FEC's "sponsor" - are relatively easy to identify simply from their name. The GTE Corporation Good Government Club represents GTE. The American Medical Association PAC represents the AMA. More problematic are ideological and single-issue groups, most of whose national PACs do not have sponsors. Americans for Good Government, for example, is a pro-Israel PAC, as is San Franciscans for Good Government and Citizens Concerned for the National Interest. Campaign America is a so-called "leadership" PAC sponsored by Senator Bob Dole of Kansas. Wish List is a PAC concerned with women's issues. The only way you're going to identify PACs with generic names like those is to ask them, or look them up if your state requires PACs to state their political agendas.

A good source for identifying federal PACs is the Almanac of Federal PACs by Edward Zuckerman (Amward Publishing). Updated biennially, it profiles all PACs that gave $50,000 or more to federal candidates and identifies their business or ideological interests.

Another good source is the Center's Open Secrets: The Encyclopedia of Congressional Money & Politics. The book identifies the primary interest of every PAC that gave $20,000 or more in the 1992 elections.

Before you can begin entering the codes, you've got to have a field in your database to hold them. If you haven't done it already, now is the time to add a five-character "catcode" field to hold the category code, and a second "source" field (10 characters or less in length) that you'll use to record how you identified the code.

Data

Field name

Length

Field type

Category code

Catcode

5

Character

Source

Source

5 or 10

Character

or example, you look up Harold Farquard in the Martindale-Hubbell Law Directory and find that he's a lawyer for Smith, Farquard & Fritz of Seattle. Here's how you fill out the database:

Name

Newemploy

Catcode

Source

Farquard, Harold

Smith, Farquard & Fritz

K1000

MartHubb

You can figure out many of the category codes simply by looking at the name of the contributor. If you see a contribution from the AT&T PAC, or from an employee or officer of AT&T, you simply look up the code for long distance telephone carrier - C4200. Since AT&T is a well-known company, and you know what business they're in, you can safely apply its code and put the source down as "Name." The same would be true of contributions from the American Medical Association, the National Rifle Association, or any other high-profile group.

Many other contributors can be identified by name even if you never heard of them before, simply because the type of business is evident from the name. Fred's Texaco is clearly a gas station, E1170. Betty's Beauty Salon is a beauty parlor, G5100. Mercy Hospital is a hospital, H2100. Gibson Pharmaceuticals makes drugs (H4300), Bandon Ford-Mercury sells cars (T2300), Main Street Savings & Loan is an S&L (F1200). Contributions from any of these can safely be categorized simply from the name.

When you're beginning the categorization process, a useful way to proceed is to search for certain key words in the "newemploy" field (or in the "contributor name" field if you're reviewing corporate contributions). Isolate, for example, every contribution with the word "Hospital" - or better, "Hosp", since you'll find plenty of abbreviations in the reports. Once you've got all the "hosps" on your computer screen, you can eyeball the records, make sure they're all hospitals, then mark them with the appropriate code. Never let the computer do this step automatically. The search of "hosp" may also turn up entries like "Ray's Animal Hospital," which should be classified under veterinarians, or "Vanessa's Hospitality Service" which might require further investigation. For that reason, even when you use the computer to search key phrases, always review each name by hand, or you're asking for trouble. And then there are always the bedeviling exceptions of companies with misleading names - Rhode Island Hospital Trust, for example, which is not a hospital at all, but a commercial bank. If you're at all in doubt, don't fill in the code based on the name alone. You can always look up the company later.

To help you grind through the lists of companies in your database, here's a partial listing of keywords that can help you identify the type of business. Again, don't automatically assume these codes follow, but search the keywords and go through each list one by one.

Keyword

Type of business

Code

Hosp

Hospital

H2100

Real, RE, R E

Real estate

F4200

Nursing

Nursing home

H2200

Sav, S&L

Savings & loan

F1200

National Bank

Commercial banks

F1100

Natl Bank

Commercial banks

F1100

Ford, Olds, Buick, etc

Car dealers

T2300

Toyota, Honda, etc

Japanese import dealers

T2310

Truck

Trucking company

T3100

ISD, USD

Public school district

X3500

Some codes can be applied based on the name of the contributor. Melvin G. Hobbes, MD is a physician, H1100. Rodney Jones, Esq. is a lawyer, K1000. A handful of other initials attached to names can also tell you what the contributor does for a living. Here's a short list.

Abbreviation

Occupation

Code

MD

Physician

H1100

DDS

Dentist

H1400

Esq

Attorney

K1000

CLU

Life insurance agent

F3300

OD

Physician (osteopath)

H1100

DVM

Veterinarian

A4110

CPA

Accountant

F5100

The Hon.

Public official

X3100

Rev

Clergy

X0000

By the way, do not assume that "Dr." before a contributor's name means they're a physician. Dr. Henry Kissinger is not, nor are most people with PhD's. On the other hand, if you've got a number of otherwise unidentified contributors who list themselves as "Dr" you might want to check their names against a directory of physicians.

REFERENCE MATERIALS FOR IDENTIFYING CONTRIBUTORS

Fortunately for journalists who are trying to get a handle on the financial affiliations of political contributors, there is no shortage of reference materials that describe the business interests of different companies. You'll find many invaluable reference books on the shelves of your local library - look in the business reference section. Below is a rundown of some of the most valuable reference sources for identifying companies and contributors. Most are updated annually.

Standard & Poor's Register of Corporations, Directors & Executives. The biggest (and most useful) book in this three-volume set is the nearly 3,000-page listing of over 55,000 corporations. The companies are arranged alphabetically, and each listing identifies the companies' SIC codes and chief lines of business, their top executives, and in many cases their board of directors. It also provides their address and phone number, and additional data like their annual sales and number of employees. Two other smaller volumes complete the set. One lists some 70,000 corporate executives, along with their business affiliations. The other holds a number of indexes (including a very handy listing of all the SIC codes). Probably the most useful indexes are the Cross-Reference and Ultimate Parent indexes. Both of these link subsidiaries and affiliates with their corporate parents. If there's one book you should seek out above all others at the local library for help in identifying companies, this is the one. One caution, though: for the most part, these are not small companies. Ed's Towing, Farley's Bar & Grill, or Jones & Associates won't be listed. (Neither will most law firms or other professional offices, since they tend not to be corporations.)

Million Dollar Directory, published by Dun & Bradstreet Information Services. The format here is similar to Standard & Poor's. Corporate profiles are not as complete, but there are more of them - about 160,000 of them in the 1993 edition. It-s a good backup or supplement to the S&P guide.

Ward's Business Directory of U.S. Private and Public Companies, published by Gale Research. Another useful publication, with the same general format as the books above, but without names of corporate executives. The 1993 edition listed 135,000 companies.

Directory of Leading Private Companies, published by the National Register Publishing Co. This reference, similar in format to the ones above, lists only privately-held companies with annual sales of $10 million or more.

American Medical Directory, published by the American Medical Association. This book lists, or attempts to list, every physician in the U.S. The names are arranged alphabetically, and show each doctor's name and city.

Martindale-Hubbell Law Directory, is the closest thing you'll find to listing every lawyer and law firm in America. Entries in this multi-volume set are arranged by state, then city, then alphabetically by lawyer and law firm. If you can find (or buy) a copy, try to get your hands on Martindale-Hubbell's CD-ROM version of the directory, which is much easier to search than the printed version - at least if you're a fast typist. Whichever version you get, this directory is a real gold mine, since lawyers are one of the very biggest contributor groups, and they give heavily through individual contributions as opposed to PACs.

All the above directories list companies and individuals from across the nation. Your local library may also have a number of regional directories that can be quite valuable when you're looking at contributors to state and local campaigns.

Specialized volumes like the Texas Oil & Gas Directory and the Hollywood Creative Directory are extremely useful, if your database includes contributors from either of those areas or industries. Similar directories exist in every region of the country. The best thing to do is search the shelves of the biggest libraries in your area.

One geographically specialized volume is worthy of particular note, as its focus is Washington, D.C. and its subject matter includes some of the most influential contributors in the nation. Washington Representatives, published by Columbia Books, is the definitive guide to lobbyists in the nation's capital. It lists both individuals and law and lobbying firms, as well as their clients. It's also cross-indexed, so you can quickly find out who represents a particular company or organization in Washington. If you're trying to track down Washington lobbyists, this book is invaluable. Highly recommended.

The most efficient way to use these reference books, if you can't borrow a copy for the newsroom, is to head to the library with a printout of the companies you're trying to identify. Scope out the books in advance, and make sure your printout is sorted in the same order as the book.

CD-ROMS

The real frontier in directory publishing won't be found on bookshelves any more, but on computer. Many of the biggest publishers - including Standard & Poor's, Dun & Bradstreet and Martindale-Hubbell - are issuing electronic versions of their directories (often with more information than the printed version) on CD-ROM. These electronic versions, while sometimes prohibitively expensive, are an excellent reason for your newsroom to invest in a CD-ROM player if you don't already have one.

If you're unfamiliar with the technology, here's a quick description. CD-ROM stands for "compact disk - read-only memory." Beneath that humdrum nomenclature, however, lies the biggest revolution in media since the invention of the personal computer. The computer companies are hyping it to the hilt. "Multimedia" is their more glamorous buzzword. Essentially, a CD-ROM is a flat circular disk that looks identical to a music CD (it is), but instead of containing music, it contains data. Amazing quantities of data. Six hundred megabytes, to be exact - thousands and thousands of pages of text. The most popular CD-ROMs these days (like the growing number of multimedia encyclopedias) pack that little disk not with words so much as graphics and sound. The most useful ones for investigative reporters are light on the multimedia, but heavy on data. Unfortunately, many are also very expensive. If your newsroom can't afford them, hunt them down at the library.

Dun's Business Locator. A very expensive CD-ROM ($2,395 per year), but the best single source anywhere for millions of smaller companies all around the country. The most recent editions list over 9.1 million businesses. If you're trying to categorize "Jones & Associates" of New Orleans is, this is the place to find out. One caveat, however: Dun's lists an inordinate amount of businesses as "management services." Whether it's due to their interview techniques or some other reason, companies that shouldn't be classified this way often are. Aside from that one quirk, however, this disk is likely to be the most valuable one you'll find anywhere for identifying literally millions of companies that are listed nowhere else.

Standard & Poor's Corporations. This electronic edition of the classic Standard & Poor's Directory of Corporations, Directors and Executives, contains much more information than the book alone, though the price ($4,900 per year) tends to keep it in the hands of serious investors only. Much of it is more of interest to investors than to investigative reporters, but among the valuable things it does include are sections for most companies that describe more completely their different lines of business. This is most important when you're dealing with PACs sponsored by big, diversified corporations that could have multiple political interests. The CD-ROM often shows how much of the companies' revenues come from which source of their business - how much General Electric gets, for example, from its aerospace operations as opposed to its home appliance or power generation divisions. You can also search corporate executives by name, but beware - the "hit rate" on random searches in this database is pretty low. The disk is much more valuable for identifying the business interests of mid-sized to large corporations. (A less expensive version, which will consist of only the same material that's in the written edition, is in the works.)

Martindale-Hubbell Law Directory. This is an electronic version of the multi-volume print edition that lists virtually every lawyer in the nation. ($995 for a one-year subscription). Type in the name and up pops the entry that identifies the attorney's law firm, address, and a host of other information, including where they went to law school and when they graduated. Use this reference as a confirmation that a particular contributor is a lawyer, but be aware that lawyers do switch firms from time to time. If Martindale-Hubbell lists them as working for one firm, while their contribution report lists them with another firm, go with the contribution report, as it will have the latest information. This CD-ROM is also valuable for searching names of companies that sound like they're law firms. A good bet here is to search for any unidentified company names that consist solely of names (Kirkland & Ellis, for example), that include the words "et al" (Reed, Smith et al), or that have an ampersand (&) in their title - though you'll want to eliminate firms that have "& Co" or "& Son," as they tend not to be law firms.

PhoneDisc. If you have ever stumbled through the stacks of a large library poring through their collection of out-of-town telephone directories, you will thank your lucky stars for this disk. Or rather, this collection of four different sets of disks - PhoneDisc Business, which applies SIC code classifications to nine million businesses based on the category they appear under in Yellow Pages ads; PhoneDisc Residential, which offers 80 million residential phone listings from the nation's White Pages directories, Phone Disk ComboPack, business and residential listings in one package; and PhoneDisc PowerFinder, which includes business and residential listings, by phone number, address, SIC code and name on five regional disks. List prices for the packages range from $79 to $249, though they're widely discounted at mail order computer houses. Not as complete or reliable as Dun's Business Locator (since it's based on the secondary source of Yellow Pages listings), but only a tiny fraction of the price. Essentially, it's the poor man's Standard & Poor's. It's got its limitations, but what a bargain! This disk alone is reason enough to invest in a CD-ROM player for the newsroom. Highly recommended.

Street Atlas USA by DeLorme Mapping. It's a stretch to include this under campaign finance research materials, but it's such an amazing resource no newsroom should be without it. This one disk contains street maps of virtually every city, town, hamlet, highway and byway in America. You can zoom in to the town of your choice not only by typing in its name, but by typing its five-digit zip code, or even the area code and local telephone exchange, as in 202-857. Nearly every street in the country - interstate to gravel road - is in here, and nearly all of them are named. Once you've zoomed in to a particular city or town, you can type in the name of a street and the program will highlight it. In cities medium-sized and up, you can even look up addresses, like the 900 block of S. Halsted Street in Chicago! The list price is $169, but it's commonly discounted to $99 or less.

REAL-WORLD CATEGORIZATION PROBLEMS

Though it would be ideal to categorize every contribution to every politician, given the realities of the data you're dealing with, it's a practical impossibility. Even in states where filling out the occupation/employer of each contributor is required, not every candidate fills out every blank in every report. Even those that are filled out are not always possible to classify. "Self-employed," for example, tells you nothing about the contributor's source of income that you can translate into an industry or interest group category.

In the Center's research into the 1992 federal elections, we were able to categorize approximately 70 percent of the individual contributions with some meaningful category code. The ones that got away fell into five categories:

Homemakers, students and other non-income earners. These are the contributors who don't draw a salary and who haven't been linked either with an income-earning spouse or parent, or with a contribution to an ideological PAC.

Generic occupation - impossible to assign category. These are the ambiguous descriptions - "self-employed," "businessman," "entrepreneur," etc. Without more information, you can't tell how they earn their money.

Engineers, type unknown. This is a subset of the "generic occupation" problem. If someone says theysre an engineer, they could work in any number of industries - from construction to oil & gas, manufacturing, electronics, even railroads.

No employer listed or discovered. These are the blanks. No occupation has been listed on the contribution reports, and you've been unable to find the occupation through other means.

Employer listed/category unknown. Even when a contributor lists his or her employer, you're not always going to be able to find out what that company does. D&E Enterprises could be anything. If you can't find them in a business directory, and they're not in the phone book (or you haven't got time to call), you're out of luck.

Given enough time and resources (and state laws that require contributors to list their occupations and employers) it should be possible to categorize 90 percent or more of all contributions. But in the real world of budgets and deadlines and multiple responsibilities, there is never enough time nor all the resources yousd like to have at hand. You should be able to identify virtually every political action committee, as well as the great majority of corporate contributors. The challenge here is the money that comes from individuals. If you can categorize 60 or 70 percent of that you'll still be doing a job you can be proud of.

Count Cash & Make Change

Sign up for our newsletter to track money’s influence on U.S. elections and public policy.

Except for the Revolving Door section, content on this site is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 United States License by OpenSecrets.org. To request permission for commercial use, please contact us.