Contact SCGS

Subscribe to site updates

Navigation

Why All Genealogy Records Are Not Free

The
following article is from Eastman's Online Genealogy Newsletter and
is copyright by Richard W. Eastman. It is re-published here with the
permission of the author. Information about the newsletter is
available at http://www.eogn.com.

- Why Isn't It Free?

The
following was published as a Plus Edition article a couple of
weeksago. At the suggestion of several newsletter readers, I am
nowrepublishing it in the Standard Edition newsletter for all to
see. Ifyou wish, you do have permission to forward this article
to others or torepublish it elsewhere for non-profit use. For
details, look at themenus on the upper right side of this web
page and click on COPYRIGHTS.

One topic that surprises me has
appeared several times recently incomments from this newsletter's
readers. Some people have questioned theidea of placing public
domain data online and charging for access tothat information, as
is done by Ancestry.com, Footnote.com, FindMyPast,WorldVital
Records, and others. One person claimed that it is illegal tocharge
for access to public domain data, and another reader stated thatthe
online sites are "violating my constitutional rights to view
thecensus."

Sorry, folks, but that simply isn't
true.

Indeed, in the U.S. and Canada, governmental records are
public domain,available free of charge to those who can travel to
the repositorieswhere the original records are stored. Many
private records, such aschurch records, may not be public domain,
but they are also oftenavailable at no charge if one can travel
to view them. When travel isnot an option, a trip to a local
library may suffice if that library hasmicrofilms of the original
records that patrons can view for free. (Forthis article, I will
ignore the costs of sending a filming crew to arepository to make
the microfilms and the expenses of reproducing anddistributing
microfilms. However, those expenses are not trivial.)

Given
the fact that the records are already available "free of
charge,"one might question the need to pay $50 or $100 or
more per year toaccess the same records on a subscription service
such as Footnote.com,Ancestry.com, Origins.net,
NewEnglandAncestors.org, and other genealogyweb sites.

First
of all, the idea that the records are available "free" is
onlytrue for those who live near the repository that houses the
originalrecords or photocopies of the records and can walk to
that repository.If you have to travel some distance to a library
that houses the recordsyou seek, you will incur travel expenses.
Even a trip to a library a fewmiles away will incur costs for
gasoline and perhaps for parking. Suchrecords are not truly
"free."

While perhaps the visitor doesn't pay
anything to view records in booksor in microfilms, that library
had to pay someone for the books, themicrofilm, the microfilm
reader, the building, the employees, heat,electricity, etc. The
library may not charge the patron to look at themicrofilms, but
the process certainly is not free. Information in alibrary is
never really free. Someone always pays, usually thetaxpayers.

A
longer trip will incur airfare or automobile expenses, along
withhotel rooms and meals. I can go to Salt Lake City to view the
“free”records available at the Family History Library. The
last time I madethat trip, it certainly was not “free.”

A
three-day trip to a distant repository can easily cost $500 or
more.If I want to go back to the "old country" to look
at records, expenseswill be much higher, of course. For many who
do not live near majorgenealogy libraries, this quickly changes
the concept of "free."

From the genealogist's
viewpoint, accessing records published on theInternet greatly
increases convenience and reduces travel expenses. Fromthe
publisher's viewpoint, the financial realities of publishing on
theweb add up rather quickly when one looks at the expenses
involved withacquiring, digitizing, and electronically publishing
records of interestto genealogists. Such an effort is not
cheap.

To be sure, there are hundreds of web pages available
today at no chargethat contain transcribed records from a variety
of sources. RootsWeb hasmany such pages, as do freebmd.org.uk,
genuki.org.uk,
Find-A-Grave,hundreds of local society web sites, and many
others. These web sitescontain records transcribed by volunteers,
and someone pays for the webservers, often without passing those
expenses on to users. In mostcases, the expenses are not huge,
and advertising can help pay thebills. A few of these web sites
may even contain images of the originalrecords. Most of these
sites have databases that contain hundreds oreven thousands of
records. In contrast, commercial services typicallyprovide
millions of records, usually many millions. With largerdatabases
come larger expenses.

Let's assume that a company or even a
genealogy society, such as the NewEngland Historic Genealogical
Society, decides to make state vitalrecords available on the
World Wide Web. Once an agreement has beennegotiated with the
state, the company or society starts work. I willmake some rough
estimates of the expenses involved.

In our example, let's say
that the project entails 25 millionhandwritten records that were
recorded over a 50-year period. (Thiswould be for a state with a
rather small population; many states willhave more records than
that in a 50-year period.) Digitizing theserecords will require
thousands of manhours. It is doubtful if anyone canfind that
number of unpaid volunteers to travel to the repository, runthe
scanners, and enter the data. In fact, the repository may not
evenhave room for a crew of that size.

If you own a
scanner, calculate how many pages you can scan in one hour.Then
calculate how long it would take you to scan twenty-five
millionpages. Using a scanner purchased at a local computer
store, I can scanone page every 2 minutes. Assuming a 40-hour
work week, I will need20,833 weeks for this project. Clearly,
hobbyist-grade scanners willnever get the job done. Expensive,
high-speed scanners need to bepurchased. Five thousand dollars is
a typical price for high-volumescanners, and this project will
probably require two or more of them.Next, operators need to be
hired to sit at the scanners 40 hours a weekto create the
digitized images. Those operators need to be paid.

This
process only makes scanned images of the records, probably
thesimplest and least-expensive part of the project. Somebody
else thenneeds to make indexes as well. The process will vary,
depending uponwhat is already available. In many cases, someone
sitting at a computerwill need to index each and every one of the
millions of entries. Add inmany more thousands of dollars in
labor charges.

Now we have created images, plus indexes to
those images. We need someskilled programmers to combine all the
data into one huge database.Skilled database administrators'
labor also is not cheap.

Once the records have been digitized
and a database has been created,the real expenses begin. This
database with twenty-five millionhigh-quality images requires
several terabytes of disk storage. (Aterabyte equals one thousand
gigabytes, the same as one millionmegabytes.) The purchase of a
high-uptime, high-throughput disk array ofthat size, along with
built-in backup capabilities, easily costs $25,000or more per
terabyte. Add in the expense of a web server, a database,and the
required software, and the cost soon exceeds $100,000 for
therequired hardware and software to make these records available
online togenealogists. This figure does not include the labor
charges mentionedearlier. All this is for a small web site. High
activity web sites suchas Ancestry.com will cost much, much
more.

Next, we need very high-speed connections to connect the
hardware to theInternet so that we can serve 100 or more
simultaneous users who wish toview these large graphics files. A
single T-1 line is the minimumrequirement for 20 or 30
simultaneous users, but most commercial webservers today are
connected by multiple OC-3 connections. (I'll skip thetechnical
discussion of T-1 and OC-3 connections. Let's just say thatthey
are very high-speed lines, capable of handling many
simultaneoususers. They also cost a lot of money.)

In most
cases, it is cheaper to install the disk array, database server,and
web server at a commercial web hosting service than to build
one'sown data center. Hosting fees for a high-usage database
start at $1,000a month and quickly go up. Way up. Commercial
genealogy companies withlots of users typically pay $10,000 or
more per month in hosting fees.This may seem high, but it is
still much less expensive than buildingyour own data center.

The
bottom line is clear to anyone with a calculator: more than aquarter
million dollars is easily expended to make high-quality
originalsource records available to genealogists. Following that
cost aremonthly fees to keep this data available.

The
result is a database in which one can search for a name, find
it,double-click on the entry, and then see an image of the
original record.In other words, primary source records are
visible to anyone in Virginiaor California or Australia or
anywhere else in the world with no travelexpenses required.

Of
course, I have ignored many other expenses. When a popular
databaseof this sort is placed online, users will have questions.
Someone needsto answer those questions; so, we must create a
customer servicedepartment. In the case of a society, a few
members might step forwardto answer questions. In the case of
Ancestry.com, it means severalhundred employees and a large
building with telephones, computers, andhigh-speed data
connections. Again, you can guess at the expenses.

Where did
this money come from?

Yes, it would be nice to provide
genealogy information online at nocost. However, if you are the
person who wishes to provide thatinformation, a few minutes with
a calculator will quickly bring you backto reality.

I like
to use the analogy of water. Water is free. If I wish, I canobtain
all the water I want at no charge. All I have to do is go towhere
the water is located. I can leave buckets on the lawn when itrains
to obtain free water. If that is insufficient to meet my needs, Ican
walk to the nearest river or lake with buckets, scoop up all
thewater I want, and carry it home at no charge. Our ancestors
did thatcenturies ago, and we can still do that today if we want.
Nothing haschanged. Water is still free.

However, if we
want the convenience of having water delivered to ourhomes, we
will incur expenses. Our ancestors did not have this option.

Someone
paid to purchase large pumps, and they paid for the pipes to
beburied underground to connect our house to the water mains. The
entireconstruction effort cost many thousands of dollars. In
addition,employees were hired to maintain the pumps and the pipes
to make sureeverything continues to work correctly. As a result,
those who consumethe water must pay a fee. Yes, the water is
free; but, the pipes, thepumps, and the employees are not. Most
all urban home owners today pay awater bill. We pay for the
convenience of home delivery. Those who donot want to pay the
delivery fee could elect to have the water shut offand then
obtain free water in the same manner that our ancestors did.

In
my mind, public domain information is the same. The information
isfree, always has been free, and probably always will be free. I
canstill obtain information today at no charge in the same manner
I alwayshave: by going to the source records and looking at them
in person. If Iwant to go to the location where the information
is located, I can do soat no charge, assuming I am willing to
walk. If the information islocated hundreds or thousands of miles
away, I may encounter significanttravel expenses, but the
information itself remains free of charge.

HOWEVER, if I want
someone to conveniently deliver the information to myhome at any
hour of the day or night that I might want it, I have to payfor
"the pipes" and for the labor of those who provide that
convenientaccess. We might consider the information to still be
free, but the"pipes" (the servers, the high-speed data
connections, the data centers,and the air conditioning to keep
the equipment cooled, etc.) are notfree, nor is all the labor of
the hundreds of people who are involved indelivering that
information to me. Those who invest millions of dollarsin
high-speed data "pipes" and all the associated labor
certainly dodeserve fair compensation for their
investments.

Yes, the data was free once, and it is still free
today. As always, Istill may go to the location where the
information is stored and, inmost cases, I can look at that
information free of charge. Nothing haschanged. The only
significant change is that we all now have anotheroption: we can
still do things the old way at no charge, or we may usenew,
convenient delivery options if we are willing to pay for
thatconvenience.

Personally, I cannot afford to travel to
Maine or Texas or England orSweden to look at every single bit of
information about my ancestorsthat I want to see. I find it much
cheaper to sit at home and pay $10 or$30 a month to look at that
information. Heck, ten bucks won't even payfor the shuttle bus to
the airport, much less airline tickets, hotels,restaurant meals,
and other required expenses to look at the "free"records.

The
only practical method of placing large amounts of
genealogyinformation on the web is to have someone pay the
expenses of acquiring,digitizing, and providing the data. In most
cases, this means that thecustomers who benefit will pay. If the
genealogy public does not wish topay the expenses of "piping"
the information to our homes, we can alwaysdo what all the
genealogists of yesteryear used to do: travel to therepositories
where the documents are kept.

As for me, I will choose the
cheaper option and pay a modest fee forsomeone to "pipe"
the information directly to my home.