Subscribe To

Monday, May 21, 2012

Why Genealogical Data Standards?

When I was a whole lot younger, I was interested in model trains. I still am, but like many of my interests, I have to choose between genealogy and other time consuming passions. One thing I learned about almost immediately, were model train scale standards. These came designated by letters such as O, OO, HO, N and NN. That was only the beginning. Now there are dozens of different standards. Here's the rub. It is like the old saying about driving on dirt roads; choose your rut because you will be in it for a long time. The cost and availability of accessories depended heavily on the scale standard you chose. I still have a box or two of trains that I just might get to some time in my life.

How does this apply to genealogy? I think the comparison is transparent. Some manufacturers or developers, either by government mandate or by overwhelming market forces, have managed to establish standards. In the computer industry, standards have been evanescent, witness SCSI and such. But still there are some standards that make life a whole lot easier. For example, USB connectors. You don't have to try to match cables or buy special adapters because all of the major manufacturers of computers have caved in and used USB. Not that some don't try different standards, witness the new Intel Thunderbolt.

In the software industry, the battle of standards is over file types. There are literally hundreds of different file types. It is extremely common to get a message when trying to load a file that the file type is not recognized. This has carried over into the genealogical software industry and virtually every single program has it proprietary, locked up, perfectly incompatible file type designated by its unique extension. This has been carried so far, as an example, the Mac version of Family Tree Maker creates files that are not compatible with the Windows version of the same program from the same company!

Recognizing this issue years and years ago, FamilySearch developed a standard for exchanging genealogical information between programs. We all know about this standard called GEDCOM or Genealogical Data Communication. Fortunately, GEDCOM opened up a pathway to share data from different programs with their own proprietary format. That was the good news. The bad news is that even though there was a "standard" hardly any of the software developers observed that standard completely and when you exchanged information, you frequently got a little message telling you what parts of your file were not included in the transfer.

So who cares about standards? Mostly genealogists who want to share their information with others. Who doesn't care about standards? Mostly any one who wants to sell their own genealogy program. Why is this? Because why would you want to make it easy for people to switch to another program. If you lock up your file formats, then when people try to move from one program to another, they will have to start all over again. Lack of standards promote brand loyalty.

Without picking on any one developer or company, some developers have created ways for competitor's files to be imported or exported into their programs out of self preservation. It is interesting that some of the most popular programs are those that support file imports from other databases. Although this is not necessarily the reason for popularity.

One other issue is the rapidly changing technology. To be quite frank, GEDCOM no longer supports the current technology.

Recently I wrote a post about FamilySearch, Ancestry.com and MyHeritage concerning their recent positions with regard to possibly "new" standards. Back when GEDCOM was first established, FamilySearch (then operating under a different name) was the big kid on the block. Now, FamilySearch is only one of many big kids on the block and there are other big kids waiting on the next block over.

Will the genealogy program developers all magically agree on a common standard? Very, very unlikely. In all the discussion of GEDCOM X, BetterGEDCOM and etc. the developers, for the most part, have been conspicuously absent. Sure, there are some very progressive companies out there (again without naming anyone in particular) but by and large, there are popular genealogy programs for which the developers have never even appeared at a genealogy conference!

It is all well and good to talk about a data exchange standard, but even if Ancestry.com and FamilySearch and MyHeritage and brightsolid and others were to agree, why would that change the proprietary, unique file type mentality of the developers? Remember my comment above, Ancestry.com's Mac and Windows versions of Family Tree Maker are not file exchange compatible without a translation program that is only partially successful.

1 comment:

Once upon a time, long ago and far away, there was a spreadsheet program called 123. It had its own file format. Along came a company with a spreadsheet program called Excel. The Excel company realized that not everyone used the same spreadsheet program and that people want and need to transfer data between programs. So they developed a way to import other programs' spreadsheet data from other file formats AND they included a way to export their spreadsheet data to other programs' file formats.

What happened was this: People evaluated 123 versus other spreadsheet programs. Some picked 123. Some picked Excel. Some picked yet other spreadsheet programs.

People who picked 123 often found they needed to work with spreadsheets from other people. They could not work with Excel files or other program's files. They ended up buying Excel as well, so they could transfer the data of other programs. While doing so, those people were introduced to Excel. They may have seen some things they liked in Excel. Next time they bought a new spreadsheet program for a new machine, many decided to buy just Excel rather than need to have both.

As a developer, I feel the "unique file mentality" you talk about is the biggest mistake any developer can make. Shareability both ways will be the number one selling feature of any program - especially genealogy software.