Subscribe To

Thursday, December 5, 2013

Analyzing mistakes In family trees - Part One

It seems that one of the hallmarks of the online family tree programs is the proliferation of inaccurate and down-right sloppy genealogical information. I could give a huge number of absolutely outrageous examples from a whole variety of sources and I probably will at some point in this post. But I am not picking on any one of the dozens (hundreds) of family tree programs. These comments apply to all of them collectively as well as individually. They all have these same problems.

Responses by genealogists to these examples of poor research and recording practices extend from those who shun family trees altogether, to those who are simply mildly amused by the content. But with all of the genealogical commentary and complaints about user generated online family trees, I find there is almost no analysis of the types of errors that are common and why that might be occurring. It is too easy, as I could have done at the beginning of this post, to simply shrug your shoulders and dismiss the problems, but this is a serious issue and needs some serious consideration. If we are truly concerned about the problems, perhaps we need to start thinking of ways to solve them rather than dismissing the whole issue as a lost cause. I am sure there are those of us out there that, when confronted with a messy family tree, wish there was some way to divorce our relatives.

Enough of the frustration and now to the problem. The first and most obvious issue is that of copying existing family trees without any critical analysis of the contents. If I go online and see a huge family tree of all my relatives and I have no prior experience, I would normally believe that the online record was "correct" merely because it looks complete and I have nothing to compare it to. Unfortunately, this is a problem that is not likely to respond to anything genealogists can do. Most of the people uploading a copy of someone's file likely has little or no interest in genealogy as such, but merely thinks there is some reason to have a copy online. I am certain that there are some who think that they are "doing their genealogy or family history" simply by virtue of the fact that they put something (anything) online. As long as the hosting entities indiscriminately advertise the huge numbers of of people they have on user submitted family trees as if that had some meaning, this situation will continue to exist and grow even larger.

For every copy online there is some "original" that is being copied. If you examine the online family trees you will soon see the patterns of copying because there will be family trees with exactly the same mistakes over and over again. In the past, I have written about some of the possible variations that have occurred just in a few of my own ancestor's listings in family trees online. But the reality is that there are some files that have been copied more than others. This likely occurs because of a commonly held access to a surname book. For example, in my Tanner line, there are several surname books about our common ancestors. The mistakes in these books have been copied over and over again. It is relatively easy to see where the information originated because of the replication of the same errors in the same individuals.

One of the most obvious common errors involves place names. These errors can be minimized by having a suggested or mandatory standard place name. But the use of standardized place names give rise to another set of problems. As genealogists, we should be recording the place at the time the event occurred. So that this issue of accurate place names also involves recording the place name with geographic and jurisdictional information different than those applicable at the time the event occurred. This usually involves entering a present county for the locations when historically, the event occurred in an older "parent" county. This issue also arises when there has been a change in the name of an ancestral location, either due to some local issue or because a new ruler has taken over the country.

So in the online family trees the copying errors can come from previously shared pedigrees and family group records or from "helpful" suggestions from those who want "standardized" place names. This would appear to be an intractable problem, but the solution, from the standpoint of the user is obvious. Watch for indicator errors. Look for those variations in individuals in a family tree that indicate that the pedigree was copied. In addition, look to the source citations. This is perhaps the easiest way to determine if an online family tree is worth examining. If there are no source citations, then it is highly probable that the information was all copied, all at one time from a previous family file.

One of the most common responses to my concerns about online family trees is that the trees are valuable because they may provide suggested ancestors that later turn out to be valid. I would certainly agree that this is possible, but, as I have said before, I doubt that I would take the time to examine hundreds of un-sourced family trees on the mere possibility that there might be an entry that would be helpful. It would appear to me that the most expedient way of resolving this issue is to block any family tree entries that were not support by some-sort of the evidence. Another group of people come back to me with the comment that we should be extremely lenient with family tree people that are just starting out and not hurt their feelings by forcing them to add something to their photo, story or document that they don't care to do and do not see the utility of. Hmm. Well, that is a consideration but should we simply allow the continued proliferation of duplicate family trees? Where does the researcher fit in this process. I am not aware of any online family tree hosting service that refuses postings based on the lack of supporting citations of sources. Maybe someone should try that tactic.

14 comments:

I'm sure you have seen the blog "Barking up the Wrong Tree". This has examples of what can happen when you just click away and copy what has been done before.

For myself, I'm slowly adding my direct line to FamilySearch Family Tree person by person, merging as I go. I haven't found anything with a source and some of the data is downright cryptic. I'm stuck now on a person with a death date (which I haven't been able to locate myself) and some other reference in the date field.

Genealogy's Star wrote "I am not aware of any online family tree hosting service that refuses postings based on the lack of supporting citations of sources. Maybe someone should try that tactic."

NO, no, no that would only suggest to those who copy online trees that the tree they are copying is accurate.

Any compiled tree should only be viewed as a possible clue to relationships.Note any tree not just online trees.Even the Heralds in the middle ages compiled trees to suit their "clients" at times rather that sticking to the accurate facts of the day.

The public often view sources as proof of accuracy but in reality citing sources only indicates where the relevant information came from. It claims nothing about accuracy of the conclusion.

It could be argued that an unsourced tree is the better option than a sourced tree as full research must be done to discover the truth. Whereas a sourced tree to lead someone to make the same errors in deduction as the original researcher.CheersGuy

1) We cannot stop people copying of information from others, although I'm always surprised at the level of acceptance shown by those people. Software (especially online stuff) has historically been very poor about encouraging citations. They want to hit the mass market, who in turn simply want a big tree to show off. That's changing, but attribution is equally important and should be automatic. If you copy data from someone's tree, your copy absolutely should say where and when it came from. I have mentioned elsewhere that parts of my tree have conflicted with the majority of online trees. I can cite my sources but, despite these online trees obviously being copies, I have no way to trace the source tree(s).

2) I disagree about standardised place names. They should be recorded as they were in the evidence, and not changed to their modern equivalents. However, in http://parallax-viewpoint.blogspot.com/2013/08/a-place-for-everything.html, I argued that every place needs some unique identifying reference that allows us to correlate disparate references, and this probably means a Internet-based Place Authority. Hence, if the data of two people record the name of a particular place differently, but also accurately, they can still agree that it's the same place based on it being associated with this same identifier. This removes all the issues of variant spellings, names in different languages, and jurisdictional changes over time -- that can all be held in the authoritative database, and keyed by that identifier.

3) I believe this issue is especially associated with family trees. People who research 'family history', on the other hand, are much more likely to use citation & attribution. The industry needs to drop this ridiculous "family tree" focus (apparent in virtually all advertising and publicity) as people assume that trees are all there is (http://parallax-viewpoint.blogspot.com/2013/10/ok-i-have-family-tree-now-what.html and http://parallax-viewpoint.blogspot.com/2013/10/micro-history-for-genealogists.html).

Once again more ideas than I can write about in a lifetime. Thanks for all your thoughts. I think I will take on standardization again shortly. I think people should use geographic coordinates more often personally.

One common place-name mistake is conflating a place of birth or death with the record-set where the data is recorded (or assumed to have been recorded), such as "Deerfield Monthly Meeting," or "German Flats German Reformed Church." For USA Maryland and Virginia families it is also common to see trees giving such event locations as in Church of England Parishes or Parish churches (which often did not exist at the time of the events) rather than in the Hundreds or possibly towns. Without specific source citations it is impossible for the viewer to determine what exactly the data-writer meant.

Following on from Geolover's comment - I just looked in one web-site for my 5G GM, Margaret Gandy, m 1791 in Cheshire. I found 17 entries for her (under that name) in trees. 14 of them claim she was born in Cheshire, when in fact - as far as I know - the only evidence for her birth is her age at various points in life. Nothing about where she was born. Perfectly logical that she was born in Cheshire, as that's where the evidence for her comes from, but it's an assumption.

Just to add to the flaws, she's the subject of another typical "error". In 16 of those 17 trees, she's named as "Margaret Gandy" only. In fact, if you were to look at her marriage to my 5G GF, she's a widow on her marriage to him and it's thus very unlikely her birth name is Gandy. It actually doesn't take much to find that her birth name was Hughes - it appears on her first marriage (Gandy isn't too difficult a name to trace) and on the baptism of my 4G GF as the church concerned was in one of those brilliant periods when they recorded grandparents' names as well as parents. Sadly, all too often people never get beyond the index to read the original so assume her name on marriage must be her birth name.