Subscribe To

Pages

Thursday, February 5, 2015

Why not a ranking and review system for online family tree databases

Although we seem to live in world society that is so absorbed with political correctness above any other concerns, I think it is time to implement a ranking and rating system for online genealogy databases. I certainly that the commercial companies will be concerned that some of their customers might become offended by low rankings, but such a ranking may also let the uninitiated know that sloppy genealogy is not excused by the greater genealogical community. As a matter of fact I would probably rate some of my own sources and entries in my own family trees very low, in the one star category.

The system used by Amazon.com and many other sales outlets gives the user a way to rank any product sold on the basis of one through five stars. There is also a review process where people can comment on the reasons why they awarded a low or high ranking. Let's suppose I had published a family tree on Ancestry.com. What if other users could rank my tree from one to five stars. Wouldn't that give other users an idea as whether or not to use in information?

Better yet, it would be a good idea is individual entries could be ranked according to a system of one through five stars. That system could also be extended to ranking sources one by one and so forth. Let's get away from all the direct vs. indirect source stuff and move on to what really counts: is the source useful and if so, how useful? I also thought about ranking our ancestors, but then we would really get into trouble.

This whole concept could be expanded into a ranking system of the accuracy of every entry and extended to the accuracy of names, places and dates. This idea would give the causal user of the programs the ability to decide what to do with certain information and whether or not other users thought it was reliable or not.

Right now, if I think some entry is baloney, all I can do is make a remark that no one will read. I could also delete the information or put in my own information. The problem is what if I do not know the information is correct or not but think it is unreliable. Comments do not seem to get anywhere.

13 comments:

James, you call for a "system of one through five stars." With all due respect, that is not a system; it is a grade.

The essential for such a thing would be criteria for judgment. Most folks looking at trees do not have skills to judge quality of evidence used for conclusions presented in such trees. Many persons who have uploaded trees or portions thereof have not included information about what motivated their conclusions -- so whether an item is correct or not, it could be judged as 0-star because evidence is not given. Is that your intention?

I can also see that at least two trees I could upload would be low-ranked by myriad persons who ~believe in~ certain published books and other material. My accounts do not agree with the assertions from that stuff because I have done the requisite research that disproves it.

So what would be the point of rankings? I will never copy something because someone else ranks it highly. If it is of interest to me, I **will** look at whatever sources are given and search also for additional material that might support or refute the given conclusions.

There already has been some discussion of the notion of "voting" for parts of the FamilySearch Family Tree. I hope this notion will not be revived as an intention. The instant it is implemented I will cease making and documenting corrections in that tree.

You make some very good points. But I respectfully disagree. You make the assumption that the reviews and ranking would "judge the quality of the evidence." You are also making a sweeping conclusion that "Most folks looking at trees do not have skills..." That is really one of the strongest arguments for implementing some kind of ranking and review system. You are right ranking is nothing more or less than a "grade," but coupled with a system of reviews, people have a way of explaining why they have come to their conclusion. In a real sense, for those who know better, Ancestry.com already ranks trees by indicating how many sources support any of the entries when they are found. I use Ancestry's "ranking" to summarily dismiss any connection to any tree with no or a very few sources. I suggest everyone who knows the importance of sources does the same. How would you suggest we let those with no skills know that some trees, sources and entries are unreliable?

James. you say, ". . . Ancestry.com already ranks trees by indicating how many sources support any of the entries when they are found. I use Ancestry's "ranking" to summarily dismiss any connection to any tree with no or a very few sources. I suggest everyone who knows the importance of sources does the same."

Usually when an appreciable number of sources is listed for a tree entry, the constituents are junk databases such as other trees and Millennium File. Even where documentary sources are included, they may be erroneous material from Ancestry's Findagrave Index, or incorrect attribution of Census or other entries (so many enumerations and childbirths after death). All are "sources" from which the tree creator copied or constructed some fiction or other.

Most trees are unreliable, although there are sterling exceptions. There is no quick-fix for evaluating genealogical accuracy. I personally am not going to devote time to going over a 10,000-person tree to get a sense of where there might be accurate bits and where not, in order to give it a grade which will not give any specifics as to what is right, what is merely plausible, and what is wrong. Persons with no skills ought to pursue learning. Persons who want to copy trees will always be myriad.

Your view of genealogical content may be accurate but seems overly pessimistic. In my experience, the number of sources is a threshold way to evaluate whether or not I want to look at the information in the tree at all. As it often happens, I am acquainted with the contributors who have accumulated more sources and already know whether or not to spend any time looking at the content. I find that Ancestry's limited system is a good way to discover those people who are actually working on and contributing to my family lines as opposed to the copy-cats. Do you have a suggestion for how to deal with the flood of online family trees other than simply ignoring them?

Over the past 10 years (or so) there has been nearly constant discussion in various venues regarding what, if anything, to do concerning junk trees and related discussion concerning the sheer number of duplications.

Some, like you, propose a rating system. Respondents ruminate on what criteria would be applied, and/or holler about "genealogy police" and who would designate or vet such persons. Some tree-hosting sites supposedly have means of at least encouraging genealogical accuracy, but in the end it is up to the users to put in their entries.

Some of the genealogy-providers are adopting programs that enable users to copy from existing trees hosted elsewhere. This trend might slightly minimize numbers of trees, but does not hold out a lot of hope for increased genealogical accuracy. Evaluating applicability and accuracy requires a human brain; the sundry search engines can only take us to the water.

I do not have a suggestion for "how to deal with the flood . . . ." As with anything else published anywhere, folks have the right to do whatever they wish. Even if a genie hacker group decided to destroy them all, there still are libraries, historical societies and genealogical societies holding material from which the trees have been drawn, not to mention the innumerable documents in courthouses, regional and religious archives and other places.

I collaborate with actually-researching cousins whose trees include some of our intersecting lines.

I do ignore scores/hundreds of trees that have silly versions of some of my family lines. There is no correcting them, and no point in letting their existence bother me any more than the proclamations of people who don't believe men actually went to the moon.

Good points, but I think that there should be some way to let others in the tree communicate that they think the information is junk. Many if not most of the trees that are copies of my own research have good information but most of the copies have failed to include the documentation. Unified family trees, such as FamilySearch.org's Family Tree, WikiTree.com, Geni.com and others give us a forum where differences can be hashed out, but I still think a ranking system would help to support more reliable online tree information rather than the "wild West" approach today where every online tree is apparently equal in validity.

The provision of source citations does not determine the accuracy, or inaccuracy, of a tree James. I have seen too many that simply cite an entry in a census, or an entry in the GRO index of civil registrations (England & Wales), and yet are entirely wrong; often with obviously impossible implications. Conversely, the absence of source citations may mean that the tree was simply posted as "cousin bait" rather than a complete genealogy.

My only online tree fell into this latter category, and I recent took it down because of people misguidedly ranking by whether it cites anything, as opposed to whether the content is correct. Online trees don't offer any way of saying "sources on request", but a much bigger issue -- at least for me -- is that a simple list of citations is wholly inadequate. There are very few cases where a single source can reliably justify a conclusion, and if there are several supporting sources then the proof argument is required. I write these in a largely rich-text narrative form (using STEMMA) but there is no easy way to upload them to an appropriate place, and without losing their structure and relationship to other data.

Your comments are very insightful. But the present system of presenting family trees online leaves little opportunity for comment either on the accuracy or completeness of the entries. Perhaps there is a better way, but there should be some system of letting the viewer of the family tree know that others consider the information to be either reliable or unreliable and to what degree. You do point out another interesting issue and that is that there is no way for those who put their family trees online to rate their own contributions and explain why there are or are not any supporting citations etc.

What is happening now is that there are simply hundreds or thousands of copies of trees with no system that will give anyone an idea of how reliable or complete the information is absent sources or a rating system or comments or whatever. For example, on FamilySearch Family Tree, you can leave a discussion item on a person, but over the past few years I have never had such a discussion item show up with a question or a comment on any of my thousands of entries. If there were a rating system, it might, at least, start a dialogue about the content of the records.

Good point about ranking your own tree. When findmypast recently introduced their own version, I suggested that they could be the first to provide a "tentative data" flag for selected parts of a tree.

We all have research that's still in progress, or which we want to be a little more sure of, but if you upload it to a tree with no such warning or tag then someone will just copy it and propagate it, whether correct or not.

One venue, the WorldConnect trees on Rootsweb, provides for the tree-owners to enter introductory material, such as emphasis and website URLs, at the head of every person-page in a given tree. No few tree owners have entered disclaimers in this section, saying they have copied the material from others, have not researched it themselves, and even cautioning viewers to verify data. I do not think there is a way to determine whether these statements have influenced whether others copy from the published material.

The general issues are that with user-owned trees copying will always happen, but some source trees are better than others, and with a unified tree there is no single version of the truth, and attribution & integrity of any authored work is required.

These are often viewed as the only alternatives rather than just two extremes. My own view -- which is usually a bit different -- is that it is possible to allow people to load their own immutable contributions onto a shared lineage-based framework, and to be able to link to selected contributions from others that they deem accurate/interesting. This is especially useful if someone had uploaded a written work, transcriptions of privately-held letters, images of original photographs, etc., because copying is not necessary. Attribution is automatic through that linking, as is a ranking system (which contributions get linked more often), and the end user creates a richer tree through "composition" rather than "copying".

Well, when I was practicing law, I used to tell my clients that there was their version of the truth, the other side's version of the truth and the facts that came out of a court decision which was another version of the truth. In historical/genealogical research, we don't deal in absolutes.