May 31, 2007

More Dialogue on NMRShiftDB Debate

Peter provides a great justification on providing open access to scientific information:

In the case of NMRShiftDB I am firmly of the opinion that it leads the
way in opening access to scientific information. If the community
wishes it can use it as a growing point to develop more and better
data. If they don’t, they will continue to use existing non-open
systems (or in most cases not use anything at all).

Tony goes into more depth about the details of the ACD/Labs validation study and specifically addresses Wolfgang's claim on large error:

What I do NOT mean is that a chemical shift at 120ppm is predicted
to be at 80ppm and therefore there is a large error. No, the chemical
shift at 120ppm could be experimentally correct but the prediction
algorithm could fail to predict it correctly.

What I DO mean is that an assignment of a particular nucleus to
120ppm may be entered into the database but the ACTUAL shift should be
12ppm….that additional zero just showing up as an error during the
submission process. So, the errors I am pointing to are those of
incorrectly drawn structures, mis-assignments, transcription errors and
other potential sources of error. My estimates refer to the number of
significant assignment or structural errors that were glaringly
incorrect and I was subjectively thinking of situations where the
difference between the actual experimental shift value and the one
assigned to nucleus was >20ppm….this does not mean that
mis-assignments of even 1ppm are any less importance, just not
necessarily as easy to detect and not part of my subjective criteria.

At present, the data have been examined in more detail and I believe
I overestimated…a report of potential glaring errors has been returned
to Christoph for him to examine and make changes to the database as he
sees fit. Glaring errors are less than 250 in number based on my
subjective criteria. Again, this does not mean that there aren’t
hundreds or thousands of errors buried in the data…they are not obvious
errors and require more manual examination.

Make sure you check out their blogs.

EDIT: This conversation has continued in the following entries (in order):