Current foci for research and reflection.

The summary for a busy person: I am
interested in Tufte's sparklines and teaching old dogs new
tricks. My interests are eclectic and not at all well focused.
However, this particular page is just about biology related
informatics. For this discussion, biomedical informatics is
the macro and bioinformatics is the micro division of the
subject matter. All kinds of informatics benefit from the
database backed Web site and statistical models. Many of these
terms are new, but derive from problems that have attracted me
since the late sixties.

Bioinformatics problems could be subdivided.
First, the algorithms used to piece together the genome.
Second, the predictions about proteins, and hence cellular
behavior, using that genome. Third, translational research, or
the acceleration of transforming genomic data into therapeutic
products.

The global recession has shifted the emphasis
of all health-care related research considerably toward those
areas which are, or are perceived to be, likely to lower costs
and improve quality. Vaccines, insulin and ACE inhibitors are
examples of the very few innovations that have done both.
Performance based pay has not been demonstrated to do either.
The U.S. governmental incentives to adopt IT in health care are
for use of the current technology. Because this existing
technology is fragmented into EHR, diagnostic and e-prescribing
software, it has little chance of lowering costs or increasing
quality.

The Human Genome Project enabled us to
specify the state of an individual human organism in far greater
detail, but at the cost of entering, saving and retrieving a
far greater number of data points. Edward Tufte's invention,
sparklines, is a very important attempt to re-compress this
explosion of data so that it can communicated to decision
makers in a timely manner -- thus saving time and money and
reducing errors and waste.

Where will the manpower come from to manage
all this data? For such a big problem, it would be efficient
to provide some more training to those who have already
mastered some of the requisite skills. We can teach old
clinician dogs, new tricks like sparklines and Ruby on
Rails. We can teach old computer dogs the advances in
molecular biology. The Y2K problem employed a lot of old
dogs, and young dogs who learned some old Cobol tricks.
The problems where biology and informatics intersect, dwarf
the Y2K problem.

There are persistant problems. More than a
half century ago Turing found that a computer, even if it knew
another computer's whole software program or genome, could not
always predict what the other computer would do. Another old
problem lies with the relational database that underlies most
database backed Web sites. No algorithm exists to get to a
fully normalized database where each of the rules is both
necessary and sufficient for modeling an enterprise. So our
models in biomedical informatics will be approximate and rely
on statistics -- "All models are wrong, some are useful,"
George Box.