Dr. Bourne gave an excellent talk, laying out a vision for how data science will improve health. He also reiterated his view of "Big Data," published recently in Journal of the American Medical Informatics Association (JAMIA), which focused less on the quantity of data and more on clinical, research, and other health-related organizations making maximal use of all of their data assets [1]. This is in distinction, as noted by another commentary, about a certain vagueness when definitions of Big Data focus on the word "big" [2]. Dr. Bourne's utilitarian view makes more sense to me, since there are many "small" data issues around clinical data, such as quality, completeness, and provenance, that must be solved before we can trust and apply the output of Big Data systems [3].

Nonetheless, what I believe was under-appreciated in Dr. Bourne's talk, which is common among those coming from the bioinformatics world where data is more regular and complete, was the scientific issues underlying the challenges of clinical data. Yes, we are (finally!) entering an era when patient data is increasingly captured in electronic form. But just because clinical data is plentiful does not mean it is good data, and there is no evidence, as is sometimes asserted, that more plentiful quantities of data will overcome some of its quality problems. I certainly agree that clinical trials as we now perform them are small, expensive, and may not have generalizability. But that does not prove that multiple orders of magnitude larger quantities of observational data will be better.

I certainly have enthusiasm for using data in our clinical systems. I believe there will be tremendous opportunities for leveraging the value of data, especially when it is of high quality. We will, for example, be able to validate the results of experimental studies on a much larger scale. We will also be able to find many uses for predictive analytics, such as identifying patients where we can intervene to ward off poor outcomes or find ways to deliver healthcare services more efficiently. There is no end to the possible value of Big Data in healthcare and biomedicine.

But the fruits of more data will not be realized just by accumulating more of it in digital systems. One of the big challenges was eloquently stated by another attendee of the talk, Dr. Justin Starren of Northwestern University, who noted that while data science deals with important problems, it takes place outside of the workflows addressed by clinical informatics. On the front end, data science says very little about data entry, workflow, usability of EHRs, and other factors that have, according to a recent survey by Medical Economics magazine, made EHRs the bane of many clinicians [4]. On the back end, there are challenges too, such as whether the output of data analytical algorithms can be applied in ways that measurably benefit clinical outcomes [5].

These are important as growing criticism emerges from clinicians regarding currently used EHRs. We also know that while a good deal of research shows benefits of IT [6], other research raises concerns about its safety [7]. Clearly we have a ways to go before we solve the end-to-end goal of electronic record-keeping leading to improved health or healthcare delivery.

I recognize we are in an era of tight federal research funding, with few dollars for investing in new programs. I am hopeful that the investments being made in data science will take a broad focus and include investigation into better ways to produce high-quality clinical data as well as optimally use it to improve health, clinical outcomes, and healthcare delivery. In the long run, however, our healthcare system really needs a research agenda and program for clinical informatics.

Last week, the ONC Health IT Buzz Blog featured two retrospectives on the Workforce Development Program, one from Chitra Mohla, Director of the Community College Workforce Program, and the other from myself. Within her posting, Ms. Mohla linked to the final summative evaluation report of the workforce program, available both as a summary and in its full report. In my posting, I reiterated what the program accomplished from my perspective, and some of the challenges we faced. All of the data point to a successful investment of HITECH funding, both in meeting acute needs and building capacity for the longer term.

Even though funding has ended, there is still good news about the health information technology (HIT) workforce, some of which I have noted in past blog postings. Probably the best news is that job growth has exceeded all predictions, and is unlikely to abate as healthcare organizations need to use their information systems to improve quality and safety while staying economically competitive with their competitors. As I have noted before, the work and skills required will change as the focus of HIT shifts from implementing systems to making best of them, particularly using their data to achieve better health and healthcare delivery.

ONC continues to promote the workforce agenda as well, mainly through the Workforce Subgroup that is part of the ONC Health IT Policy Committee. I have enjoyed being a member of this subgroup, whose current major effort is a focus on trying to get one or more codes for health informatics added to the Bureau of Labor Statistics (BLC) Standard Occupational Classification (SOC), which is undergoing a revision now for its 2018 release. The value of one or more SOC codes will be to make jobs in the field part of US federal employment statistics.

Informatics will continue to be an important part of healthcare, and to that, careers in informatics will be plentiful and rewarding. Part of the challenge is getting out the word, especially to young people who have had less exposure to the healthcare environment. As such, they may not appreciate the problems in healthcare that informatics addresses and is poised to contribute the solutions.