Presenting Data Effectively

This article is the first in a four-part series on essential statistical techniques for any scientist or engineer working in the biotechnology field. Though many of the techniques seem trivial, they are often misused or misunderstood in practice. Future topics will include hypothesis testing, confidence intervals, Design of Experiments, and analysis of variance. This column deals with the ways data can be presented to maximize effectiveness, including methods to summarize data sets.

DATA TYPES

There are two main types of data: quantitative and qualitative. Quantitative, or numerical, data are continuous data sets with an infinite number of possible values. For example, protein concentration is considered continuous data, because its value is limited by the sensitivity of the measurement device. Qualitative, or categorical, data have a finite number of possible values. For example, the number of defective vials in a lot is considered qualitative because its values range from zero to the number of vials in the lot incremented in whole units.

Each of the data types has different statistical methods used for summarizing and reporting.

QUALITATIVE DATA

Figure 1. A example of a bar graph

Qualitative data are usually presented in tabular format or as a percentage. Graphically, a bar graph can be used to present the data. A bar graph can be used to present a single variable or to compare two or more variables. Figure 1 and Table 1 show a presentation of the reasons for a lot failure.

Table 1. The reasons for a lot failure

It is not enough just to plot the data to compare the frequencies by year. If the number of lots each year is different, it is preferred to plot the percentages to make a better comparison. Figure 2 shows the preferred graph.

Figure 2. A preferred bar graph for comparing data from two years. The table below the bar graph makes it clear that the number of lots was different.

The comparative graph in Figure 1 does not show the vast improvement between 2007 and 2006 that can be seen in Figure 2. Additional statistical tests such as the chi-squared test can be used to show that 2007 showed a statistically better percent defective when compared to 2006 (p = 0.005).