How to Create and Interpret Dot Plots and Histograms in a Six Sigma Project

Both dot plots and histograms give you lots of information about the variation of a critical characteristic in a process for a Six Sigma initiative. A dot plot shows the scatter and grouping of a data from a single characteristic using (no surprise here) dots. A histogram takes the data from the dot plot and replaces the dots with bars.

After collecting measurements or data for a characteristic, create a dot plot for it by using the following steps:

Create a horizontal line that represents the scale of measure for the characteristic.

This scale should be in whatever measure best quantifies the aspect of the characteristic you’re interested in — for example, millimeters for length, pounds for weight, minutes for time, or number of defects found on an inspected part.

Divide the horizontal scale of measure into equal chunks or buckets along its length.

Select a bucket width that creates about 10 to 20 equal divisions between the largest and smallest observed values for the characteristic.

For each observed measurement of the characteristic, locate its value along the horizontal scale and place a dot for it in its corresponding bucket.

If another observed measurement falls into the same bucket, stack the second (or third, or fourth) dot above the previous one.

Repeat Step 3 until you’ve placed all the observed measurements onto the plot.

To create a histogram, replace each of the stacks of dots with a solid vertical bar of the same height as its corresponding stack of dots. Note: The vertical dimension on a dot plot or histogram is sometimes called frequency or count.

A dot plot and its fancy cousin, the histogram, offer ready access to a wealth of information about the variation of a characteristic’s performance. The following points are a few aspects of a dot plot or histogram to note:

Plot height: The frequency — the height of dots or the bar — in a dot plot or histogram indicates how often the corresponding value on the horizontal axis was observed.

Variation shape: The shape of the variation on a histogram comes in three basic flavors: normal, uniform, and skewed. You can see a variation shape that is normally distributed, or bell-shaped. For a normal distribution, most of the observed values of the characteristic are close to a central point, with fewer and fewer values appearing as you get farther away from the central tendency.

Below you will see a uniformlydistributed variation for a characteristic. With uniform distribution, the variation is evenly spread out across a bounded range. That is, you’re just as likely to observe a value for a characteristic at one end of the interval as you are at the other, or anywhere in between.

A skewed distribution is a variation shape that isn’t symmetrical; one side of the distribution extends out farther than the other side.

Variation mode: The mode of a distribution is its most often repeated value, or in other words, its peak. Usually, the variation in a characteristic has a single peak.

But sometimes, a characteristic displays two or more modes because two or more values dominate the variation. A histogram showing two or more distinct peaks is multi-modal. Multiple major peaks aren’t usual; this situation typically means that a factor affecting the characteristic’s performance is causing the entire system to behave schizophrenically.

When you encounter a multi-modal distribution, always dig deeper to discover what factor or factors are causing the characteristic’s schizophrenic behavior.

Variation average: A dot plot or histogram lets you visually estimate a characteristic’s mean, or average value, without your having to crunch any numbers.

Variation range: The extent or width of variation present in a characteristic is immediately recognized in a dot plot or histogram. The difference between the greatest observed value xMAX and the smallest observed value xMIN creates the range of the distribution. The symbol R always represents the range, which you calculate with the following equation:

R = xMAX – xMIN

Outliers:Outliers are measured observations that don’t seem to fit the grouping of the rest of the observations. They’re either too far to the right or too far to the left of the rest of the data for you to conclude that they come from the same set of circumstances that created all the other points.

When you see an outlier or outliers on a dot plot or histogram, you immediately know that something is probably different about the conditions that created those points, whether it’s the process’s setup or execution or the way you measured the process.

If you want to get more quantitative with your dot plots and histograms, you can use them to calculate the proportion of observations you’ve measured within an interval of interest or to predict the likelihood of observing certain values in the future.

Suppose you measure a characteristic 50 times. Counting and adding up what’s in each of the buckets of your dot plot or histogram, you observe 17 measurements that occur between the values of 5 and 6. You can conclude, then, that 17 out of 50, or 34 percent, of your measurements ended up between 5 and 6.

Now, peering into the future, you can predict that if the characteristic continues to operate as it did during the time of your measurements, 34 percent of future observations will end up being between 5 and 6! The casinos of Las Vegas thrive in business because they use statistics this way to know what will happen when you sit down for, say, a game of craps.