Wednesday, December 14, 2011

Looking at Past Weather Data with Time Series Plots

Christmas is right around the corner and the news media is abuzz about retail sales forecasts
for this year's holiday shopping season. As much as I hate to admit it,
another type of forecast is also on the horizon—a forecast for
impending winter weather!

Last year’s winter weather started
early, and you might remember that New York City got hit with quite the
winter storm the day after Christmas ‘10. In fact, New York City’s
December snowfall total for 2010 was a whopping 20.1 inches!

While meteorologists will be releasing their winter weather predictions
in coming weeks, I thought it might be interesting to use Minitab time
series plots to graphically view past snowfall histories for NYC,
Philadelphia, and Washington, D.C. You know what they say—“Learning
about the past can help you prepare for the future.”

To start, I visited NOAA.gov
to gather monthly snowfall totals for December in New York City
(Central Park), Philly, and Washington, D.C. from 1991-2010. I recorded
the data I collected for each city in a Minitab worksheet:

In Minitab, I chose Graph > Time Series Plot > Simple
to get a graphical representation of the snowfall data. Time series
plots are a good graphing option when you have a collection of data
where each observation is uniquely determined by a single point in time.
They’re also really helpful for quickly viewing and finding patterns
(visually) in time series data.

What can we infer from this time series plot?

While the amount of December snowfall in NYC is variable from
1991-2010, I see a pattern that shows steep increases followed by steep
decreases. It seems that after a December with a larger snowfall, the
following December has significantly less snowfall. However, I’m sure
there would be several exceptions to this pattern if we included more
years of December snowfall data to our time series plot—say dating back
through 1900.

You can see from the time series plot (with
multiple series) of both Philly and Washington, D.C. that a similar
variable pattern of sharp increases, followed by sharp declines exists.
Also, notice that December snowfall amounts for both cities have either
increased or decreased together. From 1991-2010, not once did one
increase while the other decreased from the previous year! The
correlation between the two measurements is 0.863, so I think it's
pretty fair to assume that this December's snowfall amount will either
increase or decrease in both cities.

I’ll leave the real weather forecasting to the professionals, but I
think that due to the proximity in location between Philly and D.C.,
what happens weather-wise in both of the cities is likely to be highly
correlated. (and the plot above and correlation calculation seem to
reflect this)

Speaking of forecasts—one thing you can’t do with
this data is develop a forecast model using the trend analysis feature
of time series plots in Minitab. You can tell from the time series plots
above that a trend analysis will not be helpful because there is no
real trend showing the snowfall amount generally drifting up or down
over time.

While it’s tempting to make forecasts based on the
trend analysis graph below, this data doesn't lend itself to an overall
trend in one direction or the other.

A valid trend analysis plot in Minitab might look like this:

This forecast is acceptable because the employment data collected in
this example is trending upward (and not all over the place like the
snowfall data).

How have you successfully used trend analysis in the past at your company?