Daily news about using open source R for big data analysis, predictive modeling, data science, and visualization since 2008

June 27, 2013

Learning Time Series with R

by Joseph Rickert

Late last Saturday afternoon I was reading in my usual spot at the Dana Street Coffee House in Mt. View. A stranger walking by my table noticed my copy of Madsen’s Time Series Analysis (sitting there untouched again) said he needed to learn something about time series and asked if I could recommend a book. He looked serious so I asked him if he knew any R. To my delight the guy replied: “Of course,... I learned some R for my Ph.D.”. I suggested Cowpertwait’s Introductory Time series with R. The fellow pulled out his smartphone and bought a copy right then and there. As he left, the stranger said: “God I love this coffee shop”. Yes. I love the place too: funky Dana Street for sure, but I mean the whole Bay Area: R people are everywhere.

Anyway, because this was the second time in less than a week that someone asked me about time series, I thought it would be useful to collect some information on how one might go about learning time series with R. This is by no means a comprehensive survey. I have just gathered some low hanging fruit and listed a few books that have been helpful to me.

First off: Why R? Well, attempting to learn time series without good computational tools would be madness, and unless you are already a SAS, Stata or even a MatLab expert there would be little reason to even consider these systems. The three of them together can’t match the scope and depth of the time series tools listed in R’s Time Series Task View. Like all R task views, this one organizes the libraries of available R functions by topic and is the place to start if you what you are looking for.

Otherwise, learning time series comes down to matching you learning style and experience with the available R resources. If you are a book person looking for a general introduction to R that has some time series material, then I would suggest Paul Teetor’s R Cookbook. Chapter 14 is very good. It begins by making the case for using zoo and time series object and then moves briskly showing how to manipulate time series and the basics of ARIMA models.

If you are looking for a first book devoted entirely to time series then in addition to Cowperwait you might want to look at Time Series Analysis with Applications in R by Jonathan D. Cryer and Kung-Sik Chan. Cowperwait and Metcalfe cover more ground: in addition to the basics he has chapters on non-stationary series, long memory processes, spectral analysis, multivariate models and state space models in less than 250 pages. Cryer and Chan take things a little slower but they still cover ARIMA models, ARCH and GARCH models, regression models and spectral analysis. The TSA package that goes with the book is also helpful.

If you are looking for one book that will see you through graduate school then I don’t think you can do better than the new R friendly edition of Shumway and Stoffer’s Time Series Analysis and Its Applications with R Examples. Previous editions of this book have reached the stature of having defined time series for a generation of students.

If you are interested in time series for Finance then a good place to start would be David Ruppert’s book Statistics and Data Analysis for Financial Engineering. This book has several very nice chapters on ARIMA, GARCH and regression models and even has a discussion on fitting ARMA models with Bayesian techniques. Also, anyone interested in time series for Finance would find the ebooks on the RMetrics site are a valuable resource.

If forecasting is your focus, then Forecasting: principles and practiceby Rob Hyndman and George Athanasopoulos is online and free. This book would be a good deal at full hardback prices, and it is very generous of the authors to promise to keep the online version free even after a print version becomes available on Amazon. Hyndman and Athanasopoulos are focused on basic principles and forecast accuracy, recommending simple tools when they are right for the job.The following illustrates the kind of big picture context setting knowledge that is not often explicitly mentioned by other authors:

It is a com­mon myth that ARIMA mod­els are more gen­eral than expo­nen­tial smooth­ing. While lin­ear expo­nen­tial smooth­ing mod­els are all spe­cial cases of ARIMA mod­els, the non-linear expo­nen­tial smooth­ing mod­els have no equiv­a­lent ARIMA coun­ter­parts. ...

The paper by McLeod, Yu, and Mahdi, Time Series Analysis with R, provides an overview of time series topics at a more demanding level of mathematical sophistication. For example, there is a section on stochastic differential equations. The paper contains an extensive bibliography and many links to further reading.

At the risk at oversimplifying things, here are a just few lines of R code that take you from fetching real data to automatically fitting an ARIMA model.

library(xts)library(forecast)# Fetch IBM stock data from Yahoo Finance# Go to http://finance.yahoo.com/q/hp?s=IBM+Historical+Prices and copy the link to the tableurl<-"http://ichart.finance.yahoo.com/table.csv?s=IBM&a=00&b=2&c=1962&d=05&e=26&f=2013&g=d&ignore=.csv"
IBM.df <-read.table(url,header=TRUE,sep=",")# Read data into a data framehead(IBM.df)# Look at the first few lines
IBM <-xts(IBM.df$Close,as.Date(IBM.df$Date))# Make a time series objectplot(IBM)# Plot the time series
IBM.1982 <-window(IBM,start="1982-01-01",end="2013-01-23")# Select at a subset
fit <- auto.arima(IBM.1982)# Auto fit an ARIMA model
fit # Examine the model

This is an example of how R places powerful tool at you finger tips without getting in the way. For sure, there is real work in making the effort to develop the understanding and intuition to build meaningful time series models, but there is not much of a learning curve to climb to get a handle on the required R functions. All of the authors cited have made conscientious efforts to help you to become time series fluent through using powerful, easy-to-use R tools that will enhance your learning experience.

Oh, - about Madsen's book: there is no R code here. However, Madsen's approach, emphasizing the similiarity between linear systems and time series, is interesting and unusual. For the mahtematically inclined it is a great reference for topics like Kalman Filters.

Comments

You can follow this conversation by subscribing to the comment feed for this post.

I have been investigating the bibliography for time series analysis for a long time now, and though I have already found majority of those books it's nice to have a repository of links like this article.

For a very pragmatic approach and one of the best introductory text I've seen consider
Forecasting: principles and practice
An online textbook by Rob J Hyndman and George Athanasopoulos
http://otexts.com/fpp/

One comment about the RMetrics ebooks: I find them dramatically expensive. Also, I have had a lot of problems with the RMetrics packages like fPortfolio, which seems to be only semi-functional, at best. I would have been very upset to pay those prices for the RMetrics documentation only to find that the software didn't work.