January 15, 2011

Parsing and plotting time series data

This morning I came across a post which discusses the differences between scala, ruby and python when trying to analyse time series data. Essentially, there is a text file consisting of times in the format HH:MM and we want to get an idea of its distribution. Tom discusses how this would be a bit clunky in ruby and gives a solution in scala.

However, I think the data is just crying out to be “analysed” in R:
require(ggplot2)#Load the plotting package
times = c("17:05", "16:53", "16:29", ...)#would be loaded from a file
times = as.POSIXct(strptime(times, "%H:%M")) #convert to POSIXct format
qplot(times, fill=I('steelblue'), col=I('black'))#Plot with nice colours

Which gives

I definitely don’t want to get into any religious wars of R vs XYZ. I just wanted to point out that when analysing data, R does a really good job.