Trying to find useful things to do with emerging technologies in open education and data journalism

Information Density and Custom Chart Designs

I’ve been doodling today with a some charts for the Wrangling F1 Data With R living book, trying to see how much information I can start trying to pack into a single chart.

The initial impetus came simply from thinking about a count of laps led in a particular race by each drive; this morphed into charting the number of laps in each position for each driver, and then onto a more comprehensive race summary chart (see More Shiny Goodness – Tinkering With the Ergast Motor Racing Data API for an earlier graphical attempt at producing a race summary chart).

The chart shows:

– grid position: identified using an empty grey square;
– race position after the first lap: identified using an empty grey circle;
– race position on each driver’s last lap: y-value (position) of corresponding pink circle;
– points cutoff line: a faint grey dotted line to show which positions are inside – or out of – the points;
– number of laps completed by each driver: size of pink circle;
– total laps completed by driver: greyed annotation at the bottom of the chart;
– whether a driver was classified or not: the total lap count is displayed using a bold font for classified drivers, and in italics for unclassified drivers;
– finishing status of each driver: classification statuses other than *Finished* are also recorded at the bottom of the chart.

The chart also shows drivers who started the race but did not complete the first lap.

What the chart doesn’t show is what stage of the race the driver was in each position, and how long for. But I have an idea for another chart that could help there, as well as being able to reuse elements used in the chart shown here.

FWIW, the following fragment of R code shows the ggplot function used to create the chart. The data came from the ergast API, though it did require a bit of wrangling to get it into a shape that I could use to power the chart.

[sourcecode language=”R”]#Reorder the drivers according to a final ranked position
g=ggplot(finalPos,aes(x=reorder(driverRef,finalPos)))
#Highlight the points cutoff
g=g+geom_hline(yintercept=10.5,colour=’lightgrey’,linetype=’dotted’)
#Highlight the position each driver was in on their final lap
g=g+geom_point(aes(y=position,size=lap),colour=’red’,alpha=0.15)
#Highlight the grid position of each driver
g=g+geom_point(aes(y=grid),shape=0,size=7,alpha=0.2)
#Highlight the position of each driver at the end of the first lap
g=g+geom_point(aes(y=lap1pos),shape=1,size=7,alpha=0.2)
#Provide a count of how many laps each driver held each position for
g=g+geom_text(data=posCounts,
aes(x=driverRef,y=position,label=poscount,alpha=alpha(poscount)),
size=4)
#Number of laps completed by driver
g=g+geom_text(aes(x=driverRef,y=-1,label=lap,fontface=ifelse(is.na(classification), ‘italic’ , ‘bold’)),size=3,colour=’grey’)
#Record the status of each driver
g=g+geom_text(aes(x=driverRef,y=-2,label=ifelse(status!=’Finished’, status,”)),size=2,angle=30,colour=’grey’)
#Styling – tidy the chart by removing the transparency legend
g+theme_bw()+xRotn()+xlab(NULL)+ylab(&quot;Race Position&quot;)+guides(alpha=FALSE)
[/sourcecode]