Share

2.4 Using data in R

Set working directory to the extracted folder in R under File - Change dir...

First we need to load the packages needed for the exercise

library(adehabitatHR)library(rgdal)library(gstat)library(plyr)

Now open the script "Weather-Station-Code_Snowfall.R" and run code directly from the script

This code is designed to process data downloaded from NOAA Date. This version looks at weather stations that provide snowfall data in and around Pennsylvania. The code pulls out the desired data from the downloaded aggregate weather-station file and calculates the mean annual snowfall per weather station for 12/1/94 - 3/31/05. The data is then exported to a text file for interpolation in ArcGIS.

### remove other objects from the working session ###rm(list=ls())

Read weather station data - note the use of stringsAsFactors=FALSE and na.strings=’-9999’

Make a subset of WS that includes only the months of Dec-March with further manipulation of the data for desired output of project objectives.

Winter <- WS[WS$Month %in% c(1,2,3,12), ]

#For December, add 1 to Year so that Year matches Jan-March in that seasonWinter <- within(Winter, Year[Month==12] <- Year[Month==12] +1)

#Check subset, including random row to make sure only selected months includeddim(Winter)head(Winter)Winter[699,]

#Create a matrix of unique STATION values (GHCND ) with Lat/Long values for later reference. Data contains some multiple versions of individual GHCND coordinates. Only want 1 set per.PulledCoords <- Winter[!duplicated(Winter[,1]),]head(PulledCoords)dim(PulledCoords)

#Get the number of snowfall records for each STATION for each year and name it RecordTotal. Note that NA is omited from the length count #WinterRecords <- ddply(Winter, .(STATION,Year), summarize, RecordTotal = length(na.omit(SNOW)))head(WinterRecords)tail(WinterRecords)dim(WinterRecords)

#Get the total amount of snowfall per STATION per year and name it YearlySnowYearlySnow <- ddply(Winter, .(STATION,Year), summarize, Snow = sum(SNOW, na.rm=TRUE))head(YearlySnow)tail(YearlySnow)dim(YearlySnow)

#Only include years that have more than 75% of days recorded #WinterDays <- 121FullWinters <- AllWinters[AllWinters$RecordTotal/WinterDays > 0.75, ]head(FullWinters)tail(FullWinters)dim(FullWinters)

#Get the number of years with more than 75% of days recorded for each STATIONWinterYears <- ddply(FullWinters, c(’STATION’), function(x) c(TotalYears=length(x$Year)))head(WinterYears)tail(WinterYears)dim(WinterYears)

#Get the total amount of snow for each station for all years ###TotalWinterSnow <- ddply(FullWinters, c(’STATION’), function(x) c(TotalWinterSnow=sum(x$Snow)))head(TotalWinterSnow)dim(TotalWinterSnow)

#Add a column to CoordChart showing whether each row matches a STATION #in Complete.Records. Use "NA" for value if no match, then delete rows with "NA" value.#Number of rows in CoordChart should now equal number of rows in Complete.RecordsCoordChart$match <- match(CoordChart$STATION, Complete.Records$STATION, nomatch=NA)CoordChart <- na.omit(CoordChart)head(CoordChart)dim(CoordChart)dim(Complete.Records)

#Combine Complete.Records and CoordChart. Make sure each STATION #matches in row. Delete any rows that don’t match. Shouldn’t be any. If # of rows in #Final.Values is less than number of rows in CoordChart, there is a problem (but note #that number of cols does change).Final.Values <- cbind(Complete.Records,CoordChart)Final.Values$match2 <- match(Final.Values[ ,1], Final.Values[ ,5], nomatch=NA)Final.Values <- na.omit(Final.Values)dim(Final.Values)dim(CoordChart)