Interpolating gridded datasets in R: UV-B data

In my most recent published paper, I analysed the effects of incoming solar UV-B radiation on the geochemistry of Atlas cedar pollen, focused on the Middle Atlas Mountains in Morocco. The study area was relatively small, with sample sites fairly close together.

Gridded datasets are an excellent source of data for doing global or macro-scale studies. However, if working in a relatively small area, you may find that your study area is covered by just a few grid cells due to the often low resolution of gridded data. And this can sometimes make it more difficult to carry out analysis.

To overcome the problem, you can interpolate the data to increase the resolution. After interpolation, the gridded data will go from looking like the image on the left, to looking like the image on the right, which is much more detailed for the study area.

Overview

! This guide was written using R version 3.5.0 on Windows 10.

Gridded datasets are available for all kinds of data, particularly for climate data (see my previous blog post for getting climate data). They are available in different resolutions, and usually cover the entire planet.

In this guide, I will show you how to interpolate gridded data using bilinear interpolation with the "raster" package in R. The example will use gridded UV-B data from the glUV dataset, available in ASCII format.

You can also use this guide to interpolate other kinds of gridded data (e.g. CRU climate data), although I will publish a separate step-by-step guide for doing this in a future blog post.

Getting the data

You can download the UV-B data from the glUV: Global UV-B radiation dataset website. Click on the Download link on the left to see the available datasets. For this guide, we'll use the Monthly Mean UV-B dataset for the Month of August. To download, right-click on the link that says "August (16 MB)" and save the file to your computer.

The UV-B data file is in the ASCII format, which is essentially a text file that can be read by almost any text editor. Here is how the data looks in Notepad++

There is some important information contained within the file. The NCOLS and NROWS values refer to how many rows and columns make up the grid. The XLLCORNER and YLLCORNER refer to decimal degrees coordinates, and indicate where the grid starts. The CELLSIZE refers to the size of the grid cells in degrees. In this case, each cell represents an area 0.25° squared. This is equivalent to approximately 27.5 km2. The NODATA_value refers to the value where no data is available for that grid cell. In this case, if a grid cell contains "-9999", then no UV-B data is available.

The UV-B values in this dataset are in J/m2/day. I.e. the values represent the average UV-B dose per day (in Joules per metre squared), averaged over a month.

Importing the data into R

The benefit of ASCII files, is that they can be easily read by many programs, including R. To import this dataset into R, we'll make use of the "raster" package. If you do not already have this package installed, use install.packages("raster") to install it from within the R console.

To import the data, you should save the file to the R working directory, and then use the following code:

# Load library
library(raster)
# Import UV-B data into R
uvb8

(You will need to adjust the code if you have downloaded the UV-B data to a different file location).

If you type "uvb8" into the R console, you'll be presented with information about the data:

Interpolating gridded data

Setting an extent

Before we interpolate the data, we first need to subset the data to cover the study area. In this example, we will be looking at the Middle Atlas Mountains, and the surrounding area in Morocco.

So, we need to create an "extent" - this tells R the coordinates of the area that we would like to subset the data for. For the Middle Atlas area, you can set the extent using the following code:

# Set Middle Atlas Extent (interpolation area)
midatlas

You can change the coordinates to different values depending on your own study area. The coordinates should be input using decimal degrees, where the first and second numbers represent the longitude (xmin, xmax), while the third and fourth numbers represent latitude (ymin, ymax).

It is generally a good idea to create a slightly bigger extent than your study area. E.g. add at least 1° to the xmin, xmax, ymin, ymax values.

Also, as we'll be using bilinear interpolation, it is important that the area we subset (i.e. the extent) is a square, containing an equal number of grid cells on each axis. If using a different interpolation method, this is less important.

If you do not know the coordinates to set the extent, it is possible to draw the extent on the map using the drawExtent() function, using the following code:

# Plot map
plot(uvb8)
# Draw extent
drawExtent()

This will start an interactive session - you should now click on the plot to draw your extent. The first click should be the xmin, and the second click should be the ymax. (The best thing to do, is to play around, clicking on the map to understand how this works).

In this code, we create a new object (uvb8.ex) which contains the UV-B data we imported earlier for our study area (the midatlas extent).

The ncol=24 argument tells R to create the new data grid with 24 columns. This value is important for the interpolation to work properly.

The extent in our example covers the longitude (x axis) from -8 to -2 on our map, which is equal to 6°. As the grid cells in the gridded dataset are 0.25° in size (4 grid cells for each 1°), this means that for our extent, there are 24 grid cells along the x axis (6 X 4 = 24). There are also 24 grid cells along the y axis, although it is not necessary to specify this here.

If you have used a different extent, or you have used a different resolution dataset, you will need to re-calculate the ncol= value.

Next, we need to create a RasterLayer which has a "fine" resolution grid for our extent:

# Create fine resolution grid
ma.f

This code will create an empty grid, with grid cells that are 100 times smaller than the original grid. (The smaller the grid cells, the higher the resolution).

You can adjust the resolution of the grid by changing the ncol= and nrow= values. You could for example increase the resolution by 1000. However, be warned, that increasing the resolution increases the amount of time, and the amount of memory needed by R to do the interpolation. You may be limited by your computer power as to what resolution you can use.

Again, you can verify the new RasterLayer grid by typing "ma.f" into the R console:

The RasterLayer should contain additional information (data source, names, values) which will indicate that the interpolation was successful. You could also plot the data to compare the original (non-interpolated) vs. the interpolated data:

Here, you can clearly see the interpolation has worked, with a much higher resolution image on the right.

You can extract the interpolated values for your sample sites using the extract() function. Check out my guide to extracting climate data for details on how to use this function. Simply adjust the code to extract from the "ma.uvb8" object.

You can also do various things to improve the look of the plot. For a start, you can change the colour scheme to something more suitable. There are some great resources online to help pick colour schemes, such as ColorBrewer 2.0.

You could also overlay the interpolated UV-B data on top a DEM, to show the terrain, so it is easier to associate with a place. In addition, you could add place names, sample site names, and indicate other places of interest.

Stay tuned for a future blog post, showing you how to create great looking maps!