Building Data Visualization Tools

# Get the italy dataset from ggplot2 # Consider only the following provinces "Bergamo" , "Como", "Lecco", "Milano", "Varese" # and arrange by group and order (ascending order) italy_map <- ggplot2::map_data(map = "italy") italy_map_subset <- italy_map %>% filter(region %in% c("Bergamo" , "Como", "Lecco", "Milano", "Varese")) %>% arrange(group, order) Each observation in the dataframe defines a geographical point with some extra information: long & lat, longitude and latitude of the geographical point group, an identifier connected with the specific polygon points are part of – a map can be made of different polygons (e.g. one polygon for the main land and one for each islands, one polygon for each state, …) order, the order of the point within the specific group- how the all of the points being part of the same group should be connected in order to create the polygon region, the name of the province (Italy) or state (USA) head(italy_map, 3) ## long lat group order region subregion ## 1 11.83295 46.50011 1 1 Bolzano-Bozen ## 2 11.81089 46.52784 1 2 Bolzano-Bozen ## 3 11.73068 46.51890 1 3 Bolzano-Bozen How to work with maps Having spatial information in the data gives the opportunity to map the data or, in other words, visualizing the information contained in the data in a geographical context..R has different possibilities to map data, from normal plots using longitude/latitude as x/y to more complex spatial data objects (e.g. shapefiles)..Mapping with ggplot2 package The most basic way to create maps with your data is to use ggplot2, create a ggplot object and then, add a specific geom mapping longitude to x aesthetic and latitude to y aesthetic [4] [5]..This simple approach can be used to: create maps of geographical areas (states, country, etc.) map locations as points, lines, etc..Create a map showing “Bergamo,” Como,” “Varese,” and “Milano” provinces in Italy using simple points… When plotting simple points the geom_point function is used..In this case the polygon and order of the points is not important when plotting..italy_map_subset %>% ggplot(aes(x = long, y = lat)) + geom_point(aes(color = region)) Create a map showing “Bergamo,” Como,” “Varese,” and “Milano” provinces in Italy using lines… The geom_path function is used to create such plots..From the R documentation, geom_path “… connects the observation in the order in which they appear in the data.” When plotting using geom_path is important to consider the polygon and the order within the polygon for each point in the map..The points in the dataset are grouped by region and ordered by order..If information about the region is not provided then the sequential order of the observations will be the order used to connect the points and, for this reason, “unexpected” lines will be drawn when moving from one region to the other..On the other hand if information about the region is provided using the group or color aesthetic, mapping to region, the “unexpected” lines are removed (see example below)..plot_1 <- italy_map_subset %>% ggplot(aes(x = long, y = lat)) + geom_path() + ggtitle("No mapping with region, unexpected lines") plot_2 <- italy_map_subset %>% ggplot(aes(x = long, y = lat)) + geom_path(aes(group = region)) + ggtitle("With group mapping") plot_3 <- italy_map_subset %>% ggplot(aes(x = long, y = lat)) + geom_path(aes(color = region)) + ggtitle("With color mapping") grid.arrange(plot_1, plot_2, plot_3, ncol = 2, layout_matrix = rbind(c(1,1), c(2,3))) Mapping with ggplot2 is possible to create more sophisticated maps like choropleth maps [3].. More details