Exercise

Proportional/stacked bar plots

Before you head over to ternary plots, let's try to make a classical proportional/stacked bar plot of a subset of the data. We'll use a stacked bar plot and the coord_flip() function to flips the x and y axes.

The data frame for the African Soil Profiles Database is available in your workspace as africa and can be found in the GSIF package. It contains three columns: Sand, Silt and Clay. A smaller version, containing only 50 observations is stored in africa_sample.

In the first course we mentioned that in the data layer, the structure of the data should reflect how you wish to plot it. For a ternary plot, you need to have three separate variables, for example, Sand, Silt and Clay in africa. However, for a proportional/stacked bar plot, you just need two. The type should be defined as three levels within a single factor variable. That is, you want tidy data.

It's also useful to maintain the site IDs as a variable within the data frame, currently, they are stored at row names, which is poor style and not useful.

Instructions

100 XP

Explore the structure of the africa and africa_sample datasets with str().

Add a column ID to africa_sample; use row.names(africa_sample) to populate it.

If you are not familiar with gather(), please review the course on tidyr. africa_sample_tidy contains tidy data arranged as a long data frame. Execute head(africa_sample_tidy) to view the first few rows so that you know the variable names and contents. Contrast africa_sample_tidy with africa_sample.

Finish the ggplot() command: define the data and aesthetics layer to create a stacked bar plot showing the soil composition levels of each observation (ID). Notice that geom_col() is used, since the values in the data frame are exactly where you want the bars to be drawn.