Exercise

Reading a .tsv file

This occurs every time a readr function is called to import data. In fact, all readr functions that import data have the argument col_type, which allows for custom column specifications. The message shows how the col_type argument was specified by default. Notice the column specification is created by cols() and in it are the names of the columns and some col_*() functions that tell R to import a certain column as *. For example, the weight column was imported as integer and the feed column as character, by default. One special col_*() function is col_skip(), which tells R to skip that column when importing the data.

All readr functions that import data also have the argument col_names, which is used when you want to name the columns differently from what's in the data file. This argument takes in either TRUE, FALSE, or a character vector of column names. If set equal to TRUE, the first row of the input will be used as the column names. If FALSE, column names will be generated automatically: X1, X2, X3, etc. If col_names is a character vector, the values will be used as the names of the columns and the first row of the input will be read into the first row of the output data frame.

These arguments are useful when you know you want to import certain columns of the data as certain types with certain names. The readr package does a great job guessing what each column type and name should be, but it's important to know that you can also customize this further with the col_names and col_type arguments.

In this exercise, you’ll import a set of data on professors’ salaries called Salaries.tsv with read_tsv(), another readr function that imports files with tab-separated values. This time, you’ll also provide custom column specifications when you're reading in the data.

Instructions

100 XP

In this exercise and all following, the readr package will be preloaded in your workspace so you don't need to load it yourself with library(readr).

Use the read_tsv() function to read in the Salaries.tsv file with a custom cols() specification that tells R to autogenerate column names and skip columns X2, X3, and X4. Store the result in an object called salaries.