15. Analysis Routine

Overview

The following exercise introduces a variety of useful data analysis utilities in R.

Analysis Routine: Data Import

Step 1: To get started with this exercise, direct your R session to a dedicated workshop directory and download into this directory the following sample tables. Then import the files into Excel and save them as tab delimited text files.

Homework 3D: How can the merge function in the previous step be executed so that only the common rows among the two data frames are returned? Prove that both methods - the two step version with na.omit and your method - return identical results.

Homework 3E: Replace all NAs in the data frame my_mw_target2a with zeros.

Filtering Data

Step 5: Retrieve all records with a value of greater than 100,000 in ‘MW’ column and ‘C’ value in ‘Loc’ column (targeted to chloroplast).

Homework 3F: How many protein entries in the my_mw_target data frame have a MW of greater then 4,000 and less then 5,000. Subset the data frame accordingly and sort it by MW to check that your result is correct.

Homework 3G: Retrieve those rows in my_mw_target3 where the second column contains the following identifiers: c("AT5G52930.1", "AT4G18950.1", "AT1G15385.1", "AT4G36500.1", "AT1G67530.1"). Use the %in% function for this query. As an alternative approach, assign the second column to the row index of the data frame and then perform the same query again using the row index. Explain the difference of the two methods.

Calculations on Data Frames

Step 7: Count the number of duplicates in the loci column with the table function and append the result to the data frame with the cbind function.

Export Results and Run Entire Exercise as Script

Homework 3H: Write all commands from this exercise into an R script named exerciseRbasics.R, or download it from here. Then execute the script with the source function like this: source("exerciseRbasics.R"). This will run all commands of this exercise and generate the corresponding output files in the current working directory.