From script to package

Scripts contain a combination of data and transformations. Often the
transformations are idiosyncratic, and rely heavily on functions
provided by various packages. Sometimes the script contains a useful
chunk of code that could be reused in different places. Examples we've
encountered in this course might include GC-content of DNA sequences
(or is there a function for that already? check out
Biostrings!) and creating a simple 'map' from one type of
annotation to another.

It is easy and beneficial to create a package.

Only need to think carefully about how to implement the function once.

Code in the package is reused – if you correct an error, then all
your scripts automatically benefit.

Share with your lab mates / colleagues for consistent results, and
with the wider community for fame and glory.

Lab

GC-content

Write a function that takes a DNAStringSet and returns the GC content.

Modify the function using a conditional statement to work whether
provided a DNAString or a DNAStringSet. Test the function.

Save the function in a file on your AMI.

Annotation-helper

Write a function that takes as its argument Ensembl gene identifiers
(like the rownames() of the SummarizedExperiment object in the
RNASeq vignette yesterday) and uses the select() method and
annotation package org.Hs.eg.db to return a named
character vector, where the names of the vector are the Ensembl
identifiers and the values are the corresponding gene SYMBOLs. Adopt
some simple-to-implement policy for handling Ensembl identifiers that
map to more than one gene symbol. Save this function to another file

A package

Use the RStudio wizard to create a package from the files containing
your GC-content and annotation-helper functions.