quasi-public goods, natural resource economics, public policy, computational economics and other stuff I like

Quick and dirty investment analysis with R & Quantmod

In a previous post I illustrated a few really cool features of the Quantmod package in R. More specifically, I have been using Quantmod to pull historical pricing data on stocks and mutual funds directly from Yahoo Finance into R.

I’ve streamlined a few things from that post which I will show you here. I still don’t have a full-scale portfolio analyzer ready to unveil but I’ve made a few code tweaks that some of you might find useful.

First I should say: the Quantmod package has a lot of cool features built in for charting financial data and doing technical analysis (Bollinger Bands, MACD, etc., etc.). However, I’ve had a little trouble dealing with multiple different assets in the Quantmod environment. This might be because I have a lot of time invested in the dplyr way of dealing with data frames and I’m not as familiar with xts objects. So one of my big challenges has been how to efficiently build a dplyr data frame with asset price data pulled from Yahoo Finance.

Here I’m going to do something really simple: use the mutual fund screener in Fidelity to identify some of the top performing small and mid-cap growth funds, then use Quantmod to pull historical data for those funds, and finally, I’m going to plot the growth of a hypothetical $10,000 invested in each of the funds over different time horizons.

require(Quantmod)
require(data.table)
require(dplyr)
df.pull <- function(tickers,startDate,endDate){
stockData <- new.env() #Make a new environment for quantmod to store data in
#Download the stock history (for all tickers)
getSymbols(tickers, env = stockData, src = "yahoo", from = startDate, to = endDate)
#first coerce the environment to a data frame....this gives us a list of
# data frames
df2 <- eapply(stockData,as.data.frame)
#now if we change the column names in each of the list objects we can use
# rbindlist to get them in a single data frame...the problem here is that
# the assets are not in the data frame 'df2' in the same order I put them in
# the object 'tickers'. For each data frame in the list I need to recover the name
# of the asset and also give the columns new names.
df.clean <- function(tmp.data){
asset <- unlist(strsplit(names(tmp.data)[1],"[.]"))[1]
names(tmp.data) <- c("Open","High","Low","Close","Volume","Adjusted")
tmp.data$asset<-asset
tmp.data$date <- as.Date(row.names(tmp.data),format="%Y-%m-%d")
return(tmp.data)
}
#apply the function above to each element in the df2 list...then coerce the output
# to a data frame
return(tbl_df(data.frame(rbindlist(lapply(df2,df.clean)))))
}

The function above accepts as inputs:

a character vector of ticker symbols for the assets I want historical prices for

a starting date and ending date which should both be date class object

And returns a dplyr data frame with daily prices for each of the assets passed into the function.

So now I’m going to use my data pull function to get some historical data on mutual funds that I identified using Fidelity’s Fund screener – I basically used the screener to look for funds with a high Morningstar rating, low expenses, above average returns, and funds of the class ‘small-cap growth’ or ‘mid-cap growth.’ The assets I chose for this are:

BUFTX – Buffalo Discovery Fund

FCPGX – Fidelity Small Cap Growth Fund

FDEGX – Fidelity Growth Strategies Fund

IWB – This is the exchange traded fund for the Russell 1000 Small Cap index…I included this as a benchmark

FMCSX – Fidelity Mid Cap Stock Fund

VISGX – Vanguard Small Cap Growth Index

VMGRX – Vanguard Mid Cap Growth Investor Class

I’m going to feed these ticker symbols into my function, get the daily prices, and then roll the daily data up to a monthly data frame:

Next, I’m going to write a pretty simple function to calculate the growth of a hypothetical $10,000 invested in each asset. I’m once again going to allow start date and end date to be inputs to the function so I can look at how these assets have performed over different past time periods.

Note that I can control the assets to be analyzed using the asset.names argument but – and this may be a little pedantic – the function works with the data stored in the object funds.m so I can only feed it asset names that exist in the funds.m data frame.