Quantitative research, trading strategy ideas, and backtesting for the FX and equity markets

Main menu

Category Archives: R

The folks at Rstudio have done some amazing work with the shiny package. From the shiny homepage, “Shiny makes it super simple for R users like you to turn analyses into interactive web applications that anyone can use.” Developing web applications has always appealed to me, but hosting, learning javascript, html, etc. made me put this pretty low on my priority list. With shiny, one can write web applications in R.

This example uses the managers dataset with calls to charts.PerformanceSummary and table.Stats from the PerformanceAnalytics package to display a plot and table in the shiny application.

Below is a screenshot of the application.

You need to have shiny and Performance Analytics packages installed to run the application. Once those are installed, open your R prompt and run:

shiny::runGist("https://gist.github.com/rbresearch/5081906")

There is a great shiny tutorial from Rstudio as well as examples from SystematicInvestor for those interested in learning more.

The past few posts on momentum with R focused on a relatively simple way to backtest momentum strategies. In part 4, I use the quantstrat framework to backtest a momentum strategy. Using quantstrat opens the door to several features and options as well as an order book to check the trades at the completion of the backtest.

I introduce a few new functions that are used to prep the data and compute the ranks. I won’t go through them in detail, these functions are available in my github repo in the rank-functions folder.

This first chunk of code just loads the necessary libraries, data, and applies the ave3ROC function to rank the assets based on averaging the 2, 4, and 6 month returns. Note that you will need to load the functions in Rank.R and monthly-fun.R.

The next chunk of code is a critical step in preparing the data to be used in quantstrat. With the ranks computed, the next step is to bind the ranks to the actual market data to be used with quantstrat. It is also important to change the column names to e.g. XLY.Rank because that will be used as the trade signal column when quantstrat is used.

# this is an important step in naming the columns, e.g. XLY.Rank# the "Rank" column is used as the trade signal (similar to an indicator)# in the qstratRank functioncolnames(sym.rank) <- gsub(".Adjusted",".Rank",colnames(sym.rank))# ensure the order of order symbols is equal to the order of columns # in symbols.closestopifnot(all.equal(gsub(".Adjusted","",colnames(symbols.close)),symbols))# bind the rank column to the appropriate symbol market data# loop through symbols, convert the data to monthly and cbind the data# to the rankfor(i in1:length(symbols)){
x <- get(symbols[i])
x <- to.monthly(x,indexAt='lastof',drop.time=TRUE)
indexFormat(x) <- '%Y-%m-%d'colnames(x) <- gsub("x",symbols[i],colnames(x))
x <- cbind(x, sym.rank[,i])assign(symbols[i],x)}

Changing the argument to max.levels=2 gives the flexibility of “scaling” in a trade. In this example, say asset ABC is ranked 1 in the first month — I buy 500 units. In month 2, asset ABC is still ranked 1 — I buy another 500 units.

In the previous post, I demonstrated simple backtests for trading a number of assets ranked based on their 3, 6, 9, or 12 (i.e lookback periods) month simple returns. While it was not an exhaustive backtest, the results showed that when trading the top 8 ranked assets, the ranking based 3, 6, 9, and 12 month returns resulted in similar performance.

If the results were similar for the different lookback periods, which lookback period should I choose for my strategy? My answer is to include multiple lookback periods in the ranking method.

This can be accomplished by taking the average of the 6, 9, and 12 month returns, or any other n-month returns. This gives us the benefit of diversifying across multiple lookback periods. If I believe that the lookback period of 9 month returns is better than that of the 6 and 12 month, I can use a weighted average to give the 9 month return a higher weight so that it has more influence on determining the rank. This can be implemented easily with what I am calling the WeightAve3ROC() function shown below.

This test demonstrates how it may be possible to achieve better risk adjusted returns (higher CAGR and lower drawdowns in this case) by considering multiple lookback periods in the ranking method.

Full R code is below. I have included all the functions in the R script below to make it easy for you to reproduce the tests and try things out, but I would recommend putting the functions in a separate file and using source() to load the functions to keep the code cleaner.

Many of the sites I linked to in the previous post have articles or papers on momentum investing that investigate the typical ranking factors; 3, 6, 9, and 12 month returns. Most (not all) of the articles seek to find which is the “best” look-back period to rank the assets. Say that the outcome of the article is that the 6 month look-back has the highest returns. A trading a strategy that just uses a 6 month look-back period to rank the assets leaves me vulnerable to over-fitting based on the backtest results. The backtest tells us nothing more than which strategy performed the best in the past, it tells us nothing about the future… duh!

Whenever I review the results from backtests, I always ask myself a lot of “what if” questions. Here are 3 “what if” questions that I would ask for this backtest are:

What if the strategy based on a 6 month look-back under performs and the 9 month or 3 month starts to over perform?

What if the strategies based on 3, 6, and 9 month look-back periods have about the same return and risk profile, which strategy should I trade?

What if the assets with high volatility are dominating the rankings and hence driving the returns?

The backtests shown are simple backtests meant to demonstrate the variability in returns based on look-back periods and number of assets traded.

The graphs below show the performance of a momentum strategy using 3, 6, 9, and 12 month returns and trading the Top 1, 4, and 8 ranked assets. You will notice that there is significant volatility and variability in returns only trading 1 asset. The variability between look-back periods is reduced, but there is still no one clear “best” look-back period. There are periods of under performance and over performance for all look back periods in the test.

rbresearch

rbresearch

rbresearch

Here is the R code used for the backtests and the plots. Leave a comment if you have any questions about the code below.

Time really flies… it is hard to believe that it has been over a month since my last post. Work and life in general have consumed much of my time lately and left little time for research and blog posts. Anyway, on to the post!

This post will be the first in a series of to cover a momentum strategy using R.

One of my favorite strategies is a momentum or relative strength strategy. Here are just a few of the reasons why I like momentum:

Simple to implement

Long only or long/short portfolios

Many ways to define the strength or momentum measure

It just works

Also, a momentum strategy lends itself well to potential for diversification. The universe of instruments can be infinite, but the instruments traded are finite. Think about it this way… Investor A looks at 10 instruments and invests $1000 in the top 5 instruments ranked by momentum. Investor B looks at 100 instruments and invests $1000 in the top 5 instruments ranked by momentum. Investor A is limiting his potential for diversification by only having a universe of 10 instruments. Investor B has a much larger universe of instruments and can in theory be more diversified. Theoretically speaking, you can trade an infinite number of instruments with a finite amount of trading capital using a momentum or relative strength strategy.

Note that the for loop converts the data to monthly and subsets the data so that the only column we keep is the adjusted close column. We now have four objects (XLY, XLP, XLE, XLF) that have the Adjusted Close price.

That will wrap up this first post for a quick and easy way to rank assets based on 3 month simple returns. Future posts will explore other methods for ranking and using quantstrat to backtest momentum.

Here is the code in full.

require(quantstrat)#Load ETFs from yahoo
currency("USD")symbols = c("XLY","XLP","XLE","XLF")
stock(symbols, currency="USD",multiplier=1)
getSymbols(symbols, src='yahoo', index.class=c("POSIXt","POSIXct"), from='2000-01-01')#Convert to monthly and drop all columns except Adjusted Closefor(symbolinsymbols){
x <- get(symbol)
x <- to.monthly(x,indexAt='lastof',drop.time=TRUE)
indexFormat(x) <- '%Y-%m-%d'colnames(x) <- gsub("x",symbol,colnames(x))
x <- x[,6]#drops all columns except Adjusted Close which is 6th columnassign(symbol,x)}#merge the symbols into a single object with just the close prices
symbols_close <- do.call(merge,lapply(symbols,get))#xts object of the 3 period ROC of each column in the close object#The 3 period ROC will be used as the ranking factor
roc <- ROC(symbols_close, n = 3, type = "discrete")#xts object with ranks#symbol with a rank of 1 has the highest ROC
r <- as.xts(t(apply(-roc,1,rank)))

Just stumbled on across a course on coursera titled “Computing for Data Analysis” taught by Roger D. Peng the Johns Hopkins Bloomberg School of Public Health.

Here is the description of the course.

In this course you will learn how to program in R and how to use R for effective data analysis. You will learn how to install and configure software necessary for a statistical programming environment, discuss generic programming language concepts as they are implemented in a high-level statistical language. The course covers practical issues in statistical computing which includes programming in R, reading data into R, creating informative data graphics, accessing R packages, creating R packages with documentation, writing R functions, debugging, and organizing and commenting R code. Topics in statistical data analysis and optimization will provide working examples.

I just signed up for it! This course looks like a great opportunity to sharpen skills in R and learn new things.

When we backtest a strategy on a portfolio, it is a simple analysis of a single period in time. There are ways to “stress test” a strategy such as monte carlo, random portfolios, or shuffling the returns in a random order. I could never really wrap my head around monte carlo and shuffling the returns seemed to be a better approach because the actual returns of the backtest are used, but it misses one important thing… the impact of consecutive periods of returns. If we are backtesting a strategy and we want to minimize max drawdown, consecutive down periods have a significant impact on max drawdown. If, for example, the max drawdown occured due to 4 consecutive months during 2008, we wan’t to keep those 4 months together when shuffling returns.

In my opinion, a better way to shuffle returns is to shuffle “blocks” of returns. This is nothing new, the TradingBlox software does monte carlo analysis this way. I had a look at the boot package and tseries package for their boot functions, but it was not giving me what I wanted. I wanted to visual a number of equity curves with blocks of returns randomly shuffled.

To accomplish this in R, I wrote two functions. The shuffle_returns function takes an xts object of returns, the number of samples to run (i.e. how many equity curves to generate), and a number for how many periods of returns makes up a ‘block’ as arguments.The ran_gen function function is a function within the shuffle_returns function that is used to generate random blocks of returns.

shuffle_returns returns an xts object with the random blocks of returns so we can do further analysis such as max drawdown, plotting, or pretty much anything in the PerformanceAnalytics package that takes an xts object as an argument.

This is not a perfect implementation of this idea, so if anybody knows of a better way I’d be glad to hear from you.

The example below uses sample data from edhec and generates 100 equity curves with blocks of 5 consecutive period of returns.

Using packages such as ggplot and lattice can produce some great charts and visualization, but googleVis is tough to beat for interactive charts to share on the web. Click on the image below to open up the html page.

Entry Rule: Buy 100 shares when RSI(2) is less than 20 (Note that if RSI(2) is below 20 for N days, then you will have accumulated N * 100 shares)

Exit Rule: Exit all positions when RSI(2) is greater than 50

Classification: Short-Medium term reversal (dip buying) strategy

What did we diversify?

Symbols? – No, the exact same instruments were used in the strategy.

Markets? – No, see #1.

Timeframe? Sort of, Strategy1 is a long term strategy and Strategy2 is a shorter term strategy, but both are on the weekly timeframe. We could diversify further by trading even shorter timeframes (i.e. Daily, Hourly, minute, tick, etc.)

Strategy? Yes, Strategy1 is a trend following strategy and Strategy2 is a reversal strategy.

We achieved fairly low correlations by achieving only three “levels” of diversification. Think what we could do by using a “kitchen sink” portfolio with grains, softs, metals, currencies, stocks, fixed income, international stocks, international fixed income, style ETFs, etc.

The R scripts are pretty self explanatory so I won’t go into much detail. However, I do want to call attention to 2 lines of code from strategy1.R. The code for strategy2.R is virtually identical.

# logarithmic returns of the equity curve of strategy1.
strategy1_eclogret <- ec$logret
# write the logarithmic returns of strategy 1 to a csv file with the filename "strategy1.csv"# you will have to change the file where you want to save it
write.zoo(strategy1_eclogret,file = "~/R/strats_for_cor/strategy1.csv", sep=",")

In this post, I will demonstrate how to quickly visualize correlations using the PerformanceAnalytics package. Thanks to the package creators, it is really easy correlation and many other performance metrics.

The first chart looks at the rolling 252 day correlation of nine sector ETFs using SPY as the benchmark. As expected the correlation is rather high because the sector ETFs are part of the S&P 500 index, but has been even more pronounced the last few years.

rbresearch

Chart 2 shows the correlation of five ETFs. Note that there is no single instrument I am using as a benchmark, all five ETFs will be benchmarked against one another. (note that I removed the legend because it literally took up the entire plot).

rbresearch

Chart 3 shows the same 4 ETFs, this time using SPY as a benchmark.

rbresearch

In my opinion, the beauty of the chart.RollingCorrelation function is that the inputs are time series returns. This means that the correlations of instruments (ETFs, stocks, mutual funds, etc.), hedge fund managers, portfolios, and even strategies we test in quantstrat.

Here is the R code used to generate the first chart. To do you own correlation analysis, just change the symbols or add in new data sets of different returns.