I have a question about Random forests and how they could be utilized in trading?
I heard Random forests are used for classification, is that accurate? If so, could someone give an example of what sort of classification does it help with?

10 Answers
10

I have not used random forests myself but I know of a guy who applied this classification technique to machine learning algorithms applied to pattern recognition.

Thus I think its advantages over classic regression approaches can be applied to discern patterns in financial data, though I get the impression that it vastly overfits the data and thus you end up with the classical trade-off that many quants are faced with.

I also read that it is used by the SEC where they apply it in their quest to analyze trading patterns to flag insider trading violations.

When automated trading strategies are developed and evaluated using
backtests on historical pricing data, there exists a tendency to
overfit to the past. Using a unique dataset of 888 algorithmic trading
strategies developed and backtested on the Quantopian platform with at
least 6 months of out-of-sample performance, we study the prevalence
and impact of backtest overfitting. Specifically, we find that
commonly reported backtest evaluation metrics like the Sharpe ratio
offer little value in predicting out of sample performance (R² <
0.025). In contrast, higher order moments, like volatility and maximum drawdown, as well as portfolio construction features, like hedging,
show significant predictive value of relevance to quantitative finance
practitioners. Moreover, in line with prior theoretical
considerations, we find empirical evidence of overfitting – the more
backtesting a quant has done for a strategy, the larger the
discrepancy between backtest and out-of-sample performance. Finally,
we show that by training non-linear machine learning classifiers on a
variety of features that describe backtest behavior, out-of-sample
performance can be predicted at a much higher accuracy (R² = 0.17) on
hold-out data compared to using linear, univariate features. A
portfolio constructed on predictions on hold-out data performed
significantly better out-of-sample than one constructed from
algorithms with the highest backtest Sharpe ratios.

So what they basically did was to take all kinds of real quant trading algos and asked the old EMH question whether in sample performance has any predictive power for out of sample performance. They calculated all kinds of measures for these algos and used them (and combinations thereof) to predict the out of sample performance. Then they extracted the most important features from the random forest model - the following picture is taken from the paper (p. 9)

As with many machine learning technologies, you can run a separate training and testing phase before deploying it live for prediction. All it does is build a collection of decision trees based on the parameters you give it - if the output field is a factor, you get classification (a finite enumerated set of values); if it's numeric, you get prediction. One approach might be to add a column forwhether a commodity reaches a given profit level within an affordable time period; the random forest can then build a logic to correlate that against all the other input columns (such as technical indicators, etc).

A while ago I have implemented a binary fuzzy decision tree forest to classify credit applications as a semesters project.

Let's say a tree looks like this:

C1
C11
-> X
-> Y
C12
C121
-> A
-> B
-> U

The benefits of decision tree techniques in general are:

Comprehensibility: The paths down the tree have a direct interpretation: "If condition C1 and condition C11 then X". For example "If debt>0 and income == 0 then no_credit."

Expert knowledge: It is possible to change the trees based on background knowledge.

Extensibility: It is possible to include other classification tools at the nodes, for example you could have a neural network which detects trends and then go down the tree depending on the output of the network.

Decicion tree forests have additional benefits:

Adaptation: If the problem splits into several domains, the trees can fit to their region more closely.

Smaller trees: The trees can be restricted to much smaller size, which makes them easier to understand.

Confidence information: If a lot of the trees in the forrest vote for the same classification, this can be seen as a measure of confidence.

On the downside forests can be much more expensive to compute and manage. Also, whereas a single tree can avoid overfitting by using standard pruning techniques, there does not seem to be concensus which is the best approach for forrests, yet.

Any application of machine learning techniques this approach is only as good as the data and the indicators used to train it on.

It could help with things like fraud detection, analysis of bankruptcy probability, default risk, unsupervised learning for qualitative/descriptive purposes, or for a purely backwards looking supervised analysis on returns again for descriptive/understanding purposes (variable important, etc, perhaps impulse response analysis).

It may also be good at forecasting low-frequency volatility which is well known to be easy to forecast; intuitively this works because it is likely to be combinations of events that cause very high volatility which is difficult to incorporate into a GARCH variance equation. You could just rely on the forest to learn regimes, breaks, etc (consider a dynamic forest).

Eugene Fama stated in his Nobel Prize lecture that “there is no
statistically reliable evidence that expected stock returns are
sometimes negative” (2013). However, various theoretical models such
as Barberis et al. (2015) and Barlevy and Veronesi (2003) imply that
expected stock returns are sometimes negative. This paper provides
evidence that expected excess aggregate stock market returns are
sometimes negative, and that portfolios composed of the most liquid
stocks have predictable downturns as well. This paper presents a
forecasting model that relies exclusively on ex-ante information to
predict stock market downturns only when the day-prior confidence of a
downturn is relatively high, and shows that the average excess return
on days which are predicted to be downturns by the forecasting model
is -13.9 basis points. Volatility and classic factor return variables
alone are sufficient to predict downturns in the sample and are the
most powerful downturn predictors. A market timing portfolio using
these ex-ante predictions generates a risk-adjusted return of 3.5
basis points per day, annualized to an average 8.8% risk-adjusted
return.

To be more precise, random forests work by building multiple trees by using sample with replacement from the same training data. Each tree is also built using a random subset of the features (attributes). Pruning is usually done for each tree before its inclusion. Hypothesis values are a result of averaging over all trees. One of the primary uses of random forests is the reduction of variance. If bias is the problem, then one should use boosting (Adaboost).

$\begingroup$Could you give some context about what the results of the paper are? And where to find it (link) Titles of papers without additional information are normally not very helpful.$\endgroup$
– vonjdOct 10 '17 at 16:45