"Trading is statistics and time series analysis." This blog details my progress in developing a systematic trading system for use on the futures and forex markets, with discussion of the various indicators and other inputs used in the creation of the system. Also discussed are some of the issues/problems encountered during this development process. Within the blog posts there are links to other web pages that are/have been useful to me.

Pages

Wednesday, 26 November 2014

Continuing from my last post, wherein I stated I was going to conduct a more pertinent statistical test of the returns of the bars(s) immediately following the N best, Cauchy Schwarz matching algorithm matched bars in the price history, readers may recall that the basic premise behind this algorithm is that by matching current price action to the N best matches, the price action after these matches can be used to infer what will occur after the current price action. However, rather than test the price action directly I have decided to apply the test to the MFE/MAE indicator. There are several reasons for this, which are enumerated below.

I intend to use the indicator as the target function for future Neural net training

the indicator represents a reward to risk ratio, which indirectly reflects price action itself, but without the noise of said action

this reward to risk ratio is of much more direct concern, from a trading perspective, than accurately predicting price

since the indicator is now included as a feature in the matching algorithm, testing the indicator is, very indirectly, a test of the matching algorithm too

This shows two sampling distributions of the mean for Long MFE/MAE indicator values > 0.5, the upper pane for sample sizes of 20 and the lower pane for 75. For simplicity I shall only discuss the Long > 0.5 version of the indicator, but everything that follows applies equally to the Short version. As expected the upper pane shows greater variance, and for the envisioned test a whole series of these sampling distributions will be produced for different sampling rates. The way I intend it to work is as follows:

take a single bar in the history and see what the value of the MFE/MAE indicator value is 3 bars later (assume > 0.5 for this exposition, so we compare to long sampling distributions only)

get the top 20 matched bars for the above selected bar and the corresponding 20 indicator values for 3 bars later and take the mean of these 20 indicator values

check if this mean falls within the sampling distribution of the mean of 20, as shown in the upper pane above by the vertical black line at 0.8 on the x axis. If it does fall with the sampling distribution, we accept the null hypothesis that the 20 best matches in history future indicator values and the value of the indicator after the bar to be matched come from the same distribution

repeat the immediately preceding step for means of 21, 22, ... etc until such time as the null hypothesis can be rejected, shown in the lower pane above. At this point, we then then declare an upper bound on the historical number of matches for the bar to be predicted

For any single bar to be predicted we can then produce the following chart, which is completely artificial and just for illustrative purposes:

where the cyan and red lines are the +/- 2 standard deviations above/below a notional mean value for the whole distribution of approximately 0.85, and the chart can be considered to be a type of control chart. The upper and lower control lines converge towards the right, reflecting the decreasing variance of increasingly large N sample means, as shown in the first chart above. The green line represents the cumulative N sample mean of the best N historical matches' future values. I have shown it as decreasing as it is to be expected that as more N matches are included, the greater the chance that incorrect matches, unexpected price reversals etc. will be caught up in this mean calculation, resulting in the mean value moving into the left tail of the sampling distribution. This effect combines with the shrinking variance to reach a critical point (rejection of the null hypothesis) at which the green line exits below the lower control line.

The purpose of all the above is provide a principled manner to choose the number N matches from the Cauchy-Schwarz matching algorithm to supply instances of training data to the envisioned neural net training. An incidental benefit of this approach is that it is indirectly a hypothesis test of the fundamental assumption underlying the matching algorithm; namely that past price action has predictive ability for future price action, and furthermore, it is a test of the MFE/MAE indicator. Discussion of the results of these tests in a future post.

Wednesday, 12 November 2014

This first use is as an input to my Cauchy-Schwarz matching algorithm, previous posts about which can be read here, here and here. The screen shot below shows what I would characterise as a "good" set of matches:

The top left pane shows the original section of the price series to be matched, and the panes labelled #1, #5, etc. are the best match, 5th best match and so on respectively. The last 3 rightmost bars in each pane are "future" price bars, i.e. the 4th bar in from the right is the target bar that is being matched, matched over all the bars to the left or in the past of this target bar.

I consider the above to be a set of "good" matches because, for the #1 through #25 matches for "future" bars:

if one considers the logic of the mfe/mae indicator each pane gives indicator readings of "long," which all agree with the original "future" bars

similarly the mae (maximum adverse excursion) occurs on the day immediately following the matched day

the mfe (maximum favourable excursion) occurs on the 3rd "future" bar, with the slight exception of pane #10

the marked to market returns of an entry at the open of the 1st "future" bar to the close of the 3rd "future" bar all show a profit, as does the original pane

However, it can be seen that the above noted "goodness" breaks down for panes #25 and #30, which leads me to postulate that there is an upper bound on the number of matches for which there is predictive ability for "future" returns.

In the above linked posts the test statistic used to judge the predictive efficacy of the matching algorithm was effect size. However, I think a more pertinent test statistic to use would be the average bar return over the bars immediately following a matched bar, and a discussion of this will be the subject of my next post.

In the above linked post there is a video showing the idea as a "paint bar" study. However, I thought it would be a good idea to render it as an indicator, the C++ Octave .oct code for which is shown in the code box below.

An alternative, if the indicator reading is flat, is to maintain any previous non flat position. I won't show a chart of the indicator itself as it just looks like a very noisy oscillator, but the equity curve(s) of it, without the benefit of foresight, on the EURUSD forex pair are shown below.

The yellow equity curve is the cumulative, close to close, tick returns of a buy and hold strategy, the blue is the return going flat when indicated, and the red maintaining the previous position when flat is indicated. Not much to write home about. However, this second chart shows the return when one has the benefit of the "peek into the future" as discussed in my earlier post.

The colour of the curves are as before except for the addition of the green equity curve, which is the cumulative, vwap value to vwap value tick returns, a simple representation of what an equity curve with realistic slippage might look like. This second set of equity curves shows the promise of what could be achievable if a neural net to accurately predict future values of the above indicator can be trained. More in an upcoming post.