"Trading is statistics and time series analysis." This blog details my progress in developing a systematic trading system for use on the futures and forex markets, with discussion of the various indicators and other inputs used in the creation of the system. Also discussed are some of the issues/problems encountered during this development process. Within the blog posts there are links to other web pages that are/have been useful to me.

Pages

Thursday, 31 October 2013

For my first steps in investigating the earlier proposed Brownian motion model I thought I would use some Exploratory Data Analysis techniques. I have initially decided to use look back periods of 5 bars and 21 bars; the 5 bar on daily data being a crude attempt to capture the essence of weekly price movement and the 21 bar to capture the majority of dominant cycle periods in daily data, with both parameters also being non optimised Fibonacci numbers. Using the two channels over price bars that these create it is possible to postulate nine separate and distinct price "regimes" using combinations of the criteria of being above, within or below each channel.

The use of these "regimes" is to bin normalised price change at time t into nine separate distributions, the normalising factor being the 5 bar simple moving average of absolute natural log differences at time t-1 and the choice of regime being determined by the position of the close at t-1. The rationale behind basing these metrics on preceding bars is explained in this earlier post and the implementation made clear in the code box at the end of this post. The following 4 charts are Box plots of these nine distributions, in consecutive order, for the EURUSD, GBPUSD, USDCHF and USDYEN forex pairs:

Looking at these, I'm struck by two things. Firstly, the middle three box plots are for regimes where price is in the 21 bar channel, and numbering from the left, boxes 4, 5 and 6 are below the 5 bar channel, in it and above it. These are essentially different degrees of a sideways market and it can be seen that there are far more outliers here than in the other box plots, perhaps indicating the tendency for high volatility breakouts to occur more frequently in sideways markets.

The second thing that is striking is, again numbering from the left,
box plots 3 and 7. These are regimes below the 21 bar and above the 5
bar; and above the 21 bar and below the 5 bar - essentially the
distributions for reactions against prevailing trends on the 21 bar time
scale. Here it can be seen that there are far fewer outliers and
generally shorter whiskers, indicating the tendency for reactions
against trends to be less volatile in nature.

It is
encouraging to see that there are such differences in these box plots as
it suggests that my idea of binning does separate out the different
price change characteristics. However, I believe things can be improved a
bit in this regard and this will form the subject matter of my next post.

Sunday, 27 October 2013

As stated in my previous post I'm going to rewrite my data generation code for future use in online training of neural nets, and the approach I'm going to take is to combine the concept of Brownian motion and the ideas contained in my earlier Creation of Synthetic Data post.

The basic premise, taken from Brownian motion, is that the natural log of price changes, on average, at a rate proportional to the square root of time. Take, for example, a period of 5 leading up to the "current bar." If we take a 5 period simple moving average of the absolute differences of the log of prices over this period, we get a value for the average 1 bar price movement over this period. This value is then multiplied by the square root of 5 and added to and subtracted from the price 5 days ago to get an upper and lower bound for the current bar. If the current bar lies between the bounds, we say that price movement over the last 5 periods is consistent with Brownian motion and declare an absence of trend, i.e. a sideways market. If the current bar lies outside the bounds, we declare that price movement over the last 5 bars is not consistent with Brownian motion and that a trend is in force, either up or down depending on which bound the current bar is beyond. The following 3 charts show this concept in action, for consecutive periods of 5, 13 and 21, taken from the Fibonacci Sequence:

where yellow is the closing price, blue is the upper bound and red the lower bound. It is easy to imagine many uses for this in terms of indicator creation, but I intend to use the bounds to assign a score of price randomness/trendiness over various combined periods to assign price movement to bins for subsequent Monte Carlo creation of synthetic price series. Interested readers are invited to read the above linked Creation of Synthetic Data post for a review of this methodology.

The rough working Octave code that produced the above charts is given below.

Saturday, 26 October 2013

Up till now when training my neural net classifier I have been using what I call my "idealised" data, which is essentially a model for prices comprised of a sine wave along with a linear trend. In my recent work on Savitzky-Golay filters I wanted to increase the amount of available training data but have run into a problem - the size of my data files has become so large that I'm exhausting the memory available to me in my Octave software install. To overcome this problem I have decided to convert my NN training to an online regime rather than loading csv data files for mini-batch gradient descent training - in fact what I envisage doing would perhaps be best described as online mini-batch training.

Tuesday, 15 October 2013

In my previous post I said that I was going to refactor code to use my existing, trained classification neural net with Savitzky-Golay filter inputs in the hope that this might show some improvement. I have to report that this was an abysmal failure and I suppose it was a bit naive to have ever expected that it would work. It looks like I will have to explicitly train a new neural network to take advantage of Savitzky-Golay filter features.

Sunday, 6 October 2013

For the past couple of weeks I have been playing around with Savitzky-Golay filters with a hope of creating a moving endpoint regression line as a form of zero lag smoothing, but unfortunately I have been unable to come up with anything that is remotely satisfying and I think I'm going to abandon this for now. My view at the moment is that SG filters' utility, for my purposes at least, is limited to feature extraction via the polynomial coefficients to get the 2nd, 3rd and 4th derivatives as inputs for my Neural net classifier.

To do this will require a time consuming retraining period, so before I embark on this I'm going to use what I think will be an immediately useful SG filter application. In a previous post readers can see three screenshots of moving windows of my idealised price series with SG filters overlaid. The SG filters follow these windowed "prices" so closely that I think SG filters more or less == windowed idealised prices and features derived there from. However, when it comes to applying my currently trained NN the features are derived from the noisy real prices seen in the video of the above linked post. What I intend to do wherever possible is to use a cubic SG filter applied to windowed prices and then derive input features from the SG filter values. Effectively what I will be doing is coercing real prices into smooth curves that far more closely resemble the idealised prices my NN was trained on. The advantage of this is that I will not have to retrain the NN, but only refactor my feature extraction code for use on real prices to this end. The hope is, of course, that this tweak will result in vastly improved performance to the extent that I will not need to consider future classifier NN retraining with polynomial derivatives and can proceed directly to training other NNs.

Finally, on a related note, I have recently been directed to this paper, which has some approaches to time series classification that I might incorporate in future NN training. Fig. 5 on page 9 shows six examples of time series classification categories. My idealised prices categories currently encompass what is described in the paper as "Cyclic," "Increasing Trend" and "Decreasing Trend." I shall have to consider whether it might be useful to extend my categories to include "Normal," "Upward Shift" and "Downward Shift."

Wednesday, 2 October 2013

I was recently contacted and asked if I wouldn't mind writing a short
review of the Quantpedia website in return for complimentary access to
that site's premium section, and I am happy to do so. Readers can get a
flavour of the website by perusing the free content that is available in
the Screener tab and reading the Home, About and How We Do It tabs.
Since these are readily accessible by any visitor to the site, this
review will concentrate purely on the content available in the premium
section.

The thing that strikes me is the wide and eclectic range
of studies available, which can be easily seen by clicking on the
keywords dropdown box on the screener tab. There is something for almost
all trading styles and I would be surprised if a visitor to the premium
section found nothing of value. As always the devil is in the details,
and of course I haven't read anywhere near all the information that is
available in the various studies, but a brief visual overview of the
various studies' performance taken from Quantpedia's screener page is
provided in the scatterchart below:

(n.b. Those studies that lie exactly on the y-axis (volatility = 0%) do not have no volatility, but no % figure for volatility was given in the screener.)

As can be
seen there are some impressive performers, which are fully disclosed in
the premium section. A few studies, irrespective of performance, did
catch my immediate attention. Firstly, there are a couple of studies
that look at lunar effects in stock markets and precious metals. Long
time readers of this blog will know that some time back I did a series
of tests on the Delta Phenomenon, which is basically a lunar effect
model of price turning points. The conclusion of the tests I conducted
was that the Delta Phenomenon did have statistically significant
predictive ability. The above mentioned studies come to the same
conclusion with regard to lunar effects, although via a different testing methodology. It is
comforting to have one's own research conclusions confirmed by
independent researchers, and it's a valuable resource to be able to see
how other researchers approach the testing of similar market hypotheses.

Secondly, there is a study on using Principal Components Analysis to
characterise the current state of the market. This is very much in tune
with what I'm working on at the moment with my neural net market
classifier, and the idea of using PCA as an input is a new one to me and
one that I shall almost certainly look into in more detail.

This
second point I think neatly sums up the main value of the studies on
the Quantpedia site - they can give you new insights as to how one might
develop one's own trading system(s), with the added benefit that the
idea is not a dead end because it has already been tested by the
original paper's authors and the Quantpedia site people. You could also
theoretically just take one of the studies as a stand alone system and
tweak it to suit your own needs, or add it as a form of diversification
to an existing set of trading systems. Given the wide range of studies
available, this would be a much more robust form of diversification than
merely adjusting a look back length, parameter value or some other such cosmetic adjustment.

In
conclusion, since I can appreciate the value in the Quantpedia site, I
would like to thank Martin Nizny of Quantpedia for extending me the
opportunity to review the premium section of the site.