"Trading is statistics and time series analysis." This blog details my progress in developing a systematic trading system for use on the futures and forex markets, with discussion of the various indicators and other inputs used in the creation of the system. Also discussed are some of the issues/problems encountered during this development process. Within the blog posts there are links to other web pages that are/have been useful to me.

Pages

Friday, 30 March 2012

I am pleased to say that the first phase of my Kalman filter coding, namely writing Octave code, is now complete. In doing so I have used/adapted code from the MATLAB toolbox available here. The second phase of coding, at some future date, will be to convert this code into a C++ .oct function. My code is a stripped down version of the 2D CWPA demo, which models price as a moving object with position and velocity, and which is described in detail with my model assumptions below.

The first thing I had to decide was what to actually model, and I decided on VWAP. The framework of the Kalman filter is that it tracks an underlying process that is not necessarily directly observable but for which measurements are available. VWAP calculated from OHLC bars fits this framework nicely. If one had access to high frequency daily tick data the VWAP could be calculated exactly, but since the only information available for my purposes is the daily OHLC, the daily OHLC approximation of VWAP is the observable measurement of the "unobservable" exact VWAP.

The next thing I considered was the measurement noise of the filter. Some algebraic manipulation of the VWAP approximation formula (see here) led me to choose two thirds (or 0.666) of the Hi-Lo range of the bar as the measurement noise associated with any single VWAP approximation, this being the maximum possible range of values that the VWAP can take given a bar's OHLC values.

Finally, for the process noise I employed a simple heuristic of the noise being half the bar to bar variation in successive VWAPs, the other half in this assumption being attributable to the process itself.

Having decided on the above the next step was to initialise the filter covariances, and to do this I decided to use the Median Absolute Deviation (MAD) of the noise processes as a consistent estimator of the standard deviation and use the scale factor of 1.4826 for normally distributed data (the Kalman filter assumes Gaussian noise) to calculate the noise variances (see this wiki for more details.) However, I had a concern with "look ahead bias" with this approach but a simple test dispelled these fears. This code box

shows the last 50 values of the Kalman filter with different amounts of data used for the calculations for the initialisation of the filter. The leftmost column shows filter values using all available data for initialisation, the next all data except the most recent 50 values, then all data except the most recent 100 values etc. with the rightmost column being calculated using all data except for the most recent 350 values. This last column is akin to using the data through to the end of 2010, and nothing after this date. Comparison between the left and rightmost columns shows virtually insignificant differences. If one were to begin trading the right hand edge of the chart today, initialisation would be done using all available data. If one then traded for the next one and a half years and then re-initialised the filter using all this "new" data, there would be no practical difference in the filter values over this one and a half year period. So, although there may be "look ahead bias," frankly it doesn't matter. Such is the power of robust statistics and the recursive calculations of the Kalman filter combined!

Note that this code calls three functions; lti_disc, kf_predict and kf_update; which are part of the above mentioned MATLAB toolbox. If readers wish to replicate my results, they will have to download said toolbox and put these functions where they may be called by this script.

Below is a screen shot of my Kalman filter in action.

This shows the S & P E-mini contact (daily bars) up to a week or so ago. The white line is the Kalman filter, the dotted white lines are the plus and minus 2 sigma levels taken from the covariance matrix and the red and light blue triangles show the output of the kf_predict function, prior to being updated by the kf_update function, but only shown if above (red) or below (blue) the 2 sigma level. As can be seen, while price is obviously trending most points are with these levels. The colour coding of the bars is based upon the market type as determined by my Naive Bayesian Classifier, Mark 2.

This next screen shot

shows price bars immediately prior to the first screen shot where price is certainly not trending, and it is interesting to note that the kf_predict triangles are now appearing at the turns in price. This fact may mean that the kf_predict function might be a complementary indicator to my Perfect Oscillator function

and Delta

along with my stable of other turn indicators. The next thing I will have to do is come up with a robust rule set that combines all these disparate indicators into a coherent whole. Also, I am now going to use the Kalman filter output as the input to all my other indicators. Up till now I have been using the typical price; (High+Low+Close)/3; as my input but I think the Kalman filtered VWAP for "today's" price action is a much more meaningful price input than "tomorrow's" pivot point!

Wednesday, 14 March 2012

In my travels around the internet as part of research on the Kalman filter I have found this youtube tutorial which, although quite chatty, is a good introduction and as an added bonus the MATLAB/Octave code is also supplied. A typical plot of this code is:

where

cyan is the noisy measurement

red is the underlying trajectory (hardly discernible as it lies under the plot of the filter)

Sunday, 11 March 2012

Over the years, on and off, I have tried to find code or otherwise code for myself a Kalman filter but unfortunately I have never really found what I want; the best I have at the moment is an implementation that is available from the technical papers and seminars section at the MESA Software web page. However, I recently read this R-Bloggers post which inspired me to look again for code on the web, and this time I found this, which is exactly what I want; accessible Octave like code that will enable me to fully understand (I hope!) the theory behind the Kalman filter and to be able to code my own Kalman filter function. After a little tinkering with the code (mostly plotting and inputs) a typical script run produces this plot:

I particularly like this example script as it mirrors the approach I have taken in the past with regard to creating my "idealised" sine wave time series for development purposes. I think the screen shot speaks for itself; the Kalman filter seems uncannily accurate in filtering out the noise to get the "true" underlying signal, with almost no lag at all! I shall definitely be doing some work with this in the very near future.

Wednesday, 7 March 2012

Following on from a quick post to this Trading Blox forum thread I provide the C++ .oct function code for my implementation of the Robust Repeated Median in the code box below. Due to formatting issues the headers are not shown; they should be:-
#include octave/oct.h
#include octave/dColVector.h
#include octave/parse.h // necessary for the call to feval
#include octave/ov.h // necessary for conversion to double
enclosed within a header brace e.g. "<>"

For more information about the Robust Repeated Median, interested readers are referred to this webpage and this pdf file. Taking the example given in this linked pdf, the above function gives this plot

where it can be seen that the regression line is completely unaffected by the two outliers. As always, if readers see any errors in my code or can suggest improvements, please let me know.