Sunday, August 31, 2014

I came across a blog post and a paper describing methods for implementing an ARIMA model as a price forecaster. They both deal with the stock market, which is not exactly the same as FOREX. But given the lack of ARIMA examples involving FOREX, I've decided to take a look at these:

From the blog post it seems that ARIMA doesn't have a very good resolution in it's predictions. The most it seems to be able to predict is a likely dynamic range for the future return rate. The paper shows how an ANN predicts fluctuations more closely. Still: this paper uses a FFNN, whereas the russian paper from one of my earlier posts uses a SRN.

ARIMA seems more limited than the SRN in it's predictive power. Therefor, I have decided to give priority to the SRN approcah using the newelm function from neurolab which I mentioned in an earlier post.

The ARIMA implementation, nevertheless, has inspired me to attempt a new rustic buy-and-hold algorithm. One which takes into account the density of price falls through time, and a likely dynamic range for it. I will be reporting on the results later on.

Thursday, August 28, 2014

In the previous post I showed that price fluctuations have a gaussian distribution. This has a similarity with gaussian noise. Apparently, there are several ways to predict gaussian time series. One is described here:

http://www.gaussianprocess.org/gpml/chapters/RW.pdf

But it involves concepts which are too advanced and complicated for me at this moment. Besides: gaussian processes are not the same as gaussian noise. It seems that a gaussian process is formally defined as a process involving "multivariate normal distribution". Since the price fluctuations I'm studying have only one variable, I believe using gaussian process prediction methods would be overkill. I might be wrong, though.

Another method is one used to filter out gaussian noise. It's called "Kalman filtering". Kalman filtering seems useful, because it works by predicting gaussian noise in order to eliminate it. Because currency price fluctuations follow a gaussian distribution, I think the predictive component of the Kalman filter may be useful in predicting short-term future fluctuations in currency price.

The following seems like a good, comprehensive, introduction to Kalman filtering:

So the basic idea is to treat FOREX price fluctuation as if it were gaussian noise, and try to predict it short-term with a Kalman filter.

I want to be able to predict short-term future price fluctuations because I discovered that sudden and large price decrements produce important losses when using my rustic buy-and-hold algorithm. These predictions may turn out to be useful in setting up an effective predictive stop-loss alarm for my rustic buy-and-hold algorithm.

Sunday, August 24, 2014

I've been measuring the distribution of negative fluctuations in forex market history for the EUR_USD. I discovered that most variations in price (from one instant to another, with a time base of 1 minute) are "small", when compared to larger ones. Sudden, large increments or decrements are scarce. The following plot shows the data from sorted (smallest to largest) price changes (both positive and negative) along many weeks (each week has a different color).

The plot shows a large amount of "small" values near the center (which is zero), and only a few large ones near the sides. This implies that a lapse in time with a high density of price decrements has more probability of containing large negative price fluctuations than a lapse with few price decrements.

To extract the frequency of decrements at any point in time, only the direction of the currency price is required. The following picture shows price decrements in red, and increments in green. Blank spaces represent unknown currency fluctuations in a given instant. What I try to find here is how many red squares there are within a given window. Then I shift that window to the right, and count the red squares again. I keep shifting and counting until I reach the end of the data set.

Interestingly, it turns out the set of frequencies (sampled by shifting the window from top-left to bottom-right) also have a normal distribution. This means there are more "small" frequencies than "big" ones in all consecutive windows of time.

Plotting the histogram for both data-sets showed an approximate normal (Gaussian) distribution.

This shows that the variations in price are mostly smooth, with occasional outbursts. That is: they are more likely to fluctuate within a given (relatively) small range at any given instant. There is a certain stability in currency price change.

Wednesday, August 20, 2014

From NN's to buy-and-hold. From the relatively complex, to the very simple.

In these last few days I've been imagining a new (?) algorithm for doing simple buy-and-hold trades. The key is in finding the most probable highest and lowest currency prices for a given point in time. It would involve something like taking a list of the last N prices, sorting it, and averaging the top M values to get some "local average maximum" and the last L to get a "local average minimum". Then, placing limit orders with those values. Come to think of it, it sounds a little like RSI's overbought and oversold indicators. But it's not quite it.

Pretty crude. But I'll do it while I work on the NN approach, and see what happens.

Let's say that I find the most profitable MACD parameters for a certain month or year. This is no guarantee that the same parameters will be profitable next year or month. I think trying to find optimal MACD parameters for a particular dataset would be curve-fitting. I recently found a message board post about skepticism towards technical analysis. It's from a trusted source (the James Randi website [atheist/skeptic/critical thinker here BTW]). Do take a look at the references mentioned by the posters, though:

http://forums.randi.org/showthread.php?t=96372

That said, I've come across a paper which says neural networks do a good job predicting forex market trends (http://arxiv.org/pdf/cond-mat/0304469.pdf). It uses a neural network architecture which is a mix between an Elman and a Joran SRN. I believe it doesn't say what training algorithm they used to teach the network to predict market trends. In any case, I will probably be using RTRL, because it seems less resource-consuming than BPTT. I doubt my crappy computer can handle BPTT for the large amounts of data I plan to feed my SRN. Also, I would like to start with a pure Elman architecture, instead of the "Elman-Jordan" architecture suggested in the paper. I just don't have the expertise in NN's to copy the paper step by step.

I should mention I've never implemented a FFN, much less a SRN. After spending several days looking into the details of how FFN's and SRN's work, I've come up with a new TODO list:

Understand how FFN's work (check)

Understand how SRN's work (check)

Understand the backpropagation algorithm for FFN's

Understand the BPTT algorithm for SRN's

Understand the RTRL algorithm for SRN's

It would seem one can't understand RTRL without first understanding BPTT and the classic BP algorithm, as they are somehow part of RTRL. Fortunately, there are several youtube videos on all of them, as well as some pretty good online resources with worked examples:

https://www.youtube.com/watch?v=yecGyZFyfbQ

https://www.youtube.com/watch?v=hYenZlvBwr4

http://neuralnetworksanddeeplearning.com/chap1.html

From what I've gathered so far, the NN has to be a SRN because those are the ones useful for predicting time series. FFN's are useless for that. But that's all I know so far. I'm still grasping the BP algorithm, in order to implement it.

Also, I've found some interesting NN resources. The only library I've found (and liked) which includes an Elman network generator and trainer is "neurolab" for python:

I like the fact that it's in python because it would allow me to integrate it easily with a shell script. Still: I'd like to implement my own SRN in C. And also: I haven't found out whether neurolab's newelm uses BPTT or RTRL. I don't have enough knowledge about python to re-program the whole learning function for Elman, so I'm probably better off writing everything from scratch in C.

A final word about technical analysis though... I'm not sure if NN's fit into the "technical analysis" category. Maybe not ALL technical analysis is fake. In the Randi boards, they mostly mention the typical indicators: MACD, RSI, CCI, etc. Also, the mathematical analysis in the SRN paper from earlier goes pretty deep into statistics. I should be looking into that too. Grasping the statistics surrounding FOREX seems worthwhile.

So who knows... I guess I'll wait and see what my own experiments tell me.

Sunday, August 17, 2014

I have finished programming the MACD fitness function in C. This function takes in the 3 usual parameters for the MACD and returns the balance after the last sell. This function will be useful in implementing the genetic algorithm (GA).

In the following source code, macdbal() is this fitness function. The program works by filling in all the missing candles from the data downloaded by the dl2.sh script from an earlier post. It then calculates the Simple Moving Averages used to obtain the MACD. Based on the MACD, macdbal() "buys" (subtracts the closing ask price from the balance) and "sells" (adds the closing bid price to the balance).

A few days ago I had the idea of using a tektonic virtual server (http://www.tektonic.net/virtual-servers.html) to run my FOREX trading scripts reliably. The advantages are obvious: I'd free myself from hardware maintenance and any fear of power shortages interfering with my trading.

It's only 15 dollars a month, and accessible by SSH. This would allow me to download reports and tweak my trading algorithms from my smartphone 24/7. But I will probably be doing this only once I achieve a profitable trading algorithm.

I've been interested in FOREX trading since I enrolled in college around 8 years ago, reading pieces of articles about it here and there. I am now a 26 year old electronics engineer with some graduate school experience in computer science. I think I've matured enough, mathematically and intellectually, to start playing with the FOREX market.

I admit I have no formal or previous experience with trading. This blog is not intended as a guide of any kind. Much less as trading advice. This blog is about the development of my FOREX learning curve, from the very beginning. The notes contained here are for my personal use, but I will share them in case anybody out there wants to collaborate, make suggestions, corrections, criticism, or simply to learn alongside myself.

That said, I'll start by sharing some of the experiences I've collected so far.

I've started by becoming familiar with the basic FOREX trading concepts. I believe I have a good enough grasp on pips, leverage and the MACD to start playing around with an OANDA test account.

Recenly I've run some experiments using the OANDA HTTP API, UNIX shell scripts and C programming. So far I haven't been able to turn any profit using my MACD schemes with the OANDA API. I blame a set of factors for this:

The input parameters I've used are based on only a few trading simulations I ran using the EUR_USD price data of 3.5 days (5000 1-minute samples).

My simulations do not entierly represent real trading. I must improve them.

I am using MACD exclusively. I should try combining it with other methods, such as a 1-2-3 scheme and RSI.

I am aware that there exists software which already implements MACD and others. But I've refrained from using such things. Mostly because of my obsession with writing my own software, and with being in absolute control of what my software does. Still, I don't reject the idea of using pre-existing software. I think experimenting with pre-existing software may help me turning profit sooner. It may even help me improve my own software.

To solve the issues which have prevented me from turning a profit with my OANDA test account, I have formulated the following TODO list:

Writing a proper trading simulator.

Using such simulator to run a genetic algorithm which finds the optimal MACD parameters (genetic MACD tuning). These parameters include the typical 3 numeric values, and a fourth value representing the time frame for sampling closing bid price values.

Run my MACD shell script with the obtained values.

Copy this trading strategy: http://forex-strategies-revealed.com/simple/123-rsi-macd using only shell scripting and C programming, as to understand the nuts and bolts of MACD, 1-2-3 and RSI.

Copy the same trading strategy using the suggested pre-existing software packages.

Currently my source code is not ready for publishing. I will polish it and then share it on my github account. I will be reporting as I finish the elements in my TODO list.