Shifting sands

Pages

Thursday, January 7, 2016

I have been looking at its REST API for OANDA, for potential
use with an FX trading system I developed.

The API has two streaming endpoints, one for prices and one
for account events such as trades opening and stuff like that.

Asynchronous IO is always a bit fiddly, and I wanted separate
processes for incoming tick data and events. This enables them to be managed
separately, and generally makes handling disconnects or other errors a bit
cleaner.

If you try and mash your various feeds, trade
identification, trade management, logging, accounting etc all into one big
process it gets a bit messy and convoluted. Sometimes you don’t have a choice due to the
API available, but thankfully in this case it’s easy to separate everything out.

Details

I use the requests library for streaming from OANDA, and ZeroMQ endpoints with the pub/sub pattern for passing data to the client(s).

Conceptually, it looks like this:

The prices feed connects to the OANDA prices endpoint, and
publishes what it receives to a ZeroMQ socket. Likewise, the events feed does
the same, publishing to another zmq socket.

The client connects to the respective sockets and subscribes
to its feeds, and can do whatever is required with the data it receives. You
are not limited to one client subscriber either.

I use JSON as the serialization format, as it arrives from
OANDA. No real point in deserializing JSON to an object, then immediately serializing
that with pickle or something for transport with zeromq. Your needs may differ
of course.

All up, a pretty painless process.

Code

The single file of code (here) is the prices feed, events feed and
client that just prints out what it receives. Just run it separately three times, passing
either prices, events or client as a single argument. You will also need to
fill in your OANDA details as well.

I use a 10 second timeout for the prices feed, and a 20
second timeout for the events feed, as per OANDAs recommendations. The feed
streams will print heartbeats but not publish them to clients.

Sunday, December 13, 2015

The Halloween effect, aka “sell in May and go away” is the observation that equity market returns tend to be worse over summer time in the northern hemisphere. Anyone who has followed markets for a while has probably noticed a distinct lull over the summer period.

But can we quantify this effect, does it really exist? We can and it does, and it’s simple to show with less than 10 lines of python.

Methods and madness

We create a two column data frame, one column with the monthly return, and the other a dummy variable that is 1 for our hold months (October – May) and 0 for our sell months (June – September).

Once we have created our dummy variable factor marking the events we wish to distinguish between, we do an OLS regression and look at the coefficient of our factor.

If it is “significant”, we conclude there is a material difference between a factor being present and when it is not.

If you are a commercial data scientist, you can use this same method to see if some key metric has actually changed after a marketing campaign or new release. This could be things like increasing user signups or revenue. Your dummy variable would be 0 before the campaign, and 1 afterwards.

If we can show our campaign worked, we can tell our boss how great we are and not to forget all our hard work come bonus time.

Example

As an example, lets look at SPY from 1993 onwards. First we download the data from yahoo, and create a column of monthly returns. Then we code our dummy variable as described above and run the regression.

Where x is our Halloween dummy variable with a p-value of 0.0096. Significant at any reasonable level. Take that EMH!

Looking at the data, the average monthly return is -0.28% for summer, and +1.11% for the Halloween period.

End notes

For the Halloween effect, rather than looking at monthly returns, we should probably look at the differential of monthly return and risk free rate. There’s a fairly comprehensive paper with a good historical review available here.

Also, there is a great and freely available book on working with time series data available here, the examples are in R but should be pretty easy to follow along.

Monday, May 11, 2015

I have been look at equities a bit of late,
I am particularly interested in ranking a universe of equities for “low
frequency” manual trading on a weekly or monthly basis.

Every period I would rank each name on a
bunch of different factors, then invest in the highest ranked ones for that
month.

I was initially working in R but the code
grew unwieldy, and I wanted a second opinion on my approach so took the time to
re implement it in python using Pandas.

Setup

For each symbol in our universe, we load
the raw data and generate the information used for ranking. If we have 5 names,
we end up with 5 dataframes.

Then we combine those dataframes into one
big dataframe, and iterate through month by month, selecting the symbols that
meet our ranking criteria. From those selected, we equally weight and sum the next period returns.

One thing that is really cool about the
pandas dataframe is that it allows multiple rows with the same index.

This makes it easy to get the data for the
month under consideration. We just pass the month to index function and get the
subset of data for that month, e.g.

>>> df.ix['2015-02']

cprnpravgoversym

Date

2015-02-280.043302 -0.062449 -0.038914FalseDBC

2015-02-28 -0.0250280.0085240.006130TrueIEF

2015-02-280.056838 -0.0142390.005434TrueVEU

2015-02-28 -0.0374340.0171710.015900TrueVNQ

2015-02-280.055832 -0.0116970.009236TrueVTI

[5 rows x 5 columns]

>>>

In this example there are 5 symbols, and we
see the ranking information for February 2015.

Another option would be to use hierarchical
indexing, with a sub-index for each month, but this way worked for my needs and
I think is quite clean and simple.

If anyone knows an equivalent in R that is
as clean and easy to work with for multiple time series I would love to hear about it.

Code Notes

The demo code does a simple back test of
the GTAA/Relative Strength trend following system using ETFs.

I have stripped it down to the basics so hopefully it is easy to understand. Load
the data, generate the dataframe with the info we want, make a combined data
frame, then go through month by month.

The ranking is done by filtering out names
under their 10 month moving average, then selecting the top n based on average
3 month return.

The “cpr” column is the current period
return, and the “npr” column is the next period return, which is the return
realized if we select a given security for that month.

The data is just ETF data from Yahoo, which
I have put up here. Code is here.