Monthly Archives: July 2011

I was looking at a book to find some algorithm to detect clusters within financial data. I managed to find a decent algorithm for that matter, but I then wanted to test it on some real data.

Hence, I had to get the data from somewhere and I chose to use Yahoo! Finance. The thing is, I wanted to do it in an automatized way in order to be able to re-use the module for further projects.

I remembered Luca Bolognese’s presentation of F# a few years back (certainly one of the best presentations I’ve ever seen) and I thought it was a good time to give it a try.

The idea is quite simple: you can import data from Yahoo! Finance by parsing the CSV files the website produces. These files are actually generated automatically, which means that you can access them by querying the right URL with the right parameters.

The following module shows how to generate the right query to get the CSV file from Yahoo! Finance:

The idea is now to write a module to download this data. This would be pretty straightforward and I wouldn’t make you waste your time by reading further code right here.

What I’m trying to do here is to download these files asynchronously. Doing this using C# is possible but requires some heavy coding and the result will most likely contain some bugs. F# (and functional programming languages in general) allows you to do it much more easily.

Share this:

Good evening everybody, I’ve been paying attention to portfolio modelling for the past few months. When you tackle such problem, you first try to think about how you could represent a portfolio as an object so that you can dive into your C#/C++/Java code so that you can start making money ASAP. However, you’ll soon find yourself cornered in numerous problems, especially when you want to backtest different allocation strategies.

The object-oriented approach

Usually, when people model a portfolio, they will see it as a mapping between assets and weights associated to a date. There is however a misinterpretation of what a portfolio is. Indeed, what the common programmer describer above in his model is in fact a snapshot of the portfolio stat at some time t. If you were to make some changes in between two dates (all the subsequent instances of the portfolio will then be erroneous and the programmer would have to recompute them all to get the right simulation. Let’s take an example. Say we have a portfolio going through time t=1,2,3. We assume the stock has two assets, Microsoft (MSFT) and Yahoo (YHOO), and that the allocation strategy is to have an equally weighted portfolio (weights={0.5,0.5}). Here’s how the implementation would look like (in C#):

Now assume you want to add another stock (STO) to the portfolio at time 2, the previous implementation needs to be extended as follows:

class History
{
public Dictionary<DateTime,Portfolio> history;
public Add(DateTime t, Portfolio p)
{
history.Add(t,p);
if (history.Any(x=>x.Key>=t))
{
/* Compare the new portfolio composition with the subsequent states
* Take the necessary operation to adjust the portfolio.
* OUCH!!!! This is complicated.
*/
}
}
}

As you can see in the comments I added, backward changes requires recomputing the portfolio at time 2 and 3. This is computationally intensive and the kind of function you really do not fancy writing. I’m not even discussing the probability that some bug will exist or the change that you would have to add if you were to make more complicated backward operations. Furthermore, assuming you want to see how the portfolio behave before a change, all the information about the previous simulation would be lost. This is because the class actually stores the stateof the portfolio, not really the portfolio itself. “Thanks for the heads-up Einstein! You got anything better to do?” Well, as a matter of fact, yes.

The functional approach

I would like to introduce a different way of representing a portfolio; a way which would be especially meaningful in a functional environment (F#, Scala). First of all, I would like to define a portfolio snapshot as a list of tuples of an Asset and a double representing each asset and its weight. To me, a portfolio is a strategy more than an object. In terms of allocation, the strategy outputs portfolio snapshots with weights, and these weights can actually generate buy/sell order to adjust a “real” portfolio (a basket of real assets) position. But the whole point here is that the portfolio is in fact just as set of operations. In my opinion, the correct way of representing an operation in a programming language is a function. In our case, an operation would be a function. This includes adding an asset to the portfolio, optimizing a portfolio and so on. The portfolio is hence a list of tuples of type DateTime*Operation (the date being the time at which the operation should occur). Let’s just define some formal definitions to these concepts (in F#):

An action is a function taking some type as an input and return a modified version of this object (actually, a new instance of the object with the modification included). A changeis an action happening at a certain time. Finally, a history is a list of changes. Simple. Now, how do we apply this to portfolio modeling? First, some more definitions:

For those of you not familiar with functional programming, this might look complicated, but if you look a bit into the language (particularly Pattern Matching), you’ll see it’s actually quite trivial. Let’s continue with the optimization of the portfolio:

Trivial. Finally, we need a function that will evaluate the portfolio. This requires the application in succession of all the changes to an initial portfolio allocation (most of the time, an empty portfolio, initially). This kind of operation is well-known in functional programming, it simply consists in foldinga list:

The first function implements the general logic: first sorting the operations and then applying them sequentially. The second function just modifies the first one by giving an initial empty portfolio. The application of this model is as follows:

When hence see how trivial it is to compute the state of the portfolio at different times. With this representation, altering a portfolio at time t=2 means actually adding a function to a list, but does not change the subsequent operations. The state of the portfolio (the snapshot) is computed on demand. Let’s try doing so:

As you can see, no rocket science to add backward operations! Note that we could have done so for any portfolio action… The model could be improved of course, but the idea is here. Keeping track of the old simulation would just mean keeping the date of creation of the change. I hope you enjoyed the ride, please feel free to comment! See you next time!