How to analyze hedge fund risk with sparse data

While big firms struggle with a glut of data, Larch Lane marches on with ‘little data’

By Eric Konigsberg, Director of Risk Management, Larch Lane

While the big financial services firms are worrying about how to handle big data, we’re challenged by the lack of it. What vendor do you chose when you have limited data? We chose SAS because it gave us fast and reliable risk analysis on limited data.

At Larch Lane Advisors, we invest in early stage hedge funds: funds that are younger than three years old or that have assets of less than $400 million. In addition to our fund-of-fund investments, we’ve made about 26 hedge fund seeding investments since our beginning in 2001. Risk management is integrated into all aspects of our investment process, and protection of capital is our primary objective.

As the Director of Risk Management, I use SAS® to measure our market-related risks and help with portfolio construction. SAS sifts through the “little data” to give our portfolio management and research departments the information they need to make decisions about future investments.

Data comes in one end, magic happens, and answers pop out the other … unfortunately, it’s only perfect until it suddenly isn’t.

It’s efficient and works perfectly. It’s a marvel of engineering. It’s something that data scientists will talk about in hushed tones for decades, usually only after extensive initiation rituals.

Unfortunately, it’s only perfect until it suddenly isn’t. In the immortal words of Jeff Goldblum, “Life will find a way.” In the also immortal words of Murphy, “Anything that can go wrong, will.”

At some stage, things break. And, without being able to track changes through the underlying data value chain, it’s absolutely impossible to determine why or where things are going wrong.

A classic example involved an organisation I knew that saw their customer base shift dramatically. On Monday, they serviced mainly young professionals. By Friday, their models were telling them that they serviced octogenarians almost exclusively.

This obviously made no sense. Neither their churn nor acquisition rates suggested that anything like this was possible. Short of discovering a hitherto unknown time machine, it seemed more than a little likely to be a mistake. Needless to say, it also did little to build the analytical marketing team’s credibility.

After more than a few late nights and sour stomachs, they eventually worked out that one of their data sources had stopped updating. This issue cascaded and eventually ruined their segmentation model.

Sadly, they only got to the answer after they’d re-validated their model, reviewed all their imputations, and double-checked all their reports for accuracy. While they eventually fixed the problem, they could have avoided it entirely had they preserved linage through their data value chain.

Every analytical process goes through the same five steps. The easiest way to quickly identify and fix issues with analytical data management is to make sure every step (where possible) is isolated and preserved. These five milestones are:

Sourcing raw data.

Cleansing and imputing data.

Calculating transformations and derivations.

Calculating value-added data.

Translating data into a consumable form.

Isolating and preserving the data associated with each of these steps makes it easier for teams to maintain, optimise, and debug their day-to-day work. Not only does this reduce risk but it drives efficiency, and in the long run, reduces ramp-up times. While it carries additional storage costs, the non-technical benefits and maintenance efficiencies greatly outweigh the incremental investment needed to cover storage.

It’s a simple step, but it’s a lifesaver when it comes to driving incremental efficiencies. Everything eventually breaks and time spent fixing things is time taken away from creating new value-generating processes and assets.

And, taking too long to fix things is embarrassing. We’re supposed to be experts, right?