Sunday, June 07, 2009

Freerisk.org No Threat to Moody's

Wired Magazine had an article about some internet geeks (and I mean that as a compliment) who are trying to set up 'a better way of measuring corporate credit risk'. All well and fine. Their site, freerisk.org, is trying to set up a site where people can gather data and access models derived from that data. But strategy must have good tactics, and I don't see this working out.

To really create a 'better corporate credit risk' model, one needs a historical, survivorship-bias free set of financial and market data with information on 'bads'. That is, noting which companies were either delisted for performance reasons, defaulted, or went bankrupt. I don't think that data is available anywhere for free, and without it you can not develop, validate, or calibrate a model. Current financial data is, by definition, biased towards firms that have not defaulted in the past.

Further to get relevant data one needs not merely recent financial data, but stock price information. The current project seems focused on getting SEC filings in some common format, but one needs to address the issues of how to define robust firm IDs. Tickers, even cusips, change. Stock price information so important in calculating the Merton model and its derivatives is from a different set of data. Thus, there needs to be some work on creating a unique identifier across these two sources. This is a job that must be tackled top-down, and won't come from users.

Also, the financial data one needs is often of a 'relative to now' nature. You want the latest rolling 4 quarter net income (minus extraordinary items), and its change over the previous 4 quarters. This means you have a call for data arranged by someone into lags. This is non-trivial when SEC statements are presented as single quarter snapshots, with specific dates (one does not say 'current').

Their video goes over some credit problems, such as Moody's missing Lehman, or Enron, which is true enough. A Merton model that used stock price information would have been much better, but there is a downside to the Merton model, mainly, it generates a large amount of ratings volatility that investors do not like. The agency ratings aren't optimal for predicting ratings, but they create a common metric that works pretty well (I know, not optimal). The Altman Z-Score and the Piotroski method mentioned by Freerisk are really bad alternatives, hardly worth calculating ((2*net income - liabilities)/assets works as well as either)

Before Moody's RiskCalc(TM), several people who had the credibility and means to create an algorithm yet failed to create something people were willing to pay for. S&P, Loan Pricing Corp, Ed Altman, all should have been able to create models, yet they failed. For S&P, because they use a non-transparent and non-intuitive neural net that appears ridiculously overfit. Loan Pricing Corp had access to banks and their proprietary data, but created a model that was too dumb. Ed Altman created the first risk model in 1968, and while he is a perfunctory mention in every risk model because he was first, his model is an anachronism, and he never extended it to something that would be useful (by say mapping its output to default probabilities, incorporating stock price volatility).

The freerisk.org video mentions several red herrings. Copulas used in CDOs. Macro data on the Fed's FRED database. Issues in CDOs. Correlations between defaults and various sector risks. Nouriel Roubini's prescient macro forecasts (permabear is 'vindicated'). They note financial companies. These are all pretty independent of the nonfinancial corporate credit risk problem, of trying to improve on the 'rating' for, say, IBM. To the extent one emphasizes these issues suggests they really have no understanding of what is important, relevant, or feasible in the nonfinancial corporate credit risk objective (a model relevant to AIG is not relevant to IBM). Macro forecasts, conditional correlations, asset-backed securities, are all very parochial problems almost independent of each other. Any good solution in one is not very similar to a good solution in another.

They also mention they will allow real-time correlations between scores and default rates. By this, I presume they will look at the small number of current defaults. This would induce a horrific backtest bias, because if you know what went default recently, you can adjust an algorithm to do very well, over the past 12 months. You really need the longer dataset going back 10 years on defaults, and should be very wary of anything that merely shows it did well over the past quarter.

Credit risk calculation is an eminently feasible problem with a 'flat maximum', which is why the Altman model 'works' (so too, does Net Income/Assets). A near optimal measure of risk is not too difficult (I present data over at defprob for free!). Nevertheless, most people screw it up, because they don't collect a good dataset for construction and validation, they try to be too fancy, or are too rule-based (expert rule systems with many if-then clauses that create a knife-edged kluge) or don't calibrate the output into default probabilities.

5 comments:

When examining all listed non-financial companies, I had to use much more than the ratings, and performance, of Moody's default database (which, by the way, is proprietary). Even if I did use Moody's, 10 years of data *OR* Baa bonds are not enough to draw any conclusions about default rates in general. Baa is investment grade, very unlike most companies.