A recent MarketWatch piece cited a talk in Hong Kong by Economics Nobel Prize winner Professor Robert Merton wherein he discussed the challenges of evaluating investment managers. The following article assumes that the above summary of Professor Merton’s talk is accurate. The piece, and assumedly the talk, argued that, given typical nominal portfolio returns and volatilities, it takes impractically long to detect evidence of investment skill. The argument claimed to prove that all manager selection is futile. Instead, it proved that naïve nominal performance metrics are of little use.

In this article, we will illustrate the difference between a naïve attempt to detect evidence of investment skill using nominal returns and a more productive effort relying on alphas (residual, security selection, stock picking returns) isolated using a capable modern multi-factor equity risk model. Whereas the former approach is futile at best, the latter approach is successful. In fact, rather than taking decades, a capable modern system can identify skill with high confidence in months.

Detecting Evidence of Investment Skill Using Nominal Returns

Consider nominal returns of a Portfolio and a Benchmark. The Portfolio is a live long-only fund implementing a Smart Beta active investment strategy:

With a heroic assumption that log returns follow a normal distribution, a t-test appears to confirm Professor Merton’s argument. Even with over six years of data, the returns are too noisy for a statistical inference:

Detecting Evidence of Investment Skill Using Alphas/Residuals

By comparison, consider the same Portfolio’s residual returns, or alphas, for the same period, isolated with the AlphaBetaWorks’ standard Long-Horizon Statistical U.S. Equity Risk Model. These are also the returns Portfolio would have generated if its factor exposures had been fully hedged (its returns factor-neutralized, or residualized) using the Model:

Portfolio’s Cumulative Residual/Alpha

With an equally questionable assumption that log residuals follow a normal distribution, a t-test is now highly statistically significant:

Whereas Professor Merton’s argument does indeed apply to nominal returns, it does not apply to their residuals. A critical difference is the lower dispersion of residual returns. Over 90% of the variance of a typical active equity portfolio is due to factor exposures rather than to stock picking. Therefore, using nominal returns to measure skill is like trying to take a baby’s temperature by examining her bath water, rather than the baby herself.

Whereas at least 67 out of 100 monkeys picking stocks at random are expected to outperform the Portfolio, less than 1 out of 1,000 is expected to generate higher residuals – a highly statistically significant result. Thus, with the help of a capable equity risk model, strong evidence of skill can be identified in months rather than in decades.

Converting Residuals into Nominal Outperformance

Assuming the equity risk model uses investable factors, as AlphaBetaWorks’s models do, the residual return stream above is investable. In fact, in the idealized case of costless leverage, positive residual returns can be turned into outperformance relative to any benchmark. Below is the performance of Portfolio after it is hedged to match the factor exposures of the Benchmark. The evidence of skill is now plainly visible in the naïve absolute and relative nominal return metrics:

Cumulative Returns for the Portfolio Hedged to Match the Benchmark and the Benchmark

In earlier articles, we showed that the returns of most U.S. and international smart beta ETFs are primarily due to their dumb beta (dumb factor) exposures. Thus, smart beta ETFs turn out to actively time dumb beta factors. In fact, most smart beta strategies are primarily different approaches to sector rotation. After readers’ requests to extend the above analysis to relative returns and tracking error, we showed that the findings hold for U.S. smart beta tracking error. This article extends the analysis to international smart beta tracking error.

The results hold for international smart beta tracking error: Though some international smart beta ETFs do provide valuable exposures to idiosyncratic factors, most primarily re-shuffle basic dumb Region and Sector Factors. In fact, dumb beta exposures are even more influential for international smart beta equity ETFs than for their U.S. peers. Most international smart beta equity ETFs are mainly region and sector rotation strategies in disguise. Consequently, investors and allocators must guard against fancy re-packaging of dumb international risk factors as smart beta and should perform rigorous region and sector factor analysis of their smart beta allocations. Further, many international smart beta strategies can be substantially replicated and blended using simple passive Region and Sector Factors.

Measuring the Influence of Dumb Beta Factors on International Smart Beta ETFs

We used the same dataset of International Smart Beta Equity ETFs as our earlier article on international smart beta ETFs’ dumb factor exposures. We estimated monthly positions of each ETF and then used these positions to calculate portfolio factor exposures to traditional factors such as Regions and Sectors. These ex-ante factor exposures at the end of month m1 were used to predict returns during the subsequent month m2. We then calculated predicted and actual returns relative to the Global Equity Market (defined as the iShares MSCI ACWI Index Fund – ACWI). The correlation between actual and predicted relative returns quantified the influence of dumb beta factors on smart beta tracking error. The higher the correlation, the more similar a smart beta ETF is to a portfolio of traditional, simple, and dumb systematic risk factors.

The Influence of Region Beta on International Smart Beta ETFs

The simplest systematic risk factor to describe each security is its Region (Region Market Beta). Region Beta measures exposure to one of 10 broad regional equity markets (e.g., North America, Developing Asia). These are the most basic and cheap passive international factors. Even this single-factor model estimated with robust statistical techniques delivered 0.67 mean and 0.68 median correlations between predicted and actual relative returns:

For a quarter of smart beta ETFs, Region Factor alone replicates 61% (0.7805²) of tracking error. For this group of strategies, international smart beta tracking error is largely due to region rotation.

The Influence of Region and Sector Betas on International Smart Beta ETFs

For most international smart beta ETFs, Region and Sector Factors replicate 59% (0.7673²) of tracking error. Thus, for most international smart beta equity ETFs, over half of tracking error is due to region and sector rotation.

International Smart Beta Variance and International Dumb Beta Variance

Rather than measure correlations between relative returns of replicating dumb beta portfolios and ETFs, we can instead measure the fractions of their (relative) variances unexplained by dumb beta exposures. The Dumb Beta Variance (in red below) is the distribution of ETFs’ relative variances due to their dumb Region and Sector Factor exposures. The Smart Beta Variance (in blue below) is the distribution of ETFs’ relative variances unrelated to their dumb beta exposures:

International Equity Smart Beta ETFs: Percentage of tracking error explained and unexplained by the Region and Sector dumb beta exposures

For a quarter of strategies, over 2/3 of international smart beta tracking error is due to region and sector rotation, and for some, it reaches 90%.

Our analysis focused on the most basic Region and Sector Factors and excluded Value/Growth and Size Factors, which are decades old and also considered dumb beta by some. If one expands the list of dumb beta factors, smart beta variance shrinks further.

Conclusions

Traditional, or dumb, Region and Sector Betas account for the majority of international smart beta tracking error.

Smart beta effects, unexplained by the traditional Region and Sector Betas, account for 41% of variance or less for most international smart beta ETFs.

With proper analytics, investors and allocators can identify products that do provide unique international smart beta exposures and can guard against fancy re-packaging of dumb international beta.

Investors and allocators can monitor the majority of international smart beta ETF relative risk by focusing on their Region and Sector Factor exposures.

Most international smart beta strategies can be combined using Region and Sector Factor portfolios.

Our earlier articles discussed how some smart beta strategies turn out to be merely high beta strategies and how others actively time the market. We also showed that, for the majority of smart beta ETFs, returns are mostly attributable to the traditional dumb Market and Sector Factors. Consequently, the absolute performance of most smart beta strategies can be substantially captured by sector rotation. We received questions about these studies’ focus on absolute performance: The attribution of absolute performance to the Market and Sector factors tells little about tracking error and relative variance. Perhaps smart beta volatility is attributable to dumb factors, but smart beta tracking error is not?

This article addresses the above criticism and analyzes smart beta tracking error rather than (absolute) volatility. The results hold: Though some smart beta ETFs do provide valuable exposures to idiosyncratic factors, most primarily re-shuffle basic dumb factors. Whether one considers their absolute or relative performance, most smart beta equity ETFs are largely sector rotation strategies in disguise. Consequently, investors and allocators must guard against elaborate re-packaging of dumb factors as smart beta and perform rigorous sector and industry analysis of their smart beta allocations. Further, dozens of smart beta strategies can be substantially replicated and blended using simple sector factor portfolios.

Measuring the Influence of Dumb Beta Factors on Smart Beta ETFs

We used the same U.S. Smart Beta ETF dataset as our earlier study of smart beta ETFs’ dumb factor exposures. For each ETF, we estimated monthly positions and then used these positions to calculate portfolio factor exposures for traditional (dumb beta) factors such as Market and Sectors. The ex-ante factor exposures at the end of each month were used to predict the following month’s returns. The correlation between actual and predicted returns relative to the U.S. Equity Market (defined as the iShares Russell 3000 ETF – IWV) quantified the influence of dumb beta factors on smart beta tracking error. The higher the correlation, the more similar a smart beta ETF is to a portfolio of traditional, simple, and dumb systematic risk factors.

Put differently: For most broad U.S. equity smart beta ETFs, U.S. Market and Sector Betas alone account for approximately half of tracking error and relative variance.

The Influence of all Dumb Factor Betas on Smart Beta ETFs

For the final tests, we added additional dumb factors such as Bonds, Value, and Size. All dumb factor betas delivered 0.74 mean and 0.78 median correlations between predicted and actual relative returns of smart beta ETFs:

Thus, for most broad U.S. equity smart beta ETFs, dumb beta factors account for the majority of tracking error.

Smart Beta Tracking Error and Dumb Beta Tracking Error

Rather than measure correlations between relative returns predicted by dumb beta exposures and actual relative returns, we can instead measure the fraction of relative variance unexplained by dumb beta exposures. This value (in blue below) is the fraction of smart beta tracking error that is unrelated to dumb beta factors:

Our earlier work showed that simple performance metrics, such as nominal returns and Sharpe Ratios, revert. Because of this reversion, above-average past performers tend to become below-average and vice versa. This reversion is primarily due to systematic (factor) noise. Consequently, metrics that remove factor effects from performance reveal persistent stock picking skill. Prompted by readers’ questions, we have investigated the predictive power of popular performance metrics. This article reviews the predictive power of information ratios. They offer a large improvement over simple nominal returns, naive alphas, and Sharpe ratios, but still fall short of the most predictive metrics. Over a 3-year window, the predictive power of information ratios for skill evaluation and manager selection is approximately half that of security selection distilled with a statistical equity risk model.

Measuring the Predictive Power of Information Ratios

We analyze portfolios of all institutions that have filed Forms 13F in the past 15 years. This survivorship-free portfolio dataset covers firms that have held as least $100 million in long U.S. assets. Approximately 5,000 portfolios had sufficiently long histories and low turnover to be analyzable.

To measure the persistence of performance metrics over time, we compare metrics measured in two 12-month periods separated by variable delay. One example is the 24-month delay that separates metrics for 1/31/2010-1/31/2011 and 1/31/2013-1/31/2014. A 24-month delay of 12-month metrics thus covers a 48-month time window. We use Spearman’s rank correlation coefficient to calculate statistically robust correlations.

Serial Correlation of Information Ratios

The information ratio is similar to the Sharpe ratio, but with a key upgrade: Sharpe ratio evaluates returns relative to the risk-free rate. Information ratio evaluates returns relative to a (presumably appropriate) benchmark. We use the S&P 500 Index as the benchmark, following a common practice. As a benchmark increasingly matches the factor exposures of a portfolio, information ratios converge to the standard score (z-score) of active returns estimated with a capable equity risk model. Due to the more effective handling of systematic risk, the predictive power of information ratios receives a boost.

The chart below shows correlation between 12-month Information Ratios calculated with lags of one to sixty months (1-60 month delay):

13F Equity Portfolios: Serial correlation of Information Ratios

Delay (months)

Serial Correlation

1

0.06

6

0.05

12

0.03

18

0.05

24

0.06

30

0.06

36

0.02

42

-0.02

48

-0.06

54

-0.04

60

0.02

Over the 3-year window, the serial correlation (autocorrelation) of Information Ratios is approximately half of the serial correlation of security selection returns provided in the following section. Unlike simple nominal returns and Sharpe ratios, information ratios do not suffer from short-term reversion.

Serial Correlation of Nominal Returns

For comparison, the following chart shows serial correlation of 12-month cumulative nominal returns calculated with 1-60 month lags. As we discussed in prior articles, these revert with an approximately 18-month cycle – so strong past nominal returns are actually predictive of poor short-term future nominal returns:

13F Equity Portfolios: Serial correlation of nominal returns

Delay (months)

Serial Correlation

1

-0.14

6

-0.24

12

-0.33

18

-0.06

24

0.16

30

0.23

36

0.08

42

-0.26

48

-0.44

54

-0.23

60

0.13

Serial Correlation of Security Selection Returns

As we mentioned above, when a benchmark’s factor exposures match those of the portfolio, information ratio is equivalent to the standard score (z-score) of active returns estimated with a capable equity risk model. In practice, however, information ratio is typically calculated relative to a broad benchmark, such as the S&P 500 Index for equity portfolios. Consequently, one would expect the predictive power of information ratios to be lower than the predictive power of security selection returns, properly estimated. For comparison, we provide serial correlation of a security selection metric that uses an equity risk model to control for factor exposures.

To eliminate the disruptive factor effects responsible for performance reversion, the AlphaBetaWorks Performance Analytics Platform calculates each portfolio’s return from security selection net of factor effects. αReturn is the return a portfolio would have generated if all factor returns had been zero. The following chart shows correlation between 12-month cumulative αReturns calculated with 1-60 month lags:

The predictive power of αReturns, as measured by their serial correlation of 12-month performance metrics, is approximately twice that of information ratios over a 3-year window (12-month delay between 12-month performance metrics), but the two begin to converge after three years.

For all performance metrics, the above data is aggregate, spanning thousands of portfolios and return windows. Individual firms can overcome the averages; however, the exceptions require especially careful monitoring.

Summary

The predictive power of information ratios is significantly higher than that of nominal returns and Sharpe ratios.

As a benchmark converges to the factor exposures of a portfolio, information ratios converge to the standard score (z-score) of active returns estimated with a capable risk model.

Over a 3-year window, the predictive power of information ratios, as commonly calculated, is approximately half that of the security selection return calculated with a predictive equity risk model.

Measuring the Predictive Power of Sharpe Ratios

We analyze portfolios of all institutions that have filed Forms 13F in the past 15 years. This survivorship-free portfolio dataset covers firms that have held as least $100 million in long U.S. assets. Approximately 5,000 portfolios had sufficiently long histories, low turnover, and broad holdings to be analyzable.

To measure the decay of performance metrics over time, we compare metrics measured in two 12-month periods separated by variable delay. One example of 24-month delay is metrics for 1/31/2010-1/31/2011 and 1/31/2013-1/31/2014. We use Spearman’s rank correlation coefficient to calculate statistically robust correlations.

Serial Correlation of Sharpe Ratios

Sharpe Ratio is perhaps the most common performance metric. Since it does not directly control for systematic (factor) portfolio exposures, one would expect this approach to suffer from similar reversion as nominal returns. Indeed, tests reveal that Sharpe Ratios fail to isolate security selection performance: Sharpe Ratios of portfolios revert when factor regimes change. Thus, former leaders tend to become laggards, and former laggards tend to become leaders.

The serial correlation (autocorrelation) of Sharpe Ratios is similar to the serial correlation of nominal returns in the next section. The following chart shows correlation between 12-month Sharpe Ratios calculated with lags of one to sixty months (1-60 month lag):

13F Equity Portfolios: Serial correlation of Sharpe Ratios

Delay (months)

Serial Correlation

1

-0.09

6

-0.20

12

-0.28

18

-0.06

24

0.12

30

0.15

36

-0.08

42

-0.40

48

-0.49

54

-0.24

60

0.12

Sharpe Ratios revert with an approximately 18-month cycle. Historical Sharpe Ratios thus have some predictive value, but a negative one. There is a narrow window at 2-3 year lag when past Sharpe Ratios are positively predictive of the future Sharpe Ratios. This is due to the approximately 18-month cycle of reversion.

Serial Correlation of Nominal Returns

For comparison, the following chart shows serial correlations between 12-month cumulative nominal returns calculated with 1-60 month lags. The relationship is similar to that of the Sharpe Ratios. Strong past (nominal) returns are predictive of poor short-term future returns:

13F Equity Portfolios: Serial correlation of nominal returns

Delay (months)

Serial Correlation

1

-0.14

6

-0.24

12

-0.33

18

-0.06

24

0.16

30

0.23

36

0.08

42

-0.26

48

-0.44

54

-0.23

60

0.13

Serial Correlation of Security Selection Returns

For additional comparison, we provide serial correlation of a security selection metric that adjusts for factor exposures. To eliminate the disruptive factor effects responsible for performance reversion, the AlphaBetaWorks Performance Analytics Platform calculates each portfolio’s return from security selection net of factor effects. αReturn is the return a portfolio would have generated if all factor returns had been flat. Firms with above-average αReturns in one period are likely to maintain them, though with a decay. The following chart shows correlation between 12-month cumulative αReturns calculated with 1-60 month lags:

Though the above serial correlations of αReturn may appear low, they are amplified and compounded in practical portfolios of multiple funds. A hedged portfolio of the net consensus longs (relative overweights) of the top 5% long U.S. equity stock pickers delivered approximately 8% return independently of the market. The above data is aggregate. Specific outstanding disciplined firms can overcome performance reversion, but they are the exceptions that require careful monitoring.

Summary

Sharpe Ratios revert rapidly and are not significantly better predictors of future performance than nominal returns.

Once performance is controlled for systematic (factor) exposures, security selection returns persist for approximately 5 years.

Selection of superior future performers is possible, but it requires abandoning popular non-predictive metrics and spotting skill long before it is plainly visible and arbitraged away.

Berkshire’s Recent Exposure and Performance Raise Questions

There is a good reason many investors consider Berkshire Hathaway the paragon of investment success. For over 20 years, Berkshire’s equity portfolio has outperformed the general market on a risk-adjusted basis. But, in the past two years Berkshire’s alpha has turned negative. We identify the principal culprit and provide evidence of style drift at Berkshire – either by Warren Buffett or one of his possible successors.

Berkshire Hathaway Security Selection Return

Below is a chart showing Berkshire’s long equity portfolio performance since 2006, during which time it gained 108%. Many investors and guru-followers will dig no further. But simplistic analysis produces dangerous conclusions. Much of Berkshire’s performance—and that of most stocks—can be attributed to systematic sources, or factors. Berkshire’s nominal performance is influenced by the Market, bond rates, and FX rates. Since factor returns revert, and nominal returns are a contrarian indicator of future performance, it is important to separate Berkshire’s factor performance from its stock-picking skill.

In the chart above, the black line represents Berkshire’s long equity portfolio total return. Within this, the gray line is performance due to factors, and the blue area reflects Berkshire’s positive returns from stock selection, or αReturn. Since “alpha” and security selection performance are widely and often inconsistently used, AlphaBetaWorks defines a rigorous metric of security selection performance as αReturn, which is performance net of all factor effects, or the return a portfolio would have generated if markets were flat.

From 2006 to 2013 Berkshire’s return from stock selection alone was +29.9%. But in 2014 Berkshire’s αReturn started to decline (thinning blue area). Between 2014 and 2015, Berkshire’s return from stock selection was -16.3%.

Style Drift, Succession Issues?

Warren Buffett has long been very public about his avoidance of technology investments, citing reluctance to allocate capital to a business he cannot understand. So why was a manager with no record in, and known skepticism towards, the technology industry making large bets in it? In this light, the above data indicates worrying style drift. Perhaps this large technology sector bet was made by one of his likely successors, which raises an equally important question about manager succession and the style drift risk therein: does a new portfolio manager demonstrate security selection skill (positive αReturn) in general, and in the areas of growing allocation like technology? And what other changes should investors expect of new leadership?

Whether Berkshire’s recent negative αReturn is Buffett’s own error or commission or a byproduct of portfolio manager transition is beyond the scope of this piece. Our focus is identifying active return, its sources (factor timing, stock selection), and their predictive value. All of the above lead to improved performance. In Berkshire’s case, our approach has identified that an established star manager shifted focus to an area previously avoided (technology), and that this shift reduced stock selection performance. Whatever the reason, the results are troublesome. Their early warnings are critical to investors’ and managers’ performance.

Such analysis of risk and performance using holdings data and a predictive equity risk model provides indicators that enhance manager/fund selection and future performance. It can also highlight emerging issues, such as style drift.

Our earlier piece tested several equity market hedging techniques on U.S. equity mutual fund portfolios. We now extend the tests to U.S. hedge fund long equity portfolios. Since these are generally less diversified and more active than mutual funds, simplistic approaches that use a fixed 100% short (1 beta) or rely on returns-based style analysis (RBSA) fail even more dramatically for hedge funds. Yet, a robust statistical equity risk model applied to portfolio holdings remains close to the ideal of perfect hedging. A robust and well-tested technique is thus even more vital for managing hedge fund exposures.

Equity Market Hedging Techniques

We analyze approximately 600 hedge fund long U.S. equity portfolios that are tractable from regulatory filings. Note that roughly half of U.S. hedge fund portfolios are impossible to analyze accurately due to the quarterly data frequency and high turnover. Similarly to our earlier analysis of mutual fund portfolio hedging, we evaluate three approaches to calculating market hedge ratios:

Constant 100% market exposure (1 beta): This common ad-hoc approach used by portfolio managers and analytics vendors supposes that all portfolios have the same risk as a benchmark or a hedge.

Our study spans 10 years. We calculate hedge ratios at the end of each month and use these to hedge portfolios during the following month. This produces a series of 10-year realized (ex-post) hedged portfolio returns. We further break these series into 12-month intervals and calculate their correlations to the Market. Low average market correlation and low dispersion of correlations indicates that a hedging technique effectively eliminates systematic market exposure of a typical portfolio.

Realized Market Correlations of Hedged Hedge Fund Portfolios

Realized Market Correlations of Random Return Series

A large return dataset, even when perfectly random, will contain some subsets with high market correlations. To control for this, we generate random return samples (observations) and calculate their market correlations. These results, attainable only with a perfect hedge, are the standard against which we evaluate equity market hedging techniques:

A 100% hedge is too small for high-risk portfolios and too large for low-risk ones. Also as seen for mutual funds, some hedged low-exposure portfolios formed a fat tail of nearly -1 realized market correlations.

Most RBSA assumes that portfolio factor exposures are constant over the regression window. Some advanced techniques may allow for random variation in exposures over the window, yet even this relaxed assumption is flawed. Our earlier posts covered the problems that arise when RBSA fails to detect rapid changes in portfolio risk. It turns out that the months or years of delay before RBSA captures changes in factor exposures are especially damaging when analyzing hedge fund portfolios:

RBSA fails more severely for hedge funds than for mutual funds. In fact, RBSA has similar defect as a fixed 100% hedge for some low-exposure portfolios and produces a fat tail of nearly -1 market correlations. Hedge funds’ long equity portfolios can and do cut risk rapidly, so RBSA’s failure to detect these rapid exposure reductions is expected.

The edge comes from the analysis of individual positions that responds rapidly to portfolio changes and the robust regression methods that are resilient to outliers. The result is superior analysis of individual funds.

Whereas tests using hedge fund long equity portfolios accentuate the flaws of simple hedging and returns-based analysis, the AlphaBetaWorks Statistical Equity Risk Model remains close to the baseline of a perfect hedge. Thus, it is even more vital that portfolio managers and investors who analyze or manage hedge fund equity risk rely on robust models and thoroughly tested methods.

Summary

Random portfolio returns that would be produced by a perfect hedge are the standard to which equity market hedging techniques can be compared.

Simplistic hedging that assumes 1 beta for all hedge fund long equity portfolios over-hedges some and under-hedges others, resulting in hedged portfolios with net short and net long realized exposures, respectively.

Equity market hedging techniques can be complex and their effectiveness hard to assess. In this piece we evaluate the effectiveness of several market hedging techniques by comparing them to the (idealized and unattainable) perfect market hedge. Specifically, we compare realized market correlations of hedged U.S. equity mutual fund portfolios to market correlations of random return series. Random return series are the ideal that would have been produced by perfect hedging of portfolios satisfying the random walk hypothesis. Whereas hedges that use a fixed 1 beta and hedges that use returns-based style analysis (RBSA) are flawed, a statistical equity risk model applied to portfolio holdings is close to the ideal.

Equity Market Hedging Techniques

We analyze approximately 3,000 non-index U.S. Equity Mutual Funds over 10 years. These provide a broad sample of the real-world long equity portfolios that investors may attempt to hedge. We evaluate the effectiveness of three techniques for calculating hedge ratios:

Assuming constant 100% market exposure (1 beta): Absent deeper statistical analysis, it is common to assume that all portfolios have the same market risk, equal to that of the broad benchmarks.

Using returns-based style analysis (RBSA): RBSA is a popular technique that attempts to estimate portfolio factor exposures by regressing portfolio returns against factor returns.

Applying a statistical equity risk model to portfolio holdings: This technique essentially performs RBSA on the individual portfolio holdings and aggregates the results.

For each fund and for each month of history we calculate market exposure at the end of the month and then use this estimated (ex-ante) exposure to hedge the fund during the following month. We then analyze realized (ex-post) 12-month hedged portfolio returns and calculate their correlations to the Market. The lower this correlation, the more effective a hedging technique is at eliminating systematic market exposure of a typical U.S. equity mutual fund portfolio.

Realized Market Correlations of Hedged U.S. Mutual Fund Portfolios

Realized Market Correlations of Random Return Series

An effective hedging technique should produce zero mean and median market correlations of hedged portfolio returns. Yet, if sufficiently large, even a set of perfectly random 12-month return series will contain some with large market correlations. Since our study covers over 200,000 12-month samples (observations), some market correlations are close to 1 by mere chance. To account for this and to create a baseline for comparisons, we calculated market correlations for random return series with a Monte Carlo simulation. These results, attainable only with a perfect hedge, are the baseline against which we evaluate equity market hedging techniques:

This approach over-hedges some portfolios and under-hedges others. There is a group of low-exposure portfolios for which a fixed 100% market short is too large. These produce a fat tail of negative market correlations of nearly -1 for some hedged portfolios. There is also a group of portfolios for which a fixed 100% market hedge is too small.

Returns-based style analysis with multiple factors suffers from known issues of overfitting and collinearity. Less well-known are the problems that arise from RBSA’s assumption that exposures are constant over the regression window. In practice, portfolio exposures vary over time and can change rapidly as positions change. RBSA will capture these changes months or even years later once they influence portfolio returns, if at all.

RBSA thus fails similarly to the fixed hedging above, if less dramatically: hedges are too large in some cases and too small in others. The exposure estimates are also apparently biased, since they produce hedges that are too large and market correlations that are negative, on average:

The model estimates security market exposures using robust regression methods to control for outliers. Though robust techniques perform well for most portfolios, they appear to produce hedge ratios that are too low for some high-beta portfolios. This leads to small positive mean and median market correlations of hedged portfolio returns and to the higher probability of positive market correlations compared to random portfolios.

Aside from this under-hedging of a small fraction of portfolios, application of the AlphBetaWorks Statistical Equity risk model to fund holdings comes closest to perfect equity market hedging. Portfolio managers and investors who rely on robust risk models and hedging techniques can thus nearly perfectly hedge the market risk of a typical equity portfolio.

Summary

The effectiveness of equity market hedging techniques can be assessed by comparing hedged portfolio returns to random portfolio returns that would be produced by a perfect hedge.

Simplistic hedging that assumes 1 beta for all portfolios fails, most spectacularly for low-risk portfolios.

Returns-based style analysis (RBSA) both over-hedges and under-hedges, likely due to its failure to capture rapidly changing exposures.

And why Poor Nominal Returns are a Reason to Hire Rather than Fire a Manager

Our earlier pieces discussed how nominal investment performance reverts. Since returns are dominated by systematic risk factors (primarily the Market), they are subject to reversal when investment regimes change. In the simplest terms, high risk funds do well in bull markets, and low risk funds do well in bear markets, irrespectively of stock picking skill. When the tide turns, so does the funds’ relative performance. The persistence of stock picking skill becomes evident once systematic effects are removed.

Measuring Persistence of Investment Performance

As our prior performance persistence work, this study analyzes portfolios of all institutions that have filed Form 13F during the past 15 years. This survivorship-free portfolio database covers thousands of firms that have held at least $100 million in U.S. long assets during this period.

The relationship between performance metrics of a portfolio calculated at different points in time captures their persistence. To measure the persistence of nominal returns, we analyze nominal returns during two 12-month periods separated by variable delay. For example, analysis of 24-month delay includes periods 1/31/2010-1/31/2011 and 1/31/2013-1/31/2014. We use the Spearman’s rank correlation coefficient to calculate statistically robust correlations between metrics. Technically speaking, we are studying the metrics’ serial correlation or autocorrelation.

The Persistence of Investment Performance

Serial Correlation of Nominal Returns

Portfolios with above-average nominal returns for prior 12 months tend to underperform for approximately the following two years; similarly, those with below average nominal returns tend to then outperform:

13F Equity Portfolios: Serial correlation of nominal returns

Delay (months)

Serial Correlation

1

-0.11

6

-0.26

12

-0.36

18

-0.09

24

0.15

30

0.22

36

0.08

42

-0.26

48

-0.42

54

-0.22

60

0.17

Serial Correlation of Security Selection Returns

To eliminate the disruptive factor effects responsible for the above reversion, the AlphaBetaWorks Performance Analytics Platform calculates return from security selection after controlling for the factor exposures. The resulting metric, αReturn, is the return a portfolio would have generated if all factor returns had been flat. Above-average and below-average 12-month αReturns tend to persist for approximately four years:

Delay (months)

Serial Correlation

1

0.08

6

0.10

12

0.08

18

0.05

24

0.05

30

0.04

36

-0.01

42

-0.03

48

-0.04

54

-0.03

60

-0.01

The Persistence of Negative Investment Performance

The autocorrelation of overall nominal returns and αReturns captures the persistence of both negative and positive investment performance, but positive and negative metrics need not have similar persistence. In fact, the problems with nominal returns and simplistic performance metrics derived from them are accentuated when the nominal returns are negative.

Serial Correlation of Negative Nominal Returns

Negative 12-month nominal returns revert even more rapidly and more strongly than overall returns. Rank correlation coefficient for 12-month nominal returns separated by 6 months is approximately -0.5 for negative nominal returns and -0.2 for overall nominal returns. Poor recent nominal returns are a reason to hire rather than fire a manager, at least in the short term (the subsequent 12-18 months):

13F Equity Portfolios: Serial correlation of negative nominal returns

Delay (months)

Serial Correlation

1

-0.57

6

-0.23

12

-0.01

18

-0.14

24

0.52

30

0.18

36

-0.27

42

-0.27

48

-0.24

54

-0.12

60

0.07

Serial Correlation of Negative Security Selection Returns

This reversion is not present for αReturns. Negative αReturns have similar autocorrelation for the first few years and decay more slowly than overall αReturns:

The decay in security selection performance is typically due to such things as talent turnover, style drift, management distraction, and asset growth. Since these are more likely to affect the top-performing funds, negative αReturn remains predictive for longer. The above data is aggregate and specific firms can and do overcome the average fate. Though the above serial correlations may appear low, they are amplified and compounded in portfolios of multiple funds.

Cheerful consensus is usually a recipe for mediocrity, whether investing in a stock or in a fund. Fear and panic in the face of nominal underperformance are more dangerous still. Just as it pays to be a contrarian stock picker, it pays to be a contrarian fund investor or allocator.

Measuring the Decay of Stock Picking Skill

This study analyzes portfolios of all institutions that have filed Form 13F. This is the broadest and most representative survivorship-free portfolio database covering thousands of firms that hold at least $100 million or more in U.S. long assets. Approximately 5,000 firms had sufficiently long histories, low turnover, and broad portfolios suitable for skill evaluation.

To measure the decay of stock picking performance over time, we compare metrics measured in two 12-month periods separated by variable delay. One example of 24-month delay is metrics for 1/31/2010-1/31/2011 and 1/31/2013-1/31/2014. We use Spearman’s rank correlation coefficient to calculate statistically robust correlations.

Serial Correlation of Nominal Returns

The following chart shows serial correlation (autocorrelation) between 12-month cumulative nominal returns calculated with lags of one to sixty months (1-60 month lag). The relationship is generally negative. This illustrates that strong past (nominal) returns are predictive of future returns, albeit poor in the short-term:

13F Equity Portfolios: Serial correlation of nominal returns

Delay (months)

Serial Correlation

1

-0.11

6

-0.26

12

-0.36

18

-0.09

24

0.15

30

0.22

36

0.08

42

-0.26

48

-0.42

54

-0.22

60

0.17

There is a narrow window at 2-3 year lag when past returns are predictive of the future results. This appears to be due to the approximately 18-month cycle of reversion in 12-month nominal performance.

Serial Correlation of Naive Alphas

It is common to measure alpha simply as outperformance relative to a benchmark. We will call this approach “naive alpha.” Since it ignores portfolio risk, this approach does not eliminate systematic (factor) effects and fails to isolate security selection performance: The top nominal performers who took the most systematic risk in a bullish regime remain the top performers after a benchmark return is subtracted. When regimes change, these former leaders tend to become the laggards, and vice versa.

Indeed, the serial correlation of naive alphas is similar to the serial correlation of nominal returns. The following chart shows correlation between 12-month cumulative naive alphas calculated with 1-60 month lags:

Serial Correlation of Security Selection Returns

To eliminate the disruptive factor effects responsible for performance reversion, the AlphaBetaWorks Performance Analytics Platform calculates each portfolio’s return from security selection net of factor effects. αReturn is the return a portfolio would have generated if all factor returns had been flat.

Firms with above-average αReturns in one period are likely to maintain them in the other, but with decay. The following chart shows correlation between 12-month cumulative αReturns calculated with 1-60 month lags:

Though the above serial correlations of αReturn may appear low, they are amplified and compounded in practical portfolios of multiple funds. A hedged portfolio of the net consensus longs (relative overweights) of the top 5% long U.S. equity stock pickers delivered approximately 8% return independently of the market.

For approximately 3 years, strong security selection performance, as measured by the 12-month αReturn, is predictive of the future 12-month results. Returns due to security selection thus persist for approximately 5 years. This means that as little as 12 month of consistently positive αReturns are a positive indicator for the following four years. Skilled stock pickers can be spotted years before their skill is plainly visible and broadly exploited.

The decay in security selection performance is typically due to the following sources: talent turnover, style drift, management distraction, and asset growth. It is not a coincidence that the conventional requirement for large institutional allocation is 3-5 year track record.

The above data is aggregate. Specific outstanding disciplined firms can overcome this reversion, but they are the exceptions that require careful monitoring. Spotting skilled managers before their skill is visible to all is a sounder path to superior selection. In this respect, investing with managers is very similar to investing in stocks. Manager skill is arbitraged away – analytical advantage over the crowd is key. Cheerful consensus is usually a recipe for mediocrity whether investing in a stock or in a fund.