I'm looking to run portfolio optimizations using various optimization goals - e.g. minimum variance, max diversification etc. My challenge is if I want to do this on ETF's which ones do I pick to run the optimization on?

Say there is a universe of 200 or so ETF's - is there some form of clustering I can do to reduce this down to a smaller set of 20 or so to optimize? Or is this best handled by letting the portfolio optimizer itself apply the appropriate weights out of the larger set?

What techniques should I consider for clustering - i.e. what metrics make sense is it correlation, mean return (I doubt it since that's so noisy), anything else?

Updates:

To clarify what I'm trying to do is whittle down the 500 names in the SP500 to 20 or so clusters and then from each cluster take the most representative stock to get 20 names. I would then do portfolio opt on the 20 names
Following a series from the amazing systematic investor blog I've been able to get really nice results doing some clustering as follows:

Im actually doing the exact same thing but with a smaller universe of asset. Mine is for performance tracking purposes.
–
user1234440Dec 19 '12 at 3:39

I would definitely do a preselection by looking at bid-ask spreads, cost/tracking error, trading volume, NAV premium/discount, replication mode (not necessarily in this order but I would advise it). I think you will find that after this analysis your universe will be narrowed down...
–
vanguard2kDec 19 '12 at 7:32

4 Answers
4

First, find out which ETFs are correlated with one another over time. Let the data matrix $\mathbf{X}$ of ETF price returns have $t$ rows and $p$ columns, where the $t$ rows are bars or days, and the $p$ columns in the dataset are ETFs. Next, determine the correlation matrix $\mathbf{R}$ for the ETF-to-ETF correlation, and then run principal components analysis (PCA) to identify which ETFs load or are correlated with each principal component (PC). To understand PCA, if you have a correlation matrix $\mathbf{R}$ for e.g. 50 ETFs, then PCA will create 50 PCs which have zero correlation between each other. For each ETF, there will be a loading value (i.e., correlation) for each PC. If the price returns of several ETFs are correlated, they will likely load on the same PC. Commonly, loading values (correlation) of 0.45 and greater are considered to be large enough to warrant further investigation.

You can use the loadings of each ETFs on the various PCs to group together ETFs whose price returns are correlated. For example, bank ETFs may correlate (loading>0.45) with the 1st PC, oil ETFs with the 2nd PC, emerging growth ETFs on the 3rd, and so on.

Assets (ETFs) commonly don't always correlate with everything else, but will correlate with assets in the same sector. PCA creates artificial vectors (eigenvectors) which have zero correlation with one another, and therefore by observing which ETFs correlate mostly (>0.45) with a given PC, you can essentially group.

PCA is wholly linear, and there are non-linear methods such as Laplacian Eigenmaps, Local Linear Embedding, Locally Preserving Projections, Diffusion Maps, etc. which can be employed. Another term for the non-linear techniques is "distance metrics learning."

I would look to run a pre-optimization routine over the whole universe of 200+ ETFs. I would use this pre-optimization to reduce the universe to a cardinality that provides optimal diversification effects. You can do that by first looking at pair-wise correlations and then also run optimizations to reduce portfolio variance by utilizing the covariance matrixes. In that way you will already filter out highly correlated assets that are useless to combine in a portfolio. You could use additional constraints such as minimum liquidity requirements to decide which assets to keep and which to toss out. But in any case, try to eliminate highly correlated assets and those assets that do not reduce portfolio variance in a meaningful way.

I wanted to add one thing here: Elimination of highly correlated assets reduces the condition number of the variance-covariance matrix thus giving you more stable results. At one point in the optimization procedure you have to invert this matrix somehow. So elimination of correlated products is also sensible from a numerical point of view
–
vanguard2kDec 19 '12 at 7:43

When I select assets for a portfolio given an universe, I tend to pick ones that span the beta spectrum, given your selected benchmark. I find that if your portfolio of assets have varying volatility or correlation, you can achieve better diversification. I didn't come up with the idea but it comes from a rotational system's framework from the link below:

One thing to be careful is to avoid randomly throwing assets in to a optimization procedure as you may find the ending portfolio allocation concentrated with risk that is hidden from a portfolio level. It may be useful to first manually separate the etfs in to their respective categories.

Update: I am not sure the relevance to your research but CSS analytics just started a 2 part post about Cluster Risk Parity which has some ideas that talk about subsetting assets from portfolio universe.

The description says it works on sparse matrices. How does the complete process you used look like?
–
Bob Jansen♦Jan 3 '13 at 20:27

I read the attached doc and it's very unclear how your procedure works from that document. Can you please elaborate?
–
nxstock-traderJan 6 '13 at 22:15

The easiest way to get a feel for the method is to download the R package and experiment with some toy examples. Bob, the output is a sparse correlation matrix (I.e., clustered). The input is the full asset correlation matrix, which is not sparse but which you intend to shrink until it is.
–
zweiterlindeFeb 23 '13 at 17:36