Saturday, October 27, 2012

Let's give the glmnet package a little workout. We'll generate data for a bunch of features, some of which yield a response. Many of the features are unrelated to the response. Of course, we'll inject some noise into the data, too.

Elastic net estimates the true parameters pretty well. It also gives a small weight to a spurious predictor. With 1000 features to choose from, it's not unlikely that one will look like a predictor by chance.

For some reason, cv.glmnet will cross validate over a range of lambda values, but doesn't do the same for alpha. Tweaking alpha might be help eliminate that extra coefficient.