A single equation on multiple signal-noise ratios with independent samples
can be computed using the sr_unpaired_test function. This code performs
inference via the Upsilon distribution.
The sr_test also acts as a frontend for this code, for the two sample
case.

First we perform this test under the null, using randomly generated data.
We are testing the sum of three differences of Sharpes here.

set.seed(9001)

pvals <- replicate(1000, {

X <- matrix(rnorm(500 * 6), ncol = 6)

inp <- as.sr(X)

etc <- sr_unpaired_test(inp)

etc$p.value

})

require(ggplot2)

data <- data.frame(pvals = pvals)

# empirical CDF of the p-values; should be uniform

ph <- ggplot(data, aes(sample = pvals)) + stat_qq(dist = qunif) +

geom_abline(slope = 1, intercept = 0, colour = "red") +

theme(text = element_text(size = 8)) + labs(title = "P-P plot")

print(ph)

Now we repeat for non-zero null value:

set.seed(9002)

pvals <- replicate(1000, {

zeta <- 0.1

sg <- 0.01

X <- matrix(rnorm(500 * 6, mean = zeta * sg, sd = sg),

ncol = 6)

inp <- as.sr(X)

etc <- sr_unpaired_test(inp, contrasts = rep(1,

dim(X)[2]), null.value = dim(X)[2] * zeta)

etc$p.value

})

require(ggplot2)

data <- data.frame(pvals = pvals)

# empirical CDF of the p-values; should be uniform

ph <- ggplot(data, aes(sample = pvals)) + stat_qq(dist = qunif) +

geom_abline(slope = 1, intercept = 0, colour = "red") +

theme(text = element_text(size = 8)) + labs(title = "P-P plot")

print(ph)

Now for real data. We download monthly returns of the three Fama French factors plus momentum
(the original Fifth Beatle), then divide into January and non-January periods. We regress
Momentum against the other three factors, then convert the linear regression to a Sharpe ratio
estimate. The two Sharpe ratios are then thrown into an unpaired sample test. We reject the null
of equal idiosyncratic momentum in January versus the rest of the year at the 0.05 level.
Is this 'the January Effect'? Perhaps.

Using the Upsilon distribution, we can
compute prediction intervals for future realized Sharpe ratio, with coverage
frequency over the full experiment. Here is an example on fake data:

set.seed(9003)

n1 <- 500

n2 <- 100

okvals <- replicate(500, {

zeta <- 0.1

sg <- 0.01

X <- rnorm(n1 + n2, mean = zeta * sg, sd = sg)

inp <- as.sr(X[1:n1])

oos <- as.sr(X[n1 + (1:n2)])

pint <- predint(inp, oosdf = n2 - 1, ope = 1)

is.ok <- (pint[, 1] <= oos$sr) & (oos$sr <= pint[,

2])

is.ok

})

coverage <- mean(okvals)

print(coverage)

## [1] 0.95

For a more complicated example, consider the 'Sharpe' under the attribution
model. Here we download the daily data of the three Fama French factors, then
perform an attribution of SMB against the market and HML. We compute the
factor model Sharpe for the first nine months of each year, and of the last three
months separately. Using the first three quarters, we compute a prediction interval
for the fourth quarter, then check coverage:

library(xts)

library(Quandl)

# auth-fu

quandl.auth <- Sys.getenv("QUANDL_AUTH")

options(quandl.auth = ifelse(nchar(quandl.auth), quandl.auth,

""))

rm(quandl.auth)

Quandl.auth(options()$quandl.auth)

# get data!

fff.xts <- Quandl("KFRENCH/FACTORS_D", start_date = "1927-01-31",

end_date = "2014-12-31", type = "xts")

fff.xts$Mkt <- fff.xts[, "Mkt-RF"] + fff.xts[, "RF"]

# shoot me if this is how to get the year number

# from a time index.

yrno <- as.numeric(floor(as.yearmon(index(fff.xts))))

qname <- quarters(index(fff.xts))

is.q4 <- (qname == "Q4")

okvals <- lapply(unique(yrno), function(yr) {

isi <- (yrno == yr) & !is.q4

oosi <- (yrno == yr) & is.q4

mod.is <- lm(SMB ~ Mkt + HML, data = fff.xts[isi,

])

mod.oos <- lm(SMB ~ Mkt + HML, data = fff.xts[oosi,

])

sr.is <- as.sr(mod.is)

sr.oos <- as.sr(mod.oos)

# compute prediction intervals

pint <- predint(sr.is, oosdf = sr.oos$df, oosrescal = sr.oos$rescal,

ope = sr.oos$ope)

is.ok <- (pint[, 1] <= sr.oos$sr) & (sr.oos$sr <=

pint[, 2])

is.ok

})

coverage <- mean(unlist(okvals))

print(coverage)

## [1] 0.8

It is not clear if non-normality or omitted variable bias (or broken code!)
is to blame for the apparent conservatism of the prediction intervals in this case.
We can check by simply shuffling the monthly returns data and repeating the
experiment:

# shuffle the returns data by row

set.seed(1234)

shufff <- as.data.frame(fff.xts)

shufff <- shufff[sample.int(nrow(shufff)), ]

okvals <- lapply(unique(yrno), function(yr) {

isi <- (yrno == yr) & !is.q4

oosi <- (yrno == yr) & is.q4

mod.is <- lm(SMB ~ Mkt + HML, data = shufff[isi,

])

mod.oos <- lm(SMB ~ Mkt + HML, data = shufff[oosi,

])

sr.is <- as.sr(mod.is)

sr.oos <- as.sr(mod.oos)

# compute prediction intervals

pint <- predint(sr.is, oosdf = sr.oos$df, oosrescal = sr.oos$rescal,

ope = sr.oos$ope)

is.ok <- (pint[, 1] <= sr.oos$sr) & (sr.oos$sr <=

pint[, 2])

is.ok

})

coverage <- mean(unlist(okvals))

print(coverage)

## [1] 0.95

Of course, this could be a 'lucky seed', but one suspects that non-normality is not
the issue, rather there is some autocorrelation of (idiosyncratic) returns (or volatility!).

The (negative) Markowitz portfolio appears in the inverse of
the uncentered second moment matrix of the 'augmented' vector
of returns. Via the Central Limit Theorem and the delta method
the asymptotic distribution of the Markowitz portfolio can
be found. From this, Wald statistics on the individual portfolio
weights can be computed. Here I perform this computation on the
portfolio consisting of three large cap stocks, and find
that the Markowitz weighting of AAPL is significantly non-zero
(modulo the selection biases in universe construction). The
results are little changed when using a 'robust' covariance
estimator.