Covariance

Measure of how much two random variables vary together.

A positive value means an increase in one leads to an increase in the other. A negative value means an increase in one leads to a decrease in the other. A value of 0 does not mean they are independent of each other though. (see `Y=X^2`)

`"Cov"(X,Y)="Covariance"=E((X-mu_X)(Y-mu_Y))`

`"Cov"(X,Y)=int_c^d int_a^b (x-mu_x)(y-mu_y)f(x,y)dxdy`

` = (int_c^d int_a^b xy f(x,y) dxdy) - mu_x mu_y`

Properties:

`"Cov"(aX+b, cY+d) = ac"Cov"(X,Y)`

`"Cov"(X_1+X_2,Y) = "Cov"(X_1,Y) + "Cov"(X_2,Y)`

`"Cov"(X,X) = "Var"(X)`

`"Cov"(X,Y) = E(XY) - mu_X mu_Y`

`"Var"(X+Y) = "Var"(X) + "Var"(Y) + 2"Cov"(X,Y)`

If X and Y are independent then Cov(X,Y) = 0.

Correlation Coefficient

Ratio of the linear relationship between variables. Doesn't apply for higher order relationships (like `Y=X^2`)

`rho` is the covariance of the standardizations of X an Y.

`"Cor"(X,Y) = rho = ("Cov"(X,Y))/(sigma_X sigma_Y)`

`rho` is dimensionless (it's a ratio)

`-1 <= rho <= 1`

`rho = +1` iff `Y=aX+b` with a > 0

`rho = -1` iff `Y=aX+b` with a < 0

Quantiles

The 60th percentile is the same as the 0.60 quantile and the 6th decile. The 3rd quartile would be 75th percentile.

median is x for which `P(X<=x) = P(X>=x)`

median is when cdf `F(x) = P(X<=x) = .5`

The pth quantile of X is the value `q_p` such that `F(q_p)=P(X<=q_p)=p`. In this notation `q_.5` is the median.

Central Limit Theorem & Law of Large Numbers

LoLN: As n grows, the probability that E(X) is close to `mu` goes to 1.

LoLN: `lim_(n->oo) P(|E(X) - mu| < alpha) = 1`

CLT: As n grows, the distribution of E(X) converges to the normal distribution `N(mu,sigma^2/n)`