Recently in an interview I was asked the following (I am paraphrasing):

The magnitude of uncertainty of the movement of $S_t$ is represented by $\sigma$ and is clearly captured in the term $\exp\{\sigma W_t\}$. But intuitively, why does $\sigma$ appear again in the term $r-\frac{\sigma^2}{2}$? That is, why are we deducting $\frac{\sigma^2}{2}$ from our drift $r$? What is the interpretation?

8 Answers
8

The rigorous answer: because Ito calculus tells us that we need the second order term. Look at
$$
S_t = S_0\exp(\mu t + \sigma B_t).
$$
Assume that $S_0$ is known and fixed and look at
by Ito's formula
$$
d(S_t/S_0) = \mu dt + \sigma B_t + \frac{\sigma^2}{2} dt.
$$
Then with some abuse of notation:
$$
E[d(S_t/S_0)] = \mu t + \frac{\sigma^2}{2} t,
$$
and we get the convexity term. So it is again the crucial story of Ito calculus: second order terms don't vanish (as in usual calculus) - they just stay. If you want to see this from the SDE then you have to use the Stratonovich formulation (see e.g. here).

The intuitive answer:
Just look at $E[\exp(\sigma W_t)]$. You can think of this as
$$
\exp(\sigma W_t) \approx \exp(Z \sqrt{t} \sigma)
$$
where $Z$ takes the values $\pm 1$ with probability $1/2$ (note that the noise gets $\sqrt t $ whereas a drift term would get a $t$).
Then the expectation is
$$
E[\exp(\sigma W_t)] \approx \frac12 (\exp(\sqrt{t} \sigma)+\exp(-\sqrt{t} \sigma)),
$$
using the Taylor series expansion this is
$$
\frac12 ((1+ \sqrt{t} \sigma + \frac{t \sigma^2}{2} + \text{terms of higher order}) + (1- \sqrt{t} \sigma + \frac{t \sigma^2}{2} + \text{terms of higher order} )),
$$
and you see that the terms of order $\sqrt{t}$ cancel out. You get something like
$$
E[\exp(\sigma W_t)] \approx 1 + \frac{t \sigma^2}{2} + \text{terms of higher order} \approx \exp(\frac{t \sigma^2}{2}).
$$

As a last comment if you have $\exp(\sigma W_t)$ and $W_t$ is symmetric then the positive outcomes draw the expectation up, $\exp(\sqrt{t} \sigma)$ is further away from $\exp(0)=1$ than $\exp(-\sqrt{t} \sigma)$ e.g. for $\sqrt{t} = 0.1$ and $\sigma=0.2$ you have
$ 1.020201$ versus $0.9801987$ - thus if it goes up it goes further up from $1$ then if it goes down.

EDIT:

The very short answer: because $W_t$ is symmetric around $0$ but $\exp(x)$ is not symmetric around $1$.

The convexity of the exponential function of the stochastic variable $W$ makes its expectation greater than the exponentiation of the expectation of $W$. This is an example of Jensen's inequality, $E[e^{\sigma W}]> e^{\sigma E[W]}=1$. $\sigma$ can be interpreted as the magnitude of the convexity of the exponential function. This can be seen by Taylor expanding the $e^{\sigma W}$ around $W=0$ up to the quadratic term. The convexity thus produces a drift increasing with respect to $\sigma$. We know the drift should be $e^{rt}$. Therefore the factor in front should scale down the drift from the convexity measured by $\sigma$.

From a probabilistic point of vew the "drift adjustment" comes into play so that the expected value of $S_t$ will be $e^{rt}$ rathern than $e^{(r+0.5\sigma^2)t}$.
For the expected value of a log-normaly distributed variable with mean $\mu$ and vol $\sigma$ equals $e^{(\mu+0.5\sigma^2)t}$(see the very detailed Wikipedia article) Thus by setting $\mu= (r-0.5\sigma^2)t$ we arrive at $E[S_t]=e^{rt}$

Now in most cases $r$ will denote the market risk free rate. Thus on average our stock will earn only that rate.

You can interpret the $-0.5\sigma^2$ to be the volatility-dependent drift adjustment which insures the risk neutrality of the process. Thus if judging by average returns the ivnestor won't care whether he will be invested in the risk-free portfolio or into the market portfolio.

To pick up the comment on MSE - the discounted expected payoff will then be $S_0$ and the discounted process $e^{-rt}S_t$ will be a martingale. This further supports that the thus created market setting is fair.

There are a few good answers up there explains the technical differences between Brownian and geometric Brownian motion. I think it may still help to give a binomial model breakdown to get an intuitive feel.

Now you can convince yourself (using Martingales or properties of lognormal distributions) that setting $r=0$ in the original question is the equivalent of demanding $E[y]=y$. We find that $p=\frac{1}{1+e^s}$. If we feed this $p$ into $E[x]$ we find that it is not driftless. In fact taking the small $s$ limit by Taylor expanding we get
$$
\begin{eqnarray}
E[x] &=& x- \frac{1}{2}s^2 \\
Var(x) &=& 4 s^2
\end{eqnarray}
$$
matching the continuous time results.

Punchline: Since geometric Brownian motion corresponds to exponentiating a Brownian motion, if the former is driftless, the latter is not.

Relation to a puzzle

Well this is not strictly a puzzle but may seem counterintuitive at first. Suppose we play a game where you have $X$ dollars and toss a fair coin and I pay you $2X$ if you get heads and you pay me $\frac{1}{2}X$ if you get tails. How much is the game worth. Even though the tree is recombining in that equal number of heads and tails would get you back to $X$ the process is not a Martingale since
$
.5~ 2X + .5 ~\frac{1}{2} X = \frac{5}{4} X
$
and so the price is $\frac{1}{4} X$.

Were we doing physics and we said there was an arithmetic Brownian motion we could indeed have a drift rate other than $\mu=r$ and it would make sense. Suppose, for example, that a fluid is moving at velocity $v$ and we have a random walk of particle in it. This would be reality.

The whole point of using SDEs in finance is to identify what ought to be true in equilibrium. Where people go wrong in a lot of interpretation is saying something like this: "if the risk free rate is 5% for no risk, and I demand 10% return to take a risk of $\sigma$, then my drift, $\mu$ ought to be 10%..."

Put another way, we often say that such a person demands a 10% return for a particular risk ($\sigma$). Nature may demand a drift in a moving fluid, but saying an investor demands something of a stock is a bit twisted.

The whole idea of the solution to the SDE for asset pricing is to separate a proposed price process into a drift that is there with certainty and a random element. The first is the drift, $\mu$, and the only drift that is certain is the risk-free rate, $r$.

A better way to say what an investor thinks is to say that, in equilibrium, a stock is priced such that investors have found a price at which there is no selling or buying moving the price, and at which they (collectively) feel the stock is priced such that the expected return is appropriate for the expected risk.

Deep down this is what the risk-neutral measure really means. Under the risk-neutral measure, investors must believe that there is a 50% chance the stock will, in reality, deliver a return that adequately compensates for the actual risk, and a 50% chance of the opposite. More accurately, they are indifferent between holding the bond and the stock at equilibrium pricing taking into account their own estimates of the distribution payoff and their personal risk appetite. In reality this means collectively that the expected return is greater than $r$, though we don't know by how much or what the expected $\sigma$ is either. Due to risk aversion the actual expected return must be above $r$, but we work in risk-neutral space.

Yet one more vector on this is to say that, at equilibrium, an investor with 105 dollars arriving in one year who thinks the stock is fairly priced can either borrow 100 (at 5%) and buy the stock now with the debt paid off in one year, or enter into a forward contract to buy the stock at 105 in one year. These have identical risk and return outcomes. Such an investor believes there will be a return of over 5% this year (or else he would not do the trade) and in fact expects enough return to compensate for the risk. In other words, at the current price, he is risk-neutral (or perhaps a better term is risk-indifferent). Now, if he 'demands' a 20% return of the stock to feel this way he will be sorely disappointed if the stock only delivers a 2% return (or an 11% return or whatever). But, deep in his brain, he expects the stock to go up to (perhaps) 120.

Note that if he is happy with the discount that compensates for risk then he is risk-neutral at that point. Then, of course, it makes perfect sense that the drift is only the risk-free rate since this is the cost of funding. Our investor wants to take on the expected risk for the expected return premium, and the drift, as it were, has nothing to do with risk - only the cost of financing.

Many people find it easier to think of the bond as having a future (riskless) target price of 105, and then the stock to also have a 'target price' of 105 but to be priced, today, at a 'discount' to the bond at 80 (or so).

But, if a bond has a price trajectory of $e^rt$, then a stock must have a price trajectory, in the risk-neutral way, that has a future value well below 105 - something like 85 - so that, when we discount back to today using $r$, its spot price is something like the 80 it must be. And that is where the $-\sigma^2/2$ comes in...

There are some subtleties around what $E[e^{\mu X}]$ is but the real meat of the matter is the above. If we stay in risk-neutral space, which assumes there is adequate expected compensation for anticipated risk, then risk-neutral investors - who would hold the stock at a current price $S_0$ - need a (risk-neutral) price process that delivers a future value of about $S_0e^{rt}$. To get enough 'discount' so that the current price is around 80, in my example, we need to subtract something from $r$ and that something is $-\sigma^2/2$. Try it out on a spreadsheet.

I think too much cleverness goes into risk-neutral measures and the like, when the answer is largely "because we need it to work out right".

Arithmetic Brownian Motion (BM) is a simple random walk. If we start at zero, and the increments are independent and identically distributed normal increments, the distribution at time $t$, if we start at $X_0=0$, is $N(\mu t, t\sigma^2)$.

Now, if we used that as a model for stock prices, they could go negative. So the 'easy' dodge is to say let's use $X_t$ in a useful way, and say $S_t$ (stock price at time $t$) equals:
$$S_t=S_0e^{X_t}$$

thus defining a Geometric Brownian Motion (GBM). If you think about this, all it does is take the simple (arithmetic) random walk, and transform it so at any time $t$, $S_t$ is just mapping of $X_t$ to $e^{X_t}$, so that if $X_t$ is positive it gives a price $S_t$ that is above $S_0$, and the opposite if $X_t$ is negative. This is easiest if you imagine a realized path.

The problem is that $e^x$ is convex; so the negative points of $X_t$ get squeezed into the region $0 <S_t<S_0$, and all of the positive values of $X_t$ get 'expanded' into the region $S_t>S_0$ - which given the way $e^x$ acts, spreads out many points into a very large region.

When you work out all the math, you come to call $e^{X_t}$a log-normal distribution. And, its mean (or expected value) is $E[e^X] = e^{(\mu +\frac{\sigma^2}{2})}$ which is larger than $e^{E[X]}=e^\mu$ due to Jensen's inequality.

So, to avoid having a disconnect with reality (and discrete-time compounding), $\sigma^2/2$ needs to get subtracted out so the value at time t, when $\sigma =0$, is $e^{\mu t}$.

From another angle, if you solve the SDE, and use Ito's lemma, etc., yuo get
$$S_t=S_0e^{(\mu t-\frac{\sigma^2}{2})t+\sigma W_t}$$. Now, again, $W_t$ is a mean zero random walk. But $e^{W_t}$ has an expected value greater than zero - again, due to Jensen's inequality. The subtraction of $t\frac{\sigma^2}{2}$ takes way that bias.

An answer without formulas (just right for the interview!): however large the drift of $dS_t$ is, once $S_t$ hits zero, it is stuck there forever, so the negative term in the price equation can be thought of as a way to keep an eye on this possibility. Disclaimer: of course, $S_t$ never hits zero, but can still spend a lot of time close to it if the variance is large, and it is way harder for a GBM to jump back to the top than drop to the bottom.

the intuitive explanation, without any math is that volatility has a negative drag on the mean returns:the drift mu, that is why it has a negative sign: if an asset goes down 50% it needs to go up 100% do get back to the initial point