We know the X values run from approx $-\infty$ to $+\infty$ but what are the y values??
The normal distribution takes two parameters $\mathcal{N}(\mu, \sigma^2)$ but what is the range of y?

$y>0$ obviously and the "y" will depend on the mean and variance you picked as $y=\frac{\exp(-z^2)}{\sqrt{2\pi\sigma^2}}$. But I have trouble understanding what it means. If I take the S&P500 and I difference the series (SPX-SPX(-1)) the histogram of the returns will have an approximate normal distributions and will list out the number of times I have a return of -1%,-.5%,0%,.5%, 1% , etc throughout the history. So is the "y" of the normal distribution the number of times I have had that x as a value? Should I think of the normal distribution in practical terms the number of times that one point event has occurred? I look at some normal distributions and the Y ranges from 0-4, others I see the y ranging from 0 to 1, as a probability should. I know the area underneath the curve should sum to 1 but shouldnt the y values always be less than 1?

2 Answers
2

You may be thinking of the cumulative distribution function, which takes on all values in the interval $(0,1)$. Or else you may be thinking of the (probability) density function
$$\frac{1}{\sqrt{2\pi}\sigma}e^{-\frac{(x-\mu)^2}{2\sigma^2}},$$
the familiar "bell-shaped" curve.
This density function is positive, but not necessarily between $0$ and $1$. It reaches a maximum when $x=\mu$. The maximum value (what your post would call the maximum $y$-value) is $\dfrac{1}{\sqrt{2\pi}\sigma}$. The range of the density function is the interval $\left(0,\frac{1}{\sqrt{2\pi}\sigma}\right]$.

In particular, when $\sigma$ is small, the maximum value can be quite large: the density function reaches a sharp high peak. If $\sigma$ is large, the density function, though still characteristically bell-shaped, is flat and low. The area under the density curve, and above the $x$-axis, is always $1$. So if the density function is near $0$ very soon (small variance,) it is intuitively clear that the curve must reach quite high.

Remark: Let $f(x)$ be our probability density function. Then for small $h$, the probability that our random variable lies between $x$ and $x+h$ is approximately $hf(x)$. In that sense, you can pick up a pretty good picture of $f(x)$ if you have a largish number of data points.

I see, so the "Y" is unbounded. As the variances becomes infinitly small the "Y" becomes infinitly large. But is there any interpretation to th "Y". P(X=c)=Some Y Value. I know the area underneath this single point is 0, but does the "Y" value tell us something? Ok forget that the distribtuion is continious, and we have possible X values as our X-Scale, and the frequency as our Y-scale. Should I divide the frequency of the event by the number of observation to get the probability of the event?
–
gabrielOct 20 '12 at 1:46

I have added a little to the post. Hope it helps answer your question.
–
André NicolasOct 20 '12 at 1:53

To add some commentary, the "bell curve" shape is governed by the PDF, as @AndreNicolas pointed out. However, the actual "y"-value of this curve is itself more or less meaningless. The integral of the PDF $f(x)$ gives the probability that your random variable is less than some value: $P(x < X) = \int_{-\infty}^X f(x)dx$. This is known as the CDF, or cumulative distribution function. By the fundamental theorem of calculus, the PDF is then the derivative of the CDF; that is, the PDF is the derivative of a function that returns a probability. So what is that intuitively? Honestly... it's not really anything. The "units" of the vertical axis in the PDF plot don't lead to anything intuitive; they are meaningful, but only in a derived, mathematical sense.

Some people wish to think that $f(X)$ is the probability that $x = X$, but this is untrue for continuous distributions ($P(x = X) = 0$). However, for the PDF's discrete analog, the Probability Mass Function (PMF), this statement is quite true.

But as a pdf shouldn't it be the case the function is always less than 1? I mean the area underneath the function should sum to 1 but if the pdf is a probability function shouldn't that probability be less than or equal to 1?
–
gabrielOct 20 '12 at 1:41

@gabriel The area under the pdf equals $1$. If the pdf value $f(x)$ exceeds $1$ for some and indeed many values of $x$, that is perfectly fine: but $f(x)$ cannot exceed $1$ for all $x$ in an interval $I$ of length exceeding $1$. If the latter condition were to hold, then $$\int_I f(x)\,\mathrm dx=\text{area under pdf in interval}~ I>1$$ in violation of the constraint that the total area is $1$. The value of $f(x)$ is not a probability. The units of $f(x)$ are probability per unit length and you must multiply by length (more generally, find an area) to get a probability.
–
Dilip SarwateOct 20 '12 at 2:09