Standard Deviation Formula

I would like to understand the theory behind the standard deviation formula. The way it was explained to me, you have to subtract each value from the mean and square it to avoid the negatives canceling out the positives. After multiplying and dividing by the correct frequencies, we have to square root our sum to correct the effects of squaring it. This explanation doesn't make sense to me because we could have just used absolute values to avoid the positive/negative problem, and it seems more accurate. So why do we use this square and then square root method?

It is, basically, measuring the distance from the data to the norm, just like [itex]\sqrt{(x- a)^2+ (y- b)^2+ (z- c)^2}[/itex] for distance in the plane.

Of course, you could measure "distance" by |x- a|+ |y- b|+ |z- c| and you could calculate such a number for a set of data but "absolute value" is not a very nice function- it is not differentiable at 0.

The most important reason for using "root mean square" definition is that it is the one that shows up in the normal distribution.

It is, basically, measuring the distance from the data to the norm, just like [itex]\sqrt{(x- a)^2+ (y- b)^2+ (z- c)^2}[/itex] for distance in the plane.

Of course, you could measure "distance" by |x- a|+ |y- b|+ |z- c| and you could calculate such a number for a set of data but "absolute value" is not a very nice function- it is not differentiable at 0.

The most important reason for using "root mean square" definition is that it is the one that shows up in the normal distribution.

How does it "show up" in the normal distribution. I know there is a relation, like one standard deviation on each side of the mean is 68%, but why does there have to be a root mean square in order to have this correlation?