Monday, June 27, 2011

While skewness and kurtosis are not as often calculated and reported as mean and standard deviation, they can be useful at times. Skewness is the 3rd moment around the mean, and characterizes whether the distribution is symmetric (skewness=0). Kurtosis is a function of the 4th central moment, and characterizes peakedness, where the normal distribution has a value of 3 and smaller values correspond to thinner tails (less peakedness).

Some packages (including SAS) subtract three from the kurtosis, so that the normal distribution has a kurtosis of 0 (this is sometimes called excess kurtosis.

SASSAS includes much detail on the moments and other statistics in the output from proc univariate. As usual, the quantity of output can be off-putting for new users and students. Here we extract the moments we need with the ODS system. We also generate kernel density estimates roughly analogous to the densityplot() results shown above.

5 comments:

A reader pointed out that the "e1071" package provides three different flavors for each measure. The corresponding documentation discusses some of the properties and which statistical software prefers which version. This is another option to consider to calculate these quantities.

I probably also should have added that different estimates for these properties can be generated in SAS using the vardef option to the proc univariate statement. To get the results shown from the kurtosis() function in the moments package, use vardef=n.

Skewness is the 3rd moment around the mean, and characterizes whether the distribution is symmetric (skewness=0).

A couple of issues:

First, (Pearson) skewness is a standardized third central moment (third central moment divided by the cube of the standard deviation - it doesn't change when you change from meters to feet, even though the third central moment does).

(followup)I realized just after clicking "post" that I shouldn't have called it Pearson skewness (since there are a bunch of things that get called that), but should have said moment-based skewness (rather less ambiguous).

Thanks for those clarifications. For what should be a straightforward topic, there's a lot of loose language used in discussing both skewness and kurtosis. Most likely we should use explicit formulae, rather than mere words, when precision about them seems important. And one hopes that an assertion of symmetry will be based on inspection of the pdf, not just the skewness, however calculated.

Subscribe to SAS and R!

Search the SAS and R Blog

Loading...

The book (second edition, 2014)

Reviews (from the first edition)

"By placing the R and SAS solutions together and by covering a vast array of tasks in one book, Kleinman and Horton have added surprising value and searchability to the information in their book. … a home run, and it is a book I am grateful to have sitting, dust-free, on my shelf."—Robert Alan Greevy, Jr, Teaching of Statistics in the Health Sciences

"I use SAS and R on a daily basis. Each has strengths and weaknesses, and using both of them gives the advantage of being able to do almost anything when it comes to data manipulation, analysis, and graphics. If you use both SAS and R on a regular basis, get this book. If you know one of the packages and are learning the other, you may need more than this book, but get this book, too. "

Charles Heckler, University of Rochester, Technometrics

"Excellent cross-referencing to other topics and end-of-chapter worked examples on the ‘Health evaluation and linkage to primary care’ data set are given with each topic. … users who are proficient in either of the software packages but with the need to use the other will find this book useful."—Frances Denny, Journal of the Royal Statistical Society, Series A