Top Menu

Probability Plots

This section describes creating probability plots in R for both didactic purposes and for data analyses.

Probability Plots for Teaching and Demonstration

When I was a college professor teaching statistics, I used to have to draw normal distributions by hand. They always came out looking like bunny rabbits. What can I say?

R makes it easy to draw probability distributions and demonstrate statistical concepts. Some of the more common probability distributions available in R are given below.

distribution

R name

distribution

R name

Beta

beta

Lognormal

lnorm

Binomial

binom

Negative Binomial

nbinom

Cauchy

cauchy

Normal

norm

Chisquare

chisq

Poisson

pois

Exponential

exp

Student t

t

F

f

Uniform

unif

Gamma

gamma

Tukey

tukey

Geometric

geom

Weibull

weib

Hypergeometric

hyper

Wilcoxon

wilcox

Logistic

logis

For a comprehensive list, see Statistical Distributions on the R wiki. The functions available for each distribution follow this format:

name

description

dname( )

density or probability function

pname( )

cumulative density function

qname( )

quantile function

rname( )

random deviates

For example, pnorm(0) =0.5 (the area under the standard normal curve to the left of zero). qnorm(0.9) = 1.28 (1.28 is the 90th percentile of the standard normal distribution). rnorm(100) generates 100 random deviates from a standard normal distribution.

Each function has parameters specific to that distribution. For example, rnorm(100, m=50, sd=10) generates 100 random deviates from a normal distribution with mean 50 and standard deviation 10.

You can use these functions to demonstrate various aspects of probability distributions. Two common examples are given below.

# Display the Student's t distributions with various
#
degrees of freedom and compare to the normal distribution

Fitting Distributions

There are several methods of fitting distributions in R. Here are some options.

You can use the qqnorm( ) function to create a Quantile-Quantile plot evaluating the fit of sample data to the normal distribution. More generally, the qqplot( ) function creates a Quantile-Quantile plot for any theoretical distribution.

The fitdistr() function in the MASS package provides maximum-likelihood fitting of univariate distributions. The format is fitdistr(x, densityfunction) where x is the sample data and densityfunction is one of the following: "beta", "cauchy", "chi-squared", "exponential", "f", "gamma", "geometric", "log-normal", "lognormal", "logistic", "negative binomial", "normal", "Poisson", "t" or "weibull".

# Estimate parameters assuming log-Normal distribution

# create some sample data
x <- rlnorm(100)

# estimate paramters
library(MASS)
fitdistr(x, "lognormal")

Finally R has a wide range of goodness of fit tests for evaluating if it is reasonable to assume that a random sample comes from a specified theoretical distribution. These include chi-square, Kolmogorov-Smirnov, and Anderson-Darling.