Arguments

an optional numeric vector of data values: as with x
non-finite values will be omitted.

alternative

a character string specifying the alternative
hypothesis, must be one of "two.sided" (default),
"greater" or "less". You can specify just the initial
letter.

mu

a number specifying an optional parameter used to form the
null hypothesis. See ‘Details’.

paired

a logical indicating whether you want a paired test.

exact

a logical indicating whether an exact p-value
should be computed.

correct

a logical indicating whether to apply continuity
correction in the normal approximation for the p-value.

conf.int

a logical indicating whether a confidence interval
should be computed.

conf.level

confidence level of the interval.

formula

a formula of the form lhs ~ rhs where lhs
is a numeric variable giving the data values and rhs a factor
with two levels giving the corresponding groups.

data

an optional matrix or data frame (or similar: see
model.frame) containing the variables in the
formula formula. By default the variables are taken from
environment(formula).

subset

an optional vector specifying a subset of observations
to be used.

na.action

a function which indicates what should happen when
the data contain NAs. Defaults to
getOption("na.action").

...

further arguments to be passed to or from methods.

Details

The formula interface is only applicable for the 2-sample tests.

If only x is given, or if both x and y are given
and paired is TRUE, a Wilcoxon signed rank test of the
null that the distribution of x (in the one sample case) or of
x - y (in the paired two sample case) is symmetric about
mu is performed.

Otherwise, if both x and y are given and paired
is FALSE, a Wilcoxon rank sum test (equivalent to the
Mann-Whitney test: see the Note) is carried out. In this case, the
null hypothesis is that the distributions of x and y
differ by a location shift of mu and the alternative is that
they differ by some other location shift (and the one-sided
alternative "greater" is that x is shifted to the right
of y).

By default (if exact is not specified), an exact p-value
is computed if the samples contain less than 50 finite values and
there are no ties. Otherwise, a normal approximation is used.

Optionally (if argument conf.int is true), a nonparametric
confidence interval and an estimator for the pseudomedian (one-sample
case) or for the difference of the location parameters x-y is
computed. (The pseudomedian of a distribution F is the median
of the distribution of (u+v)/2, where u and v are
independent, each with distribution F. If F is symmetric,
then the pseudomedian and median coincide. See Hollander & Wolfe
(1973), page 34.) Note that in the two-sample case the estimator for
the difference in location parameters does not estimate the
difference in medians (a common misconception) but rather the median
of the difference between a sample from x and a sample from
y.

If exact p-values are available, an exact confidence interval is
obtained by the algorithm described in Bauer (1972), and the
Hodges-Lehmann estimator is employed. Otherwise, the returned
confidence interval and point estimate are based on normal
approximations. These are continuity-corrected for the interval but
not the estimate (as the correction depends on the
alternative).

With small samples it may not be possible to achieve very high
confidence interval coverages. If this happens a warning will be given
and an interval with lower coverage will be substituted.

an estimate of the location parameter.
(Only present if argument conf.int = TRUE.)

Warning

This function can use large amounts of memory and stack (and even
crash R if the stack limit is exceeded) if exact = TRUE and
one sample is large (several thousands or more).

Note

The literature is not unanimous about the definitions of the Wilcoxon
rank sum and Mann-Whitney tests. The two most common definitions
correspond to the sum of the ranks of the first sample with the
minimum value subtracted or not: R subtracts and S-PLUS does not,
giving a value which is larger by m(m+1)/2 for a first sample
of size m. (It seems Wilcoxon's original paper used the
unadjusted sum of the ranks but subsequent tables subtracted the
minimum.)

R's value can also be computed as the number of all pairs
(x[i], y[j]) for which y[j] is not greater than
x[i], the most common definition of the Mann-Whitney test.