t-tests in R

The t.test() function in R will conduct a t-test and produce a corresponding confidence interval. It’s a bit of a multi purpose function as it handles one-sample t-tests (which we saw last week), paired t-tests (a special case of a one sample t-test) and two sample t-tests (which we will see soon).

To do a one sample t-test you simply pass in the sample values as the first argument x. The arguments alternative (default is "two.sided"), mu (default is 0) and conf.level (default is 0.95) control the direction of the alternative hypothesis, the null hypothesized value and the confidence level for the CI respectively, take a look at ?t.test for more info.

As an example, let’s look at a data set, friday from the package openintro, that compares the traffic accidents on Friday the 13th, to the Friday the 6th of the same month:

?friday

A hypothesis might be that accidents on the two Fridays should, on average, be the same. That is, the differences between the number accidents on the 13th and those on the 6th, has a population mean of zero.

Let’s grab just those rows that correspond to the traffic accidents and pull out the column of differences (more on filter() and pull() next lab):

diffs <- friday %>%
filter(type == "accident") %>%
pull(diff)

Then conduct a t-test with the null hypothesis, the the mean difference is zero, \(H_0: \mu = 0\), versus the two-sided alternative:

t.test(diffs)

Try writing a statistical summary based on these results

(In Homework 3 you are asked to do a t-test, you may either do the calculations yourself from the summary statistics, or use t.test(). I’d recommend doing it both ways so you get practice doing it from summary stats and you can check your work with t.test()).

The behaviour of p-values

We’ve talked about the performance of tests in terms of their rejection rates:

If the null hypothesis is true, we want the rejection rate to be close to the stated significance level \(\alpha\).

If the alternative is true, we want the rejection rate to be high (i.e. high power) and increase with sample size.

You’ve investigated these by simulating samples, calculating the test statistic and comparing the simulated test statistics to a critical value for a specific level, \(\alpha\). Alternatively, you could simulate samples, get the p-value for the test, and then compare the simulated p-values to any significance level \(\alpha\).

Let’s explore with the t-test. We’ll simulate our samples for a case where we know the t-test is exact, a Normal(0, 1) population and a sample size of 30:

What proportion of the simulated samples would you expect have a p-value less than 0.05? Why?

What proportion of the simulated samples would you expect have a p-value less than 0.10? Why?

What proportion of the simulated samples would you expect have a p-value less than \(\alpha\)? Why?

For an exact test, under the null hypothesis the distribution of p-values should be Uniform(0, 1).

Try simulating p-values when the samples come from a Exponential(1), and small sample size, \(n = 5\). Does the test look exact?

Power and sample size

Recall the the power of the Z-test depends on the sample size, \(n\), the difference between the hypothesized mean and the true mean (under the alternative), \(\mu_0 - \mu_A\), the population variance \(\sigma^2\), and the significance level, \(\alpha\).

The power for the t-test depends on the same quantities but in a slightly more complicated way, luckily there’s an R function that will calculate it for you: power.t.test().

For example, to find the power for a specific alternative, \(\mu = \mu_A\), for a two-sided t-test of \(H_0: \mu = \mu_0\), we could use:

(The tidy() function comes from the broom package, and will convert a lot of statistical objects into a ‘tidy’ data frame.)

It increases to an asymptote of 1.

For \(n = 30\) explore how the power depends on delta.

For \(n = 30\) explore how the power depends on the population standard deviation.

For \(n = 30\) explore how the power depends on the significance level.

In power.t.test() if you specify the power argument as a desired power, and leave one of the other arguments unspecified, it will return the value required to achieve the desired power. For example, if we want to detect a difference of 0.1 with power 0.8, with the same population standard deviation and level of test: