In the [[AP_Statistics_Curriculum_2007_Infer_2Means_Dep | previous section we discussed the inference on two paired random samples]]. Now, we show how to do inference on two independent samples.

In the [[AP_Statistics_Curriculum_2007_Infer_2Means_Dep | previous section we discussed the inference on two paired random samples]]. Now, we show how to do inference on two independent samples.

-

===Indepenent Samples Designs===

+

===Independent Samples Designs===

Independent samples designs refer to design of experiments or observations where all measurements are individually independent from each other within their groups and the groups are independent. The groups may be drawn from different populations with different distribution characteristics.

Independent samples designs refer to design of experiments or observations where all measurements are individually independent from each other within their groups and the groups are independent. The groups may be drawn from different populations with different distribution characteristics.

Line 151:

Line 151:

|}

|}

</center>

</center>

+

+

===Effect-Sizes===

+

Effect size refers to a statistic calculated from a sample of data and are most commonly estimated with their corresponding errors. Sample-driven estimates of effect sizes are different from test statistics used in hypothesis testing as the former evaluate the strength of an apparent relationship and the latter assign a significance level reflecting whether the relationship could be due to chance. The effect size does not determine the significance level, or vice-versa. Given a large enough sample size, a statistical comparison will always show a significant difference unless the population effect size is trivial (i.e., zero). For example, a sample Pearson correlation coefficient of 0.1 would be strongly statistically significant when the sample size is 1000, but would be insignificant if the sample size is just 10. Reporting only the significant p-value from this analysis could be misleading if a correlation of 0.1 is too small to be of interest in a particular application.

+

+

===Cohen's d===

+

Cohen's ''d'' is the normalized difference between two means, i.e., the difference of means divided by a standard deviation for the data, i.e.,

+

: \(d = \frac{\bar{x}_1 - \bar{x}_2}{s}\).

+

+

Cohen's ''d'' is frequently used in estimating sample sizes.

+

* A lower Cohen's ''d'' indicates the necessity of larger sample sizes, i.e., need to larger cohorts to be able to assess/detect if there are between-group differences in the observed data.

+

* The standard deviation \(s\) represents the population standard deviation (assuming the [[AP_Statistics_Curriculum_2007_Infer_BiVar#Comparing_Two_Variances_.28.5C.28.5Csigma_1.5E2_.3D_.5Csigma_2.5E2.5C.29.3F.29|groups have approximately the same variances]]).

+

: \(s = \sqrt{\frac{(n_1-1)s^2_1 + (n_2-1)s^2_2}{n_1+n_2-2}}\)

+

: \(s_1^2 = \frac{1}{n_1-1} \sum_{i=1}^{n_1} (x_{1,i} - \bar{x}_1)^2\), which is the maximum likelihood estimator "Cohen's ''d''", and it is related to [http://en.wikipedia.org/wiki/Effect_size#Hedges.27s_g Hedges's g] by a scaling factor.

Independent Samples Designs

Independent samples designs refer to design of experiments or observations where all measurements are individually independent from each other within their groups and the groups are independent. The groups may be drawn from different populations with different distribution characteristics.

Recall that for a random sample {} of the process, the population mean may be estimated by the sample average, .

The standard error of is given by .

Analysis Protocol for Independent Designs

To study independent samples, we would like to examine the differences between two group means. Suppose {} and {} represent the two independent samples. Then we want to study the differences of the two group means relative to the internal sample variations. If the two samples were drawn from populations that had different centers, then we would expect that the two sample averages will be distinct.

The degrees of freedom is: Always round down the degrees of freedom to the next smaller integer. Also, is the critical value for a Student's T distribution at .

Example

Nine observations of surface soil pH were made at two different locations. Does the data suggest that the true mean soil pH values differs for the two locations? Formulate testable hypothesis and make inference about the effect of the treatment at α = 0.05. Check any necessary assumptions for the validity of your test.

Inference

Test statistics: We can use the sample summary statistics to compute the degrees of freedom and the T-statistic

The degrees of freedom is: So, we round down df=11.

.

p − value = P(T(df = 11) > To = 5.829) = 0.00003 for this (two-sided) test. Therefore, we can reject the null hypothesis at α = 0.05! The left white area at the tails of the T(df=11) distribution depicts graphically the probability of interest, which represents the strength of the evidence (in the data) against the Null hypothesis. In this case, this area is 0.00003, which is much smaller than the initially set Type I error α = 0.05 and we reject the null hypothesis.

Conclusion

These data show that there is a statistically significant mean difference in the pH of Location 1 and Location 2 (p < 0.001).

Independent T-test Validity

Both the confidence intervals and the hypothesis testing methods in the independent-sample design require Normality of both samples. If the sample sizes are large (say >50), Normality is not as critical, as the CLT implies the sampling distributions of the means are approximately Normal. If these parametric assumptions are invalid we must use a non-parametric (distribution free test), even if the latter is less powerful.

The plots below indicate that Normal assumptions are not unreasonable for these data, and hence we may be justified in using the two independent sample T-tests in this case.

Detailed specifications for independent-sample design studies

There are several different situations that arise in studies involving independent samples inference. These cases are separated by whether the sample sizes are equal and the sample variances (approximately) equivalent (i.e., ). The table below illustrates the differences in the statistical inference in each of these situations. This table uses the following notation:

Effect-Sizes

Effect size refers to a statistic calculated from a sample of data and are most commonly estimated with their corresponding errors. Sample-driven estimates of effect sizes are different from test statistics used in hypothesis testing as the former evaluate the strength of an apparent relationship and the latter assign a significance level reflecting whether the relationship could be due to chance. The effect size does not determine the significance level, or vice-versa. Given a large enough sample size, a statistical comparison will always show a significant difference unless the population effect size is trivial (i.e., zero). For example, a sample Pearson correlation coefficient of 0.1 would be strongly statistically significant when the sample size is 1000, but would be insignificant if the sample size is just 10. Reporting only the significant p-value from this analysis could be misleading if a correlation of 0.1 is too small to be of interest in a particular application.

Cohen's d

Cohen's d is the normalized difference between two means, i.e., the difference of means divided by a standard deviation for the data, i.e.,

\(d = \frac{\bar{x}_1 - \bar{x}_2}{s}\).

Cohen's d is frequently used in estimating sample sizes.

A lower Cohen's d indicates the necessity of larger sample sizes, i.e., need to larger cohorts to be able to assess/detect if there are between-group differences in the observed data.