
Two Samples Hypothesis Testing, Page 1
Two Samples Hypothesis Testing
Introduction
• In a previous learning module, we discussed how to perform hypothesis tests for a single variable x.
• Here, we extend the concept of hypothesis testing to the comparison of two variables xA and xB.
Two Samples Hypothesis Testing when n is the same for the two Samples
Two-tailed paired samples hypothesis test:
• In engineering analysis, we often want to test whether some modification to a system causes a statistically
significant change to the system (the system is either improved or made worse).
• We conduct some experiments in which the sample mean
of sample A (without the modification) is
indeed different than the sample mean
of sample B (with the modification). In other words, the
modification appears to have led to a change, but is the change statistically significant?
• Here we discuss the simplest such statistical test – a test of whether one sample of data has a significantly
different predicted population mean compared to a second sample of data, and with the number of data points
n being the same in the two samples.
• Statisticians refer to this case (equal n in the two samples) as a paired samples hypothesis test.
• The procedure is very similar to the single-sample hypothesis tests we have already discussed, except that we
replace variable x by the difference between the two variables,
A
x
−.
• In a two-tailed paired-samples hypothesis test, we want to know whether there is a statistically significant
change in the predicted population means of the two samples. We don’t care if the change is positive or
negative in a two-tailed hypothesis test – we are concerned only about whether there is a change.
• From the definition of variable
δ
, we see that an appropriate null hypothesis is
δ
= 0, i.e., there is no change
in the population mean between the two samples (the least likely scenario). Thus, we set: [This is a two-tailed
hypothesis test.]
o Null hypothesis: Critical value is
μ
0 = 0; the least likely scenario is
μ
=
μ
0 (there is no statistically
significant change in the population means). [This is the least likely scenario since
B
x
.]
o Alternative hypothesis: (opposite of the null hypothesis),
μ
≠
μ
0. In other words, either
μ
<
μ
0 or
μ
>
μ
0
(there is a statistically significant change in the population means). [This is the most likely scenario since
B
x≠.]
• The critical t-statistic is calculated as previously, but using the sample mean of
δ
instead of x, and the sample
standard deviation of
δ
instead of x, i.e., 0
/
tSn
δ
μ
−
=.
• The corresponding p-value is calculated as previously, based on the critical t-statistic. In this case we are
considering a two-tail hypothesis test. p is calculated in Excel using the function TDIST(ABS(t),df,2), where
df is the number of degrees of freedom, df = n – 1, and the “2” specifies two tails.
• If Excel is not available, we can use tables; some modern calculators can also calculate the p-value.
• We formulate our conclusions (to 95% confidence level) based on the p-value:
o If p < 0.05, we reject the null hypothesis because the least likely scenario (
μ
=
μ
0) has less than a 5%
chance of being true. Thus, we can state confidently that there is a statistically significant change in the
population mean of the variable, i.e.,
μ
A ≠
μ
B.
o If 0.05 < p < 0.95, we cannot reject or accept the null hypothesis because the least likely scenario (
μ
=
μ
0) has more than a 5% chance of being true, but less than a 95% chance of being true. The results are
therefore inconclusive – we should conduct more tests.
o If p > 0.95, we accept the null hypothesis because what we set as the least likely scenario (
μ
=
μ
0) turns
out to have more than a 95% chance of being true. Thus, we can state confidently that there is no
statistically significant change in the population mean of the variable, i.e.,
μ
A =
μ
B.
One-tailed paired samples hypothesis test: [This is the more common one used in engineering analysis.]
• We assume here that our experiments yield
A
x>. In other words, the modification we made leads to an
improvement in the mean between Sample A and Sample B. But is the improvement statistically significant?
• In a one-tailed paired-samples hypothesis test, we want to know whether there is a statistically significant
improvement in the predicted population means of the two samples. From the definition of variable
δ
, we see