Comparing Two Samples: Differences in Means and Variances - Prof. Jenny A. Baglivo, Study notes of Mathematical Statistics

Information on comparing two probability distributions, specifically focusing on the difference in means and variances. It includes details on exact methods for normal distributions, approximate methods, and transformations to normality. The document also discusses the sampling distribution of the ratio of sample variances and provides exact methods for the ratio of variances parameter for normal distributions.

Typology: Study notes

Pre 2010

Uploaded on 02/24/2010

koofers-user-ydn
koofers-user-ydn 🇺🇸

9 documents

1 / 35

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
MT427 Notebook 5
prepared by Professor Jenny Baglivo
c
Copyright 2009 by Jenny A. Baglivo. All Rights Reserved.
5 MT427 Notebook 5 3
5.1 Two Sample Analysis: Difference in Means . . . . . . . . . . . . . . . . . . . . . . . . . 3
5.1.1 Introduction: Notation and Model Summaries . . . . . . . . . . . . . . . . . . . . 3
5.1.2 Exact Metho ds for Normal Distributions . . . . . . . . . . . . . . . . . . . . . . . 4
5.1.3 ApproximateMethods................................. 11
5.1.4 Transformations to Normality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
5.2 Two Sample Analysis: Ratio of Variances . . . . . . . . . . . . . . . . . . . . . . . . . . 15
5.2.1 FRatioDistribution.................................. 15
5.2.2 Sampling Distribution of Ratio of Sample Variances . . . . . . . . . . . . . . . . 16
5.2.3 Exact Metho ds for Normal Distributions . . . . . . . . . . . . . . . . . . . . . . . 17
5.3 Nonparametric Methods for Two Sample Analysis . . . . . . . . . . . . . . . . . . . . . 19
5.3.1 Definitions ....................................... 19
5.3.2 Wilcoxon Rank Sum Statistic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
5.3.3 Wilcoxon Rank Sum Distribution and Methods . . . . . . . . . . . . . . . . . . . 22
5.3.4 Mann-Whitney U Statistic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
5.3.5 Mann-Whitney U Distribution and Methods . . . . . . . . . . . . . . . . . . . . 28
5.3.6 Hodges-Lehmann (HL) Estimator of Shift Parameter . . . . . . . . . . . . . . . . 30
5.3.7 Exact Confidence Interval Procedure for Shift Parameter . . . . . . . . . . . . . 31
5.4 SamplingModels ........................................ 34
5.4.1 PopulationModel ................................... 34
5.4.2 RandomizationModel ................................. 35
1
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b
pf1c
pf1d
pf1e
pf1f
pf20
pf21
pf22
pf23

Partial preview of the text

Download Comparing Two Samples: Differences in Means and Variances - Prof. Jenny A. Baglivo and more Study notes Mathematical Statistics in PDF only on Docsity!

prepared by Professor Jenny Baglivo

  • MT427 Notebook
  • 5 MT427 Notebook © c Copyright 2009 by Jenny A. Baglivo. All Rights Reserved.
    • 5.1 Two Sample Analysis: Difference in Means
      • 5.1.1 Introduction: Notation and Model Summaries
      • 5.1.2 Exact Methods for Normal Distributions
      • 5.1.3 Approximate Methods
      • 5.1.4 Transformations to Normality
    • 5.2 Two Sample Analysis: Ratio of Variances
      • 5.2.1 F Ratio Distribution
      • 5.2.2 Sampling Distribution of Ratio of Sample Variances
      • 5.2.3 Exact Methods for Normal Distributions
    • 5.3 Nonparametric Methods for Two Sample Analysis
      • 5.3.1 Definitions
      • 5.3.2 Wilcoxon Rank Sum Statistic
      • 5.3.3 Wilcoxon Rank Sum Distribution and Methods
      • 5.3.4 Mann-Whitney U Statistic
      • 5.3.5 Mann-Whitney U Distribution and Methods
      • 5.3.6 Hodges-Lehmann (HL) Estimator of Shift Parameter
      • 5.3.7 Exact Confidence Interval Procedure for Shift Parameter
    • 5.4 Sampling Models
      • 5.4.1 Population Model
      • 5.4.2 Randomization Model

5.1.2 Exact Methods for Normal Distributions

If X and Y are normal random variables, then X − Y has a normal distribution.

There are two situations where this fact can be used to construct exact methods for analyzing the difference in means:

  1. σx, σy Known: Statistical methods use the fact that the standardized difference

Z =

(X − Y ) − δ √ σ x^2 n +^

σ^2 y m

has a standard normal distribution.

  1. σx = σy Estimated: Statistical methods use the fact that the approximately stan- dardized difference

T =

(X − Y ) − δ √ S p^2

n +^

1 m

) has a Student t distribution with^ n^ +^ m^ −^2 df.

In this formula, S^2 p is the pooled estimator of the common variance:

S^2 p =

(n − 1)S x^2 + (m − 1)S^2 y n + m − 2

where S x^2 and S y^2 are the sample variances for the X and Y samples.

Note that, in order to get an exact Student t distribution in the second situation, we need to assume that the unknown standard deviations are equal.

To illustrate the computation for estimating a common variance, suppose that n = 8, m = 6, s^2 x = 8.58 and s^2 y = 12.35 are observed. Then the estimate of the common variance is

Exercise. Let σ^2 = σ x^2 = σ^2 y be the common variance of the X and Y distributions. Under the assumptions of this section, demonstrate that S^2 p is an unbiased estimator of σ^2.

Confidence interval procedures. The following tables give 100(1−α)% confidence interval procedures for the difference in means parameter, δ = μx − μy.

  1. σx, σy Known: ( X − Y

) ± z(α/2)

√ σ^2 x n +^

σ^2 y m where z(α/2) is the 100(1 − α/2)% point of the standard normal distribution.

  1. σx = σy Estimated: ( X − Y

) ± tn+m− 2 (α/2)

√ S p^2

( 1 n

1 m

)

where tn+m− 2 (α/2) is the 100(1 − α/2)% point on the Student t distribution with (n + m − 2) df.

Hypothesis testing procedures. The following tables give size α tests of the null hypoth- esis that the difference in means parameter is a fixed value: Ho : δ = δo.

  1. σx, σy Known 2. σx = σy Estimated

Test Statistic Z =

( X − Y

) − δo √ σ^2 x n +^

σ^2 y m

T = (X − Y ) − δ 0 √ S p^2

( (^1) n +^ 1 m

)

RR for Ha : δ < δo Z ≤ −z(α) T ≤ −tn+m− 2 (α)

RR for Ha : δ > δo Z ≥ z(α) T ≥ tn+m− 2 (α)

RR for Ha : δ 6 = δo |Z| ≥ z(α/2) |T | ≥ tn+m− 2 (α/2)

Exercise. Assume the following data are the values of independent random samples from normal distributions with common standard deviation 2.

  1. X Sample (n = 8, x = 10. 1 ):

  2. 07 , 7. 00 , 9. 49 , 9. 76 , 11. 19 , 11. 31 , 12. 96 , 13. 02

  3. Y Sample (m = 12, y = 6. 83 ):

  4. 86 , 4. 52 , 5. 14 , 5. 23 , 5. 33 , 6. 32 , 7. 21 , 7. 56 , 7. 94 , 8. 19 , 9. 07 , 11. 59

Exercise (Source: Shoemaker, JSE, 1996): Normal body temperatures of 148 subjects were taken several times over two consecutive days. A total of 130 values are reported below.

  1. X Sample: 65 temperatures (in degrees Fahrenheit) for women

  2. 4 96. 7 96. 8 97. 2 97. 2 97. 4 97. 6 97. 7 97. 7 97. 8 97. 8 97. 8 97. 9

  3. 9 97. 9 98. 0 98. 0 98. 0 98. 0 98. 0 98. 1 98. 2 98. 2 98. 2 98. 2 98. 2

  4. 2 98. 3 98. 3 98. 3 98. 4 98. 4 98. 4 98. 4 98. 4 98. 5 98. 6 98. 6 98. 6

  5. 6 98. 7 98. 7 98. 7 98. 7 98. 7 98. 7 98. 8 98. 8 98. 8 98. 8 98. 8 98. 8

  6. 8 98. 9 99. 0 99. 0 99. 1 99. 1 99. 2 99. 2 99. 3 99. 4 99. 9 100. 0 100. 8

Sample summaries: n = 65, x = 98.3938, sx = 0. 7435

  1. Y Sample: 65 temperatures (in degrees Fahrenheit) for men

  2. 3 96. 7 96. 9 97. 0 97. 1 97. 1 97. 1 97. 2 97. 3 97. 4 97. 4 97. 4 97. 4

  3. 5 97. 5 97. 6 97. 6 97. 6 97. 7 97. 8 97. 8 97. 8 97. 8 97. 9 97. 9 98. 0

  4. 0 98. 0 98. 0 98. 0 98. 0 98. 1 98. 1 98. 2 98. 2 98. 2 98. 2 98. 3 98. 3

  5. 4 98. 4 98. 4 98. 4 98. 5 98. 5 98. 6 98. 6 98. 6 98. 6 98. 6 98. 6 98. 7

  6. 7 98. 8 98. 8 98. 8 98. 9 99. 0 99. 0 99. 0 99. 1 99. 2 99. 3 99. 4 99. 5

Sample summaries: m = 65, y = 98.1046, sy = 0. 6988

  1. Left Plot: Side-by-side box plots of the two samples are shown on the left. The sample distributions are approximately symmetric.
  2. Right Plot: A normal probability plot of standardized temperatures is shown on the right, where

(a) each x value is replaced by (x − x)/sx; (b) each y value is replaced by (y − y)/sy; and (c) the 130 ordered standardized values (vertical axis; observed) are plotted against the k/ 131 st^ quantiles of the standard normal distribution (horizontal axis; expected).

The normal probability plot has been enhanced to include the results of 100 simulations from the standard normal distribution: For each k = 1, 2 ,... , 130, the minimum and maximum value of the 100 simulated kth^ order statistics are plotted.

Assume these data are the values of independent random samples from normal distributions with a common variance.

  • Test the μx = μy versus μx 6 = μy at the 5% level.
  • Construct a 95% confidence interval for the difference in means, μx − μy.
  • Comment on the analyses.

Assume these data are the values of independent random samples from normal distributions with a common variance.

  • Test the μx = μy versus μx 6 = μy at the 5% level.
  • Construct a 95% confidence interval for the difference in means, μx − μy.
  • Comment on the analyses.

5.1.3 Approximate Methods

In addition to the exact methods given in the last section, there are approximate methods we can use to answer questions about the difference in means parameter, δ = μx − μy.

  1. σx 6 = σy Estimated, Normal Samples: Assume that X and Y are normal random variables, and that σx 6 = σy. Statistical methods use the fact that the approximate standardization T =

(X − Y ) − (μx − μy) √ S x^2 n +^

S^2 y m has an approximate Student t distribution with degrees of freedom as follows:

df =

(S x^2 /n) + (S y^2 /m)

(S x^2 /n)^2 /n + (S y^2 /m)^2 /m

  1. σx, σy Estimated, Large Samples: Assume that n and m are large. Statistical methods use the fact that the approximate standardization

Z =

(X − Y ) − (μx − μy) √ S^2 x n +^

S y^2 m has an approximate standard normal distribution.

Notes:

  1. Pooled versus Welch t Methods: Exact methods for normal samples when σx = σy is estimated using pooled information are called pooled t methods. Approximate methods for normal samples where σx and σy are separately estimated are called Welch t methods, after the mathematician who proved (in the 1940’s) that the sampling distribution was approximately Student t.
  2. Computing the Degrees of Freedom: To apply the formula for df developed by Welch for the first situation above, you would round the expression on the right to the closest whole number. The computed df satisfies the following inequality:

min(n, m) − 1 ≤ df ≤ n + m − 2.

A quick by-hand method is to use the lower bound for df instead of Welch’s formula.

  1. Central Limit Theorem: The central limit theorem can be used to demonstrate that the difference in sample means, X − Y , is approximately normally distributed when both n and m are large enough. Thus, the Z given in the second situation above has an approximately standard normal distribution when both n and m are large enough.

Assume the information on the previous page is a summary of independent random samples from normal distributions. Construct an approximate 95% confidence for the difference in means parameter, δ = μx − μy, and comment on your analysis.

5.1.4 Transformations to Normality

Methods based on sampling from normal distributions are popular and easy to apply.

For this reason, researchers often transform their data to achieve approximate normality, and then use normal theory methods on the transformed scale.

For example, the left plot below shows side-by-side box plots of samples taken from skewed pos- itive distributions, and the right plot shows an enhanced normal probability plot of combined standardized values.

Notice that the boxes are asymmetric, there are large outliers, and the normal probability plot has a pronounced bend.

By contrast, plots based on a log transformation of the data suggest that normal theory methods could be applied to the log-transformed data.

Footnotes. Although the use of transformations is attractive, there are many drawbacks. For example, it may be difficult to find an appropriate transformation, or it may be difficult to interpret the results back on the original scale.

In Section 5.3 (page 19) of these notes, we will study methods that can be used for a broad range of distributions.

  1. Reciprocal: If F has an f ratio distribution with n 1 and n 2 degrees of freedom, then the reciprocal of F has an f ratio distribution with n 2 and n 1 degrees of freedom.
  2. Quantiles: The notation fp is used to denote the pth quantile (100pth percentile) of the f ratio distribution. The Rice textbook includes tables for p = 0.90 (page A10), p = 0. 95 (page A11), p = 0.975 (page A12), and p = 0.99 (page A13). The p = 0. 10 , 0. 05 , 0. 025 , 0 .01 quantiles can be computed using reciprocals. Specifically,

fp on n 1 , n 2 df =

f 1 −p on n 2 , n 1 df

To illustrate the use of the tables in the textbook, let n 1 = 8 and n 2 = 10. Then

  1. When p = 0.90, 0.95, 0.975 and 0.99, the values are read from the tables: f 0. 90 = 2. 38 , f 0. 95 = 3. 07 , f 0. 975 = 3. 85 , f 0. 99 = 5. 06.
  2. When p = 0.10, 0.05, 0.025, and 0.01, the quantiles are computed using reciprocals. Specifically, since P (F ≤ x) = P

F

x

for every x,

to obtain the 0.10, 0.05, 0.025, and 0.01 quantiles of the distribution with 8 degrees of freedom in the numerator and 10 degrees of freedom in the denominator, we use the reciprocals of the 0.90, 0.95, 0.975, and 0.99 quantiles of the f ratio distribution with 10 degrees of freedom in the numerator and 8 degrees of freedom in the denominator. Thus,

f 0. 10 = 1

  1. 54 = 0. 39 , f 0. 05 = 1
  2. 35 = 0. 30 , f 0. 025 = 1
  3. 30 = 0. 23 , f 0. 01 = 1
  4. 81 = 0. 17.

5.2.2 Sampling Distribution of Ratio of Sample Variances

Let X be a normal random variable with mean μx and standard deviation σx, and let Y be a normal random variable with mean μy and standard deviation σy.

The following theorem tells us about the sampling distribution of the ratio of sample variances when samples are chosen independently from the X and Y distributions.

Theorem (Sampling Distribution). Let S x^2 and S y^2 be the sample variances of independent random samples of sizes n and m, respectively, from the X and Y distributions. Then

F =

S x^2 /S^2 y σ^2 x/σ^2 y

has an f ratio distribution with (n − 1) and (m − 1) degrees of freedom, where the numerator is the ratio of sample variances and the denominator is the ratio of model variances.

To demonstrate that the conclusion of the theorem is correct, first note that

  1. U = (n σ− (^2) x1) S x^2 has a chi-square distribution with (n − 1) df, and
  2. V = (m σ− y 2 1) S y^2 has a chi-square distribution with (m − 1) df.

Now (please complete the demonstration),

5.2.3 Exact Methods for Normal Distributions

Let X and Y be normal random variables. Under the conditions of the last section, the following tables give exact confidence interval and hypothesis test methods for the ratio of variances parameter, r = σ^2 x/σ^2 y.

  1. 100(1 − α)% CI for r = σ x^2 /σ^2 y , when μx and μy are estimated: [ S^2 x/S y^2 fn− 1 ,m− 1 (α/2) ,^

S x^2 /S y^2 fn− 1 ,m− 1 (1 − α/2)

]

where fn− 1 ,m− 1 (p) is the 100(1 − p)% point of the f ratio distribution with (n − 1) and (m − 1) df.

  1. 100 α% tests of Ho : r = ro, when μx and μy are estimated:

Test Statistic: F = S x^2 /S y^2 ro

RR for Ha : r < ro: F ≤ fn− 1 ,m− 1 (1 − α)

RR for Ha : r > ro: F ≥ fn− 1 ,m− 1 (α)

RR for Ha : r 6 = ro: F ≤ fn− 1 ,m− 1 (1 − α/2) or F ≥ fn− 1 ,m− 1 (α/2)

5.3 Nonparametric Methods for Two Sample Analysis

This section focuses on broadly-applicable two sample analysis methods.

5.3.1 Definitions

  1. Parametric/Nonparametric Methods: Statistical methods that require strong assump- tions about the shapes of distributions (for example, uniform or exponential), and ask questions about parameter values are called parametric methods. By contrast, nonparametric methods (also known as distribution-free methods) make mild assumptions, such as, “the distributions are continuous” or “the continuous distributions are symmetric around their centers.”
  2. Stochastically Larger/Smaller: Let V and W be continuous random variables. V is stochastically larger than W (corresponding, W is stochastically smaller than V ) if

P (V ≥ x) ≥ P (W ≥ x) for all real numbers x,

with strict inequality (that is, where “>” replaces “≥”) for at least one x.

  1. Shift Model: The random variables V and W are said to satisfy a shift model if

V − ∆ and W have the same distribution,

where ∆ is the difference in medians: ∆ = Median(V ) − Median(W ).

  1. Shift Parameter: The parameter ∆ from above is called the shift parameter.

Example: Quantile confidence interval procedure. Most of the statistical methods we have worked with so far have been parametric methods. An example of a nonparametric method is the quantile confidence interval procedure from Section 4.2.3 of these notes:

Let X be a continuous random variable, θ be the pth^ quantile of the X distribution, for some proportion p ∈ (0, 1), and X(k) be the kth^ order statistic of a random sample of size n from the X distribution. Then [ X(k 1 ), X(k 2 )

]

is a 100(1 − α)% confidence interval for θ, where

the indices k 1 and k 2 are chosen so that

P (θ < X(k 1 )) =

∑k 1 − 1 j=

(n j

pj^ (1 − p)n−j^ = α/ 2

P (X(k 1 ) < θ < X(k 2 )) =

∑k 2 − 1 j=k 1

(n j

pj^ (1 − p)n−j^ = 1 − α

P (θ > X(k 2 )) =

∑n j=k 2

(n j

pj^ (1 − p)n−j^ = α/ 2.

Illustration: Stochastically larger/smaller random variables. To illustrate the defi- nition of stochastically larger/smaller, consider the following plots of the PDFs (left plot), and the CDFs (right plot) of two random variables: V (solid blue) and W (dashed gray).

V is stochastically larger than W (correspondingly, W is stochastically smaller than V ).

Note that if V is stochastically larger than W , then their CDFs satisfy the inequality

FV (x) ≤ FW (x) for all x, with strict inequality for at least one x.

Example: Random variables satisfying shift models. If V and W satisfy a shift model with shift parameter ∆, then their distributions must have the same shape.

Here are two examples:

  1. Normal Distribution, σ = 5: If V is a normal random variable with mean 10 and standard de- viation 5, and W is a normal random variable with mean 3 and standard deviation 5, then V and W satisfy a shift model with shift parame- ter ∆ = 7.

Since ∆ > 0, V is stochastically larger than W.

  1. Shifted Exponential Distribution, λ = 1/ 10 : If V be an exponential random variable with parameter 1 10 , and let^ W^ be a shifted exponential random variable with PDF as follows:

f (x) =

e−(x−8)/^10 when x > 8,

and 0 otherwise. Then V and W satisfy a shift model with shift parameter ∆ = −8.

Since ∆ < 0, W is stochastically larger than V.