Download Comparing Population Means: Inferences, Confidence Intervals, and Hypothesis Tests - Prof. and more Study notes Data Analysis & Statistical Methods in PDF only on Docsity! Chapter 9 — Part A
Inferences Based on Two Samples:
Confidence Intervals and Tests of
Hypotheses
Identifying the Target Parameter µ1 ‐ µ2 p1 ‐ p2 σ12/σ22 Mean difference; difference in averages Difference between proportions, percentages, Ratio of variances; difference in variability or spread; QuantitativeData fractions or rates; compare proportions QualitativeData compare variation QuantitativeData Comparing Two Population Means: Independent Sampling Large Sample Confidence Interval for µ1 ‐ µ2 2 2 2 1)()( +±± σσ 22 21 2/212/21 21 ss nn zxxzxx xx −=− − αα σ 2 2 1 1 2/21 )( nn zxx +±−≅ α Example (Part I) C ffi i l d b h l h f i i lompany o c a s are concerne a out t e engt o t me a part cu ar drug retains its potency. A random sample (sample 1) of 50 bottles of the product is drawn from the current production and analyzed for ( )potency. A second sample sample 2 of 50 bottles is obtained, stored for 1 year, and then analyzed. The summary readings are: 5050 24.0 83.9 22 2 2 =≈ = = s x n σ32.0 37.10 11 1 1 =≈ = = s x n σ Obtain a 95% confidence interval for the difference among sample groups. Example (Part I) 240 83.9 50 2 2 = = x n 320 37.10 50 1 1 = = x n 05.0=α .22 =≈ sσ.11 =≈ sσ 2 2 2 1 ss 21 2/212/21 )()( 21 nn zxxzxx xx +±−≅±− − αα σ 24.032.0961)8393710( 22 ± 5050 ... +−= ]65104290[ 057.096.1)54.0( ×±= .;.= Note that 0 is not contained in this interval; therefore, we can think that there is a significant difference on population means among these two samples (with a 95% confidence level). Example (Part II) Is there enough evidence to thing that there is a change in the drug potency after being stored for 1 year? Use α = 0.05. 83.9 50 2 2 = = x n 37.10 50 1 1 = = x n 24.022 =≈ sσ32.011 =≈ sσ H : (µ - µ ) = 0 or H : µ = µ0 1 2 Ha: (µ1 - µ2) ≠ 0 0)8393710()( Dxx240320 22 0 1 2 Ha: µ1 ≠ µ2 55.9 057.0 .. 21 021 = −− = −− = −xx oz σ057.050 . 50 . 21 =+≅−xxσ 9 55 | | > 1 96 H j t H. = zo zα/2 = . ence, we re ec 0. Note that we can also calculate a p-value for this test. Comparing Two Population Means: Independent Sampling For small samples, the t‐distribution can be used with a pooled sample estimator of σ2 s 2 , p )1()1( 222 2 112 −+−= snsnsp 221 −+ nn Small Sample Confidence Interval for µ1 ‐ µ2 ⎟⎟ ⎞ ⎜⎜ ⎛ +±−=±− 222/2122/21 11)()( stxxtxx σ ⎠⎝ −+−−+ 21 ,, 212121 nnpnnxxnn αα Th l f t i b d df 2e va ue o s ase on = n1 + n2 – . Comparing Two Population Means: Independent Sampling One‐Tailed Test H ( ) D Two‐Tailed Test H ( ) D 0: µ1 ‐ µ2 = 0 Ha: (µ1 ‐ µ2) > D0 (< D0) 0: µ1 ‐ µ2 = 0 Ha: (µ1 ‐ µ2) ≠ D0 Rejection region: |to |> tα Rejection region: |to| > t α/2 )( D Test Statistic: ⎟⎟ ⎠ ⎞ ⎜⎜ ⎝ ⎛ + −− = 2 021 11 nn s xxt p o 21 Conditions: 1) The two samples are randomly selected from the target population and independent of each other . 2) Both samples populations have distributions that are approx. normal. 3) The population variances are equal. Example 12 12 36.14 58.26 1 1 1 = = = s x n 86.13 67.39 2 2 2 = = = s x n 21 210 : 05.0: μμ αμμ < == aH H 0)(: 0)(: 21 210 <− =− μμ μμ aH H 2 22 21 2 22 2 112 11.14 21212 86.131136.1411 2 )1()1( = −+ ×+× = −+ −+− = nn snsnsp 272.2 12 1 12 111.14 0)67.3958.26( 11 )( 2 021 −= ⎟ ⎠ ⎞ ⎜ ⎝ ⎛ + −− = ⎟⎟ ⎞ ⎜⎜ ⎛ + −− = s Dxxt p o 21 ⎠⎝ nn 2.272 = | to | > t α n1+n2 2 = t 0 05 22 = 1.717 Hence we reject H0., - . , Note that we can also calculate a p-value for this test. Example 21 210 : 05.0: μμ αμμ < == aH Hdata Tapeworms;input Treatment $ Count; datalines; Drug 18 Drug 43 Drug 28 Drug 50 Drug 16 Drug 32 Drug 13 Drug 35 Drug 38 Drug 33 Drug 6 Drug 7 Untreated 40 Untreated 54 Untreated 26 Untreated 63 Untreated 21 Untreated 37 Untreated 39 Untreated 23 Untreated 48 Untreated 58 Untreated 28 Untreated 39 ; proc ttest data=Tapeworms alpha=0.10; class Treatment; var Count; Note that we need to look at the output for the Pooledmethod and that the p‐value run; reported needs to be divided by 2, as we are dealing with a 1‐sided hypothesis. Comparing Two Population Means: Paired Difference Experiments • Known also as paired data. • Pairs of observations are dependent (i e correlated) . . . • It can provide more information about the difference between population means than an independent samples experiment. • The population means are compared by looking at the differences between pairs of experimental units that were similar prior to the experiment. • Differencing removes some sources of variation (mainly correlation) . Example R B f Af Diff 24n at e ore ter 1 8.7 9.4 ‐0.7 2 7.9 9.8 ‐1.9 3 8.3 9.9 ‐1.6 4 8.4 10.3 ‐1.9 di = xi1 – xi2 05.0 12 = = = α dn 5 9.2 8.9 0.3 6 9.1 8.8 0.3 7 8.2 9.8 ‐1.6 8 8.1 8.2 ‐0.1 9 8.9 9.4 ‐0.5 10 8.2 9.9 ‐1.7 11 8.9 12.2 ‐3.3 12 7.5 9.3 ‐1.8 Mean 8.450 9.658 ‐1.208 ± d st St. dev. 0.516 0.988 1.077 =− d nd n x d 1,2/α Example R B f Af Diff 24n at e ore ter 1 8.7 9.4 ‐0.7 2 7.9 9.8 ‐1.9 3 8.3 9.9 ‐1.6 4 8.4 10.3 ‐1.9 di = xi1 – xi2 05.0 12 = = = α dn 5 9.2 8.9 0.3 6 9.1 8.8 0.3 7 8.2 9.8 ‐1.6 8 8.1 8.2 ‐0.1 9 8.9 9.4 ‐0.5 10 8.2 9.9 ‐1.7 11 8.9 12.2 ‐3.3 12 7.5 9.3 ‐1.8 Mean 8.450 9.658 ‐1.208 077.12081 ±± tstx d St. dev. 0.516 0.988 1.077 ]524.0;892.1[077.1201.2208.1 12 . 11,025.01,2/ −−=±−= −=− nd nd dα 12 Comparing Two Population Means: Paired Difference Experiments 21 μμμ −=dHypothesis test for: One‐Tailed Test H0: µd = D0 Two‐Tailed Test H0: µd = D0 Ha: µd < D0 (> D0) Rejection region: Ha: µd ≠ D0 Rejection region: |to |< ‐tα (> tα) |to| > t α/2 Dx − Test Statistic: for small sample sizes dd d o ns t / 0= d Dx 0− for large sample sizes dd o n z /σ =