




























































































Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Statistical analysis results of various hypothesis tests, including t-tests and f-tests, conducted on different sets of data. The tests aim to compare means and variances between different groups or conditions. The data sets include mean load capacities of alloys, flare-illumination values of rocket propellants, and evaluation scores of two groups. The document also includes box plots and chi-square tests to further analyze the data.
Typology: Study Guides, Projects, Research
1 / 137
This page cannot be seen from the preview
Don't miss anything!





























































































1.1 a. The population of interest is the weight of shrimp maintained on the specific diet for a period of 6 months. b. The sample is the 100 shrimp selected from the pond and maintained on the specific diet for a period of 6 months. c. The weight gain of the shrimp over 6 months. d. Since the sample is only a small proportion of the whole population, it is necessary to evaluate what the mean weight may be for any other randomly selected 100 shrimps.
1.5 a. All football helmets produced by the five companies over a given period of time. b. The 540 helmets selected from the output of the five companies. c. The amount of shock transmitted to the neck when the helmet’s face mask is twisted. d. The neck strength of players is extremely variable for high school players. Hence, the amount of damage to the neck varies considerably from player to player for exactly the same amount of shock transmitted by the helmet.
2.1 The relative merits of the different types of sampling units depends on the availability of a sampling frame for individuals, the desired precision of the estimates from the sample to the population, and the budgetary and time constraints of the project.
2.3 A more precise estimate can be obtained by considering individual cars but it may be very difficult obtaining the sampling frame. By selecting parking lots and examining all cars in the lot, the data is more easily obtained but the individual cars in the lot may have common characteristics reflecting the set of persons using the parking lot. Thus, the cars in the lot are a cluster sample and not a simple random sample. This results in a less precise estimate of the population than examining the same number of cars selected individually.
2.5 The agency could stratified farms based on the total acreage of farms in the state. A simple random sample of farms could then be selected within each strata and a questionaire sent to the farmer.
2.9 a. No. The survey in which the interviewer showed the peanut butter should be the more accurate because it does not rely on the respondent’s memory of which brand was purchased. b. Both surveys may have survey nonresponse bias because an entire segment of the population (those not at home) cannot be contacted. Also, both surveys may have interviewer bias resulting from the way the question is posed (e.g., tone of voice). In the first survey, results may be biased by the respondent’s ability to recall correctly which brand was purchased. The second survey may be biased by the respondent’s unwillingness to show the interviewer the peanut butter jar (too intrusive), or by the respondent not recognizing that the peanut butter that had purchased was low fat.
2.11 a. ”Employee” should refer to anyone who is eligible for sick days.
b. Use payroll records. Stratify by employee categories (full-time, part-time, etc.), em- ployment location (plant, city, etc.), or other relevant subgroup categories. Consider systematic selection within categories. c. Sex (women more likely to be care givers), age (younger workers less likely to have elderly relatives), whether or not they care for elderly relatives now or anticipate doing in the near future, how many hours of care they (would) provide (to define ”substan- tial”), etc. The company might want to explore alternative work arrangements, such as flex-time, offering employees 4 ten-hour days, cutting back to 34 -time to allow more time to care for relatives, etc., or other options that might be mutually beneficial and provide alternatives to taking sick days.
2.13 If phosphorus first: [P,N]
[10,40], [10,50], [10,60], then [20,60], [30,60]
Or [20,40], [20,50], [20,60], then [10,60], [30,60]
Or [30,40], [30,50], [30,60], then [10,60], [20,60]
If nitrogen first: [N,P]
[40,10], [40,20], [40,30], then [50,30], [60,30]
Or [50,10], [50,20], [50,30], then [40,30], [60,30]
Or [60,10], [60,20], [60,30], then [40,30], [50,30]
b. Sample modes = 2.10, 2.77, 2. c. Median = 2. d. Mean = 2.
3.37 a. The values are given below:
Group Mean Median Mode I 2.923 2.805 no mode II 1.592 1.565 1.55, 1. III 0.797 0.755 0. b. mean = 1.7707 median = 1.565 mode = 0.70, 1.55, 1.
3.39 a. X¯ = 5. Yes.
b.
i=1(yi^ −^ 5)
c. s = 2. 56 CV = 51%.
3.41 Age: s ≈ 2.
Experience: s ≈ 5 .25.
3.43 The quantile plot is given here.
Quantile Plot of Times
u
Treatment Times(minutes)
0.0 0.10 0.20 0.30 0.40 0.50 0.60 0.70 0.80 0.90 1.
5
10
15
20
25
30
35
40
45
a. The 25th percentile is 14 minutes. b. Yes; the 90th percentile is 31.5 minutes.
3.45 a. Range ≈ 200
b. s ≈ 50 using range formula s ≈ 37 .2 using grouped data formula c. ¯y ± s yields (59, 133 .4); 121/ 191 ≈ 64% ¯y ± 2 s yields (21. 8 , 170 .6); 181/ 191 ≈ 94 .8% ¯y ± 3 s yields (− 15. 4 , 207 .8); 191/191 = 100% The empirical rule works well for this data set since the relative frequency histogram is mound-shaped.
3.47 a. The time series plot is given here.
260
280
300
320
340
360
380
Number of Blood Donors On Fridays
Number of Volunteers
The distribution is approximately symmetric with no outliers.
3.51 a. CAN: Q 1 ≈ 1. 45 , Q 2 ≈ 1. 65 , Q 3 ≈ 2. 4
DRY: Q 1 ≈ 0. 55 , Q 2 ≈ 0. 60 , Q 3 ≈ 0. 7
3.52 a. Stacked Bar Graph is given here:
Middle Primary Illerate
Shifting Settled TownDweller
0
20
40
60
80
100
120
140
160
180
200
Literacy Level of Three Subsistence Groups
Subsitence Group
Percent in Literacy Level
b. Illiterate: 46%, Primary Schooling: 4%, At Least Middle School: 50% Shifting Cultivators: 27%, Settled Agriculturists: 21%, Town Dwellers: 51%
3.53 50% of workers who resign are in the youngest age group; of those who transfer, 68.2% are in their 30’s; and of those who either retire or got fired 86.24% are at least 40 years old.
3.55 a. The means and standard deviations are given here:
Supplier ¯y s 1 189.23 2. 2 156.28 3. 3 203.94 8. b. Side-by-side Boxplots are given here:
Money Supply (Trillions of Dollars)
Month
Money Supply
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
2.^ 2.^ 2.^ 2.^ 2.^ 2.^ 2.^ 2.^ 2.^ 2.^
3.^ 3.^ 3.^
3.5 M M
3.59 a. mode = 5, median = 15, mean = 15.
b. range = 30, s ≈ 7. 5 c. s = 8. d. No, the histogram for the data set is skewed to the right and hence is not mound- shaped.
3.63 a. Mean = 0.65; Median = 0.55; Mode = 0.
b. Mean = 1.21; Median = 0.55; Mode = 0. The mean increases but the median and mode remain the same.
3.65 b. Highly skewed to the right
c. 1424 ± 3488 ⇒ (-2063, 4912) contains 37/41 = 90.2% 1424 ± (2)3488 ⇒ (-5551, 8400) contains 38/41 = 92.7% 1424 ± (3)3488 ⇒ (-9039, 11888) contains 39/41 = 95.1% These values do not match the percentages from the Empirical Rule: 68%, 95%, and 99.7%. d. 1. 48 ± 1 .54) ⇒ (-0.06, 3.02) contains 31/41 = 75.6%
3.67 a. There has been very little change from 1985 to 1996.
b. Yes c. No d. The cost of homes is very high.
3.69 a. Mode = 2.5 Median ≈ 7. 04
b. Mean ≈ 8. 3 c. median
3.71 The quantile plots are given here.
- - • • • • • - • - • • - • • • • - • • • • • • • • • • - • • • • • • • • • • • - • • • - • - Homeownership in Percentage(by State) for 1985
u
Homeownership(%)
0.0 0.10 0.20 0.30 0.40 0.50 0.60 0.70 0.80 0.90 1.
35
40
45
50
55
60
65
70
75
80
Homeownership in Percentage(by State) for 1996
u
Homeownership(%)
0.0 0.10 0.20 0.30 0.40 0.50 0.60 0.70 0.80 0.90 1.
35
40
45
50
55
60
65
70
75
80
a. 20th percentile is approximately 63%. b. Michigan, Indiana, West Virginia, Minnesota, and Maine. c. Pennsylvania, South Carolina, Wyoming, Maine, and West Virginia
3.73 a. mode = 1.0 median = 1.
b. mean = 1. c. skewed to the right.
3.75 a. Relative frequency histogram
20% trimmed mean: Plants: ¯y = 133039. 33 Arrests: ¯y = 41. 30
3.87 a. A time series plot is given here:
Four Monthly FDC Indices
Month
FDC Index
100
150
200
250
300
350
400
450
500
550
600
650
700
Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
Pharmaceticals Diversified Chain Wholesaler
3.89 a. Average Price = 76.
b. Range = 202. c. DJIA = 10856. d. Yes; The stocks covered on the NYSE. No
3.91 Only the District of Columbia with 37.4% and 40.4% homeownership in 1985 and 1996. The raw and 10% trimmed means are given here: Year Raw Mean 10% Trimmed Mean 1985 65.88 66. 1996 66.84 67.
Chapter 4: Probability and Probability Distributions
4.1 a. Subjective probability b. Relative frequency c. Classical d. Relative frequency e. Subjective probability f. Subjective probability g. Classical
4.3 Using the binomial formula, the probability of guessing correctly 15 or more of the 20 questions is 0.021.
4.7 a. P
b. Events A and B are not mutually exclusive
4.9 Because P (A|B) = 37 6 = 38 = P (A) ⇒ A and B are not independent. A and C are not independent. B and C are not independent.
4.11 A and B are not independent.
A and C are not independent. A and D are not independent. B and C are independent. B and D are not independent. C and D are not independent. None of the pairs of the events are mutually exclusive.
4.13 No
4.15 a. A: Generator 1 does not work
b. B|A: Generator 2 does not work given that Generator 1 does not work c. A ∪ B: Generator 1 works or Generator 2 works or Both Generators work
4.17 a. S = {F 1 F 2 , F 1 F 3 , F 1 F 4 , F 1 F 5 , F 2 F 1 , F 2 F 3 , F 2 F 4 , F 2 F 5 , F 3 F 1 , F 3 F 2 ,
F 3 F 4 , F 3 F 5 , F 4 F 1 , F 4 F 2 , F 4 F 3 , F 4 F 5 , F 5 F 1 , F 5 F 2 , F 5 F 3 , F 5 F 4 }
a. P (y ≤ 3) = 0.5327 (using a computer program) b. The posting of price changes are independent with the same probability 0.07 of being
- d. P (y = 10) = 0. b. 0. 0136 c. 0. 0021 d. 0. 2327
4.73 a. 0. 0336 b. P (y > 55) = 0.0038. We would then conclude that the voucher has been lost.
4.75 a. P (y > 200) = 0. 0764 b. P (y > 220) = 0. 0228 c. P (y < 120) = 0. 1949 d. P (100 < y < 200) = 0. 8472
4.77 a. P (y < 38) = 0. 1151 ⇒11.51 percentile. b. P (z ≤ k) = 0. 44
4.79 A sample is a random sample if every possible sample of size n from the population has an equal probability of being selected.
4.81 We would number the people in the population from 1 to 1,000 and then go to Table 13. Starting at Line 1, Column 25, we obtain 816, 309, 763, 078, 061, 277, 988, 188, 174, 530, 709, 496, 889, 482, 772. These would be the items in the sample.
4.83 In order to make the sampling random, the network might choose voters based on draws from a random number table, or more simply choose every nth person exiting.
4.85 We would then select the women numbered: 054, 636, 533, 482, 526.
4.87 The sampling distribution would have a mean of 60 and a standard deviation of √^516 = 1.25. If the population distribution is somewhat mound-shaped then the sampling distribution of ¯y should be approximately mound-shaped. In this situation, we would expect approximately 95% of the possible values of ¯y is lie in 60 ± (2)(1.25) = (57. 5 , 62 .5).
4.91 a. P (z < 1 .28) = 1096. 4 b. IQR = 175. 38
4.93 Facility size should be at least 178 for .05. Facility size should be at least 200 for .01.
4.95 a. P (y > 2 .7) = 0. 0228 b. P (z > 0 .6745) = 2. 30 c. Let μN ew = 2. 2065
4.97 P (y > 20 , 000) = 0. 0217
4.99 No. The last date may not be representative of all days in the month.
4.101 a. P (y < 5) = 0. 0059
P (y < 5) ≈ 0. 0125
c. Using the sample estimate of π , we can compute P (y ≤ 250) by the normal approxi- mation if nπ and n(1 − π) are both greater than 5.
4.115 n = 400 π = 0. 2
a. P (y ≤ 25) ≈ 0. b. The ad is not successful. With π = .20, we expect 80 positive responses out of 400 but we observed only 25. The probability of getting so few positive responses is virtually 0 if π = .20. We therefore conclude that π is much less than 0.20.
4.117 The sampling distribution of the sample mean consists of the following values for ¯y and their frequency of occurrence. y ¯ P (¯y) ¯y P (¯y) y¯ P (¯y) y¯ P (¯y) ¯y P (¯y) 7.25 1/70 17.00 1/70 21.50 2/70 26.00 1/70 35.75 1/ 10.50 1/70 17.25 1/70 21.75 1/70 26.25 1/ 11.25 1/70 17.50 1/70 22.25 2/70 26.50 1/ 11.50 1/70 18.00 2/70 22.50 2/70 26.75 1/ 12.00 1/70 18.25 1/70 23.25 2/70 27.50 2/ 12.25 1/70 18.75 2/70 23.50 1/70 28.25 1/ 13.00 2/70 19.00 1/70 23.75 1/70 29.00 2/ 14.00 2/70 19.25 1/70 24.00 1/70 30.00 2/ 14.75 1/70 19.50 1/70 24.25 2/70 30.75 1/ 15.50 2/70 19.75 2/70 24.75 1/70 31.00 1/ 16.25 1/70 20.50 2/70 25.00 2/70 31.50 1/ 16.50 1/70 20.75 2/70 25.50 1/70 31.75 1/ 16.75 1/70 21.25 1/70 25.75 1/70 32.50 1/
4.119 The sampling distribution of the sample median consists of the following values for the median (M) and their frequency of occurrence. M P(M) 7.5 5/ 9.0 4/ 10.5 8/ 15.5 3/ 17.0 6/ 17.5 2/ 18.5 9/ 19.0 4/ 20.5 6/ 22.5 1/ 24.0 2/ 25.5 3/ 27.0 8/ 32.0 4/ 34.0 5/
4.120 a. The normal probability plot is given here.
••••••••
•••
••••••••
•••••••••••••
••••••
••
•••
••••••••
Normal Probability Plot for the Sample Median
Quantiles of Standard Normal
Sample Mean
-2 -1 0 1 2
10
15
20
25
30
35
Note that the plotted points deviate considerably from the straight-line. Thus, the sampling distribution is not approximated very well by a normal distribution. If the sample size was much larger than 4 the approximation would be greatly improved. b. The population median equals 12+25 2 = 18.5 whereas the mean of the 70 values of the sample median is 19.536. The values differ by a significant amount due to the fact that the sample size was only 4.
4.121 a.,b. The mean and standard deviation of the sampling distribution of ¯y are given when the population distribution has values μ = 100, σ = 15: Sample Size Mean Standard Deviation 5 100 6. 20 100 3. 80 100 1. c. As the sample size increases, the sampling distribution of ¯y concentrates about the true value of μ. For n = 5 and 20, the values of ¯y could be a considerable distance from 100.
4.123 n = 36, μ = 40, σ = 12
a. The sampling distribution of ¯y is approximately normal with a mean of 40 and a standard deviation of 2 b. P (¯y > 36) = 0. 9772