














Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Examples of statistical analysis using hypothesis tests and confidence intervals to determine if there is a significant difference in math sat scores between high school seniors in mudville and hicksville. It includes various scenarios and calculations.
Typology: Exams
1 / 22
This page cannot be seen from the preview
Don't miss anything!















Bring your student ID! The midterm exam will be mostly multiple choice or fill in the blank. In particular, many questions that ask for answers here will be multiple choice or fill in the blank on the midterm. A small number of problems on the real midterm will require work to be shown etc. Likely examples include experiment designs, confidence intervals, and hypothesis tests. Material from the lecture of Thursday 30 April will be in the exam. There are many more problems in this list than there will be on the midterm. In this list of sample problems, some continue across page breaks. Thus, you might find four answers to a multiple choice problem on the same page as the problem, but the correct answer is on the next page. General comments:
Date: 30 April 2009. 1
(Not all the steps have the same point value.) Lobotozine is a new recreational drug which causes severe mental impairment even in low doses. You have bought an apparatus which measures lobotozine concentrations in blood specimens in the range 10–500 micrograms/liter (μg/l). The results for any given specimen are approximately normally distributed with standard deviation σ = 2 μg/l. The mean is supposed to be the true concentration, but you suspect the manufacturer calibrated it incorrectly, so that the mean differs from the true concentration. You test this apparatus by using it to make 5 measurements on a reference specimen in which the concentration of lobotozine is known to be 100 μg/l. The results are:
6 103. 6 102. 1 100. 6 103. 1
Assuming that the appropriate statistical procedure is safe to use, is this convincing evidence that the calibration is incorrect, at significance level α = 0.05? Answer using the following steps.
(1) State the hypotheses you will test. (2) State the test you will use. (3) Calculate the test statistic. (4) Give a P -value (or give two values between which P lies). (5) Draw the appropriate conclusion, expressing it in words appropriate for the context of the problem.
550 590 430 390 490 270 300 380.
Assume that the standard deviation of Math SAT scores of high school seniors in Mudville is the same as for the general population. Is this convincing evidence that high school seniors in Mudville who take the Math SAT do worse than the general population of test takers, at significance level α = 0.02? Answer using the following steps.
(1) State the hypotheses you will test. (2) State the test you will use. (3) Calculate the test statistic. (4) Give a P -value (or give two values between which P lies). (5) Draw the appropriate conclusion, expressing it in words appropriate for the context of the problem.
3a. A sociologist believes that high school seniors in Hicksville on average do better on the Math SAT than the general population of high school seniors. He tests this belief at significance α = 0.05, using the one sample z procedure, by choosing a simple random sample of 50 high school seniors in Hicksville. His test gives a P -value of 0.0772. What is the correct conclusion of the sociologist’s test?
a. He does not reject the null hypothesis, and concludes that there is strong evidence that high school seniors in Hicksville do not do better on the Math SAT than the general population of high school seniors.
f. Ha: μ < 176. g. No hypotheses are appropriate, because of improper data collection. h. None of the above.
a. Determine whether the sample or experiment was properly designed. b. Decide whether one can reasonably rule out the observed effect being due to chance. c. Decide whether one can reasonably rule out the observed effect being unimportant. d. Determine how large the observed effect is. e. Determine whether the experimenter is knowledgeable about statistics. f. Decide whether the problem being investigated is important. g. All of the above.
a. There is strong evidence that the mean GPA of students in my Math 243 class is larger than the mean GPA of all University of Oregon students. b. While the difference may be small, a difference as or more extreme than that observed in the group is unlikely to have arisen by chance if the mean GPA of students in my Math 243 class is the same as the mean GPA of all University of Oregon students. c. The mean GPA of students in my Math 243 class is probably much larger than the mean GPA of all University of Oregon students. d. Nothing, because of a poor choice of sampling method. e. The mean GPA of students in my Math 243 class is larger by at least 0.05 than the mean GPA of all University of Oregon students. f. Students who sit in the front row have higher GPAs than those who sit elsewhere in my class.
a. There is strong evidence that Eugene residents plan to spend more than $719 this year. b. While the difference may be small, a difference as or more extreme than that observed in the group is unlikely to have arisen by chance if Eugene residents plan to spend $719 this year. c. Eugene residents plan to spend much more than $719 this year. d. Nothing, because of a poor choice of sampling method. e. Eugene residents plan to spend at least 1% more than $719 this year.
f. People who shop at the Gateway Mall probably plan to spend more than other Eugene residents do.
a. 0. 9370 b. 0. 0630 c. 0. 1260 d. 1. 8740 e. Impossible to tell from the given data. f. None of the above.
a. An observed effect which satisfies the criteria of the “1. 5 × IQR” rule. b. An observed effect which could never occur by chance. c. An observed effect so large that it would rarely occur by chance. d. An observed effect that cannot be explained. e. None of the above.
The distribution of the data plotted is:
a. Roughly symmetric with no outliers. b. Roughly symmetric with one or more outliers. c. Skewed to the right with no outliers. d.
a. The new mean is significantly smaller than the old mean. b. The new mean is about the same as the old mean. c. The new mean must be exactly the same as the old mean. d. The new mean is significantly larger than the old mean. e. Any of the above can happen. f. None of the above.
a. The new median is significantly smaller than the old median. b. The new median is about the same as the old median. c. The new median must be exactly the same as the old median. d. The new median is significantly larger than the old median. e. Any of the above can happen. f. None of the above.
a. The new first quartile is significantly smaller than the old first quartile. b. The new first quartile is about the same as the old first quartile. c. The new first quartile must be exactly the same as the old first quartile. d. The new first quartile is significantly larger than the old first quartile. e. Any of the above can happen. f. None of the above.
a. The new third quartile is significantly smaller than the old third quartile. b. The new third quartile must be exactly the same as the old third quartile. c. The new third quartile is about the same as the old third quartile. d. The new third quartile is significantly larger than the old third quartile. e. Any of the above can happen. f. None of the above.
new standard deviation (of the data without this outlier) compare with the standard deviation of the original data?
a. The new standard deviation is significantly smaller than the old standard deviation. b. The new standard deviation is about the same as the old standard deviation. c. The new standard deviation must be exactly the same as the old standard deviation. d. The new standard deviation is significantly larger than the old standard deviation. e. Any of the above can happen. f. None of the above.
a (^) b c
Mean
Median
16 22 22 27 45 40 17 49 56 56 48 40 22 8 13 20 35 27 30 18 (1) Is 56 an outlier?
(2) Find the mean number of home runs Griffey hit.
(3) Find the standard deviation of Griffey’s home run totals for his first four seasons. (Do this by hand.)
(4) Look at the mean, median and histogram. Which of the following describes the distribu- tion? a. The distribution is skewed left and has no outliers. b. The distribution is skewed right and has no outliers.
$25, 800 $37, 000 $43, 200 $61, 000 $102, 000.
The owner has decided to give the highest paid person a $200,000 raise. (He did something that earned the company millions in extra profits.) In all parts of this problem, include appropriate units. a. What is the new mean salary? Why?
b. What is the new median salary? Why?
c. Suppose that instead the owner gives everyone a $10,000 raise. What is the new standard deviation? Why?
a. The crumple-horned snorkack. b. The spiral-horned snorkack. c. The purple-spotted snorkack. d. The crumple-horned snorkack and the spiral-horned snorkack are equally underweight. e. Cannot be determined from the information given. f. None of the above.
a. Approximately what percentage of fourth graders is taller than 41.7 inches?
b. Approximately 40% of the fourth graders are shorter than inches.
c. Approximately what percentage of simple random samples of size 16 of fourth graders has sample mean more than 48.9 inches?
a. Approximately what percentage of the zucchinis weigh more than 1.17 pounds?
b. Approximately what percentage of the zucchinis weigh between 1.79 and 3.65 pounds?
c. Approximately 10% of the zucchinis weigh more than pounds.
a. One cannot distinguish their effects on a response variable. b. They are highly correlated. c. They do not have a normal distribution. d. One of them is a placebo. e. Both are lurking variables. f. The statistician conducting the study is incompetent.
5 10 15 20 25
5
10
15
20
25
The association of the variables plotted is:
a. Clearly positive and roughly linear. b. Clearly positive and clearly nonlinear. c. Neither clearly positive nor clearly negative. d. Clearly negative and roughly linear. e. Clearly negative and clearly nonlinear.
Give the approximate coordinates of all outliers. If there are none, write “NONE”.
Midterm 1: Five number summary 22 51 77 82 99; mean 69; standard deviation 23. 63. Midterm 2: Five number summary 15 42 71 84 98; mean 65; standard deviation 26. 80. Correlation r ≈ 0 .8570; r^2 ≈ 0. 7345. (All noninteger values given to 4 significant digits.)
Individual Egg diameter (cm) Adult mass (tons) A 20 19 B 10 12 C 15 18 D 8 14 E 15 14
a. Draw a scatterplot on the axes provided. Be sure to label your axes, and make an appro- priate choice of which variable to put on the horizontal axis. [Grid omitted in sample problem.]
b. Find the equation of the least squares regression line and plot it on the graph above. c. What is the correlation between egg diameter and adult mass? What percentage of the variation in adult mass is explained by egg diameter?
10 20 30 40 50
5
10
15
20
25
Identify all influential points for the regression line by giving their approximate coordinates.
a. The Registrar’s office. b. All University of Oregon students who took the Math SAT. c. The 100 selected students. d. The Math SAT scores of the 100 selected students. e. The Math SAT scores of all University of Oregon students who took the Math SAT. f. Impossible to tell from the information given. g. None of the above.
The next two problems ask the same question about slightly different situations.
a. All college students. b. All University of Oregon students. c. The 398 students who were emailed the survey. d. The 87 students who returned the survey. e. The 211 students who did not return the survey. f. None of the above.
a. The fact that the survey omitted Alaska and Hawaii. b. The fact that people without telephones were missed.
a. An observational study. b. An experiment. c. A double blind design. d. This has elements of an observational study and an experiment. e. None of the above.
a. The amount of alcohol the people drank. b. The rate of severe heart attacks. c. The length of time hospitalized after a severe heart attack. d. Survival or nonsurvival after four years. e. The survival rate after a severe heart attack. f. Alcohol consumption. g. None of the above.
a. Anesthetic C is causing more deaths than the other three. b. The sample size is not sufficient to conclude anything definitive. c. Due to the possible presence of lurking variables confounding the results, we cannot con- clude anything definitive. d. Due to nonresponse issues, we cannot conclude anything definitive. e. None of the above.
a. In a line of 40 digits, there are exactly four 0’s. b. In a line of 40 digits there can never be a easily seen pattern, such as the same four digits repeating.
c. Every pair of digits has a 1/100 chance of being 00. d. Every digit must appear at least once in the table. e. None of the above.
a. A controlled experiment. b. A matched pairs design. c. An uncontrolled experiment. d. A double blind design. e. An observational study. f. None of the above.
a. One factor, a choice of diet. b. One factor, the blood pressure of men on the diet. c. Two factors, normal vs. vegetarian diet and unrestricted vs. restricted diet. d. Four factors, the four diets being compared. e. None of the above.
a. The drug being tested. b. The amount of drug prescribed. c. The length of the recovery time. d. The results on the patients. e. The disease the patients had. f. None of the above.
a. To ensure that each treatment group had the same number of men and women. b. To make all possible outcomes of the experiment equally likely.
a. 0 b. 7/ 2 c. 1 d. 6/ 7 e. 1/ 2 f. 1/ 7 g. Impossible to determine from the information provided. h. None of the above.
X: The sum of the rolls is greater than 9. Y : The product of the rolls is 25. Z: Both dice rolled odd numbers.
Which of the following statements about the probabilities of these events are true?
a. P (X or Y ) = P (X) + P (Y ). b. P (X or Z) = P (X) + P (Z). c. P (Y or Z) = P (Y ) + P (Z). d. P (X or Y or Z) = 1. e. None of the above.
Color Brown Yellow Red Green Orange Blue Probability 0.2 0.3 0.1 0.1? 0. Remember to show your work (even if you can do the problem in your head). a. Find the probability of drawing an orange M&M. b. Find the probability of drawing a yellow or green M&M.
a. A parameter. b. A population. c. A statistic. d. A sample. e. Both (a) and (b). f. Both (c) and (d). g. None of the above.
a. The distribution of the population is exactly normal. b. The distribution of the population is approximately normal. c. The distribution of the sample mean is approximately normal. d. The distribution of the sample mean is exactly normal. e. The limit of the sample size is the center of the population size. f. The center of the population distribution is limited.
a. The Registrar’s office. b. All University of Oregon students. c. The 100 selected students. d. The mean of the Math SAT scores of the 100 selected students. e. The mean of the Math SAT scores of all University of Oregon students. f. Impossible to tell from the information given. g. None of the above.
a. N (2. 5 , 0 .5). b. N (2. 5 , 0 .1). c. N (2. 5 , 0 .02). d. N (2. 5 , 0 .004). e. N (0. 5 , 0 .1). f. N (0. 5 , 0 .02). g. Impossible to determine from the information given.
a. Increasing the confidence level to 98%. b. Decreasing the sample size to 50. c. Increasing the sample size to 200.