Practice Exam 3 Fall 2003 | Statistical Methods 1 | STAT 515, Exams of Data Analysis & Statistical Methods

Material Type: Exam; Class: STATISTICAL METHODS I; Subject: Statistics; University: University of South Carolina - Columbia; Term: Fall 2003;

Typology: Exams

Pre 2010

Uploaded on 09/02/2009

koofers-user-iks
koofers-user-iks 🇺🇸

10 documents

1 / 4

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Statistics 515 - Fall 2003 - Practice Exam 3 (based on past exams)
Part I: Answer three of the following four questions. If you complete more than three, I will grade only
the first three. Five points each.
1) Define what is meant by the p-value (or the observed significance level) of a test. _________
______________________________________________________________________________
_____________________________________________________________________________.
2) (Circle the correct answers) When conducting a two-sample t-test that two population means are equal we
use s12 and s22 / sp2 and n1+n2-2 degrees of freedom / Satterthwaite’s formula if the variances are equal.
3) (Circle the correct answers) A student achieved a score of 780 out of 800 on an aptitude test. If you knew
nothing else about the student, regression to the mean would imply that they would score lower / about the
same / higher if they retook the test. A student scoring a 500 out of 800 (near the average) would score lower /
about the same / higher if they retook it.
4) Say we reject the null hypothesis β1=0 in a regression problem. Briefly explain why can’t we automatically
assume that changing x causes y to change?
Part II: Answer every part of the next three problems. Read each problem carefully, and show your
work for full credit. Twenty points each.
1) A candidate for political office wants to determine if there is a difference in his popularity between men and
women. To test this he collects a sample of 250 men and 250 women and records how many of them plan on
voting for him in the upcoming election.
a) State the appropriate null and alternate hypothesis for determining whether the candidate differs in popularity
between men and women. Be sure to identify what the using mean in terms of the problem (e.g. if you use
pxsp ˆ,,,, 22,
σµ
say what parameter(s) you are the symbol stands for.)
b) Of those sampled, 105 of the men and 128 of the women plan on voting for the candidate. Report the p-value
for the test of hypothesis in A.
c) Besides the sample being randomly chosen, what other assumption(s) are required to trust the test in part B?
If possible, check that the assumption(s) hold.
pf3
pf4

Partial preview of the text

Download Practice Exam 3 Fall 2003 | Statistical Methods 1 | STAT 515 and more Exams Data Analysis & Statistical Methods in PDF only on Docsity!

Statistics 515 - Fall 2003 - Practice Exam 3 (based on past exams)

Part I: Answer three of the following four questions. If you complete more than three, I will grade only

the first three. Five points each.

1) Define what is meant by the p-value (or the observed significance level ) of a test. _________

______________________________________________________________________________

_____________________________________________________________________________.

2) (Circle the correct answers) When conducting a two-sample t-test that two population means are equal we

use s 12 and s 22 / s p^2 and n 1 +n 2 -2 degrees of freedom / Satterthwaite’s formula if the variances are equal.

3) (Circle the correct answers) A student achieved a score of 780 out of 800 on an aptitude test. If you knew

nothing else about the student, regression to the mean would imply that they would score lower / about the

same / higher if they retook the test. A student scoring a 500 out of 800 (near the average) would score lower /

about the same / higher if they retook it.

4) Say we reject the null hypothesis β 1 =0 in a regression problem. Briefly explain why can’t we automatically

assume that changing x causes y to change?

Part II: Answer every part of the next three problems. Read each problem carefully, and show your

work for full credit. Twenty points each.

1) A candidate for political office wants to determine if there is a difference in his popularity between men and

women. To test this he collects a sample of 250 men and 250 women and records how many of them plan on

voting for him in the upcoming election.

a) State the appropriate null and alternate hypothesis for determining whether the candidate differs in popularity

between men and women. Be sure to identify what the using mean in terms of the problem (e.g. if you use

, p , s , x , p ˆ

μ σ , say what parameter(s) you are the symbol stands for.)

b) Of those sampled, 105 of the men and 128 of the women plan on voting for the candidate. Report the p-value

for the test of hypothesis in A.

c) Besides the sample being randomly chosen, what other assumption(s) are required to trust the test in part B?

If possible, check that the assumption(s) hold.

  1. The following partial ANOVA table is for comparing how many hours it takes for different headache

remedies to provide relief. To test this a group of patients was randomly divided into separate groups, with

each group taking a different remedy.

Source SS DF MS F p-value

Treatments 3.3054 __ ______ ______ 0.

Error 1.9553 30 ______

Total ______ 32

a) Complete the above ANOVA table by filling in the missing values.

b) How many different headache remedies were compared in this experiment? ___________

How many total patients were used in this experiment? ____________

c) What null and alternate hypothesis are being tested by the p-value in this ANOVA table? (Be sure to identify

any parameters that you use?)

d) Do the headache remedies provide the same average relief, or is there a difference? (How did you decide?)

  1. The attached data set concerns the eruptions of the Old Faithful Geyser from August 1 to August 8, 1978.

date is the day of the eruption, interval is the length of time until the next eruption (in minutes), and

duration is the length of the last eruption (in minutes). The goal is to predict the time until the next eruption

(the interval) from the length of the last eruption (the duration).

a) It is possible to check three of the four regression assumptions by using the graphs that are produced by SAS.

Say which three assumptions those are and why they seem to be met in this case.

b) Assuming the assumptions of the regression model are met, what is the p-value for testing the hypothesis

that β 1 = 0? Do we accept or reject this null hypothesis at α = 0.01? Does the duration of the previous eruption

help predict the time until the next eruption?

c) If old faithful just erupted for 3 minutes, how long do you predict it will be until the next eruption?

d) What is the estimate of the standard deviation of the errors (σ) for this regression?

e) What percent of the variation or error in predicting the time until the next eruption is explained by the

duration of the previous eruption?

PROC INSIGHT;

OPEN oldfaith;

FIT interval=duration;

RUN;

Model Equation

interval = 33.8212 + 10.7412 duration

Summary of Fit

Mean of Response 71.

Root MSE 6.

R-Square 0.

Adj R-Sq 0.

Analysis of Variance

Source

Model

Error

C Total

DF

Sum of Squares

Mean Square

F Stat

Pr > F

Parameter Estimates

Variable

Intercept

duration

DF

Estimate

Std Error

t Stat

Pr >|t|

Tolerance

Var Inflation

95% C.I. for Parameters

Variable

Intercept

duration

Estimate

Lower

Upper

P_interval

R _ i n t e r v a

RN_interval

R _ i n t e r v a

duration

i^90

n t e r v a l