2 Problems on Statistical Inference - Project | STAT 431, Study Guides, Projects, Research of Statistics

Material Type: Project; Class: STATISTICAL INFERENCE; Subject: Statistics; University: University of Pennsylvania; Term: Fall 2003;

Typology: Study Guides, Projects, Research

Pre 2010

Uploaded on 03/28/2010

koofers-user-hqm
koofers-user-hqm 🇺🇸

10 documents

1 / 3

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Stat 431, Fall 2003, Due Nov 4 (in class)
First Project (ANOVA)
Use this page as the cover page for your project. Staple it to additional pages with answers and
JMPIN output as necessary.
NAMES: ______________________________________
The data for this project is located on the class website:
www-stat.wharton.upenn.edu/~lzhao.
For Problem 1 use the data entitled “Philadelphia (Suburban)”. (This is a large portion of the data
set that is being used on the handout in class to illustrate linear regression. The complete dataset
is also on our website, labeled as “Philadelphia (all)”.)
For Problem 2 use the data set “Call Center Arrivals”.
For all questions include the relevant part of the JMP output – if any. If you include additional
JMP output be sure to describe, circle, or otherwise indicate the part of the output that is relevant
to your answer. (You may answer the questions directly on the JMP printout, if that is most
convenient, but be sure your answers are clearly indicated and easy to read.)
[P.S. A convenient way to print out JMP tables is as follows: When the window with the table or
plot is open on the computer screen click on “Edit Journal” on the menu bar. This creates a
“Journal” in JMP that contains this table or plot. When the Journal window is on the screen you
may then go to “File Save” to save it. Be sure to save it as an RTF or HTML file. You can
then open and edit that file using a word processor such as WORD.]
1
pf3

Partial preview of the text

Download 2 Problems on Statistical Inference - Project | STAT 431 and more Study Guides, Projects, Research Statistics in PDF only on Docsity!

Stat 431, Fall 2003, Due Nov 4 (in class) First Project (ANOVA)

Use this page as the cover page for your project. Staple it to additional pages with answers and JMPIN output as necessary.

NAMES: ______________________________________

The data for this project is located on the class website: www-stat.wharton.upenn.edu/~lzhao.

For Problem 1 use the data entitled “Philadelphia (Suburban)”. (This is a large portion of the data set that is being used on the handout in class to illustrate linear regression. The complete dataset is also on our website, labeled as “Philadelphia (all)”.)

For Problem 2 use the data set “Call Center Arrivals”.

For all questions include the relevant part of the JMP output – if any. If you include additional JMP output be sure to describe, circle, or otherwise indicate the part of the output that is relevant to your answer. (You may answer the questions directly on the JMP printout, if that is most convenient, but be sure your answers are clearly indicated and easy to read.)

[P.S. A convenient way to print out JMP tables is as follows: When the window with the table or

plot is open on the computer screen click on “Edit → Journal” on the menu bar. This creates a

“Journal” in JMP that contains this table or plot. When the Journal window is on the screen you

may then go to “File → Save” to save it. Be sure to save it as an RTF or HTML file. You can

then open and edit that file using a word processor such as WORD.]

  1. The Philadelphia (Suburban) data set contains information on name-of-county and community average house price for various communities within the counties. Perform a one-way ANOVA for this data set using community-average House Price as the y-variable and County as the grouping (x) variable. (For convenience the house price has been given in $10,000 units.)

a. Perform the usual overall F-test. What is the P-value for this test? What null hypothesis is it testing? What is the conclusion from this test concerning the mean community-average house price among the Philadelphia counties?

b. A friend of mine was considering buying a house in one of the counties outside of Philadelphia. Consequently he looked at this data and observed that among those counties Montgomery had the highest mean price and Delaware had the lowest. For this reason he looked at the usual t-test of the difference in mean price between Montgomery and Delaware, and concluded that this difference was statistically significant at the 0.10 level. Show the analysis he performed, and comment on whether his conclusion was justified.

c. Investigate whether the standard assumptions for an ANOVA are justified. There are fairly clear indications that suggest using a transformation of the data. What are they? [Provide the usual diagnostic plots, and comment on them. Note: To get a normal quantile plot of residuals in JMP you need to first save the residuals from the “Fit Y by X” or “Fit Model” platforms. The residuals will appear as a column in your data table, and you can work from there

using the “Analyze → Distribution” command.]

d. Justified or not, the statistician decided to transform to Log(house Price) and redo the analysis. Perform the F-test for this transformed data (as in question a), above). Also perform additional tests to identify significant differences between community house prices (as in question b), above). Do your conclusions qualitatively agree with those you found in questions a) and b)? [To do this you need to create a new column variable with the formula (property) “log”. There are both menu-methods and double-click methods to create new columns.]

e. Do the standard assumptions for validity appear to be reasonably well satisfied in the transformed model? [Provide the usual diagnostic plots, and comment on them.]

f. My friend would like to predict the community house price in a randomly chosen Delaware County community. He has (randomly) chosen a community in Delaware County and would like an interval of values that has a .95 probability of containing that community’s community-average house price. Find such an interval. [Note: To most conveniently answer this question use the “Save Columns” option in the “Fit Model” platform instead of the “Fit Y by X” platform. Also the “Fit Model” platform uses the word “individual” where we would use the word “prediction”.]