Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Analysis of Variance, Simple Linear Regression - Slides | STAT 51200, Study notes of Statistics

Material Type: Notes; Professor: Zhang; Class: Applied Regression Analysis; Subject: STAT-Statistics; University: Purdue University - Main Campus; Term: Spring 2009;

Typology: Study notes

Pre 2010

Uploaded on 07/30/2009

koofers-user-0zj
koofers-user-0zj 🇺🇸

10 documents

1 / 103

Toggle sidebar

Related documents


Partial preview of the text

Download Analysis of Variance, Simple Linear Regression - Slides | STAT 51200 and more Study notes Statistics in PDF only on Docsity!

Purdue University Spring 2009

Statistics 512: Applied Regression Analysis

•^ We will cover^ Overview

simple linear regression (SLR)

multiple linear regression (MLR)

analysis of variance (ANOVA)

January 12, 2009

Purdue University Spring 2009

manipulations. We want to(such as SAS) rather than on mathematical Emphasis will be placed on using selected practical tools

understand

the theory so that

we can

apply

it appropriately. Some of the material on

generalize the methods to MLR.SLR will be review, but our goal with SLR is to be able to

January 12, 2009

Purdue University Spring 2009

Course Information

Class:

Section 3 MWF 2:30-3:20pm at REC 121

Text:

Applied Linear Statistical Models, 5th edition

by

RecommendedKutner, Neter, Nachtsheim, and Li.

:

Applied Statistics and the SAS

Programming Language, 5th edition

by Cody and Smith.

January 12, 2009

Purdue University Spring 2009

Professor:

Dabao Zhang, MATH 534.

Office Hours:

MW 3:30pm-4:30pm or by appointment,

Evaluation: or phone (46046) or e-mail [email protected]

Problem sets will be assigned (more or

to the handout about specific evaluation policies.less) weekly. They will typically be due on Friday. Refer

January 12, 2009

Purdue University Spring 2009

Lecture Notes

Available as MS-Word or PDF

Usually (hopefully) prepared a week in advance

Not comprehensive (Be prepared to take notes.)

One/two chapters per week

Ask questions if you’re confused

January 12, 2009

Purdue University Spring 2009

Webpage

http://www.stat.purdue.edu/

zhangdb/stat512/

Announcements

Lecture Notes

Homework Assignments

Data Sets and SAS files

–^ General handouts (please see immediately)

Course Information

January 12, 2009

Purdue University Spring 2009

-^ Blackboard Vista announcements through e-mail.^ I will very occasionally send reminders or^ Mailing List

  • (^) Discussion groups • (^) Information restricted to enrolled students • (^) Moniter grades (^) Holds solutions documents

January 12, 2009

Purdue University Spring 2009

week in advance for any conflict.make sure that it works for you. Please notify me one2009 (8-10pm). Please check your schedule and^ One midterm exam has been scheduled on March 5,

possible.homework deadlines, please let me know as soon as^ If the lecture viewing schedule is not realistic for

In class, please try to make sure I hear your question.

please be courteous to your classmates.^ Chatting with your neighbors may disturb others,

January 12, 2009

Purdue University Spring 2009

class. SAS is the program we will use to perform data analysis for this

Learning to use SAS will be a large part of the course.

-^ Several sources for help:^ Getting Help with SAS

  • (^) SAS Getting Started (inengine) • (^) World Wide Web (look up the syntax in your favorite search (^) SAS Help Files (not always best)

SAS Files

section of class website)

and Tutorials

Statistical Consulting Service

January 12, 2009

Purdue University Spring 2009

Wednesday Evening Help Sessions

editionApplied Statistics and the SAS Programming Language, 5th

by Cody and Smith; most relevant material in Chapters 1,

2, 5, 7, and 9.

Your instructor

http://www.stat.purdue.edu/scs/ Math B5 Hours 10-4 M through F Statistical Consulting Service

January 12, 2009

Purdue University Spring 2009

Off-campus students:

If DACS doesn’t work for you, fill out a

of the first week of classes.notification that you’re sending a license agreeement) by the endDisks will be sent to you. I need the license agreements (orlicense agreement online (in SAS folder), mail or fax it to Pro Ed.

January 12, 2009

Purdue University Spring 2009

•^ Evening Computer Labs

SC 283

help with SAS for multiple Stat courses

Hours 7pm-9pm Wednesdays

starting second week of classes

staffed with graduate student TA

January 12, 2009

Purdue University Spring 2009

HelpThere is a tutorial in SAS to hep you get started.SAS file to be correct, since there may be cut-and-paste errors.use in these notes. If the notes differ from the SAS file, take theoutput, or my comments. I will tell you the names of all SAS files Ihow they work. Let me know if you get confused about what is input,real output and experiment with changing the commands to learnpage of notes. You should run the SAS programs yourself to see theI will usually have to edit the output somewhat to get it to fit on thefor you to download from the website.lecture (and any other programs you should need) will be available I will often give examples from SAS in class. The programs used in

(^) →

Getting

Started

with

SAS Software

January 12, 2009

Purdue University Spring 2009

Just try to get a sense of what is going on.For today, don’t worry about the detailed syntax of the commands.with SAS. You should spend some time before next week getting comfortable

January 12, 2009

Purdue University Spring 2009

-^ Variables^ Example (Price Analysis for Diamond Rings in Singapore)

response variable

(^) – price in Singapore dollars (

Y

)

explanatory variable

(^) – weight of diamond in carets (

X

)

-^ Goals

  • (^) Predict the price of a sale for a 0.43 caret diamond ring • (^) Fit a regression line (^) Create a scatterplot

January 12, 2009

Purdue University Spring 2009

File SAS Data Step

diamond.sas

on website.

case, we have a sequence of ordered pairs (weight, price).One way to input data in SAS is to type or paste it in. In this

data diamonds;

cards;input weight price @@;

; .43 ..25 655 .35 1086 .18 443 .25 678 .25 675 .15 287 .26 693 .15 316.32 919 .15 298 .16 339 .16 338 .23 595 .23 553 .17 345 .33 945.17 353 .18 438 .17 318 .18 419 .17 346 .15 315 .17 350 .32 918.12 223 .26 663 .25 750 .27 720 .18 468 .16 345 .17 352 .16 332.21 483 .15 323 .18 462 .28 823 .16 336 .20 498 .23 595 .29 860.17 355 .16 328 .17 350 .18 325 .25 642 .16 342 .15 322 .19 485

January 12, 2009

Purdue University Spring 2009

data diamonds1;

if price ne .;set diamonds;

-^ Syntax Notes appear in the^ •^ There is no output from this statement, but information does^ Each line must end with a semi-colon.

log

window.

how to do this will come later.from another file, such as a spreadsheet. Examples showingOften you will obtain data from an existing SAS file or import it

January 12, 2009

Purdue University Spring 2009

SAS

(^) proc print

Obs run; proc print data=diamonds; Now we want to see what the data look like.

weight

price

0.

0.

0.

...

0.

0.

0.

.

January 12, 2009

Purdue University Spring 2009

looks linear. Therepresent data points and adding a curve to see if it We want to plot the data as a scatterplot, using circles to

symbol

statement “

v = circle

(v

stands for “value”) lets us do this. The symbol

statement “

i = sm

” will add a smooth line using

the smoothing to work properly, we need to sort the data bywhich stay on until you turn them off. In order for thesplines (interpolation = smooth). These are options

X

variable.

January 12, 2009

Purdue University Spring 2009

proc gplot data=diamonds1;axis2 label=(angle=90 ’Price (Singapore $$)’);axis1 label=(’Weight (Carets)’);title2 ’Scatter plot of Price vs. Weight with Smoothing Cutitle1 ’Diamond Ring Price Study’;symbol1 v=circle i=sm70; proc sort data=diamonds1; by weight;

plot price*weight /

haxis=axis1 vaxis=axis2;

run;

January 12, 2009

Purdue University Spring 2009

January 12, 2009

Purdue University Spring 2009

the data. We use the Now we want to use the simple linear regression to fit a line through

symbol

option “

i

= rl

”, meaning

proc gplot data=diamonds1;title2 ’Scatter plot of Price vs. Weight with Regression L symbol1 v=circle i=rl;“interpolation = regression line” (that’s an “L”, not a one).

plot price*weight / haxis=axis1 vaxis=axis2;

run;

January 12, 2009

Purdue University Spring 2009

January 12, 2009

Purdue University Spring 2009

We use

(^) proc reg

(regression) to estimate a

proc reg data=diamonds;the model is, and what options we want.from the straight line. We tell it what the data are, whatregression line and calculate predictors and residuals

id weight; run;output out=diag p=pred r=resid;model price=weight/clb p r;

January 12, 2009

Purdue University Spring 2009

Analysis
of
Variance
Sum
of
Mean
Source
DF
Squares
Square
F
Value
Model
Error
Corrected
Total
Root
MSE
R-Square
Dependent
Mean
Adj
R-Sq
Coeff
Var
Parameter
Estimates
Parameter
Standard
Variable
DF
Estimate
Error
t
Value
Pr
>
|t|
Intercept
<.0001
weight
<.0001

January 12, 2009

Page 25