Prepare for your exams
Get points
Guidelines and tips
Sell on Docsity
Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Find documents

Prepare for your exams with the study notes shared by other students like you on Docsity

Search for your university

Find the specific documents for your university's exams

Docsity AINEW

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

20 Points

For each uploaded document

Answer questions

5 Points

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Choose your next study program

Get in touch with the best universities in the world. Search through thousands of universities and official partners

Community

Ask the community

Ask the community for help and clear up your study doubts

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

Simple Linear Regression, Slides of Statistics

Georgia Institute of Technology - Main Campus Statistics

FOUNDATIONAL CONCEPTS Model Specification: What is SLR? y = β₀ + β₁x + ε Response vs. predictor variable What does the line represent? (conditional mean) Slope interpretation. Intercept interpretation.

Typology: Slides

2024/2025

Uploaded on 06/26/2026

az-fin 🇺🇸

7 documents

1 / 9

This page cannot be seen from the preview

Don't miss anything!

Regression Analysis

Simple Linear Regression & ANOVA

Nicoleta Serban, Ph.D.

Professor

Simple Linear Regression: Assumptions,

Diagnostics and Model Performance

School of Industrial and Systems Engineering

About This Lesson

Learning Objectives:

•Examine diagnostics to evaluate

the model assumptions and about

identifications of outliers

•Differentiate between goodness-of-

fit and linear model performance

Discover Slides of Statistics Georgia Institute of Technology - Main Campus

Partial preview of the text

Download Simple Linear Regression and more Slides Statistics in PDF only on Docsity!

Regression Analysis

Simple Linear Regression & ANOVA

Nicoleta Serban, Ph.D.

Professor

Simple Linear Regression: Assumptions,

Diagnostics and Model Performance

School of Industrial and Systems Engineering

About This Lesson

Learning Objectives:

Examine diagnostics to evaluate

the model assumptions and about

identifications of outliers

Differentiate between goodness-of-

fit and linear model performance

Simple Linear Regression: Model

Data : {(x 1

,y 1

),…,(x n

,y n

)}

Model : Y

x

i =1,…,n

Assumptions :

Linearity/Mean Zero Assumption: E(# i

Constant Variance Assumption : Var(# i

Independence Assumption {# 1

} are independent random variables

( Later we assume # i

~ Normal)

Residual Analysis

Residual Values:

Graphical display: Plot of the residuals 1 i

If the scatter of # i

is not random around zero line , it could be that

Ø The relationship between X and Y is not linear

Ø Variances of error terms are not equal

Ø Response data are not independent

à Goodness-of-fit (GOF) : Use diagnostics to evaluate assumptions.

Using residual analysis, we check for

uncorrelated errors but not

independence.

Independence is a more complicated

matter. If the data are from a

randomized trial, then independence

is established, but most data are from

observational studies.

Independence Assumption:

There are clusters of residuals: the independence assumption does not hold.

Checking Assumptions: Residual Analysis

x-axis:

Checking the Assumption of Normality

One way to check this assumption in a regression is using a

Normal Probability Plot

= rank of ; $

(between 1, n)

F = CDF of Normal Distribution

Ø Let the R statistical software do this for you!

Ø A straight line in normal probability plot

implies assumption of normality is valid

Ø Curvature (especially at the ends) shows

non-normality

− 3 / 8

- 1 / 4

y-axis: ; $

Assumption of Normality: Examples

Checking the Assumption of Normality

A complementary approach to check for the

normality assumption is by plotting the

histogram of the residuals

Normality Assumption:

The residuals should have an approximately

symmetric distribution, unimodal, and with

no gaps in the data.

Outliers in Regression

A data point far from the majority of the data (in y and/or x ) may be called an

outlier , especially if it does not follow the general trend of the rest of the data.

Ø Data points that are far from the mean of the x’s are called leverage points.

Ø A data point that is far from the mean of either or both the x’s and/or the y’s

are influential points if they influence the fit of the regression.

Ø An outlier may or may not impact the regression fit significantly, thus it may or

may not be an influential point.

The upshot : Sometimes there are good reasons for excluding subsets (there

were errors in the data entry; there were errors in the experiment).

Sometimes - the outlier belongs in the data. Outliers should always be examined.

Checking for Outliers

Look at the standardized residuals :

Compare the standardized residuals to the - 2 to +2 band (or - 1 to + 1).

Standardized residuals bigger than 1 are large.
Standardized residuals bigger than 2 extremely large.

Most statistics packages will calculate these automatically.

∗

Effect of Outliers: Examples

y = 2.6 + 0.07 x

y = 1.34 + 0.13 x

y = 1.29 + 0.17 x

y = 4.9 – 0.07 x

Coefficient of Determination

A statistic that efficiently summarizes how well the X’s can be used

to predict Y is the R-square:

R

2 = 1 – SSE / SST

R

2 = Proportion of total

variability in Y that can be

explained by the regression

(that uses X)

which is interpreted as:

SSE = Q

$,&

SST = Q

$,&

Simple Linear Regression, Slides of Statistics

Related documents

Partial preview of the text

Download Simple Linear Regression and more Slides Statistics in PDF only on Docsity!

Regression Analysis

Simple Linear Regression & ANOVA

Nicoleta Serban, Ph.D.

Simple Linear Regression: Assumptions,

Diagnostics and Model Performance

About This Lesson

Simple Linear Regression: Model

Model : Y

x

Residual Analysis

Checking Assumptions: Residual Analysis

Checking the Assumption of Normality

Assumption of Normality: Examples

Checking the Assumption of Normality

Outliers in Regression

Checking for Outliers

Effect of Outliers: Examples

Coefficient of Determination

R

R

SSE = Q

SST = Q