Prepare for your exams
Get points
Guidelines and tips
Sell on Docsity
Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Find documents

Prepare for your exams with the study notes shared by other students like you on Docsity

Search for your university

Find the specific documents for your university's exams

Docsity AINEW

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

20 Points

For each uploaded document

Answer questions

5 Points

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Choose your next study program

Get in touch with the best universities in the world. Search through thousands of universities and official partners

Community

Ask the community

Ask the community for help and clear up your study doubts

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

Lecture4.pdf, Lecture notes of Economic statistics

Rice University Economic statistics

Variance, standard deviation ... Range and Inter-quartile range are relatively easy to compute. ... Definition: The Sample Standard Deviation is defined by:.

Typology: Lecture notes

2022/2023

Uploaded on 02/28/2023

jannine 🇺🇸

4.9

(15)

239 documents

1 / 17

This page cannot be seen from the preview

Don't miss anything!

Measure of Variability

(Dispersion, Spread)

•Variance, standard deviation

•Range

•Inter-Quartile Range

•Pseudo-standard deviation

0.02

0.04

0.06

0.08

0.1

0.12

0.14

0 5 10 15 20 25

Variability

Range

Definition

Let min = the smallest observation

Let max = the largest observation

Then Range =max -min

0.02

0.04

0.06

0.08

0.1

0.12

0.14

0 5 10 15 20 25

Range

Inter-Quartile Range (IQR)

Definition

Let Q1= the first quartile,

Q3= the third quartile

Then the

Inter-Quartile Range

= IQR = Q3-Q1

Discover Lecture notes of Economic statistics Rice University

Partial preview of the text

Download Lecture4.pdf and more Lecture notes Economic statistics in PDF only on Docsity!

Measure of Variability

(Dispersion, Spread)

Variance, standard deviation
Range
Inter-Quartile Range
Pseudo-standard deviation

0.02 0 0.040.

0.080.

0.120.

0 5 10 15 20 25

Variability

Range

Definition Let min = the smallest observation Let max = the largest observation Then Range =max - min

0.02 0 0.040.

0.080.

0.120.

0 5 10 15 20 25

Range

Inter-Quartile Range (IQR)

Definition Let Q 1 = the first quartile, Q 3 = the third quartile Then the Inter-Quartile Range = IQR = Q 3 - Q 1

0 5 Q 1 10 Q 315 20 25

25% 25%

50%

Inter-Quartile Range

Example The data Verbal IQ on n = 23 students arranged in increasing order is: 80 82 84 86 86 89 90 94 94 95 95 96 99 99 102 102 104 105 105 109 111 118 119

min = 80 Q 1 = 89 Q 2 = 96 Q^3 = 105 max = 119

The numbers

are called deviations from the the mean

d (^) 1 = x 1 − x d (^) 2 = x 2 − x d (^) 3 = x 3 − x M d (^) n = xn − x

The sum

is called the sum of squares of deviations from the the mean. Writing it out in full:

∑ ∑^ (^ )

= =

n i i

n i

di x x 1

2 1

2 2 3 2 2 2 d 1 (^) + d + d +L+ dn

( x 1 − x ) 2 +( x 2 − x )^2 +L+( xn − x )^2

The Sample Variance

Is defined as the quantity:

and is denoted by the symbol

2 1

−

= = n

x x n

n i i

s^2

Example Let x 1 , x 2 , x 3 , x 3 , x 4 , x 5 denote a set of 5 denote the set of numbers in the following table. (^) i 1 2 3 4 5

xi 10 15 21 7 13

Then = x 1 + x 2 + x 3 + x 4 + x 5 = 10 + 15 + 21 + 7 + 13 = 66 and

5 i 1 i

x x x x x n

x x n n

= ∑ i =^1 i^ =^1 +^2 +^3 +K+ −^1 +

2 5 =^66 =

The deviations from the mean d 1 , d 2 , d 3 , d 4 , d 5 are given in the following table.

i 1 2 3 4 5

xi 10 15 21 7 13

di -3.2 1.8 7.8 -6.2 -0.

Interpretations of s

In Normal distributions
- Approximately 2/3 of the observations will lie within one standard deviation of the mean
- Approximately 95% of the observations lie within two standard deviations of the mean
- In a histogram of the Normal distribution, the standard deviation is approximately the distance from the mode to the inflection point

0 5 10 15 20 25

Inflection point

Mode

Example

A researcher collected data on 1500 males aged 60-65. The variable measured was cholesterol and blood pressure.

The mean blood pressure was 155 with a standard deviation of 12.
The mean cholesterol level was 230 with a standard deviation of 15
In both cases the data was normally distributed

Interpretation of these numbers

Blood pressure levels vary about the value 155 in males aged 60-65.
Cholesterol levels vary about the value 230 in males aged 60-65.

The sum of squares of deviations from the the mean can also be computed using the following identity:

x x x x

n i n i i i

n i i

1 1

2 1

= = =

To use this identity we need to compute:

1 2 and 1 n

n i

∑ xi^ = x + x + + x

L

12 22 2 1

n (^2) n i

∑ xi^ = x + x + + x

L

Then:

x x x x

n i n i i i

n i i

1 1

2 1

= = =

and

1 1

2 1

2 2 −

= = = n

x x n

x x s

n i n i i i

n i i

and 2

1 1

2 1

−

= = = n

x x n

x x s

n i n i i i

n i i

Example The data Verbal IQ on n = 23 students arranged in increasing order is: 80 82 84 86 86 89 90 94 94 95 95 96 99 99 102 102 104 105 105 109 111 118 119

= 80 + 82 + 84 + 86 + 86 + 89

90 + 94 + 94 + 95 + 95 + 96
99 + 99 + 102 + 102 + 104
105 + 105 + 109 + 111 + 118
119 = 2244 = 80^2 + 82^2 + 84^2 + 86^2 + 86^2 + 89^2
90^2 + 94^2 + 94^2 + 95^2 + 95^2 + 96^2
99^2 + 99^2 + 102^2 + 102^2 + 104^2
105^2 + 105^2 + 109^2 + 111^2
118^2 + 119^2 = 221494

n i

xi 1

n i

xi 1

A quick (rough) calculation of s

The reason for this is that approximately all (95%) of the observations are between and Thus

s ≈^ Range

x − 2 s x + 2 s. max ≈ x + 2 s and min≈ x − 2 s.

and Range =max−min≈ ( x + 2 s ) −( x − 2 s ).

= 4 s 4 Hence s ≈Range

Example Verbal IQ on n = 23 students min = 80 and max = 119

This compares with the exact value of s which is 10.782. The rough method is useful for checking your calculation of s.

s ≈^119 -^80 = =

The Pseudo Standard Deviation (PSD)

Definition: The Pseudo Standard Deviation (PSD) is defined by:

InterQuartile Range

PSD =IQR=

Properties

For Normal distributions the magnitude of the pseudo standard deviation ( PSD ) and the standard deviation ( s ) will be approximately the same value
For leptokurtic distributions the standard deviation ( s ) will be larger than the pseudo standard deviation ( PSD )
For platykurtic distributions the standard deviation ( s ) will be smaller than the pseudo standard deviation ( PSD )

Example Verbal IQ on n = 23 students Inter-Quartile Range = IQR = Q 3 - Q 1 = 105 – 89 = 16 Pseudo standard deviation

This compares with the standard deviation

=PSD = 1 IQR. 35 = 116. 35 = 11. 85

s = 10. 782

An outlier is a “wild” observation in the data
Outliers occur because
- of errors (typographical and computational)
- Extreme cases in the population
We will now consider the drawing of box- plots where outliers are identified

Observations that are between the lower and upper fences are considered to be non- outliers.
Observations that are outside the inner fences but not outside the outer fences are considered to be mild outliers.
Observations that are outside outer fences are considered to be extreme outliers.
mild outliers are plotted individually in a box-plot using the symbol
extreme outliers are plotted individually in a box-plot using the symbol
non-outliers are represented with the box and whiskers with - Max = largest observation within the fences - Min = smallest observation within the fences

Inner fences Outer fence

Mild outliers

Box-Whisker plot Extreme outlier representing the data that are not outliers

Measures of Shape

Skewness
Kurtosis

0.020.04 0 0.060.080.

0.120.140.

0 5 10 15 20 25 0.02^0 0.040.060.

0.120.140.

0.02 (^00 5 10 15 20 ) 0.040.060.

0.120.140. 0 5 10 15 20 25

0.020.04 0 0.060.080.

0.120. -3 -2 -1 00 1 2 3 0 5 10 15 20 25 -3 -2 -1 00 1 2 3

Skewness – based on the sum of cubes
Kurtosis – based on the sum of 4th^ powers

∑^ (^ )

n i

xi x 1

∑^ (^ )

n i

xi x 1

Lecture4.pdf, Lecture notes of Economic statistics

Related documents

Partial preview of the text

Download Lecture4.pdf and more Lecture notes Economic statistics in PDF only on Docsity!

Measure of Variability

(Dispersion, Spread)

Range

Inter-Quartile Range (IQR)

∑ ∑^ (^ )

( x 1 − x ) 2 +( x 2 − x )^2 +L+( xn − x )^2

The Sample Variance

= ∑ i =^1 i^ =^1 +^2 +^3 +K+ −^1 +

Interpretations of s

Example

Interpretation of these numbers

∑ xi^ = x + x + + x

L

∑ xi^ = x + x + + x

L

and Range =max−min≈ ( x + 2 s ) −( x − 2 s ).

PSD =IQR=

Measures of Shape

∑^ (^ )

∑^ (^ )