Descriptive Statistics - Basic Statistics for Sociology - Lecture Slides, Slides of Statistics for Psychologists

Descriptive Statistics, Measures of Central Tendency, Category or Interval, Highest Frequency, Omit Formula, Protestant, Largest Category, Array Data, Middle Numbers, Exact Median are the important key points of lecture slides of Basic Statistics for Sociology.

Typology: Slides

2012/2013

Uploaded on 01/05/2013

gajendera
gajendera 🇮🇳

4.5

(4)

72 documents

1 / 27

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Descriptive Statistics
Healey Chapters 3 and 4
(2nd Cdn Ch. 3)
Measures of Central Tendency
And Dispersion
Docsity.com
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b

Partial preview of the text

Download Descriptive Statistics - Basic Statistics for Sociology - Lecture Slides and more Slides Statistics for Psychologists in PDF only on Docsity!

Descriptive Statistics

Healey Chapters 3 and 4

(2nd^ Cdn Ch. 3)

Measures of Central Tendency And Dispersion

Measures of Central Tendency

  • 1. Mode = can be used for any kind of data but only measure of central tendency for nominal or qualitative data.
  • Formula: value that occurs most often or the category or interval with highest frequency.
  • Note: Omit Formula 3.1 Variation Ratio in Healey and Prus 2 nd^ Cdn.

Central Tendency (cont.)

  • 2. Median = exact centre or middle of ordered data. The 50th percentile.
  • Formula:
  • Array data.
  • When sample even #, median falls halfway between two middle numbers.
  • To calculate: find(n/2)and (n/2)+1, and divide the total by 2 to find the exact median.
  • When sample is odd #, median is exact middle (n+1) /2)

Example for Raw Data:

  • Suppose you have the following set of test scores:
  • 66, 89, 41, 98, 76, 77, 69, 60, 60, 66, 69, 66, 98, 52, 74, 66, 89, 95, 66, 69
    1. Array data:
  • 98 98 95 89 89 77 76 74 69 69 69 66 66 66 66 66 60 60 52 41 N = 20 (N is even)

Median for Aggregate (grouped) Data

  • This formula is shown in Healey 1 st^ Cdn Edition and in Healey 8e but NOT in 2 nd^ Cdn
  • We will NOT COVER this one!

Properties of median :

    • for numerical data at interval or ordinal level
  • -"balance point“
  • -not affected by outliers
  • -median is appropriate when distribution is highly skewed.

Example for Mean

  • Formula: = ΣXi / N

= 1446 / 20 = 72.

The mean for these test scores is 72.

Mean for Aggregate (Grouped) Data

(Note: 1st^ Cdn. Edition: use this formula!

Omitted in 2 nd^ Cdn. Ed. but covered in class)

  • To calculate the mean for grouped data, you need a frequency table that includes a column for the midpoints, for the product of the frequencies times the midpoints (fm).

Formula: = Σ (fm) N

Calculating Mean for Grouped

Data:

Formula: = Σ (fm)

N = 1420 / 20 = 71

The mean for the grouped data is 71.

Properties of the Mean:

**- only for numerical data at interval level

  • "balance point“
  • can be affected by outliers = skewed distribution
  • tail becomes elongated and the mean is pulled in direction of outlier.**

Example… no outlier: $30000, 30000, 35000, 25000, 30000 then mean = $ but if outlier is present, then: $130000, 30000, 35000, 25000, 30000 then mean = $ (the mean is pulled up or down in the direction of the outlier)

Measures of Dispersion

  • Describe how variable the data are.
  • i.e. how spread out around the mean
  • Also called measures of variation or variability

Variability for Non-numerical Data

(Nominal or Ordinal Level Data)

  • Measures of variability for non-numerical nominal or ordinal) data are rarely used
  • We will not be covering these in class
  • Omit Formula 4.1 IQV in Healey and Prus 1st Canadian Edition and in Healey 8e
  • Omit Formula 3.1 Variation Ratio in Healey and Prus 2nd^ Canadian Edition

Interquartile Range (Q):

  • This is the difference between the 75th and the 25th percentiles (the middle 50%)
  • Gives better idea than range of what the middle of the distribution looks like.

Formula: Q = Q 3 - Q 1 (where Q 3 = N x .75, and Q 1 = N x .25) Using above data: Q = Q 3 - Q 1 = (6th^ – 2nd^ case) = $30000-25000 =$ The interquartile range (Q) is $5000.

3. Variance and Standard Deviation:

  • For raw data at the interval/ratio level.
  • Most common measure of variation.
  • The numerator in the formula is known as the sum of squares , and the denominator is either the population size N or the sample size n-
  • The variance is denoted by S 2 and the standard deviation, which is the square root of the variance, by S