Download Statistics: Descriptive Statistics and Data Analysis and more Schemes and Mind Maps Industrial Engineering in PDF only on Docsity!
Computer
Applications in IE
Introduction to Descriptive
Statistics
Assoc. Prof. Ho Thanh Phong HCMC University of Technology
Contents
OUTLINES
- Introduction to Descriptive Statistics
- Sample and Population
- Grouped Data and the Histogram
- Percentiles and Quartiles
- Measures of Central Tendency
- Measures of Variability
- Mean and Standard Deviation
- Data displaying
- Exploratory Data Analysis
Samples and Populations
X , s , pˆ
2 Dept. of Industrial & Systems Engineering 4
A population consists of the set of all measurements in which the investigator is
interested.
A sample is a subset of the measurements selected from the population.
A census is a complete enumeration of every item in a population.
Population (N)
Sample (n)
, , p 2
Why Sample? Census of a population may be: Impossible, Impractical, too costly To estimate the population parameters
Sampling
Estimation
THUẬT NGỮ
Descriptive Statistics: thống kê mô tả
Inferential Statistics: thống kê suy luận
Population: quần thể
Sample: mẫu
Census: điều tra tổng thể
Two Types of Data
Qualitative (Categorical,
Nominal or Non-metric):
Examples:
❑ Color
❑ Gender
❑ Nationality
Quantitative ( Measurable,
Countable or Metric):
Examples:
❑ Temperatures
❑ Salaries
❑ Number of points scored
on a 100-point exam
Scales of Measurement
Nominal Scale - groups or classes
❑ Gender
Ordinal Scale - order matters
❑ Ranks
Interval Scale - difference or distance matters
❑ Temperatures
Ratio Scale - Ratio matters
❑ Salaries
Group Data and the Histogram
Dividing data into groups or classes or intervals
Groups should be:
Mutually exclusive
❑ Not overlapping - every observation is assigned to only
one group
Exhaustive
❑ Every observation is assigned to a group
Equal-width (if possible)
❑ First or last group may be open-ended
Frequency Distribution
Class midpoint is the middle value of a group or class or
interval
❑ Relative frequency is the percentage of total
observations in each class
▪ Sum of relative frequencies = 1
❑ Cumulative frequency: a running total of frequencies
through the classes
Example
Class Midpoint Frequency Relative Cumulative Cumulative Frequency Frequency Relative Fre. 1 to less than 3 2 16 0.40 16 0. 3 to less than 5 4 2 0.05 18 0. 5 to less than 7 6 4 0.10 22 0. 7 to less than 9 8 3 0.075 25 0. 9 to less than 11 10 9 0.225 34 0. 11 to less than 13 12 6 0.150 40 1. Total 40 1.
Histogram
Dept. of Industrial & Systems Engineering 14
A histogram is a chart made of bars of different heights.
❑ Widths and locations of bars correspond to widths and locations
of data groupings
❑ Heights of bars correspond to frequencies or relative frequencies
of data groupings
- 0
- 0
- 0
- 0 0 F re q u e n c y Numbers
2 4 6 8 10 12
THUẬT NGỮ
Histogram: biểu đồ cột
Percentiles: bách phân vị
An ascending array : dãy tăng dần
Whole number: số nguyên
Examples
A large department store collects data on sales made by each of its salespeople. The
number of sales made on a given day by each of 20 salespeople is shown. Also, the data
has been sorted in magnitude. n = 22
Sales 9 6 12 10 13 15 16 14 14 16 17 16 24 21 22 18 19 18 20 17 27 29 Sorted Sales 6 9 10 12 13 14 14 15 16 16 16 17 17 18 18 19 20 21 22 24 27 29 Order 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
Find the 50
th
th
, and the 90
th
percentiles of this data set.
❑ To find the 50 th percentile, determine the data point in position nP/ 100 = ( 22 )( 50 / 100 ) = 11 is a whole number. The 50 th percentile is the average value of the 11 th values and the 12 th value: 16. 5. ❑ To find the 80 th percentile, the location is nP/ 100 = ( 22 )( 80 / 100 ) = 17. 6 is not a whole number. The 80 th percentile is the value of the 18 th values: 21 ❑ To find the 90 th percentile, the location is nP/ 100 = ( 22 )( 90 / 100 ) = 19. 8 The 90 th percentile is the value of the 20 th values: 24
Interquartile Range
❑ The first quartile (25th percentile) is often called the
lower quartile.
❑ The second quartile (50th percentile) is often called
median or the middle quartile.
❑ The third quartile (75th percentile) is often called the
upper quartile.
❑ The interquartile range is the difference between the first
and the third quartiles.
THUẬT NGỮ
Quartiles: tứ phân vị
Median: trung vị
Interquartile Range: khoảng liên tứ phân