Prepare for your exams
Get points
Guidelines and tips
Sell on Docsity
Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Docsity AI

Log in Sign up

Prepare for your exams

Study with the several resources on Docsity

Find documents

Prepare for your exams with the study notes shared by other students like you on Docsity

Search for your university

Find the specific documents for your university's exams

Docsity AINEW

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

20 Points

For each uploaded document

Answer questions

5 Points

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Choose your next study program

Get in touch with the best universities in the world. Search through thousands of universities and official partners

Community

Ask the community

Ask the community for help and clear up your study doubts

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

Data Classification and Presentation, Study notes of Introduction to Business Management

Virginia Polytechnic Institute and State University (Virginia Tech)Introduction to Business Management

An overview of data classification, focusing on nominal, ordinal, interval, and ratio data. It also discusses qualitative and quantitative data, as well as crossectional and time series data. The presentation of data through frequency distribution tables, absolute and relative frequency histograms, stem and leaf diagrams, and scatter diagrams.

Typology: Study notes

Pre 2010

Uploaded on 12/10/2008

vanilla-bear 🇺🇸

1 document

1 / 12

This page cannot be seen from the preview

Don't miss anything!

BIT 2405 Week 2

1. Classification of Data (Variables)

a. Nominal/Ordinal/Interval/Ratio

Nominal is an observational study of data in groups (gender, true/false)

Ordinal allows classifying, ranking, or ordering data (plus Nominal)

Interval data allows us to make statements about characteristics of data (SAT

scores/temperature) (plus Nominal and Ordinal)

Ratio has a meaningful zero and compares amounts of data

(weights/heights/profits/time)

b. Qualitative/Quantitative

Qualitative is categorical data (nominal and ordinal) and cannot be measured on

a numerical scale

Quantitative can be recorded on a numerical scale (interval and ratio)

c. Crossectional Data is collected at the same point in time

d. Time Series Data is collected over several time periods

Quantitative data can be further classified as continuous or discrete.

Continuous data

Discrete data

Summary

Exercises:

1. A supervisor must give a summary evaluation rating from among the following choices: 1) Poor

2) Fair 3) Good 4) Very Good 5) Excellent

Are these data qualitative or quantitative? Qualitative Quantitative

Are these data discrete or continuous? Discrete Continuous Neither

What is the highest level of measurement the data possess?

Nominal Ordinal Interval Ratio

2. A company is evaluating customer satisfaction with one of their products. A survey of 400

persons is conducted. Each person is asked: “What is your level of satisfaction with the

company’s products?” 1) Poor 2) Average 3) Good 4) Excellent

Are these data qualitative or quantitative? Qualitative Quantitative

Are these data discrete or continuous? Discrete Continuous Neither

What is the highest level of measurement the data possess?

Nominal Ordinal Interval Ratio

3. The weight of 50 newborn babies at a local hospital.

Are these data qualitative or quantitative? Qualitative Quantitative

Are these data discrete or continuous? Discrete Continuous Neither

What is the highest level of measurement the data possess?

Nominal Ordinal Interval Ratio

Discover Study notes of Introduction to Business Management Virginia Polytechnic Institute and State University (Virginia Tech)

Partial preview of the text

Download Data Classification and Presentation and more Study notes Introduction to Business Management in PDF only on Docsity!

BIT 2405 Week 2

Classification of Data (Variables) a. Nominal/Ordinal/Interval/Ratio Nominal is an observational study of data in groups (gender, true/false) Ordinal allows classifying, ranking, or ordering data (plus Nominal) Interval data allows us to make statements about characteristics of data (SAT scores/temperature) (plus Nominal and Ordinal) Ratio has a meaningful zero and compares amounts of data (weights/heights/profits/time) b. Qualitative/Quantitative Qualitative is categorical data (nominal and ordinal) and cannot be measured on a numerical scale Quantitative can be recorded on a numerical scale (interval and ratio) c. Crossectional Data is collected at the same point in time d. Time Series Data is collected over several time periods Quantitative data can be further classified as continuous or discrete. Continuous data Discrete data Summary

Exercises :

A supervisor must give a summary evaluation rating from among the following choices: 1) Poor

Fair 3) Good 4) Very Good 5) Excellent Are these data qualitative or quantitative? Qualitative Quantitative Are these data discrete or continuous? Discrete Continuous Neither What is the highest level of measurement the data possess? Nominal Ordinal Interval Ratio

A company is evaluating customer satisfaction with one of their products. A survey of 400 persons is conducted. Each person is asked: “What is your level of satisfaction with the company’s products?” 1) Poor 2) Average 3) Good 4) Excellent Are these data qualitative or quantitative? Qualitative Quantitative Are these data discrete or continuous? Discrete Continuous Neither What is the highest level of measurement the data possess? Nominal Ordinal Interval Ratio
The weight of 50 newborn babies at a local hospital. Are these data qualitative or quantitative? Qualitative Quantitative Are these data discrete or continuous? Discrete Continuous Neither What is the highest level of measurement the data possess? Nominal Ordinal Interval Ratio

You want to order a pizza. There are four kinds of pizza: 1) Pepperoni 2) Mushroom 3) Black Olive 4) Sausage Are these data qualitative or quantitative? Qualitative Quantitative Are these data discrete or continuous? Discrete Continuous Neither What is the highest level of measurement the data possess? Nominal Ordinal Interval Ratio
You toss a coin and record “head” as 0 and “tail” as 1. Are these data qualitative or quantitative? Qualitative Quantitative Are these data discrete or continuous? Discrete Continuous Neither What is the highest level of measurement the data possess? Nominal Ordinal Interval Ratio c. Crossectional Data is collected at the same point in time d. Time Series Data is collected over several time periods Quantitative data cant take on an integer value
Presentation of Data: a. Graphical Presentation of Quantitative Information i. Frequency Distribution Tables ii. Histograms
Absolute Frequency Histogram
Relative Frequency Histogram
Cumulative Frequency Histogram iii. Stem and Leaf Diagrams‐and‐Leaf Diagrams ‐and‐Leaf Diagrams iv. Crosstabulations v. Scatter Diagrams b. Graphical Presentation of Qualitative Information i. Frequency Distribution Tables ii. Bar Charts iii. Pie Charts
Brief review of Summation Notation
Numerical Measures of Location a. Arithmetic Mean b. Median c. Mode d. Weighted Average
Numerical Measures of Dispersion or Variability a. Range b. Mean (Average) Absolute Deviation c. Variance d. Standard Deviation Classification of Data What we can do with a data set (e.g., summarize, present, make inferences) depends on the type of

Supposed we try using 6 classes then the width of each interval would be something like: Width (12-2.4)/6 = 1.6 => Round up. Class width of 2 One form for our frequency table would then be: Tuition Rates (in $000) Number of Schools 2.0 but less than 4.0 13 4.0 but less than 6.0 24 6.0 but less than 8.0 9 8.0 but less than 10.0 8 10.0 but less than 12.0 5 12.0 but less than 14.0 1 A variation of the above absolute frequency table is to display the relative frequency of observations that fall in the specified intervals rather than absolute frequencies. A relative frequency table has 3 or 4 columns. Its components are described below. Category Frequency Relative Frequency Percent (optional) For our data set one possible form for a relative frequency table is as follows: Tuition Rates (in $000) Proportion of Schools 2.0 but less than 4.0 0. 4.0 but less than 6.0 0. 6.0 but less than 8.0 0. 8.0 but less than 10.0 0. 10.0 but less than 12.0 0. 12.0 but less than 14.0 0. Yet another variation is to display the cumulative frequency distribution; i.e. – display the number of observations that are less than the upper boundary of each class interval. For example, the Data Analysis routine in Excel provides us with the following output: Upper Limit Frequency Cumulative % 2 13 21.67% 4 24 61.67% 6 9 76.67% 8 8 90.00% 10 5 98.33% 12 1 100.00% Histograms A Histogram is simply a graphical display of a frequency distribution (table). There are a number of different forms of histograms. We will consider three types of histograms: 1.Absolute frequency histograms

Relative frequency histograms
Cumulative frequency histograms

Similar to constructing a frequency table, we have three major considerations: 1.# of intervals

interval width 3.check exhaustive and mutually exclusive Absolute frequency histogram : A graphical display of the information found in an absolute frequency table. Note: When we examine a frequency distribution (either in tabular or graphical form) we are very much interested in two things:
Spread / dispursion Page 11 of 25 Stem ‐ and ‐ Leaf Diagrams A stem and leaf diagram is a tool similar to a histogram. Like a histogram, they reflect frequencies,‐and‐Leaf Diagrams ‐and‐Leaf Diagrams concentrations of data, and shapes. Advantages: Given a set of numbers, generally the first digit or two will be the ‘stem,’ the rest will be the ‘leaf.’ Example: Consider the tuition data for the 60 Texas Schools presented in the ordered array. The resulting stem and leaf diagram is presented below:‐and‐Leaf Diagrams ‐and‐Leaf Diagrams Stem and leaf of Tuition‐and‐Leaf Diagrams ‐and‐Leaf Diagrams N = 60; Leaf Unit = 0. 1 2 4 12 3 455666899999 19 4 1456778888999999999 5 5 04889 4 6 0446 5 7 02479 8 8 00033568 0 9 3 10 347 2 11 06 1 12 0 BE CAREFUL! Too few or too many stems give little descriptive information about the distribution of numbers. 2.4, 3.4, 3.5, 3.5, 3.6, 3.6, 3.6, 3.8, 3.9, 3.9, 3.9, 3.9, 3.9, 4.4, 4.5, 4.6, 4.7, 4.7, 4.8, 4.8, 4.8, 4.8, 4.9, 4.9, 4.9, 4.9, 4.9, 4.9, 4.9, 5.0, 5.4, 5.8, 5.8, 5.9, 6.0, 6.4, 6.4, 6.6, 7.2, 7.4, 7.7, 7.9, 8.0, 8.0, 8.0, 8.3, 8.3, 8.5, 8.6, 8.8, 10.3, 10.7, 11.0, 11.6, 12. Page 12 of 25 Crosstabulations Restaurant Quality Price 1 Good 18 2 Very Good 22 3 Good 28 4 Excellent 38 5 Very Good 33

X 1 = 5, X 2 = 8, X 3 = 14

Mathematically, we could denote the sum as A more convenient way of doing this would be to use the shorthand If we had n observations, we can generalize this to: For our example, An additional example: Note that: Page 16 of 25 Measures of Location or Central Tendency What we seek is a number that we feel is typical or representative of the data set. We will consider four such measures:

Arithmetic Mean
Median
Mode
Weighted Average

Arithmetic Mean

The most commonly used measure of central tendency. We denote the mean for a population and a sample differently but compute them in the same manner For a population: For a sample: Example: A large department store collects data on sales made by each of its salespeople. The data, number of sales made on a given day by each of 20 salespeople are as follows: 9, 6, 12, 10, 13, 15, 16, 14, 14, 16, 17, 16, 24, 21, 22, 18, 19, 18, 20, 17 The sample mean is 􀵌 Σ 􀵌 􀵌

􀵌 Page 17 of 25 Properties of the mean:

The mean is sensitive to ALL of data. In other words, if one score in the distribution is changed, the mean will change too. Example: Obs. 1, 2, 3 1, 2, 30 1, 2, 300 As shown above, the mean is affected (runs to) extreme values. This can be a drawback and there is therefore a need to consider other measures of central tendency.
The sum of the deviations about the mean equals zero. Σ 􀵌 􀵌􀵌􀵌 􀵌􀵌 􀵌 0 or Σ 􀵌 􀵌 􀵌

􀵌􀵌􀵌 􀵌 􀵌 􀵌 􀵌 0 􀵌􀵌 􀵌􀵌 2 3 5 10 􀵌

Median

Definition: If the number of observations is odd Page 18 of 25 If the number of observations is even Previous example revisited: Sorting the 20 observations in ascending order we have: 6, 9, 10, 12, 13, 14, 14, 15, 16, 16, 16, 17, 17, 18, 18, 19, 20, 21, 22, 24 Since the number of observations is even, the median is the average of the 10th and 11th largest observations NOTE: The median is resistant to extreme values.

Mode

Definition: Working with the same sample data: 6, 9, 10, 12, 13, 14, 14, 15, 16, 16, 16, 17, 17, 18, 18, 19, 20, 21, 22, 24 Page 19 of 25

Median and Mode in Stem and Leaf Diagrams‐and‐Leaf Diagrams ‐and‐Leaf Diagrams

Example: Car mileage; n = 29; leaf unit = 0. 29.8, 30.1, 30.4, 30.4, 30.5, 30.6, 30.8, 30.8, 31.2, 31.3, 31.3, 31.4, 31.4, 31.5, 31.5, 31.7, 31.7, 31.7, 31.8, 31.9, 32.0, 32.2, 32.2, 32.4, 32.4, 32.5, 32.5, 32.8, 33. Freq Stem Leaf 1 29 8 3 30* 144 4 30 5688 Median = 5 31* 23344 (7) 31 5577789 5 32* 02244 Mode = 3 32 558 1 33* 3 Note that the stem labels used provide a more detailed display. For example, row 30* contains the mileages from 30.0 to 30.4, while row 30 contains the mileages from 30.5 to 30.9.

Relationship among the mean, mode, and median

(a) If skewed right then (b) If skewed left then (c) If symmetrical then Page 20 of 25

Effect of Outliers

Definition of an Outlier:

Section B

Page 22 of 25

Range

Definition: Example: Consider the sales data (number of sales for particular salesperson on 20 different days) displayed as an ordered array. 6, 9, 10, 12, 13, 14, 14, 15, 16, 16, 16, 17, 17, 18, 18, 19, 20, 21, 22, 24 Range = Note: The range has the disadvantage of only considering 2 of the N population or n sample observations. A logical alternative would be to base our measure of variability as a function of the distance all observations are from a typical value like the arithmetic mean. Average (or Mean) Absolute Deviation (AVEDEV) Definition: Excel uses the term AVEDEV Mathematical definition: AVEDEV = Example: Sales data set 6, 9, 10, 12, 13, 14, 14, 15, 16, 16, 16, 17, 17, 18, 18, 19, 20, 21, 22, 24 Recall that the Mean = 15. AVEDEV =

􀵌􀵌􀵌

Page 23 of 25

Variance (Mean Square)

Definition (important!): Formulas: The mathematical definition and computational formulas for the population and sample variance are Population Variance Sample Variance

Note: The unit of measurement corresponding to the variance for both the sample and population is the square of the original unit of measurement. To get back to the original unit, we take the square root of the variance. The resulting number is termed the standard deviation.

Standard Deviation

Definition (important!): Page 24 of 25 Formulas: Mathematical definitions of the standard deviation:

Population: √√√√√√√√√√√√√√􀵌 Σ 􀵌􀵌􀵌 􀵌 􀵌

Sample: √√√√√√√√√√√√√√􀵌 Σ 􀵌􀵌􀵌

Example: Consider the following data (number of sales for a particular salesperson on 5 different days) displayed as an ordered array. 6, 9, 10, 12, 13 The calculation of the sample variance and standard deviation is illustrated below using the definition and computational formulas for the two statistics. Calculations:

􀵌

Mean = 􀵌􀵌

Data Classification and Presentation, Study notes of Introduction to Business Management

Related documents

Partial preview of the text

Download Data Classification and Presentation and more Study notes Introduction to Business Management in PDF only on Docsity!

Exercises :

X 1 = 5, X 2 = 8, X 3 = 14

Arithmetic Mean

Median

Mode

Median and Mode in Stem and Leaf Diagrams‐and‐Leaf Diagrams ‐and‐Leaf Diagrams

Relationship among the mean, mode, and median

Effect of Outliers

Section B

Range

Variance (Mean Square)

Standard Deviation

Population: √√√√√√√√√√√√√√􀵌 Σ 􀵌􀵌􀵌 􀵌 􀵌

Sample: √√√√√√√√√√√√√√􀵌 Σ 􀵌􀵌􀵌

Definition: 􀵌􀵌 􀵌 Σ􀵌􀵌 􀵌 􀵌 􀵌