







Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Data Analysis, Data Description, Parameter, Statistic, Measures of central tendency, Arithmetic average, Mean for grouped Data, Median, Mode, Modal class, Revisiting distribution shapes are learning points available in this lecture notes.
Typology: Study notes
1 / 13
This page cannot be seen from the preview
Don't miss anything!








Section III Data Analysis Data Description When measuring data it is important to note the difference between studies on samples and studies on populations. A parameter is a measure or characteristic obtained by studying all data values from a population while a statistic is derived from a sample. For many attributes we will have separate symbols for a statistic and a parameter even though the method for computing them is the same. The number of datum in a sample will be n as before, but for a population it will be denoted N. For writing equations, generic sample or population will be denoted with X values for each datum: Example Data: Sample: {X 1 , X 2 , X 3 , ⋅ ⋅ ⋅ ⋅ ⋅ , X n } Population: {X 1 , X 2 , X 3 , ⋅ ⋅ ⋅ ⋅ ⋅ , XN} The ambiguous term “average” is actually a category known in statistics as measures of central tendency that includes the mean , median , mode and midrange. Another often- used average is the weighted mean. How data varies compared to these averages is a very useful characteristic to study. Measures of Central Tendency “A person has on average 1460 dreams in 1 year” The mean: Using a population or sample (the classical arithmetic average). Sample of (size n )
Population (of size N) The Mean μ =
∑^ X n and μ = ∑^ X N Keep track of and memorize symbols like X and μ as other equations will sometimes include them without review. Find the mean for the following population and label appropriately: 22 19 8 2 4 13 16 7 Math tips:
The Median: The Median is the halfway point of the data set. Finding the Median MD :
For Frequency distributions we speak of the modal class. The modal class is the class with the highest frequency. Class Tally Frequency 100 - 104 // 2 105 - 109 //////// 8 110 - 114 ////////////////// 18 115 - 119 ///////////// 13 120 - 124 /////// 7 125 - 129 / 1 130 - 134 / 1 Clearly in our record high temperature example, the modal class is 110˚ – 114˚. Notes on the mode:
Measures of Variation Averages are useful concepts, but they become even more useful when you combine them with the concept of variance. One type of variance is the distance between highest and lowest value, or the range. Perhaps the most important type has to do with the average distance from the mean for a datum.
Paint Example Test: Brand A VS Brand B Variable: Months before fading Two small populations of 6 cans of each brand are tested with the following results: We can calculate the means: Brand A μ = ∑^ X N = 210/6 = 35 months Brand B μ = ∑^ X N = 210/6 = 35 months Brand A Brand B 10 35 60 45 50 30 30 35 40 40 20 25
Difference: 60 - 35 = 25 50 - 35 = 15 40 - 35 = 5 30 - 35 = - 5 20 - 35 = - 15 10 - 35 = - 25 For variance and standard deviation we only want to know how far off on average, not in which direction. Square them, add them up and divide by N this gives the average of the squares of the distance from the mean called variance. Variance = 625 + 625 + 225 + 25 + 25 + 225 = 291. 6 To get the standard deviation we simply return to our scale by square rooting. Standard deviation = 17. Standard deviation for a population σ Standard deviation for a sample s III. Variance and standard deviation for populations The algorithm we just used gives us the formula for variance when using a population and in turn the standard deviation for populations: σ 2 = ∑ (^ X^ −^ μ) N 2 and^ σ^ =^ σ^ 2 = ( X^ −^ μ) 2 ∑ N
Method: These are the same steps we took from the Brand A population’s raw data set:
V. Variance and standard deviation - grouped data (frequency distributions): We have only so far computed variance for Samples and populations from raw Data. Finding Sample Variance and Standard Deviation for Grouped Data: We will again use the midpoints of each class as an average value to get an approximate answer. The adjusted formula for variance is: s 2 =
2
2 n ( n − 1 ) Example : Compute s^2 and s for our earlier data for record temperatures.