

Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Complete lecture series on Instrumentation, Measurements, Statistics course is available at docsity. Its free to download for everyone. This lecture contains following keywords: Basic Statistics, Data Analysis Using Statistics, Statistics Definitions Associated with Systematic Error, Average Absolute Deviation, Sample Standard Deviation, Standard Error, Root Mean Square Error, Mean Bias Error
Typology: Study notes
1 / 3
This page cannot be seen from the preview
Don't miss anything!


Introduction The purpose of this learning module is to introduce you to some of the fundamental definitions and techniques related to analyzing measurements with statistics. In all the definitions and examples discussed here, we consider a collection (sample) of measurements of a steady parameter. E.g., repeated measurements of a temperature, distance, voltage, etc.
Basic Definitions for Data Analysis using Statistics First some definitions are necessary: o Population – the entire collection of measurements, not all of which will be analyzed statistically. o Sample – a subset of the population that is analyzed statistically. A sample consists of n measurements. o Statistic – a numerical attribute of the sample (e.g., mean, median, standard deviation). Suppose a population – a series of measurements (or readings) of some variable x is available. Variable x can be anything that is measurable, such as a length, time, voltage, current, resistance, etc. Consider a sample of these measurements – some portion of the population that is to be analyzed statistically. The measurements are x 1 , x 2 , x 3 , ..., xn , where n is the number of measurements in the sample under consideration. The following represent some of the statistics that can be calculated:
Mean – the sample mean is simply the arithmetic average , as is commonly calculated, i.e., 1
1 n i i
x x n (^)
where i is one of the n measurements of the sample. o We sometimes use the notation x avg instead of x to indicate the average of all x values in the sample, especially when using Excel since overbars are difficult to add. o The sample mean, although it is the simplest statistic to calculate, is not always as useful as the sample median , which is discussed later. Deviation – the deviation of a measurement is defined as the difference between a particular measurement and the mean , i.e., for measurement i , di^ ^ xi^ ^ x. o When considering a group or sample of measurements, the deviation of one particular measurement is the same as the precision error or random error of that measurement. o Deviation is not the same as accuracy error. Recall that accuracy error ( inaccuracy ) is defined as the difference between a particular measurement and the true value of the quantity being measured: (accuracy error = xi – x true ). Because of bias (systematic) error, x true is often not even known , and the mean is not equal to x true if there are bias errors. Average deviation – to get some feel for how much deviation is represented in the sample, we might first think of averaging all the deviations to obtain some kind of mean or average deviation. It turns out that the average of all the deviations is zero! Try it for any set of numbers, and you will convince yourself that this is true. Why? Because by definition, some of the measurements are smaller than the average, and some are larger, and the average deviation turns out to be a meaningless and worthless calculation – it is always zero. Average absolute deviation – a better measure of deviation is the average absolute deviation (also called the average positive error ), defined as the average of the absolute value of each deviation. Mathematically,
1
1 n i i
d d n (^)
Sample standard deviation – an even better , and more accepted measure of how much deviation or scatter is in the data is obtained by calculating the sample standard deviation. For n measurements,
2 2 1 1 1 1
n n i i i i
d x x S n n
o S is kind of like an average of the deviations, but it is constructed by taking the square root of the average of the squared deviations, since d (^) i can be either positive or negative. o Notice that the denominator is n – 1, not simply n. It turns out that for small sample size (small n ), n – 1 yields a better estimate of the standard deviation than does n itself. (Details are beyond the scope of this course.) As n gets big, the difference between using n or n – 1 in the denominator becomes negligible.
Sample variance – the sample variance of the sample is simply the square of the sample standard
deviation , namely, sample variance^ ^ S^2.
Relative standard deviation – the relative standard deviation of the sample is simply the sample standard
deviation divided by the mean , namely,
x
o RSD is nondimensional. o RSD is usually written as a percentage (multiply RSD by 100%); it is then sometimes called % RSD. Standard error – the standard error is the standard deviation divided by the square root of the number of
measurements , namely, standard error^ ^ S^ / n. Median – the median of the sample is defined as the value at which half of the measurements are lower and half are higher. o A simple way to calculate median is to order all the measurements from lowest to highest. If n is odd, the number in the middle is the median. If n is even, the median is the average of the middle two values. o The median is sometimes more useful than the mean, particularly in cases where one or two values are significantly different than the rest of the values. Mode – the mode of the sample is the most probable value of the n measurement – the one that occurs most frequently. o Mode is not used as often as mean or median because it can be a misleading quantity, especially if the sample size is small and/or the distribution of measurements is not purely random. o If none of the measurements are repeated, the mode is undefined.
Example : Given: Ten length measurements: 12.1, 12.3, 12.2, 12.2, 12.4, 12.3, 12.2, 12.4, 12.2, and 12.5 m. To do: Calculate the mean, variance, average absolute deviation, standard deviation, median, and mode. Solution: o We use Excel for convenience. A portion of the spreadsheet is shown below:
o In Excel, the simplest way to obtain the statistics is to use the built-in statistical analysis macro: Office 2003 : Tools-Data Analysis-Descriptive Statistics-OK. Office 2007 : Click the Data tab instead of Tools – the rest is the same. Click in the field called Input Range , and highlight all the measurements. Select Output Range as the Output Option , click in the field for Output Range , and then click on a cell in some clean (available) portion of the spreadsheet. Select Summary Statistics, and OK.