Download Statistics Lesson 1: Descriptive vs Inferential Statistics, Populations and Samples - Prof and more Study notes Statistics in PDF only on Docsity! Lesson 1 Introduction Outline Statistics Descriptive versus inferential statistics Population versus Sample Statistic versus Parameter Simple Notation Summation Notation Statistics What are statistics? What do you thing of when you think of statistics? Can you think of some examples where you have seen statistics used? You might think about where in the real world you see statistics being used, or think about how statistics in used in your major. Statistics are divided into two main areas: descriptive and inferential statistics. Descriptive statistics- These are numbers that are used to consolidate a large amount of information. Any average, for example, is a descriptive statistic. So, batting averages, average daily rainfall, or average daily temperature are good examples of descriptive statistics. Inferential statistics- inferential statistics are used when we want to draw conclusions. For example when we want to determine if some treatment is better than another, or if there are differences in how two groups perform. A good book definition is using samples to draw inferences about populations. More on this once we define samples and populations. Population- Any set of people or objects with something in common. Anything could be a population. We could have a population of college students. We might be interested in the population of the elderly. Other examples include: single parent families, people with depression, or burn victims. For anything we might be interested in studying we could define a population. Very often we would like to test something about a population. For example, we might want to test whether a new drug might be effective for a specific group. It is impossible most of the time to give everyone a new treatment to determine if it worked or not. Instead we commonly give it to a group of people from the population to see if it is effective. This subset of the population is called a sample. When we measure something in a population it is called a parameter. When we measure something in a sample it is called a statistic. For example, if I got the average age of parents in single-family homes, the measure would be called a parameter. If I measured the age of a sample of these same individuals it would be called a statistic. Thus, a population is to a parameter as a sample is to a statistic. This distinction between samples and population is important because this course is about inferential statistics. With inferential statistics we want to draw inferences about populations from samples. Thus, this course is mainly concerned with the rules or logic of how a relatively small sample from a large population could be tested, and the results of those tests can be inferred to be true for everyone in the population. For example, if we want to test whether Bayer asprin is better than Tylonol at relieving pain, we could not give these drugs to everyone in the population. It’s not practical since the general population is so large. Instead we might give it to a couple of hundred people and see which one works better with them. With inferential statistics we can infer that what was true for a few hundred people is also true for a very large population of hundreds of thousands of people. When we write symbols about populations and samples they differ too. With populations we will use Greek letters to symbolize parameters. When we symbolize a measure from a sample (a statistic) we will use the letters you are familiar with (Roman letters). Thus, if I measure the average age of a population I’d indicate the value with the Greek letter “mu” (µ =24). While if I were to measure the same value for a subset of the population or a sample then I would indicate the value with a roman letter ( X =24). Simple Notation You might thing about descriptive statistics as the vocabulary of the "language" of statistics. If this is true then summation notation can be thought of as the alphabet of that language. Notation and summation notation is just a short hand way of representing information we have collected and mathematical operation we want to perform. For example, if I collect data on a variable, say the amount of time (in minutes) several people spent waiting at a bus stop, I can represent that group of numbers with the variable X. The variable X represents all of the data that I collected. Amount of Time X 5.0 11.1 8.9 3.5 12.3 15.6 With subscripts I can also represent an individual data point within the variable set we have labeled X. For example the third data point, 8.9, is the X3 data point. The fifth data point X5 is the number 12.3. Very often when we want to represent ALL of the data