

















Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Various statistical concepts and techniques such as frequency distribution, bar graphs, histograms, box and whisker plots, measures of central tendency, and sampling techniques. It also includes exercises and examples to illustrate the application of these concepts. useful for students studying statistics and data analysis.
Typology: Exams
1 / 25
This page cannot be seen from the preview
Don't miss anything!


















Each number below represents the age of a U.S. president on his first inauguration. 57 61 57 57 58 57 61 54 68 51 49 50 48 65 52 56 46 54 49 50 47 55 54 42 51 56 55 51 54 51 60 62 43 56 61 52 69 64 64 55 55 47 54 47
15.1 The Frequency Distribution
1. Frequency: how many times something occurs 2. Frequency Distribution: A set of data listed with their frequencies (usually listed with categories or class intervals) 3. Relative Frequency Distribution: A set of data listed with their percentages (usually listed with categories or class intervals) 4. Class intervals: Is the range of a class. In a frequency table, all class intervals should all be equal. To identify good class intervals, find the range of the data by taking the maximum value and subtracting the minimum value. Then divide the range into equal parts. Usually use 8 to 10 classes to give a good presentation of the data. Ex: A television network has asked 25 viewers to evaluate a new police drama. The possible evaluations are (E)xcellent, (A)bove average, a(V)erage, (B)elow average, (P)oor. After the show, the 25 evaluations were as follows: A, V, V, B, P, E, A, E, V, V, A, E, P, B, V, V, A, A, A, E, B, V, A, B, V. Construct a frequency table and a relative frequency table for this list of evaluations. 5. Bar Graph: A graphic form of data that uses bars that have space between them, where the height gives the frequency of that category or class Ex: Draw a bar graph of the frequency distribution of TV viewers’ responses from #5. Ex: The bar graph shows the number of Atlantic hurricanes over a period of years. Use it to answer the following questions. a. What was the smallest number of hurricanes in a year during this period? b. What number of hurricanes per year occurred most frequency? c. How many years were the hurricanes counted? d. In what percentage of the years were there more than ten hurricanes? 6. Back-to-back Bar Graph: Plot on a two-quadrant coordinate system with the horizontal scale repeated in each direction from the central axis.
Class Limits Tally Frequency 58-62 || 62-66 |||| || 66-70 |||| |||| |||| | 70-74 |||| |||| || 74-78 |||| a. Determine if the class intervals of 4 are appropriate for the data. b. Draw a histogram of the data.
11. Stem-and-leaf Plot: A display of numerical data for which each value is separated into two numbers, a stem and a leaf. Leaves are only 1 unit long. Ex: Make a stem-and-leaf plot to compare Babe Ruth’s and Hank Aaron’s Home Run records. Ruth: 54, 59, 35, 41, 46, 25, 47, 60, 54, 46, 49, 46, 41, 34, 22 Aaron: 44, 30, 39, 40, 34, 45, 44, 32, 44, 39, 44, 38, 47, 34, 40 12. Median: The middle value of a data set. If you have an even number of data items, then you have to find the average of the two data items that make up the middle. Extreme values have very little influence on the median. 13. Range: The difference of the greatest and least values in a set of data. 14. Quartile: One of four groupings of a set of data determined by the median of the set and the medians of the sets determined by the median. 15. Interquartile Range: The difference between the third quartile point and the first quartile point. 16. Box-and-whisker Plot: A diagram that graphically displays the median, quartiles, extreme values, and outliers in a set of data.Whiskers extend to the extreme values of the data, unless there is an outlier When there is an outlier, then it is the extreme value. 17. Outliers: Extreme values that are more than 1.5 of the interquartile range beyond the upper or lower quartiles. If outliers exist, each whisker is extended to the last value of the data that is not an outlier. 18. Use the following box plot of student test scores on last year’s advanced algebra mid-year exam. a. What is the median score? b. What is the interquartile range? c. What percent of the students scored between 62 and 91?
d. What is the interval of scores of students who ranked below the lower quartile?
19. The National Football League is separated into two parts-the American Football Conference (AFC) and the National Football Conference (NFC). Here are separate box plots of the capacities of the football stadiums used by the AFC and NFC. a. What is the median capacity in each conference? b. What is the size of the largest stadium in each conference? c. About what percent of the stadiums in the AFC hold fewer than 60,000 people? d. On the whole, which conference has larger stadiums? 20. Below is a stem plot of the amount of money spent by 25 shoppers at a grocery store. The stem is in $10 units. 0 1 2 3 4 5 6 7 8 9 10 11 3 8 0 1 7 8 9 0 0 3 6 8 1 3 4 7 2 5 5 0 5 2 6 7 3 21. Steps in the Calculator Homework: Bar Graphs, Histograms, and Box and Whisker Plots Worksheet Measures of Central Tendency: Range: The difference of the greatest and least values and it measures variability. Deviation: ________________________________________________________________________________ Variance: _________________________________________________________________________________ Standard (Mean) Deviation: ________________________________________________________________________ Population Standard Deviation versus Sample Standard Deviation Population: Divide by ‘n’ is used when the sample is the population Sample: Divide by ‘n - 1’ is used for a sample because it gives a better estimate of the population mean “n” = _____________________ Value Mean Deviation from mean Square of deviation from mean a. Find the median. b. Find the lower quartile. c. Find the upper quartile. d. Determine if there are any outliers. e. Construct a box and whiskers plot. 2 1
n i X i X n x 2 1
n i X i X n Sx
3.Given the quiz grades in Theresa Chair’s two sections of Geometry, answer the following questions. a. Find the mean for 1st^ period. b. Find the standard deviation for 1st^ period. Will you use population or sample? Why? c. Find the mean for 3rd^ period. d. Find the standard deviation for all Ms. Chair’s Geometry students. Will you use population or sample? Why? Determine whether the data in each table appear to be left-skewed, right-skewed, or normally distributed, or bimodal 1.U.S. Population 5. Record Low Temperatures in the 50 States 6. GPAs of Jr at Apex
7.Time Spent in a Museum 8. Average Age of High School Principals 9.
33 Sampling Techniques and Misuses 1. Statistics: The science of collecting, organizing, summarizing, and analyzing information to draw conclusions or answer questions. In addition, statistics is about providing a measure of confidence in any conclusions. 2. Population: The entire group of individuals that we want information about 3. Sample: A part of the population that we actually examine in order to gather information. The sample is used to make generalizations of the population. 1 st^ Period 3 rd^ Period 0 8 5 3 9 6 5 2 0 5 5 2 0 5 2 1 0 0 0 3 4 5 6 7 8 9 10 2 9 8 0 5 8 8 8 9 0 2 4 5 5 5 6 0 3 5 8 Age Percent 0-19 28. 20-39 29. 40-59 25. 60-79 13. Temperature ( ) Number of States 4 12 19 GPA Frequency 0.0-0.4 4 0.5-0.9 4 1.0-1.4 2 1.5-1.9 32 2.0-2.4 96 Shoe Size 4 5 6 7 8 9 10 No. Of students 1 2 4 8 5 1 2 Minutes Frequency 0-25 27 26-50 46 51-75 89 Age in Years Number 31-35 3 36-40 8 41-45 15 46-50 32
4. Qualitative Data: Data that measures the classification of individuals based on some attribute or characteristic 5. Quantitative Data: Data that provides numerical measures of individuals. Ex: Determine whether the following variables are qualitative or quantitative. a. Gender b. Temperature c. Number of days during the past week that a college student aged 21 years or older has had at least one drink. d. Zip code 6. Discrete Variable: A quantitative variable that has either a finite number of possible values or a countable number of values 7. Continuous Variable: A quantitative variable that has an infinite number of possible values that are not countable. Ex: Determine whether the following quantitative variables are discrete or continuous. a. The number of heads obtained after flipping a coin five times. b. The number of cars that arrive at a McDonald’s drive-through between 12:00 P.M. and 1:00 P. M. c. The distance a 2007 Toyota Prius can travel in city driving conditions with a full tank of gas. 8. How do we gather data? Surveys Observational Studies Experimental Studies Simulations 9. Observational Study: Investigators observe subjects and measure variables of interest without assigning treatments to the subjects. The treatment that each subject receives is determined beyond the control of the investigator. *Do not allow a researcher to claim causation, only association. 10. Experimental Study: Investigators apply treatments to experimental units (people, animals, plots of land, etc.) and then proceed to observe the effect of the treatments on the experimental units. 11. Simulations: The use of a mathematical model to recreate a situation, often repeatedly, so that the likelihood of various outcomes can be more accurately estimated. Ex: Identify each as an observational study, experimental study, survey or simulation.
Disadvantages: Difficult to do if you must divide stratum, Formulas for SD & confidence intervals are more complicated, Need sampling frame
16. Cluster Sample: based upon location randomly pick a location & sample all there Suppose we want to do a cluster sample of APEX students. One way to do this would be to randomly select 10 classrooms during 2nd^ period. Sample all students in those rooms! Advantages: Unbiased, Cost is reduced, Sampling frame may not be available (not needed) Disadvantages: Clusters may not be representative of population, Formulas are complicated 17. Convenience Sample: Ask people who are easy to ask Produces bias results An example would be stopping friendly-looking people in the mall to survey. Another example is the surveys left on tables at restaurants - a convenient method! The data obtained by a convenience sample will be biased – however this method is often used for surveys & results reported in newspapers and magazines! Ex: Identify the sampling design. a. The Educational Testing Service (ETS) needed a sample of colleges. ETS first divided all colleges into groups of similar types (small public, small private, etc.) Then they randomly selected 3 colleges from each group. b. A county commissioner wants to survey people in her district to determine their opinions on a particular law up for adoption. She decides to randomly select blocks in her district and then survey all who live on those blocks. 18. Bias: A systematic error in measuring the estimate favors certain outcomes Anything that causes the data to be wrong! It might be attributed to the researchers, the respondent, or to the sampling method! Sampling Design Worksheet Name: _______________________________ I. Classify the variable as qualitative or quantitative.
II. Determine whether the quantitative variable is discrete or continuous.
**2. Properties of Normal Curve:
Ex: Find the upper and lower limits of the interval about the mean in which 72% of the values of a set of normally distributed data can be found if the mean is 124 and the standard deviation is 16. Ex: In a normal distribution, about what percent of the data: a. fall within 2.2 standard deviations of the mean? b. do not fall within 2.2 standard deviations of the mean? Ex: A day is selected at random at a post office whose daily letter-handling rate is normally distributed. The mean number of letters per day is 10,000 and the standard deviation is 350. What is the probability that the post office hand les between 9000 and 11,000 letters per day?
Normal Distribution Worksheet 1.Mrs. Sung gave a test in her trigonometry class. The scores were normally distributed with mean of 85 and a standard deviation of 3. a. What percent would you expect to score between 82 and 88? b. What percent would you expect to score between 88 and 91? c. What is the probability that a student chosen at random scored between 79 and 91?
Statistics Test Review Sheet I. The table shows the resulting scores of the final exam in AFM for 90 Students at Apex High School. Scores 53-61 61-69 69-77 77-85 85-93 93- Frequency 3 7 11 38 19 12