Download BUSINESS STATISTICS TEXT and more Study Guides, Projects, Research Statistics in PDF only on Docsity!
CRASH COURSE
© INCLUDES FULLY SOLVED PROBLEMS FOR EVERY TOPE
® EXPERT TIPS FOR MASTERING BUSINESS STATISTICS
® ALL VOU NEED TO KNOW TO PASS
THE COURSE
LEOWARD |. MAZMIER, Ph.
SCHAUM’S Easy OUTLINES BUSINESS STATISTICS LD7479.i-viii 12/24/02 11:46 AM Page i Copyright © 2003 by The McGraw-Hill Companies, Inc. All rights reserved. Manufactured in the United States of America. Except as permitted under the United States Copyright Act of 1976, no part of this publication may be reproduced or distributed in any form or by any means, or stored in a data- base or retrieval system, without the prior written permission of the publisher. 0-07-142584-5 The material in this eBook also appears in the print version of this title: 0-07-139876-7 All trademarks are trademarks of their respective owners. Rather than put a trademark symbol after every occurrence of a trademarked name, we use names in an editorial fashion only, and to the benefit of the trademark owner, with no intention of infringement of the trademark. Where such designations appear in this book, they have been printed with initial caps. McGraw-Hill eBooks are available at special quantity discounts to use as premiums and sales pro- motions, or for use in corporate training programs. For more information, please contact George Hoare, Special Sales, at
[email protected] or (212) 904-4069. TERMS OF USE This is a copyrighted work and The McGraw-Hill Companies, Inc. (“McGraw-Hill”) and its licensors reserve all rights in and to the work. Use of this work is subject to these terms. Except as permitted under the Copyright Act of 1976 and the right to store and retrieve one copy of the work, you may not decompile, disassemble, reverse engineer, reproduce, modify, create derivative works based upon, transmit, distribute, disseminate, sell, publish or sublicense the work or any part of it without McGraw-Hill’s prior consent. You may use the work for your own noncommercial and personal use; any other use of the work is strictly prohibited. Your right to use the work may be terminated if you fail to comply with these terms. THE WORK IS PROVIDED “AS IS”. McGRAW-HILL AND ITS LICENSORS MAKE NO GUAR- ANTEES OR WARRANTIES AS TO THE ACCURACY, ADEQUACY OR COMPLETENESS OF OR RESULTS TO BE OBTAINED FROM USING THE WORK, INCLUDING ANY INFORMA- TION THAT CAN BE ACCESSED THROUGH THE WORK VIA HYPERLINK OR OTHERWISE, AND EXPRESSLY DISCLAIM ANY WARRANTY, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. McGraw-Hill and its licensors do not warrant or guarantee that the func- tions contained in the work will meet your requirements or that its operation will be uninterrupted or error free. Neither McGraw-Hill nor its licensors shall be liable to you or anyone else for any inac- curacy, error or omission, regardless of cause, in the work or for any damages resulting therefrom. McGraw-Hill has no responsibility for the content of any information accessed through the work. Under no circumstances shall McGraw-Hill and/or its licensors be liable for any indirect, incidental, special, punitive, consequential or similar damages that result from the use of or inability to use the work, even if any of them has been advised of the possibility of such damages. This limitation of lia- bility shall apply to any claim or cause whatsoever whether such claim or cause arises in contract, tort or otherwise. DOI: 10.1036/0071425845 ebook_copyright 8 x 10.qxd 7/7/03 5:09 PM Page 1 Want to learn more? We hope you enjoy this McGraw-Hill eBook! If you d like more information about this book, its author, or related books and websites, please click here. DOI Page 6x9 10/2/02 1:33 PM Page 1 , Contents v Chapter 1 Analyzing Business Data 1 Chapter 2 Statistical Presentations and Graphical Displays 7 Chapter 3 Describing Business Data: Measures of Location 18 Chapter 4 Describing Business Data: Measures of Dispersion 26 Chapter 5 Probability 37 Chapter 6 Probability Distributions for Discrete Random Variables: Binomial, Hypergeometric, and Poisson 46 Chapter 7 Probability Distributions for Continuous Random Variables: Normal and Exponential 54 Chapter 8 Sampling Distributions and Confidence Intervals for the Mean 60 Chapter 9 Other Confidence Intervals 72 Chapter 10 Testing Hypotheses Concerning the Value of the Population Mean 80 Chapter 11 Testing Other Hypotheses 94 Chapter 12 The Chi-Square Test for the Analysis of Qualitative Data 106 Chapter 13 Analysis of Variance 113 LD7479.i-viii 12/24/02 11:46 AM Page v For more information about this title, click here. Copyright © 2003 by The McGraw-Hill Companies, Inc. Click here for Terms of Use. This page intentionally left blank. Chapter 1 Analyzing Business Data In This Chapter: ✔ Definition of Business Statistics ✔ Descriptive and Inferential Statistics ✔ Types of Applications in Business ✔ Discrete and Continuous Variables ✔ Obtaining Data through Direct Observation vs. Surveys ✔ Methods of Random Sampling ✔ Other Sampling Methods ✔ Solved Problems Definition of Business Statistics Statistics refers to the body of techniques used for collecting, organizing, analyzing, and interpreting data. The data may be quantitative, with val- ues expressed numerically, or they may be qualitative, with characteris- tics such as consumer preferences being tabulated. Statistics are used in business to help make better decisions by understanding the sources of variation and by uncovering patterns and relationships in business data. 1 LD7479.001-006 12/24/02 11:48 AM Page 1 Copyright © 2003 by The McGraw-Hill Companies, Inc. Click here for Terms of Use. Descriptive and Inferential Statistics Descriptive statistics include the techniques that are used to summarize and describe numerical data for the purpose of easier interpretation. These methods can either be graphical or involve computational analy- sis. Inferential statistics include those tech- niques by which decisions about a statistical population or process are made based only on a sample having been observed. Because such decisions are made under conditions of uncertainty, the use of probability concepts is required. Whereas the measured charac- teristics of a sample are called sample sta- tistics, the measured characteristics of a sta- tistical population are called population parameters. The procedure by which the characteristics of all the mem- bers of a defined population are measured is called a census. When sta- tistical inference is used in process control, the sampling is concerned particularly with uncovering and controlling the sources of variation in the quality of the output. Types of Applications in Business The methods of classical statistics were developed for the analysis of sample data, and for the purpose of inference about the population from which the sample was selected. There is explicit exclusion of personal judgments about the data, and there is an implicit assumption that sam- pling is done from a static population. The methods of decision analysis focus on incorporating managerial judgments into statistical analysis. The methods of statistical process control are used with the premise that the output of a process may not be stable. Rather, the process may be dy- namic, with assignable causes associated with variation in the quality of the output over time. Discrete and Continuous Variables Adiscrete variable can have observed values only at isolated points along a scale of values. In business statistics, such data typically occur through 2 BUSINESS STATISTICS LD7479.001-006 12/24/02 11:48 AM Page 2 A strict random sample is not usually feasible in statistical process control, since only readily available items or transactions can easily be inspected. In order to capture changes that are taking place in the quality of process output, small samples are taken at regular intervals of time. Such a sampling scheme is called the method of rational subgroups. Such sample data are treated as if random samples were taken at each point in time, with the understanding that one should be alert to any known rea- sons why such a sampling scheme could lead to biased results. Remember The four principal methods of ran- dom sampling are the simple, sys- tematic, stratified, and cluster sam- pling methods. Solved Problems Solved Problem 1.1 Indicate which of the following terms or operations are concerned with a sample or sampling (S), and which are concerned with a population (P): (a) group measures called parameters, (b) use of inferential statistics, (c) taking a census, (d) judging the quality of an in- coming shipment of fruit by inspecting several crates of the large num- ber included in the shipment. Solution: (a) P, (b) S, (c) P, (d) S Solved Problem 1.2 Indicate which of the following types of informa- tion could be used most readily in either classical statistical inference (CI), decision analysis (DA), or statistical process control (PC): (a) man- agerial judgments about the likely level of sales for a new product, (b) subjecting every fiftieth car assembled to a comprehensive quality eval- uation, (c) survey results for a simple random sample of people who pur- chased a particular car model, (d) verification of bank account balances for a systematic random sample of accounts. CHAPTER 1: Analyzing Business Data 5 LD7479.001-006 12/24/02 11:48 AM Page 5 Solution: (a) DA, (b) PC, (c) CI, (d) CI Solved Problem 1.3 For the following types of values, designate discrete variables (D) and continuous variables (C): (a) weight of the contents of a package of cereal, (b) diameter of a bearing, (c) number of defective items produced, (d) number of individuals in a geographic area who are collecting unemployment benefits, (e) the average number of prospective customers contacted per sales representative during the past month, (f ) dollar amount of sales. Solution: (a) C, (b) C, (c) D, (d) D, (e) C, (f ) D Solved Problem 1.4 Indicate which of the following data-gathering pro- cedures would be considered an experiment (E), and which would be con- sidered a survey (S): (a) a political poll of how individuals intend to vote in an upcoming election, (b) customers in a shopping mall interviewed about why they shop there, (c) comparing two approaches to marketing an annuity policy by having each approach used in comparable geo- graphic areas. Solution: (a) S, (b) S, (c) E Solved Problem 1.5 Indicate which of the following types of samples best exemplify or would be concerned with either a judgment sample (J), a convenience sample (C), or the method of rational subgroups (R): (a) Samples of five light bulbs each are taken every 20 minutes in a produc- tion process to determine their resistance to high voltage, (b) a beverage company assesses consumer response to the taste of a proposed alcohol- free beer by taste tests in taverns located in the city where the corporate offices are located, (c) an opinion pollster working for a political candi- date talks to people at various locations in the district based on the as- sessment that the individuals appear representative of the district’s vot- ers. Solution: (a) R, (b) C, (c) J 6 BUSINESS STATISTICS LD7479.001-006 12/24/02 11:48 AM Page 6 Chapter 2 Statistical Presentations and Graphical Displays In This Chapter: ✔ Frequency Distributions ✔ Class Intervals ✔ Histograms and Frequency Polygons ✔ Frequency Curves ✔ Cumulative Frequency Distributions ✔ Relative Frequency Distributions ✔ The “And-Under” Type of Frequency Distributions ✔ Stem-and-Leaf Diagrams ✔ Dotplots ✔ Pareto Charts 7 LD7479.007-017 12/26/02 11:13 AM Page 7 Copyright © 2003 by The McGraw-Hill Companies, Inc. Click here for Terms of Use. 2. leptokurtic: peaked, with the observations concentrated within a narrow range of values; or 3. mesokurtic: neither flat nor peaked, in terms of the distribution of observed values. Cumulative Frequency Distributions A cumulative frequency distribution identifies the cumulative number of observations included below the upper exact limit of each class in the dis- tribution. The cumulative frequency for a class can be determined by adding the observed frequency for that class to the cumulative frequency for the preceding class. The graph of a cumulative frequency distribution is called an ogive. For the less-than type of cumulative distribution, this graph indicates the cumulative frequency below each exact class limit of the frequency dis- tribution. When such a line graph is smoothed, it is called an ogive curve. Remember Terms of skewness: Negatively skewed, Positively skewed, or Sym- metrical. Terms of kurtosis: Platykurtic, Leptokurtic, or Mesokurtic. Relative Frequency Distributions A relative frequency distribution is one in which the number of observa- tions associated with each class has been converted into a relative fre- quency by dividing by the total number of observations in the entire dis- tribution. Each relative frequency is thus a proportion, and can be converted into a percentage by multiplying by 100. One of the advantages associated with preparing a relative frequen- cy distribution is that the cumulative distribution and the ogive for such a distribution indicate the cumulative proportion of observations up to the 10 BUSINESS STATISTICS LD7479.007-017 12/26/02 11:13 AM Page 10 various possible values of the variable. A percentile value is the cumula- tive percentage of observations up to a designated value of a variable. The “And-Under” Type of Frequency Distribution The class limits that are given in computer-generated frequency distribu- tions usually are “and-under” types of limits. For such limits, the stated class limits are also the exact limits that define the class. The values that are grouped in any one class are equal to or greater than the lower class limit, and up to but not including the value of the upper class limit. A de- scriptive way of presenting such class limits is : 5 and under 8 8 and under 11 In addition to this type of distribution being more convenient to im- plement for computer software, it sometimes also reflects a more “natu- ral” way of collecting the data in the first place. For instance, people’s ages generally are reported as the age at the last birthday, rather than the age at the nearest birthday. Thus, to be 24 years old is to be at least 24 but less than 25 years old. Stem-and-Leaf Diagrams Astem-and-leaf diagram is a relatively simple way of organizing and pre- senting measurements in a rank-ordered bar chart format. This is a pop- ular technique in exploratory data analysis. As the name implies, ex- ploratory data analysis is concerned with techniques for preliminary analyses of data in order to gain insights about patterns and relationships. Frequency distributions and the associated graphic techniques covered in the previous sections of this chapter are also often used for this purpose. In contrast, confirmatory data analysis includes the principal methods of statistical inference that constitute most of this book. Confirmatory data analysis is concerned with coming to final statistical conclusions about patterns and relationships in data. A stem-and-leaf diagram is similar to a histogram, except that it is easier to construct and shows the actual data values, rather than having the specific values lost by being grouped into defined classes. However, the technique is most readily applicable and meaningful only if the first CHAPTER 2: Statistical Presentations, Graphical Displays 11 LD7479.007-017 12/26/02 11:13 AM Page 11 digit of the measurement, or possibly the first two digits, provides a good basis for separating data into groups, as in test scores. Each group then is analo- gous to a class or category in a frequency distribution. Where the first digit alone is used to group the mea- surements, the name stem-and-leaf refers to the fact that the first digit is the stem, and each of the measurements with that first-digit value be- comes a leaf in the display. Dotplots A dotplot is similar to a histogram in that a distribution of the data value is portrayed graphically. However, the difference is that the values are plotted individually, rather than being grouped into classes. Dotplots are more applicable for small data sets, for which grouping the values into classes of a frequency distribution is not warranted. Dotplots are partic- ularly useful for comparing two different data sets, or two subgroups of a data set. Pareto Charts A Pareto chart is similar to a histogram, except that it is a frequency bar chart for a qualitative variable, rather than being used for quantitative data that have been grouped into classes. The bars of the chart, which can represent either frequencies or relative frequencies, are arranged in de- scending order from left to right. This arrangement results in the most im- portant categories of data, according to frequency of occurrence, being located at the initial positions in the chart. Pareto charts are used in process control to tabulate the causes associated with assignable-cause variations in the quality of process output. It is typical that only a few cat- egories of causes are associated with most quality problems, and Pareto charts permit worker teams and managers to focus on these most impor- tant areas that are in need of corrective action. Bar Charts and Line Graphs A time series is a set of observed values, such as production or sales data, for a sequentially ordered series of time periods. For the purpose of 12 BUSINESS STATISTICS LD7479.007-017 12/26/02 11:13 AM Page 12 Solved Problem 2.3 Prepare a frequency polygon and a frequency curve for the data in Table 2.1. Describe the frequency curve from the stand- point of skewness. Solution The frequency curve appears to be somewhat negatively skewed. Solved Problem 2.4 Prepare a cumulative frequency distribution for Table 2.1. Present the cumulative frequency distribution graphically by means of an ogive curve. CHAPTER 2: Statistical Presentations, Graphical Displays 15 Figure 2-2 LD7479.007-017 12/26/02 11:13 AM Page 15 Solved Problem 2.5 Given that frequency curve (a) in Figure 2-4 is both symmetrical and mesokurtic, describe curves (b), (c), (d ), (e), and ( f ) in terms of skewness and kurtosis. 16 BUSINESS STATISTICS Figure 2-3 Table 2-2 Cumulative frequency distribution of apartment rental rates Solution LD7479.007-017 12/26/02 11:13 AM Page 16 CHAPTER 2: Statistical Presentations, Graphical Displays 17 Figure 2-4 Solution Curve (b) is symmetrical and leptokurtic; curve (c), positively skewed and mesokurtic; curve (d), negatively skewed and mesokurtic; curve (e), symmetrical and platykurtic; and curve ( f ), positively skewed and lep- tokurtic. LD7479.007-017 12/26/02 11:13 AM Page 17 20 BUSINESS STATISTICS The Weighted Mean The weighted mean or weighted average is an arithmetic mean in which each value is weighted according to its importance in the overall group. The formulas for the population and sample weighted means are identi- cal: Operationally, each value in the group (X) is multiplied by the ap- propriate weight factor (w), and the products are then summed and di- vided by the sum of the weights. Note! The formulas for the sample mean and population mean are as follows: The Median The median of a group items is the value of the middle item when all the items in the group are arranged in either ascending or descending order, in terms of value. For a group with an even number of items, the median is assumed to be midway between the two values adjacent to the middle. When a large number of values is contained in the group, the following formula to determine the position of the median in the ordered group is useful: Med = X [(n/2) + (1/2)] X X n X N = ∑ = ∑ / /m m orw wX wX w = ∑ ∑ ( ) LD7479.018-025 12/24/02 11:50 AM Page 20 The Mode The mode is the value that occurs most frequently in a set of values. Such a distribution is described as being unimodal. For a small data set in which no measured values are repeated, there is no mode. When two non- adjoining values are about equal in having maximum frequencies associ- ated with them, the distribution is described as being bimodal. Distribu- tions of measurements with several modes are referred to as being multimodal. Relationship between the Mean and Median For any symmetrical distribution, the mean, median, and mode all coin- cide in value (see Figure 3-1 (a) below). For a positively skewed distri- bution the mean is always larger than the median (see Figure 3-1 (b) be- low). For a negatively skewed distribution the mean is always smaller CHAPTER 3: Measures of Location 21 Figure 3-1 LD7479.018-025 12/24/02 11:50 AM Page 21 than the median (see Figure 3-1 (c) below). These latter two relationships are always true, regardless of whether the distribution is unimodal or not. Mathematical Criteria Satisfied by the Median and the Mean One purpose for determining any measure of central tendency, such as a median or mean, is to use it to represent the general level of the values included in the group. Both the median and the mean are “good” repre- sentative measures, but from the standpoint of different mathematical cri- teria or objectives. The median is the representative value that minimizes the sum of the absolute values of the differences between each value in the group and the median. That is, the median minimizes the sum of the absolute deviations with respect to the individual values being repre- sented. In contrast, the arithmetic mean focuses on minimizing the sum of the squared deviations with respect to the individual values in the group. The criterion by which the objective is that of minimizing the sum of the squared deviations associated with a representative value is called the least-squares criterion. This criterion is the one that is most impor- tant in statistical inference based on sample data. Use of the Mean, Median, and Mode We first consider the use of these measures of average for representing population data. The value of the mode indicates where most of the ob- served values are located. It can be useful as a descriptive measure for a population group, but only if there is one clear mode. On the other hand, the median is always an excellent measure by which to represent the “typ- ical” level of observed values in a population. This is true regardless of whether there is more than one mode or whether the population distribu- tion is skewed or symmetrical. The lack of symmetry is no special prob- lem because the median wage rate, for example, is always the wage rate of the “middle person” when the wage rates are listed in order of magni- tude. The arithmetic mean is also an excellent representative value for a population, but only if the population is fairly symmetrical. For nonsym- metrical data, the extreme values will serve to distort the value of the mean as a representative value. Thus, the median is generally the best measure of data location for describing population data. 22 BUSINESS STATISTICS LD7479.018-025 12/24/02 11:50 AM Page 22 Solution Because statistical inference for a population is involved, our main con- cern is to report an average that is the most stable and has the least vari- ability from sample to sample. The average that satisfied this requirement is the mean, because it satisfies the least-squares criterion. Therefore, the value reported should be the sample mean, or $1.21. Solved Problem 3.5 A sample of 20 production workers in a company earned the following net pay amounts after all deductions for a given week: $240, 240, 240, 240, 240, 240, 240, 240, 255, 255, 265, 265, 280, 280, 290, 300, 305, 325, 330, 340. Calculate the (a) mean, (b) median, and (c) mode for this group of wages. Solution (a) Mean = $270.50 (b) Median = $260.00 (c) Mode = most frequent value = $240.00 CHAPTER 3: Measures of Location 25 LD7479.018-025 12/24/02 11:50 AM Page 25 Chapter 4 Describing Business Data: Measures of Dispersion In This Chapter: ✔ Measures of Dispersion in Data Sets ✔ The Range and Modified Ranges ✔ The Mean Absolute Deviation ✔ The Variance and Standard Deviation ✔ Simplified Calculations for the Variance and Standard Deviation ✔ Mathematical Criterion Associated with the Variance and Standard Deviation ✔ Use of the Standard Deviation in Data Description 26 LD7479.026-036 12/24/02 11:51 AM Page 26 Copyright © 2003 by The McGraw-Hill Companies, Inc. Click here for Terms of Use. ✔ Use of the Range and Standard Deviation in Statistical Process Control ✔ The Coefficient of Variation ✔ Pearson’s Coefficient of Skewness ✔ Solved Problems Measures of Dispersion in Data Sets The measures of central tendency described in Chapter 3 are useful for identifying the “typical” value in a group of values. In contrast, measures of dispersion, or variability, are concerned with describing the variabili- ty among the values. Several techniques are available for measuring the extent of variability in data sets. The ones described in this chapter are the range, modified ranges, average deviation, variance, standard devi- ation, and coefficient of variation. The Range and Modified Ranges The range, or R, is the difference between highest and lowest values in- cluded in a data set. Thus, when H represents the highest value in the group and L represents the lowest value, the range for ungrouped data is: R = H − L. A modified range is a range for which some of the extreme values at each end of the distribution are eliminated from consideration. The mid- dle 50 percent is the range between the values at the 25th percentile point and the 75th percentile point of the distribution. As such, it is also the range between the first and third quartiles of the distribution. For this rea- son, the middle 50 percent range is usually designated as the interquar- tile range (IQR). Thus, IQR = Q3 − Q1 Other modified ranges that are sometimes used are the middle 80 per- cent, middle 90 percent, and middle 95 percent. CHAPTER 4: Measures of Dispersion 27 LD7479.026-036 12/24/02 11:51 AM Page 27 mean must be determined. Alternative formulas, which are mathemati- cally equivalent but which do not require the determination of each de- viation, have been derived. Because these formulas are generally easier to use for computations, they are called computational formulas. The computational formulas are: Population variance: Population standard deviation: Sample Variance: Sample standard deviation: Mathematical Criterion Associated with the Variance and Standard Deviation In Chapter 3 we described the least-squares criterion and established that the arithmetic mean is the measure of data location that satisfies this cri- terion. Now refer to the formula for population variance and note that the variance is in fact a type of arithmetic mean, in that it is the sum of squared deviations divided by the number of such values. From this standpoint alone, the variance is thereby associated with the least-squares criterion. Note also that the sum of the squared deviations in the numer- ator of the variance formula is precisely the sum that is minimized when the arithmetic mean is used as the measure of location. Therefore, the variance and its square root, the standard deviation, have a close mathe- matical relationship with the mean, and both are used in statistical infer- ence with sample data. Use of the Standard Deviation in Data Description As established in the preceding section, the standard deviation is used in conjunction with a number of methods of statistical inference covered in s X nX n = ∑ − − 2 2 1 s X nX n 2 2 2 1 = ∑ − − s m= ∑ −X N N 2 2 s m2 2 2 = ∑ −X N N 30 BUSINESS STATISTICS LD7479.026-036 12/24/02 11:51 AM Page 30 later chapters of this book. A description of these methods is beyond the scope of the present chapter. However, aside from the uses of the stan- dard deviation in inference, we can now briefly introduce a use of the standard deviation in data description. Consider a distribution of data values that is both symmetrical and mesokurtic. The frequency curve for such a distribution is called a nor- mal curve. For a set of values that is normally distributed, it is always true that approximately 68 percent of the values are included within one stan- dard deviation of the mean and approximately 95 percent of the values are included within two standard deviation units of the mean. These ob- servations are presented diagrammatically in Figures 4-1(a) and (b), re- spectively. Thus, in addition to the mean and standard deviation both be- ing associated with the least-squares criterion, they are also mutually used in analyses for normally distributed variables. CHAPTER 4: Measures of Dispersion 31 Figure 4-1(a) Figure 4-1(b) LD7479.026-036 12/24/02 11:51 AM Page 31 Use of the Range and Standard Deviation in Statistical Process Control As introduced in Chapter 3, the sample mean is used in process control for averages by the construction of charts. In addition to controlling process averages, there is at least an equal interest in controlling process variability. To monitor and control variability, either the ranges or the standard deviations of the rational subgroups that constitute the sequential samples are determined. In either case, the values are plotted identically in form to the run chart for the sequence of sample mean weights. Such a chart for sample ranges is called an R chart, while the chart for sample stan- dard deviations is called an s chart. From the standpoint of using the measure of variability that is most stable, the least-squares oriented s chart is preferred. Historically, the range has been used most frequently for monitoring process variability because it can be easily determined with little calculation. However, availability of more sophisticated weighing devices that are programmed to calculate both the sample mean and standard deviation has resulted in greater use of s charts. The Coefficient of Variation The coefficient of variation, CV, indicates the relative magnitude of the standard deviation as compared with the mean of the distribution of mea- surements, as a percentage. Thus, the formulas are: Population: Sample: CV s X = ×100 CV = ×sm 100 X 32 BUSINESS STATISTICS LD7479.026-036 12/24/02 11:51 AM Page 32 Solved Problem 4.3 Determine the sample standard deviation for the data in Solved Problems 4.1 and 4.2 by using (a) the deviations formula and (b) the alternative computational formula, and demonstrate that the answers are equivalent. Solution: (a) s = $1.28 (b) s = $1.27 CHAPTER 4: Measures of Dispersion 35 Table 4.2 Worksheet for calculating the sample standard deviation for the snack bar data Solved Problem 4.4 Many national academic achievement and aptitude tests, such as the SAT, report standardized test scores with the mean for the normative group used to establish scoring standards converted to 500 with a standard deviation of 100. Suppose that the distribution of scores for such a test is known to be approximately normally distributed. De- termine the approximate percentage of reported scores that would be be- tween (a) 400 and 600 and (b) between 500 and 700. Solution: (a) 68% (b) 47.5% (i.e., one-half of the middle 95%) LD7479.026-036 12/24/02 11:51 AM Page 35 Solved Problem 4.5 Referring to the standardized achievement test in Solved Problem 4.4, what are the percentile values that would be report- ed for scores of (a) 400, (b) 500, (c) 600 and (d ) 700? Solution: (a) 16, (b) 50, (c) 84, and (d ) 97.5 36 BUSINESS STATISTICS LD7479.026-036 12/24/02 11:51 AM Page 36 Chapter 5 Probability In This Chapter: ✔ Basic Definitions of Probability ✔ Expressing Probability ✔ Mutually Exclusive and Nonexclusive Events ✔ The Rules of Addition ✔ Independent Events, Dependent Events, and Conditional Probability ✔ The Rules of Multiplication ✔ Bayes’ Theorem ✔ Joint Probability Tables ✔ Permutations ✔ Combinations ✔ Solved Problems 37 LD7479.037-045 12/24/02 11:51 AM Page 37 Copyright © 2003 by The McGraw-Hill Companies, Inc. Click here for Terms of Use. Note! This definition does not indicate that such events must necessarily always occur jointly. For instance, suppose we consider the two possible events “ace” and “king” with respect to a card being drawn from a deck of playing cards. These two events are mutually exclusive, because any given card cannot be both an ace and a king. Suppose we consider the two possible events “ace” and “spade.” These events are not mutually exclusive, because a given card can be both an ace and a spade; however, it does not follow that every ace is a spade or every spade is an ace. The Rules of Addition The rules of addition are used when we wish to determine the probabili- ty of one event or another (or both) occurring in a single observation. Symbolically, we can represent the probability of event A or event B oc- curring by P(A or B). In the language of set theory this is called the union of A and B and the probability is designated by P(A ∪ B) (read “proba- bility of A union B”). There are two variations of the rule of addition, de- pending on whether or not the two events are mutually exclusive. The rule of addition for mutually exclusive events is P(A or B) = P(A ∪ B) = P(A) + P(B). For events that are not mutually exclusive, the probability of the joint occurrence of the two events is subtracted from the sum of the simple probabilities of the two events. We can represent the probability of joint occurrence by P(A and B). In the language of set theory this is called the intersection of A and B and the probability is designated by P(A ∩ B) (read “probability of A intersect B”). Thus, the rule of addition for events that are not mutually exclusive is P(A or B) = P(A) + P(B) − P(A and B). That formula is also often called the general rule of addition, because for events that are mutually exclusive the last term would always be zero, re- sulting in the formula then being equivalent to the formula for mutually exclusive events. 40 BUSINESS STATISTICS LD7479.037-045 12/24/02 11:51 AM Page 40 Venn diagrams can be used to portray the rationale underlying the two rules of addition. In Figure 5-1(a), note that the probability of A or B occurring is conceptually equivalent to adding the proportion of area in- cluded in A and B. In Figure 5-1(b), for events that are not mutually ex- clusive, some elementary events are included in both A and B; thus there is overlap between these event sets. When the areas included in A and B are added together for events that are not mutually exclusive, the area of overlap is essentially added in twice. Thus, the rationale of subtracting P(A and B) in the rule of addition for nonexclusive events is to correct the sum for the duplicate addition of the intersect area. Independent Events, Dependent Events, and Conditional Probability Two events are independent when the occurrence or nonoccurrence of one event has no effect on the probability of occurrence of the other event. Two events are dependent when the occurrence or nonoccurrence of one event does affect the probability of occurrence of the other event. When two events are dependent, the concept of conditional proba- bility is employed to designate the probability of occurrence of the relat- ed event. The expression P(BA) indicates the probability of event B oc- curring given that event A has occurred. Note that BA is not a fraction. Conditional probability expressions are not required for independent events because by definition there is no relationship between the occur- rence of such events. Therefore, if events A and B are independent, the conditional probability P(BA) is always equal to simple probability P(B). If the simple probability of a first event A and the joint probability CHAPTER 5: Probability 41 Figure 5-1 LD7479.037-045 12/24/02 11:51 AM Page 41 of two events A and B are known, then the conditional probability P(BA) can be determined by: P(BA) = P(A and B)/P(A) There is often some confusion regarding the dis- tinction between mutually exclusive and nonexclusive events on the one hand, and the concepts of indepen- dence and dependence on the other hand. Particularly, note the difference between events that are mutually exclusive and events that are independent. Mutual ex- clusiveness indicates that two events cannot both oc- cur, whereas independence indicates that the probability of occurrence of one event is not affected by the occurrence of the other event. Therefore it follows that if two events are mutually exclusive, this is a particular ex- ample of highly dependent events, because the probability of one event given that the other has occurred would always be equal to zero. The Rules of Multiplication The rules of multiplication are concerned with determining the probabil- ity of the joint occurrence of A and B. This concerns the intersection of A and B: P(A ∩ B). There are two variations of the rule of multiplication, according to whether the two events are independent or dependent. The rule of multiplication for independent events is: P(A and B) = P(A ∩ B) = P(A)P(B) For dependent events the probability of the joint occurrence of A and B is the probability of A multiplied by the conditional probability of B given A. An equivalent value is obtained if the two events are reversed in position. Thus the rule of multiplication for dependent events is: P(A and B) = P(A)P(BA); or P(A and B) = P(B and A) = P(B)P(AB) The first formula is often called the general rule of multiplication, because for events that are independent the conditional probability P(BA) is always equal to the unconditional probability value P(B), re- 42 BUSINESS STATISTICS LD7479.037-045 12/24/02 11:51 AM Page 42 determine the probability of an event by determining the number of com- binations of outcomes that include that event as compared with the total number of combinations that are possible. Of course, this again represents the classical approach to probability and is based on the assumption that all combinations are equally likely. Solved Problems Solved Problem 5.1 For each of the following situations, indicate whether the classical, relative frequency, or subjective approach would be most useful for determining the required probability value. (a) Probability that there will be a recession next year. (b) Probability that a six-sided die will show either a 6 or 1. (c) Probability that a randomly chosen person who enters a large de- partment store will make a purchase in that store. Solution: (a) subjective, (b) classical, and (c) relative frequency Solved Problem 5.2 Determine the probability of obtaining an ace (A), king (K ), or a deuce (D) when one card is drawn from a well-shuffled deck of 52 playing cards. Solution: P(A or K or D) = P(A) + P(K) + P(D) = 3/13 Solved Problem 5.3 In general, the probability that a prospect will make a purchase after being contacted by a salesperson is P = 0.40. If a sales- person selects three prospects randomly from a file and makes contact with them, what is the probability that all three prospects will make a pur- chase? Solution: P(all are purchasers) = (0.40) × (0.40) × (0.40) = 0.064 CHAPTER 5: Probability 45 LD7479.037-045 12/24/02 11:51 AM Page 45 Chapter 6 Probability Distributions for Discrete Random Variables: Binomial, Hypergeometric, and Poisson In This Chapter: ✔ What Is a Random Variable? ✔ Describing a Discrete Random Variable ✔ The Binomial Distribution ✔ The Binomial Variable Expressed by Proportions ✔ The Hypergeometric Distribution 46 LD7479.046-053 12/24/02 11:52 AM Page 46 Copyright © 2003 by The McGraw-Hill Companies, Inc. Click here for Terms of Use. ✔ The Poisson Distribution ✔ Poisson Approximation of Binomial Probabilities ✔ Solved Problems What Is a Random Variable? In contrast to categorical events, such as drawing a particular card from a deck of cards, a random variable is a numerical event whose value is de- termined by a chance process. When probability values are assigned to all possible numerical val- ues of a random variable X, either by a listing or by a mathematical function, the result is a prob- ability distribution. The sum of the probabilities for all the possible nu- merical outcomes must equal 1.0. Individual probability values may be denoted by the symbol f(x), which indicates that a mathematical function is involved, by P(x = X), which recognizes that the random variable can have various specific values, or simply by P(X). For a discrete random variable, observed values can occur only at isolated points along a scale of values. Therefore, it is possible that all nu- merical values for the variable can be listed in a table with accompany- ing probabilities. There are several standard probability distributions that can serve as models for a wide variety of discrete random variables in- volved in business applications. The standard models described in this chapter are the binomial, hypergeometric, and Poisson probability distri- butions. For a continuous random variable, all possible fractional values of the variable cannot be listed, and therefore, the probabilities that are de- termined by a mathematical function are portrayed graphically by a prob- ability density function or probability curve. Describing a Discrete Random Variable Just as for collections of sample and population data, it is often useful to describe a random variable in terms of its mean and its variance, or stan- CHAPTER 6: Probability Distributions 47 LD7479.046-053 12/24/02 11:52 AM Page 47 The values of p referenced in a table of binomial probabilities typi- cally do not exceed p = 0.50. If the value of p in a particular application exceeds 0.50, the problem is restated so that the event is defined in terms of the number of failures rather than the number of successes. The expected value (long-run mean) and variance for a given bino- mial distribution could be determined by listing the probability distribu- tion in a table and applying the formulas presented earlier in the chapter. However, the expected number of successes can be computed directly: E(X) = np Where q = (1 − p), the variance of the number of successes can also be computed directly: V(X) = npq The Binomial Variable Expressed by Proportions Instead of expressing the random binomial variable as the number of suc- cesses X, we can designate it in terms of the proportion of successes p̂, which is the ratio of the number of successes to the number of trials: p̂ = X /n In such cases, the formula is modified only with respect to defining the proportion. Thus, the probability of observing exactly p̂ proportion of successes in n Bernouilli trials is: P( p̂ = X/n n,p) = nCXp Xqn−X P( p̂ = X/n n,p) = nCX p X (1 − p)n−X In the second formula, p is the equivalent of p except that it specifi- cally indicates that the probability of success in an individual trial is a population or process parameter. When the binomial variable is expressed as a proportion, the distri- bution is still discrete and not continuous. Only the particular proportions for which the number of successes X is a whole number can occur. The 50 BUSINESS STATISTICS LD7479.046-053 12/24/02 11:52 AM Page 50 expected value for a binomial probability distribution expressed by pro- portions is equal to the population proportion, which may be designated by either p or p: E( p̂) = p or E( p̂) = p The variance of the proportion of successes for a binomial probabil- ity distribution, when q = (1 − p), is: V( p̂) = pq/n or V( p̂) = p(1 − p)/n The Hypergeometric Distribution When sampling is done without replacement of each sampled item taken from a finite population of items, the Bernoulli process does not apply be- cause there is a systematic change in the probability of success as items are removed from the population. Don’t Forget! When sampling without replacement is used in a situation that would otherwise qualify as a Bernoul- li process, the hypergeometric distribution is the appropriate discrete probability distribution. The Poisson Distribution The Poisson distribution can be used to determine the probability of a designated number of events occurring when the events occur in a con- tinuum of time or space. Such a process is called a Poisson process; it is similar to the Bernoulli process except that the events occur over a con- tinuum and there are no trials as such. An example of such a process is the arrival of incoming calls at a telephone switchboard. As was the case for the Bernoulli process, it is assumed that the events are independent and that the process is stationary. CHAPTER 6: Probability Distributions 51 LD7479.046-053 12/24/02 11:52 AM Page 51 Only one value is required to determine the probability of a desig- nated number of events occurring in a Poisson process: the long-run mean number of events for the specific time or space dimension of interest. This mean generally is represented by l (Greek lambda), or possibly by m. The formula for determining the probability of a designated number of suc- cesses X in a Poisson distribution is: Because a Poisson process is assumed to be stationary, it follows that the mean of the process is always proportional to the length of the time or space continuum. Therefore, if the mean is available for one length of time, the mean for any other required time period can be determined. You Need to Know This is important, because the value of l that is used must apply to the time period of interest. By definition, the expected value for a Poisson probability distribu- tion is equal to the mean of the distribution: E(X) = λ. As it happens, the variance of the number of events for a Poisson probability distribution is also equal to the mean of the distribution l: V(X) = l Poisson Approximation of Binomial Probabilities When the number of observations or trials n in a Bernoulli process is large, computations are quite tedious. Further, tabled probabilities for very small values of p are not generally available. Fortunately, the Pois- son distribution is suitable as an approximation of binomial probabilities when n is large and p or q is small. A convenient rule is that such ap- proximation can be made when n ≥ 30, and either np < 5 or nq < 5. Dif- P X e X X ( ) ! − 52 BUSINESS STATISTICS LD7479.046-053 12/24/02 11:52 AM Page 52 ✔ The Exponential Probability Distribution ✔ Solved Problems Continuous Random Variables As contrasted to a discrete random variable, a continuous random variable is one that can as- sume any fractional value within a defined range of values. Because there is an infinite number of possible fractional measurements, one cannot list every possible value with cor- responding probability. Instead, a probability density function is defined. This mathematical expression gives the function of X, represented by the symbol f(X), for any designated value of the random variable X. The plot for such a func- tion is called a probability curve, and the area between any two points under the curve indicates the probability of a val- ue between these two points occurring by chance. Several standard continuous probability distributions are applicable as models to a wide variety of continuous variables under designated cir- cumstances. Probability tables have been prepared for these standard de- viations, making it unnecessary to use the method of integration in order to determine areas under the probability curve for these distributions. The standard continuous probability models described in this chapter are the normal and exponential probability distributions. The Normal Probability Distribution The normal probability distribution is a continuous probability distribu- tion that is both symmetrical and mesokurtic. The probability curve rep- resenting the normal probability distribution is often described as being bell-shaped. The normal probability distribution is important in statisti- cal inference for three distinct reasons: CHAPTER 7: Variables: Normal and Exponential 55 LD7479.054-059 12/24/02 11:53 AM Page 55 1. The measurements obtained in many random processes are known to follow this distribution. 2. Normal probabilities can often be used to approximate other probability distributions, such as the binomial and Poisson distri- butions. 3. Distributions of such statistics as the sample mean and sample proportion are normally distributed when the sample size is large, regardless of the distribution of the parent population. As is true for any continuous probability distribution, a probability value for a continuous random variable can be determined only for an in- terval of values. The height of the density function, or probability curve, for a normally distributed variable is given by where p is the constant 3.1416, e is the constant 2.7183, m is the mean of the distribution, and s is the standard deviation of the distribution. Since every different combination of m and s would generate a different normal probability distribution (all symmetrical and mesokurtic), tables of nor- mal probabilities are based on one particular distribution: the standard normal distribution. This is the normal probability distribution with m = 0 and s = 1. Any value X from a normally distributed population can be converted into equivalent standard normal value z by the formula: z = (X − m)/ s Important! Any z value restates the original value X in terms of the number of units of the standard deviation by which the original value differs from the mean of the distribution. A negative value of z would indi- cate that the original value X was below the value of the mean. f X e X( ) [( ) / ]= − −1 2 2 22 2 ps m s 56 BUSINESS STATISTICS LD7479.054-059 12/24/02 11:53 AM Page 56 Normal Approximation of Binomial Probabilities When the number of observations or trials n is relatively large, the nor- mal probability distribution can be used to approximate binomial proba- bilities. A convenient rule is that such approximation is acceptable when n ≥ 30, and both np ≥ 5 and nq ≥ 5. This rule, combined with the one for the Poisson approximation of binomial probabilities, means that when- ever n ≥ 30, binomial probabilities can be approximated by either the nor- mal or the Poisson distribution, depending on the values of np and nq. Different texts use somewhat different rules for determining when such approximations are appropriate. When the normal probability distribution is used as the basis for ap- proximating a binomial probability value, the mean and standard devia- tion are based on the expected value and variance of the number of suc- cesses for the binomial distribution. The mean number of successes is: m = np. The standard deviation of the number of successes is: . Normal Approximation of Poisson Probabilities When the mean λ of a Poisson distribution is relatively large, the normal probability distribution can be used to approximate Poisson probabilities. A convenient rule is that such approximation is acceptable when l ≥ 10.0. The mean and standard deviation of the normal probability distribu- tion are based on the expected value and the variance of the number of events in a Poisson process. This mean is: m = l. The standard deviation is: . The Exponential Probability Distribution If events occur in the context of a Poisson process, then the length of time or space be- tween successive events follow an exponential probability distribution. Because the time or space is a continuum, such a measurement is a continuous random variable. As is the case of s l= s = npq CHAPTER 7: Variables: Normal and Exponential 57 LD7479.054-059 12/24/02 11:53 AM Page 57 Chapter 8 Sampling Distributions and Confidence Intervals for the Mean In This Chapter: ✔ Point Estimation of a Population or Process Parameter ✔ The Concept of a Sampling Distribution ✔ Sampling Distribution of the Mean ✔ The Central Limit Theorem ✔ Determining Probability Values for the Sample Mean ✔ Confidence Intervals for the Mean Using the Normal Distribution 60 LD7479.060-071 12/24/02 11:54 AM Page 60 Copyright © 2003 by The McGraw-Hill Companies, Inc. Click here for Terms of Use. ✔ Determining the Required Sample Size for Estimating the Mean ✔ The t Distribution and Confidence Intervals for the Mean ✔ Summary Table for Interval Estimation of the Population Mean ✔ Solved Problems Point Estimation of a Population or Process Parameter Because of factors such as time and cost, the parameters of a population or process frequently are estimated on the basis of sample statistics. A pa- rameter is a summary value for a population or process, whereas a sam- ple statistic is a summary value for a sample. In order to use a sample sta- tistic as an estimator of a parameter, the sample must be a random sample from a population or a rational subgroup from a process. A point estimator is the numeric value of a sample statistic that is used to estimate the value of a population or process parameter. One of the most important characteristics of an estimator is that it be unbiased. An unbi- ased estimator is a sample statistic whose ex- pected value is equal to the parameter being estimated. An expected value is the long-run mean average of the sample statistic. The elimination of any systematic bias is assured when the sample statistic is for a random sample taken from a population or a rational subgroup taken from a process. Either sampling method assures that the sample is unbiased but does not eliminate sampling variability, or sampling error, as explained in the following section. Table 8.1 presents some frequently used point estimators of popula- tion parameters. In every case, the appropriate estimator of a population CHAPTER 8: Sampling Distributions 61 LD7479.060-071 12/24/02 11:54 AM Page 61 parameter simply is the corresponding sample statistic. However, note that the formula in Chapter 4 for the sample variance includes a correc- tion factor. Without this correction, the sample variance would be a bi- ased estimator of the population variance. The Concept of a Sampling Distribution Your understanding of the concept of a sampling distribution is funda- mental to your understanding of statistical inference. As we have already established, a population distribution is the distribution of all the indi- vidual measurements in a population, and a sample distribution is the dis- tribution of the individual values included in a sample. In contrast to such distributions for individual measurements, a sampling distribution refers to the distribution of different values that a sample statistic, or estimator, would have over many samples of the same size. Thus, even though we typically would have just one random sample or rational subgroup, we recognize that the particular sample statistic that we determine, such as the sample mean or median, is not exactly equal to the respective popu- lation parameter. Further, a sample statistic will vary in value from sam- ple to sample because of random sampling variability, or sampling error. This is the idea underlying the concept that any sample statistic is in fact a type of variable whose distribution of values is represented by a sam- pling distribution. Sampling Distribution of the Mean We now turn our attention specifically to the sampling distribution of the sample mean. When the mean of just one sample is used in statistical in- 62 BUSINESS STATISTICS Table 8.1 Frequently used point estimators LD7479.060-071 12/24/02 11:54 AM Page 62 Determining Probability Values for the Sample Mean If the sampling distribution of the mean is normally distributed, either be- cause the population is normally distributed or because the central limit theorem is invoked, then we can determine probabilities regarding the possible values of the sample mean, given that the population mean and standard deviation are known. The process is analogous to determining probabilities for individual observations using the normal distribution. In the present application, however, it is the designated value of the sample mean that is converted into a value of z in order to uses the table of nor- mal probabilities. This conversion formula uses the standard error of the mean because this is the standard deviation for the variable . Thus, the conversion formula is: Example 8.1 An auditor takes a random sample of size n = 36 from a population of 1,000 accounts receivable. The mean value of the accounts receivable for the population is m = $260.00, with the population standard deviation s = $45.00. What is the probability that the sample mean will be less than $250.00? z X x = − m s X CHAPTER 8: Sampling Distributions 65 Figure 8-1 LD7479.060-071 12/24/02 11:54 AM Page 65 Figure 8-1 portrays the probability curve. The sampling distribution is de- scribed by the mean and standard error: m = 260.00 (as given) Therefore, Confidence Intervals for the Mean Using the Normal Distribution Example 8.1 above is concerned with determining the probability that the sample mean will have various values given that the population mean and standard deviation are known. What is involved is deductive reasoning with respect to the sample result based on known population parameters. We now concern ourselves with inductive reasoning by using sample data to make statements about the value of the population mean. The methods of interval estimation in this section are based on the assumption that the normal probability distribution can be used. Such use is warranted whenever n ≥ 30, because of the central limit theorem, or when n < 30 but the population is normally distributed and s is known. Although the sample mean is useful as an unbiased estimator of the population mean, there is no way of expressing the degree of accuracy of a point estimator. In fact, mathematically speaking, the probability that the sample mean is exactly correct as an estimator of the population mean is P = 0. A confidence interval for the mean is an estimate interval con- structed with respect to the sample mean by which the likelihood that the interval includes the value of the population mean can be specified. The level of confidence associated with a confidence interval indicates the P X P z P z P z x( . . , . ) ( . ) ( . ) . ( . ) . . . . < = = = < − < − = − − ≤ ≤ = − = 250 00 260 00 7 50 1 33 1 33 0 5000 1 33 0 0 5000 0 4082 0 0918 s x z = = = − = − = − 45 6 7 50 250 00 260 00 7 50 10 7 50 1 33 . . . . . . . 66 BUSINESS STATISTICS LD7479.060-071 12/24/02 11:54 AM Page 66 long-run percentage of such intervals that would include the parameter being estimated. Confidence intervals for the mean typically are constructed with the unbiased estimator at the midpoint of the interval. When use of the nor- mal probability distribution is warranted, the confidence interval for the mean is determined by: or when the population s is not known by: The most frequently used confidence intervals are the 90 percent, 95 percent, and 99 percent confidence intervals. The values of z required in conjunction with such intervals are given in Table 8.2. Determining the Required Sample Size for Estimating the Mean Suppose that the desired size of a confidence interval and the level of con- fidence to be associated with it are specified. If σ is known or can be es- timated, such as from the results of similar studies, the required sample size based on the use of the normal distribution is: n z E = s 2 X zsx± X z x± s CHAPTER 8: Sampling Distributions 67 Table 8.2 Selected proportions of area under the normal curve LD7479.060-071 12/24/02 11:54 AM Page 67 Solved Problems Solved Problem 8.1 For a particular brand of TV picture tube, it is known that the mean operating life of the tubes is m = 9,000 hr with a stan- dard deviation of s = 500 hr. (a) Determine the expected value and stan- dard error of the sampling distribution of the mean given a sample size of n = 25. (b) Interpret the meaning of the computed values. Solution: (a) (b) These calculations indicate that in the long run the mean of a large group of samples means, each based on a sample size of n = 25, will be equal to 9,000 hr. Further, the variability of these sample means with re- spect to the expected value of 9,000 hr is expressed by a standard devia- tion of 100 hr. Solved Problem 8.2 Suppose that the standard deviation of the tube life for a particular brand of TV picture tube is known to be s = 500, but that the mean operating life is unknown. Overall, the operating life of the tubes is assumed to be approximately normally distributed. For a sample of n = 15, the mean operating life is = 8,900 hr. Determine the 95 per- cent confidence intervals for estimating the population mean. Solution: The normal probability distribution can be used in this case be- cause the population is normally distributed and s is known. Solved Problem 8.3 With respect to Solved Problem 8.2, suppose that the population can be assumed to be normally distributed, but that the population standard deviation is not known. Rather, the sample standard deviation s = 500 and = 8,900. Estimate the population mean using a 90 percent confidence interval. Solution: Because n ≥ 30 the normal distribution can be used as an ap- proximation of the t distribution. However, because the population is nor- X X z n to x± = ± = s s 8 900 1 96 8 647 9 153 , . , , X E X x ( ) ,= = = m s 9 000 100 70 BUSINESS STATISTICS LD7479.060-071 12/24/02 11:54 AM Page 70 mally distributed, the central limit theorem need not be invoked. There- fore, Solved Problem 8.4 A prospective purchaser wishes to estimate the mean dollar amount of sales per customer at a toy store located at an air- lines terminal. Based on the data from other similar airports, the standard deviation of such sales amounts is estimated to be about s = $3.20. What size of random sample should be collected, as a minimum, if the pur- chaser wants to estimate the mean sales amount within $1.00 and with 99 percent confidence? Solution: n = (zs /E)2 = [(2.58)(3.20)/1.00]2 = 68.16 Solved Problem 8.5 Referring to Solved Problem 8.4, what is the min- imum required sample size if the distribution of sales amounts is not as- sumed to be normal and the purchaser wishes to estimate the mean sales amount within $2.00 with 99 percent confidence? Solution: n = (zs /E)2 = [(2.58)(3.20)/2.00]2 = 17.04 However, because the population is not assumed to be normally distrib- uted, the minimum sample size is n = 30, so that the central limit theo- rem can be invoked as the basis for using the normal probability distri- bution for constructing the confidence interval. X zs to hrx± = ± =8 900 1 645 500 35 8 761 9 039, . , , CHAPTER 8: Sampling Distributions 71 LD7479.060-071 12/24/02 11:54 AM Page 71 Chapter 9 Other Confidence Intervals In This Chapter: ✔ Confidence Intervals for the Difference between Two Means Using the Normal Distribution ✔ The t Distribution and Confidence Intervals for the Difference between Two Means ✔ Confidence Intervals for the Population Proportion ✔ Determining the Required Sample Size for Estimating the Proportion ✔ Confidence Intervals for the Difference between Two Proportions ✔ The Chi-Square Distribution and 72 LD7479.072-079 12/24/02 11:55 AM Page 72 Copyright © 2003 by The McGraw-Hill Companies, Inc. Click here for Terms of Use. Where df = n1 + n2 − 2, the confidence interval is: Note that in the two-sample case it is possible for each sample to be small, and yet the normal distribution could be used to approximate the t because df ≥ 29. However, in such use the two populations must be as- sumed to be approximately normally distributed, because the central lim- it theorem cannot be invoked with respect to a small sample. Confidence Intervals for the Population Proportion The probability distribution that is applicable to pro- portions is the binomial probability distribution. However, the mathematics associated with deter- mining a confidence interval for an unknown popu- lation proportion on the basis of the Bernoulli process is complex. Therefore, all applications- oriented textbooks utilize the normal distribution as an approximation of the exact solution for confi- dence intervals for proportions. However, when the population propor- tion p (or p) is not known, most statisticians suggest that a sample of n ≥ 100 should be taken. The variance of the distribution of proportions serves as the basis for the standard error. Given an observed sample proportion of p̂, the esti- mated standard error of the proportion is: In the context of statistical estimation, the population p would not be known because that is the value being estimated. If the population is fi- nite, then use of the finite correction factor is appropriate. As was the case for the standard error of the mean, use of this correction is generally not considered necessary if n < 0.05N. The approximate confidence interval for a population proportion is p̂ ± zsp. In addition to the two-sided confi- s p p np̂ ˆ( ˆ)= −1 ( ) ˆX X tdf x x1 2 1 2− ± −s CHAPTER 9: Other Confidence Intervals 75 LD7479.072-079 12/24/02 11:55 AM Page 75 dence interval, a one-sided confidence interval for the population pro- portion can also be determined. Determining the Required Sample Size for Estimating the Proportion Before a sample is actually collected, the minimum required sample size can be determined by specifying the level of confidence required, the sampling error that is acceptable, and by making an initial estimate of π, the unknown population proportion: Above, z is the value used for the specified confidence interval, p is the initial estimate of the population proportion, and E is the “plus and minus” sampling error allowed in the interval (always one-half the total confidence interval). If an initial estimate of p is not possible, then it should be estimated as being 0.50. Such an estimate is conservative in that it is the value for which the largest sample size would be required. Under such an assump- tion, the general formula for sample size is simplified as follows: n = (z/2E )2. Confidence Intervals for the Difference between Two Proportions In order to estimate the difference between the proportions in two popu- lations, the unbiased point estimate of (p1 − p2) is ( p̂1 − p̂2). The confi- dence interval involves use of the standard error of the difference between proportions. Use of the normal distribution is based on the same condi- tions as for the sampling distribution of the proportion, except that two samples are involved and the requirements apply to each of the two sam- ples. The confidence interval for estimating the difference between two population proportions is: ( ˆ ˆ ) ˆ ˆp p zsp p1 2 1 2− ± − ( ˆ ˆ ) ˆ ˆp p zsp p1 2 1 2− ± − 76 BUSINESS STATISTICS LD7479.072-079 12/24/02 11:55 AM Page 76 The standard error of the difference between proportions is deter- mined by the formula below, wherein the value of each respective stan- dard error of the proportion is calculated as described before: The Chi-Square Distribution and Confidence Intervals for the Variance and Standard Deviation Given a normally distributed population of values, the c2 (chi-square) distributions can be shown to be the appropriate probability distributions for the ratio (n − 1)s2/s2. There is a different chi-square distribution ac- cording to the value of n − 1, which represents the degrees of freedom (df ). Thus, Because the sample variance is the unbiased estimator of the population variance, the long-run expected value of the above ratio is equal to the de- grees of freedom, or n − 1. However, in any given sample the sample variance generally is not identi- cal in value to the population variance. Since the ra- tio above is known to follow a chi-square distribution, this probability distribution can be used for statistical inference concerning an unknown variance or standard deviation. Chi-square distributions are not symmetrical. Therefore, a two-sided confidence interval for a variance or standard deviation involves the use of two different chi square values, rather than the plus and minus ap- proach used with the confidence intervals based on the normal and t dis- tributions. The formula for constructing a confidence interval for the pop- ulation variance is: ( ) ( ) , , n s n s df upper df lower − ≤ ≤ −1 1 2 2 2 2 2χ χ s χdf n s2 2 2 1= −( ) s s s sp p p pˆ ˆ ˆ ˆ1 2 1 2 2 2 − = + CHAPTER 9: Other Confidence Intervals 77 LD7479.072-079 12/24/02 11:55 AM Page 77 Chapter 10 Testing Hypotheses Concerning the Value of the Population Mean In This Chapter: ✔ Introduction ✔ Basic Steps in Hypothesis Testing by the Critical Value Approach ✔ Testing a Hypothesis Concerning the Mean by Use of the Normal Distribution ✔ Type I and Type II Errors in Hypothesis Testing ✔ Determining the Required Sample Size for Testing the Mean 80 LD7479.080-093 12/24/02 12:00 PM Page 80 Copyright © 2003 by The McGraw-Hill Companies, Inc. Click here for Terms of Use. ✔ Testing a Hypothesis Concerning the Mean by Use of the t Distribution ✔ The P-Value Approach to Testing Hypotheses Concerning the Population Mean ✔ The Confidence Interval Approach to Testing Hypotheses Concerning the Mean ✔ Testing with Respect to the Process Mean in Statistical Process Control ✔ Summary Table for Testing a Hypothesized Value of the Mean ✔ Solved Problems Introduction The purpose of hypothesis testing is to deter- mine whether a claimed (hypothesized) value for a population parameter, such as a popula- tion mean, should be accepted as being plau- sible based on sample evidence. Recall from Chapter 8 on sampling distributions that a sample mean generally will differ in value from the population mean. If the observed value of a sample statistic, such as the sample mean, is close to the claimed parameter value and differs only by an amount that would be expected because of random sampling, then the hypothesized value is not rejected. If the sample statistic differs from the claim by an amount that cannot be ascribed to chance, then the hypothesis is rejected as not being plausible. CHAPTER 10: Testing Hypotheses 81 LD7479.080-093 12/24/02 12:00 PM Page 81 Three different procedures have been developed for testing hy- potheses, with all of them leading to the same decision when the same probability (and risk) standards are used. In this chapter we first describe the critical value approach to hypothesis testing. By this approach, the so-called critical values of the test statistic that would dictate rejection of a hypothesis are determined, and then the observed test statistic is com- pared to the critical values. This is the first approach that was developed, and thus much of the language of hypothesis testing stems from it. More recently, the P-value approach has become popular because it is the one most easily applied with computer software. This approach is based on determining the conditional probability that the observed value of a sample statistic could occur by chance, given that a particular claim for the value of the associated population parameter is in fact true. Final- ly, the confidence interval approach is based on observing whether the claimed value of a population parameter is included within the range of values that define a confidence interval for that parameter. No matter which approach to hypothesis testing is used, note that if a hypothesized value is not rejected, and therefore is accepted, this does not constitute a “proof” that the hypothesized value is correct. Accep- tance of a claimed value for the parameter simply indicates that it is a plausible value, based on the observed value of the sample statistic. Basic Steps in Hypothesis Testing by the Critical Value Approach Step 1. Formulate the null hypothesis and the alternative hypothesis. The null hypothesis (H0) is the hypothesized parameter value that is com- pared with the sample result. It is rejected only if the sample result is un- likely to have occurred given the correctness of the hypothesis. The al- ternative hypothesis (H1) is accepted only if the null hypothesis is rejected. The alternative hypothesis is also designated by (Ha) in many texts. Step 2. Specify the level of significance to be used. The level of signif- icance is the statistical standard that is specified for rejecting the null hy- pothesis. If a 5 percent level of significance is specified, then the null hy- pothesis is rejected only if the sample result is so different from the hypothesized value that a difference of that amount or larger would oc- cur by chance with a probability of 0.05 or less. 82 BUSINESS STATISTICS LD7479.080-093 12/24/02 12:00 PM Page 82 or Instead of establishing critical values in terms of the sample mean, the critical values in hypothesis testing typically are specified in terms of z values. For the 5 percent level of significance the critical values z for a two-sided test are −1.96 and +1.96, for example. When the value of the sample mean is determined, it is converted to a z value so that it can be compared with the critical values of z. The conversion formula, accord- ing to whether or not s is known, is: or A one-sided test is appropriate when we are concerned about possi- ble deviations in only one direction from the hypothesized value of the mean. There is only one region of rejection for a one-sided test. The region of rejection for a one-sided test is always in the tail that represents sup- port of the alternative hypothesis. As is the case for a two-sided test, the critical value can be determined for the mean, as such, or in terms of a z value. However, critical values for one-sided tests differ from those for two-sided tests because the given proportion of area is all in one tail of the distribution. Table 10.2 presents the values of z needed for one-sided and two-sided tests. The general formula to establish the critical value of the sample mean for a one-sided test, according to whether or not s is known, is the same as the two-sided test. z X s X = − m0 z X X = − m s 0 X zsCR X= ±m0 X zCR X= ±m s0 CHAPTER 10: Testing Hypotheses 85 LD7479.080-093 12/24/02 12:00 PM Page 85 Type I and Type II Errors in Hypothesis Testing In this section Type I and Type II errors are considered entirely with re- spect to one-sided testing of a hypothesized mean. However, the basic concepts illustrated here apply to other hypothesis testing models as well. The maximum probability of Type I error is designated by the Greek a (alpha). It is always equal to the level of significance used in testing the null hypothesis. This is so because by definition the proportion of area in the region of rejection is equal to the proportion of sample results that would occur in that region given that the null hypothesis is true. The probability of Type II error is generally designated by the Greek b (beta). The only way it can be determined is with respect to a specific value included within the range of the alternative hypothesis. With the level of significance and sample size held constant, the probability of Type II error decreases as the specific alternative value of the mean is set farther from the value in the null hypothesis. It increases as the alternative value is set closer to the value in the null hy- pothesis. An operating characteristic (OC) curve por- trays the probability of accepting the null hypothesis given various alternative values of the population mean. Figure 10-1 is the OC curve applicable to any lower-tail test for a hypothesized mean carried out at the 5 percent level of significance and based on the use of the normal probability distribution. Note that it is applicable to any such test, because the values on the horizontal axis are stated in units of the 86 BUSINESS STATISTICS Table 10.2 Critical values of z in hypothesis testing LD7479.080-093 12/24/02 12:00 PM Page 86 standard error of the mean. For any values to the left of m0, the probabil- ity of acceptance indicates the probability of Type II error. To the right of m0, the probabilities indicate correct acceptance of the null hypothesis. As indicated by the dashed lines, when m = m0, the probability of accepting the null hypothesis is 1 − a. In hypothesis testing, the concept of power refers to the probability of rejecting a null hypothesis that is false, given a specific alternative val- ue of the parameter. Where the probability of Type II error is designated b, it follows that the power of the test is always 1 − b. Referring to Fig- ure 10-1, note that the power for alternative values of the mean is the dif- ference between the value indicated by the OC curve and 1.0, and thus a power curve can be obtained by subtraction, with reference to the OC curve. Determining the Required Sample Size for Testing the Mean Before a sample is actually collected, the required sample size can be de- termined by specifying: 1. The hypothesized value of the mean 2. A specific alternative value of the mean such that the difference from the null hypothesized value is considered important 3. The level of significance to be used in the test CHAPTER 10: Testing Hypotheses 87 Figure 10-1 LD7479.080-093 12/24/02 12:00 PM Page 87