Download Quantitative Research Methods in Business and more Exams Business Statistics in PDF only on Docsity! 1 / 19 BMAL-590 Quantitative Research 1.What Is Statistics?: "Statistics is a way to get information from data." Statistics is a tool for creating new understanding from a set of numbers. You need data and information 2.descriptive statistics: one of two branches of statistics which focuses on meth- ods of organizing, summarizing, and presenting data in a convenient and informative way. One form of descriptive statistics uses graphical techniques which allow statistics practitioners to present data in ways that make it easy for the reader to extract useful information. Another form of descriptive statistics uses numerical techniques to summarize data. Rather than providing the raw data, the professor may only share summary data with the student. 3.Histogram: (or bar graph) can show if the data is evenly distributed across the range of values, if it falls symmetrically from a center peak (normal distribution), if there is a peak but the more of the data falls on one side of the peak than the other (a skewed distribution), or if there are two or more peaks in the data (bi- or multi-modal). 4.average: mean 5.range: calculated by subtracting the smallest number from the largest. 6.mode: the most frequently occurring score(s) in a distribution 7.variance: the average squared deviation from the mean 8.Standard deviation: the square root of the variance and gets the variability 2 / 19 measure back to the same units as the data. Standard deviation has many useful properties when the data is normally distributed 9. inferential statistics: a body of methods used to draw conclusions or inferences about characteristics of populations based on sample data. Exit polls are a very common application of statistical inference. 10.Statistical inference problems involve three key concepts:: population, the sample, and the statistical inference. 11.Population:: the group of all items of interest to a statistics practitioner. It is frequently very large and may, in fact, be infinitely large. In the language of statistics, population does not necessarily refer to a group of people. It may, for example, refer to the population of diameters of ball bearings produced at a large plant. A descriptive measure of a population is called a parameter. In most applications of inferential statistics, the parameter represents the information we need. 12.Sample: a set of data drawn from the population. A descriptive measure of a sample is called a statistic. We use statistics to make inferences about parameters. 13.statistical inference: the process of making an estimate, prediction, or decision about a population based on sample data. Because populations are almost always very large, investigating each member of the population would be impractical and expensive. It is far easier and cheaper to take a sample from the population of interest and draw conclusions or make estimates about the population on the basis of information provided by the sample. However, such conclusions and estimates are not always going to be correct. For this reason, we build into the statistical inference a measure of reliability. There are two such measures, the confidence level and the significance level. The confidence level is the proportion of times that an estimating procedure will be 5 / 19 29.personal interview: involves an interviewer soliciting information from a re- spondent by asking prepared questions. A personal interview has the advantage of having a higher expected response rate than other methods of data collection. In addition, there will probably be fewer incorrect responses resulting from respondents misunderstanding some questions, because the interviewer can clarify misunder- standings when asked. But, the interviewer must also be careful not to say too much, for fear of biasing the response. The main disadvantage of personal interviews is that they are expensive, especially when travel is involved. 30.telephone interview: usually less expensive, but it is also less personal and has a lower expected response rate. Unless the issue is of interest, many people will refuse to respond to telephone surveys. This problem is exacerbated by telemar- keters trying to sell services or products. 31.self-administered questionnaire: usually mailed to a sample of people. This is an inexpensive method of conducting a survey and is, therefore, attractive when the number of people to be surveyed is large. But self-administered questionnaires usually have a low response rate and may have a relatively high number of incorrect responses due to respondents misunderstanding some questions. 32.Questionnaire Design: 1. Keep the questionnaire as short as possible. Most people are unwilling to spend much time filling out a questionnaire. 2.Ask short, simple, and clearly worded questions to enable respondents to answer quickly, correctly, and without ambiguity. 3.Start with demographic questions to help respondents get started and become comfortable quickly. 4.Use dichotomous (yes/no) and multiple choice questions because of their 6 / 19 simplic- ity. 5.Use open-ended questions cautiously because they are time consuming and more difficult to tabulate and analyze. 6.Avoid using leading questions that tend to lead the respondent to a particular answer. 7.Trial a questionnaire on a small number of people to uncover potential problems, such as ambiguous wording. 8.Think about the way you intend to use the collected data when preparing the questionnaire. First determine whether you are soliciting values for an quantitative variable or a categorical variable. Then consider which type of statistical techniques, descriptive or inferential, you intend to apply to the data to be collected, and note the requirements of the specific techniques to be used. 33.Sampling: statistical inference permits us to draw conclusions about a popu- lation based on a sample. The chief motives for examining a sample rather than a population are cost and practicality. Statistical inference permits us to draw conclusions about a population parameter based on a sample that is quite small in comparison to the size of the population. Another illustration of sampling can be taken from the field of quality management. To ensure that a production process is operating properly, the operations manager needs to know what proportion of items being produced is defective. If the quality technician must destroy the item to determine whether it is defective, then there is no alternative to sampling: A complete inspection of the product population would destroy the entire output of the production process, 7 / 19 which is impractical The sample statistic can come quite close to the parameter it is designed to estimate if the target population (the population about which we want to draw inferences) and the sampled population (the actual population from which the sample has been taken) are equal. But in practice, these populations may not be equal. In any case, the sampled population and the target population should be close to one another. 34.Self-selected samples: almost always biased, because the individuals who participate in them are more keenly interested in the issue than are the other members of the population. As a result, the conclusions drawn from such surveys are frequently wrong. 35.sampling plan: a method or procedure for specifying how a sample will be taken from a population. Three different sampling plans include simple random sampling, stratified random sampling, and cluster sampling. 36.simple random sample: a sample selected in such a way that every possible sample with the same number of observations is equally likely to be chosen. One way to conduct a simple random sample is to assign a number to each element in the population, write these numbers on individual slips of paper, toss them into a hat, and draw the required number of slips (the sample size, n) from the hat. Sometimes the elements of the population are already numbered such as Social Security numbers, employee numbers, or driver's license numbers. After each element of the chosen population has been assigned a unique number, sample numbers can be selected at random. 10 / population, resulting in biased results. 45.The response rate:: the proportion of all people selected who complete the survey, is a key survey parameter and helps in the understanding of the validity of the survey and sources of non-response error. Non-response errors can occur for a number of reasons. An interviewer may be unable to contact a person listed in the sample or the sampled person may refuse to respond for some reason. In either case, responses are not obtained from a sampled person and bias is introduced. The problem of non-response error is even greater when self-administered questionnaires are used rather than an interviewer who can attempt to reduce the non-response rate by means of callbacks. 46.Selection bias: occurs when the sampling plan is such that some members of the target population cannot possibly be selected for inclusion in the sample. 47.Which of the following statements is true regarding the design of a good survey: The questions should be kept as short as possible 48.(part 2) Which of the following statements is true regarding the design of a good survey?: a. The questions should be kept as short as possible. b.A mixture of dichotomous, multiple-choice, and open-ended questions may be used. c. Leading questions must be avoided. d.All of these choices are true. 49.Which method of data collection is involved when a researcher counts and records the number of students wearing backpacks on campus on a given day?: Direct observation 50. T he manager of the customer service division of a major consumer elec- tronics company is interested in determining whether the customers who have purchased a 11 / videocassette recorder over the past 12 months are satisfied with their products. If there are four different brands of videocassette recorders made by the company, the best sampling strategy would be to use a - : stratified random sample 51.Which of the following types of samples is almost always biased?: Self-se- lected samples 52.A random experiment: an action or process that leads to one of several possi- ble outcomes 53.exhaustive: all possible outcomes must be included Additionally, the outcomes must be mutually exclusive, which means that no two outcomes can occur at the same time. A list of exhaustive and mutually exclusive outcomes is called a sample space and is denoted by S. The outcomes are denoted by 5B1, 5B2,. ,5B5X Using set notation we represent the sample space and its outcomes as: 5F={5B1,5B2,.. .,5B5X} 54.There are three approaches to assign probability to outcomes. Each of these approaches must follow the two rules governing probabilities:: The probability of any outcome must lie between 0 and 1. That is:0 d5C(5B5V )d 1 The sum of the probabilities of all the outcomes in a sample space must be 1. That is: 5£X5V= 5C(5B5V) =1 55.classical approach: used by mathematicians to help determine probability associated with games of chance. If an experiment has n possible outcomes, this method would assign a probability of 1/n to each outcome. 56.When rolling two dice, what is the total number of possible outcomes?: - 12 / there are 36 different outcomes but only 11 possible totals 57.relative frequency approach: an objective way of determining probabilities based on observing frequencies over a number of trials 58.subjective approach: define probability as the degree of belief that we hold in the occurrence of an event. Subjective probabilities can also be described as hunches or educated guesses. 59."Probability of Precipitation" (P.O.P.): defined in different ways by different forecasters, but basically it's a subjective probability based on past observations 15 / the multiplication of the probabilities of linked branches. 70. Bayes' Law: 5C(54 55) = 5C(54)55)/5C(55) The probabilities 5Ca(n5d4)5Ca(r5e4c5a6l)led prior probabilities because they are determined prior to the decision about taking the preparatory course. The conditional probability 5CP((5A4||B5)5)is called a posterior probability (or revised probability), because the prior probability is revised after the decision about taking the preparatory course. 71.Identifying the Correct Method: Although it is difficult to offer strict rules on which probability method to use, nevertheless we can provide some general guidelines. The key issue is whether joint probabilities are provided or are required. Where the joint probabilities were given, we can compute marginal probabilities by adding across rows and down columns. We can use the joint and marginal prob- abilities to compute conditional probabilities, for which a formula is available. This allows us to determine whether the events described by the table are independent or dependent. We can also apply the addition rule to compute the probability that either of two events occurs. The first step in assigning probability is to create an exhaustive and mutually exclusive list of outcomes. The second step is to use the classical, relative frequency, or subjective approach and assign probability to the outcomes. There are a variety of methods available to compute the probability of other events. These methods include probability rules and trees. An important application of these rules is Bayes' Law, which allows us to compute conditional probabilities from other forms of probability. 16 / 72.Bayes's Law is used to compute: posterior probabilities 73.The classical approach describes a probability: in terms of the proportion of times that an event can be theoretically expected to occur 74.If a set of events includes all the possible outcomes of an experiment, these events are considered to be: exhaustive 75.Which of the following statements is not correct?: If event A does not occur, then its complement A' will also not occur. 76.Sampling Distribution of the Mean: Sampling distributions describe the distri- butions of sample statistics. There are two ways to create a sampling distribution. The first is to actually draw samples of the same size from a population, calculate the statistic of interest, and then use descriptive techniques to learn more about the sampling distribution. The second method relies on the rules of probablility and the laws of expected value and variance to derive the sampling distribution. 77.Central Limit Theorem: The theory that, as sample size increases, the distribu- tion of sample means of size n, randomly selected, approaches a normal distribution. 78.standard error of the proportion: the standard deviation of sample propor- tions, which measures the average variation around the mean of the sample pro- portions 79.The concept that allows us to draw conclusions about the population based strictly on sample data without having any knowledge about the dis- tribution of the underlying population is : the central limit theorem 80.Each of the following are characteristics of the sampling distribution of the mean except: if the original population is not normally distributed, the sampling 17 / distribution of the mean will also be approximately normal for large sample sizes 81.Each of the following are characteristics of the sampling distribution of the mean:: the sampling distribution of the mean has a different mean from the original population the standard deviation of the sampling distribution of the mean is referred to as the standard deviation if the original population is not normally distributed, the sampling distribution of the mean will be normal 82.Suppose you are given 3 numbers that relate to the number of people in a university student sample. The three numbers are 10, 20, and 30. If the standard deviation is 10, the standard error equals: 5.77 83.You are tasked with finding the sample standard deviation. You are given 4 numbers. The numbers are 5, 10, 15, and 20. The sample standard deviation equals: 6.455 84.Two methods exist to create a sampling distribution. One involves using parallel samples from a population and the other is to use the: rules of proba- bility 85.hypothesis testing: make and test an educated guess about a problem/solution 86.null hypothesis: a statement or idea that can be falsified, or proved wrong represented by 5;0p(ronounced H-nought) 87.alternative or research hypothesis: the opposite of null hypothesis- consists of a statement about the expected relationship between the variables denoted 5;1 88.Type I Error: occurs when we reject a true null hypothesis. 20 / 96.Types of Errors: A Type I error occurs when we reject a true null hypothesis (Reject 5;0when it is TRUE). A Type II error occurs when we don't reject a false null hypothesis (Do NOT reject 5;0 when it is FALSE). 97.rejection region method: It can be used in conjunction with the computer, but it is mandatory for those computing statistics manually. a range of values such that if the test statistic falls into that range, we decide to reject the null hypothesis in favor of the alternative hypothesis. 98.the p-value approach: the researcher determines the exact probability of ob- taining the observed sample difference, under the assumption that the null hypoth- esis is correct 99.several drawbacks to the rejection region method:: Foremost among them is the type of information provided by the result of the test. The rejection region method produces a yes or no response to the question, "Is there sufficient statistical evidence to infer that the alternative hypothesis is true?" The implication is that the result of the test of hypothesis will be converted automatically into one of two possible courses of action: one action as a result of rejecting the null hypothesis in favor of the alternative and another as a result of not rejecting the null hypothesis in favor of the alternative. The rejection of the null hypothesis seems to imply that the new billing system will be installed. What is needed to take full advantage of the information available from the test result and make a better decision is a measure of the amount of statistical evidence supporting the alternative hypothesis so that it can be weighed in relation to the other factors, especially the financial ones. The p-value of a test provides this measure. 21 / 100. We rejected the null hypothesis. Does this prove that the alternative hypothesis is true?: No our conclusion is based on sample data (and not on the entire population), so we can never prove anything by using statistical inference. Consequently, we summarize the test by stating that there is enough statistical evidence to infer that the null hypothesis is false and that the alternative hypothesis is true. 101. if the value of the test statistic does not fall into the rejection region (or the p- value is large):: rather than say we accept the null hypothesis (which implies that we're stating that the null hypothesis is true), we state that we do not reject the null hypothesis, and we conclude that not enough evidence exists to show that the alternative hypothesis is true. Although it may appear to be the case, we are not being overly technical. 102. One-Tail Test: Predicts that the results will fall in only one direction - either positive or negative 103. Two-Tail Test: used when we want to test a research hypothesis that a para- meter is not equal (`) to some value 104. Effects on 5oýf Changing :5üDecreasing the significance level 5üin,creases the value of 5aýnd vice versa. Shifting the critical value line to the right to decrease ±will mean a larger area under the lower curve for 5aýnd vice versa. 105. The hypothesis of most interest to the researcher is: the alternative hypoth- esis 106. A Type I error occurs when we: reject a true null hypothesis 107. Statisticians can translate p-values into several descriptive terms. Sup- pose you typically reject H0 at level 0.05. Which of the following statements is incorrect?: If 22 / the p-value < 0.01, there is overwhelming evidence to infer that the alternative hypothesis is false. 108. In a criminal trial where the null hypothesis states that the defendant is innocent, a Type I error is made when: an innocent person is found guilty 109. Population Mean: ¼ 110. Population Proportion: P 111. An unbiased estimator is: a sample statistic, which has an expected value equal to the value of the population parameter 112. Thirty-six months were randomly sampled and the discount rate on new issues of 91-day Treasury Bills was collected. The sample mean is 4.76% and the standard deviation is 171.21. What is the unbiased estimate for the mean of the population?: 4.76% 113. A 98% confidence interval estimate for a population mean is determined to be 75.38 to 86.52. If the confidence level is reduced to 90%, the confidence interval for population mean: becomes narrower 114. Suppose the population of blue whales is 8,000. Researchers are able to garnish a sample of oceanic movements from 100 blue whales from within this population. Thus,: researchers can ignore the finite population correction factor 115. In the sample proportion, represented by p = x / n, the variable x refers to: the number of successes in the sample 116. analysis of variance: determines whether differences exist between popula- tion means. Ironically, the procedure works by analyzing the sample variance, hence the name. Analysis of variance is an extremely powerful and widely used procedure 117. One-Way Analysis of Variance: analysis of variance in which there is only one grouping variable Independent samples are drawn from k populations: 25 / 129. uniform distribution: Distribution where populations are spaced evenly 130. n Fisher's least significant difference (LSD) multiple comparison method, the LSD value will be the same for all pairs of means if: all sample sizes are the same 131. One-way ANOVA is applied to three independent samples having means 10, 13, and 18, respectively. If each observation in the third sample were increased by 30, the value of the F-statistic would: increase 132. Assume a null hypothesis is found true. By dividing the sum of squares of all observations or SS(Total) by (n - 1), we can retrieve the: sample variance 133. Which of the following is true about one-way analysis of variance?: n1 = n2 = ... = nk is not required. 134. the technique for hypothesis testing: concludes with either rejecting or not rejecting some hypothesis concerning a dimension of a population. in hypothesis testing the decision is based on the statistical evidence available costs (and profits) are only indirectly considered (in the selection of a significance level or in interpreting the p-value) in the formulation of a hypothesis test. 135. In decision analysis: we deal with the problem of selecting one alternative from a list of several possible decisions. there may be no statistical data, or if there are data, the decision may depend only partly on them Decision analysis directly involves profits and losses. Because of these major differences, the topics covered previously that are required for an understanding of decision analysis are probability (including Bayes' Law) and expected value. 26 / 136. payoff table: table showing the expected payoffs for each alternative in every possible state of nature 137. opportunity loss: The amount you would lose by not picking the best alterna- tive. For any state of nature, this is the difference between the consequences of any alternative and the best possible alternative. Opportunity loss is calculated row-wise, by taking the combination of act & state of nature with the highest valueand then subtracting this maximum value from all the payoffs in the row. if done correctly, we will be left with one zero (where the maximum payoff was located) and positive numbers for all other act / state of nature combinations. 138. A tabular presentation that shows the outcome for each decision alterna- tive under the various states of nature is called a: payoff table 139. Which of the following statements is false regarding the expected mon- etary value (EMV): In general, the expected monetary values represent possible payoffs. 140. Which of the following statements is true regarding the expected mone- tary value (EMV): To calculate the EMV, the probabilities of the states of nature must be already decided upon. We choose the decision with the largest EMV. To calculate the EMV, the probabilities of at least one of the states of nature must be already decided upon. 141. In the context of an investment decision, is the difference 27 / between what the profit for an act is and the potential profit given an optimal decision.: an opportunity loss 142. The branches in a decision tree are equivalent to: events and acts 143. Which of the following is not necessary to compute posterior probabili- ties?: EMV 144. Which of the following is necessary to compute posterior probabilities?- : the sum of all P(sj and Ii)'s likelihood probabilities P(Ii | sj)