































Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Community
Ask the community for help and clear up your study doubts
Discover the best universities in your country according to Docsity users
Free resources
Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors
A thorough explanation of hypothesis testing in statistics, covering key concepts such as null and alternative hypotheses, type i and type ii errors, p-values, and confidence intervals. it includes numerous questions and answers to reinforce understanding and facilitate learning. The guide is suitable for university students and those seeking a deeper understanding of statistical methods.
Typology: Exams
1 / 39
This page cannot be seen from the preview
Don't miss anything!
What is at the "heart" of hypothesis testing in statistics? Make an assumption about reality, and collect sample evidence to determine whether it contradicts the assumption. What is a hypothesis? A statement regarding a characteristic of one or more populations. Why do we test statements about a population parameter using sample data? Because it is usually impossible or impractical to gain access to the entire population. State the definition of hypothesis testing. A procedure based on sample evidence and probability, used to test statements regarding a characteristic of one or more populations. List the 3 steps in hypothesis testing.
When observed results are unlikely under the assumption that the null hypothesis is true, we say that the result is statistically significant and we reject the statement in the null hypothesis. A criterion for testing hypotheses is to determine how likely the observed sample proportion is: under the assumption that the statement in the null hypothesis is true. Give the definition of a P-value. A P-value is the probability of observing a sample statistic as extreme as or more extreme than one observed under the assumption that the statement in the null hypothesis is true. Stated another way, the P-value is the likelihood or probability that a sample will result in a statistic such as the one obtained if the null hypothesis is true. Explain how to determine whether the null hypothesis should be rejected using the P-value approach. If the probability of getting a sample statistic as extreme as or more extreme than the one obtained is small under the assumption that the statement in the null hypothesis is true, reject the null hypothesis. What are the three conditions that must be satisfied before testing a hypothesis regarding a population proportion, p? the sample is obtained by simple random sampling or the data result from a randomized experiment ; np0(1−p0)≥10 where p0 is the proportion stated in the null hypothesis; and the sampled values are independent of each other. This means that the sample size is no more than 5% of the population size (n≤0.05N). State the five steps for testing a hypothesis about a population proportion, p. Step 1: Determine the null and alternative hypotheses. The hypotheses can be structured in one of three ways: Step 2: Select a level of significance, α, depending on the seriousness of making a Type I error. Step 3 (By Hand) Step 3 (Using Technology): compute the test statistic Step 4: If P-value <α, reject the null hypothesis. Step 5: State the conclusion Explain how to make a decision about the null hypothesis when performing a two-tailed test using confidence intervals.
When testing H0: p=p0 versus H1: p≠p0, if a (1−α) 100% confidence interval contains p0, we do not reject the null hypothesis. However, if the confidence interval does not contain H0: p=p versus H1: p≠p0, if a (1−α) 100% confidence interval contains p0, we do not reject the null hypothesis. However, if the confidence interval does not contain p0, we conclude that p≠ p0 at the level of significance α. For the sampling distribution of p^ to be approximately normal, we require that np(1-p) be at least 10. If this requirement is not satisfied we use the binomial probability formula to determine the Pvalue. When there are small sample sizes, the evidence against the statement in the null hypothesis must be __________ One should be wary of studies that _____________ the null hypothesis when the test was conducted with a small sample size. substantial; do not reject State the definition of a point estimate. A point estimate is the value of a statistic that estimates the value of a parameter. Give the definition for a confidence interval for an unknown parameter. A confidence interval for an unknown parameter consists of an interval of numbers based on a point estimate. What does the level of confidence represent? The level of confidence represents the expected proportion of intervals that will contain the parameter if a large number of different samples is obtained. The level of confidence is denoted (1- a)100% What is the form of confidence interval estimates for a population parameter? point estimate + or - margin of error For a 95% confidence interval, any sample proportion that lies within 1.96 standard errors of the population proportion will result in a confidence interval that includes p. This will happen in 95% of all possible samples. Any sample proportion that is more than 1.96 standard errors from the population proportion ______ result in a confidence interval that does not contain p. This will happen in 5% of all possible samples (those sample proportions in the tails of the distribution). will
List the critical value associated with the given level of confidence. A) 90% : 1.645 B) 95% : 1.96 C) 99% : 2. State the interpretation of a confidence interval. A (1−α) 100% confidence interval indicates that (1−α) 100% of all simple random samples of size n from the population whose parameter is unknown will result in an interval that contains the parameter. Constructing a Confidence Interval for a Population Proportion using StatCrunch
Chose Student as his pseudonym State six properties of the t-distribution.
Decrease the confidence level and increase the sample size Explain why the t-distribution has less spread as the number of degrees of freedom increases. The t-distribution has less spread as the degrees of freedom increase because, as n increases, s becomes closer to σ by the law of large numbers. What type of data are needed to construct a confidence interval for a population proportion, p? Qualitative with 2 outcomes Besides the fact that the sample must be obtained by simple random sampling or through a randomized experiment, list the two conditions that must be met when constructing a confidence interval for a population proportion, p. np^ (1-p^) > 10 and n<0.05N What type of data are needed to construct a confidence interval for a population mean,? Quantitative Besides the facts that the sample must be obtained by simple random sampling or through a randomized experiment and that the sample size must be small relative to the size of the population, what other condition must be satisfied? n > 30 (good to go) n < 30 we create a box plot/ Statistics are _________ variables because the value of a statistic varies from sample to sample. random Remember, when we describe a distribution, we do so in terms of its ___________ shape, center, and spread What is the sampling distribution of a statistic? The sampling distribution of a statistic is a probability distribution for all possible values of the statistic computed from a sample of size n. What is the sampling distribution of the sample mean? The sampling distribution of the sample mean x- is the probability distribution of all possible values of the random variable x- computed from a sample of size n from a population with mean μ and standard deviation σ. List the three steps for determining the sampling distribution of the sample mean.
Step 1: Obtain a simple random sample of size n Step 2: Compute the sample mean Step 3: Assuming that we are sampling from a finite population, repeat Steps 1 and 2 until all distinct simple random samples of size n have been obtained. Note: Once a particular sample is obtained, it cannot be obtained a second time Describe the shape of the distribution of the sample mean as the sample size increases. As the sample size increases, the shape of the distribution becomes approximately normal. What does the mean of the distribution of the sample mean, x bar, equal? The mean of the distribution of the sample mean will equal the mean of the parent population. As the sample size n increases, what happens to the standard deviation of the distribution of the sample mean? The standard deviation decreases. The standard deviation of the distribution of the sample mean is less than the standard deviation of the population and the larger the sample size, n, the smaller the standard deviation of the distribution of the sample mean. What is the standard error of the mean? The standard deviation of the sampling distribution of the mean State the Central Limit Theorem. The shape of the distribution of the sample mean becomes approximately normal as the sample size n increases, regardless of the shape of the underlying population. How large does the sample size need to be before we can say that the sampling distribution of is approximately normal? The answer depends on the shape of the distribution of the underlying population. Distributions that are highly skewed will require a larger sample size for the distribution of to become approximately normal. State the rule of thumb for invoking the Central Limit Theorem. If the distribution of the population is unknown or not normal, then the distribution of the sample mean is approximately normal provided that the sample size is greater than or equal to 30 To cut the standard error of the mean in half, the sample size must be doubled
What does it mean to say that a continuous random variable is normally distributed? A continuous random variable is normally distributed or has a normal probability distribution, if its relative frequency histogram has the shape of a normal curve What value of x is associated with the peak of a normal curve? the mean What values of x are associated with the inflection points of a normal curve? mean
99.7% of the area is between x=μ−3σ and x=μ+3σ. Suppose that a random variable X is normally distributed with mean μ and standard deviation Give two representations for the area under the normal curve for any interval of values of the random variable X.
The probability distribution of a discrete random variable X provides the possible values of the random variable and their corresponding probabilities. A probability distribution can be in the form of a table, graph, or mathematical formula. What does the notation P(x) represent? The probability that the random variable X equals x In the graph of a discrete probability distribution, what do the horizontal axis and the vertical axis represent? In the graph of a discrete probability distribution, the horizontal axis is the value of the discrete random variable and the vertical axis is the corresponding probability of the discrete random variable. When graphing a discrete probability distribution, how do we emphasize that the data is discrete? When graphing a discrete probability distribution, we want to emphasize that the data are discrete. Therefore, draw the graph of discrete probability distributions using vertical lines above each value of the random variable to a height that is the probability of the random variable. State the formula for the mean of a discrete random variable. μx = Σ [x*P(x)] As the number of repetitions of the experiments increases, what does the mean value of the n trials approach? As the number of repetitions of the experiments increases, the mean value of the n trials will approach μx, the mean of the distribution of the random variable x. As the number of repetitions of the experiments increases, what happens to the difference between the mean outcome and the mean of the probability distribution? It gets closer to 0 as n increases In each simulation, what value is the graph (that shows the mean number of free throws made) drawn towards? In each simulation over time, the mean is pulled towards the theoretical mean of the random variable Because the mean of a random variable represents what we would expect to happen in the long run, it is also called the expected value, E(X).
The interpretation of the expected value is the same as the interpretation of the mean of a discrete random variable. The ___________ of the discrete random variable, , is the value σ2X under the square root in the computation of the standard deviation. variance What is a binomial probability distribution? The binomial probability distribution is a discrete probability distribution that describes probabilities for experiments in which there are two mutually exclusive (disjoint) outcomes. These two outcomes are generally referred to as success (such as making a free throw) and failure (such as missing a free throw). Experiments in which only two outcomes are possible are referred to as binomial experiments, provided that certain criteria are met. What are the four criteria for a binomial experiment?
Probability is the measure of the likelihood of a random phenomenon or chance behavior occurring. It deals with experiments that yield random short-term results or outcomes yet reveal long-term predictability. The long-term proportion in which a certain outcome is observed is the probability of that outcome. State the Law of Large Numbers. As the number of repetitions of a probability experiment increases, the proportion with which a certain outcome is observed gets closer to the probability of the outcome. Explain the meaning of the sentence, "In a random process, the trials are memoryless." Trials do not recall what has happened in the past and used them to make changes on what's going to happen in the future In probability, what is an experiment? In probability, an experiment is any process with uncertain results that can be repeated. The result of any single trial of the experiment is not known ahead of time. However, the results of the experiment over many trials produce regular patterns that allow accurate predictions. A(n) _______ is any collection of outcomes from a probability experiment. event What is a probability model? A probability model lists the possible outcomes of a probability experiment and each outcome's probability. A probability model must satisfy Rules 1 and 2 of the rules of probabilities. What is an unusual event? What cutoff points do statisticians typically use for identifying unusual events? An unusual event is an event that has a low probability of occurring. Typically, an event with a probability less than 0.05 (or 5%) is considered unusual, but this cutoff point is not set in stone. Statisticians typically use cutoff points of 0.01,0.05, and 0. List the three methods in this section for determining the probability of an event.
The _________ method gives an approximate probability of an event by conducting a probability experiment. The ____________ method of computing probabilities does not require that a probability experiment actually be performed, rather it relies on counting techniques. empirical; classical What requirement must be met in order to compute probabilities using the classical method? The classical method requires equally likely outcomes. An experiment has equally likely outcomes when each outcome has the same probability of occurring As the number of trials of an experiment increase, how does the empirical probability of an event occurring compare to the classical probability of that event occurring? The empirical probability will get closer to the classical probability as the number of trials of the experiment increases due to the Law of Large Numbers. If the two probabilities do not get closer, we may suspect that the dice are not fair. In ______________, each individual has the same chance of being selected. Therefore, we can use the classical method to compute the probability of obtaining a specific sample simple random sampling What is a subjective probability? Explain why subjective probabilities are used. A subjective probability is a probability that is determined based on personal judgement. Subjective probabilities are legitimate and are often the only method of assigning likelihood to an outcome. For instance, a financial reporter may ask an economist about the likelihood of the economy falling into recession next year. Again, we cannot conduct an experiment n times to find a relative frequency. The economist must use knowledge of the current conditions of the economy and make an educated guess about the likelihood of recession. Explain the Law of Large Numbers. How does this apply to gambling casinos? As the number of repetitions of a probability experiment increases, the proportion with which a certain outcome is observed gets closer to the probability of the outcome. This applies to casinos because they are able to make a profit in the long run because they have a small statistical advantage in each game. What does it mean for two events to be disjoint? Two events are disjoint if they have no outcomes in common. Another name for disjoint events is mutually exclusive events. If two events are mutually exclusive, it means that they cannot occur at the same time. In a Venn diagram, what does the rectangle represent? What does a circle represent?