Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Hypothesis Testing and Confidence Intervals for Population Proportions and Means, Exams of Advanced Education

Key concepts and methods related to hypothesis testing and confidence interval construction for population proportions and means. It discusses the properties, assumptions, and interpretation of confidence intervals, as well as the test statistics and decision-making process for hypothesis testing. The document also addresses the comparison of two population means, including the appropriate test statistic and p-value calculations. Overall, this resource provides a comprehensive overview of statistical inference techniques for population parameters, which are fundamental to many fields of study and research.

Typology: Exams

2024/2025

Available from 10/11/2024

solution-master
solution-master 🇺🇸

3.4

(18)

7.5K documents

1 / 8

Toggle sidebar

Related documents


Partial preview of the text

Download Hypothesis Testing and Confidence Intervals for Population Proportions and Means and more Exams Advanced Education in PDF only on Docsity!

STAT 203 Final questions and answers

What is SE? - Standard Error of p̂ = sqrt(( p̂(1- p̂))/n) What is a confidence interval? - range of values in which a specified probability of the means of repeated samples would be expected to fall Property the population proportion p - p is fixed and does not vary. Rather the confidence intervals as what's varying because of sampling variability. What is z? - This is the critical value z-score such that the upper tail area under the standard normal curve equals to 100%-confidence level/ 2. What is a confidence level? - the probability(percentages) that a confidence interval encloses the population proportion p. Could be 90%, 95%, and 99%. The higher the confidence level_______ - the larger the value of z What is ME? - Margin of error - balancing between certainty and precision. ME = z*(SE)( p̂) What happens if we allow a larger margin of error? - the confidence interval will be wider and hence will have a greater chance of including the true value of p. However, with a higher confidence in capturing p, the interval becomes less precise (wider). What are the assumptions and conditions for constructing a confidence interval? - - the sample is randomly drawn from the population

  • the sample values are independent (sample size is a small fraction of the population size)
  • the sample size is sufficiently large (p is unknown, so we check the conditions: n p̂>/= 10 and n(1- p̂) >/= 10 What are the properties of a confidence interval? - centered at the sample proportion p̂ the width of a CI increases with the confidence level, and hence lower precision the width of the CI decreases with the sample size hence greater precision (when the sample size is doubled, the standard error decreases by a factor of sort(2) How can we interpret a confidence interval for p? - - Over the collection of all confidence intervals of a fixed confidence level C that could be constructed from repeated random samples of size n obtained from the population, the percentage of confidence intervals that contain the true proportion p equals to C
  • We are C% confidence that the true proportion falls within the confidence interval that one has obtained. What are CI's constructed for? - For parameters not statistics (p not p̂) Why do we need to determine an appropriate sample size n? - with the intention of estimating a population proportion, we need to achieve a certain precision in the estimate of the population parameter. We need to specify the maximum tolerable margin of error and the confidence level to find the required sample size. When calculating sample size, what do we use for p̂? - If we have estimate of the population proportion from previous studies, we can use those values for sample size calculation. If there are no prior studies, we will use p̂= 0.5 for computing the requires sample size. What is a hypothesis? - a statement or a claim about a parameter (a numerical characteristic of a population)

What is a null hypothesis? - a statement about the value of a population parameter whose general form is: H0 : population parameter = some specific value For population proportion p --> H0 : p = p p 0 is some fixed value of p What does the null hypothesis say? - it states that an observed difference is due to chance variation not a significant difference. What is an alternative hypothesis? - It is a statement that opposes the null hypothesis and states that an observed difference is real. 3 possible forms: p =/= specific value (is different) (two-sided alternative, two tailed test) p > specific value (one-sided, right tailed test) p< specific value (one-sided, left tailed test) What is the test for a population proportion called? - one-proportion z-test What do we assume when testing a hypothesis? - we assume that the null hypothesis is true and we evaluate whether the evidence presented by the data is compatible or incompatible with the null hypothesis What does the sample proportion p̂ follow? - N (p0, sqrt((p0(1-p0)/n)) --> null model forp̂ What do we have to check in order for the Normal approximation to be true for the sample proportion p̂ in the Null model? - check that the sample size is sufficiently large --> np0>/= 10 and n(1-p0) >/= 10

What is a test statistic? - the z-score of the observed sample proportion for testing a population proportion for the null model If the test statistic is extremely positive or negative______ - The observed difference is large and is unusual under the null model. This then suggests that the data are incompatible with the null hypothesis. How do we evaluate how unusual an observed difference is when the null model is true? - We compute the P-value. This is defined as the probability of getting a value for the test statistic(or sample proportion) that is as extreme as or more extreme than the observed test statistic assuming that H0 is true. The smaller the P-value ... - the stronger the evidence against the null hypothesis is What does a small P-value suggest? - That what we observe is unlikely to be due to chance variation if H0 is true. In other words, the data are incompatible with H0. What kind of probability is the P-value? - this is a conditional probability. (condition = null hypothesis is true) How do we make a decision about whether to reject the null hypothesis? - If the P-value is smaller than alpha, we reject the null hypothesis and the test is considered significant at the given alpha level.

  • We conclude that the population parameter is significantly different/larger/smaller than the value specified under H0. If the P-value is greater than or equal to alpha, we do not reject the null hypothesis and the test is considered insignificant at the given alpha level. -We conclude that the population parameter is not significantly different/larger/smaller than the value specified under H0. What is an alpha level/ significance level? - denoted by alpha, it sets the criterion for the decision to reject the null hypothesis. Common alpha values = 0.01, 0.05, 0.

What kind of errors can happen in hypothesis testing? - Type 1 and Type II errors What is the Type I error? - the mistake of rejecting H0 when H0 is true What is the Type II error? - the mistake of failing to reject H0 when H0 is false. How can we measure the variability of a statistic? - We can use standard deviation since the value of a statistic varies from sample to sample What is s? - the sample standard deviation How to find the standard error of the sample mean ȳ? - Most of the time, the SD is unknown, so to estimate SD, we sue the SD from a random sample 's' and estimate SD through the standard error equation SE(ȳ) = s/sqrt(n) How do we get the CI for mean(miu)? - the CI will be based on the t-model instead of standard Normal model. ȳ +/- t(n-1) (s/(sqrt(n))) t(n-1) is obtained from the t-table What is a confidence interval for mean (miu) called? - one sample t-interval What are the properties of the t-model? - perfectly symmetric about mean = unimodal and bell-shaped has one model parameter - the degrees of freedom (df : determines shape of curve) thicker tails when sample size is smaller

approaches the Normal model for larger and larger samples the smaller the sample size the more spread out. What to do if the df is not listed on the table? - use the closest df for calculation Assumptions and conditions for constructing a confidence interval under the t-model - - the sample is randomly drawn

  • the sampled values are independent (sample size is a small fraction of population)
  • the 'nearly' Normal condition: when the underlying distribution is exactly or near the Normal model, using the t-model is justified even w/ small sample size when the underlying distribution is non-Normal or unknown, we need a larger sample for the use of the t-model How do we interpret the CI for mean? - - Over the collection of all CIs for mean of a fixed confidence level C that could be constructed from repeated random samples of size n obtained from the population, the percentage of confidence intervals that contain the true mean equals to C
  • Suppose we get a 99% confidence interval of (5.88, 9.42) for a population mean, we are then 99% confidence that mean is between 5.88 and 9.42. What is the test statistic using the t-model for mean? - t(n-1) = (ȳ-𝜇0)/(s/(sqrt(n))) this test is called the one sample t-test What is the test statistic definition in terms of 𝜇? - the standardized score of the observed sample mean under the assumption that the null hypothesis is true ( i.e. 𝜇 = 𝜇0) when might we need to compare the means of two populations? - comparing mean IQ scores, comparing the mean reduction in blood pressure etc.

How do we compare the two population means? - we will consider the difference between the two means 𝜇1 and 𝜇2 : 𝜇1-𝜇 2 To estimate this value, we will use ȳ1 and ȳ2: i.e., ȳ1-ȳ2. the difference between the two sample means varies because of sampling variability When calculating the df for the comparison of two means which df do you use? - we use the smaller of the two. Assumptions and conditions for two-sample inference and validity using the t-model - - the two samples are random

  • the two sample sizes are no greater than 10% of their respective populations
  • If the both y1 and y2 follow the Normal model (no restriction on sample size) if y1 and y2 are non-Normal, the sample size must be sufficiently large by the CLT and the use of the t- model
  • the two samples must be independent of each other (i.e., no influence or association with the observations. What are the possible alternative hypotheses for comparing two population means? - HA: 𝜇1 =/= 𝜇 2 HA: 𝜇1 > 𝜇 2 HA: 𝜇1 < 𝜇 2 What is the test statistic when comparing two population means? - t0 = (ȳ1-ȳ2)/(SE(ȳ1-ȳ2)) What is the P-value when comparing two means? - - area to the right of t0 under the t-curve (for HA: 𝜇 1

𝜇2)

  • area to the left of t0 under the t-curve (for HA: 𝜇1 < 𝜇2)
  • two tailed area: double the area to the right of |t0| under the t-curve (for HA: 𝜇1 =/= 𝜇2)

When do we reject H0 when comparing two means? - when the P-value is less than alpha.