Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Community
Ask the community for help and clear up your study doubts
Discover the best universities in your country according to Docsity users
Free resources
Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors
MATH 201 PROBABILITY & STATISTICS COMPLETED EXAM 2024MATH 201 PROBABILITY & STATISTICS COMPLETED EXAM 2024MATH 201 PROBABILITY & STATISTICS COMPLETED EXAM 2024
Typology: Exams
1 / 8
Therefore, the probability of getting exactly 6 heads in 10 tosses of a fair coin is about 0.205. Rationale: The binomial distribution models the number of successes in a fixed number of independent trials with a constant probability of success in each trial. The formula gives the probability of getting exactly k successes out of n trials.
Therefore, the mean, median, mode, and standard deviation of the ratings are 3.25, 3, 3, and 1. respectively. Rationale: The mean, median, mode, and standard deviation are measures of central tendency and dispersion that describe the distribution of a set of data. The mean is the average of all the values, the median is the middle value when the data is sorted, the mode is the most frequent value or values, and the standard deviation is a measure of how much the values vary from the mean. A fair die is rolled 10 times. What is the probability that the number 6 appears exactly twice? Use the binomial distribution to calculate your answer. Answer: The probability is given by the formula P(X = k) = (n choose k) * p^k * (1-p)^(n-k), where n is the number of trials, k is the number of successes, and p is the probability of success in each trial. In this case, n = 10, k = 2, and p = 1/6. Therefore, P(X = 2) = (10 choose 2) * (1/6)^2 * (5/6)^8 = 0.15. Rationale: The binomial distribution models the number of successes in a fixed number of independent trials with a constant probability of success. The die rolling is an example of such a scenario. A random variable X follows a normal distribution with mean 50 and standard deviation 10. What is the probability that X is between 40 and 60? Use the standard normal table to find your answer. Answer: The probability is given by the formula P(a < X < b) = P((X - mu)/sigma < (b - mu)/sigma) - P((X - mu)/sigma < (a - mu)/sigma), where mu is the mean and sigma is the standard deviation of X. In this case, mu = 50 and sigma = 10, so we need to find P(-1 < Z < 1), where Z is a standard normal variable. Using the standard normal table, we get P(-1 < Z < 1) = 0.8413 - 0.1587 = 0.6826. Rationale: The normal distribution is a symmetric bell-shaped curve that describes many natural phenomena. The standard normal distribution has mean 0 and standard deviation 1, and any normal variable can be transformed into a standard normal variable by subtracting the mean and dividing by the standard deviation. The standard normal table gives the cumulative probabilities for values of Z up to a certain point. A random variable Y follows an exponential distribution with rate parameter lambda = 0.2. What is the expected value and variance of Y? Use the properties of exponential distribution to find your answer. Answer: The expected value of Y is given by E(Y) = 1/lambda, where lambda is the rate parameter of the exponential distribution. In this case, lambda = 0.2, so E(Y) = 1/0.2 = 5. The variance of Y is given by Var(Y) = 1/lambda^2, where lambda is the same as above. In this case, Var(Y) = 1/0.2^2 = 25. Rationale: The exponential distribution models the time between events in a Poisson process, such as radioactive decay or customer arrivals. The rate parameter lambda indicates how often the events occur on
average. The expected value and variance of an exponential variable are inversely proportional to the square of the rate parameter. A researcher wants to test whether there is a relationship between gender and political affiliation in a sample of 100 voters. The observed frequencies are shown in the table below. Use a chi-square test to test the null hypothesis that gender and political affiliation are independent at a 0.05 significance level. Report the test statistic, the degrees of freedom, the p-value, and the conclusion.
Gender | Democrat | Republican | Independent | Total |
---|---|---|---|---|
Male | 18 | 32 | 10 | 60 |
Female | 22 | 8 | 10 | 40 |
Total | 40 | 40 | 20 | 100 |
Answer: The expected frequencies are calculated by multiplying the row total and the column total and dividing by the grand total. For example, the expected frequency for male democrats is (60 x 40) / 100 = 24. The chi-square test statistic is the sum of (observed - expected)^2 / expected for each cell. The test statistic is: X^2 = [(18 - 24)^2 / 24] + [(32 - 24)^2 / 24] + [(10 - 12)^2 / 12] + [(22 - 16)^2 / 16] + [(8 - 16)^2 / 16] + [(10 - 8)^2 / 8] X^2 = 6 + 8 + (4/3) + (9/2) + (16/2) + (1/2) X^2 = 22. The degrees of freedom are calculated by (number of rows - 1) x (number of columns - 1). In this case, df = (2 - 1) x (3 - 1) = 2. The p-value is the probability of obtaining a chi-square value equal to or greater than the test statistic under the null hypothesis. Using a chi-square table or a calculator, the p-value is approximately 0.0001. Since the p-value is less than the significance level of 0.05, we reject the null hypothesis and conclude that there is a relationship between gender and political affiliation in the sample. A coin is tossed 100 times and the number of heads and tails are recorded. The observed frequencies are shown in the table below. Use a chi-square test to test the null hypothesis that the coin is fair at a 0. significance level. Report the test statistic, the degrees of freedom, the p-value, and the conclusion.
Outcome | Heads | Tails | Total |
---|---|---|---|
Observed | 45 | 55 | 100 |
Expected | 50 | 50 | 100 |
Answer: The chi-square test statistic is the sum of (observed - expected)^2 / expected for each cell. The test statistic is: X^2 = [(45 - 50)^2 / 50] + [(55 - 50)^2 / 50] X^2 = (25/50) + (25/50) X^2 = 1
What is the coefficient of determination (R^2) for a simple linear regression model? How can it be interpreted? How can it be calculated from the correlation coefficient (r)? Answer: The coefficient of determination (R^2) is the proportion of the variance in the dependent variable that is explained by the independent variable. It can be interpreted as a measure of how well the regression line fits the data. It can be calculated from the correlation coefficient (r) by squaring it: R^2 = r^2.
. What is the difference between a residual and an error in a regression model? How can they be used to assess the quality of the model? Answer: A residual is the difference between the observed value and the predicted value of the dependent variable for a given observation. An error is the difference between the true value and the predicted value of the dependent variable for a given observation. The residuals can be used to assess the quality of the model by checking their distribution, mean, variance, and correlation with the independent variable. Ideally, the residuals should be normally distributed, have a mean of zero, have a constant variance, and be uncorrelated with the independent variable. What is multicollinearity in a multiple linear regression model? How can it affect the estimation and interpretation of the model parameters? How can it be detected and corrected? Answer: Multicollinearity is a situation where two or more independent variables in a multiple linear regression model are highly correlated with each other. It can affect the estimation and interpretation of the model parameters by inflating their standard errors, making them less precise and less significant. It can also make the model unstable and sensitive to small changes in the data. It can be detected by calculating the variance inflation factor (VIF) for each independent variable, which measures how much its variance is increased due to multicollinearity. A rule of thumb is that a VIF above 10 indicates a serious multicollinearity problem. It can be corrected by dropping some of the correlated variables, transforming them, or using regularization techniques such as ridge or lasso regression. What is heteroscedasticity in a regression model? How can it affect the estimation and interpretation of the model parameters? How can it be detected and corrected? Answer: Heteroscedasticity is a situation where the variance of the residuals in a regression model is not constant across different values of the independent variable. It can affect the estimation and interpretation of the model parameters by making them biased and inconsistent, violating one of the assumptions of ordinary least squares (OLS) regression. It can also affect the validity of hypothesis tests and confidence intervals based on OLS estimates. It can be detected by plotting the residuals against the predicted values or the independent variable, or by using formal tests such as Breusch-Pagan or White test. It can be corrected by using weighted least squares (WLS) regression, transforming the dependent or independent variables, or using robust standard errors.