


























































































Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Assesses knowledge of linear motion principles in physics, including velocity, acceleration, displacement, and the application of Newton's laws to real-world linear motion scenarios.
Typology: Exams
1 / 98
This page cannot be seen from the preview
Don't miss anything!



























































































Question 1. Which of the following best describes the primary purpose of regression analysis in health data? A) To prove causation between two variables B) To estimate the expected value of a dependent variable given predictors C) To calculate the mean of a single variable D) To perform a non‑parametric test of differences Answer: B Explanation: Regression analysis estimates how the dependent variable changes with predictors, providing expected values, not proof of causation. Question 2. In the context of modeling, a deterministic model differs from a probabilistic model because it A) Includes random error terms B) Assumes a fixed relationship without error C) Requires large sample sizes D) Uses categorical predictors only Answer: B Explanation: Deterministic models specify exact outcomes with no random error, whereas probabilistic models incorporate error terms. Question 3. Which step is NOT part of the typical regression modeling process? A) Data collection B) Model specification C) Randomization of the dependent variable
D) Model validation Answer: C Explanation: Randomizing the dependent variable is not a standard step; the process moves from data collection to validation. Question 4. The Pearson correlation coefficient measures A) The causal effect of X on Y B) The linear association strength and direction between two continuous variables C) The variance of a single variable D) The probability that two variables are independent Answer: B Explanation: Pearson’s r quantifies linear strength and direction, not causation. Question 5. A correlation coefficient of – 0.85 indicates A) A strong positive linear relationship B. No relationship C) A strong negative linear relationship D) A perfect positive relationship Answer: C Explanation: Negative values denote inverse direction; magnitude 0.85 signals a strong relationship. Question 6. Which statement correctly distinguishes correlation from causation?
Answer: C Explanation: (\epsilon) captures the deviation of observed y from the deterministic part of the model. Question 9. Which of the following is NOT an assumption of simple linear regression? A) Linearity of the relationship B) Independence of errors C) Normality of predictors D) Homoscedasticity of errors Answer: C Explanation: Normality is required for errors, not for the predictor variable. Question 10. The Ordinary Least Squares (OLS) method chooses coefficient estimates that A) Maximize the sum of squared residuals B) Minimize the sum of absolute residuals C) Minimize the sum of squared residuals D) Maximize the likelihood of the predictor Answer: C Explanation: OLS minimizes the sum of squared differences between observed and predicted values. Question 11. The OLS estimator of the slope (\beta_1) in simple linear regression can be calculated as A) (\frac{\sum (x_i - \bar{x})(y_i - \bar{y})}{\sum (x_i - \bar{x})^2}) B) (\frac{\sum (y_i - \bar{y})^2}{\sum (x_i - \bar{x})^2})
C) (\frac{\sum (x_i - \bar{x})^2}{\sum (y_i - \bar{y})^2}) D) (\frac{\sum (x_i - \bar{x})}{\sum (y_i - \bar{y})}) Answer: A Explanation: The slope estimator is the covariance divided by the variance of X. Question 12. The coefficient of determination (R^2) represents A) The proportion of total variability in Y explained by the model B) The slope of the regression line C) The standard error of estimate D) The correlation between residuals Answer: A Explanation: (R^2 = SSR/SST) quantifies explained variance. Question 13. Which of the following residual types is scaled by its estimated standard error? A) Raw residuals B) Standardized residuals C) Studentized residuals D) Leverage residuals Answer: C Explanation: Studentized residuals divide raw residuals by an estimate of their standard deviation.
Answer: C Explanation: Two predictors make it a multiple linear regression model. Question 17. The matrix notation ( \mathbf{y} = \mathbf{X}\boldsymbol{\beta} + \boldsymbol{\epsilon}) is useful because A) It eliminates the need for residual analysis B) It allows compact expression of OLS estimators as ((\mathbf{X}'\mathbf{X})^{- 1}\mathbf{X}'\mathbf{y}) C) It guarantees multicollinearity is absent D) It automatically satisfies normality Answer: B Explanation: Matrix form simplifies derivation of the OLS solution. Question 18. Adjusted (R^2) differs from ordinary (R^2) in that it A) Increases with every added predictor regardless of usefulness B) Decreases when a predictor does not improve model fit, adjusting for number of predictors C) Is always larger than (R^2) D) Ignores the total sum of squares Answer: B Explanation: Adjusted (R^2) penalizes unnecessary variables, preventing over‑fitting. Question 19. In a multiple regression output, a standardized coefficient (Beta weight) of 0.45 for predictor X1 indicates
A) X1 has a weaker impact than a predictor with Beta = 0.60, holding units constant B) X1 explains 45 % of the variance in Y C) A one‑standard‑deviation increase in X1 leads to a 0.45‑standard‑deviation increase in Y, all else equal D) The unstandardized coefficient is 0. Answer: C Explanation: Standardized coefficients express effect sizes in standard‑deviation units. Question 20. To include a categorical variable “Gender” (Male/Female) in a linear regression, you would typically A) Use the raw text values directly B) Create a dummy variable (e.g., 0 = Male, 1 = Female) C) Exclude it because regression only handles numeric data D) Convert it to a logarithmic scale Answer: B Explanation: Dummy coding translates categories into numeric 0/1 values. Question 21. An interaction term between variables X1 and X2 in a regression model allows you to A) Test whether the effect of X1 on Y changes at different levels of X B) Reduce multicollinearity automatically C) Remove the need for main effects D) Ensure homoscedasticity
C) Presence of multicollinearity D) Linear relationship strength Answer: B Explanation: Shapiro‑Wilk evaluates the null hypothesis that data come from a normal distribution. Question 25. Leverage points are observations that A) Have unusually large residuals only B) Have extreme predictor values, potentially influencing the fitted line C) Always increase (R^2) D) Are identified by high Cook’s distance only Answer: B Explanation: Leverage reflects the distance of an observation’s predictor values from the mean. Question 26. Cook’s Distance combines information on A) Leverage and residual size to assess overall influence of an observation B) Only the magnitude of the residual C) The correlation between predictors D) The variance inflation factor Answer: A Explanation: Cook’s D flags points that both have large residuals and high leverage.
Question 27. Multicollinearity most directly affects A) The overall F‑test significance B) The stability and interpretability of individual coefficient estimates C) The calculation of (R^2) D) The normality of residuals Answer: B Explanation: High collinearity inflates standard errors, making coefficients unstable. Question 28. A Variance Inflation Factor (VIF) of 12 for predictor X3 suggests A) No multicollinearity concerns B) Moderate multicollinearity (acceptable) C) Severe multicollinearity; X3 may need to be removed or combined D) That X3 is the strongest predictor Answer: C Explanation: VIF > 10 is commonly considered indicative of serious multicollinearity. Question 29. A logarithmic transformation of the dependent variable is most appropriate when A) The relationship between X and Y is already linear B) The residuals show increasing variance with larger fitted values C) Predictors are categorical D) The sample size is small
C) Adding 2 × (number of parameters) to the negative log‑likelihood D) Ignoring the likelihood altogether Answer: C Explanation: AIC = – 2 log(L) + 2k, where k is the number of estimated parameters. Question 33. Cross‑validation helps to A) Increase multicollinearity B) Estimate model performance on unseen data, reducing over‑fitting risk C) Ensure residuals are normally distributed D) Convert categorical variables to continuous Answer: B Explanation: By repeatedly training and testing on different subsets, cross‑validation assesses generalizability. Question 34. Polynomial regression of degree 2 (quadratic) can be expressed as A) (y = \beta_0 + \beta_1 x + \beta_2 x^2 + \epsilon) B) (y = \beta_0 + \beta_1 \log(x) + \epsilon) C) (y = \beta_0 + \beta_1 x + \epsilon) D) (y = \beta_0 + \beta_1 \sqrt{x} + \epsilon) Answer: A Explanation: Adding a squared term allows curvature in the fitted relationship.
Question 35. Which of the following is a correct interpretation of a 99 % prediction interval for a new observation? A) 99 % of the sample data falls within this interval B) There is a 99 % probability that the true mean response lies within the interval C) There is a 99 % chance that a future individual’s response will fall inside the interval, given the model D) The interval is narrower than a 95 % confidence interval for the mean Answer: C Explanation: Prediction intervals account for both the uncertainty in the mean estimate and the individual error term. Question 36. The Durbin‑Watson statistic is used to detect A) Heteroscedasticity B) Autocorrelation of residuals C) Multicollinearity D) Non‑linearity Answer: B Explanation: Values near 2 indicate no autocorrelation; values approaching 0 or 4 suggest positive or negative autocorrelation. Question 37. When the residuals plot shows a systematic curved pattern, the most likely cause is A) Correct model specification B) Violation of linearity assumption; a transformation or polynomial term may be needed C) Homoscedasticity D) Multicollinearity
B) Encode the predictor with two dummy variables, using one level as the reference category C) Use a single dummy variable equal to 1 for all three levels D) Convert the categories into ordinal numbers Answer: B Explanation: For k categories, k‑1 dummy variables are needed; the omitted level serves as the baseline. Question 41. If the residuals from a regression are not normally distributed, which remedial action is most appropriate? A) Increase the sample size dramatically B) Apply a transformation to the dependent variable or use a robust regression method C) Add more predictors regardless of relevance D) Ignore the violation because normality is not important Answer: B Explanation: Transformations or robust methods can mitigate non‑normal residuals. Question 42. A model with (R^2 = 0.92) and Adjusted (R^2 = 0.70) suggests A) The model is over‑fitting; many predictors add little explanatory power B) The model is under‑fitting; more predictors are needed C) Both statistics indicate excellent fit D) Multicollinearity is absent Answer: A
Explanation: Large drop from (R^2) to adjusted (R^2) signals many predictors that do not improve the model. Question 43. In a regression context, “weighting” observations is most commonly done when A) All observations have equal variance B) Some observations have larger measurement error (heteroscedasticity) C) The sample size is very large D) Predictors are highly correlated Answer: B Explanation: Weighting reduces the influence of observations with larger error variance. Question 44. The term “studentized residual” differs from a standardized residual because A) It uses the residual’s own estimated standard error rather than the overall residual standard deviation B) It is always larger in magnitude C) It does not require the leverage value D) It is only used in logistic regression Answer: A Explanation: Studentized residuals adjust for each observation’s leverage, providing a more precise scaling. Question 45. Which of the following is true regarding the F‑test in a multiple regression ANOVA table? A) It tests the null hypothesis that all slope coefficients are zero simultaneously
Explanation: BIC includes a (\log(n)) factor, which grows faster than the constant 2 in AIC, leading to stronger penalization. Question 48. In a regression with interaction term (X_1X_2), a significant interaction coefficient indicates* A) Both main effects are non‑significant B) The effect of (X_1) on Y changes depending on the level of (X_2) (or vice versa) C) Multicollinearity is resolved D) Heteroscedasticity is present Answer: B Explanation: Interaction significance means the slope for one predictor is conditional on the other predictor’s value. Question 49. Which of the following plots is most useful for checking the normality of residuals? A) Residuals vs. fitted values plot B) Q‑Q plot (quantile‑quantile) of residuals C) Scatter plot of Y vs. X D) Leverage plot Answer: B Explanation: A Q‑Q plot compares residual quantiles to a normal distribution; deviations indicate non‑normality. Question 50. The term “tolerance” in multicollinearity diagnostics is defined as A) The inverse of VIF
B) The square root of VIF C) The correlation between two predictors D) The p‑value of the predictor Answer: A Explanation: Tolerance = 1/VIF; low tolerance (<0.1) signals multicollinearity. Question 51. In a regression analysis, a Cook’s Distance value of 0.8 for an observation is A) Always considered highly influential B) Potentially influential if the threshold is 1 or (4/n) (whichever is smaller) C) Irrelevant for diagnostics D) Equivalent to the standardized residual Answer: B Explanation: Common rule: D > 1 or D > 4/n flags influence; 0.8 may be concerning in small samples. Question 52. When performing a log‑log transformation (both Y and X logged), the slope coefficient can be interpreted as A) The additive change in Y for a one‑unit change in X B) The percent change in Y associated with a 1 % change in X C) The elasticity of Y with respect to X D) Both B and C Answer: D