PrepIQ NWCA Linear Regression Analysis Ultimate Exam, Exams of Technology

The PrepIQ NWCA Linear Regression Analysis Ultimate Exam focuses on statistical modeling and predictive analysis techniques. Learners study regression equations, correlation analysis, trend forecasting, data interpretation, and statistical decision-making concepts.

Typology: Exams

2025/2026

Available from 06/04/2026

shilpi-jain-3
shilpi-jain-3 🇮🇳

2.5

(11)

80K documents

1 / 57

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
PrepIQ NWCA Linear
Regression Analysis Ultimate
Exam
**Question 1. Which of the following best describes the primary purpose of
regression analysis in health data?**
A) To calculate the mean of a single variable
B) To predict the value of a dependent variable from one or more independent
variables
C) To test for normality of a dataset
D) To rank variables by frequency
Answer: B
Explanation: Regression analysis models the relationship between a dependent
variable and one or more independent variables to make predictions.
**Question 2. In the context of regression, what distinguishes a deterministic model
from a probabilistic model?**
A) Deterministic models include random error terms, while probabilistic models do
not
B) Probabilistic models produce the same output for identical inputs, deterministic
models do not
C) Deterministic models predict exact outcomes, whereas probabilistic models
predict outcomes with associated uncertainty
D) Probabilistic models are only used for categorical data
Answer: C
Explanation: Deterministic models give fixed outputs for given inputs, while
probabilistic models incorporate randomness and provide predictions with
uncertainty.
**Question 3. Which step typically follows data collection in the regression modeling
process?**
A) Hypothesis testing of the slope
B) Model validation
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b
pf1c
pf1d
pf1e
pf1f
pf20
pf21
pf22
pf23
pf24
pf25
pf26
pf27
pf28
pf29
pf2a
pf2b
pf2c
pf2d
pf2e
pf2f
pf30
pf31
pf32
pf33
pf34
pf35
pf36
pf37
pf38
pf39

Partial preview of the text

Download PrepIQ NWCA Linear Regression Analysis Ultimate Exam and more Exams Technology in PDF only on Docsity!

Regression Analysis Ultimate

Exam

Question 1. Which of the following best describes the primary purpose of regression analysis in health data? A) To calculate the mean of a single variable B) To predict the value of a dependent variable from one or more independent variables C) To test for normality of a dataset D) To rank variables by frequency Answer: B Explanation: Regression analysis models the relationship between a dependent variable and one or more independent variables to make predictions. Question 2. In the context of regression, what distinguishes a deterministic model from a probabilistic model? A) Deterministic models include random error terms, while probabilistic models do not B) Probabilistic models produce the same output for identical inputs, deterministic models do not C) Deterministic models predict exact outcomes, whereas probabilistic models predict outcomes with associated uncertainty D) Probabilistic models are only used for categorical data Answer: C Explanation: Deterministic models give fixed outputs for given inputs, while probabilistic models incorporate randomness and provide predictions with uncertainty. Question 3. Which step typically follows data collection in the regression modeling process? A) Hypothesis testing of the slope B) Model validation

Regression Analysis Ultimate

Exam

C) Variable selection D) Data cleaning and exploratory analysis Answer: D Explanation: After collecting data, analysts usually clean the data and explore relationships before building a model. Question 4. The Pearson correlation coefficient measures which of the following? A) The causal effect of X on Y B) The linear association strength and direction between two continuous variables C) The difference between means of two groups D) The variance of a single variable Answer: B Explanation: Pearson’s r quantifies the linear relationship between two continuous variables, ranging from –1 to +1. Question 5. A correlation of –0.85 between physical activity minutes and BMI indicates what? A) A strong positive linear relationship B) No linear relationship C) A strong negative linear relationship D) That increased activity causes higher BMI Answer: C Explanation: An r of –0.85 reflects a strong inverse linear association; as activity increases, BMI tends to decrease. Question 6. Which statement correctly differentiates correlation from causation?

Regression Analysis Ultimate

Exam

Explanation: (\epsilon) denotes the random error (residual) capturing variability not explained by the linear model. Question 9. Which of the following is NOT an assumption of simple linear regression? A) Linearity of the relationship between X and Y B) Independence of observations C) Homoscedasticity (constant variance of errors) D) Multicollinearity among predictors Answer: D Explanation: Multicollinearity concerns multiple predictors and is not an assumption for simple linear regression with a single predictor. Question 10. In the OLS method, the slope (\beta_1) is estimated by which formula? A) (\frac{\sum (x_i - \bar{x})}{\sum (y_i - \bar{y})}) B) (\frac{\sum (x_i - \bar{x})(y_i - \bar{y})}{\sum (x_i - \bar{x})^2}) C) (\frac{\sum y_i}{\sum x_i}) D) (\frac{\sum (y_i - \bar{y})^2}{\sum (x_i - \bar{x})^2}) Answer: B Explanation: The OLS slope estimator is the covariance of X and Y divided by the variance of X. Question 11. The intercept (\beta_0) in a simple linear regression model represents: A) The predicted value of Y when X equals its mean B) The predicted value of Y when X is zero

Regression Analysis Ultimate

Exam

C) The average of Y across all observations D) The residual variance Answer: B Explanation: (\beta_0) is the estimated value of Y when the predictor X equals zero. Question 12. Which metric quantifies the proportion of total variability in Y explained by the regression model? A) Standard Error of Estimate B) Adjusted R² C) Coefficient of Determination (R²) D) F-statistic Answer: C Explanation: R² = SSR / SST measures the proportion of total variation accounted for by the model. Question 13. If a simple linear regression yields (R^2 = 0.64), what does this imply? A) 64% of the variation in the predictor is explained by the response B) The model explains 64% of the variation in the response variable C) The slope is 0. D) There is a 64% chance that the model is correct Answer: B Explanation: An R² of 0.64 indicates that 64% of the variability in Y is explained by X.

Regression Analysis Ultimate

Exam

Answer: A Explanation: SST = SSR (regression sum of squares) + SSE (error sum of squares). Question 17. The F-test in simple linear regression assesses: A) Whether the intercept differs from zero B) Whether the slope coefficient is significantly different from zero C) Whether the residuals are normally distributed D) Whether the predictor variable is categorical Answer: B Explanation: The overall F-test evaluates the null hypothesis that all regression coefficients (in SLR, just the slope) are zero. Question 18. To test the hypothesis (H_0: \beta_1 = 0) at α = 0.05, which statistic is most appropriate? A) Z-statistic B) t-statistic with n-2 degrees of freedom C) Chi-square statistic D) F-statistic with 1 and n-2 degrees of freedom Answer: B Explanation: In SLR, the slope is tested using a t-test with n-2 degrees of freedom. Question 19. A 95% confidence interval for the slope (\beta_1) is (0.12, 0.45). Which conclusion is correct? A) The slope is not significantly different from zero at the 0.05 level B) The true slope is definitely 0.

Regression Analysis Ultimate

Exam

C) We are 95% confident that the true slope lies between 0.12 and 0. D) The interval indicates heteroscedasticity Answer: C Explanation: The interval provides a range within which the true slope is expected to fall with 95% confidence. Question 20. A prediction interval for a new observation is wider than the corresponding confidence interval for the mean response because: A) It accounts for both the uncertainty in estimating the mean and the random error of an individual observation B) It uses a larger critical t-value C) It ignores the residual variance D) It is based on a smaller sample size Answer: A Explanation: Prediction intervals incorporate both the variability of the estimated mean and the inherent variability of individual outcomes. Question 21. In multiple linear regression, the model expands to (y = \beta_0 + beta_1x_1 + \beta_2x_2 + \dots + \beta_kx_k + \epsilon). What does (\beta_2) represent? A) The effect of (x_2) on Y when all other predictors are held constant B) The total effect of all predictors on Y C) The correlation between (x_1) and (x_2) D) The intercept for the second predictor Answer: A Explanation: Each (\beta_i) reflects the change in Y associated with a one-unit change in its predictor, controlling for all other variables.

Regression Analysis Ultimate

Exam

Answer: B Explanation: Partial coefficients isolate the effect of one predictor controlling for the others. Question 25. Which test evaluates the overall significance of a multiple regression model? A) t-test for each coefficient B) F-test comparing the full model to a model with no predictors C) Chi-square test of residuals D) Durbin-Watson test Answer: B Explanation: The overall F-test assesses whether at least one predictor has a non-zero coefficient. Question 26. Standardized regression coefficients (beta weights) are useful because they: A) Provide coefficients in original measurement units B) Allow direct comparison of predictor importance regardless of variable scales C) Eliminate multicollinearity D) Convert categorical variables to continuous Answer: B Explanation: Standardized coefficients are expressed in standard deviation units, facilitating comparison across variables. Question 27. To include gender (male/female) in a regression model, which coding scheme is appropriate?

Regression Analysis Ultimate

Exam

A) Treat gender as a continuous variable ranging from 0 to 1 B) Use a dummy variable (e.g., 0 = female, 1 = male) C) Exclude gender because it is categorical D) Use the Pearson correlation coefficient Answer: B Explanation: Dummy coding converts a binary categorical variable into a numeric 0/1 indicator. Question 28. When modeling the interaction between exercise frequency (X1) and diet quality (X2) on weight loss, the interaction term is: A) (X1 + X2) B) (X1 - X2) C) (X1 \times X2) D) (\frac{X1}{X2}) Answer: C Explanation: An interaction term is the product of the two predictors, capturing how the effect of one variable changes at levels of the other. Question 29. Heteroscedasticity in regression residuals can be detected most readily by: A) A Q-Q plot of residuals B) Plotting residuals versus fitted values and observing a funnel shape C) Computing the correlation coefficient D) Performing a chi-square test Answer: B

Regression Analysis Ultimate

Exam

D) Severe heteroscedasticity Answer: C Explanation: A Durbin-Watson value around 2 indicates that residuals are approximately independent. Question 33. Which plot is most useful for checking normality of residuals? A) Scatter plot of residuals vs. predictor B) Histogram of residuals C) Normal probability (Q-Q) plot D) Bar chart of residual frequencies Answer: C Explanation: A Q-Q plot compares the distribution of residuals to a normal distribution; deviations from the line suggest non-normality. Question 34. The Shapiro-Wilk test evaluates: A) Equality of variances across groups B) Whether a sample comes from a normal distribution C) Presence of multicollinearity D) The significance of the regression slope Answer: B Explanation: The Shapiro-Wilk test is a formal test for normality. Question 35. Leverage points in regression are identified by: A) Large absolute residuals alone B) High values of the hat matrix diagonal (h_ii)

Regression Analysis Ultimate

Exam

C) Small Cook’s distance values D) Low VIF values Answer: B Explanation: Leverage measures how far an observation’s predictor values are from the mean of the predictors; high hat values indicate high leverage. Question 36. Cook’s distance combines information from residuals and leverage to assess: A) Multicollinearity B) Influence of an observation on overall regression coefficients C) Heteroscedasticity D) Normality of errors Answer: B Explanation: Cook’s D quantifies the change in fitted values when a particular observation is omitted. Question 37. A VIF (Variance Inflation Factor) value of 12 for a predictor suggests: A) No multicollinearity concerns B) Moderate multicollinearity C) Severe multicollinearity, possibly inflating standard errors D) That the predictor should be transformed Answer: C Explanation: VIF values above 10 are commonly taken as indicating serious multicollinearity.

Regression Analysis Ultimate

Exam

Explanation: WLS assigns weights inversely proportional to error variance, correcting heteroscedasticity. Question 41. In stepwise forward selection, a variable is added to the model only if: A) Its p-value exceeds a pre-specified threshold B) Its inclusion reduces the Akaike Information Criterion (AIC) C) Its VIF is greater than 5 D) It has the highest correlation with the response Answer: B Explanation: Forward stepwise adds variables that improve model fit, often judged by decreasing AIC (or increasing F-statistic significance). Question 42. The Bayesian Information Criterion (BIC) differs from AIC primarily by: A) Penalizing model complexity more heavily, especially with larger sample sizes B) Ignoring likelihood altogether C) Being used only for logistic regression D) Always selecting the model with the most predictors Answer: A Explanation: BIC includes a stronger penalty term (log n) for the number of parameters, favoring simpler models as n grows. Question 43. Cross-validation helps to assess: A) Multicollinearity within a single dataset B) The predictive performance of a model on unseen data C) The normality of residuals

Regression Analysis Ultimate

Exam

D) The exact value of β coefficients Answer: B Explanation: Cross-validation partitions data into training and testing subsets to estimate out-of-sample prediction error. Question 44. In k-fold cross-validation with k = 5, the data are: A) Split into 5 equal parts, each serving once as the validation set while the other 4 are used for training B) Randomly sampled 5 times with replacement C) Divided into 5 groups, but only one group is ever used for training D) Tested on all 5 folds simultaneously Answer: A Explanation: 5-fold cross-validation rotates the validation set across the five partitions. Question 45. Polynomial regression of degree 2 can be expressed as: A) (y = \beta_0 + \beta_1x + \epsilon) B) (y = \beta_0 + \beta_1x + \beta_2x^2 + \epsilon) C) (y = \beta_0 + \beta_1\log(x) + \epsilon) D) (y = \beta_0 + \beta_1x^{-1} + \epsilon) Answer: B Explanation: A quadratic (second-degree) polynomial includes both (x) and (x^2) terms. Question 46. When adding a quadratic term (x^2) to a linear model, the primary purpose is to:

Regression Analysis Ultimate

Exam

Explanation: Over-fitting occurs when a model is overly complex and fits idiosyncrasies of the sample rather than the underlying pattern. Question 49. Which diagnostic plot is most useful for detecting influential observations? A) Residuals vs. fitted values B) Cook’s distance plot C) Histogram of Y D) Scatter plot of X vs. Y Answer: B Explanation: A plot of Cook’s D highlights observations that have a large impact on regression coefficients. Question 50. A Durbin-Watson statistic of 1.2 suggests: A) Positive autocorrelation of residuals B) Negative autocorrelation of residuals C) No autocorrelation D) Perfect multicollinearity Answer: A Explanation: Values substantially less than 2 indicate positive autocorrelation. Question 51. Which of the following best describes the purpose of the “adjusted R²” when comparing two models with different numbers of predictors? A) To always increase with more predictors B) To penalize unnecessary predictors, allowing fair comparison of model explanatory power C) To replace the F-test for overall significance

Regression Analysis Ultimate

Exam

D) To measure the correlation between observed and predicted values Answer: B Explanation: Adjusted R² adjusts for model complexity, decreasing if added variables do not improve fit. Question 52. In a multiple regression output, a predictor has a p-value of 0.08. At α = 0.05, the appropriate decision is to: A) Retain the predictor because p < 0. B) Remove the predictor because it is not statistically significant at the 0.05 level C) Conclude that the model is invalid D) Increase the sample size to achieve significance Answer: B Explanation: With α = 0.05, a p-value of 0.08 fails to meet the significance criterion, suggesting the predictor may be dropped. Question 53. The term “partial F-test” in multiple regression is used to: A) Compare two nested models, testing whether a set of predictors improves the model significantly B) Test the normality of residuals C) Evaluate multicollinearity among predictors D) Determine the best transformation for Y Answer: A Explanation: A partial F-test assesses whether adding (or removing) a group of variables significantly changes model fit.