PrepIQ Predictive Analytics Certificate Program Ultimate Exam, Exams of Technology

This practice exam is designed for those seeking certification in predictive analytics. It tests knowledge in statistical methods, data analysis, machine learning models, and business forecasting techniques, preparing candidates to use data for decision-making and predictive insights.

Typology: Exams

2025/2026

Available from 05/02/2026

shilpi-jain-3
shilpi-jain-3 🇮🇳

2.5

(11)

80K documents

1 / 95

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
PrepIQ Predictive Analytics
Certificate Program Ultimate Exam
**Question 1.** Which of the following best describes the primary difference
between descriptive and predictive analytics?
A) Descriptive analytics forecasts future outcomes, while predictive analytics
summarizes past data.
B) Descriptive analytics uses statistical models, whereas predictive analytics
relies on data visualization only.
C) Descriptive analytics explains what happened, while predictive analytics
estimates what will happen.
D) Descriptive analytics is limited to structured data, while predictive
analytics works only with unstructured data.
Answer: C
Explanation: Descriptive analytics focuses on summarizing historical data to
understand past events, whereas predictive analytics applies statistical or
machine learning models to estimate future outcomes.
**Question 2.** In the CRISP-DM methodology, which phase directly follows
“Data Understanding”?
A) Business Understanding
B) Data Preparation
C) Modeling
D) Deployment
Answer: B
Explanation: After gaining an understanding of the data, the next step is to
clean, transform, and prepare it for modeling.
**Question 3.** Which statistical measure is most appropriate for describing
the central tendency of a highly skewed continuous variable?
A) Mean
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b
pf1c
pf1d
pf1e
pf1f
pf20
pf21
pf22
pf23
pf24
pf25
pf26
pf27
pf28
pf29
pf2a
pf2b
pf2c
pf2d
pf2e
pf2f
pf30
pf31
pf32
pf33
pf34
pf35
pf36
pf37
pf38
pf39
pf3a
pf3b
pf3c
pf3d
pf3e
pf3f
pf40
pf41
pf42
pf43
pf44
pf45
pf46
pf47
pf48
pf49
pf4a
pf4b
pf4c
pf4d
pf4e
pf4f
pf50
pf51
pf52
pf53
pf54
pf55
pf56
pf57
pf58
pf59
pf5a
pf5b
pf5c
pf5d
pf5e
pf5f

Partial preview of the text

Download PrepIQ Predictive Analytics Certificate Program Ultimate Exam and more Exams Technology in PDF only on Docsity!

Certificate Program Ultimate Exam

Question 1. Which of the following best describes the primary difference between descriptive and predictive analytics? A) Descriptive analytics forecasts future outcomes, while predictive analytics summarizes past data. B) Descriptive analytics uses statistical models, whereas predictive analytics relies on data visualization only. C) Descriptive analytics explains what happened, while predictive analytics estimates what will happen. D) Descriptive analytics is limited to structured data, while predictive analytics works only with unstructured data. Answer: C Explanation: Descriptive analytics focuses on summarizing historical data to understand past events, whereas predictive analytics applies statistical or machine learning models to estimate future outcomes. Question 2. In the CRISP-DM methodology, which phase directly follows “Data Understanding”? A) Business Understanding B) Data Preparation C) Modeling D) Deployment Answer: B Explanation: After gaining an understanding of the data, the next step is to clean, transform, and prepare it for modeling. Question 3. Which statistical measure is most appropriate for describing the central tendency of a highly skewed continuous variable? A) Mean

Certificate Program Ultimate Exam

B) Median C) Mode D) Standard deviation Answer: B Explanation: The median is resistant to extreme values and better represents the center of a skewed distribution than the mean. Question 4. The probability mass function of a Binomial distribution with parameters n=10 and p=0.3 gives the probability of obtaining exactly 4 successes. Which expression computes this probability? A) C(10,4)·0.3⁴·0.7⁶ B) (10⁴)·0.3·0. C) 10·0.3⁴·0.7⁶ D) C(10,4)·0.7⁴·0.3⁶ Answer: A Explanation: The binomial probability formula is C(n,k)·pᵏ·(1-p)ⁿ⁻ᵏ; substituting n=10, k=4, p=0.3 yields option A. Question 5. In hypothesis testing, a p-value of 0.04 indicates which of the following at a 5 % significance level? A) Fail to reject the null hypothesis B) Reject the null hypothesis C) Accept the alternative hypothesis with certainty D) The test is inconclusive Answer: B Explanation: Since 0.04 < 0.05, the result is statistically significant and the null hypothesis is rejected.

Certificate Program Ultimate Exam

Explanation: K-NN imputation uses similar observations to estimate missing values, better retaining the distribution’s variance than simple mean/median substitution. Question 9. Which of the following is a common method for detecting outliers in a univariate continuous feature? A) Pearson correlation B) Z-score greater than 3 C) Chi-square test D) One-hot encoding Answer: B Explanation: Observations with absolute Z-scores > 3 are typically considered outliers in a normally distributed variable. Question 10. Min-Max scaling transforms a feature to which range? A) 0 to 1 B) –1 to 1 C) –∞ to +∞ D) 0 to 100 Answer: A Explanation: Min-Max scaling linearly rescales data so that the minimum becomes 0 and the maximum becomes 1. Question 11. When creating dummy variables for a categorical variable with three levels (Red, Blue, Green), how many dummy columns should be generated to avoid the dummy-variable trap? A) 1 B) 2

Certificate Program Ultimate Exam

C) 3

D) 4

Answer: B Explanation: With k categories, k-1 dummy variables are required; the omitted category serves as the reference level. Question 12. In time-series feature engineering, which lag feature would you create to capture the value from two periods ago? A) lag_ B) lag_ C) lead_ D) diff_ Answer: B Explanation: lag_2 stores the observation from two time steps prior, useful for autoregressive modeling. Question 13. A box plot is most useful for visualizing which of the following aspects of a variable? A) Frequency distribution B) Correlation with another variable C) Central tendency and dispersion, including outliers D) Time-based trends Answer: C Explanation: Box plots display median, quartiles, and potential outliers, summarizing distribution shape and spread. Question 14. In a scatter plot matrix of several continuous variables, what does a strong linear pattern between two variables suggest?

Certificate Program Ultimate Exam

Question 17. Which assumption of ordinary least squares regression is violated if residuals exhibit a funnel-shaped pattern when plotted against fitted values? A) Linearity B) Independence C) Homoscedasticity D) Normality Answer: C Explanation: A funnel shape indicates non-constant variance (heteroscedasticity), breaching the homoscedasticity assumption. Question 18. Variance Inflation Factor (VIF) values greater than 10 typically indicate: A) Strong multicollinearity among predictors. B) Overfitting due to too many observations. C) Non-linear relationships requiring transformation. D) Perfect model fit. Answer: A Explanation: High VIF values suggest that a predictor is highly correlated with other predictors, leading to multicollinearity. Question 19. Ridge regression differs from Lasso regression mainly in its penalty term because: A) Ridge uses L1 norm, Lasso uses L2 norm. B) Ridge uses L2 norm, Lasso uses L1 norm. C) Both use L1 norm, but Ridge adds a bias term. D) Both use L2 norm, but Lasso shrinks coefficients to zero. Answer: B

Certificate Program Ultimate Exam

Explanation: Ridge adds an L2 (squared) penalty, while Lasso adds an L (absolute) penalty, leading to coefficient shrinkage and possible variable elimination. Question 20. In logistic regression, the odds ratio associated with a predictor of 1.5 means: A) The odds increase by 150 % for each unit increase in the predictor. B) The odds decrease by 1.5 times for each unit increase. C) The probability of the outcome increases by 1.5 %. D) The log-odds increase by 1.5 units. Answer: A Explanation: An odds ratio of 1.5 indicates that the odds are multiplied by 1. (i.e., increase by 50 %) for each one-unit increase in the predictor. Question 21. Which loss function is minimized when training a binary classification Support Vector Machine? A) Hinge loss B) Logistic loss C) Squared error loss D) Cross-entropy loss Answer: A Explanation: SVMs use hinge loss to maximize the margin between classes while penalizing misclassifications. Question 22. In a decision tree, which impurity measure is based on the concept of information entropy? A) Gini impurity B) Misclassification error

Certificate Program Ultimate Exam

A) The validation set left aside before training. B) The subset of trees that did not use a particular observation during bootstrap sampling. C) The average error across all trees on the training data. D) The error on the test set after model deployment. Answer: B Explanation: OOB error uses observations not included in the bootstrap sample for a given tree, providing an internal performance estimate. Question 26. Which kernel function enables a Support Vector Machine to model non-linear relationships by mapping data into an infinite-dimensional space? A) Linear kernel B) Polynomial kernel C) Radial Basis Function (RBF) kernel D) Sigmoid kernel Answer: C Explanation: The RBF kernel computes similarity based on Euclidean distance and implicitly maps data to a high-dimensional space, capturing complex patterns. Question 27. In k-Nearest Neighbors classification, choosing a very large k value will most likely: A) Increase model variance. B) Decrease bias but increase variance. C) Increase bias and reduce variance. D) Have no effect on bias-variance trade-off. Answer: C

Certificate Program Ultimate Exam

Explanation: A large k smooths decision boundaries, leading to higher bias (oversimplification) while reducing variance (less sensitivity to noise). Question 28. Which of the following is NOT a typical step in the model tuning process? A) Grid search over hyperparameter space. B) Random search for hyperparameters. C) Manual selection of the final model without validation. D) Cross-validation to assess each hyperparameter combination. Answer: C Explanation: Manual selection without validation defeats the purpose of systematic tuning; the other options are standard practices. Question 29. The confusion matrix element representing instances correctly predicted as the negative class is: A) True Positive (TP) B) False Positive (FP) C) True Negative (TN) D) False Negative (FN) Answer: C Explanation: True Negative counts the correctly identified negative cases. Question 30. For an imbalanced binary classification problem, which metric is most informative when the minority class is of primary interest? A) Accuracy B) Precision C) Recall (Sensitivity) D) Specificity

Certificate Program Ultimate Exam

D. Stratified sampling validation Answer: C Explanation: Rolling-origin validation respects temporal order by training on past data and testing on subsequent periods. Question 34. The Augmented Dickey-Fuller (ADF) test is used to assess: A) Autocorrelation in residuals. B) Stationarity of a time-series. C) Seasonality strength. D) Normality of errors. Answer: B Explanation: ADF tests the null hypothesis of a unit root; rejection indicates stationarity. Question 35. When a time-series exhibits both trend and seasonal patterns, which decomposition model is appropriate if the amplitude of the seasonal component grows with the level of the series? A) Additive decomposition B) Multiplicative decomposition C) STL decomposition only D) No decomposition needed Answer: B Explanation: Multiplicative decomposition assumes that seasonal variation changes proportionally with the series level. Question 36. In exponential smoothing, which method is best suited for data with a trend but no seasonality? A) Simple Exponential Smoothing (SES)

Certificate Program Ultimate Exam

B) Holt’s Linear Trend method C) Holt-Winters Seasonal method D) Moving Average Answer: B Explanation: Holt’s method extends SES by adding a component to capture linear trends. Question 37. For an ARIMA(p,d,q) model, the parameter “d” represents: A) Number of autoregressive terms. B) Number of differencing operations applied to achieve stationarity. C) Number of moving-average terms. D) Seasonal period length. Answer: B Explanation: “d” is the order of differencing needed to render the series stationary. Question 38. In the identification of ARIMA parameters, the partial autocorrelation function (PACF) is primarily used to determine: A) The moving-average order (q). B) The differencing order (d). C) The autoregressive order (p). D) Seasonal period (s). Answer: C Explanation: PACF cuts off after lag p for a pure AR process, helping to select the AR order. Question 39. Which forecasting accuracy metric is scale-independent and expressed as a percentage?

Certificate Program Ultimate Exam

Question 42. Principal Component Analysis (PCA) primarily aims to: A) Increase the number of features. B) Reduce dimensionality while preserving as much variance as possible. C. Convert categorical variables into numeric. D. Perform supervised classification. Answer: B Explanation: PCA creates orthogonal components that capture maximal variance, facilitating dimensionality reduction. Question 43. In association rule mining, the “lift” metric greater than 1 indicates: A) The antecedent and consequent are independent. B) The rule is less useful than random chance. C) Positive association; the occurrence of the antecedent increases the likelihood of the consequent. D) Negative correlation between antecedent and consequent. Answer: C Explanation: Lift = confidence / (support of consequent); values > 1 mean the rule predicts the consequent better than chance. Question 44. Which of the following is a common technique for handling high-cardinality categorical variables in tree-based models? A) One-hot encoding all categories. B) Target encoding (mean encoding). C) Dropping the variable entirely. D) Converting to binary using ASCII codes. Answer: B

Certificate Program Ultimate Exam

Explanation: Target encoding replaces categories with the mean of the target, reducing dimensionality while preserving predictive information for tree models. Question 45. When evaluating a regression model on a test set, you obtain an R-squared of – 0.12. What does this indicate? A) The model explains 12 % of variance. B) The model performs worse than a simple mean-only predictor. C) The model has perfect fit. D) There is a calculation error; R-squared cannot be negative. Answer: B Explanation: Negative R² occurs when the model’s residual sum of squares exceeds that of the baseline (mean) model, indicating poor performance. Question 46. In a classification problem with three classes, which metric extends the binary ROC AUC concept? A) One-vs-Rest AUC B) Macro-averaged F1 score only C) Confusion matrix diagonal sum D. Gini coefficient only Answer: A Explanation: For multiclass, AUC can be computed using a One-vs-Rest approach and then averaged (macro or weighted). Question 47. Which regularization technique can perform both variable selection and coefficient shrinkage simultaneously? A) Ridge regression B) Elastic Net

Certificate Program Ultimate Exam

Question 50. Which validation technique provides the most unbiased estimate of model performance when the dataset is small? A) Simple train-test split (70/30) B) 5-fold cross-validation C. Leave-One-Out Cross-Validation (LOOCV) D. Bootstrap .632 estimator Answer: C Explanation: LOOCV uses every observation as a test case once, maximizing training data while providing an almost unbiased performance estimate for small datasets. Question 51. In a hierarchical clustering dendrogram, the height at which two clusters are merged represents: A) The number of observations in each cluster. B) The distance (or dissimilarity) between the clusters. C) The silhouette score. D) The number of features used. Answer: B Explanation: The vertical line height indicates the linkage distance, i.e., how dissimilar the merged clusters are. Question 52. Which distance metric is most appropriate for binary presence/absence data in clustering? A) Euclidean distance B) Manhattan distance C) Jaccard similarity (converted to distance) D) Cosine similarity Answer: C

Certificate Program Ultimate Exam

Explanation: Jaccard focuses on the proportion of shared “1”s relative to the union, making it suitable for binary sparse data. Question 53. In a Gaussian Mixture Model (GMM), the Expectation-Maximization (EM) algorithm iteratively performs which two steps? A) Gradient descent and backpropagation. B) Sampling and bootstrapping. C) E-step (estimate responsibilities) and M-step (update parameters). D) Pruning and bagging. Answer: C Explanation: EM alternates between estimating the probability that each data point belongs to each component (E-step) and maximizing the likelihood by updating component parameters (M-step). Question 54. When deploying a predictive model in production, which practice helps ensure that model inputs remain consistent with the training data schema? A) Retraining the model daily without monitoring. B) Implementing data validation and schema enforcement at the inference layer. C. Ignoring missing values during prediction. D. Using a different feature set for each prediction. Answer: B Explanation: Validating incoming data against the original schema prevents drift and runtime errors. Question 55. Which of the following is a common cause of data leakage in predictive modeling?