Predictive Analytics Practice Exam: Questions and Answers, Exams of Technology

This practice exam for a predictive analytics certificate program features 26 multiple-choice questions. It covers key concepts like statistical models, CRISP-DM, data preparation, hypothesis testing, linear algebra, and machine learning. Detailed answer explanations are provided, making it ideal for certification preparation or enhancing predictive analytics knowledge. Topics include data types, distributions, regression, decision trees, and model evaluation, offering a comprehensive review of essential principles. It tests and reinforces knowledge of concepts and methodologies, from basic statistics to advanced machine learning, providing a thorough assessment.

Typology: Exams

2025/2026

Available from 12/30/2025

shilpi-jain-1
shilpi-jain-1 🇮🇳

4.2

(5)

29K documents

1 / 101

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Predictive Analytics Certificate Program
Practice Exam
**Question 1.** Which of the following most accurately defines predictive analytics?
A) Summarizing historical data to describe what happened
B) Using statistical models to forecast future events
C) Optimizing decisions based on simulation outcomes
D) Visualizing data trends for stakeholder communication
Answer: B
Explanation: Predictive analytics employs statistical and machinelearning models to estimate
future outcomes based on historical patterns.
**Question 2.** In the CRISPDM methodology, which phase is primarily concerned with
assessing the quality of the data and handling missing values?
A) Business Understanding
B) Data Understanding
C) Data Preparation
D) Modeling
Answer: C
Explanation: Data Preparation focuses on cleaning, transforming, and preparing data for
modeling, including missingvalue treatment.
**Question 3.** Which of the following is a continuous variable?
A) Customer gender (Male/Female)
B) Order status (Pending, Shipped, Delivered)
C) Daily sales revenue in dollars
D) Product category (Electronics, Clothing)
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b
pf1c
pf1d
pf1e
pf1f
pf20
pf21
pf22
pf23
pf24
pf25
pf26
pf27
pf28
pf29
pf2a
pf2b
pf2c
pf2d
pf2e
pf2f
pf30
pf31
pf32
pf33
pf34
pf35
pf36
pf37
pf38
pf39
pf3a
pf3b
pf3c
pf3d
pf3e
pf3f
pf40
pf41
pf42
pf43
pf44
pf45
pf46
pf47
pf48
pf49
pf4a
pf4b
pf4c
pf4d
pf4e
pf4f
pf50
pf51
pf52
pf53
pf54
pf55
pf56
pf57
pf58
pf59
pf5a
pf5b
pf5c
pf5d
pf5e
pf5f
pf60
pf61
pf62
pf63
pf64

Partial preview of the text

Download Predictive Analytics Practice Exam: Questions and Answers and more Exams Technology in PDF only on Docsity!

Practice Exam

Question 1. Which of the following most accurately defines predictive analytics? A) Summarizing historical data to describe what happened B) Using statistical models to forecast future events C) Optimizing decisions based on simulation outcomes D) Visualizing data trends for stakeholder communication Answer: B Explanation: Predictive analytics employs statistical and machine‑learning models to estimate future outcomes based on historical patterns. Question 2. In the CRISP‑DM methodology, which phase is primarily concerned with assessing the quality of the data and handling missing values? A) Business Understanding B) Data Understanding C) Data Preparation D) Modeling Answer: C Explanation: Data Preparation focuses on cleaning, transforming, and preparing data for modeling, including missing‑value treatment. Question 3. Which of the following is a continuous variable? A) Customer gender (Male/Female) B) Order status (Pending, Shipped, Delivered) C) Daily sales revenue in dollars D) Product category (Electronics, Clothing)

Practice Exam

Answer: C Explanation: Continuous variables can take any numeric value within a range; daily revenue is measured on a continuous scale. Question 4. The mean of a normally distributed variable is 50 and the standard deviation is

  1. Approximately what proportion of observations fall between 45 and 55? A) 34% B) 68% C) 95% D) 99.7% Answer: B Explanation: In a normal distribution, about 68% of values lie within ±1 standard deviation of the mean (45–55). Question 5. Which hypothesis testing method is appropriate for comparing the means of three independent groups? A) Paired t‑test B) One‑way ANOVA C) Chi‑squared test D) Two‑sample proportion test Answer: B Explanation: One‑way ANOVA evaluates whether there are statistically significant differences among three or more group means.

Practice Exam

D) Data have a linear trend over time Answer: B Explanation: Median imputation is robust to outliers, reducing distortion when extreme values are present. Question 9. Which scaling technique transforms each feature to have a mean of 0 and a standard deviation of 1? A) Min‑Max scaling B) Log transformation C) Z‑score standardization D) Decimal scaling Answer: C Explanation: Z‑score (standard score) standardizes features to zero mean and unit variance. Question 10. Creating dummy variables from a categorical feature with three levels results in how many binary columns? A) 1 B) 2 C) 3 D) 4 Answer: B Explanation: For k categories, k‑1 dummy variables are needed to avoid perfect multicollinearity.

Practice Exam

Question 11. Which plot is most appropriate for visualizing the distribution of a single continuous variable? A) Scatter plot B) Histogram C) Bar chart D) Pie chart Answer: B Explanation: Histograms display frequency counts across intervals, revealing the shape of a continuous distribution. Question 12. In a correlation matrix, a value of – 0.85 between two variables indicates: A) Strong positive linear relationship B) Weak negative linear relationship C) Strong negative linear relationship D) No linear relationship Answer: C Explanation: Correlation coefficients near – 1 denote a strong inverse linear association. Question 13. Simple Linear Regression assumes that the residuals are: A) Heteroscedastic B) Autocorrelated C) Normally distributed with constant variance

Practice Exam

Question 16. Ridge regression primarily addresses which issue in linear models? A) High bias B) Multicollinearity C) Non‑linear relationships D) Categorical predictors Answer: B Explanation: Ridge adds an L2 penalty, shrinking coefficients and reducing variance caused by multicollinearity. Question 17. Lasso regression differs from Ridge regression because Lasso can: A) Increase model bias more than Ridge B) Set some coefficients exactly to zero, performing variable selection C) Only be applied to classification problems D) Use a quadratic penalty term Answer: B Explanation: Lasso’s L1 penalty can zero out less important coefficients, effectively selecting features. Question 18. Which of the following is a correct interpretation of the odds ratio (OR) of 2. for a binary predictor in logistic regression? A) The predictor reduces the odds of the outcome by 2.5 times B) The predictor increases the probability of the outcome by 2.5% C) The odds of the outcome are 2.5 times higher when the predictor is present

Practice Exam

D) The predictor has no effect on the outcome Answer: C Explanation: An OR > 1 indicates higher odds of the event when the predictor equals 1. Question 19. In a decision tree, which impurity measure is based on the probability of misclassifying a randomly chosen element? A) Gini impurity B) Entropy C) Mean Squared Error D) Information Gain Ratio Answer: A Explanation: Gini impurity reflects the expected misclassification rate if a randomly selected element were labeled according to class proportions. Question 20. Pruning a decision tree primarily helps to: A) Increase the depth of the tree B) Reduce overfitting and improve generalization C) Convert the tree into a linear model D) Add more splits for each leaf Answer: B Explanation: Pruning removes branches that provide little predictive power, lowering variance.

Practice Exam

Answer: C Explanation: The RBF (Gaussian) kernel corresponds to an infinite‑dimensional Hilbert space. Question 24. In k‑Nearest Neighbors classification, the choice of k primarily influences: A) Model’s ability to capture non‑linear relationships B) The bias‑variance trade‑off, with larger k increasing bias and reducing variance C) The depth of the decision boundary D) The number of features used Answer: B Explanation: Larger k smooths the decision surface (higher bias, lower variance); smaller k does the opposite. Question 25. Which of the following is NOT a valid method for evaluating a regression model’s performance? A) Mean Squared Error (MSE) B) R‑squared C) Confusion Matrix D) Mean Absolute Error (MAE) Answer: C Explanation: A confusion matrix applies to classification, not regression. Question 26. A model with high training accuracy but low validation accuracy is likely suffering from: A) Underfitting

Practice Exam

B) Overfitting C) Data leakage D) Class imbalance Answer: B Explanation: Overfitting occurs when a model captures noise in the training set, failing to generalize. Question 27. In K‑fold cross‑validation with K = 5, each fold is used as the validation set how many times? A) 1 B) 2 C) 4 D) 5 Answer: D Explanation: Each of the 5 folds serves as the validation set once, while the remaining 4 folds train the model. Question 28. Which metric is most appropriate when the cost of false negatives is much higher than false positives? A) Accuracy B) Precision C) Recall (Sensitivity) D) F1‑Score

Practice Exam

B) Presence of autocorrelation C) Stationarity of a series D) Forecast accuracy Answer: C Explanation: ADF tests the null hypothesis that a unit root is present (non‑stationary). Question 32. Differencing a time series once is primarily intended to: A) Remove seasonality B) Make the series stationary by eliminating trend C) Increase the series length D) Smooth out random noise Answer: B Explanation: First‑order differencing subtracts the previous observation, helping to eliminate trends. Question 33. In an ARIMA(p,d,q) model, the ‘q’ component refers to: A) Number of autoregressive terms B) Number of differencing operations C) Number of moving‑average terms D) Seasonal period Answer: C Explanation: ‘q’ is the order of the moving‑average part, modeling past forecast errors.

Practice Exam

Question 34. Which method is best suited for forecasting data with both trend and seasonal patterns? A) Simple Moving Average B) Holt’s Linear Trend method C) Holt‑Winters Triple Exponential Smoothing D) Naïve Forecast Answer: C Explanation: Holt‑Winters captures level, trend, and seasonality simultaneously. Question 35. The Mean Absolute Percentage Error (MAPE) is problematic when actual values contain zeros because: A) Division by zero is undefined, leading to infinite errors B) It over‑penalizes large errors C) It becomes negative D) It gives the same result as MAE Answer: A Explanation: MAPE involves dividing by the actual value; zero actuals cause undefined or infinite percentages. Question 36. In clustering, the silhouette coefficient measures: A) The distance between cluster centroids B) The similarity of an object to its own cluster compared to other clusters C) The probability that a point belongs to a cluster

Practice Exam

Question 39. In association rule mining, the metric “lift” greater than 1 indicates: A) The antecedent and consequent are independent B) The rule has high confidence but low support C) The occurrence of the antecedent increases the likelihood of the consequent D) The rule is invalid Answer: C Explanation: Lift > 1 suggests a positive association beyond chance between antecedent and consequent. Question 40. Which of the following best describes the “curse of dimensionality”? A) Model performance always improves with more features B) High‑dimensional spaces cause distances between points to become less discriminative C) Data become more sparse as sample size increases D) Computation time decreases with more dimensions Answer: B Explanation: In high dimensions, all points tend to be similarly distant, making pattern detection harder. Question 41. When applying a log transformation to a positively skewed variable, the primary effect is: A) Increasing variance B) Making the distribution more symmetric C) Converting it to a categorical variable D) Removing outliers

Practice Exam

Answer: B Explanation: Log transformation compresses large values, reducing skewness and approximating normality. Question 42. Which evaluation metric is most appropriate for imbalanced binary classification where the minority class is the focus? A) Overall accuracy B) Macro‑averaged F1‑Score C) ROC‑AUC D) Precision‑Recall AUC Answer: D Explanation: Precision‑Recall curves emphasize performance on the positive (minority) class and are more informative than ROC‑AUC in severe imbalance. Question 43. In a confusion matrix, the term “True Negative Rate” is synonymous with: A) Specificity B) Sensitivity C) Precision D) Recall Answer: A Explanation: True Negative Rate (TNR) measures the proportion of actual negatives correctly identified, i.e., specificity.

Practice Exam

D) Cosine similarity Answer: C Explanation: Hamming distance counts mismatches between binary strings, suiting dummy‑encoded categorical data. Question 47. In time‑series cross‑validation (rolling origin), why is it preferred over random K‑fold splits? A) It provides more folds B) It respects temporal ordering, preventing future data from leaking into training C) It reduces computational cost D) It increases model variance Answer: B Explanation: Rolling origin maintains chronological integrity, essential for realistic forecasting evaluation. Question 48. Which of the following is a common technique to detect multicollinearity before modeling? A) Plotting residuals vs. fitted values B) Calculating pairwise Pearson correlation coefficients C) Performing a chi‑squared test D) Using the Kolmogorov‑Smirnov test Answer: B Explanation: High pairwise correlations suggest potential multicollinearity among predictors.

Practice Exam

Question 49. The “bias‑variance trade‑off” states that reducing bias typically leads to: A) Increased variance B) Decreased variance C) No change in variance D) Better interpretability Answer: A Explanation: Simplifying a model (high bias) reduces variance, while complex models (low bias) increase variance; they move inversely. Question 50. In a logistic regression model, the log‑likelihood function is maximized using: A) Ordinary Least Squares B) Gradient Descent or Newton‑Raphson methods C) K‑means clustering D) Principal Component Analysis Answer: B Explanation: Logistic regression uses iterative optimization (e.g., Newton‑Raphson) to maximize the log‑likelihood. Question 51. Which of the following best describes “bagging” in ensemble learning? A) Sequentially correcting errors of previous models B) Training multiple models on different random subsets of data and aggregating predictions C) Combining models with different algorithms into a single meta‑learner