Prepare for your exams
Get points
Guidelines and tips
Sell on Docsity
Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Find documents

Prepare for your exams with the study notes shared by other students like you on Docsity

Search for your university

Find the specific documents for your university's exams

Docsity AINEW

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

20 Points

For each uploaded document

Answer questions

5 Points

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Choose your next study program

Get in touch with the best universities in the world. Search through thousands of universities and official partners

Community

Ask the community

Ask the community for help and clear up your study doubts

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

Exam PA Predictive Analytics Practice Exam, Exams of Technology

Technology

Exam PA covers predictive modeling techniques used in actuarial science. It includes machine learning algorithms, data analysis, and statistical methods used to predict future trends, customer behavior, and financial risks. Candidates should demonstrate proficiency in applying analytics to real-world actuarial problems.

Typology: Exams

2025/2026

Available from 12/26/2025

shilpi-jain-1 🇮🇳

4.2

(5)

29K documents

1 / 90

This page cannot be seen from the preview

Don't miss anything!

Exam PA Predictive Analytics Practice

Exam

**Question 1.** Which data type best describes free‑text comments collected from

policyholders?

A) Structured

B) Semi‑structured

C) Unstructured

D) Quantitative

**Answer:** C

**Explanation:** Free‑text comments lack a predefined schema and are considered

unstructured data.

**Question 2.** In a relational database, which language is primarily used to retrieve data?

A) HTML

B) SQL

C) NoSQL

D) XML

**Answer:** B

**Explanation:** Structured Query Language (SQL) is the standard for querying relational

databases.

**Question 3.** When modeling claim frequency, which distribution is most appropriate if the

variance exceeds the mean?

A) Poisson

B) Normal

C) Negative Binomial

D) Exponential

**Answer:** C

Partial preview of the text

Download Exam PA Predictive Analytics Practice Exam and more Exams Technology in PDF only on Docsity!

Exam

Question 1. Which data type best describes free‑text comments collected from policyholders? A) Structured B) Semi‑structured C) Unstructured D) Quantitative Answer: C Explanation: Free‑text comments lack a predefined schema and are considered unstructured data. Question 2. In a relational database, which language is primarily used to retrieve data? A) HTML B) SQL C) NoSQL D) XML Answer: B Explanation: Structured Query Language (SQL) is the standard for querying relational databases. Question 3. When modeling claim frequency, which distribution is most appropriate if the variance exceeds the mean? A) Poisson B) Normal C) Negative Binomial D) Exponential Answer: C

Exam

Explanation: The Negative Binomial distribution handles over‑dispersion (variance > mean) common in claim counts. Question 4. Which imputation method preserves the original distribution of a numeric variable the best? A) Mean imputation B) Median imputation C) Hot‑deck imputation D) Mode imputation Answer: C Explanation: Hot‑deck draws observed values from similar records, maintaining the variable’s distribution. Question 5. A Z‑score of 4.2 for a claim amount indicates: A) A typical observation B) A mild outlier C) A severe outlier D) Missing data Answer: C Explanation: Z‑scores beyond ±3 are commonly treated as extreme outliers. Question 6. Which transformation would you apply to a positively skewed severity variable to approximate normality? A) Log transformation B) Square root transformation C) Inverse transformation

Exam

C) Non‑linear relationship D) Multicollinearity Answer: C Explanation: Curvature indicates a non‑linear association between the two variables. Question 10. Which of the following is a systematic component of a GLM? A) Distribution family B) Link function C) Linear predictor D) Deviance Answer: C Explanation: The systematic component is the linear predictor (β0 + β1X1 + …). Question 11. The canonical link for a Poisson distribution is: A) Identity B) Log C) Inverse D) Probit Answer: B Explanation: The log link is the canonical link for the Poisson family. Question 12. In a GLM for claim severity using a Gamma distribution, which link is most commonly employed? A) Log B) Identity

Exam

C) Inverse D) Probit Answer: A Explanation: The log link ensures positive fitted values for Gamma‑distributed responses. Question 13. When interpreting a GLM coefficient for a categorical factor, the coefficient represents: A) Absolute risk B) Relative risk compared to the reference level C) Probability of the event D) Standard deviation change Answer: B Explanation: Coefficients for factors are interpreted as log‑relativities relative to the base level. Question 14. Which statistic is used to compare nested GLMs? A) R‑squared B) AIC C) Deviance difference (likelihood ratio test) D) BIC Answer: C Explanation: The likelihood ratio test (deviance difference) assesses whether the more complex model improves fit. Question 15. A model with a very low training R² but high test R² likely suffers from: A) Overfitting

Exam

B) Bagging multiple trees on bootstrapped samples C) Pruning deep trees D) Using a single deep tree Answer: B Explanation: Bagging (bootstrap aggregating) creates many trees and averages predictions, lowering variance. Question 19. Gradient Boosting differs from Random Forest because it: A) Grows trees in parallel B) Fits each new tree to the residuals of the previous ensemble C) Uses only categorical variables D) Does not require hyper‑parameter tuning Answer: B Explanation: Boosting sequentially fits trees to correct previous errors, reducing bias. Question 20. Which activation function suffers from vanishing gradients for large negative inputs? A) ReLU B) Sigmoid C) Leaky ReLU D) Softmax Answer: B Explanation: The sigmoid saturates at 0 for large negative values, causing vanishing gradients. Question 21. In logistic regression, the odds ratio for a predictor is obtained by:

Exam

A) Exponentiating the coefficient B) Squaring the coefficient C) Taking the natural log of the coefficient D) Dividing the coefficient by its standard error Answer: A Explanation: e^(β) converts log‑odds to odds ratio. Question 22. Support Vector Machines find a hyperplane that maximizes: A) The number of support vectors B) The margin between classes C) The classification error D) The depth of the tree Answer: B Explanation: SVMs aim to maximize the distance (margin) between the closest points of each class. Question 23. Which data split ratio is most commonly used for training, validation, and testing in predictive modeling? A) 60/20/ B) 70/15/ C) 80/10/ D) 50/25/ Answer: C Explanation: An 80/10/10 split provides ample training data while reserving separate sets for tuning and final evaluation.

Exam

Question 27. In a confusion matrix, which cell represents false negatives? A) TP B) FP C) FN D) TN Answer: C Explanation: False negatives are actual positives incorrectly predicted as negatives. Question 28. Which metric combines precision and recall into a single harmonic mean? A) Accuracy B) F1 Score C) ROC‑AUC D) Specificity Answer: B Explanation: The F1 score is the harmonic mean of precision and recall. Question 29. The ROC curve plots: A) Recall vs. Precision B) Sensitivity vs. 1 – Specificity C) Accuracy vs. Threshold D) True Positive Rate vs. False Positive Rate Answer: D (True Positive Rate vs. False Positive Rate) Explanation: ROC displays TPR (sensitivity) against FPR across thresholds. Question 30. A model with high variance but low bias typically:

Exam

A) Underfits the data B) Overfits the data C) Has perfect generalization D) Is unaffected by regularization Answer: B Explanation: High variance indicates sensitivity to training data, leading to overfitting. Question 31. Which technique can directly address model drift in production? A) Increasing the number of trees B) Periodic recalibration using recent data C. Using a deeper neural network D. Removing all categorical variables Answer: B Explanation: Monitoring and updating the model with fresh data mitigates drift. Question 32. In A/B testing of a new pricing model, the null hypothesis usually states that: A. The new model yields higher profit B. The new model yields lower loss C. There is no difference in key metric between control and treatment groups D. The treatment group will have more outliers Answer: C Explanation: The default null hypothesis asserts no difference between the two groups. Question 33. SHAP values provide: A. Global feature importance only

Exam

Question 36. Under GDPR, which principle requires that personal data be processed only for a specific, explicit purpose? A. Data minimization B. Purpose limitation C. Right to be forgotten D. Data portability Answer: B Explanation: Purpose limitation mandates that data be collected for clearly defined purposes. Question 37. Proxy variables can introduce bias when: A. They are highly correlated with the target B. They are unrelated to the protected attribute C. They serve as indirect measures of a protected characteristic D. They have low variance Answer: C Explanation: Proxy variables can encode protected attributes indirectly, leading to unfair discrimination. Question 38. Actuaries are primarily responsible for which of the following when deploying predictive models? A. Writing production code in Java B. Validating model assumptions and documenting limitations C. Designing user interfaces D. Managing the IT infrastructure

Exam

Answer: B Explanation: Actuaries ensure models are statistically sound, appropriately documented, and used responsibly. Question 39. Which of the following best describes “granularity” in data? A. The number of missing values B. The level of detail (e.g., transaction vs. entity) C. The proportion of categorical variables D. The size of the dataset in megabytes Answer: B Explanation: Granularity refers to the detail level of data records. Question 40. A “hot‑deck” imputation method draws replacement values from: A. A statistical distribution fitted to the data B. The same variable’s mean C. Similar records (donor pool) within the dataset D. Random numbers generated by a seed Answer: C Explanation: Hot‑deck selects observed values from “donor” records that are similar to the missing case. Question 41. In a time‑series claim dataset, the “day of week” feature is an example of: A. Interaction term B. Derived variable C. Target variable

Exam

C. Recall D. Specificity Answer: C Explanation: Recall (sensitivity) captures the ability to detect actual frauds (reduce FN). Question 45. In a GLM with a log link, the predicted mean μ̂ is obtained by: A. μ̂ = β0 + β1X B. μ̂ = exp(β0 + β1X1) C. μ̂ = log(β0 + β1X1) D. μ̂ = 1 / (β0 + β1X1) Answer: B Explanation: The inverse of the log link is the exponential function. Question 46. Which regularization technique can both shrink coefficients and perform variable selection? A. Ridge (L2) B. Lasso (L1) C. Elastic Net (L1 + L2) D. Dropout Answer: C Explanation: Elastic Net combines L1 and L2 penalties, enabling shrinkage and selection. Question 47. In a Random Forest, the “out‑of‑bag” (OOB) error estimate is derived from: A. The validation set B. Observations not used in building each individual tree

Exam

C. The training data after pruning D. The test set predictions Answer: B Explanation: OOB uses the bootstrapped samples left out of each tree to assess error. Question 48. The “learning rate” (η) in gradient boosting controls: A. The maximum depth of each tree B. The proportion of features sampled at each split C. The contribution of each new tree to the ensemble D. The number of trees grown Answer: C Explanation: η scales the incremental updates added by each tree. Question 49. Which loss function is appropriate for a binary classification neural network? A. Mean Squared Error B. Huber loss C. Binary Cross‑Entropy D. Poisson deviance Answer: C Explanation: Binary cross‑entropy measures the distance between predicted probabilities and actual binary outcomes. Question 50. In logistic regression, a predictor with an odds ratio of 0.70 indicates: A. The predictor increases odds by 30 % B. The predictor reduces odds by 30 %

Exam

B. SHAP values (global aggregation) C. Decision rules extraction D. Partial Dependence Plots Answer: D Explanation: Partial dependence plots summarize the average effect of a feature across the entire model. Question 54. When deploying a model as a micro‑service, which container technology is most commonly used? A. Docker B. Hadoop C. Spark D. Kafka Answer: A Explanation: Docker packages applications with dependencies for easy deployment. Question 55. Which metric is most appropriate for evaluating a highly imbalanced classification problem where the minority class is the focus? A. Accuracy B. ROC‑AUC C. Precision‑Recall AUC D. Mean Squared Error Answer: C Explanation: PR‑AUC captures performance on the positive (minority) class better than ROC‑AUC when classes are imbalanced.

Exam

Question 56. In a GLM, the dispersion parameter φ is: A. Always equal to 1 for all families B. Estimated separately for the Gaussian family only C. Fixed for Poisson and Binomial families D. Used to scale the variance function in the Gamma family Answer: D Explanation: In the Gamma family, variance = φ · μ²; φ must be estimated. Question 57. Which of the following best describes “model interpretability” in actuarial modeling? A. Ability to run the model on any hardware B. Ability to trace predictions back to input features and understand their impact C. Ability to achieve the lowest possible error metric D. Ability to handle big data volumes Answer: B Explanation: Interpretability concerns understanding how inputs drive outputs. Question 58. The “bias‑variance trade‑off” states that: A. Increasing model complexity always reduces both bias and variance B. Reducing bias inevitably increases variance, and vice versa C. Bias and variance are unrelated D. Only bias matters for predictive performance Answer: B Explanation: Model complexity reduces bias but typically raises variance.

Exam PA Predictive Analytics Practice Exam, Exams of Technology

Related documents

Partial preview of the text

Download Exam PA Predictive Analytics Practice Exam and more Exams Technology in PDF only on Docsity!

Exam

Exam

Exam

Exam

Exam

Exam

Exam

Exam

Exam

Exam

Exam

Exam

Exam

Exam