AIDSAI Certified Machine Learning CMLP Exam, Exams of Technology

The CMLP exam certifies core competencies in machine learning. It covers supervised and unsupervised learning, model evaluation, feature selection, overfitting prevention, and deployment fundamentals. Candidates demonstrate practical understanding of building, testing, and maintaining machine learning solutions across diverse application domains.

Typology: Exams

2025/2026

Available from 01/21/2026

shilpi-jain-2
shilpi-jain-2 🇮🇳

1

(1)

25K documents

1 / 111

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
AIDSAI Certified Machine Learning CMLP
Exam
**Question 1.** Which data ingestion method is most appropriate for processing millions of
sensor readings per second with minimal latency?
A) Batch ETL scheduled nightly
B) Micro-batch Spark Streaming
C) Realtime Kafka streaming
D) Manual CSV upload
Answer: C
Explanation: Realtime Kafka streaming can ingest highvelocity data continuously with low
latency, suitable for millions of sensor events per second, unlike batch or microbatch
approaches.
**Question 2.** In a data lake architecture for machine learning, which characteristic best
differentiates it from a traditional data warehouse?
A) Strict schema enforcement at write time
B) Storage of raw, unstructured data alongside structured data
C) Use of OLAP cubes for fast analytics
D) Mandatory ACID transactions for all operations
Answer: B
Explanation: Data lakes store raw, unstructured, semistructured, and structured data without
requiring a predefined schema, whereas warehouses enforce schemas and are optimized for
structured queries.
**Question 3.** When connecting to a semistructured data source containing nested JSON
objects, which Spark feature is most useful for flattening the hierarchy for model training?
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b
pf1c
pf1d
pf1e
pf1f
pf20
pf21
pf22
pf23
pf24
pf25
pf26
pf27
pf28
pf29
pf2a
pf2b
pf2c
pf2d
pf2e
pf2f
pf30
pf31
pf32
pf33
pf34
pf35
pf36
pf37
pf38
pf39
pf3a
pf3b
pf3c
pf3d
pf3e
pf3f
pf40
pf41
pf42
pf43
pf44
pf45
pf46
pf47
pf48
pf49
pf4a
pf4b
pf4c
pf4d
pf4e
pf4f
pf50
pf51
pf52
pf53
pf54
pf55
pf56
pf57
pf58
pf59
pf5a
pf5b
pf5c
pf5d
pf5e
pf5f
pf60
pf61
pf62
pf63
pf64

Partial preview of the text

Download AIDSAI Certified Machine Learning CMLP Exam and more Exams Technology in PDF only on Docsity!

Exam

Question 1. Which data ingestion method is most appropriate for processing millions of sensor readings per second with minimal latency? A) Batch ETL scheduled nightly B) Micro-batch Spark Streaming C) Real‑time Kafka streaming D) Manual CSV upload Answer: C Explanation: Real‑time Kafka streaming can ingest high‑velocity data continuously with low latency, suitable for millions of sensor events per second, unlike batch or micro‑batch approaches. Question 2. In a data lake architecture for machine learning, which characteristic best differentiates it from a traditional data warehouse? A) Strict schema enforcement at write time B) Storage of raw, unstructured data alongside structured data C) Use of OLAP cubes for fast analytics D) Mandatory ACID transactions for all operations Answer: B Explanation: Data lakes store raw, unstructured, semi‑structured, and structured data without requiring a predefined schema, whereas warehouses enforce schemas and are optimized for structured queries. Question 3. When connecting to a semi‑structured data source containing nested JSON objects, which Spark feature is most useful for flattening the hierarchy for model training?

Exam

A) DataFrame.persist() B) spark.read.format("csv") C) spark.sql.functions.explode() D) broadcast joins Answer: C Explanation: The explode() function expands arrays or nested structures into separate rows, enabling flattening of JSON hierarchies for downstream processing. Question 4. Which ETL design pattern ensures that each transformation step can be independently re‑executed without re‑processing the entire pipeline? A) Monolithic pipeline B) Lambda architecture C) Incremental (delta) processing with checkpoints D) Full reload on each run Answer: C Explanation: Incremental processing with checkpoints records the state after each step, allowing isolated re‑execution of only the affected stages. Question 5. In data cleaning, which statistical technique is most appropriate for imputing missing values in a normally distributed numeric feature? A) Median imputation B) Mode imputation C) Mean imputation

Exam

Explanation: Capturing data version IDs and lineage ensures that the exact inputs can be retrieved, making experiments reproducible. Question 8. Which storage solution offers the lowest latency for serving feature vectors to an online inference service? A) Amazon S3 object storage B) HDFS distributed file system C) In‑memory key‑value store (e.g., Redis) D) Cold archival tape storage Answer: C Explanation: In‑memory stores like Redis provide sub‑millisecond latency, ideal for real‑time feature retrieval, unlike disk‑based or archival systems. Question 9. Which encryption method protects data both at rest and in transit for a cloud‑based ML data repository? A) AES‑256 for storage and TLS 1.2 for network traffic B) MD5 hashing for files C) Base64 encoding of data D) Plaintext storage with VPN Answer: A Explanation: AES‑256 encrypts data at rest, while TLS 1.2 secures data in transit; both together provide comprehensive protection.

Exam

Question 10. In IAM policy design, which principle minimizes the risk of unauthorized data access? A) Granting admin rights to all users B) Using role‑based access control with least privilege C) Sharing credentials across teams D) Disabling multi‑factor authentication Answer: B Explanation: Role‑based access with least‑privilege ensures users receive only the permissions necessary for their tasks, reducing exposure. Question 11. Which scaling technique transforms features to a range of 0 to 1 and is sensitive to outliers? A) Standardization (Z‑score) B) Min‑Max scaling C) Robust scaling D) Log transformation Answer: B Explanation: Min‑Max scaling maps values linearly to [0,1]; extreme outliers can compress the majority of data into a narrow range. Question 12. When a feature exhibits a heavy‑tailed distribution, which scaling method is most robust? A) Min‑Max scaling

Exam

Answer: C Explanation: Target encoding replaces categories with the mean target value, reducing dimensionality while preserving predictive signal for high‑cardinality features. Question 15. Which encoding method can unintentionally introduce ordinal relationships where none exist? A) One‑hot encoding B) Label encoding C) Frequency encoding D) Hashing trick Answer: B Explanation: Label encoding assigns integer values, implying order, which may mislead models that interpret numeric magnitude as ordinal. Question 16. When creating polynomial features of degree 3 for a numeric predictor, how many interaction terms are generated for two original features x₁ and x₂? A) 3 B) 4 C) 5 D) 6 Answer: D Explanation: Degree‑3 polynomial terms include x₁³, x₁²x₂, x₁x₂², x₂³, plus lower‑degree terms (x₁², x₁x₂, x₂²) and linear terms, totaling 6 interaction terms beyond the original features.

Exam

Question 17. In text preprocessing, which step is essential for reducing dimensionality while preserving semantic meaning in bag‑of‑words models? A) Stemming or lemmatization B) Adding HTML tags C) Converting to uppercase D) Random word shuffling Answer: A Explanation: Stemming/lemmatization reduces word variants to a common root, decreasing the vocabulary size without losing core meaning. Question 18. Which vectorization technique captures term importance across documents by weighting term frequency with inverse document frequency? A) CountVectorizer B) One‑hot encoding C) TF‑IDF D) Word2Vec Answer: C Explanation: TF‑IDF multiplies term frequency by inverse document frequency, emphasizing words that are frequent in a document but rare across the corpus. Question 19. Which dimensionality reduction algorithm is non‑linear and preserves local neighborhood structure, often used for visualizing high‑dimensional data?

Exam

D) Random forest importance Answer: C Explanation: RFE iteratively fits a model, eliminates the weakest features, and repeats until the desired number of features remains. Question 22. In a correlation matrix, a pair of features with a Pearson coefficient of 0. suggests what action? A) Keep both; they add unique information B) Remove one to reduce multicollinearity C) Transform both using log scaling D) Increase regularization Answer: B Explanation: A coefficient of 0.95 indicates strong linear dependence; retaining both can cause multicollinearity, so dropping one is advisable. Question 23. Which regression algorithm adds an L2 penalty to the loss function to shrink coefficients? A) Linear Regression B) Ridge Regression C) Lasso Regression D) Elastic Net Answer: B

Exam

Explanation: Ridge regression incorporates an L2 regularization term, penalizing large coefficients and reducing overfitting. Question 24. Which linear model is capable of performing both L1 and L2 regularization simultaneously? A) Ridge B) Lasso C) Elastic Net D) Bayesian Regression Answer: C Explanation: Elastic Net combines L1 (lasso) and L2 (ridge) penalties, offering a balance between feature selection and coefficient shrinkage. Question 25. In logistic regression, what does the sigmoid activation function output? A) Class label directly B) Probability between 0 and 1 C) Log‑odds D) Decision tree leaf Answer: B Explanation: The sigmoid function maps any real‑valued input to a value in (0,1), interpreted as the probability of the positive class.

Exam

B. Gradient Boosted Trees C. Bagging D. Stacking Answer: B Explanation: Gradient Boosting adds trees iteratively, each trained on the residual errors of the combined previous trees. Question 29. Which hyperparameter in XGBoost controls the depth of each tree and thus influences model complexity? A) learning_rate B) max_depth C) n_estimators D) subsample Answer: B Explanation: max_depth sets the maximum depth of each decision tree, directly affecting the capacity and risk of overfitting. Question 30. In K‑Means clustering, what does the “elbow method” help determine? A) Optimal number of clusters by locating a point where within‑cluster sum of squares decreases slowly B) Distance metric selection C) Initialization seed D) Data scaling technique

Exam

Answer: A Explanation: The elbow method plots SSE vs. k; the “elbow” point indicates diminishing returns, suggesting a suitable k. Question 31. Which clustering algorithm can discover arbitrarily shaped clusters and does not require specifying the number of clusters a priori? A) K‑Means B) Hierarchical Agglomerative Clustering C) DBSCAN D) Gaussian Mixture Models Answer: C Explanation: DBSCAN groups points based on density, handling irregular shapes and automatically determining the number of clusters. Question 32. In anomaly detection for credit‑card fraud, which unsupervised technique models normal behavior and flags deviations? A) One‑Class SVM B) Logistic Regression C) Decision Tree D) Naïve Bayes Answer: A

Exam

Question 35. Which activation function is preferred for hidden layers in deep networks because it avoids saturation for positive inputs? A) Sigmoid B) Tanh C) ReLU D) Softmax Answer: C Explanation: ReLU outputs zero for negative inputs and identity for positive, providing sparse activation and reducing saturation. Question 36. Which optimizer adapts learning rates per parameter based on first and second moments of gradients? A) Stochastic Gradient Descent (SGD) B) AdaGrad C) RMSprop D) Adam Answer: D Explanation: Adam combines momentum (first moment) and RMSprop‑like scaling (second moment) for adaptive learning rates. Question 37. When performing hyperparameter tuning with Grid Search, what is a major drawback compared to Bayesian Optimization? A) Requires prior knowledge of the search space

Exam

B) Explores fewer configurations C) Computationally expensive due to exhaustive enumeration D) Cannot handle categorical parameters Answer: C Explanation: Grid Search evaluates every combination, leading to high computational cost, whereas Bayesian Optimization intelligently selects promising points. Question 38. Which regularization technique randomly disables a proportion of neurons during each training iteration? A) L1 regularization B) L2 regularization C) Dropout D) Early stopping Answer: C Explanation: Dropout sets a random subset of activations to zero, preventing co‑adaptation and reducing overfitting. Question 39. In K‑Fold cross‑validation, how many distinct models are trained when K=5? A) 1 B) 3 C) 5 D) 10

Exam

Question 42. Which classification metric is most informative when the positive class is rare and false negatives are costly? A) Accuracy B) Precision C) Recall D) F1‑Score Answer: C Explanation: Recall measures the proportion of actual positives correctly identified, crucial when missing positives is expensive. Question 43. In a binary classifier, a ROC‑AUC of 0.5 indicates what level of performance? A) Perfect discrimination B) Good discrimination C) No discriminative ability (random guessing) D) Overfitting Answer: C Explanation: An AUC of 0.5 corresponds to the diagonal line, equivalent to random chance. Question 44. Which component of a confusion matrix directly reflects Type II errors? A) True Positives B) False Positives C) False Negatives

Exam

D) True Negatives Answer: C Explanation: False Negatives are instances where the model missed the positive class, representing Type II errors. Question 45. When a learning curve shows high training error and similarly high validation error, the model is likely suffering from: A) High bias (underfitting) B) High variance (overfitting) C) Data leakage D) Imbalanced classes Answer: A Explanation: Both errors being high indicates the model cannot capture underlying patterns, a bias problem. Question 46. Which model‑agnostic technique explains the contribution of each feature to a single prediction by approximating the model locally? A) SHAP values B) Global feature importance plot C) Confusion matrix D) ROC curve Answer: A