PrepIQ Professional Machine Learning Engineer Ultimate Exam, Exams of Technology

This exam evaluates knowledge and skills in designing, building, and deploying machine learning models. Topics include data preprocessing, model selection, training and evaluation, deployment, monitoring, optimization, ethical AI, and cloud-based ML tools. Candidates must demonstrate ability to develop scalable, efficient, and reliable machine learning solutions.

Typology: Exams

2025/2026

Available from 04/19/2026

shilpi-jain-3
shilpi-jain-3 🇮🇳

2.5

(11)

80K documents

1 / 86

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
PrepIQ Professional Machine
Learning Engineer Ultimate Exam
**Question 1.** Which Google Cloud service is best suited for rapidly building a
chatbot that leverages pre-trained large language models without writing custom
inference code?
A) Vertex AI AutoML Image
B) Vertex AI Agent Builder
C) BigQuery ML
D) Cloud Dataflow
Answer: B
Explanation: Vertex AI Agent Builder (formerly Generative AI App Builder) provides a
low-code environment to create conversational agents using foundation models,
eliminating the need for custom code.
**Question 2.** When choosing between a pre-trained Vision API and training a
custom image classification model on Vertex AI, which factor most strongly favors
using the pre-trained API?
A) Need for domain-specific classes not covered by the API
B) Requirement for sub-second latency on edge devices
C) Limited labeled training data and fast time-to-market
D) Desire to fine-tune model hyper-parameters
Answer: C
Explanation: Pre-trained Vision APIs require no training data and can be deployed
instantly, making them ideal when data is scarce and speed is critical.
**Question 3.** In Vertex AI Model Garden, what does the “Foundation Model” term
refer to?
A) A model trained on a specific downstream task
B) A large, pre-trained model that can be fine-tuned for many tasks
C) A model that only runs on TPU hardware
D) A model that is automatically deployed to GKE
Answer: B
Explanation: Foundation models are large, general-purpose models (e.g., Gemini,
PaLM) that can be adapted to various downstream applications via fine-tuning.
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b
pf1c
pf1d
pf1e
pf1f
pf20
pf21
pf22
pf23
pf24
pf25
pf26
pf27
pf28
pf29
pf2a
pf2b
pf2c
pf2d
pf2e
pf2f
pf30
pf31
pf32
pf33
pf34
pf35
pf36
pf37
pf38
pf39
pf3a
pf3b
pf3c
pf3d
pf3e
pf3f
pf40
pf41
pf42
pf43
pf44
pf45
pf46
pf47
pf48
pf49
pf4a
pf4b
pf4c
pf4d
pf4e
pf4f
pf50
pf51
pf52
pf53
pf54
pf55
pf56

Partial preview of the text

Download PrepIQ Professional Machine Learning Engineer Ultimate Exam and more Exams Technology in PDF only on Docsity!

Learning Engineer Ultimate Exam

Question 1. Which Google Cloud service is best suited for rapidly building a chatbot that leverages pre-trained large language models without writing custom inference code? A) Vertex AI AutoML Image B) Vertex AI Agent Builder C) BigQuery ML D) Cloud Dataflow Answer: B Explanation: Vertex AI Agent Builder (formerly Generative AI App Builder) provides a low-code environment to create conversational agents using foundation models, eliminating the need for custom code. Question 2. When choosing between a pre-trained Vision API and training a custom image classification model on Vertex AI, which factor most strongly favors using the pre-trained API? A) Need for domain-specific classes not covered by the API B) Requirement for sub-second latency on edge devices C) Limited labeled training data and fast time-to-market D) Desire to fine-tune model hyper-parameters Answer: C Explanation: Pre-trained Vision APIs require no training data and can be deployed instantly, making them ideal when data is scarce and speed is critical. Question 3. In Vertex AI Model Garden, what does the “Foundation Model” term refer to? A) A model trained on a specific downstream task B) A large, pre-trained model that can be fine-tuned for many tasks C) A model that only runs on TPU hardware D) A model that is automatically deployed to GKE Answer: B Explanation: Foundation models are large, general-purpose models (e.g., Gemini, PaLM) that can be adapted to various downstream applications via fine-tuning.

Learning Engineer Ultimate Exam

Question 4. Which SQL statement in BigQuery ML creates a linear regression model to predict price from sqft and bedrooms? A) CREATE MODEL mydataset.price_model OPTIONS(model_type='linear_reg') AS SELECT price, sqft, bedrooms FROM mydataset.homes; B) CREATE MODEL mydataset.price_model OPTIONS(model_type='linear_reg') AS SELECT * FROM mydataset.homes; C) CREATE MODEL mydataset.price_model OPTIONS(model_type='linear_reg') AS SELECT price, sqft, bedrooms FROM mydataset.homes; D) CREATE MODEL mydataset.price_model OPTIONS(model_type='linear_reg') AS SELECT price, sqft, bedrooms FROM mydataset.homes WHERE price IS NOT NULL; Answer: C Explanation: The correct syntax lists the target and feature columns; option C follows the required format. Question 5. Which BigQuery ML feature enables you to register a trained model directly to Vertex AI for serving? A) EXPORT MODEL TO GCS B) CREATE OR REPLACE MODEL WITH model_path C) CREATE MODEL … OPTIONS (model_registry='vertex_ai') D) MODEL_REGISTRY='vertex_ai' clause in CREATE MODEL Answer: C Explanation: The model_registry='vertex_ai' option automatically registers the BQML model in Vertex AI Model Registry. Question 6. ARIMA_PLUS in BigQuery ML is primarily used for: A) Classification of textual data B) Time-series forecasting with automatic hyperparameter selection C) Image segmentation D) Clustering of customer segments Answer: B

Learning Engineer Ultimate Exam

Explanation: AutoML Text expects a CSV where one column contains the raw text and another contains the label. Question 10. What does “data parallelism” refer to in distributed training on Vertex AI? A) Splitting the model across multiple devices B) Replicating the whole model on each worker and feeding each a different data shard C) Using a single GPU for all training steps D) Sharing parameters via parameter servers only Answer: B Explanation: Data parallelism replicates the model on each worker and each worker processes a different portion of the dataset. Question 11. When using Vertex AI Vizier for hyperparameter tuning, which of the following is NOT a valid search space type? A) Discrete B) Continuous C) Categorical D) Temporal Answer: D Explanation: Vizier supports discrete, continuous, and categorical spaces; “temporal” is not a defined type. Question 12. Which of the following is a primary advantage of using TensorFlow Transform (TFX) for preprocessing? A) Preprocessing runs only during serving, not training B) Guarantees that training and serving pipelines apply identical transformations C) Eliminates the need for feature engineering D) Automatically converts all features to embeddings Answer: B

Learning Engineer Ultimate Exam

Explanation: TFX’s TensorFlow Transform ensures that the same preprocessing graph is used in both training and serving, preventing training-serving skew. Question 13. In Dataflow pipelines, which programming model does Apache Beam provide? A) Map-Reduce only B) Stream-and-batch unified model using PCollections and transforms C) SQL-only processing D) Graph-based neural network execution Answer: B Explanation: Apache Beam’s unified model lets you define pipelines with PCollections and transforms that can run in batch or streaming mode. Question 14. Which feature store serving mode provides sub-millisecond latency for online predictions? A) Offline (BigQuery) B) Online (Vertex AI Feature Store) C) Batch (Dataflow) D) Edge (TensorFlow Lite) Answer: B Explanation: The online feature store is optimized for low-latency, point-lookup queries required for real-time inference. Question 15. When handling class imbalance in a binary classification problem, which technique is NOT commonly used? A) Oversampling the minority class B) Undersampling the majority class C) Adding random noise to the majority class features D) Using class-weighting in the loss function Answer: C

Learning Engineer Ultimate Exam

Question 19. Traffic splitting in Vertex AI Endpoints is primarily used for: A) Load balancing across multiple GPUs B) A/B testing different model versions or performing canary releases C) Distributing data preprocessing tasks D) Scaling the underlying compute cluster automatically Answer: B Explanation: Traffic splitting allows you to route a percentage of requests to different model versions for testing or gradual rollouts. Question 20. Model quantization mainly helps with: A) Increasing model accuracy B) Reducing model size and inference latency on edge devices C) Converting models to TensorFlow.js format D) Improving training speed on TPUs Answer: B Explanation: Quantization reduces the precision of weights (e.g., from float32 to int8), decreasing memory footprint and latency. Question 21. Which evaluation metric is most appropriate for an imbalanced binary classification where false negatives are especially costly? A) Accuracy B) ROC-AUC C) F1-Score D) Precision-Recall AUC Answer: D Explanation: Precision-Recall AUC focuses on performance for the positive class, making it suitable when the minority class (often the positive) is critical. Question 22. In a regression problem, which metric directly measures the average magnitude of errors without squaring them?

Learning Engineer Ultimate Exam

A) RMSE

B) MAE

C) R²

D) MSE

Answer: B Explanation: Mean Absolute Error (MAE) is the average absolute difference between predictions and true values. Question 23. Which of the following is a technique for preserving privacy when training models on sensitive data? A) Data augmentation B) Differential privacy C) Model ensembling D) Early stopping Answer: B Explanation: Differential privacy adds calibrated noise to gradients or outputs, providing provable privacy guarantees. Question 24. Which Vertex AI feature allows you to explain predictions by attributing importance to input features? A) Model Monitoring B) Explainable AI (Feature Attributions) C) AutoML Tabular D) Pipelines UI Answer: B Explanation: Explainable AI provides feature-level attributions (e.g., SHAP values) for individual predictions. Question 25. When deploying a large generative model (e.g., Gemini) for inference, which deployment option typically offers the best scalability? A) Vertex AI Endpoints with dedicated GPU nodes

Learning Engineer Ultimate Exam

B) A model that has been retrained on domain-specific data, adjusting its weights C) A model that is exported as ONNX for edge deployment D) A model that uses reinforcement learning from human feedback (RLHF) only Answer: B Explanation: Fine-tuning adjusts the pre-trained weights using domain data, creating a specialized version of the foundation model. Question 29. What is the primary purpose of the “Model Monitoring” feature in Vertex AI? A) To automatically retrain models on a schedule B) To detect data drift, prediction skew, and performance degradation in production C) To visualize training curves in real time D) To manage model versioning in GCS buckets Answer: B Explanation: Model Monitoring continuously checks for drift and performance issues, enabling alerts and automated retraining. Question 30. Which of the following is a recommended practice for handling outliers before training a linear regression model? A) Always remove the top 5% of rows by target value B) Apply robust scaling (e.g., QuantileTransformer) to reduce outlier influence C) Encode outliers as a separate categorical feature D) Increase the learning rate to converge faster Answer: B Explanation: Robust scaling methods lessen the impact of extreme values without discarding data. Question 31. In Vertex AI Feature Store, what distinguishes “offline” features from “online” features? A) Offline features are stored in Cloud SQL, online in BigQuery

Learning Engineer Ultimate Exam

B) Offline features are accessed via batch reads, online via low-latency key-value lookups C) Offline features are only for training, online only for inference D) Offline features are encrypted, online are not Answer: B Explanation: Offline features are typically stored in BigQuery for batch processing, while online features reside in a low-latency serving store. Question 32. Which of the following is an example of a “managed service” offering that abstracts away infrastructure for ML? A) Compute Engine VM instances B) Vertex AI AutoML C) Kubernetes Engine clusters you manage yourself D) Custom Docker containers on Cloud Run Answer: B Explanation: AutoML abstracts the underlying compute, data handling, and model training, providing a fully managed experience. Question 33. When using Dataflow for scaling data preprocessing, which runner is used by default in a fully managed environment? A) DirectRunner B) SparkRunner C) DataflowRunner D) FlinkRunner Answer: C Explanation: In Google Cloud, the DataflowRunner executes Apache Beam pipelines on the managed Dataflow service. Question 34. What is the main benefit of using “model versioning” in Vertex AI Model Registry? A) It automatically improves model accuracy over time

Learning Engineer Ultimate Exam

C) Manually uploading model artifacts via the console D) Deploying the new model version to an endpoint automatically Answer: C Explanation: CI/CD aims to automate deployments; manual uploads break the pipeline’s automation. Question 38. When deploying a model to a Vertex AI Endpoint, which parameter controls the maximum number of concurrent requests a replica can handle? A) machine_type B) max_concurrency C) autoscaling_min_nodes D) traffic_split Answer: B Explanation: max_concurrency sets the per-replica request concurrency limit. Question 39. Which of the following is a key difference between “training-serving skew” and “data drift”? A) Skew occurs during model training, drift occurs after deployment B) Skew refers to mismatched preprocessing between train and serve, drift refers to changes in input data distribution over time C) Skew is a type of model bias, drift is a type of model variance D) Both are synonyms; there is no difference Answer: B Explanation: Training-serving skew arises when preprocessing pipelines differ, while data drift reflects evolving input distributions. Question 40. Which of the following is an appropriate use case for Vertex AI Agent Builder? A) Large-scale image segmentation for satellite imagery B) Real-time fraud detection on streaming transaction data C) Building a customer-service chatbot that answers FAQs using a LLM

Learning Engineer Ultimate Exam

D) Training a recommendation system with matrix factorization Answer: C Explanation: Agent Builder specializes in low-code conversational agents powered by large language models. Question 41. In Vertex AI AutoML Tables, what is the effect of enabling “feature importance” during training? A) It automatically removes low-importance features before training B) It generates a report indicating how much each feature contributed to predictions C) It converts all categorical features to embeddings D) It forces the model to use a linear algorithm only Answer: B Explanation: Feature importance provides post-training insights about each feature’s impact on the model’s decisions. Question 42. Which Cloud service can be used to orchestrate end-to-end ML workflows that include steps in BigQuery, Dataflow, and Vertex AI? A) Cloud Scheduler only B) Cloud Composer (Apache Airflow) C) Cloud Functions alone D) Cloud DNS Answer: B Explanation: Cloud Composer provides Airflow-based orchestration, allowing DAGs that integrate multiple GCP services. Question 43. When fine-tuning a foundation model on a domain-specific corpus, which of the following strategies reduces catastrophic forgetting? A) Using a very high learning rate B) Freezing all transformer layers except the final classification head C) Applying a small learning rate and optionally using “adapter” layers D) Training only on synthetic data

Learning Engineer Ultimate Exam

Answer: B Explanation: Data augmentation expands the effective dataset, and dropout prevents co-adaptation of neurons, both reducing overfitting. Question 47. What does the “Explainable AI” (XAI) method “Integrated Gradients” compute? A) The gradient of loss with respect to model parameters B) The accumulated gradients of the output with respect to each input feature along a straight-line path from a baseline C) The SHAP values using a game-theoretic approach D) The attention weights from a transformer model Answer: B Explanation: Integrated Gradients approximate feature attributions by integrating gradients from a baseline input to the actual input. Question 48. Which of the following statements about “model quantization aware training (QAT)” is true? A) QAT converts a trained model to INT8 after training only B) QAT simulates quantization effects during training to preserve accuracy after conversion C) QAT is only applicable to NLP models D) QAT increases model size dramatically Answer: B Explanation: Quantization-aware training inserts fake quantization nodes during training, allowing the model to adapt to reduced precision. Question 49. In Vertex AI, which component is responsible for automatically scaling the number of machines in a training job based on resource utilization? A) Managed Pipelines B) Hyperparameter Tuning Service C) Training Cluster Autoscaler D) Cloud Scheduler

Learning Engineer Ultimate Exam

Answer: C Explanation: The Training Cluster Autoscaler adjusts the number of workers for distributed training jobs. Question 50. Which of the following is NOT a typical step in the “data ingestion” phase for ML on GCP? A) Loading CSV files into BigQuery tables B) Streaming sensor data into Pub/Sub and then to BigQuery C) Directly training a model on raw log files stored in Cloud Storage without any parsing D) Using Dataprep to clean and transform raw CSVs before loading Answer: C Explanation: Raw log files must be parsed and transformed before they can be used for model training; training directly on unprocessed logs is not standard. Question 51. Which of the following best describes “model pruning” in the context of deep learning? A) Removing entire layers to reduce depth B) Deleting weights with values close to zero to reduce model size and inference cost C) Converting a model to a decision tree D) Adding more neurons to improve accuracy Answer: B Explanation: Pruning eliminates insignificant weights, decreasing model size and often improving latency with minimal loss in accuracy. Question 52. When using BigQuery ML to train a K-means clustering model, which statement is true? A) K-means can only be used for supervised learning B) The number of clusters K must be specified in the OPTIONS clause C) BigQuery ML automatically determines the optimal K D) K-means in BQML supports categorical features without encoding

Learning Engineer Ultimate Exam

Explanation: TPUs require TPUStrategy and often benefit from XLA compilation for performance. Question 56. Which of the following is a recommended way to detect “concept drift” in a deployed model? A) Periodically re-train the model on the same training set B) Compare the distribution of input features over time using statistical tests (e.g., KS test) C) Increase the learning rate during serving D) Disable model monitoring to reduce overhead Answer: B Explanation: Statistical tests on feature distributions reveal shifts that may indicate concept drift. Question 57. What does the “online prediction” option in Vertex AI Endpoints primarily optimize for? A) Maximum throughput with batch jobs B) Minimal latency for individual inference requests C) Automatic model compression D) Model interpretability Answer: B Explanation: Online prediction serves single requests with low latency, ideal for real-time applications. Question 58. Which of the following is NOT a supported input format for Vertex AI Vision Object Detection AutoML? A) JPEG images stored in Cloud Storage B) PNG images stored in Cloud Storage C) TFRecord files with embedded images D) Raw binary image data sent via REST without base64 encoding Answer: D

Learning Engineer Ultimate Exam

Explanation: AutoML Vision expects images stored in Cloud Storage or TFRecord; raw binary payloads are not directly supported. Question 59. When using Vertex AI Feature Store, which operation is used to retrieve a set of features for a given entity key at inference time? A) ReadFeatureValues API call B) ExportFeatureValues to BigQuery C) BatchCreateFeatures D) DeleteFeature Answer: A Explanation: ReadFeatureValues provides low-latency online access to feature values for a specific entity. Question 60. Which of the following statements about “model registries” is true? A) They store only the binary model file, not metadata B) They enable model versioning, lineage tracking, and approval workflows C) They are only available for TensorFlow models D) They automatically generate API keys for each model Answer: B Explanation: Model registries manage versions, metadata, and can integrate with CI/CD for approvals. Question 61. In Vertex AI AutoML Text Sentiment, which metric is used by default to evaluate model performance? A) Accuracy B) F1-Score (macro) C) ROC-AUC D) Mean Squared Error Answer: B Explanation: For multi-class sentiment analysis, AutoML uses macro-averaged F1 as the primary metric.