




























































































Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
A set of practice exam questions and answers for the google machine learning certificate. It covers key concepts and services such as vertex ai, bigquery ml, automl, and tensorflow. The questions test understanding of model training, evaluation, and deployment, as well as data preprocessing and feature engineering techniques. This resource is valuable for individuals preparing for the certification exam and seeking to reinforce their knowledge of machine learning on google cloud. The practice exam includes questions on topics such as time-series forecasting, image classification, and hyperparameter tuning. It also addresses important considerations for responsible ai and mitigating bias in machine learning models. The document serves as a useful tool for self-assessment and exam preparation.
Typology: Exams
1 / 101
This page cannot be seen from the preview
Don't miss anything!





























































































Question 1. Which Google Cloud service allows you to train a machine‑learning model directly from SQL queries without writing custom code? A) Vertex AI Workbench B) BigQuery ML C) Cloud AutoML Tables D) Dataflow Answer: B Explanation: BigQuery ML (BQML) lets you create and train models using standard SQL syntax, eliminating the need for external code. Question 2. In BQML, which model type is most appropriate for predicting a continuous numeric value such as house price? A) LOGISTIC_REG B) LINEAR_REG C) KMEANS D) ARIMA Answer: B Explanation: LINEAR_REG performs linear regression, which predicts continuous numeric outcomes. Question 3. Which BQML function returns the confusion matrix for a classification model? A) ML.PREDICT B) ML.EVALUATE C) ML.CONFUSION_MATRIX
Answer: C Explanation: ML.CONFUSION_MATRIX computes true‑positive, false‑positive, etc., for classification models. Question 4. When using BQML for time‑series forecasting, which model type should you select? A) BOOSTED_TREE_REGressor B) ARIMA C) MATRIX_FACTORIZATION D) KMEANS Answer: B Explanation: ARIMA is designed for time‑series data and can capture trends and seasonality. Question 5. Which SQL clause in BQML is used to split data into training and evaluation sets automatically? A) CREATE MODEL … OPTIONS (model_type='LINEAR_REG', input_label_cols=['label']) B) SELECT * FROM dataset.table WHERE RAND() < 0. C) ML.TRAINING_DATA D) ML.SPLIT Answer: B Explanation: Using RAND() in a WHERE clause is a common technique to randomly partition data.
C) A decision tree classifier D) A convolutional neural network Answer: B Explanation: RAG retrieves relevant documents from a vector store to augment the generation process. Question 9. Which AutoML product is best suited for training a custom image classification model? A) AutoML Tables B) AutoML Video C) AutoML Vision D) AutoML Natural Language Answer: C Explanation: AutoML Vision is designed for image classification and object detection tasks. Question 10. When preparing data for AutoML Tables, which of the following is NOT required? A) Splitting data into train/validation sets B) Converting categorical columns to one‑hot vectors manually C) Ensuring a target column is labeled “target” D) Removing rows with missing values Answer: B
Explanation: AutoML Tables automatically handles categorical encoding; manual one‑hot is unnecessary. Question 11. Which TensorFlow activation function is typically used in the output layer for multi‑class classification? A) ReLU B) Sigmoid C) Softmax D) Tanh Answer: C Explanation: Softmax converts logits into a probability distribution across classes. Question 12. In a supervised learning problem, what does the loss function measure? A) Model complexity B) Difference between predicted and true values C) Number of features used D) Training time Answer: B Explanation: The loss quantifies how far predictions deviate from actual labels. Question 13. Which of the following is a common technique to reduce overfitting? A) Decrease learning rate to zero B) Increase model depth indefinitely C) Apply dropout during training
Question 16. Which metric is best for evaluating a binary classifier when class imbalance is severe? A) Accuracy B) ROC‑AUC C) Mean Squared Error D) R² Answer: B Explanation: ROC‑AUC is insensitive to class distribution and measures discriminative ability. Question 17. In Vertex AI Feature Store, what is the purpose of an “entity type”? A) To store raw images B) To group related feature vectors belonging to the same logical object (e.g., user) C) To define a model architecture D) To schedule batch predictions Answer: B Explanation: Entity types represent a category of objects (users, products) whose features are stored together. Question 18. Which of the following data preprocessing steps can cause training‑serving skew if not applied consistently? A) Storing data in Cloud Storage B) Normalizing numeric columns only on the training set C) Using BQML to train a model D) Deploying a model to Vertex AI
Answer: B Explanation: If normalization parameters are computed only on training data and not applied to serving data, predictions can be inconsistent. Question 19. Which Google Cloud service is primarily used for building large‑scale data transformation pipelines before training? A) Cloud Run B) Dataflow C) Cloud Functions D) Cloud Scheduler Answer: B Explanation: Dataflow (Apache Beam) processes streaming or batch data at scale, ideal for ETL pipelines. Question 20. When exporting a TensorFlow model for serving in other environments, which format should you use? A) .h B) SavedModel directory C) .pbtxt only D) CSV Answer: B Explanation: The SavedModel format contains the graph, variables, and signatures required for serving.
Answer: B Explanation: Feature drift indicates that the characteristics of incoming data have changed, potentially degrading model performance. Question 24. Which Vertex AI feature enables you to automatically trigger model retraining when performance drops below a threshold? A) Model Registry B) Model Monitoring Alerts C) Pipelines with conditional steps D. AutoML Answer: C Explanation: Pipelines can include conditionals that launch a new training job when monitoring metrics breach predefined limits. Question 25. What is the purpose of a canary deployment in Vertex AI Endpoints? A) To serve only 1% of traffic for testing the new model version before full rollout B) To compress model files for faster download C) To convert models to ONNX format D. To automatically delete old model versions Answer: A Explanation: Canary releases route a small portion of traffic to the new version, allowing validation before scaling.
Question 26. Which explanation method in Vertex Explainable AI computes the contribution of each input feature by integrating gradients along a path from a baseline to the input? A) LIME B) SHAP C) Integrated Gradients D) Counterfactual Answer: C Explanation: Integrated Gradients approximates feature importance by accumulating gradients from a baseline. Question 27. In the context of Responsible AI, which practice helps mitigate bias in training data? A) Ignoring protected attributes B) Over‑sampling minority classes only during evaluation C) Performing fairness analysis and re‑weighting or resampling as needed D. Using only synthetic data Answer: C Explanation: Systematic fairness analysis and corrective sampling help reduce bias. Question 28. Which Vertex AI component stores versioned model artifacts and metadata for reproducibility? A) Model Registry B) Feature Store C) Dataflow
Question 31. In AutoML Vision, what does the “bounding box” annotation type enable? A) Image classification only B) Object detection with location coordinates C) Semantic segmentation masks D. Text extraction Answer: B Explanation: Bounding boxes label the position of objects for detection tasks. Question 32. Which Google Cloud service is specifically designed for orchestrating complex ML pipelines using a DAG of tasks? A) Cloud Functions B) Cloud Composer C) Cloud Build D. Cloud DNS Answer: B Explanation: Cloud Composer (Apache Airflow) allows you to define directed‑acyclic‑graph pipelines. Question 33. What is the primary benefit of using Vertex AI Custom Training containers? A) Automatic data labeling B) Ability to bring any ML framework or custom dependencies not natively supported C) Free GPU usage D. Built‑in hyperparameter tuning
Answer: B Explanation: Custom containers let you package any libraries or frameworks required for your training job. Question 34. Which loss function is most appropriate for a binary classification problem? A) Mean Squared Error B) Binary Cross‑Entropy (Log Loss) C) Hinge Loss D. Categorical Cross‑Entropy Answer: B Explanation: Binary cross‑entropy measures the error for two‑class predictions. Question 35. In a K‑means clustering job in BQML, which option controls the number of clusters? A) num_clusters B) k C) cluster_count D. max_iterations Answer: B Explanation: The parameter k specifies how many centroids the algorithm should find. Question 36. Which feature of Vertex AI Pipelines helps you track input data versions used by each pipeline run?
Explanation: Inconsistent preprocessing leads to mismatched input distributions. Question 39. When using AutoML Tables, what does the “target column” represent? A) The feature to be dropped B) The column containing the label you want to predict C) The column used for data sharding D. The column that stores timestamps Answer: B Explanation: The target column holds the ground‑truth label for supervised learning. Question 40. Which of the following is a key difference between batch prediction and online prediction in Vertex AI? A) Batch prediction can only use CPU; online prediction requires GPU B) Batch prediction processes large datasets asynchronously; online prediction serves low‑latency requests in real time C) Batch prediction automatically retrains the model; online prediction does not D. Batch prediction stores results in Cloud SQL only Answer: B Explanation: Batch jobs are suited for high‑throughput, delayed processing, while online endpoints provide immediate responses. Question 41. In TensorFlow, which optimizer is most commonly used for deep learning tasks due to its adaptive learning rate? A) SGD
B) Adam C) RMSprop D. Adagrad Answer: B Explanation: Adam combines momentum and adaptive learning rates, often yielding faster convergence. Question 42. What does the “epoch” parameter control during model training? A) Number of layers in the neural network B) Number of times the entire training dataset is passed through the model C) Size of each mini‑batch D. Learning rate Answer: B Explanation: An epoch is a full pass over all training samples. Question 43. Which of the following is NOT a typical step in the data ingestion phase for ML on Google Cloud? A) Uploading raw files to Cloud Storage B) Registering the dataset in Vertex AI Datasets C) Running inference on a deployed model D. Creating a BigQuery external table Answer: C Explanation: Inference occurs after the model is trained and deployed, not during ingestion.
C) A/B Testing (traffic splitting) D. Model Auto‑Scaling Answer: C Explanation: Traffic splitting enables A/B testing across model versions. Question 47. What is the primary purpose of the “Feature Crossing” technique? A) To reduce dimensionality by merging features B) To create interaction features that capture combined effects of two or more original features C. To normalize numerical columns D. To encrypt sensitive data Answer: B Explanation: Feature crossing generates new features representing the joint occurrence of original features. Question 48. Which of the following is a recommended practice for handling personally identifiable information (PII) before training a model? A) Store it in plain text on Cloud Storage B) Mask or remove PII, or use differential privacy techniques C. Use it as a primary feature for prediction D. Share it publicly for transparency Answer: B Explanation: Protecting privacy requires de‑identifying or applying privacy‑preserving methods.
Question 49. In Vertex AI, which component stores real‑time logs of endpoint predictions for later analysis? A) Model Registry B) Model Monitoring C) Feature Store D. Cloud Scheduler Answer: B Explanation: Model Monitoring captures prediction logs, latency, and drift metrics. Question 50. When using BigQuery ML’s ML.FEATURE_IMPORTANCE, what does a higher value indicate? A) The feature is less important for the model B) The feature contributes more to reducing loss during training C. The feature has missing values D. The feature is categorical Answer: B Explanation: Higher importance scores reflect greater impact on model performance. Question 51. Which of the following is a key distinction between AutoML Video Classification and AutoML Video Object Tracking? A) Classification predicts video‑level labels; tracking identifies and follows objects frame‑by‑frame B) Both perform the same task C. Tracking only works on black‑and‑white videos