Databricks Machine Learning Associate Exam Questions, Exams of Advanced Education

Databricks Machine Learning Associate Exam Questions

Typology: Exams

2025/2026

Available from 04/03/2026

Lect_Smith
Lect_Smith 🇺🇸

5

(7)

8.5K documents

1 / 3

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Databricks Machine Learning
Associate Exam Questions
Databricks Certified Machine Learning Associate Certification - Correct Answer
✔✔The Databricks Certified Machine Learning Associate certification exam assesses
an individual's ability to use Databricks to perform basic machine learning tasks.
This includes an ability to understand and use Databricks Machine Learning and its
capabilities like AutoML, Feature Store, and select capabilities of MLflow. It also
assesses the ability to make correct decisions in machine learning workflows and
implement those workflows using Spark ML. Finally, the ability to understand
advanced characteristics of scaling machine learning models is assessed.
Individuals who pass this certification exam can be expected to complete basic
machine learning tasks using Databricks and its associated tools.
About the Databricks Machine Learning Associate Exam - Correct Answer ✔✔●
Number of items: 45 multiple-choice questions● Time limit: 90 minutes●
Registration fee: $200● Languages: English● Delivery method: Online Proctored●
Type: Proctored certification● Test aides: None allowed.● Prerequisite: None
required; course attendance and six months of hands-on experience in Databricks is
highly recommended● Validity: 2 years● Recommended experience: 6+ months of
hands-on experience performing the machine learning tasks outlined in the exam
guide
Exam Topics - Correct Answer ✔✔Section 1: Databricks Machine Learning 29%
Section 2: ML Workflows 29%
Section 3: Spark ML 33%
Section 4: Scaling ML Models 9%
A data scientist wants to efficiently tune the hyperparameters of a scikit-learn
model in parallel. They elect to use the Hyperopt library to facilitate this
process.Which of the following Hyperopt tools provides the ability to optimize
hyperparameters in parallel?
A.fminB.SparkTrialsC.quniformD.search_spaceE.objective_function - Correct Answer
✔✔Answer: B
An organization is developing a feature repository and is electing to one-hot encode
all categorical feature variables. A data scientist suggests that the categorical
feature variables should not be one-hot encoded within the feature repository.Which
of the following explanations justifies this suggestion?A.One-hot encoding is a
potentially problematic categorical variable strategy for some machine learning
algorithms.B.One-hot encoding is dependent on the target variable's values which
differ for each apaplication.C.One-hot encoding is computationally intensive and
should only be performed on small samples of training sets for individual machine
learning problems.D.One-hot encoding is not a common strategy for representing
categorical feature variables numerically. - Correct Answer ✔✔Answer: A
pf3

Partial preview of the text

Download Databricks Machine Learning Associate Exam Questions and more Exams Advanced Education in PDF only on Docsity!

Databricks Machine Learning

Associate Exam Questions

Databricks Certified Machine Learning Associate Certification - Correct Answer ✔✔The Databricks Certified Machine Learning Associate certification exam assesses an individual's ability to use Databricks to perform basic machine learning tasks. This includes an ability to understand and use Databricks Machine Learning and its capabilities like AutoML, Feature Store, and select capabilities of MLflow. It also assesses the ability to make correct decisions in machine learning workflows and implement those workflows using Spark ML. Finally, the ability to understand advanced characteristics of scaling machine learning models is assessed. Individuals who pass this certification exam can be expected to complete basic machine learning tasks using Databricks and its associated tools. About the Databricks Machine Learning Associate Exam - Correct Answer ✔✔● Number of items: 45 multiple-choice questions● Time limit: 90 minutes● Registration fee: $200● Languages: English● Delivery method: Online Proctored● Type: Proctored certification● Test aides: None allowed.● Prerequisite: None required; course attendance and six months of hands-on experience in Databricks is highly recommended● Validity: 2 years● Recommended experience: 6+ months of hands-on experience performing the machine learning tasks outlined in the exam guide Exam Topics - Correct Answer ✔✔Section 1: Databricks Machine Learning 29% Section 2: ML Workflows 29% Section 3: Spark ML 33% Section 4: Scaling ML Models 9% A data scientist wants to efficiently tune the hyperparameters of a scikit-learn model in parallel. They elect to use the Hyperopt library to facilitate this process.Which of the following Hyperopt tools provides the ability to optimize hyperparameters in parallel? A.fminB.SparkTrialsC.quniformD.search_spaceE.objective_function - Correct Answer ✔✔Answer: B An organization is developing a feature repository and is electing to one-hot encode all categorical feature variables. A data scientist suggests that the categorical feature variables should not be one-hot encoded within the feature repository.Which of the following explanations justifies this suggestion?A.One-hot encoding is a potentially problematic categorical variable strategy for some machine learning algorithms.B.One-hot encoding is dependent on the target variable's values which differ for each apaplication.C.One-hot encoding is computationally intensive and should only be performed on small samples of training sets for individual machine learning problems.D.One-hot encoding is not a common strategy for representing categorical feature variables numerically. - Correct Answer ✔✔Answer: A

A data scientist has created a linear regression model that uses log(price) as a label variable. Using this model, they have performed inference and the predictions and actual label values are in Spark DataFrame preds_df.They are using the following code block to evaluate the model:regression_evaluator.setMetricName("rmse").evaluate(preds_df)Which of the following changes should the data scientist make to evaluate the RMSE in a way that is comparable with price?A.They should exponentiate the computed RMSE valueB.They should take the log of the predictions before computing the RMSEC.They should evaluate the MSE of the log predictions to compute the RMSED.They should exponentiate the predictions before computing the RMSE - Correct Answer ✔✔Answer: D A data scientist has a Spark DataFrame spark_df. They want to create a new Spark DataFrame that contains only the rows from spark_df where the value in column discount is less than or equal 0.Which of the following code blocks will accomplish this task?A.spark_df.loc[:,spark_df['discount'] <= 0]B.spark_df[spark_df['discount'] <= 0]C.spark_df.filter (col('discount') <= 0)D.spark_df.loc(spark_df['discount'] <= 0, :] - Correct Answer ✔✔Answer: C A data scientist has written a feature engineering notebook that utilizes the pandas library. As the size of the data processed by the notebook increases, the notebook's runtime is drastically increasing, but it is processing slowly as the size of the data included in the process increases.Which of the following tools can the data scientist use to spend the least amount of time refactoring their notebook to scale with big data?A.PySpark DataFrame APIB.pandas API on SparkC.Spark SQLD.Feature Store - Correct Answer ✔✔Answer: B Which of the following tools can be used to distribute large-scale feature engineering without the use of a UDF or pandas Function API for machine learning pipelines?A.KerasB.Scikit-learnC.PyTorchD.Spark ML - Correct Answer ✔✔Answer: D Which statement describes a Spark ML transformer?A.A transformer is an algorithm which can transform one DataFrame into another DataFrameB.A transformer is a hyperparameter grid that can be used to train a modelC.A transformer chains multiple algorithms together to transform an ML workflowD.A transformer is a learning algorithm that can use a DataFrame to train a model - Correct Answer ✔✔Answer: A A machine learning engineer has been notified that a new Staging version of a model registered to the MLflow Model Registry has passed all tests. As a result, the machine learning engineer wants to put this model into production by transitioning it to the Production stage in the Model Registry.From which of the following pages in Databricks Machine Learning can the machine learning engineer accomplish this task?A.The home page of the MLflow Model RegistryB.The experiment page in the Experiments observatoryC.The model version page in the MLflow Model RegistryD.The model page in the MLflow Model Registry - Correct Answer ✔✔Answer: C