dbt Labs Analytics Engineer Certification Ultimate Exam, Exams of Technology

The dbt Labs Analytics Engineer Certification Ultimate Exam is a comprehensive study guide for data professionals working with dbt Labs tools. It covers data modeling, SQL transformations, testing, documentation, and analytics engineering best practices. Learners will gain hands-on understanding of building scalable data pipelines and managing data workflows. With scenario-based questions and detailed explanations, this ultimate exam prepares candidates to excel in certification and apply analytics engineering concepts in real-world data environments.

Typology: Exams

2025/2026

Available from 04/20/2026

nicky-jone
nicky-jone 🇮🇳

2.9

(43)

28K documents

1 / 49

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
dbt Labs Analytics Engineer Certification Ultimate
Exam
Question 1. Which dbt macro is used to declare a dependency on another model within a SELECT
statement?
A) source()
B) ref()
C) config()
D) var()
Answer: B
Explanation: The `ref()` macro tells dbt that the current model depends on another dbt model, allowing
dbt to build the correct DAG and compile the dependent model’s SQL.
Question 2. In dbt, what does the `source()` macro primarily accomplish?
A) It creates a temporary table.
B) It defines a seed file.
C) It references a raw data table defined in a .yml source configuration.
D) It runs a Python script.
Answer: C
Explanation: `source()` points to a raw table or view that is managed outside of dbt, enabling lineage
tracking and freshness checks.
Question 3. Which materialization type stores the results of a model as a view that does not persist data
on disk?
A) table
B) incremental
C) view
D) ephemera l
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b
pf1c
pf1d
pf1e
pf1f
pf20
pf21
pf22
pf23
pf24
pf25
pf26
pf27
pf28
pf29
pf2a
pf2b
pf2c
pf2d
pf2e
pf2f
pf30
pf31

Partial preview of the text

Download dbt Labs Analytics Engineer Certification Ultimate Exam and more Exams Technology in PDF only on Docsity!

Exam

Question 1. Which dbt macro is used to declare a dependency on another model within a SELECT statement? A) source() B) ref() C) config() D) var() Answer: B Explanation: The ref() macro tells dbt that the current model depends on another dbt model, allowing dbt to build the correct DAG and compile the dependent model’s SQL. Question 2. In dbt, what does the source() macro primarily accomplish? A) It creates a temporary table. B) It defines a seed file. C) It references a raw data table defined in a .yml source configuration. D) It runs a Python script. Answer: C Explanation: source() points to a raw table or view that is managed outside of dbt, enabling lineage tracking and freshness checks. Question 3. Which materialization type stores the results of a model as a view that does not persist data on disk? A) table B) incremental C) view D) ephemera l

Exam

Answer: C Explanation: A view materialization creates a virtual table; the underlying query executes each time the view is queried, and no data is stored permanently. Question 4. When would you choose an incremental materialization over a full table refresh? A) When the dataset is static and never changes. B) When you need to recompute the entire dataset on every run. C) When only new or changed rows need to be added to an existing table. D) When you want to create a temporary table for testing. Answer: C Explanation: Incremental models add only new or updated rows based on a unique key, reducing compute cost for large, slowly changing datasets. Question 5. Which Jinja function is used to test whether a model is being run in an incremental context? A) is_incremental() B) is_table() C) is_snapshot() D) is_ephemeral() Answer: A Explanation: is_incremental() returns true when the current run is processing an incremental model, allowing conditional logic for upserts. Question 6. In dbt, what is the purpose of a snapshot? A) To store static CSV data. B) To capture point‑in‑time versions of source data for Slowly Changing Dimensions.

Exam

Question 9. Which configuration file defines the folder structure, default materializations, and naming conventions for a dbt project? A) packages.yml B) profiles.yml C) dbt_project.yml D) schema.yml Answer: C Explanation: dbt_project.yml holds project‑level configurations such as resource paths and default materializations. Question 10. In dbt, what does the grant configuration control? A) The number of rows to process. B) Permissions for the created objects in the warehouse. C) The order of model execution. D) The naming of seed files. Answer: B Explanation: grant statements let you set SELECT, INSERT, UPDATE, etc., privileges on models, snapshots, and sources. Question 11. Which of the following is a generic test provided out‑of‑the‑box by dbt? A) custom_sql_test() B) not_null() C) validate_date_range() D) check_nulls() Answer: B

Exam

Explanation: dbt includes generic tests such as unique, not_null, accepted_values, and relationships. Question 12. How would you define a custom generic test in dbt? A) By writing a Python script in the models folder. B) By creating a .sql file in the tests directory that uses the test macro. C) By adding a custom_test: key to dbt_project.yml. D) By using the source_freshness configuration. Answer: B Explanation: Custom generic tests are defined as SQL files that use the test macro and can be referenced in schema.yml. Question 13. What does the store_failures configuration do when set to true? A) It prevents any tests from running. B) It writes failing rows to a table for later inspection. C) It deletes all data that fails a test. D) It logs failures to a JSON file only. Answer: B Explanation: store_failures: true creates a table containing rows that failed a test, aiding debugging and audit. Question 14. Which parameter in a source freshness definition determines the threshold for raising a warning? A) error_after B) warn_after

Exam

Question 17. Which dbt feature allows you to track the downstream impact of a model on dashboards or ML pipelines? A) exposures B) seeds C) snapshots D) macros Answer: A Explanation: Exposures are defined in YAML to link models to downstream artifacts like dashboards, enabling impact analysis. Question 18. How can you run only models that have changed since the last successful run? A) dbt run --select state:modified B) dbt run --full-refresh C) dbt run --exclude state:modified D) dbt run --models changed_only Answer: A Explanation: The state:modified selector runs nodes that have changed according to the state comparison artifacts. Question 19. Which dbt artifact contains a complete list of all nodes, their dependencies, and metadata for a run? A) run_results.json B) manifest.json C) catalog.json D) sources.json

Exam

Answer: B Explanation: manifest.json holds the full project graph, including models, tests, sources, exposures, and their relationships. Question 20. When debugging a dbt model, where can you find the compiled SQL that dbt sends to the warehouse? A) target/run/ B) target/compiled/ C) logs/dbt.log D) profiles.yml Answer: B Explanation: The target/compiled// directory contains the final rendered SQL for each model. Question 21. What is the primary benefit of using dbt’s clone command? A) To copy data between warehouses. B) To create a zero‑copy clone of a warehouse schema for isolated development. C) To duplicate a model file. D) To generate documentation automatically. Answer: B Explanation: dbt clone creates a zero‑copy clone (e.g., Snowflake clone) allowing developers to test changes without affecting production data. Question 22. Which of the following is NOT a supported warehouse for dbt as of the latest release? A) Snowflake

Exam

Question 25. Which configuration precedence is correct from highest to lowest? A) dbt_project.yml > model config() > package defaults B) model config() > dbt_project.yml > package defaults C) package defaults > dbt_project.yml > model config() D) dbt_project.yml > package defaults > model config() Answer: B Explanation: Settings defined with config() inside a model override project‑level settings, which in turn override package defaults. Question 26. What does the unique_key config option do for an incremental model? A) It defines the column used to detect duplicates during upserts. B) It sets the primary key for a view. C) It determines which rows to delete on full refresh. D) It specifies the order of row insertion. Answer: A Explanation: unique_key tells dbt how to match incoming rows to existing rows for incremental merges. Question 27. Which dbt command validates that all model and test definitions compile without executing them? A) dbt run B) dbt test C) dbt compile D) dbt debug Answer: C

Exam

Explanation: dbt compile renders all Jinja and produces compiled SQL files, checking for syntax errors without running against the warehouse. Question 28. In dbt, what is a seed file? A) A Python script for ML models. B) A static CSV file loaded into the warehouse as a table. C) A macro library. D) A snapshot definition. Answer: B Explanation: Seed files are CSVs placed in the data/ folder; dbt loads them as tables using the seed command. Question 29. How does dbt enforce schema contracts on a model? A) By using the contract config in the model file. B) By adding a schema.yml with tests: under the model. C) By defining a model_contracts block in dbt_project.yml. D) By setting enforce_contracts: true in profiles.yml. Answer: A Explanation: Adding config(contract={'enforced': true}) (or contract: true in newer versions) tells dbt to validate column types and presence at runtime. Question 30. Which of the following is a correct way to reference a column description in a dbt model’s .yml file? A) description: "Customer ID" B) doc: "Customer ID"

Exam

Question 33. Which macro would you use to generate a list of all models that are downstream of a given model? A) downstream() B) depends_on() C) graph() D) ref_dependencies() Answer: A Explanation: The downstream() macro (available via dbt_utils) returns models that depend on the supplied model. Question 34. What is the effect of setting full_refresh: true on an incremental model run? A) It skips the model entirely. B) It drops and recreates the target table, recomputing all rows. C) It only refreshes the newest partition. D) It runs the model in a sandbox environment. Answer: B Explanation: full_refresh forces dbt to truncate the incremental table and rebuild it from scratch. Question 35. In dbt, which file type is used to define model contracts, column types, and tests? A) .sql B) .yml C) .json D) .md Answer: B

Exam

Explanation: Schema files (schema.yml or any .yml under the models directory) hold contracts, tests, and documentation. Question 36. Which command will generate the interactive documentation site for a dbt project? A) dbt docs generate && dbt docs serve B) dbt run --docs C) dbt compile --docs D) dbt test --docs Answer: A Explanation: dbt docs generate builds the docs JSON, and dbt docs serve starts a local web server to view them. Question 37. What does the exposures YAML block enable you to do? A) Define which models are materialized as tables. B) Track the downstream assets (e.g., dashboards) that depend on a model. C) Configure source freshness thresholds. D) Set up custom Jinja macros. Answer: B Explanation: Exposures link models to downstream tools, allowing impact analysis when a model changes. Question 38. Which dbt feature helps you to limit a run to only models that are directly impacted by a change in source data? A) selector: state:modified B) selector: tag:critical

Exam

A) The name of the current model file. B) The active profile configuration (e.g., warehouse, schema). C) The list of all downstream models. D) The compiled SQL of the model. Answer: B Explanation: target contains connection information such as type, schema, database, and threads for the current run. Question 42. Which of the following statements about dbt’s run_results.json is correct? A) It contains the compiled SQL for each model. B) It records the execution status, timing, and error messages for each node in a run. C) It defines the project’s folder structure. D) It stores source freshness metrics. Answer: B Explanation: run_results.json is generated after a run and includes details like success/failure, execution time, and any errors per node. Question 43. In dbt, what is the purpose of the pre-hook configuration? A) To execute SQL before a model runs, such as setting session variables. B) To validate model contracts. C) To generate documentation automatically. D) To clone the warehouse schema. Answer: A Explanation: pre-hook runs specified SQL statements right before the model’s main query executes.

Exam

Question 44. Which of these is NOT a valid dbt materialization option? A) view B) table C) incremental D) stored_procedure Answer: D Explanation: dbt does not have a stored_procedure materialization; it only supports view, table, incremental, and ephemeral (plus some adapters’ custom types). Question 45. When you want to ensure that a column never contains NULL values, which generic test should you apply? A) unique B) not_null C) accepted_values D) relationships Answer: B Explanation: The not_null test fails if any row has a NULL in the specified column. Question 46. Which dbt feature enables you to enforce that a column’s values belong to a predefined list? A) accepted_values test B) unique test C) not_null test D) relationship test

Exam

D) {{ adapter.get_columns_in_relation(ref('source_name')) }} Answer: A Explanation: The source_columns() macro (provided by dbt) returns column metadata for a given source definition. Question 50. Which of the following best describes a “state” operation in dbt? A) Running a model in a different schema. B) Comparing the current manifest to a previous run’s manifest to detect changes. C) Exporting data to a CSV file. D) Creating a snapshot of a model. Answer: B Explanation: State operations (e.g., state:modified) allow dbt to detect what has changed since a prior run, enabling selective execution. Question 51. In dbt, what is the purpose of the on-run-start hook? A) To run SQL after every model finishes. B) To execute SQL before any models in the run start, typically for session setup. C) To generate documentation automatically. D) To delete temporary tables after the run. Answer: B Explanation: on-run-start runs once at the beginning of the dbt run, useful for setting session variables or logging. Question 52. Which of the following is a correct way to define a model that uses a Python script for transformation?

Exam

A) Create a .py file in the models/ directory and set materialization: python. B) Use {{ python() }} macro inside a .sql model. C) Place a .sql file with {{ config(materialized='python') }} and write Python code in a {{ python }} block. D) Create a .py file in the models/ folder and set materialized: python in the model’s config. Answer: D Explanation: dbt supports Python models (e.g., with Snowpark, BigQuery ML) by placing a .py file and configuring materialized: python. Question 53. What does the meta key in a schema.yml file allow you to store? A) Column data types. B) Arbitrary key‑value pairs for downstream tools or custom logic. C) Model documentation. D) Test definitions. Answer: B Explanation: meta is a free‑form dictionary that can be accessed in macros or external systems for custom behavior. Question 54. Which dbt command is used to validate connection settings and ensure the profile is correctly configured? A) dbt compile B) dbt debug C) dbt run D) dbt test Answer: B