









































Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
The dbt Labs Analytics Engineer Certification Ultimate Exam is a comprehensive study guide for data professionals working with dbt Labs tools. It covers data modeling, SQL transformations, testing, documentation, and analytics engineering best practices. Learners will gain hands-on understanding of building scalable data pipelines and managing data workflows. With scenario-based questions and detailed explanations, this ultimate exam prepares candidates to excel in certification and apply analytics engineering concepts in real-world data environments.
Typology: Exams
1 / 49
This page cannot be seen from the preview
Don't miss anything!










































Question 1. Which dbt macro is used to declare a dependency on another model within a SELECT statement? A) source() B) ref() C) config() D) var() Answer: B Explanation: The ref() macro tells dbt that the current model depends on another dbt model, allowing dbt to build the correct DAG and compile the dependent model’s SQL. Question 2. In dbt, what does the source() macro primarily accomplish? A) It creates a temporary table. B) It defines a seed file. C) It references a raw data table defined in a .yml source configuration. D) It runs a Python script. Answer: C Explanation: source() points to a raw table or view that is managed outside of dbt, enabling lineage tracking and freshness checks. Question 3. Which materialization type stores the results of a model as a view that does not persist data on disk? A) table B) incremental C) view D) ephemera l
Answer: C Explanation: A view materialization creates a virtual table; the underlying query executes each time the view is queried, and no data is stored permanently. Question 4. When would you choose an incremental materialization over a full table refresh? A) When the dataset is static and never changes. B) When you need to recompute the entire dataset on every run. C) When only new or changed rows need to be added to an existing table. D) When you want to create a temporary table for testing. Answer: C Explanation: Incremental models add only new or updated rows based on a unique key, reducing compute cost for large, slowly changing datasets. Question 5. Which Jinja function is used to test whether a model is being run in an incremental context? A) is_incremental() B) is_table() C) is_snapshot() D) is_ephemeral() Answer: A Explanation: is_incremental() returns true when the current run is processing an incremental model, allowing conditional logic for upserts. Question 6. In dbt, what is the purpose of a snapshot? A) To store static CSV data. B) To capture point‑in‑time versions of source data for Slowly Changing Dimensions.
Question 9. Which configuration file defines the folder structure, default materializations, and naming conventions for a dbt project? A) packages.yml B) profiles.yml C) dbt_project.yml D) schema.yml Answer: C Explanation: dbt_project.yml holds project‑level configurations such as resource paths and default materializations. Question 10. In dbt, what does the grant configuration control? A) The number of rows to process. B) Permissions for the created objects in the warehouse. C) The order of model execution. D) The naming of seed files. Answer: B Explanation: grant statements let you set SELECT, INSERT, UPDATE, etc., privileges on models, snapshots, and sources. Question 11. Which of the following is a generic test provided out‑of‑the‑box by dbt? A) custom_sql_test() B) not_null() C) validate_date_range() D) check_nulls() Answer: B
Explanation: dbt includes generic tests such as unique, not_null, accepted_values, and relationships. Question 12. How would you define a custom generic test in dbt? A) By writing a Python script in the models folder. B) By creating a .sql file in the tests directory that uses the test macro. C) By adding a custom_test: key to dbt_project.yml. D) By using the source_freshness configuration. Answer: B Explanation: Custom generic tests are defined as SQL files that use the test macro and can be referenced in schema.yml. Question 13. What does the store_failures configuration do when set to true? A) It prevents any tests from running. B) It writes failing rows to a table for later inspection. C) It deletes all data that fails a test. D) It logs failures to a JSON file only. Answer: B Explanation: store_failures: true creates a table containing rows that failed a test, aiding debugging and audit. Question 14. Which parameter in a source freshness definition determines the threshold for raising a warning? A) error_after B) warn_after
Question 17. Which dbt feature allows you to track the downstream impact of a model on dashboards or ML pipelines? A) exposures B) seeds C) snapshots D) macros Answer: A Explanation: Exposures are defined in YAML to link models to downstream artifacts like dashboards, enabling impact analysis. Question 18. How can you run only models that have changed since the last successful run? A) dbt run --select state:modified B) dbt run --full-refresh C) dbt run --exclude state:modified D) dbt run --models changed_only Answer: A Explanation: The state:modified selector runs nodes that have changed according to the state comparison artifacts. Question 19. Which dbt artifact contains a complete list of all nodes, their dependencies, and metadata for a run? A) run_results.json B) manifest.json C) catalog.json D) sources.json
Answer: B Explanation: manifest.json holds the full project graph, including models, tests, sources, exposures, and their relationships. Question 20. When debugging a dbt model, where can you find the compiled SQL that dbt sends to the warehouse? A) target/run/ B) target/compiled/ C) logs/dbt.log D) profiles.yml Answer: B Explanation: The target/compiled// directory contains the final rendered SQL for each model. Question 21. What is the primary benefit of using dbt’s clone command? A) To copy data between warehouses. B) To create a zero‑copy clone of a warehouse schema for isolated development. C) To duplicate a model file. D) To generate documentation automatically. Answer: B Explanation: dbt clone creates a zero‑copy clone (e.g., Snowflake clone) allowing developers to test changes without affecting production data. Question 22. Which of the following is NOT a supported warehouse for dbt as of the latest release? A) Snowflake
Question 25. Which configuration precedence is correct from highest to lowest? A) dbt_project.yml > model config() > package defaults B) model config() > dbt_project.yml > package defaults C) package defaults > dbt_project.yml > model config() D) dbt_project.yml > package defaults > model config() Answer: B Explanation: Settings defined with config() inside a model override project‑level settings, which in turn override package defaults. Question 26. What does the unique_key config option do for an incremental model? A) It defines the column used to detect duplicates during upserts. B) It sets the primary key for a view. C) It determines which rows to delete on full refresh. D) It specifies the order of row insertion. Answer: A Explanation: unique_key tells dbt how to match incoming rows to existing rows for incremental merges. Question 27. Which dbt command validates that all model and test definitions compile without executing them? A) dbt run B) dbt test C) dbt compile D) dbt debug Answer: C
Explanation: dbt compile renders all Jinja and produces compiled SQL files, checking for syntax errors without running against the warehouse. Question 28. In dbt, what is a seed file? A) A Python script for ML models. B) A static CSV file loaded into the warehouse as a table. C) A macro library. D) A snapshot definition. Answer: B Explanation: Seed files are CSVs placed in the data/ folder; dbt loads them as tables using the seed command. Question 29. How does dbt enforce schema contracts on a model? A) By using the contract config in the model file. B) By adding a schema.yml with tests: under the model. C) By defining a model_contracts block in dbt_project.yml. D) By setting enforce_contracts: true in profiles.yml. Answer: A Explanation: Adding config(contract={'enforced': true}) (or contract: true in newer versions) tells dbt to validate column types and presence at runtime. Question 30. Which of the following is a correct way to reference a column description in a dbt model’s .yml file? A) description: "Customer ID" B) doc: "Customer ID"
Question 33. Which macro would you use to generate a list of all models that are downstream of a given model? A) downstream() B) depends_on() C) graph() D) ref_dependencies() Answer: A Explanation: The downstream() macro (available via dbt_utils) returns models that depend on the supplied model. Question 34. What is the effect of setting full_refresh: true on an incremental model run? A) It skips the model entirely. B) It drops and recreates the target table, recomputing all rows. C) It only refreshes the newest partition. D) It runs the model in a sandbox environment. Answer: B Explanation: full_refresh forces dbt to truncate the incremental table and rebuild it from scratch. Question 35. In dbt, which file type is used to define model contracts, column types, and tests? A) .sql B) .yml C) .json D) .md Answer: B
Explanation: Schema files (schema.yml or any .yml under the models directory) hold contracts, tests, and documentation. Question 36. Which command will generate the interactive documentation site for a dbt project? A) dbt docs generate && dbt docs serve B) dbt run --docs C) dbt compile --docs D) dbt test --docs Answer: A Explanation: dbt docs generate builds the docs JSON, and dbt docs serve starts a local web server to view them. Question 37. What does the exposures YAML block enable you to do? A) Define which models are materialized as tables. B) Track the downstream assets (e.g., dashboards) that depend on a model. C) Configure source freshness thresholds. D) Set up custom Jinja macros. Answer: B Explanation: Exposures link models to downstream tools, allowing impact analysis when a model changes. Question 38. Which dbt feature helps you to limit a run to only models that are directly impacted by a change in source data? A) selector: state:modified B) selector: tag:critical
A) The name of the current model file. B) The active profile configuration (e.g., warehouse, schema). C) The list of all downstream models. D) The compiled SQL of the model. Answer: B Explanation: target contains connection information such as type, schema, database, and threads for the current run. Question 42. Which of the following statements about dbt’s run_results.json is correct? A) It contains the compiled SQL for each model. B) It records the execution status, timing, and error messages for each node in a run. C) It defines the project’s folder structure. D) It stores source freshness metrics. Answer: B Explanation: run_results.json is generated after a run and includes details like success/failure, execution time, and any errors per node. Question 43. In dbt, what is the purpose of the pre-hook configuration? A) To execute SQL before a model runs, such as setting session variables. B) To validate model contracts. C) To generate documentation automatically. D) To clone the warehouse schema. Answer: A Explanation: pre-hook runs specified SQL statements right before the model’s main query executes.
Question 44. Which of these is NOT a valid dbt materialization option? A) view B) table C) incremental D) stored_procedure Answer: D Explanation: dbt does not have a stored_procedure materialization; it only supports view, table, incremental, and ephemeral (plus some adapters’ custom types). Question 45. When you want to ensure that a column never contains NULL values, which generic test should you apply? A) unique B) not_null C) accepted_values D) relationships Answer: B Explanation: The not_null test fails if any row has a NULL in the specified column. Question 46. Which dbt feature enables you to enforce that a column’s values belong to a predefined list? A) accepted_values test B) unique test C) not_null test D) relationship test
D) {{ adapter.get_columns_in_relation(ref('source_name')) }} Answer: A Explanation: The source_columns() macro (provided by dbt) returns column metadata for a given source definition. Question 50. Which of the following best describes a “state” operation in dbt? A) Running a model in a different schema. B) Comparing the current manifest to a previous run’s manifest to detect changes. C) Exporting data to a CSV file. D) Creating a snapshot of a model. Answer: B Explanation: State operations (e.g., state:modified) allow dbt to detect what has changed since a prior run, enabling selective execution. Question 51. In dbt, what is the purpose of the on-run-start hook? A) To run SQL after every model finishes. B) To execute SQL before any models in the run start, typically for session setup. C) To generate documentation automatically. D) To delete temporary tables after the run. Answer: B Explanation: on-run-start runs once at the beginning of the dbt run, useful for setting session variables or logging. Question 52. Which of the following is a correct way to define a model that uses a Python script for transformation?
A) Create a .py file in the models/ directory and set materialization: python. B) Use {{ python() }} macro inside a .sql model. C) Place a .sql file with {{ config(materialized='python') }} and write Python code in a {{ python }} block. D) Create a .py file in the models/ folder and set materialized: python in the model’s config. Answer: D Explanation: dbt supports Python models (e.g., with Snowpark, BigQuery ML) by placing a .py file and configuring materialized: python. Question 53. What does the meta key in a schema.yml file allow you to store? A) Column data types. B) Arbitrary key‑value pairs for downstream tools or custom logic. C) Model documentation. D) Test definitions. Answer: B Explanation: meta is a free‑form dictionary that can be accessed in macros or external systems for custom behavior. Question 54. Which dbt command is used to validate connection settings and ensure the profile is correctly configured? A) dbt compile B) dbt debug C) dbt run D) dbt test Answer: B