DAA B01 SnowPro Advanced Data Analyst Beta Practice Exam, Exams of Technology

A beta practice exam designed for advanced Snowflake data analysts. It assesses complex SQL development, performance tuning, data modeling, semi-structured data handling, Snowpark usage, and BI integration strategies. The exam emphasizes exploratory analysis, transformation design, and optimization for large-scale data environments.

Typology: Exams

2025/2026

Available from 01/08/2026

shilpi-jain-1
shilpi-jain-1 🇮🇳

4.2

(5)

29K documents

1 / 92

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
DAA B01 SnowPro Advanced Data Analyst Beta Practice
Exam
**Question 1.** Which Snowflake object is best suited for encapsulating reusable business logic that
can be called from multiple SQL statements?
A) View
B) Stored Procedure
C) UserDefined Function (UDF)
D) Materialized View
Answer: C
Explanation: A UDF allows you to define custom scalar or tablevalued logic that can be invoked in any
SELECT, making it ideal for reusable calculations.
**Question 2.** When loading JSON data from an external S3 stage using COPY INTO, which option
ensures that semistructured fields are automatically parsed into VARIANT columns?
A) FILE_FORMAT = (TYPE = 'CSV')
B) FILE_FORMAT = (TYPE = 'JSON')
C) ON_ERROR = 'SKIP_FILE'
D) FORCE = TRUE
Answer: B
Explanation: Specifying FILE_FORMAT TYPE=JSON tells Snowflake to treat each line as JSON and store it
in a VARIANT column.
**Question 3.** In Snowflake, which of the following statements about primary keys is true?
A) Snowflake enforces primary key uniqueness at load time.
B) Primary keys are only metadata and are not enforced.
C) Primary keys must be defined on clustered tables.
D) Primary keys automatically create a unique index.
Answer: B
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b
pf1c
pf1d
pf1e
pf1f
pf20
pf21
pf22
pf23
pf24
pf25
pf26
pf27
pf28
pf29
pf2a
pf2b
pf2c
pf2d
pf2e
pf2f
pf30
pf31
pf32
pf33
pf34
pf35
pf36
pf37
pf38
pf39
pf3a
pf3b
pf3c
pf3d
pf3e
pf3f
pf40
pf41
pf42
pf43
pf44
pf45
pf46
pf47
pf48
pf49
pf4a
pf4b
pf4c
pf4d
pf4e
pf4f
pf50
pf51
pf52
pf53
pf54
pf55
pf56
pf57
pf58
pf59
pf5a
pf5b
pf5c

Partial preview of the text

Download DAA B01 SnowPro Advanced Data Analyst Beta Practice Exam and more Exams Technology in PDF only on Docsity!

Exam

Question 1. Which Snowflake object is best suited for encapsulating reusable business logic that can be called from multiple SQL statements? A) View B) Stored Procedure C) User‑Defined Function (UDF) D) Materialized View Answer: C Explanation: A UDF allows you to define custom scalar or table‑valued logic that can be invoked in any SELECT, making it ideal for reusable calculations. Question 2. When loading JSON data from an external S3 stage using COPY INTO, which option ensures that semi‑structured fields are automatically parsed into VARIANT columns? A) FILE_FORMAT = (TYPE = 'CSV') B) FILE_FORMAT = (TYPE = 'JSON') C) ON_ERROR = 'SKIP_FILE' D) FORCE = TRUE Answer: B Explanation: Specifying FILE_FORMAT TYPE=JSON tells Snowflake to treat each line as JSON and store it in a VARIANT column. Question 3. In Snowflake, which of the following statements about primary keys is true? A) Snowflake enforces primary key uniqueness at load time. B) Primary keys are only metadata and are not enforced. C) Primary keys must be defined on clustered tables. D) Primary keys automatically create a unique index. Answer: B

Exam

Explanation: Snowflake treats primary keys as informational metadata; they are not enforced, but they help BI tools understand relationships. Question 4. Which Snowflake feature allows you to query a table as it existed 48 hours ago without restoring data? A) Zero‑Copy Clone B) Time Travel C) Data Retention Policy D) Fail‑Safe Answer: B Explanation: Time Travel enables querying historical data within the defined retention period (up to 90 days) without needing to restore. Question 5. You need to schedule a daily aggregation that runs at 2 AM UTC and writes results to a summary table. Which Snowflake object should you use? A) Stream B) Task C) Pipe D) External Table Answer: B Explanation: Tasks provide scheduled execution of SQL statements; they are ideal for daily batch aggregations. Question 6. Which join type returns all rows from the left table and matching rows from the right table, filling non‑matches with NULLs? A) INNER JOIN

Exam

Question 9. To enforce column‑level security that masks credit‑card numbers for non‑privileged users, you should implement: A) Row Access Policy B) Dynamic Data Masking C) Secure View D) Network Policy Answer: B Explanation: Dynamic Data Masking applies a masking expression to a column based on the role of the requester. Question 10. Which Snowflake function would you use to calculate the 90th percentile of a numeric column? A) PERCENT_RANK() B) PERCENTILE_CONT(0.9) C) MEDIAN() D) APPROX_PERCENTILE(0.9) Answer: B Explanation: PERCENTILE_CONT computes a continuous percentile; passing 0.9 returns the 90th percentile. Question 11. You need to create a clone of a production database for testing transformations without using additional storage. Which command accomplishes this? A) CREATE DATABASE test_clone CLONE prod_db; B) CREATE SCHEMA test_clone CLONE prod_schema; C) CREATE TABLE test_clone CLONE prod_table; D) CREATE VIEW test_clone AS SELECT * FROM prod_db;

Exam

Answer: A Explanation: CREATE DATABASE … CLONE creates a zero‑copy clone that shares storage until changes are made. Question 12. Which of the following best describes a Snowflake Stream? A) A continuous data ingestion pipeline from external sources. B) A change‑data‑capture (CDC) object that tracks DML changes on a table. C) A scheduled task that runs a SQL script. D) A view that materializes data for fast access. Answer: B Explanation: Streams capture inserts, updates, and deletes on a source table, enabling incremental processing. Question 13. When optimizing a query that scans a large fact table, which technique can most reduce the amount of data read? A) Increase the virtual warehouse size. B) Use RESULT_SCAN(). C) Apply partition pruning via a filter on a clustering key. D) Disable caching. Answer: C Explanation: Partition pruning (micro‑partition pruning) limits scanning to only relevant partitions based on filter predicates. Question 14. Which Snowflake object type guarantees that the underlying data cannot be accessed by unauthorized users, even if they have SELECT privileges on the underlying tables? A) View

Exam

Question 17. Which analytic function would you use to assign a rank to each row within a partition, restarting the rank for each new partition? A) ROW_NUMBER() B) RANK() C) DENSE_RANK() D) NTILE() Answer: B Explanation: RANK() provides ranking with gaps and restarts for each partition defined in the OVER clause. Question 18. When you need to expose a dataset to an external partner without copying the data, which Snowflake feature should you use? A) External Table B) Data Sharing C) Clone D) Export to CSV Answer: B Explanation: Snowflake Data Sharing lets you share live data securely without moving or copying it. Question 19. Which SQL clause can be used to limit the amount of data returned by a query to the first 100 rows? A) LIMIT 100 B) TOP 100 C) FETCH FIRST 100 ROWS ONLY D) Both A and C are valid in Snowflake

Exam

Answer: D Explanation: Snowflake supports both LIMIT and the ANSI FETCH syntax. Question 20. What is the primary benefit of using a Materialized View for a frequently run aggregation query? A) It automatically updates the source tables. B) It stores pre‑computed results, reducing query latency. C) It enforces row‑level security. D) It eliminates the need for a virtual warehouse. Answer: B Explanation: Materialized Views persist the results of the underlying query, allowing faster reads. Question 21. Which function would you use to extract the year component from a DATE column named order_date? A) YEAR(order_date) B) DATE_PART('YEAR', order_date) C) EXTRACT(YEAR FROM order_date) D) All of the above are supported in Snowflake Answer: D Explanation: Snowflake supports YEAR(), DATE_PART(), and EXTRACT() for date component extraction. Question 22. When performing a MERGE statement, which clause defines the condition that determines whether a source row matches a target row? A) WHEN MATCHED THEN B) ON

Exam

A) ALTER TABLE employees MODIFY COLUMN ssn SET MASKING POLICY ssn_mask; B) CREATE MASKING POLICY ssn_mask AS (VAL STRING) RETURNS STRING - > CASE WHEN CURRENT_ROLE() = 'ADMIN' THEN VAL ELSE CONCAT('XXX-XX-', RIGHT(VAL,4)) END; C) GRANT MASK ON COLUMN employees.ssn TO ROLE analyst; D) CREATE ROW ACCESS POLICY ssn_policy... Answer: B Explanation: A masking policy defines the expression applied to the column; the example masks all but the last four digits. Question 26. Which Snowflake system function returns the number of rows affected by the last DML statement? A) ROW_COUNT() B) RESULT_SCAN() C) LAST_QUERY_ID() D) SYSTEM$ROW_COUNT() Answer: D Explanation: SYSTEM$ROW_COUNT() reports the row count for the most recent INSERT/UPDATE/DELETE/MERGE. Question 27. When loading data with COPY INTO and you want rows that cause errors to be logged but not stop the load, which ON_ERROR setting should you use? A) ABORT_STATEMENT B) CONTINUE C) SKIP_FILE D) SKIP_FILE_NUM Answer: B

Exam

Explanation: ON_ERROR='CONTINUE' logs the offending rows and continues loading the rest of the file. Question 28. Which function would you use to generate a random integer between 1 and 100 in a SELECT statement? A) RANDOM() B) UNIFORM(1,100,0) C) RANDINT(1,100) D) SEQ4() Answer: B Explanation: UNIFORM generates a random value within a specified range; UNIFORM(1,100,0) returns an integer between 1 and 100. Question 29. In Snowflake, what does the keyword “CLUSTER BY” do when creating a table? A) It creates a primary key constraint. B) It defines a clustering key to improve pruning. C) It enforces uniqueness on the specified columns. D) It partitions the table into separate physical files. Answer: B Explanation: CLUSTER BY defines one or more columns used for automatic micro‑partition clustering, aiding query performance. Question 30. Which Snowflake function can be used to convert a semi‑structured VARIANT column containing an array into a set of rows? A) FLATTEN() B) SPLIT() C) TO_ARRAY()

Exam

B) CUMULATIVE_SUM(sales_amount) C) SUM(sales_amount) PARTITION BY transaction_date D) RUNNING_TOTAL(sales_amount) Answer: A Explanation: The window function SUM … OVER (ORDER BY …) computes a running total based on the specified order. Question 34. When you need to provide a BI tool with a dataset that refreshes every 15 minutes, which Snowflake feature is most appropriate to automate the refresh? A) Materialized View with AUTO_REFRESH = TRUE B) Task that runs the INSERT … SELECT every 15 minutes C) External Table that points to a CSV updated every 15 minutes D) Secure View with real‑time access Answer: B Explanation: A Task can be scheduled to run at a 15‑minute interval to refresh the staging table used by the BI tool. Question 35. Which of the following best explains the purpose of the RESULT_SCAN function? A) It scans the physical storage for corrupted files. B) It replays the result set of a previously executed query without re‑running it. C) It returns metadata about the last query plan. D) It clears the result cache. Answer: B Explanation: RESULT_SCAN(<query_id>) returns the exact result set of a prior query, leveraging the result cache.

Exam

Question 36. To enforce that a column “email” never contains NULL values, you would define the column as: A) email STRING NOT NULL B) email VARCHAR UNIQUE C) email STRING DEFAULT '' D) email STRING CONSTRAINT NOT_NULL Answer: A Explanation: Adding NOT NULL to the column definition prevents NULL insertion. Question 37. Which Snowflake function can be used to compute the geographic distance (in miles) between two latitude/longitude points? A) ST_DISTANCE() B) GEO_DISTANCE() C) Haversine() D) SPATIAL_DISTANCE() Answer: A Explanation: ST_DISTANCE takes two GEOGRAPHY objects and returns the distance, defaulting to meters; you can convert to miles. Question 38. In Snowflake, what does the “SECURE” keyword do when creating a view? A) It encrypts the view definition. B) It prevents the view from being dropped. C) It hides underlying table metadata from users without direct privileges. D) It forces the view to be materialized. Answer: C

Exam

D) CONSTRAINT_COLUMN_USAGE

Answer: C Explanation: KEY_COLUMN_USAGE contains information about columns participating in UNIQUE and PRIMARY KEY constraints. Question 42. Which Snowflake function can be used to parse a string containing ISO‑ 8601 timestamps into TIMESTAMP_TZ values? A) TO_TIMESTAMP_TZ() B) TO_DATE() C) PARSE_TIMESTAMP() D) CONVERT_TIMEZONE() Answer: A Explanation: TO_TIMESTAMP_TZ parses a string and attaches time‑zone information, supporting ISO‑8601 format. Question 43. When using the COPY INTO command with a CSV file that includes a header row, which option should you set to skip the header? A) SKIP_HEADER = 1 B) HEADER = TRUE C) FILE_FORMAT = (SKIP_HEADER=1) D) Both A and C are correct Answer: D Explanation: You can specify SKIP_HEADER either directly in COPY or inside the FILE_FORMAT definition. Question 44. Which of the following best describes the purpose of Snowflake’s “Result Cache” expiration policy?

Exam

A) Results are kept indefinitely. B) Results are invalidated when the underlying data changes. C) Results expire after 7 days regardless of data changes. D) Results are cleared when the virtual warehouse is suspended. Answer: B Explanation: The result cache is automatically invalidated if any source table is modified. Question 45. To calculate a moving average of sales over the previous 7 days for each product, which window frame clause would you use? A) ROWS BETWEEN 6 PRECEDING AND CURRENT ROW B) RANGE BETWEEN INTERVAL '7' DAY PRECEDING AND CURRENT ROW C) ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW D) RANGE BETWEEN 7 PRECEDING AND CURRENT ROW Answer: A Explanation: ROWS BETWEEN 6 PRECEDING AND CURRENT ROW provides a 7‑row (including current) moving window. Question 46. Which Snowflake object can you use to expose a reusable transformation that takes input parameters and returns a table? A) Table Function B) Stored Procedure C) View D) UDF (Scalar) Answer: A Explanation: Table Functions accept arguments and return a result set, ideal for parameterized transformations.

Exam

Explanation: CURRENT_TIMEZONE() yields the timezone setting for the current session. Question 50. To load data from a CSV file stored in an S3 bucket without creating a named stage, which syntax is appropriate? A) COPY INTO my_table FROM 's3://bucket/data.csv' CREDENTIALS=(AWS_KEY_ID='…' AWS_SECRET_KEY='…') FILE_FORMAT=(TYPE='CSV'); B) PUT file://data.csv @%my_table; COPY INTO my_table; C) CREATE STAGE tmp_stage URL='s3://bucket/'; COPY INTO my_table FROM @tmp_stage/data.csv; D) LOAD DATA FROM S3 's3://bucket/data.csv' INTO my_table; Answer: A Explanation: The COPY INTO command can reference an external location directly with credentials. Question 51. Which Snowflake function would you use to concatenate an array of strings into a single comma‑separated string? A) ARRAY_TO_STRING() B) LISTAGG() C) STRING_AGG() D) CONCAT_ARRAY() Answer: B Explanation: LISTAGG aggregates string values with a delimiter, suitable for array flattening. Question 52. When creating a table that stores JSON data, which column data type should you choose? A) STRING B) VARIANT C) OBJECT

Exam

D) ARRAY

Answer: B Explanation: VARIANT is Snowflake’s generic type for semi‑structured data such as JSON. Question 53. Which of the following is a benefit of using a Multi‑Cluster Warehouse with “MAX_CLUSTER_COUNT = 5” and “SCALING_POLICY = ECONOMY”? A) Guarantees zero‑latency query execution. B) Automatically adds clusters up to 5 when concurrency spikes, then releases them when idle. C) Forces all queries to run on a single cluster to reduce cost. D) Disables auto‑suspend for cost savings. Answer: B Explanation: ECONOMY scaling adds clusters as needed, up to the defined maximum, and shuts them down when idle. Question 54. Which Snowflake built‑in function can be used to calculate the percentile rank of a value within a result set? A) PERCENT_RANK() B) RANK_PERCENTILE() C) PERCENTILE_RANK() D) NTILE() Answer: A Explanation: PERCENT_RANK() returns the relative rank of each row as a value between 0 and 1. Question 55. To ensure that a Snowflake task runs only when the dependent stream has new data, you should set the task’s: A) WHEN = SYSTEM$STREAM_HAS_DATA('my_stream')