




























































































Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
This exam prepares analysts for complex data exploration, modeling, analytics, and reporting tasks within Snowflake. It includes scenarios on advanced SQL analytics, windowing functions, materialized views, BI integration, dataset optimization, secure data sharing, and performance tuning for analytical workloads. Candidates practice interpreting business requirements, shaping analytical datasets, and designing efficient analytical solutions using Snowflake’s rich capabilities.
Typology: Exams
1 / 100
This page cannot be seen from the preview
Don't miss anything!





























































































Question 1. Which Snowflake object is best suited for loading a continuously arriving stream of JSON logs from an Amazon S3 bucket? A) Internal stage B) External stage with AUTO_REFRESH = TRUE C) Named pipe D) Snowpipe Answer: D Explanation: Snowpipe automatically ingests data from an external stage as soon as new files appear, making it ideal for continuous JSON log loading. Question 2. When assessing data volume for ingestion, which characteristic primarily influences the choice between using a bulk COPY command versus Snowpipe? A) Data velocity B) Data schema complexity C) Number of columns D) Data encryption Answer: A Explanation: High‑velocity (continuous) data favors Snowpipe for near‑real‑time ingestion, whereas bulk loads are better for large, infrequent batches. Question 3. Which Snowflake feature allows you to query a shared database without copying the data into your own account? A) Snowflake Marketplace B) Data Sharing (Secure Data Sharing) C) Clone
D) External Table Answer: B Explanation: Secure Data Sharing provides read‑only access to another account’s objects without data duplication. Question 4. You need to load a set of daily CSV files that follow the naming pattern sales_2023- 07 - ??.csv from an Azure Blob storage location. Which COPY option correctly selects these files? A) FILE_FORMAT = (TYPE = 'CSV') B) PATTERN = 'sales_2023- 07 - ...csv' C) PATTERN = 'sales_2023- 07 - .*.csv' D) MATCH_BY_COLUMN_NAME = CASE_SENSITIVE Answer: C Explanation: The PATTERN regular expression sales_2023- 07 - .*\.csv matches any file with that prefix and a .csv extension. Question 5. During a COPY operation, you encounter the error “Invalid UTF‑8 byte sequence”. Which action will most likely resolve the issue? A) Increase the warehouse size B) Use the FILE_FORMAT option ENCODING = 'UTF8' C) Set ON_ERROR = 'CONTINUE' D) Add a compression option Answer: B
Answer: A Explanation: COALESCE returns the first non‑null argument, allowing you to substitute the column average for nulls. Question 9. Which Snowflake scalar function would you use to extract the year component from a TIMESTAMP column? A) DATE_PART('YEAR', column) B) YEAR(column) C) EXTRACT(YEAR FROM column) D) TO_CHAR(column, 'YYYY') Answer: A Explanation: DATE_PART('YEAR', …) returns the integer year from a timestamp value. Question 10. Which window function assigns a unique sequential number to rows within a partition based on the order of a sales amount column, starting at 1? A) RANK() B) ROW_NUMBER() C) DENSE_RANK() D) NTILE()
Answer: B Explanation: ROW_NUMBER() provides a unique sequential integer for each row in the defined order. Question 11. When flattening a JSON array column, which table function should you use? A) SPLIT_TO_TABLE B) FLATTEN C) PARSE_JSON D) ARRAY_AGG Answer: B Explanation: FLATTEN expands semi‑structured arrays or objects into relational rows. Question 12. Which system function returns the name of the current Snowflake role executing the query? A) CURRENT_USER() B) CURRENT_ROLE() C) CURRENT_ACCOUNT() D) SESSION_ROLE() Answer: B Explanation: CURRENT_ROLE() reports the active role for the session. Question 13. You need to create a reusable calculation that converts a string to uppercase and trims whitespace. Which Snowflake object is most appropriate? A) View
Explanation: The AT clause with a timestamp (or offset) accesses a historical version via Time Travel. Question 16. To generate a random sample of 5% of rows from a large fact table, which expression is most efficient? A) SELECT * FROM table WHERE RAND() < 0. B) SELECT * FROM TABLE(SAMPLE(5)) C) SELECT * FROM table SAMPLE (5 PERCENT) D) SELECT * FROM table WHERE RANDOM() < 0. Answer: C Explanation: The SAMPLE clause (e.g., SAMPLE (5 PERCENT)) is optimized for random row sampling. Question 17. Which grouping extension allows you to produce subtotals for each combination of region and product category in a single query? A) GROUP BY ROLLUP(region, product_category) B) GROUP BY CUBE(region, product_category) C) GROUP BY GROUPING SETS((region, product_category), (region), (product_category)) D) All of the above Answer: D Explanation: ROLLUP, CUBE, and GROUPING SETS can all generate the required subtotals; each provides a different level of granularity. Question 18. You need to classify customers into “High”, “Medium”, or “Low” value based on total spend using a CASE expression. Which construct is correct?
A) CASE WHEN total_spend > 10000 THEN 'High' WHEN total_spend > 5000 THEN 'Medium' ELSE 'Low' END B) IF(total_spend > 10000, 'High', IF(total_spend > 5000, 'Medium', 'Low')) C) DECODE(total_spend, >10000, 'High', >5000, 'Medium', 'Low') D) SELECT CLASSIFY(total_spend) Answer: A Explanation: The standard CASE WHEN syntax evaluates conditions sequentially and returns the appropriate label. Question 19. Which Snowflake feature can be used to revert a table to its state before a mistaken DELETE operation performed 12 hours ago? A) UNDROP TABLE B) TIME TRAVEL with AT BEFORE (DELETE_TIME) C) CLONE the table and drop the clone D) MATERIALIZED VIEW refresh Answer: B Explanation: Time Travel lets you query the table “AT BEFORE” a specific timestamp, effectively restoring the pre‑delete state. Question 20. When modeling a data warehouse for BI, which schema design is most appropriate for fast slice‑and‑dice analysis on sales data? A) Star schema B) Snowflake schema C) Data vault D) Normalized OLTP schema
Question 23. In the Query Profile, which tab provides a visual representation of the execution steps and their relative cost? A) History B) Details C) Graph D) Timeline Answer: C Explanation: The Graph tab shows the query plan DAG with node costs, aiding performance troubleshooting. Question 24. Which caching mechanism stores the result set of a query for up to 24 hours, allowing subsequent identical queries to be served instantly? A) Metadata Cache B) Result Cache C) Warehouse Cache D) Local Disk Cache Answer: B Explanation: Result Cache retains query results and can be reused without recomputation for up to 24 hours. Question 25. Partition pruning in Snowflake is most effective when: A) The table is clustered on a column used in the WHERE clause B) The table has a primary key defined C) The warehouse size is X‑Large
D) The query uses a SELECT * Answer: A Explanation: Clustering on a column referenced in filters enables Snowflake to prune micro‑partitions, reducing scanned data. Question 26. Which object automatically refreshes its data based on changes in the underlying base table and can improve query performance for frequently accessed aggregates? A) View B) Materialized View C) Temporary Table D) Clone Answer: B Explanation: Materialized Views store pre‑computed results and refresh incrementally, speeding up repeated aggregate queries. Question 27. To compute an approximate distinct count of users on a massive event table, which Snowflake function provides the most efficient solution? A) COUNT(DISTINCT user_id) B) APPROX_COUNT_DISTINCT(user_id) C) HLL_COUNT.MERGE(HLL(user_id)) D) SELECT DISTINCT user_id Answer: B Explanation: APPROX_COUNT_DISTINCT uses HyperLogLog internally for fast, memory‑efficient approximate distinct counts.
D) Heat grid Answer: B Explanation: Bar charts effectively display categorical bucket counts or sums, making them ideal for distribution analysis. Question 31. When connecting Power BI to Snowflake, which authentication method provides the highest security without storing credentials in the client? A) Username/Password B) OAuth C) SSO via Okta (SAML) D) Key Pair Authentication Answer: C Explanation: SSO (SAML) delegates authentication to an identity provider, avoiding credential storage in Power BI. Question 32. Row Access Policies in Snowflake are evaluated: A) At load time only B) At query compile time for each row returned C) Only for materialized views D) Only for external tables Answer: B Explanation: Row Access Policies are applied during query execution, filtering rows per the policy logic.
Question 33. Which Snowflake object enables you to automatically run a SQL statement when new data arrives in a stage? A) Task B) Stream C) Pipe D) Alert Answer: C Explanation: A Pipe defines a Snowpipe that triggers loading when files appear in a stage. Question 34. To capture changes (inserts, updates, deletes) on a source table for downstream analytics, which Snowflake feature should you enable? A) Time Travel B) Streams C) Tasks D) Cloning Answer: B Explanation: Streams track DML changes on a table, providing a change data capture (CDC) mechanism. Question 35. Which of the following statements about Snowflake’s Virtual Warehouses is FALSE? A) They can be sized independently of each other B) They share compute resources across all warehouses by default
Explanation: TRY_TO_TIMESTAMP safely attempts conversion and returns NULL on failure, suitable for VARIANT values. Question 38. When creating a clone of a large table for data cleaning, which of the following statements is true? A) The clone consumes the same amount of storage as the source immediately B) The clone is read‑only until the source is dropped C) Changes to the clone do not affect the source table D) Clones cannot be created on tables larger than 1 TB Answer: C Explanation: Clones are independent logical copies; modifications to the clone are isolated from the source. Question 39. Which Snowflake function returns the number of elements in a JSON array stored in a VARIANT column? A) ARRAY_SIZE(variant_column) B) JSON_LENGTH(variant_column) C) GET_ARRAY_LENGTH(variant_column) D) COUNT(variant_column) Answer: A Explanation: ARRAY_SIZE returns the count of elements in a VARIANT that holds an array. Question 40. To enforce that every row in a table has a non‑null, unique email address, which combination is recommended? A) PRIMARY KEY(email)
B) UNIQUE(email) NOT NULL C) FOREIGN KEY(email) REFERENCES other_table(email) D) CHECK(email IS NOT NULL) Answer: B Explanation: UNIQUE ensures uniqueness, and NOT NULL guarantees presence; together they enforce the desired rule. Question 41. Which statement about Snowflake’s RESULT_SCAN function is correct? A) It can retrieve results of any query in the account history B) It only works for queries executed within the last 24 hours C) It requires the query ID as an argument D) It bypasses all caching mechanisms Answer: C Explanation: RESULT_SCAN takes a query ID and returns the result set of that specific query. Question 42. When you need to generate a series of dates from 2023‑ 01 ‑01 to 2023‑ 01 ‑ 31 for a calendar table, which Snowflake function is most convenient? A) SEQUENCE_GENERATOR B) GENERATOR(ROWCOUNT => 31) C) DATEADD(day, SEQ4(), '2023- 01 - 01') D) SPLIT_TO_TABLE Answer: C Explanation: Using DATEADD with SEQ4() creates incremental dates efficiently.
C) Automatic Refresh (built‑in) D) Manual REFRESH MATERIALIZED VIEW command Answer: C Explanation: Materialized Views have built‑in automatic incremental refresh; no external task is required. Question 46. You need to calculate the median sales amount per region. Which combination of functions will give the correct result? A) PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY sales) B) MEDIAN(sales) OVER (PARTITION BY region) C) APPROX_PERCENTILE(sales, 0.5) D) All of the above Answer: D Explanation: All three approaches compute the median; PERCENTILE_CONT is exact, MEDIAN is a shortcut, APPROX_PERCENTILE provides an approximation. Question 47. Which of the following statements about Snowflake’s external tables is FALSE? A) They query data stored in external cloud storage without loading it B) They support push‑down predicates to limit data read C) They can be updated directly with DML statements D) They require a defined file format Answer: C
Explanation: External tables are read‑only; DML cannot modify the underlying external files. Question 48. Which function would you use to safely parse a string column that may contain malformed JSON into a VARIANT, returning NULL for invalid rows? A) PARSE_JSON(string) B) TRY_PARSE_JSON(string) C) FROM_JSON(string) D) JSON_EXTRACT(string) Answer: B Explanation: TRY_PARSE_JSON returns NULL when parsing fails, preventing query errors. Question 49. When using the SPLIT_TO_TABLE function, what is the result? A) A single string with delimiters removed B) A set of rows, each containing one token from the split string C) A JSON array of tokens D) A temporary table stored in the warehouse Answer: B Explanation: SPLIT_TO_TABLE returns a result set where each row holds one token from the original delimited string. Question 50. Which Snowflake feature allows you to query a table as it existed at a specific point in time without creating a clone? A) CLONE AT B) TIME TRAVEL with AT clause