




























































































Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
A practical engineering-oriented exam that validates skills in building resilient ETL/ELT frameworks, optimizing data distribution, Teradata utilities (TPT, BTEQ, FastLoad), query design, job scheduling, automation scripts, and performance tuning.
Typology: Exams
1 / 115
This page cannot be seen from the preview
Don't miss anything!





























































































Question 1. Which of the following best describes a relational database? A) Stores data as key‑value pairs B) Organizes data in tables with rows and columns and enforces relationships via keys C) Stores data only in hierarchical files D) Uses a flat file with no schema Answer: B Explanation: A relational database models data as tables (relations) where rows represent records and columns represent attributes, and relationships are defined using primary and foreign keys. Question 2. In Teradata Vantage, what is the primary difference between a Primary Index (PI) and a Primary Key (PK)? A) PI determines data distribution; PK enforces uniqueness only B) PI is optional; PK must be defined for every table C) PI can be composite; PK cannot be composite D) There is no difference; they are synonyms Answer: A Explanation: The Primary Index controls how rows are hashed and distributed across AMPs, while the Primary Key is a logical constraint ensuring uniqueness but does not affect data distribution. Question 3. Which schema is characterized by a single fact table surrounded by dimension tables with no further normalization? A) Snowflake schema
B) Star schema C) Galaxy schema D) Third normal form Answer: B Explanation: A star schema places a central fact table directly linked to denormalized dimension tables, forming a star‑like shape. Question 4. When would you choose a Snowflake schema over a Star schema in Teradata? A) When query performance is the top priority B) When you need to minimize storage by normalizing dimensions C) When you have only a few dimensions D) When you want to avoid using foreign keys Answer: B Explanation: Snowflake schemas normalize dimension tables to reduce redundancy and storage, at the cost of slightly more complex joins. Question 5. Which of the following is NOT a typical characteristic of a Data Lake architecture? A) Stores raw, unstructured data B) Enforces strict schema-on-write C) Supports multiple data formats (JSON, Parquet, etc.) D) Allows low‑cost, high‑volume storage
Explanation: A view is a virtual table defined by a SELECT statement; it can be queried like a regular table. Question 8. ClearScape Analytics primarily provides which capability? A) Real‑time streaming ingestion B) In‑database advanced analytics using R, Python, and SQL C) Hardware‑level compression only D) Network firewall management Answer: B Explanation: ClearScape Analytics enables running statistical and machine‑learning models directly inside the database using R, Python, and SQL. Question 9. Which component of Vantage architecture is responsible for executing user queries? A) Storage Node B) Compute Cluster (AMP) C) Front‑End Load Balancer D) Object Store Answer: B Explanation: The Access Module Processors (AMPs) are the compute engines that parse, compile, and execute queries.
Question 10. What is the purpose of Teradata Workload Management (TASM)? A) To encrypt data at rest B) To classify, prioritize, and throttle workloads to meet SLAs C) To backup databases automatically D) To generate data visualizations Answer: B Explanation: TASM classifies incoming queries, applies priorities, and can throttle resources to ensure service‑level agreements are met. Question 11. Which scalability option allows Vantage to automatically add compute resources in response to workload spikes? A) Static node allocation B) Elastic scaling in the cloud C) Manual node provisioning only D) Fixed‑size on‑premise cluster Answer: B Explanation: Elastic scaling in a cloud environment lets Vantage provision additional compute nodes on demand. Question 12. How does a Primary Index affect data distribution in Teradata? A) It stores data in a single physical file
D) Eliminates need for Primary Index Answer: C Explanation: Row partitioning (range, hash, or list) enables the optimizer to skip entire partitions when predicates match partition keys. Question 15. Which locking level is applied when a SELECT statement reads data without modifying it? A) Transaction lock B) Request lock C) Statement lock (read lock) D) No lock is taken Answer: C Explanation: A SELECT acquires a read (shared) statement lock, allowing concurrent reads but preventing conflicting writes. Question 16. In the data access layer, which component translates a user’s SQL request into an execution plan? A) Optimizer B) Storage Manager C) Data Lake D) Scheduler
Answer: A Explanation: The optimizer parses the SQL, determines the best execution strategy, and produces the plan. Question 17. Which data type is best suited for storing semi‑structured JSON documents in Teradata? A) VARCHAR(255) B) BLOB C) JSON D) CLOB Answer: C Explanation: Teradata provides a native JSON data type that supports indexing and functions specific to JSON. Question 18. Which of the following statements about column‑level attributes is FALSE? A) FORMAT defines display formatting for a column B) DEFAULT provides a value when none is supplied C) COMPRESS can be used to reduce storage for repeated values D) COLUMN LEVEL constraints are ignored during bulk load Answer: D Explanation: Column‑level constraints (e.g., NOT NULL, CHECK) are still enforced during bulk load unless explicitly disabled.
Answer: C Explanation: The CASCADE option propagates the delete operation to dependent rows. Question 22. Which profiling activity helps determine the optimal data type for a column? A) Analyzing data distribution and range of values B) Counting the number of tables in the database C) Measuring network latency D) Checking user login times Answer: A Explanation: Profiling the range, precision, and cardinality of column values guides the selection of the most efficient data type. Question 23. When should you collect statistics on a table in Teradata? A) Only after loading data for the first time B) Whenever significant data changes occur that could affect the optimizer’s estimates C) Never; statistics are automatically generated
D) Only for tables larger than 1 TB Answer: B Explanation: Updating statistics after substantial data modifications ensures the optimizer makes accurate cost estimates. Question 24. Which TPT operator is used for high‑speed bulk loading of data into a Teradata table? A) EXPORT B) UPDATE C) LOAD D) SELECT Answer: C Explanation: The LOAD operator streams data directly into the target table, leveraging parallelism for maximum throughput. Question 25. In a TPT script, what is the purpose of the “APPLY” clause? A) To define the target table B) To specify the source file format C) To map input columns to target columns and apply transformations D) To set user authentication Answer: C
Question 28. When moving data from an S3 bucket into Vantage Lake, which step is essential? A) Creating a Teradata user with no privileges B) Defining an external table that references the S3 location C) Disabling encryption on S D) Deleting all existing Vantage objects Answer: B Explanation: An external table points to the S3 objects, enabling SELECT/INSERT operations directly on the data. Question 29. Which authentication method uses Kerberos tickets for user verification? A) LDAP B) LDAP+Kerberos C) Password only D) OAuth Answer: B Explanation: LDAP+Kerberos combines directory lookup with Kerberos ticket validation for secure authentication. Question 30. In Vantage, which privilege is required to create a new database? A) SELECT B) CREATE DATABASE
Answer: B Explanation: The CREATE DATABASE privilege explicitly allows a user to define a new database object. Question 31. What does the “spool” space in Teradata refer to? A) Permanent storage for tables B) Temporary workspace for query execution results and intermediate data C) Space reserved for user login logs D) Disk space for backups Answer: B Explanation: Spool space is allocated per session to hold intermediate query results, sorts, and temporary tables. Question 32. Which metric would you examine to identify a query that is causing high CPU usage? A) Disk I/O latency B) CPU time in the Explain plan (CPU per AMP) C. Network packet loss D. Number of users logged in
Question 35. In the context of ModelOps, what does “model versioning” refer to? A) Storing multiple copies of the same model for backup only B) Maintaining distinct, identifiable iterations of a model to track changes and facilitate rollback C. Converting a model to a different programming language D. Running the model on multiple servers simultaneously Answer: B Explanation: Model versioning tracks each iteration, allowing data scientists to compare performance and revert if needed. Question 36. Which ClearScape Analytics language is best suited for deep learning tasks? A) SQL B) R C) Python D) Java Answer: C Explanation: Python, with libraries such as TensorFlow and PyTorch, is the primary language for deep learning within ClearScape. Question 37. Which visualization type is most appropriate for showing the distribution of a single numeric variable?
A) Bar chart B) Scatter plot C) Histogram D) Pie chart Answer: C Explanation: A histogram displays frequency counts of numeric ranges, revealing distribution shape. Question 38. In statistical analysis, what does a p‑value indicate? A) The probability that the null hypothesis is true B) The probability of observing the data assuming the null hypothesis is true C. The correlation coefficient between two variables D. The size of the sample Answer: B Explanation: The p‑value quantifies how likely the observed data would occur under the null hypothesis. Question 39. Which Teradata utility is primarily used for exporting data from a table to a flat file? A) BTEQ B) FastLoad C) MultiLoad D. TPT EXPORT
Explanation: An exclusive lock blocks both reads and writes on the locked row until the transaction commits. Question 42. Which of the following is a characteristic of a “Join Index” (JI)? A) It stores the join result physically, reducing runtime join cost B) It is always unique C) It can only be defined on primary keys D) It cannot be used in a SELECT statement Answer: A Explanation: A Join Index pre‑stores the join output, allowing queries to read from the index instead of performing the join each time. Question 43. When would you use a “Secondary Index” (SI) instead of a “Primary Index”? A) When the column is already the Primary Index B) To improve query performance on a column that is not part of the Primary Index and has high selectivity C. To enforce uniqueness only D. When you want to replicate data across all AMPs Answer: B Explanation: A Secondary Index provides an alternate access path for high‑selectivity columns not covered by the PI.
Question 44. Which of the following statements about “Dynamic Partition Elimination” (DPE) is true? A) DPE works only for column‑partitioned tables B) DPE allows the optimizer to skip scanning partitions that cannot satisfy the predicate at runtime C. DPE requires manual hints in every query D. DPE increases query execution time Answer: B Explanation: DPE automatically removes irrelevant partitions during execution, reducing I/O. Question 45. In Teradata, what does the “FASTEXPORT” utility do? A) Loads data into a table quickly B) Extracts data from a table to a client in a high‑throughput, parallel manner C) Rebuilds indexes automatically D. Generates execution plans Answer: B Explanation: FASTEXPORT streams data out of Teradata using parallel processes, ideal for bulk exports. Question 46. Which of the following is a recommended practice when designing a table to minimize data skew? A) Choose a Primary Index column with uniform distribution of values B. Use a constant value for the Primary Index