Prepare for your exams
Get points
Guidelines and tips
Sell on Docsity
Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Docsity AI

Log in Sign up

Prepare for your exams

Study with the several resources on Docsity

Find documents

Prepare for your exams with the study notes shared by other students like you on Docsity

Search for your university

Find the specific documents for your university's exams

Docsity AINEW

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

20 Points

For each uploaded document

Answer questions

5 Points

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Choose your next study program

Get in touch with the best universities in the world. Search through thousands of universities and official partners

Community

Ask the community

Ask the community for help and clear up your study doubts

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

PrepIQ Informatica Data Engineering 10 2 Developer Professional Ultimate Exam, Exams of Technology

Technology

Focuses on validating skills in Informatica Data Engineering Integration using DEI/BDM 10.2. Topics include big data integration, mass ingestion, transformations, mappings, workflows, Hadoop ecosystem integration, Spark engine optimization, pushdown execution, and real-time streaming pipelines. The practice exam simulates real-world processing tasks such as ingesting large datasets, applying transformations efficiently, debugging jobs, monitoring performance, and tuning pipelines in distributed environments.

Typology: Exams

2025/2026

Available from 04/28/2026

shilpi-jain-3 🇮🇳

2.5

(11)

80K documents

1 / 87

This page cannot be seen from the preview

Don't miss anything!

PrepIQ Informatica Data

Engineering 10 2 Developer

Professional Ultimate Exam

**Question 1. Which component of the Hadoop ecosystem is primarily

responsible for storing large data sets across a cluster?**

A) MapReduce

B) YARN

C) HDFS

D) Hive

Answer: C

Explanation: HDFS (Hadoop Distributed File System) provides scalable,

fault-tolerant storage by distributing data blocks across multiple nodes.

**Question 2. In the context of YARN, what does the ResourceManager do?**

A) Executes map and reduce tasks on nodes

B) Schedules containers and arbitrates resources across the cluster

C) Stores metadata about HDFS files

D) Provides a SQL-like query interface

Answer: B

Explanation: The YARN ResourceManager is the central authority that allocates

resources (CPU, memory) to applications by managing containers.

**Question 3. Which Informatica engine enables the execution of transformations

on Spark?**

A) Blaze Engine

B) Smart Executor

C) Polyglot Computing Engine

D) Data Integration Engine

Answer: C

Explanation: Informatica’s Polyglot Computing Engine abstracts underlying

processing engines, allowing mappings to run on Spark when configured.

Partial preview of the text

Download PrepIQ Informatica Data Engineering 10 2 Developer Professional Ultimate Exam and more Exams Technology in PDF only on Docsity!

Engineering 10 2 Developer

Professional Ultimate Exam

Question 1. Which component of the Hadoop ecosystem is primarily responsible for storing large data sets across a cluster? A) MapReduce B) YARN C) HDFS D) Hive Answer: C Explanation: HDFS (Hadoop Distributed File System) provides scalable, fault-tolerant storage by distributing data blocks across multiple nodes. Question 2. In the context of YARN, what does the ResourceManager do? A) Executes map and reduce tasks on nodes B) Schedules containers and arbitrates resources across the cluster C) Stores metadata about HDFS files D) Provides a SQL-like query interface Answer: B Explanation: The YARN ResourceManager is the central authority that allocates resources (CPU, memory) to applications by managing containers. Question 3. Which Informatica engine enables the execution of transformations on Spark? A) Blaze Engine B) Smart Executor C) Polyglot Computing Engine D) Data Integration Engine Answer: C Explanation: Informatica’s Polyglot Computing Engine abstracts underlying processing engines, allowing mappings to run on Spark when configured.

Engineering 10 2 Developer

Professional Ultimate Exam

Question 4. The Smart Executor in Informatica primarily provides which benefit? A) Automatic code generation for Java B) Parallel execution of transformation logic on the target database C) Dynamic allocation of compute resources based on workload D) Real-time data profiling Answer: C Explanation: Smart Executor monitors runtime metrics and dynamically scales resources, optimizing performance for large data jobs. Question 5. Which layer of the Informatica abstraction model isolates developers from underlying Hadoop infrastructure? A) Physical Data Object layer B) Logical Mapping layer C) Integration Service layer D) Abstraction Layer Answer: D Explanation: The Informatica abstraction layer abstracts Hadoop details, allowing developers to design mappings without handling low-level HDFS or YARN specifics. Question 6. When creating a Physical Data Object (PDO) for a Hive table, which property must be defined to enable partition pruning? A) Primary key B) Partition columns C) Data type of each column D) File format Answer: B Explanation: Defining partition columns lets Informatica generate queries that prune unnecessary partitions, improving performance.

Engineering 10 2 Developer

Professional Ultimate Exam

Question 10. Which lookup mode provides the best performance for large reference data sets in a Hadoop environment? A) Connected, active B) Connected, passive C) Unconnected, active D) Unconnected, passive Answer: B Explanation: A connected passive lookup loads the reference data once and keeps it in memory, reducing repeated I/O on large Hadoop tables. Question 11. In a dynamic mapping, what does the “Dynamic Port” feature allow? A) Automatic generation of target tables B) Handling of schema changes without redesigning the mapping C) Real-time monitoring of row counts D) Encryption of data at rest Answer: B Explanation: Dynamic ports adapt to schema drift, enabling the mapping to process new or altered source columns without manual changes. Question 12. Which parameter type is evaluated only once at the start of a workflow execution? A) Mapping variable B) Workflow variable C) Object parameter with “runtime” scope D) Parameter file entry Answer: B

Engineering 10 2 Developer

Professional Ultimate Exam

Explanation: Workflow variables are set before the workflow starts and remain constant throughout its execution. Question 13. Which task in Workflow Manager is used to pause a workflow until a specific file appears in a directory? A) Event Task B) Timer Task C) Decision Task D) Command Task Answer: A Explanation: An Event Task can be configured to wait for a file-based event, such as the arrival of a trigger file. Question 14. When deploying an application, which Informatica service is responsible for executing the mapping logic? A) Repository Service B) Integration Service C) Domain Service D) Monitoring Service Answer: B Explanation: The Integration Service runs the data integration jobs, including mappings and workflows, on the designated runtime engine. Question 15. In Big Data Streaming (BDS), which component provides the ability to process continuous data streams using micro-batches? A) Kafka Connect B) Spark Streaming Engine C) Flume Agent D) Hive Streaming

Engineering 10 2 Developer

Professional Ultimate Exam

Answer: C Explanation: ORC (Optimized Row Columnar) stores data in a columnar layout, offering high compression and fast query performance. Question 19. In Informatica, what does the “Pushdown Optimization” option do for a mapping executed on Hadoop? A) Forces all logic to run on the client machine B) Pushes eligible transformation logic to the database or Hadoop engine for execution C) Compresses source files before loading D) Generates a Java source file for debugging Answer: B Explanation: Pushdown Optimization offloads compatible transformations to the underlying engine (e.g., Hive, Spark), reducing data movement. Question 20. Which transformation would you use to assign a sequential numeric value to each row in a mapping? A) Sequence Generator B) Rank C) Sorter D) Filter Answer: A Explanation: The Sequence Generator produces a monotonically increasing number for each row, useful for surrogate keys. Question 21. When profiling data in the Developer tool, which metric indicates the percentage of null values in a column? A) Distinct Count B) Null Count C) Min Value

Engineering 10 2 Developer

Professional Ultimate Exam

D) Average Length Answer: B Explanation: Null Count reports the number of rows where the column value is null, allowing calculation of the null percentage. Question 22. Which of the following best describes the role of the Repository Service? A) Executes data integration jobs B. Manages metadata storage and versioning C) Provides user authentication for the domain D) Monitors workflow performance Answer: B Explanation: The Repository Service stores and manages metadata objects (mappings, sessions, workflows) and handles version control. Question 23. In a workflow, what does the “Decision Task” evaluate? A) Time-based triggers B) File existence C) Boolean expressions based on workflow variables D) Completion status of previous tasks Answer: C Explanation: Decision Tasks use expressions that reference workflow variables to determine the next path in the workflow. Question 24. Which of the following is a key characteristic of a “connected” lookup transformation? A) It can be used only in the source qualifier B) It passes data rows through the lookup regardless of match status C) It requires a separate mapping call to retrieve lookup values

Engineering 10 2 Developer

Professional Ultimate Exam

C) Real-time schema validation against a central repository D) Encryption of schema metadata Answer: B Explanation: Dynamic Schema lets a source adapt to changes such as added, removed, or reordered columns, facilitating schema drift handling. Question 28. Which of the following best describes a “parameter file” in Informatica? A) A file that stores workflow logs B) A file containing key-value pairs for runtime variable substitution C) A file used to define security policies D) A file that holds transformation code snippets Answer: B Explanation: Parameter files supply values for parameters and variables at runtime, enabling environment-specific configurations. Question 29. When using a “Union” transformation, what must be true about the input ports? A) All inputs must have the same number of ports and matching data types B) Input ports can have different data types and will be auto-converted C) Only one input can be connected at a time D) Union can only be used with flat file sources Answer: A Explanation: Union merges rows from multiple pipelines, requiring each input to have identical port structures for consistent output. Question 30. What is the primary purpose of the “Transaction Control” transformation? A) To enforce data type conversions

Engineering 10 2 Developer

Professional Ultimate Exam

B) To group rows into transactions for commit/rollback control C) To generate surrogate keys D) To perform row-level security checks Answer: B Explanation: Transaction Control allows you to define commit, rollback, or disconnect points within a mapping, giving fine-grained transaction management. Question 31. Which of the following statements about “Incremental Loading” is correct? A) It always overwrites the entire target table B) It loads only rows that have changed since the last load, typically using a high-water mark column C) It requires a full table scan of the source each run D) It can only be implemented with flat file sources Answer: B Explanation: Incremental loads use a change indicator (e.g., timestamp, version) to fetch only new or updated rows, reducing data volume. Question 32. In the context of NoSQL data sources, which Informatica connector is used to read from MongoDB? A) Hadoop Connector B) NoSQL Connector C) MongoDB Native Connector D) JSON Connector Answer: C Explanation: The MongoDB Native Connector provides direct read/write capabilities for MongoDB collections.

Engineering 10 2 Developer

Professional Ultimate Exam

Explanation: Command Tasks run operating-system commands or scripts on the host where the Integration Service resides. Question 36. Which of the following best describes “Schema Drift” in big data environments? A) Gradual performance degradation of a Hadoop cluster B) Changes in source data structure (e.g., added columns) over time C) Loss of metadata due to repository corruption D) Increase in data volume beyond cluster capacity Answer: B Explanation: Schema drift refers to evolving source schemas that can break static mappings unless handled dynamically. Question 37. Which Informatica transformation can be used to rank rows based on a numeric column and keep only the top N rows? A) Sorter B) Rank C) Aggregator D) Filter Answer: B Explanation: Rank assigns a rank number to rows based on a sort order and can limit output to a specified rank range. Question 38. When configuring a Kafka source in a streaming mapping, which property defines the offset reset behavior? A) bootstrap.servers B) group.id C) auto.offset.reset D) key.deserializer

Engineering 10 2 Developer

Professional Ultimate Exam

Answer: C Explanation: The auto.offset.reset property determines where the consumer starts reading if no previous offset is found (earliest or latest). Question 39. Which of the following is NOT a valid execution mode for an Informatica mapping on a Hadoop cluster? A) Spark B) Blaze C) Hive D. MapReduce only (without any higher-level engine) Answer: D Explanation: While MapReduce is the underlying engine, Informatica abstracts execution through Spark, Blaze, or Hive; you do not select “MapReduce only” directly. Question 40. In the Developer tool, what does the “Preview” button do for a source definition? A) Executes the entire mapping and writes to the target B) Retrieves a sample of source rows for quick inspection C) Generates the SQL code for the source query D) Validates the mapping syntax only Answer: B Explanation: Preview fetches a limited number of rows from the source, allowing developers to verify column data and types. **Question 41. Which of the following statements about “Pushdown Optimization

Full” is correct?** A) It pushes only filter conditions to the source database B) It pushes all eligible transformation logic, including joins and aggregations, to the source engine C) It disables all pushdown and forces local execution

Engineering 10 2 Developer

Professional Ultimate Exam

D) 1025

Answer: C Explanation: Sequence values: 1st = 1000, 2nd = 1005, 3rd = 1010? Wait increment 5, so 1000, 1005, 1010. Actually third is 1010. None of the options match? Correction: With start 1000, increment 5 => 1st 1000, 2nd 1005, 3rd

Option A is 1010. So answer A. Answer: A Explanation: The sequence adds 5 each time; after two increments, the third value is 1000 + 2 × 5 = 1010. Question 45. Which transformation can be used to calculate a running total across rows in a mapping? A) Aggregator with group by set to none and a cumulative expression B) Filter C) Rank D) Joiner Answer: A Explanation: An Aggregator without a group-by clause can compute cumulative expressions using the “running total” syntax. Question 46. In a streaming mapping, which component is responsible for maintaining state across micro-batches? A) Kafka Producer B) Spark Structured Streaming checkpoint C) Hive Metastore D) HDFS NameNode Answer: B Explanation: Checkpointing in Spark Structured Streaming preserves state (offsets, aggregations) between micro-batches.

Engineering 10 2 Developer

Professional Ultimate Exam

Question 47. Which of the following is a valid reason to use a “Connected” lookup instead of an “Unconnected” lookup? A) When you need to reuse the same lookup logic in multiple mappings B) When you want to pass additional columns downstream without extra expressions C) When the lookup source is a flat file D) When you need to perform a self-join on the source Answer: B Explanation: Connected lookups automatically forward rows downstream, allowing you to enrich data without extra mapping logic. Question 48. What does the “High-Water Mark” technique rely on for incremental loads? A) A checksum of the entire source table B) A column (often timestamp or ID) that monotonically increases with each change C) The total row count of the source D) The size of the source file Answer: B Explanation: High-water mark columns indicate the latest processed value, enabling the extraction of only newer rows. Question 49. Which of the following statements about the “Smart Executor” is FALSE? A) It can automatically switch between Spark and Hive based on workload B) It monitors runtime metrics to adjust resource allocation C) It provides built-in data profiling during execution D) It works only with on-premise Hadoop clusters

Engineering 10 2 Developer

Professional Ultimate Exam

Answer: B Explanation: Parallelism creates multiple partitions of the source data, allowing concurrent processing to improve throughput. Question 53. Which of the following is a primary benefit of using the “Hive” execution mode for a mapping? A) Real-time processing of streaming data B) Ability to leverage existing HiveQL queries and tables C) Automatic generation of Java code for custom logic D) Direct write to NoSQL stores without transformation Answer: B Explanation: Hive execution translates mapping logic into HiveQL, allowing reuse of existing Hive tables and queries. Question 54. Which transformation can be used to split rows into multiple output groups based on a condition, without discarding any rows? A) Filter B) Router C) Aggregator D) Joiner Answer: B Explanation: Router evaluates multiple group expressions and routes each row to the first matching group, preserving all rows. Question 55. What does the “Cache Type” property of a lookup define? A) Whether the cache is stored in memory, on disk, or both B) The data type of the lookup key column C) The number of rows to cache per batch D) The timeout for cache refresh

Engineering 10 2 Developer

Professional Ultimate Exam

Answer: A Explanation: Cache Type determines if the lookup cache resides entirely in memory, on disk, or uses a hybrid approach. Question 56. Which of the following is NOT a valid source type for a Physical Data Object in Informatica? A) Relational database B) Hadoop HDFS file C) Kafka topic D) FTP server directory Answer: D Explanation: While you can read files via FTP using a file connector, a “Physical Data Object” refers to a defined source/target; FTP directories are accessed via a file connection, but not a distinct PDO type. The most inaccurate choice is D. Question 57. In a workflow, which task can be used to pause execution for a specific amount of time? A) Timer Task B) Event Task C) Decision Task D) Command Task Answer: A Explanation: Timer Tasks introduce a delay based on a defined interval before proceeding to the next task. Question 58. Which of the following best explains the term “Polyglot” in Informatica’s computing engines? A) Ability to translate code into multiple programming languages B) Support for executing mappings on different processing engines (Spark, Hive, etc.)

PrepIQ Informatica Data Engineering 10 2 Developer Professional Ultimate Exam, Exams of Technology

Related documents

Partial preview of the text

Download PrepIQ Informatica Data Engineering 10 2 Developer Professional Ultimate Exam and more Exams Technology in PDF only on Docsity!

Engineering 10 2 Developer

Professional Ultimate Exam

Engineering 10 2 Developer

Professional Ultimate Exam

Engineering 10 2 Developer

Professional Ultimate Exam

Engineering 10 2 Developer

Professional Ultimate Exam

Engineering 10 2 Developer

Professional Ultimate Exam

Engineering 10 2 Developer

Professional Ultimate Exam

Engineering 10 2 Developer

Professional Ultimate Exam

Engineering 10 2 Developer

Professional Ultimate Exam

Engineering 10 2 Developer

Professional Ultimate Exam

Engineering 10 2 Developer

Professional Ultimate Exam

Engineering 10 2 Developer

Professional Ultimate Exam

D) 1025

Engineering 10 2 Developer

Professional Ultimate Exam

Engineering 10 2 Developer

Professional Ultimate Exam

Engineering 10 2 Developer

Professional Ultimate Exam