

















































































Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
The Data Analyst Exam certifies analytical skills for extracting insights from large datasets within Nuix Discover. Topics include advanced searching, filtering, analytics dashboards, pattern detection, trend analysis, and visualization. Candidates demonstrate the ability to transform raw digital evidence into actionable intelligence for investigations and compliance reviews.
Typology: Exams
1 / 89
This page cannot be seen from the preview
Don't miss anything!


















































































Question 1. In the Nuix Discover Analytics workspace, which pane directly displays the list of documents that match the current filter? A) Visualization pane B) Document list pane C) Analytics tab D) Map view pane Answer: B Explanation: The Document list pane shows the actual documents returned by the active filter, while the Visualization pane renders charts and the Analytics tab holds the analytical tools. Question 2. Which permission is required for a user to access the “Map” feature in Nuix Discover? A) View Only B) Analyst C) Map Access D) Administrator Answer: C Explanation: “Map Access” is a specific permission that enables the Map clustering tool; other roles may not include this capability. Question 3. When customizing a workspace view, which metadata column can be added to the document list to aid in timeline analysis? A) File hash B) Custodian C) Creation date D) File size Answer: C
Explanation: Adding the Creation date column allows analysts to sort and filter documents chronologically, which is essential for timeline investigations. Question 4. In an OLAP cube, what is the term for the categorical field that defines rows, such as “Custodian”? A) Measure B) Dimension C) Hierarchy D) Filter Answer: B Explanation: Dimensions are the categorical axes of a cube; “Custodian” would be a dimension, while counts or sizes would be measures. Question 5. Which of the following actions creates a new cube in the Discover interface? A) Right‑click the Document list and select “Create Cube” B) Click the “New Cube” button on the Analytics tab C) Drag a dimension onto the Visualization pane D) Use the “Export” menu and choose “Cube” Answer: B Explanation: The Analytics tab contains the “New Cube” button that initiates the cube‑building wizard. Question 6. When building a cube, selecting “File Type” as a dimension primarily helps an analyst to: A) Identify duplicate files B) Determine the most common document formats in the collection C) Locate encrypted files
C) Index concepts based on semantic similarity D) Require manual tagging of documents Answer: C Explanation: Mines generate conceptual indexes that capture ideas and topics, not just literal text strings. Question 10. Which confidence metric in a Mine indicates the likelihood that a document truly belongs to the identified concept? A) Relevance Score B) Frequency Count C) Confidence Percentage D) Weighting Factor Answer: C Explanation: Confidence Percentage reflects the algorithm’s certainty that the document matches the conceptual cluster. Question 11. When reviewing a Mine, an analyst notices a low‑confidence cluster of “financial transactions.” The appropriate next step is to: A) Delete the cluster B) Increase the confidence threshold and re‑run the Mine C) Manually verify documents within the cluster D) Export the cluster to a separate case Answer: C Explanation: Low confidence warrants manual validation to confirm relevance before any further action. Question 12. In the Mine Map view, a “parent” node represents:
A) The most frequent keyword in the dataset B) A broader concept that contains several sub‑concepts C) The custodian who authored the most documents D) The oldest document in the collection Answer: B Explanation: Parent nodes are higher‑level concepts; child nodes are more specific sub‑concepts derived from the parent. Question 13. Which action in the Mine interface allows you to view all documents associated with a specific concept node? A) Double‑click the node B) Right‑click and choose “Export” C) Drag the node onto the Document list pane D) Hover to see a tooltip Answer: A Explanation: Double‑clicking a concept node opens a filtered view of all documents linked to that concept. Question 14. The Map feature groups documents based on: A) File extension similarity B) Conceptual similarity derived from text analytics C) File size proximity D) Creation date intervals Answer: B Explanation: Map clusters are formed by measuring conceptual similarity, not by simple file attributes.
Explanation: The outlier detection overlay highlights documents that deviate from the cluster’s typical characteristics, suggesting possible miscoding. Question 18. The “Quick Code” option on the Map is best suited for: A) Assigning a new custodian tag to a single document B) Applying the same tag to all documents in a selected cluster with one click C) Generating a summary report of cluster contents D) Exporting the cluster to an external review platform Answer: B Explanation: Quick Code streamlines mass coding by applying the chosen tag to every document in the cluster instantly. Question 19. Which of the following is NOT a typical use case for the “Unsweep” function? A) Reversing a previous Sweep action B) Removing a tag from an entire cluster C) Deleting the cluster from the case D) Restoring documents to the unreviewed pool Answer: C Explanation: Unsweep only changes coding status; it does not delete clusters from the case. Question 20. In a social network graph, a node with the highest degree (most connections) is commonly referred to as: A) Hub node B) Leaf node C) Bridge node D) Isolated node
Answer: A Explanation: A hub node connects to many other nodes, indicating a “top talker” in communication analysis. Question 21. Which filter would you apply to view only email communications exchanged between two specific custodians? A) Entity = “Email” AND Date > 2020 B) Sender = Custodian A AND Recipient = Custodian B C) Domain = “example.com” D) Communication Type = “Chat” Answer: B Explanation: Filtering by both Sender and Recipient isolates the exact email exchange between the two custodians. Question 22. The “Concept Cloud” for a person in Discover primarily displays: A) A list of all file types they have created B) The most frequent words and concepts appearing in their communications C) Their total storage consumption D) The number of meetings they attended Answer: B Explanation: Concept Clouds visualize dominant topics and keywords in a person’s messages. Question 23. To limit a social network analysis to the month of March 2023, which filter element is most appropriate? A) Date >= “2023‑ 03 ‑ 01 ” AND Date <= “ 2023 ‑ 03 ‑ 31 ” B) Year = 2023
Question 26. In a cube, which measure would you select to see the total data volume contributed by each custodian? A) Document Count B) File Size Sum C) Unique Keywords D) Email Count Answer: B Explanation: “File Size Sum” aggregates the byte size of all documents per custodian, showing data volume. Question 27. Which of the following best describes a “drill‑through” operation in a cube? A) Exporting the entire cube to a PDF file B) Opening the underlying document list for a selected cell C) Merging two dimensions into a single axis D) Refreshing the cube with new data sources Answer: B Explanation: Drill‑through lets the analyst view the raw documents that make up the aggregated cell. Question 28. A Mine returns a concept with a confidence of 92% and a relevance score of 0.45. What does the relevance score indicate? A) Percentage of total documents in the case belonging to the concept B) Strength of the concept’s relationship to the query terms C) Number of custodians associated with the concept D) Average file size of documents in the concept Answer: B
Explanation: Relevance scores quantify how closely the concept matches the user’s search or query intent. Question 29. To improve the precision of a Mine that is returning many unrelated documents, an analyst should: A) Increase the confidence threshold B) Decrease the number of dimensions in the cube C) Enable “Exact Match” mode in the Mine settings D. Export the Mine results and manually filter them Answer: A Explanation: Raising the confidence threshold filters out low‑confidence matches, tightening precision. Question 30. In the Map view, clusters are colored based on: A) File extension type B) Date of creation C) Conceptual similarity score D) Custodian ownership Answer: C Explanation: Color coding reflects the similarity metric used to form each cluster. Question 31. Which action can be used to merge two adjacent clusters that represent the same concept in the Map? A) Drag one cluster onto the other and select “Merge” B) Use the “Combine” button in the toolbar after selecting both clusters C) Right‑click a cluster and choose “Absorb Neighbor” D) Merging is not supported; clusters must remain separate
C) Export cubes directly to PowerBI D) Generate automated review scripts Answer: B Explanation: The Filter Builder is used to construct advanced filter expressions for cubes and document lists. Question 35. Which view would you use to quickly assess the distribution of document types across all custodians? A) Map view B) Cube with dimensions Custodian (rows) and File Type (columns) C) Mine concept list D) Social network graph Answer: B Explanation: A cube with Custodian and File Type dimensions provides a matrix of document type counts per custodian. Question 36. The “Export to CSV” function in a cube view includes: A) Only the visible rows and columns B) All underlying raw data regardless of visibility C) A summary of measures but no dimension values D) The visual styling of the cube Answer: A Explanation: Exporting to CSV writes the current view (visible rows/columns) to a flat file. Question 37. Which metric on a Mine concept indicates the proportion of total documents that the concept covers?
A) Confidence Percentage B) Coverage Ratio C) Relevance Score D) Frequency Count Answer: B Explanation: Coverage Ratio measures how much of the overall dataset the concept represents. Question 38. An analyst notices that a Map cluster contains a mix of emails and PDFs that share the phrase “Project Phoenix.” What does this suggest about the clustering algorithm? A) It groups solely by file type B) It uses conceptual similarity based on shared text C) It clusters by creation date D) It relies on file size similarity Answer: B Explanation: The presence of different file types with a common phrase shows that the algorithm clusters on shared concepts. Question 39. In the social network interface, what does an edge thickness represent? A) The total size of files exchanged between two nodes B) The frequency or volume of communications between the two entities C) The number of distinct concepts discussed D) The duration of the relationship Answer: B Explanation: Thicker edges indicate a higher volume or frequency of messages exchanged.
Question 43. Which of the following best describes the purpose of “Dimension Hierarchies” in a cube? A) To group dimensions into logical parent‑child relationships for drill‑down analysis B) To assign colors to each dimension automatically C) To encrypt sensitive dimension data D) To limit the number of rows displayed in the cube Answer: A Explanation: Hierarchies enable multi‑level aggregation (e.g., Year > Quarter > Month) for progressive drill‑down. Question 44. A Mine returns a concept labeled “Travel Expenses” with a confidence of 78% and a relevance of 0.62. An analyst should interpret this as: A) The concept is highly reliable and should be coded as privileged B) The concept is moderately confident and fairly relevant; manual review is advisable C) The concept is irrelevant and can be ignored D) The concept only appears in PDF files Answer: B Explanation: Confidence below 80% suggests caution; a relevance of 0.62 indicates a decent match, so manual verification is recommended. Question 45. Which feature allows an analyst to tag a whole Map cluster as “Privileged” with a single click? A) Sweep → Privileged B) Quick Code → Privileged C) Export → Apply Tag
D) Merge → Set Privilege Answer: B Explanation: Quick Code provides rapid mass‑tagging; selecting “Privileged” applies it to every document in the cluster. Question 46. In the social network view, a “community detection” algorithm is used to: A) Identify clusters of nodes that interact more frequently with each other than with the rest of the network B) Assign a unique color to each node based on file type C) Sort nodes alphabetically by custodian name D) Remove all isolated nodes from the graph Answer: A Explanation: Community detection finds tightly knit groups, revealing sub‑networks of frequent communication. Question 47. To limit a cube’s data to documents created after January 1 2022, which filter syntax is correct? A) CreatedDate > “2022‑ 01 ‑ 01 ” B) DateCreated >= 2022‑ 01 ‑ 01 C. Created >= “2022/01/01” D) CreationDate => 2022‑ 01 ‑ 01 Answer: A Explanation: The correct field name and comparison operator is “CreatedDate > ‘YYYY‑MM‑DD’”. Question 48. Which of the following visualizations is NOT available directly from the cube’s Visualization pane?
Question 51. In the Analytics workspace, the “Pin” icon next to a dimension allows you to: A) Freeze the dimension so it remains visible while scrolling B) Export the dimension to a separate file C) Delete the dimension from the cube D) Convert the dimension into a measure Answer: A Explanation: Pinning keeps the selected dimension column fixed in the view. Question 52. Which of the following best explains why a Mine might return a high‑confidence concept that appears in only a few documents? A) The concept is rare but highly distinctive, leading to strong confidence scores B) The algorithm is malfunctioning and needs to be reset C) Confidence scores are unrelated to document frequency D) The concept is automatically generated for each custodian Answer: A Explanation: Rare but uniquely identifying language can produce high confidence despite low coverage. Question 53. To visualize the evolution of a concept over time, an analyst should: A) Create a cube with dimensions Concept and Date, then plot a line chart B) Use the Map’s time‑slider feature C) Export the Mine to PowerBI and add a timeline D) Apply a filter on the document list for the concept and sort by size Answer: A Explanation: Building a cube with Concept and Date enables temporal aggregation, which can be charted as a line graph.
Question 54. In the Map, what does a “cluster density” indicator represent? A) Number of files per cluster B) Average file size within the cluster C) Similarity score among documents in the cluster D) Number of custodians represented Answer: C Explanation: Density reflects how tightly the documents are conceptually related. Question 55. Which of the following actions can be performed directly from the social network’s “Concept Cloud” panel? A) Export the list of top concepts to CSV B) Merge two nodes in the network graph C) Apply a privilege tag to all related documents D) Re‑run the Mine algorithm on the selected node Answer: A Explanation: The Concept Cloud panel includes an export option for the displayed concepts. Question 56. When a cube’s measure shows a negative number, the most likely cause is: A) An error in the data ingestion process B) The use of a calculated measure that subtracts values (e.g., Net Change) C) Corrupt visualization settings D) The presence of encrypted files Answer: B Explanation: Negative numbers typically arise from calculated measures that perform subtraction.