CertDA Certificate in Data Analytics Practice Exam, Exams of Technology

This exam tests data acquisition, statistical analysis, data visualization, predictive modeling, and data-driven decision-making. Topics include SQL, Excel analytics, Python/R fundamentals, machine learning basics, dashboard creation, and KPI development. Candidates interpret datasets, build analytical models, design visual reports, and explain insights aligned with business objectives. Real-world case scenarios require optimizing decisions using data mining, forecasting, and exploratory analysis.

Typology: Exams

2025/2026

Available from 01/14/2026

shilpi-jain-1
shilpi-jain-1 🇮🇳

4.2

(5)

29K documents

1 / 97

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
CertDA Certificate in Data Analytics Practice
Exam
**Question 1.** In the CRISPDM framework, which phase focuses on translating the business
problem into a datamining goal?
A) Data Understanding
B) Business Understanding
C) Data Preparation
D) Deployment
Answer: B
Explanation: Business Understanding is the first CRISPDM step where project objectives and
datamining goals are defined.
**Question 2.** Which of the following activities belongs to the Data Understanding phase?
A) Building predictive models
B) Collecting initial data sets
C) Deploying the model in production
D) Cleaning and transforming data
Answer: B
Explanation: Data Understanding involves gathering initial data and exploring its characteristics.
**Question 3.** During Data Preparation, which process is primarily responsible for handling
missing values?
A) Feature selection
B) Data cleaning
C) Model evaluation
D) Business case definition
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b
pf1c
pf1d
pf1e
pf1f
pf20
pf21
pf22
pf23
pf24
pf25
pf26
pf27
pf28
pf29
pf2a
pf2b
pf2c
pf2d
pf2e
pf2f
pf30
pf31
pf32
pf33
pf34
pf35
pf36
pf37
pf38
pf39
pf3a
pf3b
pf3c
pf3d
pf3e
pf3f
pf40
pf41
pf42
pf43
pf44
pf45
pf46
pf47
pf48
pf49
pf4a
pf4b
pf4c
pf4d
pf4e
pf4f
pf50
pf51
pf52
pf53
pf54
pf55
pf56
pf57
pf58
pf59
pf5a
pf5b
pf5c
pf5d
pf5e
pf5f
pf60
pf61

Partial preview of the text

Download CertDA Certificate in Data Analytics Practice Exam and more Exams Technology in PDF only on Docsity!

Exam

Question 1. In the CRISP‑DM framework, which phase focuses on translating the business problem into a data‑mining goal? A) Data Understanding B) Business Understanding C) Data Preparation D) Deployment Answer: B Explanation: Business Understanding is the first CRISP‑DM step where project objectives and data‑mining goals are defined. Question 2. Which of the following activities belongs to the Data Understanding phase? A) Building predictive models B) Collecting initial data sets C) Deploying the model in production D) Cleaning and transforming data Answer: B Explanation: Data Understanding involves gathering initial data and exploring its characteristics. Question 3. During Data Preparation, which process is primarily responsible for handling missing values? A) Feature selection B) Data cleaning C) Model evaluation D) Business case definition

Exam

Answer: B Explanation: Data cleaning addresses missing, inconsistent, or erroneous data before modeling. Question 4. In CRISP‑DM, the Data Modeling phase typically includes which activity? A) Defining key performance indicators (KPIs) B) Selecting appropriate modeling techniques C) Conducting stakeholder interviews D) Archiving raw data Answer: B Explanation: Data Modeling is where analysts choose algorithms and build models. Question 5. Which CRISP‑DM phase assesses whether the model meets the business objectives? A) Data Evaluation B) Data Preparation C) Data Understanding D) Deployment Answer: A Explanation: Data Evaluation compares model results against business goals. Question 6. The final CRISP‑DM phase, Deployment, most commonly includes which task? A) Splitting data into training and test sets

Exam

Question 9. Which data type is best described as semi‑structured? A) Relational tables in SQL B) Plain text documents C) JSON files D) Binary image files Answer: C Explanation: JSON contains tags and hierarchy but does not conform to a rigid schema, making it semi‑structured. Question 10. Structured data is typically stored in: A) Hadoop Distributed File System (HDFS) B) Relational databases C) Email archives D) Video streaming platforms Answer: B Explanation: Relational databases enforce schema, making them ideal for structured data. Question 11. Which source would be considered an external data source for a retail company? A) Point‑of‑sale transaction logs B) Employee payroll system C) Social media sentiment feeds D) Internal inventory database

Exam

Answer: C Explanation: Social media feeds originate outside the organization. Question 12. An ERP system primarily provides which type of data? A) External market trends B) Internal operational data C) Public demographic data D) Weather forecasts Answer: B Explanation: ERP (Enterprise Resource Planning) captures internal business processes. Question 13. Which method is most appropriate for collecting real‑time stock price data? A) Manual entry B) API integration C) Printed newspaper scanning D) Email attachment Answer: B Explanation: APIs allow automated, real‑time data retrieval. Question 14. Web scraping is best suited for obtaining: A) Structured data from relational databases B) Unstructured text from web pages C) Sensor data from IoT devices

Exam

B) Heat‑map visualization C) Data cleansing D) Business requirement gathering Answer: A Explanation: Predictive analytics employs models like ARIMA for forecasting. Question 18. Prescriptive analytics differs from predictive analytics by: A) Using only descriptive statistics B) Suggesting optimal actions based on predictions C) Ignoring business constraints D) Focusing solely on data collection Answer: B Explanation: Prescriptive analytics provides recommendations on what to do next. Question 19. Which language is most widely used for statistical modeling in finance? A) HTML B) Python C) SQL D) R Answer: D Explanation: R has extensive packages for statistical analysis and is popular among finance analysts.

Exam

Question 20. Which tool is best suited for ad‑hoc data manipulation by non‑technical users? A) Hadoop B) Excel C) TensorFlow D) SAS Answer: B Explanation: Excel offers a familiar interface for quick data tasks. Question 21. In SQL, which clause is used to filter rows after aggregation? A) WHERE B) GROUP BY C) HAVING D) ORDER BY Answer: C Explanation: HAVING filters grouped results, whereas WHERE applies before aggregation. Question 22. Which Python library is primarily used for data manipulation and analysis? A) Matplotlib B) NumPy C) Pandas D) Scikit‑learn

Exam

C) Time‑series trends D) Hierarchical data Answer: B Explanation: Scatter plots show correlation between two numeric dimensions. Question 26. Which chart type can be misleading if the y‑axis does not start at zero? A) Pie chart B) Bar chart C) Line chart D) Scatter plot Answer: B Explanation: Bar charts rely on baseline zero; truncating the axis can exaggerate differences. Question 27. In data visualization, “chart junk” refers to: A) Missing data points B) Unnecessary decorative elements that obscure insight C) Inconsistent color palettes D) Overly large datasets Answer: B Explanation: Chart junk adds visual clutter without adding informational value.

Exam

Question 28. Which visualization would best depict market share percentages among five companies? A) Stacked bar chart B) Pie chart C) Heat map D) Box plot Answer: B Explanation: Pie charts effectively show parts of a whole for a limited number of categories. Question 29. What is the primary purpose of a box plot? A) Show distribution quartiles and outliers B) Display cumulative totals over time C) Compare categorical frequencies D) Illustrate geographic data Answer: A Explanation: Box plots summarize median, quartiles, and potential outliers. Question 30. Which of the following best describes data veracity? A) The speed of data generation B) The trustworthiness and quality of data C) The size of the dataset D) The variety of data formats

Exam

C) Network latency D) File format compatibility Answer: A Explanation: AI decisions must be explainable to avoid unfair discrimination. Question 34. Which security measure is essential when transmitting financial data over the internet? A) CSV formatting B) SSL/TLS encryption C) Data compression D) Color‑coded charts Answer: B Explanation: SSL/TLS secures data in transit against interception. Question 35. Which of the following best defines “data value”? A) The monetary cost of storing data B) The insight and ROI derived from analyzing data C) The number of rows in a dataset D) The bandwidth required to transfer data Answer: B Explanation: Data value measures the business benefit obtained from data insights.

Exam

Question 36. An example of unstructured data is: A) A relational table of sales transactions B) A JSON file containing product attributes C) An email body with free‑text comments D) A CSV file of inventory counts Answer: C Explanation: Free‑text emails lack predefined schema, classifying them as unstructured. Question 37. Which of the following is a key advantage of using APIs for data collection? A) Manual verification of each record B) Real‑time or near‑real‑time data retrieval C) Unlimited storage capacity D) Automatic data visualization Answer: B Explanation: APIs enable programmatic, timely access to external data sources. Question 38. In the context of data preparation, “ETL” stands for: A) Extract, Transform, Load B) Evaluate, Test, Learn C) Encode, Transfer, Link D) Estimate, Track, Log Answer: A

Exam

C) A visualization dashboard for executives D) An encrypted file system for backups Answer: B Explanation: Data lakes hold large volumes of raw, often unprocessed data. Question 42. When performing feature engineering, creating a “month‑of‑year” variable from a timestamp is an example of: A) Dimensionality reduction B) Data encoding C) Data aggregation D) Variable transformation Answer: D Explanation: Extracting month from a date transforms the raw timestamp into a useful feature. Question 43. Which of the following is a primary purpose of cross‑validation in model building? A) To increase the size of the training set B) To assess model performance on unseen data C) To visualize model coefficients D) To speed up model training Answer: B Explanation: Cross‑validation tests generalization by rotating training and validation subsets.

Exam

Question 44. In a regression model, a high Variance Inflation Factor (VIF) indicates: A) Strong predictive power B) Multicollinearity among predictors C) Overfitting due to too many observations D) Underfitting due to insufficient variables Answer: B Explanation: VIF measures how much a predictor is linearly related to other predictors. Question 45. Which of the following is an example of a KPI for a marketing analytics project? A) Number of rows in the dataset B) Click‑through rate (CTR) C) Size of the database server D) Frequency of data backups Answer: B Explanation: CTR directly reflects marketing performance and is a common KPI. Question 46. Which visualization technique is most effective for showing a time‑series trend with seasonal patterns? A) Scatter plot B) Histogram C) Line chart with multiple series D) Radar chart

Exam

B) It may not meet the specific needs of different audiences C) It reduces data security risks D) It improves data latency Answer: B Explanation: Different roles require tailored views; a one‑size‑fits‑all dashboard can be ineffective. Question 50. In the context of RPA, “bot” refers to: A) A statistical model B) A software robot that automates repetitive tasks C) A data visualization component D) A hardware device for data capture Answer: B Explanation: RPA bots mimic human actions to perform rule‑based processes. Question 51. Which of the following best illustrates “data bias” in a predictive model? A) The model runs faster on a GPU B) Training data over‑represents a particular demographic, leading to skewed predictions C) The model uses a linear algorithm D) The dataset contains missing values Answer: B Explanation: Over‑representation creates bias, affecting fairness and accuracy.

Exam

Question 52. Which SQL function is used to calculate the average of a numeric column? A) SUM() B) AVG() C) COUNT() D) MAX() Answer: B Explanation: AVG() returns the mean value of the specified column. Question 53. Which of the following is a key benefit of using cloud‑based data warehouses? A) Fixed hardware capacity B) Unlimited on‑premises storage C) Scalability and pay‑as‑you‑go pricing D) Inability to integrate with APIs Answer: C Explanation: Cloud warehouses can scale resources dynamically and charge based on usage. Question 54. In data visualization, the term “small multiples” refers to: A) Using tiny fonts to fit more information B) Displaying several similar charts side‑by‑side for comparison C) Aggregating data into a single bar D) Combining multiple data sources into one plot