Prepare for your exams
Get points
Guidelines and tips
Sell on Docsity
Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Docsity AI

Log in Sign up

Prepare for your exams

Study with the several resources on Docsity

Find documents

Prepare for your exams with the study notes shared by other students like you on Docsity

Search for your university

Find the specific documents for your university's exams

Docsity AINEW

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

20 Points

For each uploaded document

Answer questions

5 Points

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Choose your next study program

Get in touch with the best universities in the world. Search through thousands of universities and official partners

Community

Ask the community

Ask the community for help and clear up your study doubts

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

Data Analytics Certificate Programs Practice Exam, Exams of Technology

Technology

This exam measures foundational to intermediate data analytics skills including data wrangling, visualization, dashboards, statistical modeling, and interpretation of insights. Candidates practice analysis using real-life business datasets, evaluating trends, forecasting, building KPIs, and recommending data-guided solutions. The exam includes analytics workflow design, ETL basics, and ethical data use scenarios.

Typology: Exams

2025/2026

Available from 12/12/2025

shilpi-jain-1 🇮🇳

4.2

(5)

29K documents

1 / 111

This page cannot be seen from the preview

Don't miss anything!

Data Analytics Certificate Programs Practice Exam

**Question 1. Which of the following best describes qualitative data?**

A) Data that can be measured on a numeric scale

B) Data that represent categories or attributes

C) Data that follow a normal distribution

D) Data that are always stored in relational tables

Answer: B

Explanation: Qualitative data consist of non‑numeric categories such as gender,

color, or product type, unlike quantitative data which are numeric.

**Question 2. In a relational database, what is the purpose of a primary key?**

A) To enforce referential integrity between tables

B) To uniquely identify each row in a table

C) To store large binary objects

D) To speed up data encryption processes

Answer: B

Explanation: A primary key uniquely identifies each record, ensuring no duplicate

rows and enabling efficient indexing.

**Question 3. Which data format is best suited for hierarchical data exchange over

the web?**

A) CSV

B) JSON

Partial preview of the text

Download Data Analytics Certificate Programs Practice Exam and more Exams Technology in PDF only on Docsity!

Question 1. Which of the following best describes qualitative data? A) Data that can be measured on a numeric scale B) Data that represent categories or attributes C) Data that follow a normal distribution D) Data that are always stored in relational tables Answer: B Explanation: Qualitative data consist of non‑numeric categories such as gender, color, or product type, unlike quantitative data which are numeric. Question 2. In a relational database, what is the purpose of a primary key? A) To enforce referential integrity between tables B) To uniquely identify each row in a table C) To store large binary objects D) To speed up data encryption processes Answer: B Explanation: A primary key uniquely identifies each record, ensuring no duplicate rows and enabling efficient indexing. Question 3. Which data format is best suited for hierarchical data exchange over the web? A) CSV B) JSON

C) Parquet D) SQL dump Answer: B Explanation: JSON (JavaScript Object Notation) naturally represents nested structures, making it ideal for hierarchical data interchange. Question 4. Which NoSQL database type stores data as key‑value pairs? A) Document store B) Graph database C) Column‑family store D) Key‑value store Answer: D Explanation: Key‑value stores (e.g., Redis, DynamoDB) map a unique key to an opaque value, enabling fast lookups. Question 5. In a star schema, the central table is called a: A) Fact table B) Dimension table C) Bridge table D) Lookup table Answer: A

Question 8. Data lineage is important because it helps: A) Increase storage capacity B) Track the origin and transformations of data C) Reduce network latency D) Generate random numbers for simulations Answer: B Explanation: Data lineage documents where data came from and how it was transformed, aiding transparency and debugging. Question 9. Which missing‑data mechanism assumes the probability of missingness is unrelated to any observed or unobserved values? A) MCAR B) MAR C) MNAR D) NMAR Answer: A Explanation: MCAR (Missing Completely at Random) means missingness is independent of all data, making simple imputation unbiased. Question 10. When using mean imputation for a numeric variable, which issue may arise? A) Increased variance

B) Introduction of outliers C) Underestimation of variability D) Violation of primary key constraints Answer: C Explanation: Replacing missing values with the mean reduces the variable’s variance, potentially biasing downstream analyses. Question 11. Min‑Max scaling transforms a feature to which range? A) 0 to 1 B) – 1 to 1 C) 0 to 100 D) – ∞ to +∞ Answer: A Explanation: Min‑Max scaling subtracts the minimum and divides by the range, yielding values between 0 and 1. Question 12. One‑hot encoding is appropriate for: A) Ordinal variables with natural ordering B) Nominal categorical variables without order C) Continuous numeric variables D) Binary variables already coded as 0/ Answer: B

A) Only rows with matching keys in both tables B) All rows from the left table and matching rows from the right table C) All rows from the right table and matching rows from the left table D) Rows that do not match in either table Answer: B Explanation: LEFT JOIN preserves all records from the left table and adds data from the right table when keys match. Question 16. Which Python library provides the groupby operation for data frames? A) NumPy B) Matplotlib C) Pandas D) Scikit‑learn Answer: C Explanation: Pandas’ groupby method groups rows based on column values and enables aggregation. Question 17. In hypothesis testing, a Type II error occurs when: A) The null hypothesis is true but rejected B) The null hypothesis is false but not rejected C) The alternative hypothesis is true and accepted

D) The p‑value is exactly 0. Answer: B Explanation: A Type II error (false negative) fails to reject a false null hypothesis. Question 18. Which test is appropriate for comparing the means of three independent groups? A) Paired t‑test B) One‑way ANOVA C) Chi‑square test of independence D) Mann‑Whitney U test Answer: B Explanation: One‑way ANOVA assesses whether at least one group mean differs among three or more independent groups. Question 19. The p‑value represents: A) The probability that the null hypothesis is true B) The probability of observing data as extreme as the sample, assuming the null is true C) The confidence level of the test D) The effect size of the treatment Answer: B

Question 22. Which metric is most appropriate for evaluating a highly imbalanced binary classifier? A) Accuracy B) Precision C) Recall D) F1‑score Answer: D Explanation: F1‑score balances precision and recall, providing a single measure that is robust to class imbalance. Question 23. In K‑means clustering, the algorithm minimizes: A) The sum of absolute deviations from the median B) The total within‑cluster sum of squares (inertia) C) The distance between cluster centroids and the origin D) The number of clusters Answer: B Explanation: K‑means iteratively assigns points to clusters to minimize within‑cluster variance (sum of squared distances). Question 24. Which of the following is a characteristic of a stationary time series? A) Increasing mean over time

B) Constant variance and mean over time C) Seasonal patterns that change amplitude D) Trend component that grows linearly Answer: B Explanation: Stationarity implies that statistical properties (mean, variance, autocorrelation) are constant over time. Question 25. A box plot visualizes all the following EXCEPT: A) Median B) Interquartile range C) Mean D) Outliers Answer: C Explanation: Traditional box plots display median, quartiles, and outliers, but not the mean (unless explicitly added). Question 26. Which chart type is best for showing the proportion of categories that sum to 100 %? A) Bar chart B line chart C) Pie chart D) Scatter plot

Explanation: Quick filters can be displayed as sliders, enabling interactive range selection on dashboards. Question 29. When creating a heat map, what does color intensity typically represent? A) Geographic location B) Frequency or magnitude of a variable C) Time sequence D) Data type (numeric vs. categorical) Answer: B Explanation: In heat maps, darker or more saturated colors indicate higher values or counts. Question 30. Which of the following best describes a data story’s “challenge” component? A) The final recommendation to stakeholders B) The background context and business problem being addressed C) The technical steps taken to clean the data D) The visual design of the dashboard Answer: B Explanation: The “challenge” defines the problem or question that motivates the analysis, setting the stage for the story.

Question 31. In Git, which command records changes to the repository? A) git clone B) git pull C) git commit D) git merge Answer: C Explanation: git commit creates a new snapshot of staged changes, preserving version history. Question 32. Which DDL statement is used to remove an existing table? A) DROP TABLE B) DELETE FROM C) TRUNCATE TABLE D) ALTER TABLE Answer: A Explanation: DROP TABLE permanently deletes the table definition and its data. Question 33. In Excel, which function retrieves a value from a table based on a matching key in the first column? A) SUMIF B) VLOOKUP C) CONCATENATE

Explanation: Visualization is an analytical step, not part of the ETL (Extract‑Transform‑Load) pipeline. Question 36. Which statistical test would you use to examine the relationship between two categorical variables? A) Pearson correlation B) Independent t‑test C) Chi‑square test of independence D) Paired t‑test Answer: C Explanation: The chi‑square test assesses whether the distribution of one categorical variable differs across levels of another. Question 37. In a regression model, multicollinearity refers to: A) High correlation between the dependent variable and residuals B) Strong correlation among independent variables C) Non‑linear relationship between predictors and outcome D) Heteroscedasticity of residuals Answer: B Explanation: Multicollinearity occurs when predictors are highly correlated, inflating coefficient variance and destabilizing estimates.

Question 38. Which Python library provides the train_test_split function for creating validation sets? A) NumPy B) pandas C) scikit‑learn D) matplotlib Answer: C Explanation: train_test_split is part of scikit‑learn’s model_selection module, facilitating data partitioning. Question 39. Which window function assigns a sequential integer to rows ordered by a specified column? A) LAG() B) ROW_NUMBER() C) SUM() OVER() D) FIRST_VALUE() Answer: B Explanation: ROW_NUMBER() generates a unique, consecutive integer based on the defined ordering. Question 40. In a data pipeline, which component ensures that transformed data meets quality standards before loading? A) Extractor

Answer: C Explanation: AVG() returns the arithmetic mean of the specified column’s values. Question 43. Which of the following is a common method for handling high‑cardinality categorical variables? A) One‑hot encoding all categories B) Dropping the variable entirely C) Target encoding (mean encoding) D) Converting to binary using ASCII values Answer: C Explanation: Target encoding replaces categories with the mean of the target variable, reducing dimensionality while preserving predictive power. Question 44. In time‑series forecasting, the “seasonal decomposition” technique separates a series into which components? A) Trend, noise, and outliers B) Trend, seasonality, and residual (error) C) Level, slope, and curvature D) Autocorrelation, partial autocorrelation, and lag Answer: B Explanation: Seasonal decomposition (e.g., STL) extracts trend, seasonal pattern, and residual error components.

Question 45. Which visualization best shows the relationship between two continuous variables? A) Bar chart B) Scatter plot C) Stacked area chart D) Histogram Answer: B Explanation: Scatter plots display paired observations, making it easy to assess correlation or patterns between two numeric variables. Question 46. In Power BI, which feature allows you to create a reusable calculation across multiple reports? A) Data source B) Measure (DAX) C) Query editor step D) Custom visual Answer: B Explanation: Measures, defined using DAX, are reusable calculations that can be referenced in any visual within the report. Question 47. Which of the following is NOT a typical characteristic of a “big data” environment?

Data Analytics Certificate Programs Practice Exam, Exams of Technology

Related documents

Partial preview of the text

Download Data Analytics Certificate Programs Practice Exam and more Exams Technology in PDF only on Docsity!