














































































Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
This study guide focuses on implementing data science solutions using Python. It covers data analysis libraries, machine learning frameworks, visualization tools, and practical coding applications. Candidates gain hands-on knowledge to analyze data and build predictive models using Python-based workflows.
Typology: Exams
1 / 86
This page cannot be seen from the preview
Don't miss anything!















































































Question 1. Which Python data structure is immutable? A) List B) Dictionary C) Tuple D) Set Answer: C Explanation: Tuples cannot be altered after creation, making them immutable, whereas lists, dictionaries, and sets are mutable. Question 2. In a Jupyter notebook, which shortcut clears the output of the current cell? A) Shift + Enter B) Esc + O C) Ctrl + Shift + - D) Ctrl + M + O Answer: D Explanation: Pressing Ctrl+M then O toggles the output of the selected cell. Question 3. What does the dtype argument in np.array() control? A) Number of dimensions B) Memory layout C) Data type of elements D) Shape of the array Answer: C Explanation: dtype specifies the type (e.g., int64, float32) of the array’s elements.
Question 4. Which Pandas method removes duplicate rows based on all columns? A) drop_duplicates() B) unique() C) duplicated() D) remove_duplicates() Answer: A Explanation: DataFrame.drop_duplicates() returns a new DataFrame without duplicate rows. Question 5. In NumPy broadcasting, which shape can be broadcast to shape (5, 4)? A) (5,) B) (4,) C) (1, 4) D) (5, 1) Answer: C Explanation: Shape (1, 4) can be broadcast across the first dimension to match (5, 4). Question 6. Which Matplotlib function creates a histogram? A) plt.plot() B) plt.bar() C) plt.hist() D) plt.scatter() Answer: C Explanation: plt.hist() bins data and visualizes the frequency distribution as a histogram.
Question 10. In NumPy, which function computes the determinant of a square matrix? A) np.trace() B) np.linalg.det() C) np.linalg.inv() D) np.prod() Answer: B Explanation: np.linalg.det() returns the determinant value of a square matrix. Question 11. Which keyword is used to handle exceptions in Python? A) catch B) error C) except D) finally Answer: C Explanation: The except block follows try to capture and handle raised exceptions. Question 12. What does the Pandas Series object represent? A) Two‑dimensional table B) Ordered one‑dimensional array with axis labels C) Unordered collection of key‑value pairs D) Immutable list Answer: B Explanation: A Series is a one‑dimensional labeled array, similar to a column in a DataFrame.
Question 13. Which of the following is NOT a valid NumPy indexing technique? A) Boolean indexing B) Fancy indexing C) Slice indexing D) Regex indexing Answer: D Explanation: NumPy does not support regular‑expression based indexing. Question 14. Which Seaborn plot is best suited for visualizing the distribution of a single continuous variable? A) pairplot B) heatmap C) violinplot D) scatterplot Answer: C Explanation: A violin plot shows the kernel density estimation of a single variable’s distribution. Question 15. Which method in Pandas is used to fill missing values with the mean of a column? A) fillna(method='mean') B) replace(np.nan, df.mean()) C) fillna(df.mean()) D) interpolate() Answer: C Explanation: df.fillna(df.mean()) substitutes NaNs with the column’s mean.
Explanation: A left join keeps all rows from the left DataFrame and matches rows from the right where possible. Question 19. In Matplotlib, which command adds a legend to the current axes? A) plt.title() B) plt.show() C) plt.legend() D) plt.xlabel() Answer: C Explanation: plt.legend() displays the legend based on labeled plot elements. Question 20. Which statistical measure is robust to outliers? A) Mean B) Standard deviation C) Median D) Variance Answer: C Explanation: The median, being the middle value, is less affected by extreme observations. Question 21. What does the @ operator do in Python 3.5+? A) List concatenation B) Matrix multiplication C) Decorator definition D) Exponentiation Answer: B
Explanation: The @ symbol is overloaded for matrix multiplication, e.g., A @ B. Question 22. Which Pandas function returns the first n rows of a DataFrame? A) head(n) B) top(n) C) first(n) D) slice(n) Answer: A Explanation: df.head(n) displays the leading n rows. Question 23. In NumPy, what is the result of np.where(condition, x, y)? A) Indices where condition is True B) Array with elements from x where condition is True, else from y C) Boolean mask of condition D) Count of True values in condition Answer: B Explanation: np.where selects elements from x or y based on the boolean condition. Question 24. Which of the following is a non‑parametric test for comparing two independent samples? A) Student’s t‑test B) Mann‑Whitney U test C) Paired t‑test D) Z‑test Answer: B
Explanation: The alpha argument accepts a float between 0 (transparent) and 1 (opaque). Question 28. What does the np.corrcoef(a, b) function compute? A) Covariance matrix B) Correlation coefficient matrix C) Linear regression coefficients D) Euclidean distance Answer: B Explanation: np.corrcoef returns the Pearson correlation coefficients between arrays. Question 29. Which Pandas method reshapes data from long to wide format? A) melt() B) pivot() C) stack() D) unstack() Answer: B Explanation: pivot creates a new DataFrame where unique values become columns. Question 30. Which of the following is a valid lambda expression that squares its input? A) lambda x: x ** 2 B) lambda x: pow(x,2) C) Both A and B D) None of the above Answer: C Explanation: Both lambda definitions return the square of x.
Question 31. In probability, the variance of a Bernoulli(p) distribution is: A) p B) p(1‑p) C) p² D) 1‑p Answer: B Explanation: Variance for a Bernoulli trial equals p(1-p). Question 32. Which NumPy function can be used to compute the dot product of two 1‑D arrays? A) np.multiply() B) np.dot() C) np.cross() D) np.inner() Answer: B Explanation: np.dot returns the scalar dot product for 1‑D arrays. Question 33. Which Pandas function detects missing values? A) isnull() B) notnull() C) isna() D) All of the above Answer: D Explanation: isnull, isna are aliases; notnull returns the opposite boolean mask.
Question 37. Which statistical concept measures the asymmetry of a distribution? A) Kurtosis B) Skewness C) Variance D) Standard deviation Answer: B Explanation: Skewness quantifies the degree of asymmetry around the mean. Question 38. In Pandas, what does the astype() method do? A) Changes the DataFrame’s index B) Converts column data types C) Sorts the DataFrame D) Renames columns Answer: B Explanation: astype casts Series or DataFrame columns to a specified dtype. Question 39. Which of the following is a correct way to catch any exception in Python? A) except: B) except Exception: C) except BaseException: D) All of the above Answer: D Explanation: All three syntaxes will capture any exception; except Exception is the most common practice.
Question 40. In NumPy, which function computes the element‑wise maximum of two arrays? A) np.max() B) np.maximum() C) np.where() D) np.argmax() Answer: B Explanation: np.maximum returns an array with the larger value at each position. Question 41. Which of the following is a valid way to create a Pandas DataFrame from a dictionary of lists? A) pd.DataFrame({'A':[1,2], 'B':[3,4]}) B) pd.Series({'A':[1,2], 'B':[3,4]}) C) pd.read_dict({'A':[1,2], 'B':[3,4]}) D) pd.DataFrame.from_dict({'A':[1,2], 'B':[3,4]}, orient='columns') Answer: A Explanation: Passing a dict of equal‑length lists directly to DataFrame constructs the table. Question 42. Which method in Matplotlib adds grid lines to a plot? A) plt.showgrid() B) plt.grid() C) plt.add_grid() D) plt.lines() Answer: B Explanation: plt.grid(True) toggles grid visibility.
Question 46. In Seaborn, which argument controls the categorical variable for a box plot? A) x B) y C) hue D) All of the above Answer: D Explanation: x or y defines the categorical axis; hue can add a secondary categorical grouping. Question 47. Which Python statement creates a generator that yields squares of numbers from 0 to 9? A) [x**2 for x in range(10)] B) (x**2 for x in range(10)) C) list(x**2 for x in range(10)) D) set(x**2 for x in range(10)) Answer: B Explanation: Parentheses produce a generator expression; it yields values lazily. Question 48. Which statistical test assesses the association between two categorical variables? A) Pearson correlation B) Chi‑square test of independence C) Paired t‑test D) ANOVA Answer: B
Explanation: The chi‑square test evaluates whether observed frequencies differ from expected frequencies. Question 49. In NumPy, which function returns the indices that would sort an array? A) np.sort() B) np.argsort() C) np.lexsort() D) np.order() Answer: B Explanation: np.argsort provides the permutation indices that sort the array. Question 50. Which Pandas method returns a Boolean Series indicating whether each element is in a given list? A) isin() B) in() C) contains() D) match() Answer: A Explanation: Series.isin([list]) checks membership element‑wise. Question 51. In Matplotlib, which function is used to create a figure with multiple subplots arranged in a grid? A) plt.subplot2grid() B) plt.subplots() C) plt.grid() D) plt.layout()
C) sum() D) aggregate('cumsum') Answer: A Explanation: Series.cumsum() returns a running total. Question 55. Which statistical measure is defined as the square root of variance? A) Mean B) Standard deviation C) Median absolute deviation D) Interquartile range Answer: B Explanation: Standard deviation quantifies dispersion in the same units as the data. Question 56. Which of the following NumPy functions creates an uninitialized array of shape (3, 3)? A) np.zeros((3,3)) B) np.empty((3,3)) C) np.full((3,3), 0) D) np.arange(9).reshape(3,3) Answer: B Explanation: np.empty allocates memory without initializing entries. Question 57. In Seaborn, which parameter of sns.heatmap() controls whether the cells are annotated with their numeric values? A) cmap B) linewidths
C) annot D) fmt Answer: C Explanation: Setting annot=True displays the data values inside each heatmap cell. Question 58. Which of the following is TRUE about Python’s list comprehension syntax? A) It can include conditional filters. B) It cannot be nested. C) It always returns a tuple. D) It must be assigned to a variable. Answer: A Explanation: List comprehensions support an optional if clause to filter items. Question 59. In hypothesis testing, a Type I error occurs when: A) The null hypothesis is incorrectly rejected. B) The null hypothesis is incorrectly accepted. C) The test has low power. D) The sample size is too small. Answer: A Explanation: A Type I error is a false positive—rejecting a true null hypothesis. Question 60. Which Pandas function can be used to convert a column of timestamps stored as strings into datetime objects? A) pd.to_datetime() B) datetime.strptime()