













































































Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
A practice exam for python in data science and machine learning. It includes multiple-choice questions covering various topics such as python data types, numpy, pandas, scikit-learn, and matplotlib. Each question is followed by the correct answer and a brief explanation. This practice exam is designed to help students and professionals assess their knowledge and prepare for certification or job interviews in the field of data science and machine learning. The questions cover fundamental concepts and practical applications, making it a valuable resource for anyone looking to enhance their skills in python for data science.
Typology: Exams
1 / 85
This page cannot be seen from the preview
Don't miss anything!














































































Question 1. Which Python data type is immutable? A) list B) dict C) tuple D) set Answer: C Explanation: Tuples cannot be changed after creation, unlike lists, dicts, and sets. Question 2. What is the output of len("DataScience")? A) 10 B) 11 C) 12 D) 13 Answer: B Explanation: The string contains 11 characters, so len returns 11. Question 3. Which keyword is used to handle exceptions in Python? A) catch B) error C) try D) excepted Answer: C Explanation: try starts a block where exceptions can be caught with except. Question 4. What does the list comprehension [x**2 for x in range(5)] produce?
Answer: A Explanation: Squares of numbers 0‑4 are computed, yielding the list shown. Question 5. Which method removes a key from a dictionary and returns its value? A) popitem() B) del C) remove() D) pop() Answer: D Explanation: dict.pop(key) deletes the key and returns the associated value. Question 6. What is the result of set([1,2,2,3])? A) {1,2,3,2} B) {1,2,3} C) [1,2,3] D) (1,2,3) Answer: B Explanation: Sets store unique elements, so duplicate 2 is removed. Question 7. Which NumPy function creates an array of zeros with shape (3,2)? A) np.empty((3,2))
B) arr[:,2] C) arr[2] D) arr[3,:] Answer: B Explanation: : selects all rows, and 2 selects the third column (0‑based indexing). Question 11. Which Pandas object is a one‑dimensional labeled array? A) DataFrame B) Series C) Panel D) Index Answer: B Explanation: A Pandas Series holds data and an index, similar to a column vector. Question 12. Which method reads a CSV file into a DataFrame? A) pd.read_excel() B) pd.read_sql() C) pd.read_csv() D) pd.load_csv() Answer: C Explanation: pd.read_csv parses CSV files and returns a DataFrame. Question 13. How do you select rows where column age is greater than 30 using loc? A) df.loc[df['age'] > 30] B) df.iloc[df['age'] > 30]
C) df.select(df['age'] > 30) D) df.where(df['age'] > 30) Answer: A Explanation: loc works with label‑based boolean indexing. Question 14. Which function drops rows containing any NaN values? A) df.fillna() B) df.dropna() C) df.isnull() D) df.replace() Answer: B Explanation: dropna removes rows (or columns) with missing values. Question 15. What does the StandardScaler do to numeric features? A) Scales to [0,1] range B) Centers to mean 0 and unit variance C) Rounds to nearest integer D) Encodes categories as integers Answer: B Explanation: StandardScaler subtracts the mean and divides by the standard deviation. Question 16. Which encoding creates a binary column for each category? A) Label Encoding B) One‑Hot Encoding C) Ordinal Encoding
Answer: B Explanation: fit learns parameters from the training data. Question 20. Which algorithm is a non‑parametric, instance‑based learner? A) Linear Regression B) Logistic Regression C) K‑Nearest Neighbors D) Decision Tree Answer: C Explanation: KNN stores training instances and makes predictions based on nearest neighbors. Question 21. Which plot is best for visualizing the distribution of a single numeric variable? A) Scatter plot B) Histogram C) Bar chart D) Box plot Answer: B Explanation: Histograms show frequency of value ranges for a continuous variable. Question 22. In Matplotlib, which function creates a figure with multiple subplots arranged in 2 rows and 3 columns? A) plt.subplot(2,3) B) plt.subplots(2,3) C) plt.plot(2,3) D) plt.grid(2,3)
Answer: B Explanation: plt.subplots(nrows, ncols) returns a Figure and an array of Axes. Question 23. Which Seaborn function creates a heatmap of a correlation matrix? A) sns.boxplot() B) sns.heatmap() C) sns.distplot() D) sns.pairplot() Answer: B Explanation: sns.heatmap visualizes matrix‑like data, often used for correlations. Question 24. What does the groupby('city').mean() operation return? A) Mean of each column for the entire DataFrame B) Mean of numeric columns grouped by unique city values C) Count of rows per city D) Median of numeric columns per city Answer: B Explanation: groupby creates groups based on city; mean computes columnwise averages within each group. Question 25. Which Pandas method reshapes data from wide to long format? A) pivot() B) melt() C) stack() D) unstack()
Answer: A Explanation: arange creates values from start to stop (exclusive) with step size 2. Question 29. Which of the following statements about Python’s global keyword is true? A) It creates a new global variable inside a function. B) It allows a function to modify a variable defined at module level. C) It makes a variable accessible in all imported modules. D) It is required to read a global variable inside a function. Answer: B Explanation: Declaring global var inside a function tells Python to use the module‑level variable. Question 30. Which Pandas function returns the number of unique values in a Series? A) .count() B) .nunique() C) .unique() D) .value_counts() Answer: B Explanation: nunique counts distinct elements. Question 31. What is the purpose of np.linalg.inv()? A) Compute matrix determinant B) Compute matrix transpose C) Compute matrix inverse D) Compute eigenvalues
Answer: C Explanation: np.linalg.inv returns the inverse of a square matrix if it exists. Question 32. Which of the following is NOT a valid Pandas indexer? A) .loc B) .iloc C) .ix D) .select Answer: D Explanation: .select does not exist; .loc, .iloc, and deprecated .ix are valid. Question 33. In scikit‑learn, which class implements cross‑validation for hyper‑parameter tuning? A) GridSearchCV B) RandomForestClassifier C) LinearRegression D) PCA Answer: A Explanation: GridSearchCV exhaustively searches over parameter grids using cross‑validation. Question 34. Which metric is appropriate for evaluating a binary classifier when classes are imbalanced? A) Accuracy B) Precision‑Recall AUC C) R‑squared D) Mean Absolute Error
Answer: B Explanation: np.exp applies the exponential function to each element. Question 38. In Pandas, what does the astype('float') method do? A) Converts column names to floats B) Casts the DataFrame or Series to float dtype C) Rounds values to the nearest integer D) Removes rows with non‑numeric values Answer: B Explanation: astype changes the data type of the object. Question 39. Which of the following is a correct way to define a recursive function that computes factorial? A) def fact(n): return n * fact(n-1) if n > 1 else 1 B) def fact(n): while n>1: n *= n-1 C) def fact(n): return n ** n D) def fact(n): return n + fact(n-1) Answer: A Explanation: The function calls itself with n-1 and stops when n is 1. Question 40. What does the plt.title('Sales') command do? A) Sets the x‑axis label B) Sets the y‑axis label C) Adds a legend D) Adds a title to the current axes
Answer: D Explanation: title adds a textual title above the plot. Question 41. Which of the following is the default solver for scikit‑learn’s LogisticRegression? A) liblinear B) saga C) lbfgs D) newton‑cg Answer: C Explanation: lbfgs is the default optimizer for LogisticRegression. Question 42. In Pandas, which method returns a DataFrame with duplicate rows removed? A) dropna() B) drop_duplicates() C) unique() D) distinct() Answer: B Explanation: drop_duplicates eliminates rows that are identical across all columns. Question 43. Which of the following statements about Python’s enumerate() is correct? A) It returns a list of (index, value) tuples. B) It creates a dictionary mapping indices to values. C) It yields pairs of (index, element) during iteration. D) It only works on NumPy arrays.
Explanation: legend displays the legend based on labeled plot elements. Question 47. Which Pandas method converts a column of strings to datetime objects? A) pd.to_datetime() B) pd.datetime() C) pd.time() D) pd.as_date() Answer: A Explanation: pd.to_datetime parses strings into pandas Timestamp objects. Question 48. What does the sklearn.metrics.mean_absolute_error compute? A) Average of squared errors B) Average absolute difference between true and predicted values C) Ratio of correct predictions D) Area under the ROC curve Answer: B Explanation: MAE measures average magnitude of errors without considering direction. Question 49. Which of the following statements about the zip() function is true? A) It returns a list of tuples. B) It merges two dictionaries. C) It creates an iterator of tuples from multiple iterables. D) It only works with numeric sequences. Answer: C Explanation: zip pairs elements from each iterable, producing an iterator of tuples.
Question 50. In Seaborn, which plot is ideal for visualizing the relationship between three numeric variables? A) boxplot B) pairplot C) heatmap D) stripplot Answer: B Explanation: pairplot creates a matrix of scatterplots and histograms for each variable pair. Question 51. Which NumPy method creates a copy of an array with a new shape without changing data? A) reshape() B) resize() C) flatten() D) ravel() Answer: A Explanation: reshape returns a view or copy with the specified shape, leaving original data intact. Question 52. What does the df.isnull().sum() expression compute? A) Total number of rows with any missing values B) Number of missing values per column C) Boolean mask of missing entries D) Fills missing values with zeros Answer: B
Answer: D Explanation: All three syntaxes correctly create a new column with the summed values. Question 56. Which of the following is the correct syntax to raise a custom exception MyError? A) raise MyError B) throw MyError() C) exception MyError D) error MyError Answer: A Explanation: raise is used to trigger exceptions in Python. Question 57. In NumPy, what is the shape of the array returned by np.eye(3)? A) (3,) B) (3,3) C) (1,3) D) (3,1) Answer: B Explanation: eye creates a 2‑D identity matrix of size 3×3. Question 58. Which scikit‑learn transformer is used to convert categorical variables into one‑hot encoded arrays? A) LabelEncoder B) OneHotEncoder C) OrdinalEncoder D) MinMaxScaler
Answer: B Explanation: OneHotEncoder creates binary columns for each category. Question 59. What does the plt.figure(figsize=(10,5)) command do? A) Sets the size of the plot window to 10 inches wide and 5 inches tall B) Changes the DPI to 10× C) Sets the resolution of the figure to 10×5 pixels D) Adds a border of 10 by 5 units Answer: A Explanation: figsize specifies width and height in inches. Question 60. Which of the following is a valid way to create a Pandas DataFrame from a dictionary of lists? A) pd.DataFrame({'col1':[1,2], 'col2':[3,4]}) B) pd.Series({'col1':[1,2], 'col2':[3,4]}) C) pd.Panel({'col1':[1,2], 'col2':[3,4]}) D) pd.read_csv({'col1':[1,2], 'col2':[3,4]}) Answer: A Explanation: DataFrame constructor accepts a dict of column names mapped to list‑like data. Question 61. Which metric combines precision and recall into a single value? A) Accuracy B) F1‑Score C) ROC‑AUC D) Log‑Loss