Data Science and Machine Learning Exam Questions, Exams of Technology

A set of multiple-choice questions and answers related to data science and machine learning concepts. It covers topics such as data types, data structures, data preprocessing, linear regression, regularization, logistic regression, k-nearest neighbors (knn), support vector machines (svm), decision trees, clustering, and dimensionality reduction techniques like pca and t-sne. The questions are designed to test understanding of key concepts and their applications in data analysis and model building, making it a valuable resource for students and practitioners in the field.

Typology: Exams

2024/2025

Available from 10/25/2025

anil-kumar-jain-1
anil-kumar-jain-1 🇮🇳

2.9

(15)

27K documents

1 / 167

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
3 3 Test 3 3 Exam
Question 1. Which of the following is an example of a nominal data
type?
A) Temperature in Celsius
B) Gender
C) Height in centimeters
D) Income
Answer: B
Explanation: Nominal data is categorical with no inherent order. Gender
is a classic example.
Question 2. What distinguishes ordinal data from nominal data?
A) Ordinal data has an inherent order
B) Ordinal data is always numeric
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b
pf1c
pf1d
pf1e
pf1f
pf20
pf21
pf22
pf23
pf24
pf25
pf26
pf27
pf28
pf29
pf2a
pf2b
pf2c
pf2d
pf2e
pf2f
pf30
pf31
pf32
pf33
pf34
pf35
pf36
pf37
pf38
pf39
pf3a
pf3b
pf3c
pf3d
pf3e
pf3f
pf40
pf41
pf42
pf43
pf44
pf45
pf46
pf47
pf48
pf49
pf4a
pf4b
pf4c
pf4d
pf4e
pf4f
pf50
pf51
pf52
pf53
pf54
pf55
pf56
pf57
pf58
pf59
pf5a
pf5b
pf5c
pf5d
pf5e
pf5f
pf60
pf61
pf62
pf63
pf64

Partial preview of the text

Download Data Science and Machine Learning Exam Questions and more Exams Technology in PDF only on Docsity!

Question 1. Which of the following is an example of a nominal data type? A) Temperature in Celsius B) Gender C) Height in centimeters D) Income Answer: B Explanation: Nominal data is categorical with no inherent order. Gender is a classic example. Question 2. What distinguishes ordinal data from nominal data? A) Ordinal data has an inherent order B) Ordinal data is always numeric

C) Nominal data can be ranked D) Nominal data has units Answer: A Explanation: Ordinal data are categorical with a logical order, unlike nominal data. Question 3. Which variable is continuous? A) Eye color B) Number of children C) Weight D) Brand of car Answer: C

A) Matrix B) Vector C) Data frame D) Tensor Answer: B Explanation: A vector (or array) is optimal for a single column of values. Question 6. What is a key difference between a matrix and a data frame? A) Matrices store only numeric data, data frames can store mixed types B) Data frames are always larger C) Matrices can only be two-dimensional D) Data frames are only available in R

Answer: A Explanation: Matrices are numeric; data frames can include different data types per column. Question 7. Which function in pandas is used to read a CSV file? A) read_table() B) read_csv() C) load_csv() D) import_csv() Answer: B Explanation: read_csv() is the standard pandas function for CSV files. Question 8. What does the pandas .info() method display?

Explanation: MCAR stands for Missing Completely At Random. Question 10. What is the first step in handling missing data? A) Imputation B) Identification C) Deletion D) Scaling Answer: B Explanation: You must identify missing values before handling them. Question 11. Which imputation method is best for categorical data? A) Mean B) Median

C) Mode D) Regression Answer: C Explanation: Mode imputation is appropriate for categorical data. Question 12. What is KNN imputation? A) Replacing missing values with the mean B) Predicting missing values using similar data points C) Dropping all rows with missing data D) Using the last observed value Answer: B Explanation: KNN imputation finds similar records to estimate missing values.

B) Z-score C) Min-Max scaling D) PCA Answer: B Explanation: Z-score measures how far a value deviates from the mean. Question 15. What does a box plot help visualize? A) Data skewness B) Outliers and quartiles C) Correlations D) Data types Answer: B Explanation: Box plots show quartiles, medians, and outliers.

Question 16. What is capping/winsorization? A) Removing missing values B) Trimming data extremes C) Replacing outliers with boundary values D) Scaling values Answer: C Explanation: Winsorization replaces extreme values with specified percentiles. Question 17. What is normalization in data preprocessing? A) Removing outliers B) Scaling values to [0,1] range

Question 19. Why use log transformation on data? A) To normalize data B) To reduce skewness C) To handle categorical variables D) To encode missing values Answer: B Explanation: Log transformation helps reduce skewness in highly skewed data. Question 20. Which is NOT an assumption of linear regression? A) Linearity B) Homoscedasticity

C) Independence D) Non-linearity Answer: D Explanation: Linear regression assumes a linear relationship. Question 21. What does the R2 value in linear regression represent? A) The slope B) The intercept C) Variance explained by the model D) The error Answer: C Explanation: R2 indicates the proportion of variance explained by the model.

C) Improve data quality D) Reduce variance Answer: B Explanation: Regularization discourages overly complex models. Question 24. Polynomial features are used in regression to: A) Encode categorical data B) Model non-linear relationships C) Standardize data D) Detect outliers Answer: B Explanation: Polynomial features enable linear models to capture non- linear patterns.

Question 25. What is the purpose of the sigmoid function in logistic regression? A) Normalize data B) Map outputs to probabilities C) Detect outliers D) Encode labels Answer: B Explanation: The sigmoid function maps real values to [0,1] for probability interpretation. Question 26. In K-Nearest Neighbors (KNN), what does ‘K’ represent? A) Number of features

Explanation: Cosine similarity is less common for KNN classification. Question 28. What is a kernel trick in SVM? A) Data normalization B) Transforming data to higher dimensions C) Scaling features D) Regularization Answer: B Explanation: The kernel trick allows SVMs to find non-linear boundaries. Question 29. Which splitting criterion is used in decision trees for classification? A) Mean squared error

B) Gini impurity C) R2 score D) Ridge penalty Answer: B Explanation: Gini impurity is commonly used to measure node purity in classification trees. Question 30. What does MSE stand for in regression metrics? A) Mean Standard Error B) Mean Squared Error C) Median Squared Error D) Maximum Squared Error Answer: B