Machine Learning Exam Questions and Answers, Exams of Technology

A set of multiple-choice questions and answers related to machine learning concepts. It covers topics such as supervised and unsupervised learning, feature selection, model evaluation, and hyperparameter tuning. The questions are designed to test understanding of key concepts and principles in machine learning, making it a useful resource for students and practitioners alike. Explanations for each answer, enhancing its educational value and making it suitable for exam preparation or self-study. It includes questions on topics such as linear regression, logistic regression, k-nearest neighbors, svms, decision trees, ensemble methods, pca, and feature scaling.

Typology: Exams

2024/2025

Available from 10/18/2025

anil-kumar-jain-1
anil-kumar-jain-1 🇮🇳

2.9

(15)

27K documents

1 / 83

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
44566 Test 1 5 Exam
Question 1. Which of the following best defines Machine Learning (ML)?
A) A set of rigid rules programmed explicitly by humans
B) A subset of artificial intelligence focused on data-driven pattern recognition
C) A hardware-based approach to data processing
D) A statistical method that does not involve algorithms
Answer: B
Explanation: Machine Learning is a subset of AI that enables systems to learn
from data and improve performance without explicit programming.
Question 2. What are the primary types of Machine Learning?
A) Supervised, Unsupervised, Reinforcement Learning
B) Linear, Nonlinear, Polynomial Learning
C) Batch, Online, Offline Learning
D) Predictive, Descriptive, Prescriptive Learning
Answer: A
Explanation: The main types are supervised learning, unsupervised learning, and
reinforcement learning, each differing in how they learn from data.
Question 3. In ML terminology, what does a 'Feature' refer to?
A) The output variable to be predicted
B) An individual measurable property of the data
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b
pf1c
pf1d
pf1e
pf1f
pf20
pf21
pf22
pf23
pf24
pf25
pf26
pf27
pf28
pf29
pf2a
pf2b
pf2c
pf2d
pf2e
pf2f
pf30
pf31
pf32
pf33
pf34
pf35
pf36
pf37
pf38
pf39
pf3a
pf3b
pf3c
pf3d
pf3e
pf3f
pf40
pf41
pf42
pf43
pf44
pf45
pf46
pf47
pf48
pf49
pf4a
pf4b
pf4c
pf4d
pf4e
pf4f
pf50
pf51
pf52
pf53

Partial preview of the text

Download Machine Learning Exam Questions and Answers and more Exams Technology in PDF only on Docsity!

Question 1. Which of the following best defines Machine Learning (ML)? A) A set of rigid rules programmed explicitly by humans B) A subset of artificial intelligence focused on data-driven pattern recognition C) A hardware-based approach to data processing D) A statistical method that does not involve algorithms Answer: B Explanation: Machine Learning is a subset of AI that enables systems to learn from data and improve performance without explicit programming. Question 2. What are the primary types of Machine Learning? A) Supervised, Unsupervised, Reinforcement Learning B) Linear, Nonlinear, Polynomial Learning C) Batch, Online, Offline Learning D) Predictive, Descriptive, Prescriptive Learning Answer: A Explanation: The main types are supervised learning, unsupervised learning, and reinforcement learning, each differing in how they learn from data. Question 3. In ML terminology, what does a 'Feature' refer to? A) The output variable to be predicted B) An individual measurable property of the data

C) The entire dataset used for training D) The model's decision rule Answer: B Explanation: Features are individual measurable attributes or variables that are used as input to a model. Question 4. Which term describes the set of all possible functions that a learning algorithm can choose from? A) Hypothesis space B) Data universe C) Parameter set D) Model library Answer: A Explanation: The hypothesis space encompasses all functions that the learning algorithm can select to approximate the target function. Question 5. What is the primary goal of minimizing bias in a machine learning model? A) To increase model complexity B) To improve the model's ability to capture true patterns C) To reduce computation time D) To make the model more interpretable

Question 8. What do eigenvalues and eigenvectors of a matrix represent? A) Scalar factors and directions that characterize the matrix's transformations B) The mean and variance of the data C) The maximum and minimum elements in a matrix D) The row and column sums Answer: A Explanation: Eigenvalues and eigenvectors describe the directions and magnitudes of a matrix's principal transformations. Question 9. In calculus, what does the gradient of a function indicate? A) The curvature of the function B) The direction and rate of fastest increase C) The point of inflection D) The minimum point only Answer: B Explanation: The gradient points in the direction of steepest ascent and indicates how the function changes. Question 10. Which probability rule states that the probability of the union of two events is the sum of their individual probabilities minus their intersection?

A) Addition rule B) Multiplication rule C) Bayes' theorem D) Conditional probability Answer: A Explanation: The addition rule accounts for overlapping probabilities to avoid double-counting. Question 11. What is a 'sample' in the context of machine learning? A) The entire population data B) A subset of data used for training or testing C) The predicted output of a model D) The features used for modeling Answer: B Explanation: A sample is a subset of data taken from the population for analysis or training. Question 12. Which type of learning uses labeled data for training? A) Supervised learning B) Unsupervised learning C) Reinforcement learning

Explanation: Cross-entropy loss measures the difference between predicted probabilities and actual labels in classification. Question 15. How does the K-Nearest Neighbors (KNN) algorithm classify a new data point? A) By finding the majority class among its K nearest neighbors B) By fitting a linear boundary C) By optimizing a decision tree D) By calculating the mean of all data points Answer: A Explanation: KNN classifies based on the majority label of the K closest data points using a distance metric. Question 16. Which kernel function is commonly used in SVMs to handle nonlinear data? A) Polynomial kernel B) Linear kernel C) Sigmoid kernel D) RBF (Radial Basis Function) Answer: D Explanation: The RBF kernel maps data into higher dimensions to handle nonlinearity effectively.

Question 17. What does a decision tree split on a feature using Gini impurity aim to minimize? A) The variability within each split B) The sum of squared errors C) The heterogeneity of the classes within the node D) The total number of splits Answer: C Explanation: Gini impurity measures how mixed the classes are within a node, and splitting aims to minimize this. Question 18. What is the main purpose of ensemble methods like Random Forests? A) To combine multiple models to improve accuracy B) To reduce the size of the dataset C) To simplify the model D) To perform dimensionality reduction Answer: A Explanation: Ensemble methods aggregate multiple models to enhance predictive performance and robustness.

B) To increase model complexity C) To eliminate outliers D) To select the best features Answer: A Explanation: Feature scaling ensures that all features contribute equally to the model by normalizing their ranges. Question 22. Which imputation method replaces missing values with the median of the feature? A) Mean imputation B) Median imputation C) Mode imputation D) Random imputation Answer: B Explanation: Median imputation replaces missing data with the median, which is robust to outliers. Question 23. What does a high ROC AUC score indicate about a classifier? A) Excellent ability to distinguish between classes B) Poor performance C) Overfitting

D) High bias Answer: A Explanation: A high AUC indicates the model has a strong ability to discriminate between positive and negative classes. Question 24. Which metric measures the proportion of actual positives correctly identified? A) Precision B) Recall (Sensitivity) C) Accuracy D) F1-Score Answer: B Explanation: Recall measures the fraction of true positives correctly detected. Question 25. In regression, what does the R2 score indicate? A) The proportion of variance explained by the model B) The average error magnitude C) The correlation between features D) The probability of prediction correctness Answer: A

Question 28. What does stratified K-Fold cross-validation aim to preserve? A) The distribution of classes in each fold B) The order of data points C) The randomness of data splitting D) The grouping of similar data points Answer: A Explanation: Stratified K-Fold maintains the class distribution across folds, especially important for imbalanced datasets. Question 29. Which metric is used to evaluate the accuracy of a classification model by considering true positives, true negatives, false positives, and false negatives? A) Confusion matrix B) Mean Squared Error C) R2 score D) Precision Answer: A Explanation: The confusion matrix summarizes classification performance by counting TP, TN, FP, FN.

Question 30. Which of the following is NOT a common method for handling missing data? A) Mean imputation B) Dropping missing values C) One-Hot Encoding D) Median imputation Answer: C Explanation: One-Hot Encoding is used for categorical feature encoding, not for handling missing data. Question 31. Which feature transformation technique rescales data to a fixed range, typically [0,1]? A) Min-Max Scaling B) Standardization C) Log Transformation D) Power Transformation Answer: A Explanation: Min-Max Scaling rescales features to a specified range, often [0,1]. Question 32. In the context of supervised learning, what does the term 'training set' refer to? A) Data used to train the model

D) Agglomerative Clustering Answer: A Explanation: DBSCAN groups points based on density and can identify noise as outliers. Question 35. What is the primary purpose of pruning in decision trees? A) To reduce overfitting by removing branches that do not provide power B) To increase the depth of the tree C) To optimize the split criteria D) To select features Answer: A Explanation: Pruning simplifies the tree, reducing overfitting and improving generalization. Question 36. Which of the following best describes ensemble learning? A) Combining multiple models to improve overall performance B) Reducing the number of features used C) Applying a single complex model D) Clustering data into subsets Answer: A

Explanation: Ensemble learning aggregates predictions from multiple models to enhance accuracy and robustness. Question 37. What is the main difference between bagging and boosting? A) Bagging reduces variance; boosting reduces bias B) Bagging uses sequential models; boosting uses parallel models C) Bagging combines models; boosting trains models sequentially to correct errors D) Bagging is for classification; boosting is for regression only Answer: C Explanation: Bagging combines models trained independently, while boosting trains models sequentially to focus on errors. Question 38. Which clustering method builds a hierarchy from the bottom up? A) Agglomerative clustering B) Divisive clustering C) K-Means D) DBSCAN Answer: A Explanation: Agglomerative clustering starts with individual points and merges them into clusters hierarchically.

B) F1-Score C) Confusion Matrix D) ROC Curve Answer: A Explanation: MAE measures the average magnitude of errors in continuous predictions. Question 42. In the context of hyperparameter tuning, what does Random Search do? A) Randomly samples hyperparameter combinations to find the best B) Exhaustively searches all possible combinations C) Uses gradient descent to optimize hyperparameters D) Selects hyperparameters based on prior knowledge Answer: A Explanation: Random Search explores hyperparameter space randomly, often more efficiently than grid search. Question 43. What is the purpose of stratified sampling in data splitting? A) To maintain class distribution in training and test sets B) To randomly select data points without regard to class C) To balance the dataset by oversampling minority classes

D) To reduce overfitting Answer: A Explanation: Stratified sampling ensures that the proportion of classes remains consistent across splits. Question 44. Which metric is used to measure the overlap between predicted and actual positive cases in classification? A) Precision B) Recall C) F1-Score D) Accuracy Answer: A Explanation: Precision quantifies the proportion of predicted positives that are actually positive. Question 45. Which of the following is NOT a common kernel used in SVMs? A) Linear kernel B) Polynomial kernel C) Sigmoid kernel D) Exponential kernel Answer: D