Prepare for your exams
Get points
Guidelines and tips
Sell on Docsity
Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Find documents

Prepare for your exams with the study notes shared by other students like you on Docsity

Search for your university

Find the specific documents for your university's exams

Docsity AINEW

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

20 Points

For each uploaded document

Answer questions

5 Points

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Choose your next study program

Get in touch with the best universities in the world. Search through thousands of universities and official partners

Community

Ask the community

Ask the community for help and clear up your study doubts

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

Data Science and Machine Learning Exam Questions, Exams of Technology

Technology

A set of multiple-choice questions and answers related to data science and machine learning concepts. It covers topics such as data types, data structures, data preprocessing, linear regression, regularization, logistic regression, k-nearest neighbors (knn), support vector machines (svm), decision trees, clustering, and dimensionality reduction techniques like pca and t-sne. The questions are designed to test understanding of key concepts and their applications in data analysis and model building, making it a valuable resource for students and practitioners in the field.

Typology: Exams

2024/2025

Available from 10/25/2025

anil-kumar-jain-1 🇮🇳

2.9

(15)

27K documents

1 / 167

This page cannot be seen from the preview

Don't miss anything!

3 3 Test 3 3 Exam

Question 1. Which of the following is an example of a nominal data

type?

A) Temperature in Celsius

B) Gender

C) Height in centimeters

D) Income

Answer: B

Explanation: Nominal data is categorical with no inherent order. Gender

is a classic example.

Question 2. What distinguishes ordinal data from nominal data?

A) Ordinal data has an inherent order

B) Ordinal data is always numeric

Partial preview of the text

Download Data Science and Machine Learning Exam Questions and more Exams Technology in PDF only on Docsity!

Question 1. Which of the following is an example of a nominal data type? A) Temperature in Celsius B) Gender C) Height in centimeters D) Income Answer: B Explanation: Nominal data is categorical with no inherent order. Gender is a classic example. Question 2. What distinguishes ordinal data from nominal data? A) Ordinal data has an inherent order B) Ordinal data is always numeric

C) Nominal data can be ranked D) Nominal data has units Answer: A Explanation: Ordinal data are categorical with a logical order, unlike nominal data. Question 3. Which variable is continuous? A) Eye color B) Number of children C) Weight D) Brand of car Answer: C

A) Matrix B) Vector C) Data frame D) Tensor Answer: B Explanation: A vector (or array) is optimal for a single column of values. Question 6. What is a key difference between a matrix and a data frame? A) Matrices store only numeric data, data frames can store mixed types B) Data frames are always larger C) Matrices can only be two-dimensional D) Data frames are only available in R

Answer: A Explanation: Matrices are numeric; data frames can include different data types per column. Question 7. Which function in pandas is used to read a CSV file? A) read_table() B) read_csv() C) load_csv() D) import_csv() Answer: B Explanation: read_csv() is the standard pandas function for CSV files. Question 8. What does the pandas .info() method display?

Explanation: MCAR stands for Missing Completely At Random. Question 10. What is the first step in handling missing data? A) Imputation B) Identification C) Deletion D) Scaling Answer: B Explanation: You must identify missing values before handling them. Question 11. Which imputation method is best for categorical data? A) Mean B) Median

C) Mode D) Regression Answer: C Explanation: Mode imputation is appropriate for categorical data. Question 12. What is KNN imputation? A) Replacing missing values with the mean B) Predicting missing values using similar data points C) Dropping all rows with missing data D) Using the last observed value Answer: B Explanation: KNN imputation finds similar records to estimate missing values.

B) Z-score C) Min-Max scaling D) PCA Answer: B Explanation: Z-score measures how far a value deviates from the mean. Question 15. What does a box plot help visualize? A) Data skewness B) Outliers and quartiles C) Correlations D) Data types Answer: B Explanation: Box plots show quartiles, medians, and outliers.

Question 16. What is capping/winsorization? A) Removing missing values B) Trimming data extremes C) Replacing outliers with boundary values D) Scaling values Answer: C Explanation: Winsorization replaces extreme values with specified percentiles. Question 17. What is normalization in data preprocessing? A) Removing outliers B) Scaling values to [0,1] range

Question 19. Why use log transformation on data? A) To normalize data B) To reduce skewness C) To handle categorical variables D) To encode missing values Answer: B Explanation: Log transformation helps reduce skewness in highly skewed data. Question 20. Which is NOT an assumption of linear regression? A) Linearity B) Homoscedasticity

C) Independence D) Non-linearity Answer: D Explanation: Linear regression assumes a linear relationship. Question 21. What does the R2 value in linear regression represent? A) The slope B) The intercept C) Variance explained by the model D) The error Answer: C Explanation: R2 indicates the proportion of variance explained by the model.

C) Improve data quality D) Reduce variance Answer: B Explanation: Regularization discourages overly complex models. Question 24. Polynomial features are used in regression to: A) Encode categorical data B) Model non-linear relationships C) Standardize data D) Detect outliers Answer: B Explanation: Polynomial features enable linear models to capture non- linear patterns.

Question 25. What is the purpose of the sigmoid function in logistic regression? A) Normalize data B) Map outputs to probabilities C) Detect outliers D) Encode labels Answer: B Explanation: The sigmoid function maps real values to [0,1] for probability interpretation. Question 26. In K-Nearest Neighbors (KNN), what does ‘K’ represent? A) Number of features

Explanation: Cosine similarity is less common for KNN classification. Question 28. What is a kernel trick in SVM? A) Data normalization B) Transforming data to higher dimensions C) Scaling features D) Regularization Answer: B Explanation: The kernel trick allows SVMs to find non-linear boundaries. Question 29. Which splitting criterion is used in decision trees for classification? A) Mean squared error

B) Gini impurity C) R2 score D) Ridge penalty Answer: B Explanation: Gini impurity is commonly used to measure node purity in classification trees. Question 30. What does MSE stand for in regression metrics? A) Mean Standard Error B) Mean Squared Error C) Median Squared Error D) Maximum Squared Error Answer: B

Data Science and Machine Learning Exam Questions, Exams of Technology

Related documents

Partial preview of the text

Download Data Science and Machine Learning Exam Questions and more Exams Technology in PDF only on Docsity!