Prepare for your exams
Get points
Guidelines and tips
Sell on Docsity
Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Docsity AI

Log in Sign up

Prepare for your exams

Study with the several resources on Docsity

Find documents

Prepare for your exams with the study notes shared by other students like you on Docsity

Search for your university

Find the specific documents for your university's exams

Docsity AINEW

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

20 Points

For each uploaded document

Answer questions

5 Points

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Choose your next study program

Get in touch with the best universities in the world. Search through thousands of universities and official partners

Community

Ask the community

Ask the community for help and clear up your study doubts

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

Data and Machine Learning Practice Exam, Exams of Machine Learning

University of New South Wales (UNSW)Machine Learning

MATH5836 Data and Machine Learning Practice Exam from University of New South Wales 2025

Typology: Exams

2025/2026

Uploaded on 12/08/2025

aakash-12 🇦🇺

1 document

1 / 25

This page cannot be seen from the preview

Don't miss anything!

Assessment 4: Practice Exam

Introduction

Sample Final Examination

There are three parts to this examination:

Part A (Quiz): 10 marks

Part B (Short answers): 14 marks

Part C (Programming): 26 marks (Note that you have the option to do either Q2 or Q3)

Instructions

All answers must be submitted online using the provided instructions in the respective

questions.

Answer all the questions in Part A and B, and one question in Part C.

Questions may be answered in any order.

Ensure you submit all answers

Part B and Part C needs to be submitted as a single pdf document in Moodle - Special

Exam. In case of Part C, do not include any code in this document, code will be

submitted in Ed.

Do not use email or any other software to communicate during the exam.

Do not use ChatGPT or other AI tools for the exam.

Please email me directly if you have any issues (rohitash.chandra@unsw.edu.au)

You are free to use any software installed in the lab computers, including Jupyter

notebooks. Note that the entire course is run on Edstem and some libraries may not work

in desktop computer and hence you should use Edstem rather than the desktop for

coding.

Ensure that you create a pdf using open oﬃce and upload in Moodle. You can simply save

as doc and print as pdf.

The exam will be held for 2 hours in Lab with restricted internet with the following sites

and resources: https://edstem.org/au/courses/19116/lessons/60882/slides/413118

Note that the above weights for sections can change in your ﬁnal exam, i.e for example, you can have 15

marks with17 multiple-choice questions (best 15) in Part A. Allocation and questions for Part B may be reduced

if this happens.

Discover Exams of Machine Learning University of New South Wales (UNSW)

Partial preview of the text

Download Data and Machine Learning Practice Exam and more Exams Machine Learning in PDF only on Docsity!

Assessment 4: Practice Exam

Introduction

Sample Final Examination

There are three parts to this examination: Part A (Quiz): 10 marks Part B (Short answers): 14 marks Part C (Programming): 26 marks (Note that you have the option to do either Q2 or Q3)

Instructions

All answers must be submitted online using the provided instructions in the respective questions. Answer all the questions in Part A and B, and one question in Part C. Questions may be answered in any order. Ensure you submit all answers Part B and Part C needs to be submitted as a single pdf document in Moodle - Special Exam. In case of Part C, do not include any code in this document, code will be submitted in Ed. Do not use email or any other software to communicate during the exam. Do not use ChatGPT or other AI tools for the exam. Please email me directly if you have any issues ([email protected]) You are free to use any software installed in the lab computers, including Jupyter notebooks. Note that the entire course is run on Edstem and some libraries may not work in desktop computer and hence you should use Edstem rather than the desktop for coding. Ensure that you create a pdf using open office and upload in Moodle. You can simply save as doc and print as pdf. The exam will be held for 2 hours in Lab with restricted internet with the following sites and resources: https://edstem.org/au/courses/19116/lessons/60882/slides/ Note that the above weights for sections can change in your final exam, i.e for example, you can have 15 marks with17 multiple-choice questions (best 15) in Part A. Allocation and questions for Part B may be reduced if this happens.

You can reuse code from the lessons in the course and from exercise solutions.

Part A: Online quiz (10 marks)

Question 1 Question 2 You need to answer 10 multiple choice questions in this section. Each question is worth 1 mark. All the questions are compulsory. Which activation function in the output layer of a neural network would be most suited for a multiclass classification problem? Softmax ReLu Hyberbolic tangent Linear None of the above Which of the following statements is correct? Adam is generally faster than SGD. Achieving excellent training performance on the training dataset implies that you have an excellent model. It is best to randomly assign the number of hidden neurons irrespective of the dataset. Keras employs scikit-learn in its core framework. None of the above

Question 3 Question 4 Question 5 What would be a major difference between the role of a data scientist and a data engineer? They do not have any differences in roles at major companies. Data scientists typically use machine learning models to develop solutions and compile reports while data engineers work with databases/datasets to organise, process and visualise data. Data engineers are database managers and data scientists are programmers. They both do similar work, but data scientists present mostly while data engineers develop models. None of the above. What would be the best model for highly non-linear and chaotic time series prediction problem? Linear regression model Logistic regression model Neural network model with sigmoid activation function in output layer Neural network model with linear activation function in output layer Either linear and sigmoid activation can be used in the output layer of a neural Network for this problem. Given ROC and AUC (0.7) in the below figure, which of the following statements is true?

Question 7 Question 8 Question 9 Which one of the following statements is true? In bagging, models are trained sequentially, and the aim is to reduce erros in every subsequent steps. In boosting, models are trained in parallel independent of each other and the outcomes are combined. In stacking, models are trained in parallel independent of each other and the outcomes are combined. In bagging, models are trained independent of each other and the outcomes are combined. None of the above. Suppose you want to cluster the following data set into two clusters. Which one of the following algorithm is the most suitable for your task? K-Means Algorithm DBSCAN Algorithm Agglomerative Clustering Algorithm Random Forest Algorithm

Question 10 Which one of the following sentences is correct? Model-based collaborative filtering uses descriptions of items for recommendations, and is similar to Amazon-style recommender systems. Collaborative filtering works well even with a very limited past recommendations. Memory-based collaborative filtering uses descriptions of items for recommendations, and is similar to Amazon-style recommender systems. Model-based collaborative filtering uses well-understood techniques from information retrieval. None of the above. Which one of the following statements is not true about Principal Component Analysis (PCA)? PCA is an unsupervised method. PCA searches for the directions that data have the smallest variance. Maximum number of principal components <= number of features. All principal components are orthogonal to each other.

Pandas_Cheat_Sheet.pdf Scikit_Learn_Cheat_Sheet_Python.pdf numpy-user.pdf matplotlib.pdf Machine Learning Modelling in R.pdf data-transformation.pdf

Calculator: https://www.desmos.com/scientific
Machine Learning in Python. Retrieved from https://scikit-learn.org/stable/
R interface to Keras. Retrieved from https://keras.rstudio.com/
Introduction to Keras. Retrieved from https://keras.io/
The caret Package. Retrieved from https://topepo.github.io/caret/
Python. Retrieved from https://docs.python.org/3/.
Rdrr.io. Retrieved from https://rdrr.io/r/
https://edstem.org/
https://pandas.pydata.org/
https://matplotlib.org/
https://moodle.telt.unsw.edu.au/ "You need to upload a pdf document of your response in Moodle - Final Exam - depending on your session. Note only one document needs to be uploaded that will include Part B and Part C Moodle submission link: Upload to Moodle (Section B and C Answers): https://moodle.telt.unsw.edu.au/mod/turnitintooltwo/view.php?id=

Part A : Solutions

1: A

2: A

3: B

4: E

5: A

6:A

7: D

8: B

9: E

10: B

Part B: Q1 (2 marks)

If a Decision Tree is overfitting the training set, is it a good idea to try decreasing max_depth? Briefly explain your answer. Type your response in the Challenge workspace (in the file answer.txt) and then click on the Submit button at the bottom right of the screen.

Part B: Q2 (2 marks)

Briefly explain the most important difference between the AdaBoot and the Gradient Boosting methods. Type your response in the Challenge workspace (in the file answer.txt) and then click on the Submit button at the bottom right of the screen.

Part B: Q4 (2 marks)

In multi-layer perceptron, does increasing the number of hidden layers improve performance? Explain your answer with reference to any dataset example from lessons or assignment. Type your response in the Challenge workspace (in the file answer.txt) and then click on the Submit button at the bottom right of the screen.

Part B: Q5 (2 marks)

Explain what is happening in the code below. def BackwardPass(self, input_vec, desired): out_delta = (desired - self.out)(self.out(1-self.out)) hid_delta = out_delta.dot(self.W2.T) * (self.hidout * (1-self.hidout)) if self.vanilla == True: self.W2+= self.hidout.T.dot(out_delta) * self.learn_rate self.B2+= (-1 * self.learn_rate * out_delta) self.W1 += (input_vec.T.dot(hid_delta) * self.learn_rate) self.B1+= (-1 * self.learn_rate * hid_delta) else: v2 = self.W2.copy() v1 = self.W1.copy() b2 = self.B2.copy() b1 = self.B1.copy() self.W2+= ( v2 *self.momenRate) + (self.hidout.T.dot(out_delta) * self.lear self.W1 += ( v1 *self.momenRate) + (input_vec.T.dot(hid_delta) * self.learn_ self.B2+= ( b2 *self.momenRate) + (-1 * self.learn_rate * out_delta) # self.B1 += ( b1 *self.momenRate) + (-1 * self.learn_rate * hid_delta) Type your response in the Challenge workspace (in the file answer.txt) and then click on the Submit button at the bottom right of screen.

Part C: Programming questions (26 marks)

For Part C questions, you need to answer Part C: Q1 and one of Part C: Q2 OR Part C: Q

Part C: Q1 (6 marks)

This remains a challenge for large models and unstructured datasets.For the following tasks, you need to write Python (or R) code, along with the required comments in the file answer.py (or answer.r ) and submit your solution. Load the dataset available in dataset_clustering.csv. Your task is to cluster the dataset using K- Means. You need to use silhouette scores to select a suitable number of clusters. and store that value in the variable named best_k and the corresponding model should be stored in the variable named best_model. In your comments, provide brief justifications, with clearly articulated reasons, for the alternatives you explored to build the model you submitted.

How to submit

Type your solution (python code and comments) in the Challenge workspace (in the file answer.py or answer.r ) and then click on the Submit button at the bottom right of the screen.

Data and Machine Learning Practice Exam, Exams of Machine Learning

Related documents

Partial preview of the text

Download Data and Machine Learning Practice Exam and more Exams Machine Learning in PDF only on Docsity!

Assessment 4: Practice Exam

Introduction

Sample Final Examination

Instructions

Part A: Online quiz (10 marks)

Part A : Solutions

1: A

2: A

3: B

4: E

5: A

6:A

7: D

8: B

9: E

10: B

Part B: Q1 (2 marks)

Part B: Q2 (2 marks)

Part B: Q4 (2 marks)

Part B: Q5 (2 marks)

Part C: Programming questions (26 marks)

Part C: Q1 (6 marks)

How to submit