Prepare for your exams
Get points
Guidelines and tips
Sell on Docsity
Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Docsity AI

Log in Sign up

Prepare for your exams

Study with the several resources on Docsity

Find documents

Prepare for your exams with the study notes shared by other students like you on Docsity

Search for your university

Find the specific documents for your university's exams

Docsity AINEW

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

20 Points

For each uploaded document

Answer questions

5 Points

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Choose your next study program

Get in touch with the best universities in the world. Search through thousands of universities and official partners

Community

Ask the community

Ask the community for help and clear up your study doubts

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

Assignment I Questions - Machine Learning and Data Mining | CS 434, Assignments of Computer Science

Oregon State University (OSU)Computer Science

Material Type: Assignment; Class: MACHINE LEARNING AND DATA MINING; Subject: Computer Science; University: Oregon State University; Term: Unknown 1989;

Typology: Assignments

Pre 2010

Uploaded on 08/31/2009

koofers-user-0ro 🇺🇸

9 documents

1 / 3

This page cannot be seen from the preview

Don't miss anything!

Cs434 Assignment 1

Due: Monday Oct 13th in class

Part I. (14pts)

Through out the course, Weka will be a very useful tool for you to explore different

machine learning and data mining algorithms. The purpose of this part of the assignment

is to familiarize you with this software package. For this assignment, please download

and install the Weka software package (version 3.4) from

http://www.cs.waikato.ac.nz/ml/weka/

If you don’t have access to a computer for this purpose, please inform the instructor and

special arrangement can be made to accommodate your need.

Basic information on this software package can be found in the following tutorial:

http://easynews.dl.sourceforge.net/sourceforge/weka/ExplorerGuide-3.4.pdf

Note that there are many documents available on the Weka webpage introducing different

aspects of Weka. For example, another useful thing to read is the following document,

which describes the input format for Weka, i.e., the “arff” (attribute relation file format)

format.

http://www.cs.waikato.ac.nz/~ml/weka/arff.html

Please use Weka to explore the “iris” data set that comes with the software. To open this

data set, choose “explorer” from the Weka GUI chooser, which opens a panel with

several tabs. Select the “preprocess” tab and click “Open file”, then click on the “data”

folder, choose the “iris.arff” file.

With the help of the Weka software, answer the following questions:

1. How many classes there are in this data set? (2pts)

2. How many (non-class) attributes there are? (2pts)

3. What are the mean and standard deviation for each attribute? (2pts)

4. If you were to choose only one attribute to build your classifier, which attribute

should you choose? (4pts)

5. Which pair of attributes provides the best discrimination among classes? (4pts)

(Suggestion: use the visualization tool to look for good separations among classes.)

Note that questions 4 and 5 are subjective; please provide your reasons for the answers.

Reasons could be describe in words and/or shown through figures.

Discover Assignments of Computer Science Oregon State University (OSU)

Partial preview of the text

Download Assignment I Questions - Machine Learning and Data Mining | CS 434 and more Assignments Computer Science in PDF only on Docsity!

Cs434 Assignment 1 Due: Monday Oct 13th^ in class

Part I. (14pts) Through out the course, Weka will be a very useful tool for you to explore different machine learning and data mining algorithms. The purpose of this part of the assignment is to familiarize you with this software package. For this assignment, please download and install the Weka software package (version 3.4) from http://www.cs.waikato.ac.nz/ml/weka/

If you don’t have access to a computer for this purpose, please inform the instructor and special arrangement can be made to accommodate your need.

Basic information on this software package can be found in the following tutorial: http://easynews.dl.sourceforge.net/sourceforge/weka/ExplorerGuide-3.4.pdf

Note that there are many documents available on the Weka webpage introducing different aspects of Weka. For example, another useful thing to read is the following document, which describes the input format for Weka, i.e., the “arff” (attribute relation file format) format. http://www.cs.waikato.ac.nz/~ml/weka/arff.html

Please use Weka to explore the “iris” data set that comes with the software. To open this data set, choose “explorer” from the Weka GUI chooser, which opens a panel with several tabs. Select the “preprocess” tab and click “Open file”, then click on the “data” folder, choose the “iris.arff” file.

With the help of the Weka software, answer the following questions:

How many classes there are in this data set? (2pts)
How many (non-class) attributes there are? (2pts)
What are the mean and standard deviation for each attribute? (2pts)
If you were to choose only one attribute to build your classifier, which attribute should you choose? (4pts)
Which pair of attributes provides the best discrimination among classes? (4pts) (Suggestion: use the visualization tool to look for good separations among classes.)

Note that questions 4 and 5 are subjective; please provide your reasons for the answers. Reasons could be describe in words and/or shown through figures.

Part II

Below is a set of 2-d data points, with black dots representing positive class and red dots representing negative class. The blue line segments show the Voronoi diagram of these points. (14pts) a. What is the training error of 1-nearest neighbor? (2pts) b. What is the training error of 3-nearest neighbor? (4pts) c. Please mark out the 1-nearest neighbor decision boundary for this data set, which should be a subset of the blue line segments. (4pts) d. Now consider 3-nearest neighbor, true or false : the decision boundary of 3-NN is also formed by a subset of these blue line segments, but a different subset from the answer of (a). Explain your answer. (4 pts)

Assignment I Questions - Machine Learning and Data Mining | CS 434, Assignments of Computer Science

Related documents

Partial preview of the text

Download Assignment I Questions - Machine Learning and Data Mining | CS 434 and more Assignments Computer Science in PDF only on Docsity!