aboutSupervisedLearning, Assignments of Machine Learning

ABOUT SUPERVISED LEARNING.Abstract,advantages, disadvantage complete detail of supervised learninging

Typology: Assignments

2020/2021

Uploaded on 05/27/2021

kalyan-kumar-7
kalyan-kumar-7 🇮🇳

1 document

1 / 17

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Abstract
Supervised learning is the construction of algorithms that are able to produce general
patterns and hypotheses by using externally supplied instances to predict the fate of future
instances. Supervised machine learning classification algorithms aim at categorizing data
from prior information.
Supervised learning learns by examples as to what a face is in terms of structure, colour
etc. so that after several iterations it learns to define a face.
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff

Partial preview of the text

Download aboutSupervisedLearning and more Assignments Machine Learning in PDF only on Docsity!

Abstract

Supervised learning is the construction of algorithms that are able to produce general

patterns and hypotheses by using externally supplied instances to predict the fate of future

instances. Supervised machine learning classification algorithms aim at categorizing data

from prior information.

Supervised learning learns by examples as to what a face is in terms of structure, colour

etc. so that after several iterations it learns to define a face.

Contents

  • 1.Introduction TITLE PAGE-NO
  • 2.Supervised learning general overview 2-
  • 3.History
  • 4.How do supervised learning algorithms work?
    1. Principle
  • 6.Types 7-
  • 7.Application
  • 8.Advantages
    1. Disadvantages
  • 10.Conclusion
    1. References 13-

2. Supervised learning – a general overview

The Supervised Learning algorithm analyzes the training data and produces an inferred function, which can be used for mapping new examples. An optimal scenario will allow for the algorithm to correctly determine the class labels for unseen instances. This requires the learning algorithm to generalize from the training data to unseen situations in a “reasonable” way. With the help of historical data, random sampling is carried out. Random sampling picks 70% and 30% of records. With 70%, machine learning gets trained with the data. It is important to make sure the data is generalized and is not a specified one. Once the system is trained, it will provide a model (statistical model) which means that a certain understanding has been attained from the data along with some formulas. Calculations will be the output of the modelling. For instance, the brain has to be evaluated to check its functioning. Thirty per cent of the data has an input and output but when you give that to the model it will take only the independent variable and it calculates, giving the output. Hence, the model will give an output and you’re going to compare the brain's predicted output and the actual value. Hence, the accuracy of the percentage will be attained.

Figure 2.1:The figure above shows technical details regarding supervised learning

4. How do supervised learning algorithms work

Ordinary programming algorithms tell the computer what to do straightforwardly. For example, sorting algorithms turn unordered data into data ordered by some criteria, often the numeric or alphabetical order of one or more fields in the data. Linear regression algorithms fit a straight line, or another function that is linear in its parameters such as a polynomial, to numeric data, typically by performing matrix inversions to minimize the squared error between the line and the data. Squared error is used as the metric because you don’t care whether the regression line is above or below the data points; you only care about the distance between the line and the points. Nonlinear regression algorithms, which fit curves that are not linear in their parameters to data, are a little more complicated, because, unlike linear regression problems, they can’t be solved with a deterministic method. Instead, the nonlinear regression algorithms implement some kind of iterative minimization process, often some variation on the method of steepest descent. Steepest descent computes the squared error and its gradient at the current parameter values, picks a step size (aka learning rate), follows the direction of the gradient “down the hill,” and then recomputes the squared error and its gradient at the new parameter values. Eventually, with luck, the process converges. The variants on steepest descent try to improve the convergence properties.

5. Principle

Machine learning algorithms are described as learning a target function (f) that best maps input variables (X) to an output variable (Y). Y = f(X) This is a general learning task where we would like to make predictions in the future (Y) given new examples of input variables (X). We don’t know what the function (f) looks like or it’s form. If we did, we would use it directly and we would not need to learn it from data using machine learning algorithms. It is harder than you think. There is also error (e) that is independent of the input data (X). Y = f(X) + e This error might be an error such as not having enough attributes to sufficiently characterize the best mapping from X to Y. This error is called irreducible error because no matter how good we get at estimating the target function (f), we cannot reduce this error. This is to say, that the problem of learning a function from data is a difficult problem and this is the reason why the field of machine learning and machine learning algorithms exist.

c) Naive Bayesian Model:

The Bayesian model of classification is used for large finite datasets. It is a method of assigning class labels using a direct acyclic graph. The graph comprises one parent node and multiple children nodes. And each child node is assumed to be independent and separate from the parent. Decision Trees:A decision tree is a flowchart-like model containing conditional control statements, comprising decisions and probable consequences. The output relates to the labelling of unforeseen data

d)Random Forest Model

The random forest model is an ensemble method. It operates by constructing a multitude of decision trees and outputs a classification of the individual trees. Suppose you want to predict which undergraduate students will perform well in GMAT – a test taken for admission into graduate management programs. A random forest model would accomplish the task, given the demographic and educational factors of a set of students who have previously taken the test.

e)Neural Networks

This algorithm is designed to cluster raw input, recognize patterns, or interpret sensory data. Despite their multiple advantages, neural networks require significant computational resources. It can get complicated to fit a neural network when there are thousands of observations. It is also called the ‘black-box’ algorithm as interpreting the logic behind their predictions can be challenging.

f) Support Vector Machines

Support Vector Machine (SVM) is a supervised learning algorithm developed in the year 1990. It draws from the statistical learning theory developed by Vapnik.SVM separates hyperplanes, which makes it a discriminative classifier. The output is produced in the form of an optimal hyperplane that categorizes new examples. SVMs are closely connected to the kernel framework and used in diverse fields. Some examples include bioinformatics, pattern recognition, and multimedia information retrieval.

8. Advantages

● You will have an exact idea about the classes in the training data. ● Supervised learning is a simple process for you to understand. In the case of unsupervised learning, we don’t easily understand what is happening inside the machine, how it is learning, etc. ● You can find out exactly how many classes are there before giving the data for training. ● It is possible for you to be very specific about the definition of the classes, that is, you can train the classifier in a way that has a perfect decision boundary to distinguish different classes accurately. ● After the entire training is completed, you don’t necessarily need to keep the training data in your memory. Instead, you can keep the decision boundary as a mathematical formula. ● Supervised learning can be very helpful in classification problems. ● Another typical task of supervised machine learning is to predict a numerical target value from some given data and labels.

9. Disadvantages

Supervised Learning has a lot of challenges and disadvantages that you could face while working with these algorithms. Let’s take a look at these. ● You could overfit your algorithm easily ● Good examples need to be used to train the data ● Computation time is very large for Supervised Learning ● Unwanted data could reduce the accuracy ● Pre-Processing of data is always a challenge ● If the dataset is incorrect, you make your algorithm learn incorrectly which can bring losses

10. Conclusion

Supervised learning is the simplest subcategory of machine learning and serves as an introduction to machine learning to many machine learning practitioners. Supervised learning is the most commonly used form of machine learning, and has proven to be an excellent tool in many fields. Supervised learning use cases use labelled data to train a machine or an application, regression, and classifications techniques to develop predictive data models that have multiple applications across all domains and industries. And even in our daily life, we all use them. Supervised learning requires experienced data scientists to build, scale, and update the models. If the algorithms go wrong, the results will be inaccurate. Therefore, the selection of relevant data is crucial for supervised learning to work efficiently. Selecting the right and relevant insights are always vital for a training set, and the real-life applications of supervised learning are tremendous.

THANK YOU