Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Tutorial on Probability and Estimation-Introduction to Machine Learning-Lecture 01-Computer Science, Lecture notes of Introduction to Machine Learning

Tutorial on Probability and Estimation, Dhruv Batra, Probability, Continuous Random Variables, Bias, Probabilistic Model, Probability Distributions, Sequence Probability, Parameter Estimation Problem, Maximum Likelihood Estimator, Bernoulli, Bayes Rule, MAP Estimation, Discrete Vs. Continuous RV, Beta Prior for Bernoulli, Conjugate Prior, MAP Estimate, Bernoulli, Beta Distributions, Gaussian Distributions, Greg Shakhnarovich, Lecture Slides, Introduction to Machine Learning, Computer Science, To

Typology: Lecture notes

2011/2012

Uploaded on 03/12/2012

alfred67
alfred67 🇺🇸

4.9

(20)

328 documents

1 / 89

Toggle sidebar

Related documents


Partial preview of the text

Download Tutorial on Probability and Estimation-Introduction to Machine Learning-Lecture 01-Computer Science and more Lecture notes Introduction to Machine Learning in PDF only on Docsity!

Lecture 1: Tutorial on probability and estimation

TTIC 31020: Introduction to Machine Learning

Instructor: Greg Shakhnarovich, Lecture by Dhruv Batra

TTI–Chicago

September 27, 2010

Welcome

TTIC 31020, Introduction to Machine Learning

MWF 9:30-10:20am

Instructor: Greg Shakhnarovich, [email protected]

TA: Feng Zhao, [email protected]

Greg is traveling this week; all administrative details will be discussed next Monday.

Welcome

TTIC 31020, Introduction to Machine Learning

MWF 9:30-10:20am

Instructor: Greg Shakhnarovich, [email protected]

TA: Feng Zhao, [email protected]

Greg is traveling this week; all administrative details will be discussed next Monday.

Plan for this week:

  • (^) Today and on Wednesday: tutorial/refresher on probability and estimation – Dhruv
  • Friday: general introduction to machine learning – Nati

Why probability?

This class is mostly about statistical methods and models in machine learning

Probability is fundamental in dealing with uncertainty inherent in real world problems

Statistics leverages laws of probability to evaluate important properties of the world from data, and make intelligent predictions about future

Background

Things you should have seen before

Background

Things you should have seen before

  • Discrete vs. Continuous Random Variables

Background

Things you should have seen before

  • Discrete vs. Continuous Random Variables
  • (^) pmf vs pdf

Background

Things you should have seen before

  • Discrete vs. Continuous Random Variables
  • (^) pmf vs pdf
  • (^) Joint vs Marginal vs Conditional Distributions

Background

Things you should have seen before

  • Discrete vs. Continuous Random Variables
  • (^) pmf vs pdf
  • (^) Joint vs Marginal vs Conditional Distributions
  • IID: Independant Identically Distributed

Background

Things you should have seen before

  • Discrete vs. Continuous Random Variables
  • (^) pmf vs pdf
  • (^) Joint vs Marginal vs Conditional Distributions
  • IID: Independant Identically Distributed
  • (^) Bayes Rule and Priors

Background

Things you should have seen before

  • Discrete vs. Continuous Random Variables
  • (^) pmf vs pdf
  • (^) Joint vs Marginal vs Conditional Distributions
  • IID: Independant Identically Distributed
  • (^) Bayes Rule and Priors

This refresher WILL revise these topics.

Problem: estimating bias in coin toss

A single coin toss produces H or T.

A sequence of n coin tosses produces a sequence of values; n = 4

Problem: estimating bias in coin toss

A single coin toss produces H or T.

A sequence of n coin tosses produces a sequence of values; n = 4 T ,H,T ,H

Problem: estimating bias in coin toss

A single coin toss produces H or T.

A sequence of n coin tosses produces a sequence of values; n = 4 T ,H,T ,H H,H,T ,T

Problem: estimating bias in coin toss

A single coin toss produces H or T.

A sequence of n coin tosses produces a sequence of values; n = 4 T ,H,T ,H H,H,T ,T T ,T ,T ,H

Problem: estimating bias in coin toss

A single coin toss produces H or T.

A sequence of n coin tosses produces a sequence of values; n = 4 T ,H,T ,H H,H,T ,T T ,T ,T ,H

A probabilistic model allows us to model the uncertainly inherent in the process (randomness in tossing a coin), as well as our uncertainty about the properties of the source (fairness of the coin).

Probabilistic model

First, for convenience, convert H → 1, T → 0.

  • We have a random variable X taking values in { 0 , 1 }

Probabilistic model

First, for convenience, convert H → 1, T → 0.

  • We have a random variable X taking values in { 0 , 1 }

Bernoulli distribution with parameter μ:

Pr(X = 1; μ) = μ.

Probabilistic model

First, for convenience, convert H → 1, T → 0.

  • We have a random variable X taking values in { 0 , 1 }

Bernoulli distribution with parameter μ:

Pr(X = 1; μ) = μ.

We will write for simplicity p(x) or p(x; μ) instead of Pr(X = x; μ)

Probabilistic model

First, for convenience, convert H → 1, T → 0.

  • We have a random variable X taking values in { 0 , 1 }

Bernoulli distribution with parameter μ:

Pr(X = 1; μ) = μ.

We will write for simplicity p(x) or p(x; μ) instead of Pr(X = x; μ)

The parameter μ ∈ [0, 1] specifies the bias of the coin

  • Coin is fair if μ = (^12)

Reminder: probability distributions

Discrete random variable X taking values in set X = {x 1 , x 2 ,.. .}

Reminder: probability distributions

Discrete random variable X taking values in set X = {x 1 , x 2 ,.. .}

Probability mass function p : X → [0, 1] satisfies the law of total probability: ∑

x∈X

p(X = x) = 1

Reminder: probability distributions

Discrete random variable X taking values in set X = {x 1 , x 2 ,.. .}

Probability mass function p : X → [0, 1] satisfies the law of total probability: ∑

x∈X

p(X = x) = 1

Hence, for Bernoulli distribution we know

p(0) = 1 − p(1; μ) = 1 − μ.

Sequence probability

Now consider two tosses of the same coin, 〈 X 1 , X 2 〉

Sequence probability

Now consider two tosses of the same coin, 〈 X 1 , X 2 〉

We can consider a number of probability distributions:

Joint distribution p(X 1 , X 2 ) Conditional distributions p(X 1 | X 2 ), p(X 2 | X 1 ), Marginal distributions p(X 1 ), p(X 2 )