Maximum Likelihood - Computer System Modeling Fundamentals - Lecture Slides, Slides of Java Programming

In the class of java programming we learn the basic concept of programming, here are some major points discuss in these lecture slides which I shared with you: Maximum Likelihood, Parameter Estimation, Graphical Models, Hypothesis Testing, Maximum a Posteriori, Argmax, Independent Observations, Exponential Random Variable, Unknown Parameter, Log Likelihood For Computation

Typology: Slides

2012/2013

Uploaded on 04/23/2013

saritae
saritae 🇮🇳

4.5

(10)

101 documents

1 / 29

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Today…
More about parameter estimation
Using maximum likelihood
Using MAP
Next time: graphical models
Docsity.com
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b
pf1c
pf1d

Partial preview of the text

Download Maximum Likelihood - Computer System Modeling Fundamentals - Lecture Slides and more Slides Java Programming in PDF only on Docsity!

Today…

  • More about parameter estimation
    • Using maximum likelihood
    • Using MAP
  • Next time: graphical models

Hypothesis Testing

  • The maximum likelihood (ML) hypothesis is the

hypothesis that makes the data most likely

H

ML

= argmax

i

P(D | H

i

ML Parameter Estimation

  • The maximum likelihood (ML) estimate is the parameter

value that makes the data most likely

  • If X 1

, …, X

n

are independent observations, then

= argmax θ log P ( X i = x i ( ;^ θ^ )) i = 1 n

ML = argmax θ

P ( X

1 = x 1

, X

2 = x 2

,..., X

n = x n ; θ) €

ML = argmax θ

P ( X

i = x i ; θ) i = 1 n

ML Parameter Estimation

  • The maximum likelihood (ML) estimate is the parameter

value that makes the data most likely

  • If X 1

, …, X

n

are independent observations, then

= argmax θ log f X (^) i ( x i

( ;^ θ^ ))

i = 1 n

ML = argmax θ f X 1 ,..., X (^) n ( x 1 , x 2 ,..., x n ; θ ) €

ML = argmax θ f X (^) i ( x i ; θ ) i = 1 n

Log Likelihood for Computation

  • The log likelihood has a computational benefit too…

The Good and the Bad of ML

  • Maximum likelihood is consistent – as the number of

observations gets large, the maximum likelihood estimate

gets closer and closer to the true parameter value

The Bayesian Point of View

  • Instead of treating parameters as fixed but unknown values

θ , Bayesians treat them as random variables Θ

The Bayesian Point of View

  • Instead of treating parameters as fixed but unknown values

θ , Bayesians treat them as random variables Θ

  • Can then define the notions of prior and posterior…

The Bayesian Point of View

  • Instead of treating parameters as fixed but unknown values

θ , Bayesians treat them as random variables Θ

  • Can then define the notions of prior and posterior…
    • Prior: € P (Θ = θ) € f Θ ( θ )

The Bayesian Point of View

  • Instead of treating parameters as fixed but unknown values

θ , Bayesians treat them as random variables Θ

  • Can then define the notions of prior and posterior…
    • Prior:
    • Posterior: € P (Θ = θ) € f Θ ( θ ) € P (Θ = θ | X 1 = x 1 ,..., X n = x n )

The Bayesian Point of View

  • Instead of treating parameters as fixed but unknown values

θ , Bayesians treat them as random variables Θ

  • Can then define the notions of prior and posterior…
    • Prior:
    • Posterior:
  • As before, priors may be subjective or estimated from data € P (Θ = θ) € f Θ ( θ ) € P (Θ = θ | X 1 = x 1 ,..., X n = x n ) € f Θ| X 1 ,..., X (^) n ( θ | x 1 ,..., x n )

MAP Parameter Estimation

  • The maximum a posteriori (MAP) estimate is the most

likely parameter value given the data

MAP = argmax θ P (Θ = θ | X 1 = x 1

,..., X

n = x n

MAP Parameter Estimation

  • The maximum a posteriori (MAP) estimate is the most

likely parameter value given the data

  • If X 1

, …, X

n

are independent given Θ, then

MAP = argmax θ P (Θ = θ | X 1 = x 1

,..., X

n = x n

MAP = argmax θ P (Θ = θ ) P ( X i = x i | Θ = θ ) i = 1 n

= argmax θ

P ( X

1 = x 1

,..., X

n = x n | Θ = θ ) P (Θ = θ )

MAP Parameter Estimation

  • The maximum a posteriori (MAP) estimate is the most

likely parameter value given the data

  • If X 1

, …, X

n

are independent given Θ, then

  • Can use the same log trick here too

MAP = argmax θ P (Θ = θ | X 1 = x 1

,..., X

n = x n

MAP = argmax θ P (Θ = θ ) P ( X i = x i | Θ = θ ) i = 1 n

= argmax θ

P ( X

1 = x 1

,..., X

n = x n | Θ = θ ) P (Θ = θ )