Prepare for your exams
Get points
Guidelines and tips
Sell on Docsity
Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Docsity AI

Log in Sign up

Prepare for your exams

Study with the several resources on Docsity

Find documents

Prepare for your exams with the study notes shared by other students like you on Docsity

Search for your university

Find the specific documents for your university's exams

Docsity AINEW

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

20 Points

For each uploaded document

Answer questions

5 Points

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Choose your next study program

Get in touch with the best universities in the world. Search through thousands of universities and official partners

Community

Ask the community

Ask the community for help and clear up your study doubts

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

Bayesian Inference and Machine Learning: Computing Expectations with Unknowns - Prof. Haro, Study notes of Computer Science

University of Utah (The U)Computer Science

Prof. Harold Daume

The problem of computing expectations of functions with respect to a probabilistic model with unknowns in the context of bayesian inference. It covers various methods for approximating these integrals, including summation, uniform sampling, importance sampling, and rejection sampling. The text focuses on the case where the variable of interest, denoted as θ, is univariate and bounded, but also touches upon the challenges of extending these methods to higher dimensions and non-discrete variables.

Typology: Study notes

Pre 2010

Uploaded on 08/31/2009

koofers-user-n7f 🇺🇸

10 documents

1 / 2

This page cannot be seen from the preview

Don't miss anything!

Machine Learning (CS 5350/CS 6350) 03 Apr 2007

Bayesian inference

The general problem we face in Bayesian inference is to compute the expectation of some function with

respect to a probabilistic model with unknowns.

In the simplest case, we want the expectation of a single variable. In more complex cases, we want an

expectation of a complex function of all variables.

Suppose p(θ) is our distribution of interest (some θwill be known, some not). We want:

Z=Eθ∼phf(θ)i=Zdθp(θ)f(θ)

For some cases this integral will be available in closed form (eg., HMMs). For many (most?) cases, however,

it will not.

Lets say that θis discrete, univariate. Then we can compute the expectation by just summing over all

possible values. Obviously, though, this won’t scale well to high-dimensional or non-discrete variables. But

let’s see what happens if we try. . .

Integration by Summation

Suppose θis univariate, bounded continuous. Wlog, θ∈[0,1]. If we remember how we first learned

integration, we can break [0,1] into Requally-sized rectangles. Then, we have:

Z≈

R−1

X

i=0

1

Rp(i/R)f(i/R)

As R→ ∞,Zbecomes increasingly more accurate.

One way of thinking about this is that we have a set Scontaining R-many equally spaced points, and the

integral is approximated by:

Z≈1

|S|X

θ∈S

p(θ)f(θ)

Unfortunately, if θis D-dimensional, then we need to sum RDvalues of θ.

Uniform Sampling

Instead of spacing θ∈Sevenly, let’s space them randomly. This is the idea of “Monte Carlo” integration,

which essentially means “randomized” integration. Uniform sampling is the simplest case. Let Sb e a random

sampling of θs. Then, we still have:

Z≈1

|S|X

θ∈S

p(θ)f(θ)

1

Discover Study notes of Computer Science University of Utah (The U)

Partial preview of the text

Download Bayesian Inference and Machine Learning: Computing Expectations with Unknowns - Prof. Haro and more Study notes Computer Science in PDF only on Docsity!

Machine Learning (CS 5350/CS 6350) 03 Apr 2007

Bayesian inference

The general problem we face in Bayesian inference is to compute the expectation of some function with respect to a probabilistic model with unknowns.

In the simplest case, we want the expectation of a single variable. In more complex cases, we want an expectation of a complex function of all variables.

Suppose p(θ) is our distribution of interest (some θ will be known, some not). We want:

Z = Eθ∼p

[

f (θ)

]

dθp(θ)f (θ)

For some cases this integral will be available in closed form (eg., HMMs). For many (most?) cases, however, it will not.

Lets say that θ is discrete, univariate. Then we can compute the expectation by just summing over all possible values. Obviously, though, this won’t scale well to high-dimensional or non-discrete variables. But let’s see what happens if we try...

Integration by Summation

Suppose θ is univariate, bounded continuous. Wlog, θ ∈ [0, 1]. If we remember how we first learned integration, we can break [0, 1] into R equally-sized rectangles. Then, we have:

Z ≈

R∑− 1

i=

R

p (i/R) f (i/R)

As R → ∞, Z becomes increasingly more accurate.

One way of thinking about this is that we have a set S containing R-many equally spaced points, and the integral is approximated by:

Z ≈

|S|

θ∈S

p(θ)f (θ)

Unfortunately, if θ is D-dimensional, then we need to sum RD^ values of θ.

Uniform Sampling

Instead of spacing θ ∈ S evenly, let’s space them randomly. This is the idea of “Monte Carlo” integration, which essentially means “randomized” integration. Uniform sampling is the simplest case. Let S be a random sampling of θs. Then, we still have:

Z ≈

|S|

θ∈S

p(θ)f (θ)

Machine Learning (CS 5350/CS 6350) 2

This scales better computationally, but still the number of samples required to guarantee that we get a close approximation is huge.

It’s worth thinking about how hard this problem is. Think of a boat on a lake. We want to estimate the volume of the lake, but cannot see the bottom. We can drive the boat to any position in the lake and drop an anchor, thereby measuring the depth there. How can we approximate the volume? Uniform sampling says to drive randomly around the lake, dropping at the flip of a coin. But there are many chases in which we can do better.

Importance Sampling

Here we use prior knowledge in the form of a helper distribution q that we expect to be “similar” to p and from which we can sample. It must have the same “support” as p (i.e., not zero too often). Then, we compute:

Z = Eθ∼p[f (θ)]

=

dθp(θ)f (θ)

dθq(θ)

p(θ) q(θ)

f (θ)

= Eθ∼q

[

p(θ) q(θ) f (θ)

]

So instead of computing an expectation wrt p, we compute wrt q. And then we weight each example.

Rejection Sampling

The idea in rejection sampling is similar to importance sampling. Let q be a proposal distribution that satisfies p(x) ≤ M q(x) for M < ∞. Now, draw points from q and accept them with probability p(x)/[M q(x)]. Compute expectations only over the accepted points.

Bayesian Inference and Machine Learning: Computing Expectations with Unknowns - Prof. Haro, Study notes of Computer Science

Related documents

Partial preview of the text

Download Bayesian Inference and Machine Learning: Computing Expectations with Unknowns - Prof. Haro and more Study notes Computer Science in PDF only on Docsity!

Bayesian inference

[

]

Z ≈

R∑− 1

R

Z ≈

|S|

Z ≈

|S|

[

]