Probability Final Exam by Prof. Shou-De Lin, Exams of Probability and Statistics

A final exam in probability by prof. Shou-de lin, including problems related to probability density functions, dice throws, company bids, random variables, correlation coefficient, hypothesis test, chi-square test, mutual information, bayesian network and association rule, tfidf, social network analysis, and n-gram language model.

Typology: Exams

2012/2013

Uploaded on 02/21/2013

rahas
rahas 🇮🇳

4.3

(15)

86 documents

1 / 8

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Probability Final Exam
Instructor: Prof. Shou-De Lin
14:30 17:30, Wed., June 19th, 2008
Total score: 120 points
Final exam grades and final grades will be e-mailed in early July.
You have three days to make petitions.
Problems
1. Let Xhave the following probability density function:
fX(x) = 1
2πσ e
(xµ)2
2σ2.
What is the probability density function of Y=eX? (5 points)
Ans.
P(Yy) = P(eXy) = P(Xln y),
and therefore
fY(y) = dP (Yy)
dy =dP (Xln y)
dln y·dln y
dy
=1
y2πσ e
(ln yµ)2
2σ2,0< y.
2. Person A throws an unbiased dice ntimes and B throws the same dice n+ 1 times.
We care about how many ‘6’s they throw. If you are told that
P(B has more ‘6’s than A) = 5
12,
then what is the probability that A and B have equally many ‘6’s after throwing
the dice ntimes? (7 points)
Hint: conditioning on which player has more ‘6’s after each has thrown ntimes
Ans. Let A and B respectively throw Anand Bn‘6’s in ntimes. The probability
that B has more ‘6’s than A is
5
12 =P(Bn> An) + P(Bn=An)P(B throws a ‘6’ in the (n+ 1)th time)
=1P(Bn=An)
2+P(Bn=An)
6,
so P(An=Bn) = 1/4.
1
pf3
pf4
pf5
pf8

Partial preview of the text

Download Probability Final Exam by Prof. Shou-De Lin and more Exams Probability and Statistics in PDF only on Docsity!

Probability Final Exam

Instructor: Prof. Shou-De Lin 14:30 ∼ 17:30, Wed., June 19th, 2008

  • Total score: 120 points
  • Final exam grades and final grades will be e-mailed in early July. You have three days to make petitions.

Problems

  1. Let X have the following probability density function:

fX (x) =

2 πσ

e−^

(x−μ)^2 2 σ^2.

What is the probability density function of Y = eX^? (5 points) Ans. P (Y ≤ y) = P (eX^ ≤ y) = P (X ≤ ln y), and therefore

fY (y) =

dP (Y ≤ y) dy

dP (X ≤ ln y) d ln y

d ln y dy

=

y

2 πσ

e

−(ln y−μ)^2 2 σ^2 , 0 < y.

  1. Person A throws an unbiased dice n times and B throws the same dice n + 1 times. We care about how many ‘6’s they throw. If you are told that

P (B has more ‘6’s than A) =

then what is the probability that A and B have equally many ‘6’s after throwing the dice n times? (7 points) Hint: conditioning on which player has more ‘6’s after each has thrown n times Ans. Let A and B respectively throw An and Bn ‘6’s in n times. The probability that B has more ‘6’s than A is 5 12

= P (Bn > An) + P (Bn = An)P (B throws a ‘6’ in the (n + 1)th time)

1 − P (Bn = An) 2

P (Bn = An) 6

so P (An = Bn) = 1/4.

  1. Your company must make a sealed bid for a construction project. Your company will win if your bid is lower than other companies. If you win the bid, then you plan to pay another firm 100 thousand dollars to do the work. You are competing with two other companies, and you believe their bids are two independent random variables uniformly distributed in [70, 250] and [140, 300], respectively.

(a) Suppose your bid is x, what is the probability that you win? (4 points) Ans. When x ∈ [0, 70), we will win. When x ∈ [70, 140), we only have to beat the company whose bid is in [70, 250]. Therefore, the probability to win is

250 − x 180

When x ∈ [140, 250], we need to beat both of the competitors, so the winning probability is (^) ( 250 − x 180

300 − x 160

When x > 250, we will lose. (b) Suppose your bid is x, what is the expected profit? (2 points) Ans.

expected profit =

x − 100 if x ∈ [0, 70), (x−100)(250−x) 180 if^ x^ ∈^ [70,^ 140), (x−100)(250−x)(300−x) 180 · 160 if^ x^ ∈^ [140,^ 250], 0 if x > 250.

(c) Determine the x that maximizes your profit. (2 points) Ans. x = (1300 − 100

13)/ 6 ≈ 156 .57, the maximum expected profit is (156. 57 −100)(250− 156 .57)(300− 156 .57)/(180·160) ≈ 26 .32 thousand dollars.

  1. X, Y , and Z are three random variables. Can you propose a real-world example of them that satisfy the following conditions (6 points):

(a) X and Y are independent. (b) X and Y become dependent given Z.

Ans. X: father’s blood type, Y : mother’s blood type, Z: their child’s blood type.

  1. T 1 and T 2 are two positive continuous random variables that satisfy:
    • T 1 > T 2.
    • T 1 + T 2 < 2.

(a) Given α = 0.025, what is the critical region? (5 points) Ans. Let n be the sample size and y be the number of positive votes. The critical region for α = 0.025 is

z =

y/n − 0. 65 √ (0.65)(0.35)/n

(b) Given that 414 out of a sample of 600 favor this proposal, find the p-value. ( points) Ans. z =

and the p-value ≈ P (Z ≥ 2 .054) = 0.0200. (c) Should we reject or accept H 0? (2 points) Ans. Since z > 1 .96 and the p-value < 0 .0250, we reject H 0 at an α = 0. 025 significance level.

  1. Chi-Square Test: (8 points) The teacher claims that 1/4 of the students will receive A grade, 1/4 will receive B and 1/2 will receive C grade. If among the 40 students, 6 receive A, 7 receive B and 27 receive C. Would the claim be rejected at α = 0.05 significance level? Ans. q 2 = 4. 95 < 5 .991 = χ^20. 05 (2), so we do not reject H 0 at α = 0.05.
  2. Mutual Information: (7 points) A six-sided fair die is rolled. What is the mutual information between the topside and the front face (the side most facing you)? Hint: The sum of two opposite sides is always 7. Ans. Note that having observed a side F of the cube facing us, there are four equally probable possibilities for the top T. Thus,

I(T ; F ) = H(T ) − H(T |F ) = log 6 − log 4 = log 3 − 1 ,

since T has a uniform distribution on { 1 , 2 ,... , 6 }.

  1. Bayesian Network and Association Rule: Half of the Taiwanese students in the class get high score, and 2/3 of the students in the class are Taiwanese. Only 1/10 of the non-Taiwanese students get high score.

(a) Define the random variables and draw the Bayesian network (with conditional probability table) for this statement. (2 points) Ans.

H

T P^ (T^ ) = 2/^3

P (H| T ) = 1/ 2

P (H|˜T ) = 1/ 10

(b) What is the probability that a randomly chosen student is a Taiwanese who gets high score? (3 points) Ans. P (T, H) = P (T )P (H|T ) = (2/3)(1/2) = 1/3. (c) Given an association rule that says “Japanese = true” → “score = high,” please provide a pair of “reasonable” min-support and min-confidence that make this rule true. (5 points) Ans. The answers have to satisfy the following conditions:

  • min-support < P (score = high),
  • P (score = high) < min-confidence. P (score = high) = P (H|T )P (T )+P (H|˜T )P (˜T ) = (1/2)(2/3)+(1/10)(1/3) = 11 / 30.
  1. Bayesian Network Inference: (10 points) Given the following Bayesian network,

W

S

H

F

P (H) = 1/ 2

P (F | H) = 0

P (F |˜H) = 1/ 2

P (S| H) = 1/ 10

P (S|˜H) = 1/ 2

P (W | S) = 1/ 2

P (W |˜S) = 1

please calculate P (W, F ). Ans. P (W, F ) = P (W, F, H) + P (W, F, ˜H), but P (W, F, H) = 0. Therefore it suffices to compute

P (W, F, ˜H) = P (W, F |˜H)P (˜H) = P (W |˜H)P (F |˜H)P (˜H)

=

P (W, S|˜H) + P (W, ˜S|˜H)

P (W |S)P (S|˜H) + P (W |˜S)P (˜S|˜H)

(Given S, W is independent of H)

=

12. TFIDF

Corpus C consists of only three documents:

D 1 : “new york times” D 2 : “new york post” D 3 : “los angeles times”

N 1 P 1 C 1

Sue emails Jean

calls emails calls calls

calls

There are six paths of length 2:

Sue −calls−−→ N 1 −^ calls−−→ Jean

Jean emails −−−−→ Sue emails −−−−→ P 1

Jean emails −−−−→ Sue calls −−−→ N 1

N 1 calls −−−→ Jean calls −−−→ C 1

N 1 calls −−−→ Jean calls −−−→ P 1

N 1 −^ calls−−→ Jean −emails−−−→ Sue

If we perform a random experiment to pick a length-2 path randomly, and define two random variables S and P : S: the starting node of the path (e.g., “Sue”) P : the link-combination of the path (e.g., {calls, emails}, {calls, calls})

(a) What is the size of P ’s outcome space? (2 points) Ans. 4. (b) What is the mutual information I(S; P )? (5 points) Ans. We need to average all possible combinations of PMI(S,P ):

1 6

log

1 6 1 6 ·^

1 2

log

1 6 1 3 ·^

1 6

log

1 6 1 3 ·^

1 6

log

1 3 1 2 ·^

1 2

log

1 6 1 2 ·^

1 6

= log 2.

(c) Assume that min-support is 0.3 and min-confidence is 0.7, can we conclude an association rule N 1 → {calls, calls}? Why? (3 points) Ans. No. Although support = 2/6 = 1/ 3 > min-support, confidence = 2/ 3 < min-confidence. (d) Assume the initial PageRank values for each node is 0.2. Which node(s) have the highest PageRank values after two iterations? (5 points) Ans. Let “∼=” denote normalization.

T = 0 T = 1 T = 2

Sue 15 13 · 15 ∼= 19 13 · 13 ∼= 112 Jean 15 1 · 15 ∼= 13 1 · 16 ∼= 113 N 1 15 12 · 15 ∼= 16 12 · 19 ∼= 111 P 1 15 12 · 15 + 13 · 15 ∼= 185 12 · 19 + 13 · 13 ∼= 113 C 1 15 13 · 15 ∼= 19 13 · 13 ∼= 112 Therefore, Jean and P 1 have the highest PageRank values after two iterations.

  1. N-gram Language Model: (10 points) The problem of “Chinese poetry segmentation” aims at breaking a Chinese poetry sentence into a section of terms, for example,

Can you carefully describe a way to use n-gram LM to do this job? Hint: You need to determine not only where to put the breaks but also how many breaks there are. Ans. A plausible answer is as the following. First calculate the bi-gram probability for each term (using the poetry corpus), for example The second step is to choose

  1. 5 0. 01 0. 7 0. 3 0. 2 0. 5 夜 半 鐘 聲 到 客 船

some break points. Assuming each break is a binary decision, then in this case we will have 2^6 different choices. For each case, we can calculate the corresponding probability. For example, in the following case the probability is (0. 5 × 0. 7 × 0. 3 × 0 .5)/3 (we normalize the values by diving the whole probability by the number of sections).

  1. 5 0. 01 0. 7 0. 3 0. 2 0. 5 夜 半