Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Introduction to Probability and Statistics - Final Exam Review Sheet | STAT 20, Study notes of Statistics

Material Type: Notes; Class: Introduction to Probability and Statistics; Subject: Statistics; University: University of California - Berkeley; Term: Summer 2007;

Typology: Study notes

Pre 2010

Uploaded on 09/07/2009

koofers-user-tz9
koofers-user-tz9 🇺🇸

10 documents

1 / 4

Toggle sidebar

Related documents


Partial preview of the text

Download Introduction to Probability and Statistics - Final Exam Review Sheet | STAT 20 and more Study notes Statistics in PDF only on Docsity!

Stat 20 Final Exam Review

  1. Are there gender differences in the progress of students in doctoral programs? A major university classified all students entering Ph.D. programs in a given year by their status 6 years later. The categories used were as follows: completed the degree, still enrolled, and dropped out. The data appear below:

Status Men Women Total Completed 423 98 521 Still Enrolled 143 33 176 Dropped Out 238 98 336 Total 804 229 1033

Assume that these data can be viewed as a random sample giving us information on student progress.

(a) We want to test the null hypothesis H 0 : “The distribution of students’ status is the same for men and women. Calculate the expected values of each cell under H 0? (b) What test statistic should be used to test H 0? Calculate this test statistic. (c) Under H 0 , what is the distribution of your test statistic? What is the P -value of your test statistic. (d) What is your conclusion about gender differences in the progress of students in doctoral programs?

  1. A university financial aid office polled a random sample of undergraduate students to study their summer employment. Not all students were employed the previous summer. Here are the results for men and women:

Men Women Total Employed 72 59 131 Not employed 8 14 22 Total 80 73 153

(a) We are interested in testing H 0 : p 1 = p 2 vs. H 0 : p 1 6 = p 2 , where p 1 is the pro- portion of male students employed during the summer and p 2 is the proportion of female students employed during the summer. What is your test statistic for testing the hypotheses? (b) What is the P -value of your test statistic? (c) Would you reject H 0 at the .05 significance level? (d) Find a 99% confidence interval for the difference between the proportions of male and female students who were employed during the summer.

  1. There are two kinds of twins: identical twins result when a fertilized egg divides, and fraternal twins come from the fertilization of two separate eggs in the womb. The probability that a maternity results in identical twins is 1/300 and the probability of fraternal twins is 1/125. For this problem, we consider a maternity can only result in single birth or twin births (e.g., no triplets, etc.) (A maternity is defined as a pregnancy that results in birth.) Identical twins are always of the same sex (of course!), and we assume that for each pair of identical twins, the probability is 0.5 that they are girls and 0.5 that they are boys. Fraternal twins are like siblings and may be of the same or different sexes, and we assume that the sexes of the two twin babies are independent and each has 50% of chance of being a girl or a boy. Also, for a single birth, the baby has equal chance of being either sex.

(a) What is the probability that a maternity will result in the single birth of a girl? (b) What is the probability that a maternity will result in twin girls? (c) A woman has twins. What is the conditional probability that they are identical twin boys? (d) Elvis Presley had a twin brother who died at birth. What is the conditional probability that Elvis was an identical twin, given that he had a twin brother?

  1. Plotted below are the seed count against seed weight in milligrams on the left and log count against log weight on the right for common tree species. Below the plots is the output using the statistical softward package STATA for the regression of log seed count (lCount) on log seed weight (lWeight). All logs are natural logs, also denoted as loge or ln.

Count

Plot of Seed Count versus Weight^ Weight (mg)

0 1000 2000 3000 4000 5000 6000

0

5000

10000

15000

20000

25000

30000

Log Count

Plot of Log Seed Count versus Log Weight^ Log Weight (log(mg))

0 2 4 6 8 10

3

6

9

12

. regress lCount lWeight

Source | SS df MS Number of obs = 19 -------------+------------------------------ F( 1, 17) = 107. Regression | 54.5897644 1 54.5897644 Prob > F = 0. Residual | 8.65741923 17 .509259955 R-squared = 0. -------------+------------------------------ Adj R-squared = 0. Total | 63.2471836 18 3.51373242 Root MSE =.


lCount | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- lWeight | -.5670124 .0547655 -10.35 0.000 -.6825574 -. _cons | 9.758665 .3027314 32.24 0.000 9.119958 10.


(a) Let Y be the seed count and X denote seed weight, express the linear relationship above in terms of X and Y. (b) It is claimed that the product of the seed count and the square root of the seed weight is constant across tree species and let α denote the constant. Express this relationship between seed count and seed weight in terms of Y , X and α. (c) Take logs of the relationship. Compare this to the relationship in part (a) above and formulate a null hypothesis and a two-sided alternative to test this claim. (d) Test the claim at the 5% level using the 95% confidence interval. (e) Estimate α.

  1. For the items below, select true or false. If false, explain why the statement is incorrect.

(a) Even if the covariance of X and Y is greater than zero, E(X+Y)=E(X)+E(Y)

True False

(b) The average of a list of numbers cannot be smaller than its standard deviation

True False

(c) In a simple linear regression, if the correlation between Y and X is negative, then so is the estimate of the slope, b.

True False

(d) If the correlation between two random variable X and Y is equal to 0, then X and Y are independent.

True False

(e) A 90% confidence interval is always wider than a 95% confidence interval.

True False