Random Variables and Probability Distributions - Lecture Notes | STAT 380, Study notes of Statistics

Material Type: Notes; Class: Statistics and Applications; Subject: Statistics ; University: University of Nebraska - Lincoln; Term: Spring 2005;

Typology: Study notes

Pre 2010

Uploaded on 08/30/2009

koofers-user-udj
koofers-user-udj 🇺🇸

10 documents

1 / 87

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Chapter 3: Random Variables and Probability
Distributions
Diagram from Chapter 1:
Inference
Sample Population
Take Sample
In Chapter 2, we learned about some of the basics of
probability. In this chapter, we are going to learn more
about probability and how we can use “probability
distributions”. One can think of probability distributions
as population quantities since they summarize possible
values that values that a random variable can take on.
2005 Christopher R. Bilder
3.1
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b
pf1c
pf1d
pf1e
pf1f
pf20
pf21
pf22
pf23
pf24
pf25
pf26
pf27
pf28
pf29
pf2a
pf2b
pf2c
pf2d
pf2e
pf2f
pf30
pf31
pf32
pf33
pf34
pf35
pf36
pf37
pf38
pf39
pf3a
pf3b
pf3c
pf3d
pf3e
pf3f
pf40
pf41
pf42
pf43
pf44
pf45
pf46
pf47
pf48
pf49
pf4a
pf4b
pf4c
pf4d
pf4e
pf4f
pf50
pf51
pf52
pf53
pf54
pf55
pf56
pf57

Partial preview of the text

Download Random Variables and Probability Distributions - Lecture Notes | STAT 380 and more Study notes Statistics in PDF only on Docsity!

Chapter 3: Random Variables and Probability

Distributions

Diagram from Chapter 1:

Inference

Sample

Population

Take Sample

In Chapter 2, we learned about some of the basics of

probability. In this chapter, we are going to learn more

about probability and how we can use “probability

distributions”. One can think of probability distributions

as population quantities since they summarize possible

values that values that a random variable can take on.

3.1-3.2: Concept of a Random Variable and Discrete

Probability Distributions

Example: Fifty numbers from 0 to 9 (fifty_numbers_ch3.xls)

Suppose I draw 50 numbers at random from 0, 1, …, 9

with replacement and each number has an equal chance

of being drawn. Notice that for each draw, the sample

space is S = {0, 1, …, 9}. Below are the results.

Below is a table summarizing frequencies and

percentages for each category:

Frequency Percentage

Let X = the number selected on one draw from 0, 1, …,

X is called a random variable because it can change

from draw to draw in a random manner which is

controlled by a probability structure.

Notice a capital X is used!

We know before the experiment of drawing numbers that

X can be any number from 0, 1, …, 9, and we know the

percentage of draws (probability) we expect X to be any

of these numbers. Thus, we can talk about a probability

distribution for X before the experiment. Note that this is

for the population!

x P(X=x)

0 P(X=0) = 0.

1 P(X=1) = 0.

Notes:

 A lowercase “x” is used to denote the values that X

can take on. Be careful!!!

9

x 0

P(X x) 1

 Values of X not listed above have a probability of 0; for

example, P(X=4.2) = 0.

 Remember that P(X=7) is what we expect to happen if

the experiment is repeated an infinite number of times.

In our sample, the percentage of time a 7 occurred

was 0.06.

 The random variable used in this example is called a

discrete random variable since there are a finite

number of values that it can take on – 0, 1, …, 9.

Section 3.3 will discuss continuous random variables

where there are an infinite number of values a random

variable can take on within a particular region

 The more general definition of a discrete random

variable is if the set of possible values for X is

“countable”. Countable corresponds to a set which

is “finite” or “countably infinite”.

 When there are a finite number of values for X

(i.e., you can count all possible values), the

random variable is discrete (as in this example).

 A random variable is also called “discrete” if the set

of possible values of X is “countably infinite”.

There will be more on this at the end of the

section.

 In order to make this more general (we need to for

continuous random variables), the notation of f(x) is

 Select Tools > Data Analysis > Random Number

Generation from the main Excel menu bar.

 The Random Number Generation window will appear.

 To generate a sample of 50, the Number of Variables

is set to 1, and the Number of Random Numbers is set

to 50.

 Select Discrete as the distribution since there are a

discrete number of values of X.

 Select the area in the spreadsheet where the

probability distribution is located. For example, cell

F14 contains the “0” and cell G23 contains the last

“0.1”. The “:” in the list of cells means “to”. The “$” are

not needed to make this work (see window below).

 Select a cell for the output to begin the Output Range.

 Select OK.

The final Random Number Generation window is shown

below with part of the output (see 50 random numbers

column)

 The Histogram window will now appear. Fill in the

Input Range with the cells addresses of the sample,

the Bin Range with the cell addresses of the possible

values that X can take on, and the Output Range with

an empty spot in the spreadsheet where the Excel

generated output will go. Below is what the window

should look like:

 Select OK to produce the table below. The titles of the

table can be changed to whatever you desire.

Bin Frequency

0 8

1 4

2 5

3 10

4 5

5 3

6 0

7 3

8 5

9 7

More 0

Let’s review what has happened here.

 The random variable X denotes the number which is

drawn from 0, 1, …, 9.

 The probability distribution assigns probabilities to

particular values that X can take on. For this example,

all 10 possible values have a probability of 0.1. By

specifying the probability distribution, we are specifying

the possible values that can exist in the population with

how often they can exist.

 The value “observed” from the first draw was 3. The

value “observed” from the second draw was 1. All of

these numbers constitute a sample of size 50 from a

population which has the specified probability

distribution.

x P(X=x) = f(x)

is used instead. What do you expect would happen

with the sample? Try it out for homework.

In order to plot the probability distributions for discrete

random variables, the following steps can be performed.

 Select Insert > Chart from the main Excel menu bar.

Below is the window that will appear.

 We want to make a column chart since this will plot

each of the probabilities for a specific X value as a bar.

Select Next > to go on to the next step.

 Select the Series tab at the top of the window. Select

Add to add a set of values to appear on the plot.

 Put in the range of the probabilities in the Values box

and put in the range of X values in the Category (X)

axis labels box. Below is what the finished window will

look like. Select Next > to go on to the next step.

 The next window asks you where you want to put the

plot. Select Finish to put it in the current sheet.

 The plot should now appear in the sheet. You can

move it anywhere by clicking on it and moving it to a

new location while holding down the left mouse button.

 Various items can be changed in the plot by selecting

the items and making the appropriate changes.

o Change the title: Select the title so that it is

highlighted (black squares around it) and then

click in the text area. Type in a new name.

o Change the y-axis scale and gridlines: Select the

y-axis (black squares will appear on it), right click

on it to bring up a short cut menu, and select

Format axis. Select whatever option you want to

change from the Format axis window.

o Change the gridlines?

o Remove the gray background?

 Below is the final plot:

Probability distribution for X

0

0 1 2 3 4 5 6 7 8 9

X

f(x)

Note that it may be better to represent each probability

with just a line instead of bar (see for example p. 68).

This can not be done easily in Excel. Below is a plot

done in the statistical software package called R which

demonstrates this:

0 2 4 6 8

PDF

X

f(x)

 The names probability distribution function and a

cumulative distribution function are often abbreviated as

PDF and CDF.

Example: Fifty numbers from 0 to 9 (fifty_numbers_ch3.xls)

x P(X=x) = f(x) F(x)

Questions:

 What is F(0)?

 What is F(1.5)?

 What is F(1.99999)?

 What is F(2)?

 What is F(-1)?

Plotting the cumulative distribution function for discrete

random variables in Excel can be a little more difficult.

Because of this, you will not be responsible for knowing

how to do it. Please see fifty_numbers_ch3.xls for an

example of how to do it. Below is the plot.

CDF

0

1

-1 0 1 2 3 4 5 6 7 8 9 10

X

F(x)

Example: Let’s Play Plinko! (plinko.xls in Chapter 4)

From Haws (1995):

The contestant drops a circular disk down a board

with nails arranged in the pattern indicated in figure

  1. Assume also that the disk is equally likely to go to

the left or to the right at each nail it encounters. The

contestant wins the amount of cash indicated on the

reservoir in which the disk lands.