Stata Commands for Statistics for Sociologists: Module 7 – Inference for Distributions, Study notes of Statistics

Stata commands for module 7 of the statistics for sociologists course at the university of north carolina, chapel hill, focusing on inference for distributions. It includes functions for normal and student's t distributions, as well as instructions for calculating confidence intervals and performing one-sample tests.

Typology: Study notes

2011/2012

Uploaded on 12/29/2012

sankait
sankait 🇮🇳

4.2

(13)

113 documents

1 / 7

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
University of North Carolina
Chapel Hill
Soci708-001 Statistics for Sociologists
Fall 2009
Professor François Nielsen
Stata Commands for Module 7 Inference for Distributions
For further information on any command in this handout, simply type help
followed by the name of the command in Stata.
For confidence intervals, also see page 35 of the Stata and SAS Guide pdf
(click on Documents in side bar; guide is linked under Software Documenta-
tion).
1 Statistical Functions in Stata
1.1 Normal Distribution Functions
The function normal(z) returns the cumulative standard normal distribution
P(Zz).
. display normal(1.207)
.88628393
The function invnormal(p) returns z such that P(Zz) = p.
. display invnormal(0.975)
1.959964
1.2 Student tDistribution Functions
The function ttail(df, t) returns the reverse cumulative (upper-tail) Stu-
dent’s t distribution for df degrees of freedom; given tit returns the probability
P(T>t).
. display ttail(7, 1.960)
.04540985
The function invttail(df, p) returns the inverse reverse cumulative
(upper-tail) Student’s t distribution for df degrees of freedom; given pit returns
tsuch that P(T>t) = p.
. display invttail(7, 0.025)
2.3646243
1.3 Curve for Problem 7.113 p.481
For IPS6e Problem 7.113 p.481 Degrees of freedom and confidence interval
width.
This is how to draw the curve requested in this problem in Stata:
1
pf3
pf4
pf5

Partial preview of the text

Download Stata Commands for Statistics for Sociologists: Module 7 – Inference for Distributions and more Study notes Statistics in PDF only on Docsity!

University of North Carolina Chapel Hill

Soci708-001 Statistics for Sociologists

Fall 2009

Professor François Nielsen

Stata Commands for Module 7 – Inference for Distributions

For further information on any command in this handout, simply type help

followed by the name of the command in Stata.

For confidence intervals, also see page 35 of the Stata and SAS Guide pdf

(click on Documents in side bar; guide is linked under Software Documenta-

tion).

1 Statistical Functions in Stata

1.1 Normal Distribution Functions

The function normal(z) returns the cumulative standard normal distribution

P ( Zz ).

. display normal(1.207) .

The function invnormal(p) returns z such that P ( Z ≤ z ) = p.

. display invnormal(0.975)

1.2 Student t Distribution Functions

The function ttail(df, t) returns the reverse cumulative (upper-tail) Stu-

dent’s t distribution for df degrees of freedom; given t it returns the probability

P ( T > t ).

. display ttail(7, 1.960) .

The function invttail(df, p) returns the inverse reverse cumulative

(upper-tail) Student’s t distribution for df degrees of freedom; given p it returns

t such that P ( T > t ) = p.

. display invttail(7, 0.025)

1.3 Curve for Problem 7.113 p.

For IPS6e Problem 7.113 p.481 – Degrees of freedom and confidence interval

width.

This is how to draw the curve requested in this problem in Stata:

. twoway function y=invttail(x,0.025), range(2 100) yline(1.96)

This is how to do it in R:

curve(qt(0.975,x),2,100,xlab="Degrees of freedom", ylab="t*") abline(1.96,0,col="blue")

2 Entering Data for Confidence Intervals and One-Sample Tests

There are several ways to enter data in Stata to calculate confidence intervals

for the mean and one-sample test statistics. Here are three of them.

2.1 Method 1

A quick-and-dirty method from Andrew Ritchey. Take IPS6e Problem 7.

p.442 as an example. You have to enter 20 observations. In Stata, first clear

any data in memory. Then create a data frame with 20 observations and create

a variable with all values missing:

. clear . set obs 20 . gen mpg=.

Then go to the data editor (Data/Data Editor, or click the icon) and replace the

missing values with the actual values. Then close the Data Editor (click on the

×). Your data are ready to use. See below.

2.2 Method 2

Another technique from Michele Easter. You want to enter 9 observations. After

clearing any data in memory, we create a new variable called var and input

the data. (Type input var and then enter the values. After the last value,

type end.)

. clear . input var

var

  1. 3
  2. 3
  3. end

2.3 Method 3

Still another technique that is a time-saver for moderately long data sets from

the textbook. Use the disk that comes with the text and navigate to the data

sets, and go the the Excel folder. The data for Problem 7.24 p. 442 are under

. * lower bound of CI . display 43.17 - 2.

. * upper bound . display 43.17 + 2.

Now we do the CI the easy way.

. ci mpg

Variable | Obs Mean Std. Err. [95% Conf. Interval] -------------+--------------------------------------------------------------- mpg | 20 43.17 .9872104 41.10374 45.

3 One-Sample t-Test in Stata

The following two examples involve inputting data using the keyboard, but in

general it is easier just to go into the Data Editor, or copy and paste from an

already-entered Excel spreadsheet.

In the first command, we create a new variable called var and input the

data. (Type input var and then enter the values. After the last value, type

end.)

. input var

var

  1. 3
  2. 3
  3. end

Check the mean by using the summarize command (a.k.a. sum or su).

. su var

Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- var | 9 3.133333 .1581139 2.9 3.

The mean for this sample is 3.133333. Now we would like to do a t-test to

assess how likely it is that the true mean is greater than 3. Stata tests whether

the μ > 3, μ = 3, or μ < 3 at the same time.

. ttest var=

One-sample t test

Variable | Obs Mean Std. Err. Std. Dev. [95% Conf. Interval] ---------+-------------------------------------------------------------------- var | 9 3.133333 .0527046 .1581139 3.011796 3.


mean = mean(var) t = 2. Ho: mean = 3 degrees of freedom = 8

Ha: mean < 3 Ha: mean != 3 Ha: mean > 3 Pr(T < t) = 0.9824 Pr(|T| > |t|) = 0.0353 Pr(T > t) = 0.

Stata by default assigns a 95% confidence interval, but this can be changed

using the option level. To tell Stata to use a 90% confidence interval, you

would enter the command (output not shown):

. ttest var=3, level(90) ...

See Stata Help for more on the ttest command.

4 Matched Pairs in Stata

This example is very similar.

. input x

x

  1. 9000
  2. 8000
  3. 6000
  4. 6000
  5. 8000
  6. 7000
  7. end . ttest x=

One-sample t test

Variable | Obs Mean Std. Err. Std. Dev. [95% Conf. Interval] ---------+-------------------------------------------------------------------- x | 6 7333.333 494.4132 1211.06 6062.404 8604.


mean = mean(x) t = 14. Ho: mean = 0 degrees of freedom = 5

Ha: mean < 0 Ha: mean != 0 Ha: mean > 0 Pr(T < t) = 1.0000 Pr(|T| > |t|) = 0.0000 Pr(T > t) = 0.

5 Two-Samples Problems in Stata

Now, using the dataset woprops (available from sidebar of course website,

under Datasets), do a t-test to see whether State cabinets under Democratic

. ttesti 17 .1712353 .0822401 22 .2398636 .1350461, unequal ... . ttesti 17 .1712353 .0822401 22 .2398636. ...