Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Statistical Analysis of Professor Salaries: Confidence Intervals and Hypothesis Testing - , Study notes of Statistics

The calculations for constructing 95% confidence intervals for the mean salaries of female and male professors using the t-distribution. It also includes hypothesis testing for the difference between mean salaries and testing the independence of gender and rank. The document also includes the calculation of the regression line for salary by years employed.

Typology: Study notes

2011/2012

Uploaded on 10/02/2012

danielle-fitzpatrick1
danielle-fitzpatrick1 🇺🇸

6 documents

1 / 6

Toggle sidebar

Related documents


Partial preview of the text

Download Statistical Analysis of Professor Salaries: Confidence Intervals and Hypothesis Testing - and more Study notes Statistics in PDF only on Docsity!

  1. The sample mean is from a random sample, the population distribution is normal and the

population standard deviation is unknown, therefore to construct a 95% Confidence interval for

μ f

and μ m

I will use the following formula:

± (t critical value) (s / √n)

a) For μ f ,

where n = 14, f

= 21357.14, s f

= 6151.873, df f

= 13, t critical value = 2.

A 95% Confidence interval for μ f

is (17805.760, 24908.520). The range of the confidence

interval is defined by the sample statistic + margin of error. And the uncertainty is denoted by

the confidence level. Therefore, this 95% confidence interval says that the female professor

mean salary falls within the interval $17,805.760 and $24,908.520. That is we are 95%

confident that the true mean lies between this range.

b) For μ m

where n = 38, m

= 24696.79, s m

=5646.409, df m

= 37, t critical value = 2.

A 95% Confidence interval for μ m

is (22841.038, 26552.542). The range of the confidence

interval is defined by the sample statistic + margin of error. And the uncertainty is denoted by

the confidence level. Therefore, this 95% confidence interval says that the male professor mean

salary falls within the interval $22,841.038 and $26,552.542. That is that we are 95% confident

that the true mean lies between this range.

  1. Population characteristic of interest μ =mean salary for all professors
  2. Null hypothesis H 0

: μ = 26,

  1. Alternate Hypothesis H a

: μ <26,

  1. Significance level: α =.
  2. Test statistic:

t =

´ xhypothesized value

s

n

  1. Assumptions: The population is normal and the standard deviation is unknown,

therefore the use of the t test is reasonable.

  1. Computations: Where n = 52, = 23797.654, s = 42257.897, df = 52-1 = 51, SE =

s / √n = 5860.116, t =(23797.654-26000)/5860.116 =- 0.376.

  1. P- Value: Since we have a one-tailed test, the P-value is the probability that the t-

score having 51 degrees of freedom is less than -0.376. From the distribution

chart we can see that the P- Value is 0.356.

  1. Since the P-value (0.356) is greater than the significance level (0.05), we cannot

reject the null hypothesis and therefore there is not convincing evidence that

population mean is less than $26,000.

  1. The population characteristic of interest is μ1 - μ2 = difference in mean salary

where μ1 = true mean male salary and μ2 = true mean female salary.

  1. Null hypothesis: H 0

: μ1 - μ2 = 0

  1. Alternate hypothesis: H a

: μ1 - μ2 > 0

  1. Significance level: α =.
  2. Test statistic:

t =

´

x 1 −

´

x 2 − hypothesized value

[

(

s

1

2

n

1

)

+

(

s

2

2

n

12

)

]

  1. Assumptions: Both samples are independently selected random samples. The

population distribution is a normal distribution and the standard deviations for

both samples are unknown, so the use of the two-sample t test is reasonable.

  1. Computation: Where 1 = 24696.79 , 2 = 21357.14, s 1

2

=31881934.9 , s 2

2

=37845542 , n 1

= 38 n 2

=14 , df = 21, so

t* = 24696.79-21357.14-

√((31881934.9/38)+(37845542/14))

t* is approximately 1.

  1. P-Value: This is an upper-tailed test, so the P-Value is the area right of the t*: P-

Value = P (t > t). Because df = 21, the P-Value is.*.

  1. Conclusion: Since the P-Value, .045 is less than the significance value .05, we

reject the null hypothesis. There is convincing evidence that the mean value for

male salary is higher than the mean salary for females.

1. H

0:

Gender and rank are independent.

2. H

a

: Gender and rank are not independent.

  1. Significance level: α = .05.
  2. Test statistic:

X

2

=

allcells

( observed countexpected count )

2

expected cell count

  1. Expected cell counts: The following table shows the expected cell counts in

parentheses.

Rank

Full Non-Full Total

Gender

Female 4 (5.385) 10 (8.615) 14

Male 16 (14.615) 22 (23.385) 38

Total 20 32 52

  1. Assumptions: This is a random sample. All expected counts are larger than 5.
  2. Computations:

X

2

= (4-5.385)

2

+ (10-8.615)

2

+ (16-14615.)

2

+ (22-23.385)

2

=.

5.385 8.615 14.615 23.

  1. P-value: The two-way table for this example has 2 rows and 2 columns, so the P-

value is based on a chi-square distribution with the appropriate df: df = (2-1)(2-1)

= 1. We can use excel to find the P-value, which is .374.

  1. Conclusion: Since P-value (.374) > α (.05), we cannot reject H 0

and there is not

convincing evidence that gender and rank are dependent.

0 5 10 15 20 25 30

0

5000

10000

15000

20000

25000

30000

35000

40000

f(x) = 752.8 x + 18166.

Salary by Years Employed

Salary

Linear (Salary )

Year

Salary

Let y denote the dependent variable, Salary and x denote the independent variable, years.

a) The summary statistics of the data are:

 n = 52

i = 1

n

x

i

= 389

i = 1

n

y

i

= 1237478

i = 1

n

x

i

2

=

i = 1

n

x

i

y

i

= 10421851

i = 1

n

y

i

2

= 31234802944

i = 1

n

x

i

i = 1

n

y

i

= 481378942

Then we have:

S

xy

=

i = 1

n

x

i

y

i

(

i = 1

n

x

i

)(

i = 1

n

y

i

)

n

= 1164564

S

xx

=

i = 1

n

x

i

2

(

i = 1

n

x

i

)

2

n

= 1546.

= 389/52 = 7.

´ y = 1237478/52 = 23797.

^

β =¿

Sxy

Sxx

=

=752.

α ^ = ´ y

^

β

= 23797.654 – (752.798)(7.481) = 18165.

The equation of the estimated regression line is then:

^

y =

^

α +

^

β =18165.972+752.798 x

b) y is expected to increase by

^

β

units for each1 unit increase in x, here Salary is to increase

$752.8 for each increase in years worked.

c) If the year was to be 18, we would have:

^

y =

^

α +

^

β =18165.972+752.798( 18 )

= $31,719.