Delivery Time vs. Housing Tenure: Chi-Squared Test & Contingency Tables, Study notes of Mathematical Methods

An explanation of the chi-squared test for association and its application to the analysis of the relationship between time of delivery and housing tenure. The calculation of expected frequencies and the comparison of observed and expected frequencies for different housing tenure groups. The document also mentions other statistical tests such as fisher's exact test and chi-squared test for linear association.

Typology: Study notes

2010/2011

Uploaded on 09/10/2011

myohmy
myohmy 🇬🇧

4.8

(10)

297 documents

1 / 16

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
1
Applied Biostatistics
Chi-squared tests
Martin Bland
Professor of Health Statistics
University of York
http://www-users.york.ac.uk/~mb55/
Analyses for qualitative data
Also called nominal, categorical.
Only two categories: dichotomous, attribute, quantal,
binary.
Methods:
Chi-squared test for association
Fisher’s exact test
Chi-squared test for linear association
Risk ratio, relative risk
Odds ratio
Contingency tables
Cross tabulation of two categorical variables:
Time of delivery by housing tenure
Housing tenure Premature Term Total
---------------------------------------------
Owner-occupier 50 849 899
Council tenant 29 229 258
Private tenant 11 164 175
Lives with parents 6 66 72
Other 3 36 39
---------------------------------------------
Total 99 1344 1443
This kind of cross-tabulation of frequencies is also called a
contingency table or cross classification.
Want to test the null hypothesis that there is no relationship
or association between the two variables.
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff

Partial preview of the text

Download Delivery Time vs. Housing Tenure: Chi-Squared Test & Contingency Tables and more Study notes Mathematical Methods in PDF only on Docsity!

Applied Biostatistics

Chi-squared tests

Martin Bland

Professor of Health Statistics

University of York

http://www-users.york.ac.uk/~mb55/

Analyses for qualitative data

Also called nominal, categorical.

Only two categories: dichotomous, attribute, quantal,

binary.

Methods:

 Chi-squared test for association

 Fisher’s exact test

Chi-squared test for linear association

 Risk ratio, relative risk

 Odds ratio

Contingency tables

Cross tabulation of two categorical variables:

Time of delivery by housing tenure Housing tenure Premature Term Total


Owner-occupier 50 849 899 Council tenant 29 229 258 Private tenant 11 164 175 Lives with parents 6 66 72 Other 3 36 39


Total 99 1344 1443

This kind of cross-tabulation of frequencies is also called a

contingency table or cross classification.

Want to test the null hypothesis that there is no relationship

or association between the two variables.

Contingency tables

Time of delivery by housing tenure Housing tenure Premature Term Total


Owner-occupier 50 849 899 Council tenant 29 229 258 Private tenant 11 164 175 Lives with parents 6 66 72 Other 3 36 39


Total 99 1344 1443

Want to test the null hypothesis that there is no relationship

or association between the two variables.

If the sample is large, we can do this by a chi-squared test.

If the sample is small, we must use Fisher’s exact test.

The chi-squared test for association

Time of delivery by housing tenure Housing tenure Premature Term Total


Owner-occupier 50 849 899 Council tenant 29 229 258 Private tenant 11 164 175 Lives with parents 6 66 72 Other 3 36 39


Total 99 1344 1443

Null hypothesis: no association between the two variables.

Alternative hypothesis: an association of some type.

The chi-squared test for association

Time of delivery by housing tenure Housing tenure Premature Term Total


Owner-occupier 899 Council tenant 258 Private tenant 175 Lives with parents 72 Other 39


Total 99 1344 1443

Proportion who are premature = 99/

Out of 899 owner occupiers, expect 899 × 99/1443 = 61.

to be premature deliveries if the null hypothesis were true.

The chi-squared test for association

Time of delivery by housing tenure Housing tenure Premature Term Total


Owner-occupier 61.7 837.3 899 Council tenant 258 Private tenant 175 Lives with parents 72 Other 39


Total 99 1344 1443

Proportion who are term = 1344/

Out of 899 owner occupiers, expect 899 × 1344/1443 = 837.

to be term deliveries if the null hypothesis were true.

Note that 61.7 + 837.3 = 899.

The chi-squared test for association

Time of delivery by housing tenure Housing tenure Premature Term Total


Owner-occupier 61.7 837.3 899 Council tenant 258 Private tenant 175 Lives with parents 72 Other 39


Total 99 1344 1443

Out of 258 council tenants, expect 258 × 99/1443 = 17.

to be premature deliveries if the null hypothesis were true.

Out of 258 council tenants, expect 258 × 1344/1443 = 240.

to be term deliveries if the null hypothesis were true.

The chi-squared test for association

Time of delivery by housing tenure Housing tenure Premature Term Total


Owner-occupier 61.7 837.3 899 Council tenant 17.7 240.3 258 Private tenant 175 Lives with parents 72 Other 39


Total 99 1344 1443

Out of 258 council tenants, expect 258 × 99/1443 = 17.

to be premature deliveries if the null hypothesis were true.

Out of 258 council tenants, expect 258 × 1344/1443 = 240.

to be term deliveries if the null hypothesis were true.

The chi-squared test for association

Time of delivery by housing tenure Housing tenure Premature Term Total


Owner-occupier 61.7 837.3 899 Council tenant 17.7 240.3 258 Private tenant 175 Lives with parents 72 Other 39


Total 99 1344 1443

Out of 258 council tenants, expect 258 × 99/1443 = 17.

to be premature deliveries if the null hypothesis were true.

Out of 258 council tenants, expect 258 × 1344/1443 = 240.

to be term deliveries if the null hypothesis were true.

Note that 17.7 + 240.3 = 258.

The chi-squared test for association

Time of delivery by housing tenure Housing tenure Premature Term Total


Owner-occupier 61.7 837.3 899 Council tenant 17.7 240.3 258 Private tenant 12.0 163.0 175 Lives with parents 72 Other 39


Total 99 1344 1443

The chi-squared test for association

Time of delivery by housing tenure Housing tenure Premature Term Total


Owner-occupier 61.7 837.3 899 Council tenant 17.7 240.3 258 Private tenant 12.0 163.0 175 Lives with parents 4.9 67.1 72 Other 39


Total 99 1344 1443

The chi-squared test for association

Time of delivery by housing tenure Housing tenure Premature Term Total


Owner-occupier 50 61.7 849 837.3 899 Council tenant 29 17.7 229 240.3 258 Private tenant 11 12.0 164 163.0 175 Lives with parents 6 4.9 66 67.1 72 Other 3 2.7 36 36.3 39


Total 99 1344 1443

Compare the observed and expected frequencies.

Add (observed – expected)^2 /expected for all cells.

If null hypothesis true and samples are large enough, this is an

observation from a Chi-squared distribution, often written ^2.

The Chi-squared distribution

Family of distributions, one parameter, called the degrees

of freedom.

5%

Probability density

Chi-squared with 4 degrees of freedom 0 5 10 15 20 25 30

0

.

.

Percentage points of the Chi-squared Distribution

Degrees Probability that the tabulated value of is exceeded freedom 10% 0.10 5% 0.05 1% 0.01 0.1% 0. 1 2.71 3.84 6.63 10. 2 4.61 5.99 9.21 13. 3 6.25 7.81 11.34 16. 4 7.78 9.49 13.28 18. 5 9.24 11.07 15.09 20. 6 10.64 12.59 16.81 22. 7 12.02 14.07 18.48 24. 8 13.36 15.51 20.09 26. 9 14.68 16.92 21.67 27. 10 15.99 18.31 23.21 29.

..... .....

The chi-squared test for association

Time of delivery by housing tenure Housing tenure Premature Term Total


Owner-occupier 50 849 899 Council tenant 29 229 258 Private tenant 11 164 175 Lives with parents 6 66 72 Other 3 36 39


Total 99 1344 1443

For a contingency table, the degrees of freedom are given by:

(number of rows – 1) × (number of columns – 1).

The chi-squared test for association

Time of delivery by housing tenure Housing tenure Premature Term Total


Owner-occupier 50 849 899 Council tenant 29 229 258 Private tenant 11 164 175 Lives with parents 6 66 72 Other 3 36 39


Total 99 1344 1443

For a contingency table, the degrees of freedom are given by:

(number of rows – 1) × (number of columns – 1).

We have (5 – 1) × (2 – 1) = 4 degrees of freedom.

The chi-squared test for association

Time of delivery by housing tenure Housing tenure Premature Term Total


Owner-occupier 50 849 899 Council tenant 29 229 258 Private tenant 11 164 175 Lives with parents 6 66 72 Other 3 36 39


Total 99 1344 1443

For a contingency table, the degrees of freedom are given by:

(number of rows – 1) × (number of columns – 1).

We have (5 – 1) × (2 – 1) = 4 degrees of freedom.

^2 = 10.5, 4 d.f.

The chi-squared test for association

The chi-squared statistic is not an index of the strength of the

association.

If we double the frequencies, this will double chi-squared, but

the strength of the association is unchanged.

The chi-squared test for association

The test statistic follows the Chi-squared Distribution provided

the expected values are large enough.

This is a large sample test.

The smaller the expected values become, the more dubious

will be the test.

The conventional criterion for the test to be valid is this: the

chi-squared test is valid if at least 80% of the expected

frequencies exceed 5 and all the expected frequencies

exceed 1.

Also known as the Pearson chi-squared test.

Fisher’s exact test

Also called the Fisher-Irwin exact test.

Works for any sample size.

Used to be used only for small samples in 2 by 2 tables,

because of computing problems.

Calculate the probability of every possible table with the given

row and column totals.

Sum the probabilities for all the tables as or less probable

than the observed.

Fisher’s exact test

Time of delivery by housing tenure Housing tenure Premature Term Total


Owner-occupier 50 849 899 Council tenant 29 229 258 Private tenant 11 164 175 Lives with parents 6 66 72 Other 3 36 39


Total 99 1344 1443

^2 = 10.5, 4 d.f. Using a computer, P = 0.033.

Fishers’ exact test: P = 0.034.

Fisher’s exact test

Renal failure and mortality in peritonitis patients (Fennell, unpublished)

Status after | Renal failure | 3 months | Yes No | Total -------------+----------------+---------- Dead | 6 31 | 37 Alive | 2 62 | 64 -------------+----------------+---------- Total | 8 93 | 101

Fisher’s exact test

Renal failure and mortality in peritonitis patients (Fennell, unpublished)

Status after | Renal failure | 3 months | Yes No | Total -------------+----------------+---------- Dead | 6 Ex=2.9 31 | 37 Alive | 2 62 | 64 -------------+----------------+---------- Total | 8 93 | 101

The chi-squared test for linear association

Assessment of radiological appearance at six months as compared with appearance on admission (MRC 1948)

Radiological assessment Streptomycin Control

Considerable improvement 28 4 Moderate or slight improvement 10 13 No material change 2 3 Moderate or slight deterioration 5 12 Considerable deterioration 6 6 Deaths 4 14

Total 55 52

Chi-squared = 26.97, 5 d.f., P = 0.0001.

Does not take the ordering of the categories into account.

The chi-squared test for linear association

Assessment of radiological appearance at six months as compared with appearance on admission (MRC 1948)

Radiological assessment Streptomycin Control

Considerable improvement 28 4 Moderate or slight improvement 10 13 No material change 2 3 Moderate or slight deterioration 5 12 Considerable deterioration 6 6 Deaths 4 14

Total 55 52

Chi-squared = 26.97, 5 d.f., P = 0.0001.

Does not take the ordering of the categories into account.

Several tests do, including the Armitage chi-squared test for

trend, the Mantel-Haenszel linear-by-linear association,

Kendall’s tau b.

The chi-squared test for linear association

Assessment of radiological appearance at six months as compared with appearance on admission (MRC 1948)

Radiological assessment Streptomycin Control

Considerable improvement 28 4 Moderate or slight improvement 10 13 No material change 2 3 Moderate or slight deterioration 5 12 Considerable deterioration 6 6 Deaths 4 14

Total 55 52

Chi-squared = 26.97, 5 d.f., P = 0.0001.

Does not take the ordering of the categories into account.

SPSS does the Mantel-Haenszel linear-by-linear association

chi-squared test, whether you want it or not.

The chi-squared test for linear association

Assessment of radiological appearance at six months as compared with appearance on admission (MRC 1948)

Radiological assessment Streptomycin Control

Considerable improvement 28 4 Moderate or slight improvement 10 13 No material change 2 3 Moderate or slight deterioration 5 12 Considerable deterioration 6 6 Deaths 4 14

Total 55 52

Chi-squared = 26.97, 5 d.f., P = 0.0001.

Does not take the ordering of the categories into account.

Trend: chi-squared = 17.93, 1 d.f., P < 0.0001.

Mantel-Haenszel linear-by-linear: chi-squared = 17.76, 1 d.f.,

P < 0.0001.

The chi-squared test for linear association

Mantel-Haenszel linear-by-linear: chi-squared = 17.76, 1 d.f.,

P < 0.0001.

Assigns numerical values to categories. This must be

meaningful.

E.g.: Considerable improvement =1,

Moderate or slight improvement =2,

No material change = 3,

Moderate or slight deterioration = 4,

Considerable deterioration =5,

Death =6,

and

Streptomycin = 1,

Control =2.

The chi-squared test for linear association

Mantel-Haenszel linear-by-linear: chi-squared = 17.76, 1 d.f.,

P < 0.0001.

Assigns numerical values to categories. This must be

meaningful.

We then say, given these numerical scales, is there a

relationship of the form

improvement = constant + another constant × treatment

Applied Biostatistics

Chi-squared tests

Martin Bland

Professor of Health Statistics

University of York

http://www-users.york.ac.uk/~mb55/