The Chi Square Test, Study notes of Statistics

➢ Describe the problem of multiple comparisons. ➢ Calculate expected counts in two-way tables. ➢ Describe the chi-square test statistic. ➢ Describe the cell ...

Typology: Study notes

2021/2022

Uploaded on 08/05/2022

char_s67
char_s67 🇱🇺

4.5

(116)

1.9K documents

1 / 12

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
The Chi Square Test
Diana Mindrila, Ph.D.
Phoebe Balentyne, M.Ed.
Based on Chapter 23 of The Basic Practice of Statistics (6th ed.)
Concepts:
Two-Way Tables
The Problem of Multiple Comparisons
Expected Counts in Two-Way Tables
The Chi-Square Test Statistic
Cell Counts Required for the Chi-Square Test
Uses of the Chi-Square Test
The Chi-Square Distributions
Objectives:
Construct and interpret two-way tables.
Describe the problem of multiple comparisons.
Calculate expected counts in two-way tables.
Describe the chi-square test statistic.
Describe the cell counts required for the chi-square test.
Describe uses of the chi-square test.
Describe the chi-square distributions.
Perform a chi-square goodness of fit test.
References:
Moore, D. S., Notz, W. I, & Flinger, M. A. (2013). The basic practice of statistics (6th
ed.). New York, NY: W. H. Freeman and Company.
pf3
pf4
pf5
pf8
pf9
pfa

Partial preview of the text

Download The Chi Square Test and more Study notes Statistics in PDF only on Docsity!

The Chi Square Test

Diana Mindrila, Ph.D. Phoebe Balentyne, M.Ed.

Based on Chapter 23 of The Basic Practice of Statistics (6th^ ed.)

Concepts:  Two-Way Tables  The Problem of Multiple Comparisons  Expected Counts in Two-Way Tables  The Chi-Square Test Statistic  Cell Counts Required for the Chi-Square Test  Uses of the Chi-Square Test  The Chi-Square Distributions

Objectives:  Construct and interpret two-way tables.  Describe the problem of multiple comparisons.  Calculate expected counts in two-way tables.  Describe the chi-square test statistic.  Describe the cell counts required for the chi-square test.  Describe uses of the chi-square test.  Describe the chi-square distributions.  Perform a chi-square goodness of fit test.

References: Moore, D. S., Notz, W. I, & Flinger, M. A. (2013). The basic practice of statistics (6th ed.). New York, NY: W. H. Freeman and Company.

Example

Question: Is there an association between students’ preference for online or face- to-face instruction and their education level?

Survey Items: Are you an undergraduate or graduate student? o Undergraduate o Graduate

Which method of instructional delivery do you prefer? o Face-to-face o Online

 The information gathered from this survey must be organized in a data file within the statistical software.  For each question, a categorical (or nominal) variable is created.

Data File:

Chi-Square Test

 To determine whether the association between two qualitative variables is statistically significant, researchers must conduct a test of significance called the Chi-Square Test. There are five steps to conduct this test.

Step 1: Formulate the hypotheses

 Null Hypothesis:

H 0 : There is no significant association between students’ educational level and their preference for online or face-to-face instruction.

or

H 0 : There is no difference in the distribution of instructional preferences between undergraduate and graduate students.

 If there is no association between the two variables, the individuals would be uniformly distributed across the cells of the table.

 The alternative hypothesis for a chi-square test is always two-sided. (It is technically multi-sided because the differences may occur in both directions in each cell of the table).

 Alternative Hypothesis:

Ha: There is a significant association between students’ educational level and their preference for online or face-to-face instruction.

or

Ha: There is a significant difference in the distribution of instructional preferences between undergraduate and graduate students.

Step 2: Specify the expected values for each cell of the table (when the null hypothesis is true)

 The expected values specify what the values of each cell of the table would be if there was no association between the two variables.  The formula for computing the expected values requires the sample size, the row totals, and the column totals.

The Chi-Square Distributions

 The image above shows that the distribution of the chi-square statistic starts at zero and can only have positive values.  The shape of the distribution is much different than the t or z statistic and is skewed to the right.  The shape of the distribution changes as the degrees of freedom increases.

The chi-square distributions are a family of distributions that take only positive values and are skewed to the right. A particular chi-square distribution is specified by giving its degrees of freedom. The chi-square test for a two-way table with r rows and c columns uses critical values from the chi-square distribution with ( r – 1)( c – 1) degrees of freedom. The P - value is the area under the density curve of this chi-square distribution to the right of the value of the test statistic.

The Chi-Square Distributions

Chi-Square Test – Test Statistic

 The above example shows the observed and expected values for the example problem.  If these values are entered into the formula for the chi-square tests statistic, the value obtained is 28.451.  Is this value high enough to reject the null hypothesis?

The Chi-Square Test Interpretation

The chi-square test is an overall test for detecting relationships between two categorical variables. If the test is significant, it is important to look at the data to learn the nature of the relationship. There are three ways to look at the data:

1) Compare selected percents: which cells occur in very different percentages than the other cells?

2) Compare observed and expected cell counts: which cells have more or less observations than would be expected if H 0 were true?

3) Look at the terms of the chi-square statistic: which cells contribute the most to the value of λ^2?

 In the example, only 5 graduate students preferred face-to-face instruction, compared with the expected value of 18.  Forty undergraduate students preferred face-to-face instruction, while the expected value was 27.  Therefore, researchers can conclude not only that there is an association between the variables, but they can also describe the association.  Researchers can conclude that graduate students prefer online instruction, whereas undergraduate students prefer face-to-face instruction.

Cell Counts Required for the Chi-Square Test

 The chi-square test is an approximate method that becomes more accurate as the counts in the cells of the table get larger. Therefore, it is important to check that the counts are large enough to result in a trustworthy p-value. Fortunately, the chi-square approximation is accurate for very modest counts.

Cell Counts Required for the Chi-Square Test You can safely use the chi-square test with critical values from the chi-square distribution when no more than 20% of the expected counts are less than 5 and all individual expected counts are 1 or greater. In particular, all four expected counts in a 2  2 table should be 5 or greater.