Performing Chi-Square Tests in SPSS: A Handout, Study notes of Statistics

This handout provides instructions on how to use SPSS to perform two types of Chi-Square tests: the Chi-Square Goodness of Fit test and the Chi-Square test of association between two variables. the concept of Chi-Square tests, how to enter data into SPSS, and how to interpret the output.

Typology: Study notes

2021/2022

Uploaded on 09/12/2022

stifler
stifler 🇮🇹

4

(7)

215 documents

1 / 11

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Using SPSS to perform Chi-Square tests: Graham Hole, January 2006: page 1:
Using SPSS to perform Chi-Square tests:
This handout explains how to perform the two types of Chi-Square test
that were discussed in the lecture on Chi-Square last term: the Chi-Square
Goodness of Fit test, and the Chi-Square test of association between two
variables. (See the "Chi-Square test" on my website,
www.sussex.ac.uk/Users/grahamh/teaching06, for more infomation on the Chi-
Square test and how to calculate it by hand).
1. The Chi-Square Goodness of Fit test:
The most common use of this test is to see whether or not instances of a
number of categories have occurred equally frequently. The example used in the
lecture last term was shoppers' preference for various soap-powder names.
Suppose each shopper is given a list of four soap-powder names ("Kostik",
"Smelloff", "Noscum" and "Grungefree") and asked to pick the one they like best.
Our data consist of how many people pick each soap-powder; in other words,
each person falls into one (and only one) of four categories. If names are chosen
at random (i.e. shoppers show no consistent preference for one soap-powder
over any other) then similar numbers of shoppers will choose each of the four
names. However if there are any consistent preferences for soap-powder names,
then one or more names will have a higher frequency of being chosen than the
others. The Chi-Square Goodness of Fit test enables us to see whether the
observed pattern of frequencies, obtained from our data, differs significantly from
the frequencies we would expect to get by chance (i.e., all categories having
roughly similar frequencies).
Data entry:
There are two quite different ways of entering data into SPSS in order to
perform a Chi-Square test.
(a) Using Chi-Square on raw data:
Using Chi-Square is most straightforward when you have all of the raw
data. You produce one column that tells SPSS what each participant's choice
was. In the example below, I've got 24 shoppers. I've used "1" as a code to
represent "Kostik", "2" as a code for "Smelloff", "3" for "Noscum" and "4" for
"Grungefree".
In line with the conventions for SPSS, each row represents one
participant. Thus the first entry in the "soappowder" column is one participant's
choice - "1", or "Kostik". The second entry is another participant's choice (also
"Kostik") and so on. (I've not bothered to do it here, but you could go to "Variable
view", and use "Values" to assign the name of each soap-powder to its particular
code-number; this makes life easier when it comes to looking at the SPSS
output).
To perform the Chi-Square Goodness of Fit test, go to "Analyze"; select
"Nonparametric tests"; and then click on "Chi-Square".
pf3
pf4
pf5
pf8
pf9
pfa

Partial preview of the text

Download Performing Chi-Square Tests in SPSS: A Handout and more Study notes Statistics in PDF only on Docsity!

Using SPSS to perform Chi-Square tests:

This handout explains how to perform the two types of Chi-Square test that were discussed in the lecture on Chi-Square last term: the Chi-Square Goodness of Fit test, and the Chi-Square test of association between two variables. (See the "Chi-Square test" on my website, www.sussex.ac.uk/Users/grahamh/teaching06 , for more infomation on the Chi- Square test and how to calculate it by hand).

1. The Chi-Square Goodness of Fit test:

The most common use of this test is to see whether or not instances of a number of categories have occurred equally frequently. The example used in the lecture last term was shoppers' preference for various soap-powder names. Suppose each shopper is given a list of four soap-powder names ("Kostik", "Smelloff", "Noscum" and "Grungefree") and asked to pick the one they like best. Our data consist of how many people pick each soap-powder; in other words, each person falls into one (and only one) of four categories. If names are chosen at random (i.e. shoppers show no consistent preference for one soap-powder over any other) then similar numbers of shoppers will choose each of the four names. However if there are any consistent preferences for soap-powder names, then one or more names will have a higher frequency of being chosen than the others. The Chi-Square Goodness of Fit test enables us to see whether the observed pattern of frequencies, obtained from our data, differs significantly from the frequencies we would expect to get by chance (i.e., all categories having roughly similar frequencies).

Data entry: There are two quite different ways of entering data into SPSS in order to perform a Chi-Square test. (a) Using Chi-Square on raw data: Using Chi-Square is most straightforward when you have all of the raw data. You produce one column that tells SPSS what each participant's choice was. In the example below, I've got 24 shoppers. I've used "1" as a code to represent "Kostik", "2" as a code for "Smelloff", "3" for "Noscum" and "4" for "Grungefree". In line with the conventions for SPSS, each row represents one participant. Thus the first entry in the "soappowder" column is one participant's choice - "1", or "Kostik". The second entry is another participant's choice (also "Kostik") and so on. (I've not bothered to do it here, but you could go to "Variable view", and use "Values" to assign the name of each soap-powder to its particular code-number; this makes life easier when it comes to looking at the SPSS output). To perform the Chi-Square Goodness of Fit test, go to "Analyze"; select "Nonparametric tests"; and then click on "Chi-Square".

The following dialog box appears. Click on the name of the variable containing the data, and then click on the arrow to move this name to the "Test variable list" .box. Then click on "OK" to run the Chi-Square test.

Here, if we simply type in the number of occurrences for each category (i.e. the number of shoppers picking each soap-powder name), SPSS will think we have four participants, scoring 12, 4, 6 and 2 respectively! We need to force SPSS to treat these values as frequencies for our four categories.

Here's how to do this.

  1. You need two columns. One contains the frequency with which each category occurred. The other gives the category identifiers (1 to 4 as before, representing the names of the four soap-powders).
  2. Click on "Data", and then click on "Weight cases" at the bottom of the menu that appears. The following dialog box appears:
  1. Click on "weight cases". Then put the variable that contains the frequency data (the number of occurrences of each category) into the "Frequency Variable" box, as above. Then click "OK". SPSS will now treat the numbers in the "frequency" column as the totals for the categories identified in the "soappowder" column. (In other words, by using the "weight cases" option, we have fooled SPSS into thinking that we have typed in ""Kostik", "Smelloff", "Noscum" and "Grungefree" 24, 4, 6 and 2 times respectively).
  2. Now run the Chi-Square analysis as before, using the "frequency" column in the "Test Variable List" box. You should get exactly the same results for the Chi-Square test as you did when using method (a).

2. The Chi-Square test of association between two

independent variables:

This is the most common use of Chi-Square in psychology. We have categorical data for two independent variables and we want to see if there is some relationship between them. As with the Goodness of Fit test, there are two ways of entering the data, depending on whether you enter each participant separately, or want to use the summary frequency with which each category occurred.

(a) Using Chi-Square on raw data: Suppose we want to see if there is an association between brand of anti- dandruff shampoo ("Noflakes" and "Head and Shudders") and hair loss (totally bald versus no hair loss). In this case, we would have two columns. One would give the brand of shampoo that a participant used (coded 1 for "Noflakes" or "2" for "Head and Shudders") and the other would give the same participant's state of hairiness (coded with "1" for "bald" or "2" for "full head of hair"). If there is no association between hair loss and shampoo brand, we would expect to see as many slapheads using "Noflakes" as using "Head and

Move one of the variable names from the left-hand box to the box entitled "Row(s)" and the other variable name to the box named "Column(s)". Then click on "Statistics".

The following dialog box will appear. Click on the little box next to "Chi- Square" to select it. (There are another dozen tests here, but you can ignore them at present!) Then click on "continue" to get back to the previous dialog box.

Click on "Cells..." and make sure that there are ticks in the boxes next to "Observed" and "Expected", so that SPSS will show you both the observed and expected frequencies for each permutation of variables. Click on "Continue" to get back to the previous dialog box.

Finally click on "OK" to get the results of the analysis. The bits that you want are the table of observed and expected frequencies, and the results of the Chi-Square test. The table shows us that 17 people used "Noflakes" and 13 used "Head and Shudders". It also shows us the observed frequencies (how many users of each shampoo actually were bald and how many were actually hairy) and the expected frequencies (how many bald and hairy users of each shampoo we would expect to get if baldness and shampoo choice had nothing to do with each other). Hopefully you can see that the observed and expected frequencies are rather different from each other.

hairloss * shampoo Crosstabulation

12 3 15 8.5 6.5 15. 5 10 15 8.5 6.5 15. 17 13 30 17.0 13.0 30.

Count Expected Count Count Expected Count Count Expected Count

bald

hairy

hairloss

Total

Noflakes

Head and Shudders

shampoo

Total

In the other two columns, I've used "1" and "2" to tell SPSS which permutation of conditions each frequency refers to. They are the same codes as I used before: "1" for "Noflakes" or "2" for "Head and Shudders", and "1" for "bald" or "2" for "full head of hair" Thus a "1" for "hairiness" and "1" for "shampooh" means "bald Noflakes users". "1" for "hairiness" and "2" for "shampooh" means "bald Head and Shudders users"; and so on. Replacing the variable labels with words hopefully makes this clearer:

The next step is to weight the cases. Click on "Data", and then on "Weight cases". In the dialog box that appears, move "frequency" into the "frequency variable" box. Then click on "OK".

Now perform the Chi-Square analysis as before - you should get the same results as when you used all the raw data.

Reassurance for the cognitively challenged: If you find the "weight cases" business confusing, don't worry - so do many other people (including me)! It's not exactly intuitive, but you can always resort to using the more straightforward method of data entry, as long as you don't mind typing in the data.