






Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
This handout provides instructions on how to use SPSS to perform two types of Chi-Square tests: the Chi-Square Goodness of Fit test and the Chi-Square test of association between two variables. the concept of Chi-Square tests, how to enter data into SPSS, and how to interpret the output.
Typology: Study notes
1 / 11
This page cannot be seen from the preview
Don't miss anything!







This handout explains how to perform the two types of Chi-Square test that were discussed in the lecture on Chi-Square last term: the Chi-Square Goodness of Fit test, and the Chi-Square test of association between two variables. (See the "Chi-Square test" on my website, www.sussex.ac.uk/Users/grahamh/teaching06 , for more infomation on the Chi- Square test and how to calculate it by hand).
The most common use of this test is to see whether or not instances of a number of categories have occurred equally frequently. The example used in the lecture last term was shoppers' preference for various soap-powder names. Suppose each shopper is given a list of four soap-powder names ("Kostik", "Smelloff", "Noscum" and "Grungefree") and asked to pick the one they like best. Our data consist of how many people pick each soap-powder; in other words, each person falls into one (and only one) of four categories. If names are chosen at random (i.e. shoppers show no consistent preference for one soap-powder over any other) then similar numbers of shoppers will choose each of the four names. However if there are any consistent preferences for soap-powder names, then one or more names will have a higher frequency of being chosen than the others. The Chi-Square Goodness of Fit test enables us to see whether the observed pattern of frequencies, obtained from our data, differs significantly from the frequencies we would expect to get by chance (i.e., all categories having roughly similar frequencies).
Data entry: There are two quite different ways of entering data into SPSS in order to perform a Chi-Square test. (a) Using Chi-Square on raw data: Using Chi-Square is most straightforward when you have all of the raw data. You produce one column that tells SPSS what each participant's choice was. In the example below, I've got 24 shoppers. I've used "1" as a code to represent "Kostik", "2" as a code for "Smelloff", "3" for "Noscum" and "4" for "Grungefree". In line with the conventions for SPSS, each row represents one participant. Thus the first entry in the "soappowder" column is one participant's choice - "1", or "Kostik". The second entry is another participant's choice (also "Kostik") and so on. (I've not bothered to do it here, but you could go to "Variable view", and use "Values" to assign the name of each soap-powder to its particular code-number; this makes life easier when it comes to looking at the SPSS output). To perform the Chi-Square Goodness of Fit test, go to "Analyze"; select "Nonparametric tests"; and then click on "Chi-Square".
The following dialog box appears. Click on the name of the variable containing the data, and then click on the arrow to move this name to the "Test variable list" .box. Then click on "OK" to run the Chi-Square test.
Here, if we simply type in the number of occurrences for each category (i.e. the number of shoppers picking each soap-powder name), SPSS will think we have four participants, scoring 12, 4, 6 and 2 respectively! We need to force SPSS to treat these values as frequencies for our four categories.
Here's how to do this.
This is the most common use of Chi-Square in psychology. We have categorical data for two independent variables and we want to see if there is some relationship between them. As with the Goodness of Fit test, there are two ways of entering the data, depending on whether you enter each participant separately, or want to use the summary frequency with which each category occurred.
(a) Using Chi-Square on raw data: Suppose we want to see if there is an association between brand of anti- dandruff shampoo ("Noflakes" and "Head and Shudders") and hair loss (totally bald versus no hair loss). In this case, we would have two columns. One would give the brand of shampoo that a participant used (coded 1 for "Noflakes" or "2" for "Head and Shudders") and the other would give the same participant's state of hairiness (coded with "1" for "bald" or "2" for "full head of hair"). If there is no association between hair loss and shampoo brand, we would expect to see as many slapheads using "Noflakes" as using "Head and
Move one of the variable names from the left-hand box to the box entitled "Row(s)" and the other variable name to the box named "Column(s)". Then click on "Statistics".
The following dialog box will appear. Click on the little box next to "Chi- Square" to select it. (There are another dozen tests here, but you can ignore them at present!) Then click on "continue" to get back to the previous dialog box.
Click on "Cells..." and make sure that there are ticks in the boxes next to "Observed" and "Expected", so that SPSS will show you both the observed and expected frequencies for each permutation of variables. Click on "Continue" to get back to the previous dialog box.
Finally click on "OK" to get the results of the analysis. The bits that you want are the table of observed and expected frequencies, and the results of the Chi-Square test. The table shows us that 17 people used "Noflakes" and 13 used "Head and Shudders". It also shows us the observed frequencies (how many users of each shampoo actually were bald and how many were actually hairy) and the expected frequencies (how many bald and hairy users of each shampoo we would expect to get if baldness and shampoo choice had nothing to do with each other). Hopefully you can see that the observed and expected frequencies are rather different from each other.
hairloss * shampoo Crosstabulation
12 3 15 8.5 6.5 15. 5 10 15 8.5 6.5 15. 17 13 30 17.0 13.0 30.
Count Expected Count Count Expected Count Count Expected Count
bald
hairy
hairloss
Total
Noflakes
Head and Shudders
shampoo
Total
In the other two columns, I've used "1" and "2" to tell SPSS which permutation of conditions each frequency refers to. They are the same codes as I used before: "1" for "Noflakes" or "2" for "Head and Shudders", and "1" for "bald" or "2" for "full head of hair" Thus a "1" for "hairiness" and "1" for "shampooh" means "bald Noflakes users". "1" for "hairiness" and "2" for "shampooh" means "bald Head and Shudders users"; and so on. Replacing the variable labels with words hopefully makes this clearer:
The next step is to weight the cases. Click on "Data", and then on "Weight cases". In the dialog box that appears, move "frequency" into the "frequency variable" box. Then click on "OK".
Now perform the Chi-Square analysis as before - you should get the same results as when you used all the raw data.
Reassurance for the cognitively challenged: If you find the "weight cases" business confusing, don't worry - so do many other people (including me)! It's not exactly intuitive, but you can always resort to using the more straightforward method of data entry, as long as you don't mind typing in the data.