




























Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
SPSS creates a new column for each factor extracted and then places the factor score for each case within that column.
Typology: Study Guides, Projects, Research
1 / 36
This page cannot be seen from the preview
Don't miss anything!





























Rerun the analysis in this chapter using principal component analysis and compare the results to those in the chapter. (Set the iterations to convergence to 30.)
Access the main dialog box (Figure 1) by selecting. Simply select the variables you want to include in the analysis (remember to exclude any variables that were identified as problematic during the data screening) and transfer them to the box labelled Variables by clicking on. Figure 1 : Main dialog box for factor analysis There are several options available, the first of which can be accessed by clicking on to access the dialog box in Figure 2. The Univariate descriptives option provides means and standard deviations for each variable. Most of the other options relate to the correlation matrix of variables (the R -‐matrix). The Coefficients option produces the R -‐matrix, and selecting the Significance levels option will include the significance value of each correlation in the R -‐matrix. You can also ask for the Determinant of this matrix, and this option is useful for testing for multicollinearity or singularity. KMO and Bartlett’s test of sphericity produces the Kaiser–Meyer–Olkin measure of sampling adequacy and Bartlett’s test. We have already stumbled across KMO and Bartlett’s test and have seen the various criteria for adequacy, but with a sample of 2571 we shouldn’t have cause to worry.
The Reproduced option produces a correlation matrix based on the model (rather than the real data). Differences between the matrix based on the model and the matrix based on the observed data indicate the residuals of the model. SPSS produces these residuals in the lower table of the reproduced matrix, and we want relatively few of these values to be greater than .05. Luckily, to save us scanning this matrix, SPSS produces a summary of how many residuals lie above .05. The Reproduced option should be selected to obtain this summary. The Anti-‐image option produces an anti-‐image matrix of covariances and correlations. These matrices contain measures of sampling adequacy for each variable along the diagonal and the negatives of the partial correlation/covariances on the off-‐diagonals. The diagonal elements, like the KMO measure, should all be greater than 0.5 at a bare minimum if the sample is adequate for a given pair of variables. If any pair of variables has a value less than this, consider dropping one of them from the analysis. The off-‐diagonal elements should all be very small (close to zero) in a good model. When you have finished with this dialog box click on to return to the main dialog box. Figure 2 : Descriptives in factor analysis To access the extraction dialog box (Figure 3 ), click on in the main dialog box. There are several ways of conducting a factor analysis, and when and where you use the various methods will depend on numerous things. For our purposes we will use principal component analysis ( ) which, strictly speaking, isn’t factor analysis; however, the two procedures may often yield similar results. In the Analyze box there are two options: to analyse the Correlation matrix or to analyse the Covariance matrix. The Display box has two options within it: to display the Unrotated factor solution and a Scree plot. The scree plot is a useful way of establishing how many factors should be retained in an analysis. The unrotated factor solution is useful in assessing the improvement of interpretation due to rotation. If the rotated solution is little better than the unrotated solution then it is possible that an inappropriate (or less optimal) rotation method has been used.
Figure 4 : Factor Analysis: Rotation dialog box A final option is to set the Maximum Iterations for Convergence , which specifies the number of times that the computer will search for an optimal solution. In most circumstances the default of 25 is more than adequate for SPSS to find a solution for a given data set. However, if you have a large data set (like we have here) then the computer might have difficulty finding a solution (especially for oblique rotation). To allow for the large data set we are using, change the value to 30. The factor scores dialog box (Figure 5 ) can be accessed by clicking on in the main dialog box. This option allows you to save factor scores for each case in the data editor. SPSS creates a new column for each factor extracted and then places the factor score for each case within that column. These scores can then be used for further analysis, or simply to identify groups of participants who score highly on particular factors. There are three methods of obtaining these scores. If you want to ensure that factor scores are uncorrelated then select the Anderson-‐Rubin method; if correlations between factor scores are acceptable then choose the Regression method. As a final option, you can ask SPSS to produce the factor score coefficient matrix. Figure 5 : Factor scores dialog box The final two options relate to how coefficients are displayed. By default SPSS will list variables in the order in which they are entered into the data editor. Usually, this format is most convenient. However, when interpreting factors it is sometimes useful to list variables by size. If you select Sorted by size , SPSS will order the variables by their factor loadings. In
fact, it does this sorting fairly intelligently so that all of the variables that load highly onto the same factor are displayed together. The second option is to Suppress small coefficients: Absolute value below a specified value (by default .1). This option ensures that factor loadings within ± .1 are not displayed in the output. Again, this option is useful for assisting in interpretation. The default value is probably sensible, but on your first analysis I recommend changing it either to .4 (for interpretation purposes) or to a value reflecting the expected value of a significant factor loading given the sample size. This will make interpretation simpler. You can, if you like, rerun the analysis and set this value lower just to check you haven’t missed anything (like a loading of .39). For this example set the value at .4. Figure 6 : Factor analysis o ptions dialog box
Select the same options as I have in the screenshots and run a PCA with orthogonal rotation. Repeat, but using direct oblimin rotation. For the purposes of saving space in this section I set the default SPSS options such that each variable is referred to only by its label on the data editor (e.g., Q12). On the output you obtain, you should find that the SPSS uses the value label (the question itself) in all of the output. When using the output in this chapter just remember that Q1 represents question 1, Q2 represents question 2 and Q17 represents question 17. The first body of output concerns data screening, assumption testing and sampling adequacy. You’ll find several large tables (or matrices) that tell us interesting things about our data. If you selected the Univariate descriptives option in Figure 2 then the first table will contain descriptive statistics for each variable (the mean, standard deviation and number of cases). This table is not included here, but you should have enough experience to be able to interpret it. The table also includes the number of missing cases; this summary is a useful way to determine the extent of missing data. The top half of the R -‐matrix (or correlation matrix) shows the Pearson correlation coefficient between all pairs of questions, whereas the bottom half contains the one-‐tailed
five variables). The anti-‐image correlation and covariance matrices provide similar information (remember the relationship between covariance and correlation) and so only For the KMO statistic Kaiser (1974) recommends a bare minimum of .5 and that values between .5 and .7 are mediocre, values between .7 and .8 are good, values between .8 and .9 are great and values above .9 are superb (Hutcheson & Sofroniou, 1999). For these data the value is .93, which falls into the range of being superb, so we should be confident that Kaiser-Meyer-Olkin Measure of Sampling Adequacy. Approx. Chi-Square df Sig.
I mentioned that KMO can be calculated for multiple and individual variables. The KMO values for individual variables are produced on the diagonal of the anti-‐image correlation matrix (I have highlighted these cells). These values make the anti-‐image correlation matrix an extremely important part of the output (although the anti-‐image covariance matrix can be ignored). As well as checking the overall KMO statistic, it is important to examine the diagonal elements of the anti-‐image correlation matrix: the value should be above the bare minimum of .5 for all variables (and preferably higher). For these data all values are well above .5, which is good news! If you find any variables with values below .5 then you should consider excluding them from the analysis (or run the analysis with and without them and note the difference). Removal of a variable affects the KMO statistics, so if you do remove a variable be sure to re-‐examine the new anti-‐image correlation matrix. As for the rest of the anti-‐image correlation matrix, the off-‐diagonal elements represent the partial correlations between variables. For a good factor analysis we want these correlations to be very small (the smaller, the better). So, as a final check, you can just look through to see that the off-‐ diagonal elements are small (they should be for these data). Bartlett’s measure tests the null hypothesis that the original correlation matrix is an identity matrix. A significant test tells us that the R -‐matrix is not an identity matrix; therefore, there are some relationships between the variables we hope to include in the analysis. For these data, Bartlett’s test is highly significant ( p < .001); it usually is. The first part of the factor extraction process is to determine the linear components within the data set (the eigenvectors) by calculating the eigenvalues of the R -‐matrix. We know that there are as many components (eigenvectors) in the R -‐matrix as there are variables, but most will be unimportant. To determine the importance of a particular vector we look at the magnitude of the associated eigenvalue. We can then apply criteria to determine which factors to retain and which to discard. By default SPSS uses Kaiser’s criterion of retaining factors with eigenvalues greater than 1 (see Figure 3 ). Output lists the eigenvalues associated with each linear component (factor) before extraction, after extraction and after rotation. Before extraction, SPSS has identified 23 linear components within the data set (we know that there should be as many eigenvectors as there are variables and so there will be as many factors as variables). The eigenvalues associated with each factor represent the variance explained by that particular linear component, and SPSS also displays the eigenvalue in terms of the percentage of variance explained (so factor 1 explains 31.696% of total variance). It should be clear that the first few factors explain relatively large amounts of variance (especially factor 1), whereas subsequent factors explain only small amounts of variance. SPSS then extracts all factors with eigenvalues greater than 1, which leaves us with four factors. The eigenvalues associated with these factors are again displayed (and the percentage of variance explained) in the columns labelled Extraction Sums of Squared Loadings. The values in this part of the table are the same as the values before extraction, except that the values for the discarded factors are ignored (hence, the table is blank after the fourth factor). In the final part of the table (labelled Rotation Sums of Squared Loadings ), the eigenvalues of the factors after rotation are displayed. Rotation has the effect of optimizing the factor structure, and one consequence for these data is that the relative importance of the four factors is equalized.
loadings of each variable onto each factor. By default SPSS displays all loadings; however, we requested that all loadings less than .4 be suppressed in the output (see Figure 6 ) and so there are blank spaces for many of the loadings. This matrix is not particularly important for interpretation, but it is interesting to note that before rotation most variables load highly onto the first factor. At this stage SPSS has extracted four factors. Factor analysis is an exploratory tool and so it should be used to guide the researcher to make various decisions: you shouldn’t leave the computer to make them. One important decision is the number of factors to extract. By Kaiser’s criterion we should extract four factors and this is what SPSS has done. However, this criterion is accurate when there are less than 30 variables and communalities after extraction are greater than .7 or when the sample size exceeds 250 and the average communality is greater than .6. The communalities are shown in Output , and only one exceeds .7. The average of the communalities can be found by adding them up and dividing by the number of communalities (11.573/23 = .503). So, on both grounds Kaiser’s rule may not be accurate. However, you should consider the huge sample that we have, because the factors, but there is little to recommend this criterion over Kaiser’s. As a final guide we can use the scree plot which we asked SPSS to produce by using the option in Figure 3. The scree plot is shown in Output. This curve is difficult to interpret because it begins to tail off after three factors, but there is another drop after four factors before a stable plateau is reached. Therefore, we could probably justify retaining either two or four factors. Given the large Initial Extraction Extraction Method: Principal Component 1 2 3 4 Component Extraction Method: Principal Component Analysis.
- Output questions at this stage. - .402 43600 .189 99 .214-.104 37 .329-. lation Ma sample, it is probably safe to assume Kaiser’s criterion; however, you might like to rerun the analysis specifying that SPSS extract only two factors (see Figure 3 ) and compare the results. Output 8 Output shows an edited version of the reproduced correlation matrix. The top half of this matrix (labelled Reproduced Correlations ) contains the correlation coefficients between all of the questions based on the factor model. The diagonal of this matrix contains the communalities after extraction for each variable.
Output shows the rotated component matrix (also called the rotated factor matrix in factor analysis), which is a matrix of the factor loadings for each variable onto each factor. This matrix contains the same information as the component matrix, except that it is calculated after rotation. There are several things to consider about the format of this matrix. First, factor loadings less than .4 have not been displayed because we asked for these loadings to be suppressed using the option in Figure 6. If you didn’t select this option, or didn’t adjust the criterion value to .4, then your output will differ. Second, the variables are listed in the order of size of their factor loadings. By default, SPSS orders the variables as they are in the data editor; however, we asked for the output to be Sorted by size using the option in Figure
Output 8 The final part of the output is the factor transformation matrix. This matrix provides information about the degree to which the factors were rotated to obtain a solution. If no rotation were necessary, this matrix would be an identity matrix. If orthogonal rotation were completely appropriate then we would expect a symmetrical matrix (same values above and below the diagonal). In reality the matrix is not easy to interpret, although very asymmetrical matrices might be taken as a reason to try oblique rotation. For the inexperienced factor analyst you are probably best advised to ignore the factor transformation matrix.
. . . . . I have little experience of computers SPSS always crashes when I try to use it I worry that I will cause irreparable damage because of my incompetenece with computers All computers hate me Computers have minds of their own and deliberately go wrong whenever I use them Computers are useful only for playing games Computers are out to get me I can't sleep for thoughts of eigen vectors I wake up under my duvet thinking that I am trapped under a normal distribtion Standard deviations excite me People try to tell you that SPSS makes statistics easier to understand but it doesn't I dream that Pearson is attacking me with correlation coefficients I weep openly at the mention of central tendency Statiscs makes me cry I don't understand statistics I have never been good at mathematics I slip into a coma whenever I see an equation I did badly at mathematics at school My friends are better at statistics than me My friends are better at SPSS than I am If I'm good at statistics my friends will think I'm a nerd My friends will think I'm stupid for not being able to cope with SPSS Everybody looks at me when I use SPSS 1 2 3 4 Component Extraction Method: Principal Component Analysis. Rotation Method: Varimax with Kaiser Normalization. a. Rotation converged in 9 iterations. Component Transformation Matrix .635 .585 .443 -. .137 -.168 .488. .758 -.513 -.403. .067 .605 -.635. Component 1 2 3 4 1 2 3 4 Extraction Method: Principal Component Analysis. Rotation Method: Varimax with Kaiser Normalization.
Output 1 The final part of the output is a correlation matrix between the factors (Output ). This matrix contains the correlation coefficients between factors. As predicted from the structure matrix, factor 2 has little or no relationship with any other factors (correlation coefficients are low), but all other factors are interrelated to some degree (notably factors 1 and 3 and factors 3 and 4). The fact that these correlations exist tell us that the constructs measured can be interrelated. If the constructs were independent then we would expect oblique rotation to provide an identical solution to an orthogonal rotation and the component correlation matrix should be an identity matrix (i.e., all factors have correlation coefficients of 0). Therefore, this final matrix gives us a guide to whether it is reasonable to assume independence between factors: for these data it appears that we cannot assume independence. Therefore, the results of the orthogonal rotation should not be trusted: the obliquely rotated solution is probably more meaningful. On a theoretical level the dependence between our factors does not cause concern; we might expect a fairly strong relationship between fear of maths, fear of statistics and fear of computers. Generally, the less mathematically and technically minded people struggle with statistics. However, we would not expect these constructs to correlate with fear of peer evaluation (because this construct is more socially based). In fact, this factor is the one that correlates fairly badly with all others – so on a theoretical level, things have turned out rather well! Structure Matrix .695. . -.632 -. .567 .516 -. .548 .487 -. .520 .413 -. .462. . . . . -.435. . .404. .401. .723 -. .426. .576. .561 -. . -. .453 -. .451 -. I wake up under my duvet thinking that I am trapped under a normal distribtion I can't sleep for thoughts of eigen vectors Standard deviations excite me I weep openly at the mention of central tendency I dream that Pearson is attacking me with correlation coefficients Statiscs makes me cry I don't understand statistics My friends are better at SPSS than I am My friends are better at statistics than me If I'm good at statistics my friends will think I'm a nerd My friends will think I'm stupid for not being able to cope with SPSS Everybody looks at me when I use SPSS I have little experience of computers SPSS always crashes when I try to use it All computers hate me I worry that I will cause irreparable damage because of my incompetenece with computers Computers have minds of their own and deliberately go wrong whenever I use them People try to tell you that SPSS makes statistics easier to understand but it doesn't Computers are out to get me Computers are useful only for playing games I have never been good at mathematics I slip into a coma whenever I see an equation I did badly at mathematics at school 1 2 3 4 Component Extraction Method: Principal Component Analysis. Rotation Method: Oblimin with Kaiser Normalization.
Output 2
Having reached a suitable solution and rotated that solution, we can look at the factor scores. Output shows the component score matrix B from which the factor scores are calculated and the covariance matrix of factor scores. The component score matrix is not particularly useful in itself. It can be useful in understanding how the factor scores have been computed, but with large data sets like this one you are unlikely to want to delve into the mathematics behind the factor scores. However, the covariance matrix of scores is useful. This matrix in effect tells us the relationship between factor scores (it is an unstandardized correlation matrix). If factor scores are uncorrelated then this matrix should be an identity matrix (i.e., diagonal elements will be 1 but all other elements are 0). For these data the covariances are all zero, indicating that the resulting scores are uncorrelated. Component Correlation Matrix 1.000 -.154 .364 -. -.154 1.000 -.185 8.155E- .364 -.185 1.000 -. -.279 8.155E-02 -.464 1. Component 1 2 3 4 1 2 3 4 Extraction Method: Principal Component Analysis. Rotation Method: Oblimin with Kaiser Normalization.
Output 13
To sum up, the analyses revealed four underlying scales in our questionnaire that may, or may not, relate to genuine sub-‐components of SPSS anxiety. It also seems as though an obliquely rotated solution was preferred due to the interrelationships between factors. The use of factor analysis is purely exploratory; it should be used only to guide future hypotheses, or to inform researchers about patterns within data sets. A great many decisions are left to the researcher using factor analysis, and I urge you to make informed decisions, rather than basing decisions on the outcomes you would like to get. The next question is whether or not our scale is reliable. Task 2 The University of Sussex constantly seeks to employ the best people possible as lecturers. They wanted to revise the ‘Teaching of Statistics for Scientific Experiments’ (TOSSE) questionnaire, which is based on Bland’s theory that says that good research methods lecturers should have: (1) a profound love of statistics; (2) an enthusiasm for experimental design; (3) a love of teaching; and (4) a complete absence of normal interpersonal skills. These characteristics should be related (i.e., correlated). The university revised this questionnaire to become the ‘Teaching of Statistics for Scientific Experiments – Revised’ (TOSSE-‐R). They gave this questionnaire to 239 research methods lecturers around the world to see if it supported Bland’s theory. The data are in TOSSE-‐R.sav. Conduct principal axis functioning analysis (with appropriate rotation) and interpret the factor structure.
Figure 7 : The TOSSE-‐R I once woke up in a vegetable patch hugging a turnip that I'd mistakenly dug up thinking it was Roy's largest root If I had a big gun I'd shoot all the students I have to teach I memorise probability values for the F -distribution I worship at the shrine of Pearson I still live with my mother and have little personal hygiene Teaching others makes me want to swallow a large bottle of bleach because the pain of my burning oesophagus would be light relief in comparison I like control conditions Helping others to understand sums of squares is a great feeling I could spend all day explaining statistics to people I calculate 3 ANOVAs in my head before getting out of bed People fall asleep as soon as I open my mouth to speak I like it when I've helped people to understand factor rotation Designing experiments is fun I'd rather think about appropriate dependent variables than go to the pub I soil my pants with excitement at the mere mention of Factor Analysis Thinking about whether to use repeated- or independent-measures thrills me I enjoy sitting in the park contemplating whether to use participant observation in my next experiment
SD D N A SA Teaching of Statistics for Scientific Experiments — Revised (TOSSE-R) Standing in front of 300 people in no way makes me lose control of my bowels
I like to help students
Passing on knowledge is the greatest gift you can bestow an individual Thinking about Bonferroni corrections gives me a tingly feeling in my groin
I quiver with excitement when thinking about designing my next experiment
I often spend my spare time talking to the pigeons ... and even they die of boredom
I tried to build myself a time machine so that I could go back to the 1930s and follow Fisher around on my hands and knees licking the floor on which he'd just trodden
I love teaching
I spend lots of time helping students I love teaching because students have to pretend to like me or they'll get bad marks
My cat is my only friend