



Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Material Type: Exam; Class: APPLIED MULTIVARIATE ANLY; Subject: STATISTICS; University: Texas A&M University; Term: Unknown 1989;
Typology: Exams
1 / 6
This page cannot be seen from the preview
Don't miss anything!




Eigenvalues Eigenvalue Difference Proportion Cumulative 1 2.87331359 1.07665350 0.5747 0. 2 1.79666009 1.58182321 0.3593 0. 3 0.21483689 0.11490283 0.0430 0. 4 0.09993405 0.08467868 0.0200 0. 5 0.01525537 0.0031 1.
Factor Pattern Factor1 Factor Population 0.58096 0. School 0.76704 -0. Employment 0.67243 0. Services 0.93239 -0. Value 0.79116 -0.
The sociologist reached the following conclusions: (i) People tend to live where the jobs are plentiful. (ii) People with higher education live in more expensive houses. (iii) Over 93% of the variability in the data can be explained by two factors. (iv) A single factor consisting of an average (possibly weighted average) of the five variables might account for much of the difference between the 12 areas.
Question #1.a: Comment on why you think these conclusions are appropriate using the above data to support your answers.
In an attempt to clarify the interpretation, a VARIMAX rotation was performed (~35 degrees) yielding the following factor pattern. Rotated factor pattern Factor1 Factor Population 0.01602 0. School 0.94076 -0. Employment 0.13702 0. Services 0.82481 0. HouseValue 0.96823 -0.
Question #1.b: Does this pattern clarify the underlying variables. Support your answer with a brief description of your observations.
The same data was analyzed using PROC PRINCOMP yielding a set of eigenvalues and eigenvectors.
Question #1.c: Recalling how the above factor analysis was obtained , what are the eigenvalues and eigenvectors for the principal components analysis?
NOTE: SINCE WE USED PRINCIPAL FACTORING WITH INITAL COMMUNALITIES=1 AND DID NOT ITERATE, WE SEE THAT THE EIGENVALES IN PC ANALAYIS WOULD BE THE SAME AND WE HAVE THE RELATION
FACTOR (^) I œ È-i EIGENVECTORI
THE DATA X WITH THE I TH^ FACTOR. RECALL THAT FACTOR (^) I œL AS DEFINED INI THE PC DISCUSSION.
Classification Table Corn Soybeans Total Corn 6 1 7 Soybeans 1 5 6 Total 7 6 13
(iv) What do you conclude from the CANDISC output?
In a further study, three crops, Clover, Cotton and Sugar beets were examined on these same four variables. PROC DISCRIM yields the following result.
Linear Discriminant Function for Crop
Variable Clover Cotton Sugar beets Constant -6.0 -7.0 -5. x1 0.10 0.01 0. x2 0.12 0.11 0. x3 0.10 0.11 0. x4 0.12 0.15 0.
Question #2.b (i) Determine the rule for classifying a new field. (ii) Suppose a new field yields (x1 x4) œœœœ (20 30 10 20). How would you classify it?
JUST AS ABOVE, WE CAN DEFINE THREE LINES, L (^) CLOVER/COTTON , L (^) CLOVER/BEETSAND L (^) COTTON/BEETSBY TAKING DIFFERENCES IN COLUMNS. IF OUR NEW FIELD YIELDS L (^) CLOVER/COTTON ā0 WE CHOOSE CLOVER OVER COTTON AND IF IT YIELDS L (^) CLOVER/BEETS ā0, WE CHOOSE CLOVER OVER BEETS ETC. THIS HELPS US VISUALIZE THE RULE.
ALTERNATIVELY, WE CAN EVALUATE EACH COLUMN FOR THE NEW FILED THUS COMPUTING. L (^) CLOVER, L COTTONAND L BEETS NOTING FOR EXAMPLE THAT L (^) CLOVER/COTTON œ L (^) CLOVER L (^) COTTONETC, WE SEE THAT THE ABOVE RULE SAYS ASSIGN THE NEW FIELD TO THE ONE THAT YIELD THE MAXIMUM OF L (^) CLOVER, L (^) COTTONAND L BEETS. IN THIS CASE WE HAVE L (^) CLOVER œ 3, L (^) COTTONœ.6 AND L (^) BEETS œ1.3 HENCE WE CONCLUDE CLOVER.
Question #3a. Describe briefly how you would obtain this in SAS. It is not necessary to give detailed computer commands.
Suppose the following cluster description was obtained:
OBS ID CLUS
1 2 1 2 5 1 3 1 1 4 15 1 5 33 1 6 4 1 7 17 1 8 3 1 9 6 1 10 14 1 11 16 1 12 27 1
Question #3b After the analysis was performed it was determined that the three crops observed were CORN with ID (^) œœœœ 1 7, COTTON with ID œœœœ 14 19 and CLOVER with ID œœœœ (^26) 36. Develop a frequency table to assess the effectiveness of the clustering procedure,
Question #4. Discuss (briefly) four multivariate methods that are used for one or the other of these purposes, indicating the objective of the method and a concise description of the basic computations involved.
PC: GIVEN THE P-VARIATE DATA X, WE WISH TO DEFINE A SMALL NUMBER OF LINEAR FUNCTIONS OF X THAT EXPLAIN THE VARIABILITY IN PARTICULAR WE SEEK Y1 THAT HAS MAXIMUM VARIABLITY, Y2 THAT IS UNCORRELATED WITH Y1 AND NEXT LARGEST VARIABILITY ETC. THESE ARE DETERMINED BY THE EIGENVECTORS AND VARIANCES BY THE EIGENVALUES. THIS CORRESPONDS TO AN ORTHOGONAL ROTATION OF THE COORINATE SYSTEM. SMALL EIGEN VALUES SUGGEST DEGENERACIES IN THE DATA.**
FA: HERE WE ARE INTERESTED IN EXPLAINING THE DATA X IN TERMS OF A SMALL NUMBER OF COMMON FACTORS. THE FACTORS ARE SIMILAR TO THE PC EIGENVALUES (SEE QUESTION 1C) THE FACTORS DEFINE LOADINGS THAT ARE THE CORREALTION OF THE X DATA WITH THE FACTORS. AFTER OBATINING THE FACTORS WE MAY CONSIDER ROTATING THEM TO