Final Exam Key for Multivariate Data Analysis | EDMS 771, Exams of Descriptive statistics

Material Type: Exam; Class: MULTIVARIATE DATA ANAL; Subject: Measurement, Statistics, and Evaluation; University: University of Maryland; Term: Unknown 1989;

Typology: Exams

Pre 2010

Uploaded on 02/13/2009

koofers-user-sbp
koofers-user-sbp 🇺🇸

5

(1)

10 documents

1 / 2

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
EDMS 771 FINAL EXAM = KEY
Grading: This exam is worth a total of 45 points. Average = 40.9, SD = 3.5
__________________________________________________________________________________________
The items in Parts A and B refer to the World95 database that is available for download (SPSS SAV file) from
the EDMS771 website. Brief descriptions of the variables can be seen in the Variables View tab of the file; note
that missing data are indicated by the SPSS default {.} and that not all variables are used in the analyses.
Part A
1. Compute descriptive statistics for the following 12 variables: lifeexpf, lifeexpm, literacy, gdp_cap, birth_rt,
death_rt, density, urban, pop_incr, babymort, aids_rt, and religion. For later use, recode religion into a new
variable with four categories (3 categories are unchanged): Catholic, Muslim, Protstnt, Other; present a
frequency distribution and bar graph for this new variable. Note that religion is a string (alphanumeric) variable
and should be recoded as a numeric variable. [2 points] Freqs in new religion cats = 41, 27, 16, 24
2. Perform cluster analysis using Quick Cluster with 2, 3 and 4 clusters for the following 6 variables: lifeexpf,
lifeexpm, literacy, gdp_cap, birth_rt, death_rt. Present two (2) different reasons that a researcher might prefer
the 3 cluster solution. [3 points] Reasons include: (1) 3 cluster solution seems to confirm to developed,
developing and third-world classifications; (2) 4 cluster solution has one very small cluster {and other
possibilities}
3. Rerun the 3-cluster solution and save cluster membership as a variable in your database; present a frequency
distribution and bar graph for this new variable. [2 points] Freqs = 63, 19, 24
4. Run a MANOVA with cluster membership as the grouping variable (fixed factor) and the following 5
dependent variables: density, urban, pop_incr, babymort, aids_rt.
(A) Explain why a researcher would be interested in an analysis of this type. [3 points] In order to profile
the clusters
(B) What assumptions are assessed by the Box and Levene tests? Based on your analysis, are these
assumptions reasonable? [3 points] Box = homogeneity of covariance (not reasonable); Levene =
homogeneity of variance per variable (only reasonable for aids rate variable)
(C) Perform Tukey HSD post-doc pairwise comparisons for the 5 dependent variables; interpret results for
each variable. [3 points] Note patterns of subsets of homogeneous means
(D) The Games-Howell variation on Tukey tests does not require equal n or homogeneity of variance; rerun
the pairwise comparisons using this procedure and compare, in detail, to the results from Tukey tests. [3 points]
Mainly agrees
5. Run a discriminant analysis with cluster membership as the grouping variable and the following 5
independent variables: density, urban, pop_incr, babymort, aids_rt.
(A) Provide a brief interpretation of the statistically significant discriminant function(s). [3 points] First DF
is significant; seems to distinguish between developed countries and others
(B) Present a plot of the group centroids; comment on the apparent separation among the groups. [2 points]
Separation only along first DF as expected due to NS of second DF
(C) Present and interpret the classification table [3 points] 68-79% correct classifications; could comment
on misclassifications
6. Create a cross-tabulation of cluster membership and the recoded 4-category religion variable; using
appropriate statistical tests, discuss whether or not cluster membership is independent of religion (as recoded).
[3 points] Pearson Chi-square = 24.365 (6 df) indicates lack of independence
pf2

Partial preview of the text

Download Final Exam Key for Multivariate Data Analysis | EDMS 771 and more Exams Descriptive statistics in PDF only on Docsity!

EDMS 771 FINAL EXAM = KEY

Grading: This exam is worth a total of 45 points. Average = 40.9, SD = 3.


The items in Parts A and B refer to the World95 database that is available for download (SPSS SAV file) from the EDMS771 website. Brief descriptions of the variables can be seen in the Variables View tab of the file; note that missing data are indicated by the SPSS default {.} and that not all variables are used in the analyses. Part A

  1. Compute descriptive statistics for the following 12 variables: lifeexpf, lifeexpm, literacy, gdp_cap, birth_rt, death_rt, density, urban, pop_incr, babymort, aids_rt, and religion. For later use, recode religion into a new variable with four categories (3 categories are unchanged): Catholic, Muslim, Protstnt, Other; present a frequency distribution and bar graph for this new variable. Note that religion is a string (alphanumeric) variable and should be recoded as a numeric variable. [2 points] Freqs in new religion cats = 41, 27, 16, 24
  2. Perform cluster analysis using Quick Cluster with 2, 3 and 4 clusters for the following 6 variables: lifeexpf, lifeexpm, literacy, gdp_cap, birth_rt, death_rt. Present two (2) different reasons that a researcher might prefer the 3 cluster solution. [3 points] Reasons include: (1) 3 cluster solution seems to confirm to developed, developing and third-world classifications; (2) 4 cluster solution has one very small cluster {and other possibilities}
  3. Rerun the 3-cluster solution and save cluster membership as a variable in your database; present a frequency distribution and bar graph for this new variable. [2 points] Freqs = 63, 19, 24
  4. Run a MANOVA with cluster membership as the grouping variable (fixed factor) and the following 5 dependent variables: density, urban, pop_incr, babymort, aids_rt. (A) Explain why a researcher would be interested in an analysis of this type. [3 points] In order to profile the clusters (B) What assumptions are assessed by the Box and Levene tests? Based on your analysis, are these assumptions reasonable? [3 points] Box = homogeneity of covariance (not reasonable); Levene = homogeneity of variance per variable (only reasonable for aids rate variable) (C) Perform Tukey HSD post-doc pairwise comparisons for the 5 dependent variables; interpret results for each variable. [3 points] Note patterns of subsets of homogeneous means (D) The Games-Howell variation on Tukey tests does not require equal n or homogeneity of variance; rerun the pairwise comparisons using this procedure and compare, in detail, to the results from Tukey tests. [3 points] Mainly agrees
  5. Run a discriminant analysis with cluster membership as the grouping variable and the following 5 independent variables: density, urban, pop_incr, babymort, aids_rt. (A) Provide a brief interpretation of the statistically significant discriminant function(s). [3 points] First DF is significant; seems to distinguish between developed countries and others (B) Present a plot of the group centroids; comment on the apparent separation among the groups. [2 points] Separation only along first DF as expected due to NS of second DF (C) Present and interpret the classification table [3 points] 68-79% correct classifications; could comment on misclassifications
  6. Create a cross-tabulation of cluster membership and the recoded 4-category religion variable; using appropriate statistical tests, discuss whether or not cluster membership is independent of religion (as recoded). [3 points] Pearson Chi-square = 24.365 (6 df) indicates lack of independence

Part B

  1. One of the clusters has relatively high values for lifeexpf, lifeexpm, literacy, gdp_cap and relatively low values for birth_rt, death_rt (in my analysis, this is cluster 3 but it could be a different cluster in your results); we want to compare this cluster, coded as 1, to a combination of the other two clusters, coded as 0 – create this combination by creating a new variable ( do not use results from the two-cluster cluster analysis). Present a frequency distribution and bar graph for this new variable. [2 points] Freqs for new groups are 82 and 24
  2. Perform a logistic regression analysis comparing the two clusters on the following 5 covariates: density, urban, pop_incr, babymort, aids_rt (DO NOT use stepwise procedures). [2 points] Chi-square block 1 = 90. (A) At Block 0, some variables are significant in the “Variables not in the Equation” table but are not significant at Block 1 in the “Variables in the Equation” table; explain why this happened. [3 points] Multicollinearity (B) Interpret the effect of the babymort variable on predicted membership in the two groups. [2 points] Odds ratio = .616; each unit increase in babymort decreases odds for group 1 by multiplicative factor of. (C) As part of the LRA printout, you should have included a Classification Table with default cut value of .500; comment on the success of classification that resulted from using the 5 covariates [3 points] Darn good! (D) Redo the analysis with a cut value that matches the actual sizes of the two clusters; compare and comment on the results using this new value and the default value [3 points] Using .23, classification actually gets worse