







































































Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
The Multivariate Data Analysis MVDA Ultimate Exam is designed for students and professionals working with statistical data analysis. It covers advanced topics such as regression, factor analysis, cluster analysis, and data modeling. This package includes a detailed study guide and extensive practice questions with explanations. Learners will develop analytical and statistical skills. With up to 1000 practice questions, this ultimate exam ensures deep understanding and exam readiness.
Typology: Exams
1 / 79
This page cannot be seen from the preview
Don't miss anything!








































































Question 1. Which of the following statements correctly defines an eigenvalue of a square matrix A? A) It is the determinant of A. B) It is a scalar λ such that Av = λv for some non‑zero vector v. C) It is the trace of A divided by its dimension. D) It is the inverse of a singular value of A. Answer: B Explanation: By definition, an eigenvalue λ satisfies Av = λv for a non‑zero eigenvector v. Question 2. In the context of Singular Value Decomposition (SVD), the matrix Σ contains: A) Eigenvectors of A. B) Orthogonal basis for the column space of A. C) Singular values on its diagonal, all non‑negative. D) Inverse of the covariance matrix. Answer: C Explanation: SVD writes A = UΣVᵀ where Σ is diagonal with singular values (square roots of eigenvalues of AᵀA). Question 3. The Moore‑Penrose pseudoinverse of a matrix A is particularly useful when: A) A is square and nonsingular. B) A is rectangular or singular. C) A is diagonalizable. D) A has full rank only. Answer: B Explanation: The pseudoinverse provides a least‑squares solution for over‑ or under‑determined systems, i.e., when A is not square or is singular. Question 4. For a covariance matrix Σ, the trace of Σ equals: A) The product of its eigenvalues. B) The sum of its variances (diagonal elements).
C) The determinant of Σ. D) The Mahalanobis distance of the mean vector. Answer: B Explanation: The trace is the sum of diagonal entries, which are the variances of each variable. Question 5. Which property does NOT hold for a multivariate normal (MVN) distribution? A) Any linear combination of MVN variables is normal. B) Marginal distributions are also MVN. C) The joint pdf is symmetric about its mean vector. D) Skewness can be non‑zero. Answer: D Explanation: MVN distributions are symmetric and have zero skewness; all linear combinations remain normal. Question 6. Mardia’s test for multivariate normality assesses: A) Only skewness. B) Only kurtosis. C) Both multivariate skewness and kurtosis. D) Homogeneity of variances. Answer: C Explanation: Mardia’s test provides separate statistics for multivariate skewness and kurtosis. Question 7. Mahalanobis distance is used to detect outliers because it: A) Measures Euclidean distance ignoring correlation. B) Scales distances by the covariance structure of the data. C) Is equivalent to the Z‑score for each variable. D) Only works for binary data. Answer: B
Question 11. When performing PCA on a correlation matrix instead of a covariance matrix, the resulting components are: A) Scale‑dependent. B) Based on standardized variables, thus unit‑free. C) Only suitable for variables measured in the same units. D) Identical to those from the covariance matrix. Answer: B Explanation: Correlation matrix standardizes each variable to variance 1, making PCA results independent of original scales. Question 12. In PCA, the proportion of variance explained by the first principal component is: A) Always greater than 50 %. B) Equal to the eigenvalue of the first component divided by the sum of all eigenvalues. C) Determined by the number of observations. D) Unrelated to eigenvalues. Answer: B Explanation: The eigenvalue of a component equals the variance it captures; dividing by total variance (sum of eigenvalues) yields the proportion explained. Question 13. The Kaiser criterion recommends retaining components with eigenvalues: A) Greater than 0.5. B) Greater than 1. C) Greater than the average eigenvalue. D) Equal to the number of variables. Answer: B Explanation: Kaiser’s rule keeps components whose eigenvalues exceed 1, indicating they explain more variance than a single standardized variable. Question 14. Parallel analysis improves upon the Kaiser criterion by:
A) Comparing observed eigenvalues to those obtained from random data. B) Using a scree‑plot visual inspection. C) Retaining all components until cumulative variance reaches 90 %. D) Ignoring eigenvalues altogether. Answer: A Explanation: Parallel analysis generates eigenvalues from simulated random data; components are kept only if observed eigenvalues exceed the random ones. Question 15. In Exploratory Factor Analysis (EFA), the term “common variance” refers to: A) Variance unique to each variable. B) Variance shared among variables and explained by factors. C) Total variance of the dataset. D) Measurement error variance. Answer: B Explanation: Common variance is the portion of each variable’s variance accounted for by the latent factors. Question 16. Which extraction method in EFA is most appropriate when the data are not normally distributed? A) Maximum Likelihood (ML). B) Principal Axis Factoring (PAF). C) Principal Component Analysis (PCA). D) Independent Component Analysis (ICA). Answer: B Explanation: PAF does not assume multivariate normality, making it robust for non‑normal data, whereas ML relies on normality. Question 17. Varimax rotation is classified as: A) Orthogonal, preserving factor independence. B) Oblique, allowing correlated factors.
Question 21. Box’s M test in MANOVA evaluates: A) Multivariate normality. B) Homogeneity of covariance matrices across groups. C) Equality of sample sizes. D) Linear relationships among variables. Answer: B Explanation: Box’s M tests the assumption that covariance matrices are equal (homogeneous) across groups. Question 22. Which MANOVA test statistic is most robust to violations of the equality of covariance matrices? A) Wilks’ Lambda. B) Pillai’s Trace. C) Hotelling‑Lawley Trace. D) Roy’s Largest Root. Answer: B Explanation: Pillai’s Trace is considered the most robust to heterogeneity of covariance matrices. Question 23. Linear Discriminant Analysis (LDA) assumes that the groups share: A) Identical covariance matrices. B) Different covariance matrices. C) Non‑linear decision boundaries. D) Only binary outcomes. Answer: A Explanation: LDA assumes homoscedasticity (equal covariance) across groups, leading to linear discriminant functions. Question 24. Quadratic Discriminant Analysis (QDA) differs from LDA because: A) It assumes equal covariance matrices.
B) It estimates a separate covariance matrix for each group, allowing quadratic boundaries. C) It cannot be used with more than two groups. D) It requires the data to be ordinal. Answer: B Explanation: QDA relaxes the equal‑covariance assumption, fitting a distinct covariance matrix per group, resulting in quadratic decision surfaces. Question 25. The canonical discriminant function that maximizes the ratio of between‑group to within‑group variance is: A) The first canonical root. B) The last canonical root. C) The eigenvector with the smallest eigenvalue. D) The function with the highest Mahalanobis distance. Answer: A Explanation: The first canonical discriminant function captures the greatest separation among groups. Question 26. In cross‑validation of a discriminant model, “leave‑one‑out” (LOO) refers to: A) Removing one variable at a time. B) Using all observations except one for training and testing on the omitted case. C) Splitting the data into a 50/50 train‑test split. D) Dropping the observation with the highest leverage. Answer: B Explanation: LOO repeatedly fits the model on n‑1 cases and predicts the omitted case, providing an almost unbiased estimate of classification accuracy. Question 27. Canonical Correlation Analysis (CCA) seeks to: A) Reduce dimensionality of a single set of variables. B) Find linear combinations of two variable sets that are maximally correlated. C) Test equality of group means. D) Cluster observations into homogeneous groups.
Question 31. In logistic regression, the odds ratio for a predictor is obtained by: A) Raising e to the power of the coefficient (β). B) Taking the absolute value of the coefficient. C) Squaring the coefficient. D) Computing the coefficient’s t‑statistic. Answer: A Explanation: The logistic model is logit(p) = β₀ + β₁x; exponentiating β₁ gives the multiplicative change in odds per unit increase in x. Question 32. Which fit index is NOT commonly used for evaluating multinomial logistic regression models? A) Pseudo‑R² (Nagelkerke). B) Akaike Information Criterion (AIC). C) Bayesian Information Criterion (BIC). D) Root Mean Square Error of Approximation (RMSEA). Answer: D Explanation: RMSEA is a fit index for structural equation models, not for logistic regression. Question 33. In hierarchical clustering, Ward’s method differs from single linkage because it: A) Merges clusters based on the smallest increase in total within‑cluster variance. B) Uses the maximum distance between cluster members. C) Always produces balanced clusters. D) Is equivalent to K‑means clustering. Answer: A Explanation: Ward’s method minimizes the increase in the sum of squared errors (within‑cluster variance) at each merge. Question 34. Which distance metric is most appropriate for mixed data types (continuous, ordinal, nominal) in clustering? A) Euclidean distance.
B) Manhattan distance. C) Gower’s distance. D) Mahalanobis distance. Answer: C Explanation: Gower’s distance handles mixed variable types by scaling each component appropriately. Question 35. In K‑means clustering, the algorithm guarantees convergence to: A) The global optimum of within‑cluster sum of squares. B) A local optimum, which may depend on initial centroids. C) The true underlying class labels. D) The smallest possible number of clusters. Answer: B Explanation: K‑means iteratively reduces within‑cluster variance but can get trapped in local minima; results depend on initial seeds. Question 36. Metric multidimensional scaling (MDS) seeks to preserve: A) The rank order of dissimilarities. B) The actual magnitudes of dissimilarities in a low‑dimensional space. C) Only the largest distances. D) The number of clusters. Answer: B Explanation: Metric MDS aims to reproduce the original distance matrix as faithfully as possible in reduced dimensions. Question 37. In non‑metric MDS, the stress function is minimized to: A) Preserve Euclidean distances exactly. B) Preserve the monotonic relationship (rank order) of dissimilarities. C) Maximize the number of dimensions. D) Ensure orthogonal axes.
Explanation: Configural invariance checks whether the same factor structure holds across groups; it is the baseline for subsequent invariance tests. Question 41. In structural equation modeling (SEM), the distinction between the measurement model and the structural model is: A) Measurement model specifies relationships among latent variables; structural model links observed variables. B) Measurement model defines how latent constructs are measured by observed indicators; structural model specifies causal paths among latent constructs. C) Both models are identical in SEM. D) Measurement model addresses multicollinearity; structural model addresses heteroscedasticity. Answer: B Explanation: SEM separates the specification of latent‑variable indicators (measurement) from the hypothesized relationships among the latent constructs (structural). Question 42. A recursive SEM model is characterized by: A) Presence of feedback loops (cycles). B) No directed cycles; all paths flow in one direction. C) Simultaneous equations with endogenous variables on both sides. D) Non‑identifiable parameters. Answer: B Explanation: Recursive models have a directed acyclic graph, meaning no variable simultaneously predicts and is predicted by another. Question 43. In K‑fold cross‑validation, the primary advantage over a single train‑test split is: A) Reduced computational time. B) Increased bias in performance estimates. C) More stable and less variable estimate of model predictive ability. D) Elimination of the need for a validation set. Answer: C
Explanation: Averaging performance across K folds reduces variance of the estimate, providing a more reliable assessment than a single split. Question 44. Bootstrapping in the context of MVDA is used to: A) Increase the sample size by duplicating observations. B) Approximate the sampling distribution of a statistic and obtain confidence intervals without strong parametric assumptions. C) Perform variable selection automatically. D) Replace missing data with random draws. Answer: B Explanation: Bootstrapping resamples with replacement to generate empirical distributions for statistics, aiding inference especially when analytic solutions are complex. Question 45. A Variance Inflation Factor (VIF) greater than 10 typically indicates: A) Severe multicollinearity among predictors. B) Heteroscedastic errors. C) Non‑linear relationships. D) Over‑dispersion in count data. Answer: A Explanation: VIF quantifies how much the variance of a coefficient is inflated due to correlation with other predictors; values >10 are a common rule‑of‑thumb for problematic multicollinearity. Question 46. In multivariate regression diagnostics, the condition index is derived from: A) The eigenvalues of the correlation matrix of predictors. B) The sum of squared residuals. C) The determinant of the covariance matrix. D) The Mahalanobis distances of observations. Answer: A Explanation: The condition index is the square root of the ratio of the largest to each eigenvalue of the predictor correlation matrix; large values signal multicollinearity.
A) The product of all eigenvalues. B) The smallest eigenvalue of the hypothesis matrix. C) The largest eigenvalue, focusing on the most dominant dimension of group separation. D) Always identical to Wilks’ Lambda. Answer: C Explanation: Roy’s Largest Root uses the greatest eigenvalue, emphasizing the single dimension where groups differ the most. Question 51. In discriminant analysis, the term “canonical correlation” refers to: A) The correlation between each predictor and the group variable. B) The correlation between the discriminant scores and the original variables. C) The correlation between the discriminant function and the group membership. D) The correlation among the predictor variables. Answer: C Explanation: Canonical correlation in DFA quantifies the relationship between the discriminant scores (linear combinations) and the categorical group variable. Question 52. When applying the Box‑Cox transformation, the parameter λ = 0 corresponds to which transformation? A) No transformation (identity). B) Square root transformation. C) Natural logarithm transformation. D) Reciprocal transformation. Answer: C Explanation: In the Box‑Cox family, λ = 0 is defined as the natural log of the variable. Question 53. The scree plot is used to: A) Visualize eigenvalues and help decide how many components/factors to retain. B) Display residuals from a regression model. C) Plot Mahalanobis distances.
D) Show the distribution of missing values. Answer: A Explanation: A scree plot graphs eigenvalues in descending order; an “elbow” suggests an appropriate cutoff. Question 54. In a PCA based on the covariance matrix, scaling the variables before analysis will: A) Have no effect on the resulting components. B) Change the relative importance of variables with larger variances. C. Make the components orthogonal. D. Convert the analysis to factor analysis. Answer: B Explanation: Covariance‑based PCA is sensitive to variable scales; larger‑variance variables dominate the component extraction unless standardized. Question 55. The term “latent variable” in SEM refers to: A) An observed indicator measured without error. B) A variable that is not directly measured but inferred from observed indicators. C) A variable that is always binary. D) The residual term of a regression equation. Answer: B Explanation: Latent variables represent unobserved constructs modeled through their relationships with observed measures. Question 56. In a logistic regression model with a multinomial outcome, the reference category is: A) The category with the highest frequency. B) The category coded as 0 in all dummy variables. C) Arbitrarily chosen; it determines the baseline for log‑odds comparisons. D) The category with the smallest odds ratio. Answer: C
Question 60. In the context of multivariate outlier detection, a Mahalanobis distance exceeding the 97.5 % quantile of a chi‑square distribution with p degrees of freedom is commonly considered: A) A typical observation. B) A potential multivariate outlier. C) Evidence of multicollinearity. D) An indication of homoscedasticity. Answer: B Explanation: Under MVN, Mahalanobis distances follow a chi‑square(p) distribution; large values signal observations far from the multivariate mean. Question 61. Which of the following is true about the relationship between PCA and Factor Analysis (FA) when communalities are set to 1? A) PCA and FA become identical. B) PCA yields fewer components than FA. C) FA provides orthogonal components, while PCA does not. D) The two methods are unrelated regardless of communalities. Answer: A Explanation: When communalities are fixed at 1, FA reduces to PCA because all variance is treated as common variance. Question 62. In a multivariate regression model, the matrix of regression coefficients B is obtained by: A) B = (XᵀX)⁻¹ XᵀY. B) B = XᵀY (XᵀX). C) B = (YᵀY)⁻¹ YᵀX. D) B = (XᵀX)ᵖ XᵀY, where p is the number of predictors. Answer: A Explanation: The ordinary least squares solution for multivariate Y is B = (XᵀX)⁻¹ XᵀY.
Question 63. In the context of SEM, a modification index (MI) suggests: A) The degree of model misfit. B) The expected reduction in chi‑square if a fixed parameter is freed. C) The total number of observed variables. D. The proportion of variance explained by the model. Answer: B Explanation: MI indicates how much the overall chi‑square would decrease if a constrained parameter were estimated freely. Question 64. Which of the following best describes the purpose of the “parallel analysis” technique in factor extraction? A) To test for multivariate normality. B) To compare observed eigenvalues with those obtained from randomly generated data of the same size. C) To rotate factor loadings for interpretability. D) To compute communalities. Answer: B Explanation: Parallel analysis retains factors whose eigenvalues exceed the corresponding eigenvalues from random data, reducing over‑extraction. Question 65. In a multivariate GLM with a continuous outcome vector Y and a set of predictors X, the link function that relates E(Y) to X is typically: A) Identity link for each component. B) Logit link for each component. C) Probit link for each component. D) Inverse link for each component. Answer: A Explanation: For continuous outcomes, the multivariate linear model uses the identity link, preserving linear relationships. Question 66. The term “wide data” (p > n) poses challenges for traditional regression because: