6. Let C and D be matrices conformable to multiplication. Then (CD) =

Size: px

Start display at page:

Download "6. Let C and D be matrices conformable to multiplication. Then (CD) ="

Marcus Carroll
5 years ago
Views:

1 Quiz 1. Name: 10 points per correct answer. (20 points for attendance). 1. Let A = 3 and B = [3 yy]. When is A equal to B? xx A. When x = 3 B. When y = 3 C. When x = y D. Never 2. See 1. What is the dimension of A? A. 1 B. 2 C. 1x2 D. 2x1 3. See 1. BA = 4. See 1. What is B -1? A. 1/3 + 1/y B. 1/3 1/y C. [1/3 1/y] D. B -1 does not exist 5. See 1. Suppose I = 1 0. Then I A = 0 1 A. does not exist B. = 3 xx C. = 3 + x D. = [ 3 xx]. 6. Let C and D be matrices conformable to multiplication. Then (CD) = A. D C B. C D C. C -1 D D. D -1 C 7. See 5. Trace(I) = A. 0 B. 1 C. 2 D See 5. I = A. 0 B. 1 C. 2 D. 4

2 Quiz 2. Name: Closed book and notes 1. Suppose (X,Y) are distributed as bivariate normal. Then the expected value of Y, given that X = x, is A. unrelated to x B. a normal function of x C. a linear function of x D. a quadratic function of x 2. Let Y = blood pressure, X = weight, and Z = age of a generic person. The partial correlation of (Y,X), given Z = 51, is A. The correlation between weight and age among 51-year olds B. The correlation between blood pressure and age among 51-year olds C. The correlation between blood pressure and weight among 51-year olds 3. What information do you need to find the partial correlation in 2.? A. Both the mean vector of (Y,X,Z) and the covariance matrix of (Y,X,Z) are needed B. Only the mean vector of (Y,X,Z) is needed C. Only the covariance matrix of (Y,X,Z) is needed 4. Let Z be an affine transformation of a vector X that has the multivariate normal distribution. Then Z has a distribution. A. multivariate normal B. multivariate Bernoulli C. multivariate chi-squared D. multivariate exponential

3 Quiz 3. Name: 1. What do the authors use for the dimension of a typical multivariate data matrix X? A. n n B. n q C. q n D. q q 2. Inside an R data frame, how is a missing value denoted? A. By NA B. By? C. By. D. By (i.e., by a blank space) 3. What is another term for the nontrivial extraction of implicit, previously unknown and potentially useful information from data? A. Multivariate analysis B. Multiple regression analysis C. Data mining D. Data wrangling 4. Suppose you have 4 variables in your multivariate data set. How many covariances are there? A. 4 B. 6 C. 9 D. 20

4 Quiz 4. Name: Closed book, notes, and no electronic devices. 1. A multivariate data set has 20 rows and 3 columns. How many normal quantile-quantile plots will you look at? A. 1 B. 3 C. 17 D A multivariate data set has 20 rows and 3 columns. How many chi-square quantilequantile plots will you look at? B. 1 B. 3 C. 17 D Suppose X ~ N3(µ, Σ) (a trivariate normal distribution). What is the distribution of (X µ) T Σ -1 (X - µ)? 2 A. N(0,1) B. N(3,1) C. χ 1 D. 2 χ 3 4. What is the expected appearance of the chi-squared quantile-quantile plot when the data come from a multivariate normal distribution? A. A straight line B. A bell curve C. A right skewed curve D. An S-shaped curve

5 Quiz 5. Name: Closed books, notes, and no electronic devices. Note: The first problem is a select all that apply problem. The rest are pure multiple choice. 1. According to the authors, what are the goals of using graphical displays of data? (Select all that apply; 5 points per correct selection/non-selection) A. To provide an overview B. To tell a story C. To suggest hypotheses D. To criticise a model 2. What kind of graph does the R command plot(x, Y) create? A. A bivariate boxplot B. A bivariate histogram C. A contour plot D. A scatterplot 3. How were the marginal distributions displayed on the same axes as a scatterplot in the reading? A. By using density plots B. By using rug plots C. By using histograms D. By using normal distributions 4. What is the point of using bubble and glyph plots? A. To estimate the bivariate distributions B. To check whether the distribution is bivariate normal C. To counter the effect of discreteness D. To visualize data that have more than two dimensions

6 Quiz 6. Name: Closed book, notes, and no electronic devices. 1. Why do the authors complain about using bivariate histograms? A. The volume under the histogram is not 1.0 B. There is often not enough data for reliable estimates C. They are too smooth D. They can show non-bivariate normal appearances 2. A kernel density estimator is a estimate of the distribution that produced the data. A. Parametric B. Nonparametric C. Univariate normal D. Multivariate normal 3. The density estimate that uses Gaussian kernels is a sum of bumps? What is an individual bump? A. A single normal density function B. A single data point C. A collection of nearby data points D. A rectangular box centered over the data value 4. What is an example of a perspective plot? A. A two-dimensional scatterplot B. A three-dimensional scatterplot C. A contour plot of a bivariate density function D. A three-dimensional plot of a bivariate density function

7 Quiz 7. Name: Closed books, notes, and no electronic devices. 1. What is a "principal component"? A. a linear combination B. an eigenvalue C. an eigenvector D. a variance 2. What is tr(r) when R is a correlation matrix calculated from an nxq data matrix? A. n B. q C. n-q D. n+q 3. When there are 10 original variables, what percentage of the variation in the 10 original variables is accounted for by all ten principal components? A. 1% B. 10% C. 90% D. 100% 4. When the principal components are extracted from the correlation matrix, then the average variance of the principal components is A. -1 B. 0 C. 1 D. 100

8 Quiz 8. Name: Closed, books, notes, and no electronic devices 1. In the first example about head lengths, the second principal component was essentially A. the sum of the head lengths of the two sons B. the mean of the head lengths sons C. the head length of the first born son D. the difference of the head lengths of the two sons 2. In the heptathlon example, why was the correlation between the first principal component (PC) score and the standard scoring system negative? A. Because the loadings were all negative B. Because the first PC is unrelated to the standard scoring C. Because the first PC explained so little variance in the heptathlon results D. Because the eigenvalue of the first PC was negative 3. Which principal component scores were useful to predict Sulfur Dioxide concentration? A. Potentially, all of them B. Only the first two C. Only the last two D. Only those whose eigenvalue was greater than How does canonical correlation analysis (CCA) differ from principal components analysis (PCA)? A. CCA is derived from relationships between sets of variables while PCA is derived from relations within a set of variables B. CCA uses unstandardized data while PCA uses standardized data C. CCA finds nonlinear relationships between variable while PCA finds linear relationships between variables D. CCA allows the data to come from non-multivariate normal distributions, while PCA assumes the data come from multivariate normal distributions

9 Quiz 9. Name: Closed, books, notes, and no electronic devices 5. In the first example about head lengths, the first principal component was essentially E. the sum of the head lengths of the two sons F. the head length of the first born son G. the head length of the second born son H. the difference of the head lengths of the two sons 6. In the heptathlon example, what was the relationship between the first principal component (PC) score and the score assigned by official Olympic rules? E. A strong inverse (negative) relationship F. A weak inverse (negative) relationship G. A strong direct (positive) relationship H. A weak direct (positive) relationship 7. Which principal component scores were statistically significant to predict Sulfur Dioxide concentration? E. All of them F. The first two G. Those whose eigenvalue was greater than 1.0 H. Those whose p-value was less than 0.05 in the regression model 8. Suppose the computer gives you the eigenvalue/eigenvector pair for a covariance YY matrix of a vector Y = YY 2 as { 3.4, 0.53 }. What is the variance of the linear YY combination 0.80Y Y Y3? A B C. 3.4 D

10 Quiz 10. Name: 1. Multidimensional scaling (MDS) is a model for the A. nxq data matrix B. qxq covariance matrix C. qxq correlation matrix D. nxn proximity matrix 2. What did the MDS graph of the airline data purport to show? A. The clustering of airports on either the west or east coast B. The airports that are considered outliers C. The known spatial arrangement of the airports D. The three-dimensional locations of the airplanes 3. In the MDS analysis of the skull data, distance between epochs (time periods) was determined using A. Mahalanobis distance B. Average distance C. Euclidean distance D. Manhattan distance 4. The conclusion from the MDS analysis of water voles was that A. British water voles in different regions are relatively different from one another B. British water voles in different regions are relatively similar to one another C. British water voles are similar to Spanish water voles D. British water voles have longer tails than Yugoslavian water voles

11 Quiz 11. Name: 5. What kind of data do you use for a standard correspondence analysis? E. Bivariate categorical data F. Multivariate normal data G. Correlation matrix data H. Covariance matrix data 6. Correspondence analysis is commonly used to supplement which statistical test? E. Contingency table test for independence F. Two-sample t-test G. Pearson product-moment correlation test H. Spearman rank correlation test 7. Like multidimensioal scaling, correspondence analysis also uses distance matrices. Which distance is used? E. Mahalanobis distance F. Average distance G. Euclidean distance H. Chi-squared distance 8. The conclusion from the correspondence analysis graph in the real example was that E. British water voles in different regions are relatively different from one another F. The first born son is more similar to the second born son than he is to the third born son G. Historical skulls in the epoch c4000bc are more similar to skulls in c3300bc than they are to skulls in cad150 H. Girls years old are more likely to have sexual relations than girls years old

12 Quiz 12. Name: 1. The factor analysis model states that the A. manifest variables are functions of the latent variables B. latent variables are functions of the manifest variables C. manifest variables are functions of linear combinations of the manifest variables D. latent variables are linear combinations of the manifest variables 2. What is assumed about a latent variable f in factor analysis? A. E(f) = 0 and Var(f) = 0 B. E(f) = 0 and Var(f) = 1 C. E(f) = 1 and Var(f) = 0 D. E(f) = 1 and Var(f) = 1 3. The coefficient that multiplies a latent variable in a factor analysis model is called a A. squared correlation B. standard deviation C. eigenvector component D. loading 4. In the example where the factor analysis model was used to predict the exam scores in Classics, French, and English, what latent variable was assumed for the factor? A. Socioeconomic status B. Intelligence or intellectual ability C. Time spent studying for the exams D. Student-teacher ratio

13 Quiz 13. Name: 1. What is the conclusion of the Scale Invariance section? A. You should standardize the data before performing factor analysis B. When the variances of the variables differ greatly, you should use the correlation matrix in factor analysis C. The results of factor analysis are essentially equivalent, not matter whether the covariance matrix or correlation matrix is used D. You should use the scale function in R prior to performing factor analysis 2. What are the parameters that you have to estimate in the factor analysis model? A. The values of the latent variables and the loadings B. The loadings and the specific variances C. The values of the latent variable and the specific variances D. The specific means and the specific variances 3. What is the most respectable method of estimating the parameters in the factor analysis model? A. The spectral decomposition method B. The least squares method C. The maximum likelihood method D. The principal factor method 4. What is the main goal of factor rotation? A. To achieve a simple structure of the loading pattern B. To make the data better conform to the assumptions of factor analysis C. To better estimate the values of the latent variables D. To better estimate the values of the specific variances

14 Quiz 14. Name: 1. In the expectations of life example, the manifest variables were A. four male life expectancy measures and four female life expectancy measures B. different countries C. three factors defining life expectancy D. the columns of the matrix ΛΛ Τ + Ψ 2. In the expectations of life example, the latent variables were life force measures defined for A. different people B. different countries C. different years D. different companies 3. In both the drug Use example and the expectations of life example, how did they decide how many factors to choose? A. By using a chi-square test B. By using a t test C. By noting that the percentage of variance explained was more than 70% D. By noting that the percentage of variance explained was more than 90% 4. In the drug use example, the latent variables were drug seeking measures defined for A. different students B. different schools C. different years D. different teachers

15 Quiz 15. Name: 1. What is another term for clusters? A. Normal groups B. Non-normal groups C. Homogenous groups D. Heterogeneous groups 2. Consider the following scatterplot. How many clusters are there? A. 2 B. 3 C. 4 D. It is not clear how many clusters there are 3. Before the process of agglomerative clustering can begin, what must first be calculated? A. A dendrogram B. A covariance or correlation matrix C. A mean vector and a covariance matrix D. A similarity or distance matrix 4. What does the two-cluster solution of jet fighters largely correspond to? A. High fuel efficiency jets and low fuel efficiency jets B. High altitude jets and low altitude jets C. Jets that can or cannot land on a carrier D. Old jets and new jets

16 Quiz 16. Name: 1. In the k-means clustering of entities based on crime rates, which was the outlier? A. Texas B. New York C. Washington, DC D. Honolulu 2. How were the variables standardized in the crime rate example? A. By dividing by their respective ranges B. By multiplying by their respective ranges C. By dividing by their respective standard deviations D. By multiplying by their respective standard deviations 3. When is a k-means clustering solution good? A. When the total number of clusters is large B. When the average number of observations within a cluster is high C. When the correlations between variables within a cluster are larger than 0.7 D. When the distances from within-cluster data values to the cluster mean are small 4. The clusters found in the Greco-Roman pottery example correspond best to A. Region where the pottery was found B. Year of discovery of the pottery C. Degree of solubility (in water) of the pottery D. Heat at which the pottery was fired

17 Quiz 16. Name: 5. In the k-means clustering of entities based on crime rates, which was the outlier? B. Texas B. New York C. Washington, DC D. Honolulu 6. How were the variables standardized in the crime rate example? E. By dividing by their respective ranges F. By multiplying by their respective ranges G. By dividing by their respective standard deviations H. By multiplying by their respective standard deviations 7. When is a k-means clustering solution good? E. When the total number of clusters is large F. When the average number of observations within a cluster is high G. When the correlations between variables within a cluster are larger than 0.7 H. When the distances from within-cluster data values to the cluster mean are small 8. The clusters found in the Greco-Roman pottery example correspond best to E. Region where the pottery was found F. Year of discovery of the pottery G. Degree of solubility (in water) of the pottery H. Heat at which the pottery was fired

18 Quiz 16. Name: 9. In the k-means clustering of entities based on crime rates, which was the outlier? C. Texas B. New York C. Washington, DC D. Honolulu 10. How were the variables standardized in the crime rate example? I. By dividing by their respective ranges J. By multiplying by their respective ranges K. By dividing by their respective standard deviations L. By multiplying by their respective standard deviations 11. When is a k-means clustering solution good? I. When the total number of clusters is large J. When the average number of observations within a cluster is high K. When the correlations between variables within a cluster are larger than 0.7 L. When the distances from within-cluster data values to the cluster mean are small 12. The clusters found in the Greco-Roman pottery example correspond best to I. Region where the pottery was found J. Year of discovery of the pottery K. Degree of solubility (in water) of the pottery L. Heat at which the pottery was fired

19 Quiz 17. Name: 1. What is the model in model-based clustering? A. The maximum likelihood method B. The hierarchical agglomeration method C. The Euclidean distance method D. The mixture of normals distribution 2. What do you assume about the shapes of the data in the clusters when you use model-based clustering? A. They are spherical B. They are elongated with the same orientation C. They can have different shapes but have the same volume D. They are ellipsoidal 3. In the diabetes analysis, what do the three chosen clusters refer to? A. three clinical diagnosis groups B. three levels of blood glucose C. young, middle age, and elderly patients D. three drug treatment groups 4. The mclust function gives results most similar to what? A. principal components B. factor analysis C. kernel density estimation D. multidimensional scaling

20 Quiz 17. Name: 1. What is the model in model-based clustering? B. The maximum likelihood method B. The hierarchical agglomeration method C. The Euclidean distance method D. The mixture of normals distribution 2. What do you assume about the shapes of the data in the clusters when you use model-based clustering? A. They are spherical B. They are elongated with the same orientation C. They can have different shapes but have the same volume D. They are ellipsoidal 3. In the diabetes analysis, what do the three chosen clusters refer to? A. three clinical diagnosis groups B. three levels of blood glucose C. young, middle age, and elderly patients D. three drug treatment groups 4. The mclust function gives results most similar to what? A. principal components B. factor analysis C. kernel density estimation D. multidimensional scaling

21 Quiz 18. Name: 1. In the wine example, what are the three known classes (or groups)? A. Wine in either a small, medium, or large bottle B. Wine from three different grape varieties (cultivars) C. Wine aged in either French oak, American oak, or stainless steel containers D. Red, white, or rosé wine 2. In general (not necessarily for the wine data), which model is fit for the within-group data when using MclustDA? A. A multivariate normal model with diagonal covariance matrix B. A multivariate normal model with any covariance matrix C. A mixture of multivariate normals model with diagonal covariance matrices D. A mixture of multivariate normals model with any covariance matrices 3. What kind of data are used to estimate the distributions used in the discriminant analysis model? A. Distance data B. Covariance data C. Training data D. Test data 4. What is test error? A. the proportion of test cases that are misclassified B. the difference between fitted and observed data in the test data set C. the sum of Type I and Type II errors for the likelihood ratio test D. the number of mistakes made on an examination

22 Quiz 19. Name: 5. How does exploratory factor analysis (EFA) differ from confirmatory factor analysis (CFA)? A. EFA constrains loadings to zero; CFA does not B. CFA constrains loadings to zero; EFA does not C. EFA assumes multivariate normal distributions; CFA does not D. CFA assumes multivariate normal distributions; EFA does not 6. What kind of data can be used to estimate CFA and/or structural equation (SEM) models? A. Covariance data B. Distance data C. Latent data D. Nominal data 7. The CFA and SEM model parameters are chosen to minimize a discrepancy function. How is discrepancy determined? A. By least squared differences B. By weighted least squared differences C. By average relative absolute difference D. By maximum likelihood assuming multivariate normality 8. The parameters θ of the SEM/CFA model are identifiable if A. Σ(θ1) = Σ(θ2) implies that θ1 = θ2. B. θ1 = θ2 implies that Σ(θ1) = Σ(θ2). C. The GFI index is greater than 0.90 or 0.95 D. The GFI index is less than 0.90 or 0.95

23 Quiz 20. Name: PCA = Principal Components Analysis EFA = Exploratory Factor Analysis CFA = Confirmatory Factor Analysis 1. (20) The reading said EFA was usually and PCA was usually. A. an exploratory method; a confirmatory method B. a confirmatory method; an exploratory method C. an end in and of itself; a means to an end D. a means to an end; an end in and of itself 2. (40) In which ways are EFA and CFA are similar? Select all that apply. Five points per correct selection/nonselection. A. Both allow rotation of the loadings B. Both assume uncorrelated factors C. Both assume uncorrelated errors D. Both assume reflective measurements E. Both are exploratory F. Both allow you to set the loadings to zero a priori G. Both use maximum likelihood assuming multivariate normality to estimate parameters H. Both are models to explain the covariance structure of the observed variables 3. (20 How do you decide how many latent factors to use in your CFA model? (Select the best single answer) A. By using your a priori theory about the measurements B. By using a scree plot C. By using the chi-square test D. By using the discrepancy statistic

24 Quiz 21. Name: 1. The chi-square tests of model fit (i.e., tests of H0: Σ = Σ(θ), where Σ(θ) is the modelimplied covariance matrix) discussed in Section 7.3 gave the following results. Ability and Aspiration study: Drug use study: A. Significant (p <.05); Significant (p <.05) B. Significant (p <.05); Insignificant (p >.05) C. Insignificant (p >.05); Significant (p <.05) D. Insignificant (p >.05); Insignificant (p >.05) 2. What is a normed residual? A. The difference between an observed and model-implied covariance B. The scaled difference between an observed and a model-implied covariance C. The difference between the observed data vector Xi and the predicted data vector D. The scaled difference between the observed data vector Xi and the predicted data vector 3. How is correlation between factors modelled using the R function sem? A. F1 <-> F2 B. F2 ~ F2 C. F1 ~~ F2 D. F1 >-< F2 4. In the video, how did the author initially choose three latent variables in the Confirmatory Factor Analysis? A. By using regression analysis B. By using principal components analysis C. By using exploratory factor analysis D. By using structural equation modeling 5. The usual CFA model fixes the variances of the latent variables at 1.0. The video allowed a second method where those three variances were free parameters, but A. the disturbance terms (error variances) were all fixed at 1.0 B. three of the disturbance terms (error variances) were fixed at 1.0 C. the loadings from the latent variable to manifest variables were all fixed at 1.0 D. three of the loadings from the latent variable to manifest variables were fixed at 1.0

25 6. In my data analysis file, the negative items were replaced via the code A. Q = Q B. Q = Q 1 C. Q = 5 Q D. Q = 6 Q Consider the following plot, from my data analysis. 7. This is an example of what kind of structural model? A. Mediation B. Standard regression C. Multivariate regression D. Hierarchical factor 8. See the path diagram. The 0.56 between Q28 and Q29 refers to A. the variance of the common factor for Q28 and Q29 B. the correlation between Q28 and Q29 C. the covariance between Q28 and Q29 D. the covariance between error terms for Q28 and Q29

26 Quiz 22. Name: 1. What is the main distinguishing feature of structural equation models? A. Non-identifiability of parameters B. Uncorrelated latent factors C. Direct relationships between the manifest variables D. Direct relationships between the latent variables 2. The variable socioeconomic status was a variable in the study A. latent B. manifest C. normally distributed D. linear combination 3. The variable Alienation71 was modelled as a function of which variables? A. Alientation67, SES B. Alienation67, Powerlesnness71 C. Anomia71, Powerlesnness71 D. Anomia67, Powerlesnness67 4. What was the result of the test of model fit (i.e., the test of H0: Σ = Σ(θ), where Σ(θ) is the model-implied covariance matrix) after modifying the model by allowing the measurement errors for anomia in 1967 and in 1971 to be correlated? A. Chi-square fit statistic with 6 degrees of freedom = ; model still fits poorly B. Chi-square fit statistic with 5 degrees of freedom = 6.359; model fit is better C. Chi-square fit statistic with 5 degrees of freedom = 6.359; model still fits poorly D. Chi-square fit statistic with 6 degrees of freedom = ; model fit is better

Short Answer Questions: Answer on your separate blank paper. Points are given in parentheses.

Short Answer Questions: Answer on your separate blank paper. Points are given in parentheses. ISQS 6348 Final exam solutions. Name: Open book and notes, but no electronic devices. Answer short answer questions on separate blank paper. Answer multiple choice on this exam sheet. Put your name on