2023 – 2024 IBM Cognitive Class – Machine Learning – Dimensionality Reduction Answers

### Module 1: Data Series

**1. Which of the following techniques can be used to reduce the dimensions of the population?**

- Exploratory Data Analysis
- Principal Component Analysis
- Exploratory Factor Analysis
**Cluster Analysis**

**2. Cluster Analysis partitions the columns of the data, whereas principal component and exploratory factor analyses partition the rows of the data. True or false?**

**False**- True

**3. Which of the following options are true? Select all that apply.**

- PCA explains the total variance
- EFA explains the common variance
- EFA identifies measures that are sufficiently similar to each other to justify combination
- PCA captures latent constructs that are assumed to cause variance

### Module 2: Data Refinement

**1. Which of the following options is true?**

**A matrix of correlations describes all possible pairwise relationships**- Eigenvalues are the principal components
- Correlation does not explain the covariation between two vectors
- Eigenvectors are a measure of total variance, as explained by the principal components

**2. PCA is a method to reduce your data to the fewest ‘principal components’ while maximizing the variance explained. True or false?**

- False
**True**

**3. Which of the following techniques was NOT covered in this lesson?**

**Parallel analysis**- Percentage of Common Variance
- Scree Test
- Kaiser-Guttman Rule

### Module 3: Exploring Data

**1. EFA is commonly used in which of the following applications? Select all that apply.**

**Customer satisfaction surveys****Personality tests****Performance evaluations**- Image analysis

**2. Which of the following options is an example of an Oblique Rotation?**

- Regmax
- Varimax
- Softmax
**Promax**

**3. An Orthogonal Rotation assumes that factors are correlated with each other. True or false?**

**False**- True

### Machine Learning – Dimensionality Reduction Final Exam Answers

**1. Why might you use cluster analysis as an analytic strategy?**

- To identify higher-order dimensions
- To identify outliers
- To reduce the number of variables
**To segment the market**- None of the above

**2. Suppose you have 100,000 individuals in a dataset, and each individual varies along 60 dimensions. On average, the dimensions are correlated at r = .45. You want to group the variables together, so you decide to run principle component analysis. How many meaningful, higher-order components can you extract?**

- 60
- 3
**20**- 24
- The answer cannot be determined

**3. What technique should you use to identify the dimensions that hang together?**

- Principal axis factoring
- Confirmatory factor analysis
**Exploratory factor analysis**- Two of the above
- None of the above

**4. What are loadings?**

- Covariance between the two factors
- Correlations between each variable and its factor
- Correlations between each variable and its component
**Two of the above**- None of the above

**5. When would you use PCA over EFA?**

- When you want to use an orthogonal rotation
**When you are interested in explaining the total variance in a variance-covariance matrix**- When you have too many variables
- When you are interested in a latent construct
- None of the above

**6. What is uniqueness?**

- A measure of replicability of the factor
**The amount of variance not explained by the factor structure**- The amount of variance explained by the factor structure
- The amount of variance explained by the factor
- None of the above

**7. Suppose you are looking to extract the major dimensions of a parrot’s personality. Which technique would you use?**

- Maximum likelihood
- Principal component analysis
- Cluster analysis
**Factor analysis**- None of the above

**8. Suppose you have 60 variables in a dataset, and you know that 2 components explain the data very well. How many components can you extract?**

- 45
- 5
**60**- 2
- None of the above

**9. When would you use an orthogonal rotation?**

- When correlations between the variables are large
- When you observe small correlations between the variables in the dataset
**When you think that the factors are uncorrelated**- All of the above
- None of the above

**10. When would you use confirmatory factor analysis?**

**When you want to validate the factor solution**- When you want to explain the variance in the matrix accounting for the measurement error
- When you want to identify the factors
- Two of the above
- None of the above

**11. Which of the following is NOT a rule when deciding on the number of factors?**

**Newman-Frank Test**- Percentage of common variance explained
- Scree test
- Kaiser-Guttman
- None of the above

**12. What is one assumption of factor analysis?**

- A number of factors can be determined via the Scree test
- Factor analysis will extract only unique factors
**A latent variable causes the variance in observed variables**- There is no measurement error
- None of the above

**13. What is an eigenvector?**

- The proportion of the variance explained in the matrix
- A higher-order dimension that subsumes all of the lower-order errors
**A higher-order dimension that subsumes similar lower-order dimensions**- A higher-order dimension that subsumes all lower-order dimensions
- None of the above

**14. What is a promax rotation?**

- A rotation method that minimizes the square loadings on each factor
**A rotation method that maximizes the variance explained**- A rotation method that maximizes the square loadings on each factor
- A rotation method that minimizes the variance explained
- None of the above

**15. What is the cut-off point for the Common Variance Explained rule?**

- 80% of variance explained
**50% of variance explained**- 3 variables
- 1 unit
- None of the above

**16. Why would you try to reduce dimensions?**

- Individuals need to be placed into groups
- Variables are highly-correlated
**Many variables are likely assessing the same thing**- Two of the above
- All of the above

**17. If you have 20 variables in a dataset, how many dimensions are there?**

**At most 20**- At least 20
- As many as the number of factors you can extract
- Not enough information
- None of the above

**18. What term describes the amount of variance of each variable explained by the factor structure?**

- Eigenvector
- Commonality
- Similarity
**Communality**- None of the above

**19. What package contains the necessary functions to perform PCA and EFA?**

- ggplot2
- FA
**psych**- factAnalis
- None of the above

**20. What is the best method for identifying the number of factors to extract?**

**Parallel Analysis**- Scree test
- Newman-Frank Test
- Percentage of common variance explained
- All of the above