| 26. Review: Orthogonal projections | blank notes | annotated notes |
| 27. PCA: geometric interpretation | blank notes | annotated notes |
| 28. PCA: further components | blank notes | annotated notes |
| pca_mnist.ipynb | notebook file | HTML version |
- How would you explain the purpose of finding the orthogonal projection of a vector \(v\) onto \(u\) ?
- Why might it be beneficial to use Principal Component Analysis (PCA) on a data set with many variables?
- How much time do we save using the PCA (like from our example of going from 784 to 250ish)?
- Are there any courses (at UB or online) that you would recommend which teach MNIST?
How can we determine the optimal number of principal components to retain in PCA, especially in scenarios where we’re dealing with high-dimensional data?
The answer will depend on a particular situation - the data being analyzed and what one wants to do with the data. There are some rules that people sometimes follow - e.g. that one should use enough components to capture at least 90% or 95% of the total variance. One can also plot eigenvalues of the covariance matrix and see if at some point they start decreasing less rapidly. An often used method is cross-validation, which amounts to using different numbers of principal components with some training data, and checking which number works best.
What are the main applications for PCA? What fields use it the most?
PCA is most commonly used a dimensionality reduction technique. It is used in data analysis and machine learning.
If a dataset contains outliers, how does PCA account for it or how do we account for it to still effectively use PCA?
PCA is sensitive to outliers, the same way as \(L_2\) regression is. For this reason, data should be checked for outliers before PCA is applied. There are some methods for this.
Do people ever make art out of these topics. For example people have made art about the Collatz Conjecture so I’m wondering if there is any graph related art?
I have not seen art made with PCA, but PCA is often used to analyze satellite photos, and the resulting images look quite artistic, see this one, for example.
Is demeaning data necessary or is it more for the presentation of the results?
It is necessary, without it PCA will give wrong results.
I have question about Python code, If I used python from your post, how can I use it in my HW?
I am afraid I don’t understand the issue. Please ask me in person (in class, after class, at office hours) and I will try to help.
- In lecture note 27 I saw the proj being used and wondered if I could get more insight about it. I have seen proj in previous courses before but I would still like more clarification about what proj means and what it does and how we compute it.
- Another question I have is about the difference matrix. When computing the difference matrix do the matrices have to have the same dimensions or can we take the difference matrix from matrices of different dimensions just like how we can multiply a 2x3 and a 3x2 matrix?
Orthogonal projection of a vector \(v\) onto a vector \(u\) decomposes \(v\) into a vector that is a scalar multiple of \(u\) and a vector which is orthogonal to \(u\). A different way of looking at this is that the orthogonal projection produces a vector which is a scalar multiple of \(u\), but at the same time is as close to \(v\) as possible.
Addition and subtraction of matrices are defined only for matrices of the same dimensions.
Why do we always have to use demeaned data for PCA?
One way to look at it is that PCA provides a new coordinate system for a given set of data. The origin of this coordinate system is the center (i.e. the mean) of the data. Demeaning shift the data, so that this new origin of the coordinate systems agrees with the usual origin.