Probabilistic View of Principal Component Analysis | by Saptashwa Bhattacharyya | Jul, 2023


One of the primarily used dimension reduction techniques in data science and machine learning is Principal Component Analysis (PCA). Previously, We have already discussed a few examples of applying PCA in a pipeline with Support Vector Machine and here we will see a probabilistic perspective of PCA to provide a more robust and comprehensive understanding of the underlying data structure. One of the biggest advantages of Probabilistic PCA (PPCA) is that it can handle missing values in a dataset, which is not possible with classical PCA. Since we will discuss Latent Variable Model and Expectation-Maximization algorithm, you can also check this detailed post.

What you can expect to learn from this post?

  1. Short Intro to PCA.
  2. Mathematical building blocks for PPCA.
  3. Expectation Maximization (EM) algorithm or Variational Inference? What to use for parameter estimation?
  4. Implementing PPCA with TensorFlow Probability for a toy dataset.

Let’s dive into this!

1. Singular Value Decomposition (SVD) and PCA:

One of the major important concepts in Linear Algebra is SVD and it’s a factorization technique for real or complex matrices where for example a matrix (say A) can be factorized as:

where U,Vᵀ are orthogonal matrices (transpose equals the inverse) and Σ would be a diagonal matrix. A need not be a square matrix, say it’s a N×D matrix so we can already think of this as our data matrix with N instances and D features. U,V are square matrices (N×N) and (D×D) respectively, and Σ will then be an N×D matrix where the D×D subset will be diagonal and the remaining entries will be zero.

We also know Eigenvalue decomposition. Given a square matrix (B) which is diagonalizable can be factorized as:

where Q is the square N×N matrix whose ith column is the eigenvector q_i of B, and Λ is the diagonal matrix whose diagonal elements are the corresponding eigenvalues.



Source link

Leave a Comment