What this post covers
This post introduces SVD and dimensionality reduction.
- How SVD factorizes a matrix
- What singular values measure
- Why low-rank approximation works
- How SVD connects to compression, recommendation systems, and embeddings
Key terms
- SVD: a factorization of a matrix into orthonormal directions and scaling
- singular value: the strength of stretching along a special direction
- low-rank approximation: approximating a matrix with fewer important directions
- dimensionality reduction: keeping the important structure while reducing coordinates
Core idea
SVD factorizes a matrix as
A = U Σ V^T
The big idea is simple: SVD reveals which directions a matrix stretches most, which directions it barely changes, and how to describe data using fewer important coordinates.
with shapes
A (m x n) = U (m x m) Σ (m x n) V^T (n x n)
A useful way to read this is:
V^Tgives orthonormal directions in the input space (perpendicular unit vectors forming a clean coordinate system)Σtells us how much each direction is stretchedUgives the corresponding orthonormal directions in the output space
So SVD tells us which directions matter most and how strongly the matrix acts on them.
That is why SVD naturally leads to dimensionality reduction. If some singular values are tiny, then the corresponding directions contribute very little, so we can often discard them and keep a shorter coordinate description.
Step-by-step examples
Example 1) Image compression
If an image is stored as a matrix, then keeping only the largest singular values can preserve the main visual structure while reducing storage. For example, if an image effectively uses only the top 20 singular values out of hundreds, then we can store a much smaller approximation while still keeping the main shapes and contrast patterns.
Example 2) Recommendation systems
A user-item matrix can be huge and sparse, but often most of its structure can be approximated by a lower-rank representation. That is why low-rank methods are useful in recommendation systems.
Example 3) Embeddings and document matrices
High-dimensional text or embedding matrices often contain structure that can be summarized in fewer directions. SVD helps compress while still preserving important relationships.
Math notes
- Every real or complex matrix has an SVD, including rectangular matrices.
- The singular values are the nonnegative square roots of the eigenvalues of
A^T A. - If
v_iis a right singular vector, then
A v_i = σ_i u_i
for the corresponding left singular vector u_i.
- Truncated SVD gives the optimal rank-
kapproximation in Frobenius norm. - PCA can be viewed through SVD on centered data.
Common mistakes
Thinking SVD is just a library call
It is also a way to read matrix structure.
Confusing eigendecomposition with SVD
They are related, but SVD is more general and works for rectangular matrices too.
Thinking small singular values are always useless
Whether a direction matters depends on the goal. Sometimes subtle structure matters.
Practice or extension
- Why can keeping only a few large singular values still preserve a lot of structure?
- What advantage does low-rank approximation give in recommendation systems?
- If the singular values are
[10, 5, 1, 0.1], how much of the total squared magnitude is kept by the top two?
Wrap-up
This post introduced SVD and dimensionality reduction.
- SVD breaks a matrix into meaningful directions and scaling.
- Large singular values point to important structure.
- Low-rank approximation helps with compression and representation.
- This completes the main arc of the linear algebra series.
💬 댓글
이 글에 대한 의견을 남겨주세요