What this post covers
This post introduces the matrix as a transformation, not just a table of numbers.
- Why the “rectangular array” definition is incomplete
- How rows and columns carry structural meaning
- Why a matrix can be read as a rule that sends one vector to another
- How this viewpoint connects to graphics, neural networks, and data transformation
Key terms
- matrix: a coordinate representation of a transformation rule
- vector: the basic object a matrix takes in and outputs
- linear transformation: a rule that preserves vector addition and scalar multiplication
Core idea
When people first meet matrices, they often remember them as “rectangular tables of numbers.” That description is not wrong, but it is too weak. If you stop there, it becomes hard to see why matrix multiplication has the shape it does, or why matrices sit at the center of linear algebra.
A tiny example makes the real role clearer. If
A = [2 0
0 3]
then the vector (1, 1) is sent to (2, 3). The matrix is not just storing numbers. It is describing a rule that changes vectors.
For programming, a more useful sentence is this:
matrix = a coordinate representation of a rule that sends vectors to vectors
More specifically, a matrix represents a linear transformation. A transformation is just a rule or function that takes an input and produces an output. In this case, the inputs and outputs are vectors, and the rule preserves addition and scalar multiplication. In finite-dimensional linear algebra, once we choose bases for both the input space and the output space, that rule can be written as one matrix.
So a matrix is not only storage. It is also a machine for changing vectors.
That is why the same mathematical object appears in so many places:
- rotating or scaling points in graphics
- mapping input features into a new feature space in a neural network
- expressing a coordinate change or data transformation
Why columns matter so much
Suppose a matrix A acts on an input vector x. The expression Ax can look like a purely mechanical rule. But to really understand it, it helps to look at the columns of A.
For a 2 x 2 matrix, the first column is the result of applying A to the first standard basis vector e1 = (1, 0), and the second column is the result of applying A to e2 = (0, 1). In other words, the first column is Ae1 and the second column is Ae2.
So each column tells you where one basic input axis goes. If you change a column, you are changing where one basis direction is sent.
Once you know where the basic axes go, you can understand where any vector goes, because every vector is built from those basis directions.
Coordinates, bases, and representation
A matrix is not the transformation itself in some abstract sense. More precisely, it is the representation of a linear transformation relative to chosen bases for the domain and codomain.
That means the same transformation can have a different matrix if you change the basis in either space.
At the beginner stage, it is perfectly fine to remember “a matrix is the representation of a linear transformation.” But later, when you study diagonalization or eigenvectors, this distinction becomes very important.
Step-by-step examples
Example 1) A 2D scaling transformation
Consider
A = [2 0
0 3]
This stretches the x direction by 2 and the y direction by 3.
If we apply it to the standard basis vectors,
Ae1 = (2, 0)
Ae2 = (0, 3)
So the first column tells us where e1 goes, and the second column tells us where e2 goes.
Now (1, 1) = 1e1 + 1e2, so
A(1, 1) = 1(2, 0) + 1(0, 3) = (2, 3)
This is the bridge from “column meanings” to “matrix-vector multiplication.”
Example 2) A neural-network linear layer
Suppose the input has 3 features and the output has 2 features. Then the weight matrix can be 2 x 3: 2 rows and 3 columns.
That means the matrix maps a 3-dimensional input vector to a 2-dimensional output vector. In the column-vector convention, a 2 x 3 matrix takes vectors in R^3 and produces vectors in R^2.
In code you often see
y = W @ x + b
The matrix W is the linear part that changes feature space. But once the bias b is added, the full map is no longer a pure linear transformation; it is an affine transformation. If you only want the linear-algebra core, focus on W.
The linear-algebra core still lives inside W.
Example 3) Graphics and geometric change
In computer graphics, matrices are used to rotate, scale, reflect, and otherwise transform points.
The important idea is that a matrix does not just change one point. It changes the whole space according to one consistent rule.
So a matrix is better read as a space-level rule than as a pile of numbers.
Math notes
- A matrix is the coordinate representation of a linear transformation.
- Once bases for both the domain and codomain are fixed, a linear transformation can be written as one matrix.
- In this post we use the column-vector convention.
- An
m x nmatrix hasmrows andncolumns, so it maps ann-dimensional input vector to anm-dimensional output vector. A good mnemonic is: columns = input size, rows = output size.
That dimension-reading habit matters in both mathematics and code.
Common mistakes
Treating a matrix as just table-shaped data
Some datasets can be stored in matrix form, but in linear algebra a matrix is also an operator. You should separate storage shape from mathematical role.
Treating rows and columns as mere layout details
Rows and columns are not only formatting. Each column tells you where one input-space basis direction goes, and each row tells you how one output coordinate is built from the input.
Calling y = Wx + b fully linear
In machine-learning practice, people often say “linear layer,” but strictly speaking the + b part makes it affine rather than purely linear.
Thinking matrix intuition only works in 2D
2D and 3D examples are easier to picture, but the exact same ideas extend to high-dimensional feature spaces.
Practice or extension
- What input dimension and output dimension does a
2 x 3matrix represent? - What does a diagonal matrix do to each axis?
- Why does each column describe the movement of one basic input direction?
- In
y = Wx + b, which part carries the linear structure?
If you want a quick coding exercise, try a small matrix and vector in NumPy and check how the input changes under the transformation.
Wrap-up
This post reframed the matrix as a transformation.
- A matrix is not only a table of numbers.
- It represents a rule that sends vectors to vectors.
- Each column shows where one basic axis goes.
- This viewpoint powers graphics, neural networks, and data transformations.
In the next post, we will study what the product Ax actually does and read it through the column-vector viewpoint.
💬 댓글
이 글에 대한 의견을 남겨주세요