Matrix Properties and Transformations: From Theory to Machine Learning Practice
Learn matrix rank, determinants, linear transformations, and solving equations with Python. Understand how these concepts power ML algorithms with practical examples

When Matrix Properties Actually Mattered
I was training a neural network and kept getting this error: "Singular matrix." My code crashed, and I had no idea why.
After some digging, I learned that not all matrices are "invertible"—some lack the properties needed for certain operations. That's when matrix properties like rank, determinant, and invertibility stopped being abstract concepts and became debugging tools!
In this post, we'll explore the properties that make matrices useful (or problematic) in machine learning, and you'll learn to spot issues before they crash your code.
Matrix Rank: How Much Information?
What is Rank?
The rank tells you how many independent rows (or columns) a matrix has. Think of it as measuring "how much unique information" the matrix contains.
- Full rank: All rows/columns are independent (maximum information)
- Rank-deficient: Some rows are combinations of others (redundant information)
import numpy as np
# Full rank matrix (2×2)
A = np.array([
[1, 2],
[3, 4]
])
rank_A = np.linalg.matrix_rank(A)
print(f"Matrix A:\n{A}")
print(f"Rank: {rank_A}")
print(f"Is full rank? {rank_A == min(A.shape)}")
# Output: Rank: 2, Is full rank? True
Why Rank Matters in ML
Problem: What if one feature is just a copy of another?
# Rank-deficient matrix
B = np.array([
[1, 2, 3],
[2, 4, 6], # This is 2× the first row!
[4, 5, 6]
])
rank_B = np.linalg.matrix_rank(B)
print(f"Matrix B:\n{B}")
print(f"Rank: {rank_B} (should be 3 for full rank)")
print(f"Is full rank? {rank_B == min(B.shape)}")
# Output: Rank: 2, Is full rank? False
ML Impact:
- Duplicate features reduce rank (waste of computation!)
- Linearly dependent features cause numerical instability
- Low rank might indicate you can reduce dimensions (hello, PCA!)
Determinants: The "Invertibility Test"
What is a Determinant?
The determinant is a single number computed from a square matrix. It tells you:
- det ≠ 0: Matrix is invertible (good!)
- det = 0: Matrix is singular/not invertible (problem!)
# Calculate determinant
A = np.array([[3, 8], [4, 6]])
det_A = np.linalg.det(A)
print(f"Matrix A:\n{A}")
print(f"Determinant: {det_A:.2f}")
print(f"Is invertible? {det_A != 0}")
# For 2×2 matrix: det = ad - bc
# Manual: 3*6 - 8*4 = 18 - 32 = -14 ✓
Determinant = 0: The Red Flag
# Singular matrix (determinant = 0)
singular = np.array([
[1, 2],
[2, 4] # Second row is 2× first row
])
det_singular = np.linalg.det(singular)
print(f"Singular matrix:\n{singular}")
print(f"Determinant: {det_singular:.10f}")
print("Cannot be inverted! Will cause errors in algorithms.")
# Try to invert (this will work but give warning)
try:
inv = np.linalg.inv(singular)
except np.linalg.LinAlgError as e:
print(f"Error: {e}")
ML Warning Signs:
- Training fails with "singular matrix" errors
- Numerical instability during optimization
- Features are perfectly correlated
The Trace: Sum of Diagonal Elements
The trace is simply the sum of diagonal elements. It's used in many ML algorithms!
A = np.array([
[5, 2, 1],
[3, 7, 4],
[6, 8, 9]
])
trace = np.trace(A)
print(f"Matrix:\n{A}")
print(f"Diagonal elements: {np.diag(A)}")
print(f"Trace (sum): {trace}")
# Calculation: 5 + 7 + 9 = 21
Where You'll See Trace:
- Computing variance in PCA
- Calculating model complexity
- Optimization algorithms
Matrix Inverse: Solving for X
What is Matrix Inverse?
The inverse of matrix A (written A⁻¹) satisfies: A × A⁻¹ = I
It's like "undoing" a transformation. If A transforms vectors, A⁻¹ transforms them back!
# Original matrix
A = np.array([[4, 7], [2, 6]])
print(f"Matrix A:\n{A}")
# Calculate inverse
A_inv = np.linalg.inv(A)
print(f"\nInverse A⁻¹:\n{A_inv}")
# Verify: A @ A⁻¹ = I
identity = A @ A_inv
print(f"\nA @ A⁻¹ (should be identity):\n{np.round(identity, 10)}")
# Output: approximately [[1, 0], [0, 1]]
Using Inverse to Solve Equations
If you have Ax = b, then x = A⁻¹b
# System: 3x + 2y = 7, 2x + 5y = 12
A = np.array([[3, 2], [2, 5]])
b = np.array([7, 12])
# Solve using inverse
x = np.linalg.inv(A) @ b
print(f"Solution: x = {x[0]:.2f}, y = {x[1]:.2f}")
# Verify: A @ x should equal b
verification = A @ x
print(f"Check A @ x = {verification} (should be {b})")
Better way: Use np.linalg.solve() instead of computing inverse (more efficient!)
# Same problem, better solution
x = np.linalg.solve(A, b)
print(f"Solution using solve(): x = {x[0]:.2f}, y = {x[1]:.2f}")
Solving Systems of Linear Equations
Real ML Problem: Linear Regression
Linear regression solves a system of equations to find the best fit line!
# Simple linear regression: y = mx + c
# Given 3 data points, find m and c
# Data points: (1, 3), (2, 5), (3, 7)
# System: c + 1m = 3
# c + 2m = 5
# c + 3m = 7
# Coefficient matrix A
A = np.array([
[1, 1], # [c, m] for point 1
[1, 2], # [c, m] for point 2
[1, 3] # [c, m] for point 3
])
# Observed values b
b = np.array([3, 5, 7])
# Solve using least squares (for overdetermined systems)
params, residuals, rank, s = np.linalg.lstsq(A, b, rcond=None)
c, m = params
print(f"Best fit line: y = {m:.2f}x + {c:.2f}")
print(f"Intercept (c): {c:.2f}")
print(f"Slope (m): {m:.2f}")
# Predictions
x_test = np.array([1, 2, 3])
y_pred = m * x_test + c
print(f"\nPredictions: {y_pred}")
print(f"Actual: {b}")
The Normal Equation: ML's Secret Weapon
For linear regression, there's a closed-form solution: θ = (XᵀX)⁻¹Xᵀy
# Housing price prediction
# Features: size (sq ft)
X = np.array([[1, 1000], [1, 1500], [1, 2000], [1, 1200]]) # Add bias column
y = np.array([200000, 250000, 300000, 220000]) # Prices
# Normal equation
XtX = X.T @ X
Xty = X.T @ y
theta = np.linalg.inv(XtX) @ Xty
print(f"Parameters: intercept = {theta[0]:.0f}, coef = {theta[1]:.2f}")
# Make prediction for 1800 sq ft house
prediction = np.array([1, 1800]) @ theta
print(f"Predicted price for 1800 sq ft: ${prediction:,.0f}")
This is exactly what scikit-learn's LinearRegression does internally!
Linear Transformations: How Matrices Change Space
What is a Transformation?
A transformation takes a vector and changes it. Matrices represent transformations!
Key idea: Multiplying by a matrix = applying a transformation
# Scaling transformation: makes vectors bigger/smaller
scale_matrix = np.array([
[2, 0], # Scale x by 2
[0, 3] # Scale y by 3
])
original = np.array([1, 1])
transformed = scale_matrix @ original
print(f"Original vector: {original}")
print(f"After scaling: {transformed}") # Output: [2, 3]
Common Transformations in ML
1. Rotation: Changing Direction
import numpy as np
# Rotate 90° counterclockwise
# Formula: [[cos θ, -sin θ], [sin θ, cos θ]]
theta = np.pi / 2 # 90 degrees
rotation = np.array([
[np.cos(theta), -np.sin(theta)],
[np.sin(theta), np.cos(theta)]
])
vector = np.array([1, 0]) # Point along x-axis
rotated = rotation @ vector
print(f"Original: {vector}")
print(f"After 90° rotation: {np.round(rotated)}") # Output: [0, 1]
ML Use: Data augmentation (rotating images), coordinate transformations
2. Reflection: Mirroring
# Reflect across y-axis (flip horizontally)
reflection = np.array([
[-1, 0], # Negate x
[0, 1] # Keep y
])
point = np.array([3, 2])
reflected = reflection @ point
print(f"Original: {point}")
print(f"Reflected: {reflected}") # Output: [-3, 2]
ML Use: Flipping images for data augmentation
3. Shearing: Skewing
# Shear transformation
shear = np.array([
[1, 0.5], # x' = x + 0.5y
[0, 1] # y' = y
])
square = np.array([[0, 1, 1, 0], [0, 0, 1, 1]]) # Square corners
sheared = shear @ square
print("Original square corners:")
print(square)
print("\nSheared:")
print(sheared)
Orthogonality: When Vectors are Independent
What Does Orthogonal Mean?
Two vectors are orthogonal (perpendicular) if their dot product is zero. They're completely independent!
# Perpendicular vectors
v1 = np.array([1, 0, 0]) # x-axis
v2 = np.array([0, 1, 0]) # y-axis
dot_product = np.dot(v1, v2)
print(f"v1: {v1}")
print(f"v2: {v2}")
print(f"Dot product: {dot_product}")
print(f"Are orthogonal? {dot_product == 0}")
Orthogonal Projection: Finding Components
Project vector b onto vector a to find "how much of b goes in direction of a"
Formula: proj_a(b) = (a·b / a·a) × a
# Project b onto a
a = np.array([3, 4])
b = np.array([5, 2])
# Calculate projection
dot_ab = np.dot(a, b)
dot_aa = np.dot(a, a)
projection = (dot_ab / dot_aa) * a
print(f"Vector a: {a}")
print(f"Vector b: {b}")
print(f"Projection of b onto a: {projection}")
# The perpendicular component
perpendicular = b - projection
print(f"Perpendicular component: {perpendicular}")
# Verify: projection and perpendicular should be orthogonal
verify = np.dot(projection, perpendicular)
print(f"Dot product (should be ~0): {verify:.10f}")
ML Application:
- Feature decomposition in PCA
- Gram-Schmidt orthogonalization
- Creating independent features
Vector Spaces: The Big Picture
What is a Vector Space?
A vector space is a collection of vectors where you can add vectors and multiply by scalars, staying within the space.
Key concepts:
- Span: All possible linear combinations of vectors
- Basis: Minimum set of independent vectors that span the space
- Dimension: Number of vectors in a basis
# Standard basis for R³
e1 = np.array([1, 0, 0])
e2 = np.array([0, 1, 0])
e3 = np.array([0, 0, 1])
# Any vector in R³ can be written as combination
target = np.array([5, 3, 7])
print(f"Target vector: {target}")
print(f"As combination: {target[0]}*e1 + {target[1]}*e2 + {target[2]}*e3")
# Verify
combination = target[0]*e1 + target[1]*e2 + target[2]*e3
print(f"Reconstruction: {combination}")
Linear Independence: No Redundancy
Vectors are linearly independent if none can be written as a combination of others.
import sympy as sp
# Check if vectors are independent
v1 = sp.Matrix([1, 2, 3])
v2 = sp.Matrix([4, 5, 6])
v3 = sp.Matrix([7, 8, 9])
# Put as columns in a matrix
M = sp.Matrix([[1, 4, 7], [2, 5, 8], [3, 6, 9]])
rank = M.rank()
print(f"Rank: {rank}")
print(f"Number of vectors: 3")
print(f"Are linearly independent? {rank == 3}")
# They're not independent because v3 = 2*v2 - v1
verification = 2*v2 - v1
print(f"\n2*v2 - v1 = {verification.T}")
print(f"v3 = {v3.T}")
print(f"Equal? {verification == v3}")
ML Insight: If features are linearly dependent, you have redundant information. Remove them for efficiency!
Practical Example: Data Preprocessing
Let's combine these concepts to preprocess a dataset!
# Raw dataset: 5 samples, 3 features
data = np.array([
[100, 2.5, 30],
[150, 3.0, 45],
[120, 2.8, 35],
[180, 3.5, 50],
[110, 2.6, 32]
])
print("Original data:")
print(data)
print(f"Shape: {data.shape}")
# Step 1: Check rank (are features independent?)
rank = np.linalg.matrix_rank(data)
print(f"\nRank: {rank}")
print(f"Full rank? {rank == min(data.shape)}")
# Step 2: Center the data (subtract mean)
mean = np.mean(data, axis=0)
centered = data - mean
print(f"\nCentered data (first 2 rows):")
print(centered[:2])
# Step 3: Compute covariance matrix
cov_matrix = np.cov(centered.T)
print(f"\nCovariance matrix:")
print(cov_matrix)
print(f"Determinant: {np.linalg.det(cov_matrix):.2f}")
# Step 4: Check if features are correlated
print(f"\nCorrelation between features 0 and 1: {cov_matrix[0,1]:.2f}")
Common Pitfalls and How to Avoid Them
Pitfall 1: Ignoring Singular Matrices
# This matrix looks fine but is singular!
A = np.array([
[1, 2, 3],
[2, 4, 6],
[1, 1, 2]
])
det = np.linalg.det(A)
print(f"Determinant: {det:.10f}")
if abs(det) < 1e-10:
print("Warning: Matrix is (nearly) singular!")
print("Check for duplicate or linearly dependent rows/columns")
Pitfall 2: Not Checking Dimensions
# Always verify dimensions before operations
A = np.random.randn(5, 3)
B = np.random.randn(4, 5)
print(f"A shape: {A.shape}")
print(f"B shape: {B.shape}")
# Can we multiply A @ B?
if A.shape[1] == B.shape[0]:
result = A @ B
print(f"Result shape: {result.shape}")
else:
print("Cannot multiply! Inner dimensions don't match")
Pitfall 3: Numerical Instability
# Small determinant = numerical problems
A = np.array([[1, 1], [1, 1.0001]])
det = np.linalg.det(A)
print(f"Determinant: {det}")
if abs(det) < 1e-5:
print("Warning: Nearly singular! Results may be unstable")
print("Consider regularization or feature engineering")
Quick Reference: Matrix Properties
| Property | Formula/Code | Meaning | ML Use |
|---|---|---|---|
| Rank | np.linalg.matrix_rank(A) | # independent rows | Detect redundancy |
| Determinant | np.linalg.det(A) | Invertibility test | Check for singularity |
| Trace | np.trace(A) | Sum of diagonal | Variance, complexity |
| Inverse | np.linalg.inv(A) | Undo transformation | Solve equations |
| Solve | np.linalg.solve(A, b) | Find x in Ax=b | Linear regression |
What's Coming Next
You now understand matrix properties and transformations! In the final post, we'll cover:
- Eigenvalues and Eigenvectors: The "DNA" of matrices
- Matrix Decomposition: SVD and its power
- PCA: Dimensionality reduction in action
- Real ML Applications: Seeing it all come together
These advanced topics build directly on what you've learned here!
Matrix properties aren't just theory—they're debugging tools, performance indicators, and keys to understanding when algorithms will work (or fail). Master these, and you'll write better ML code! If this guide helped you understand matrix properties and transformations, connect with me on Twitter or LinkedIn.
Support My Work
If this guide helped you understand matrix properties, transformations, and how they apply to machine learning, I'd really appreciate your support! Creating comprehensive, free educational content takes significant time and effort. Your support helps me continue sharing knowledge and creating more helpful tutorials for students like you.
☕ Buy me a coffee - Every contribution, big or small, means the world to me and keeps me motivated to create more content!
Cover image by Vlado Paunovic on Unsplash