Ojaswi Athghara | Matrix Properties and Transformations: From Theory to Machine Learning Practice

Matrix Properties and Transformations: From Theory to Machine Learning Practice

When Matrix Properties Actually Mattered

I was training a neural network and kept getting this error: "Singular matrix." My code crashed, and I had no idea why.

After some digging, I learned that not all matrices are "invertible"—some lack the properties needed for certain operations. That's when matrix properties like rank, determinant, and invertibility stopped being abstract concepts and became debugging tools!

In this post, we'll explore the properties that make matrices useful (or problematic) in machine learning, and you'll learn to spot issues before they crash your code.

Matrix Rank: How Much Information?

What is Rank?

The rank tells you how many independent rows (or columns) a matrix has. Think of it as measuring "how much unique information" the matrix contains.

Full rank: All rows/columns are independent (maximum information)
Rank-deficient: Some rows are combinations of others (redundant information)

import numpy as np

# Full rank matrix (2×2)
A = np.array([
    [1, 2],
    [3, 4]
])

rank_A = np.linalg.matrix_rank(A)
print(f"Matrix A:\n{A}")
print(f"Rank: {rank_A}")
print(f"Is full rank? {rank_A == min(A.shape)}")
# Output: Rank: 2, Is full rank? True

Why Rank Matters in ML

Problem: What if one feature is just a copy of another?

# Rank-deficient matrix
B = np.array([
    [1, 2, 3],
    [2, 4, 6],  # This is 2× the first row!
    [4, 5, 6]
])

rank_B = np.linalg.matrix_rank(B)
print(f"Matrix B:\n{B}")
print(f"Rank: {rank_B} (should be 3 for full rank)")
print(f"Is full rank? {rank_B == min(B.shape)}")
# Output: Rank: 2, Is full rank? False

ML Impact:

Duplicate features reduce rank (waste of computation!)
Linearly dependent features cause numerical instability
Low rank might indicate you can reduce dimensions (hello, PCA!)

Determinants: The "Invertibility Test"

What is a Determinant?

The determinant is a single number computed from a square matrix. It tells you:

det ≠ 0: Matrix is invertible (good!)
det = 0: Matrix is singular/not invertible (problem!)

# Calculate determinant
A = np.array([[3, 8], [4, 6]])
det_A = np.linalg.det(A)

print(f"Matrix A:\n{A}")
print(f"Determinant: {det_A:.2f}")
print(f"Is invertible? {det_A != 0}")

# For 2×2 matrix: det = ad - bc
# Manual: 3*6 - 8*4 = 18 - 32 = -14 ✓

Determinant = 0: The Red Flag

# Singular matrix (determinant = 0)
singular = np.array([
    [1, 2],
    [2, 4]  # Second row is 2× first row
])

det_singular = np.linalg.det(singular)
print(f"Singular matrix:\n{singular}")
print(f"Determinant: {det_singular:.10f}")
print("Cannot be inverted! Will cause errors in algorithms.")

# Try to invert (this will work but give warning)
try:
    inv = np.linalg.inv(singular)
except np.linalg.LinAlgError as e:
    print(f"Error: {e}")

ML Warning Signs:

Training fails with "singular matrix" errors
Numerical instability during optimization
Features are perfectly correlated

The Trace: Sum of Diagonal Elements

The trace is simply the sum of diagonal elements. It's used in many ML algorithms!

A = np.array([
    [5, 2, 1],
    [3, 7, 4],
    [6, 8, 9]
])

trace = np.trace(A)
print(f"Matrix:\n{A}")
print(f"Diagonal elements: {np.diag(A)}")
print(f"Trace (sum): {trace}")
# Calculation: 5 + 7 + 9 = 21

Where You'll See Trace:

Computing variance in PCA
Calculating model complexity
Optimization algorithms

Matrix Inverse: Solving for X

What is Matrix Inverse?

The inverse of matrix A (written A⁻¹) satisfies: A × A⁻¹ = I

It's like "undoing" a transformation. If A transforms vectors, A⁻¹ transforms them back!

# Original matrix
A = np.array([[4, 7], [2, 6]])
print(f"Matrix A:\n{A}")

# Calculate inverse
A_inv = np.linalg.inv(A)
print(f"\nInverse A⁻¹:\n{A_inv}")

# Verify: A @ A⁻¹ = I
identity = A @ A_inv
print(f"\nA @ A⁻¹ (should be identity):\n{np.round(identity, 10)}")
# Output: approximately [[1, 0], [0, 1]]

Using Inverse to Solve Equations

If you have Ax = b, then x = A⁻¹b

# System: 3x + 2y = 7, 2x + 5y = 12
A = np.array([[3, 2], [2, 5]])
b = np.array([7, 12])

# Solve using inverse
x = np.linalg.inv(A) @ b
print(f"Solution: x = {x[0]:.2f}, y = {x[1]:.2f}")

# Verify: A @ x should equal b
verification = A @ x
print(f"Check A @ x = {verification} (should be {b})")

Better way: Use np.linalg.solve() instead of computing inverse (more efficient!)

# Same problem, better solution
x = np.linalg.solve(A, b)
print(f"Solution using solve(): x = {x[0]:.2f}, y = {x[1]:.2f}")

Solving Systems of Linear Equations

Real ML Problem: Linear Regression

Linear regression solves a system of equations to find the best fit line!

# Simple linear regression: y = mx + c
# Given 3 data points, find m and c

# Data points: (1, 3), (2, 5), (3, 7)
# System: c + 1m = 3
#         c + 2m = 5
#         c + 3m = 7

# Coefficient matrix A
A = np.array([
    [1, 1],  # [c, m] for point 1
    [1, 2],  # [c, m] for point 2
    [1, 3]   # [c, m] for point 3
])

# Observed values b
b = np.array([3, 5, 7])

# Solve using least squares (for overdetermined systems)
params, residuals, rank, s = np.linalg.lstsq(A, b, rcond=None)
c, m = params

print(f"Best fit line: y = {m:.2f}x + {c:.2f}")
print(f"Intercept (c): {c:.2f}")
print(f"Slope (m): {m:.2f}")

# Predictions
x_test = np.array([1, 2, 3])
y_pred = m * x_test + c
print(f"\nPredictions: {y_pred}")
print(f"Actual: {b}")

The Normal Equation: ML's Secret Weapon

For linear regression, there's a closed-form solution: θ = (XᵀX)⁻¹Xᵀy

# Housing price prediction
# Features: size (sq ft)
X = np.array([[1, 1000], [1, 1500], [1, 2000], [1, 1200]])  # Add bias column
y = np.array([200000, 250000, 300000, 220000])  # Prices

# Normal equation
XtX = X.T @ X
Xty = X.T @ y
theta = np.linalg.inv(XtX) @ Xty

print(f"Parameters: intercept = {theta[0]:.0f}, coef = {theta[1]:.2f}")

# Make prediction for 1800 sq ft house
prediction = np.array([1, 1800]) @ theta
print(f"Predicted price for 1800 sq ft: ${prediction:,.0f}")

This is exactly what scikit-learn's LinearRegression does internally!

Linear Transformations: How Matrices Change Space

What is a Transformation?

A transformation takes a vector and changes it. Matrices represent transformations!

Key idea: Multiplying by a matrix = applying a transformation

# Scaling transformation: makes vectors bigger/smaller
scale_matrix = np.array([
    [2, 0],  # Scale x by 2
    [0, 3]   # Scale y by 3
])

original = np.array([1, 1])
transformed = scale_matrix @ original

print(f"Original vector: {original}")
print(f"After scaling: {transformed}")  # Output: [2, 3]

Common Transformations in ML

1. Rotation: Changing Direction

import numpy as np

# Rotate 90° counterclockwise
# Formula: [[cos θ, -sin θ], [sin θ, cos θ]]
theta = np.pi / 2  # 90 degrees
rotation = np.array([
    [np.cos(theta), -np.sin(theta)],
    [np.sin(theta), np.cos(theta)]
])

vector = np.array([1, 0])  # Point along x-axis
rotated = rotation @ vector

print(f"Original: {vector}")
print(f"After 90° rotation: {np.round(rotated)}")  # Output: [0, 1]

ML Use: Data augmentation (rotating images), coordinate transformations

2. Reflection: Mirroring

# Reflect across y-axis (flip horizontally)
reflection = np.array([
    [-1, 0],  # Negate x
    [0, 1]    # Keep y
])

point = np.array([3, 2])
reflected = reflection @ point

print(f"Original: {point}")
print(f"Reflected: {reflected}")  # Output: [-3, 2]

ML Use: Flipping images for data augmentation

3. Shearing: Skewing

# Shear transformation
shear = np.array([
    [1, 0.5],  # x' = x + 0.5y
    [0, 1]     # y' = y
])

square = np.array([[0, 1, 1, 0], [0, 0, 1, 1]])  # Square corners
sheared = shear @ square

print("Original square corners:")
print(square)
print("\nSheared:")
print(sheared)

Orthogonality: When Vectors are Independent

What Does Orthogonal Mean?

Two vectors are orthogonal (perpendicular) if their dot product is zero. They're completely independent!

# Perpendicular vectors
v1 = np.array([1, 0, 0])  # x-axis
v2 = np.array([0, 1, 0])  # y-axis

dot_product = np.dot(v1, v2)
print(f"v1: {v1}")
print(f"v2: {v2}")
print(f"Dot product: {dot_product}")
print(f"Are orthogonal? {dot_product == 0}")

Orthogonal Projection: Finding Components

Project vector b onto vector a to find "how much of b goes in direction of a"

Formula: proj_a(b) = (a·b / a·a) × a

# Project b onto a
a = np.array([3, 4])
b = np.array([5, 2])

# Calculate projection
dot_ab = np.dot(a, b)
dot_aa = np.dot(a, a)
projection = (dot_ab / dot_aa) * a

print(f"Vector a: {a}")
print(f"Vector b: {b}")
print(f"Projection of b onto a: {projection}")

# The perpendicular component
perpendicular = b - projection
print(f"Perpendicular component: {perpendicular}")

# Verify: projection and perpendicular should be orthogonal
verify = np.dot(projection, perpendicular)
print(f"Dot product (should be ~0): {verify:.10f}")

ML Application:

Feature decomposition in PCA
Gram-Schmidt orthogonalization
Creating independent features

Vector Spaces: The Big Picture

What is a Vector Space?

A vector space is a collection of vectors where you can add vectors and multiply by scalars, staying within the space.

Key concepts:

Span: All possible linear combinations of vectors
Basis: Minimum set of independent vectors that span the space
Dimension: Number of vectors in a basis

# Standard basis for R³
e1 = np.array([1, 0, 0])
e2 = np.array([0, 1, 0])
e3 = np.array([0, 0, 1])

# Any vector in R³ can be written as combination
target = np.array([5, 3, 7])
print(f"Target vector: {target}")
print(f"As combination: {target[0]}*e1 + {target[1]}*e2 + {target[2]}*e3")

# Verify
combination = target[0]*e1 + target[1]*e2 + target[2]*e3
print(f"Reconstruction: {combination}")

Linear Independence: No Redundancy

Vectors are linearly independent if none can be written as a combination of others.

import sympy as sp

# Check if vectors are independent
v1 = sp.Matrix([1, 2, 3])
v2 = sp.Matrix([4, 5, 6])
v3 = sp.Matrix([7, 8, 9])

# Put as columns in a matrix
M = sp.Matrix([[1, 4, 7], [2, 5, 8], [3, 6, 9]])

rank = M.rank()
print(f"Rank: {rank}")
print(f"Number of vectors: 3")
print(f"Are linearly independent? {rank == 3}")

# They're not independent because v3 = 2*v2 - v1
verification = 2*v2 - v1
print(f"\n2*v2 - v1 = {verification.T}")
print(f"v3 = {v3.T}")
print(f"Equal? {verification == v3}")

ML Insight: If features are linearly dependent, you have redundant information. Remove them for efficiency!

Practical Example: Data Preprocessing

Let's combine these concepts to preprocess a dataset!

# Raw dataset: 5 samples, 3 features
data = np.array([
    [100, 2.5, 30],
    [150, 3.0, 45],
    [120, 2.8, 35],
    [180, 3.5, 50],
    [110, 2.6, 32]
])

print("Original data:")
print(data)
print(f"Shape: {data.shape}")

# Step 1: Check rank (are features independent?)
rank = np.linalg.matrix_rank(data)
print(f"\nRank: {rank}")
print(f"Full rank? {rank == min(data.shape)}")

# Step 2: Center the data (subtract mean)
mean = np.mean(data, axis=0)
centered = data - mean
print(f"\nCentered data (first 2 rows):")
print(centered[:2])

# Step 3: Compute covariance matrix
cov_matrix = np.cov(centered.T)
print(f"\nCovariance matrix:")
print(cov_matrix)
print(f"Determinant: {np.linalg.det(cov_matrix):.2f}")

# Step 4: Check if features are correlated
print(f"\nCorrelation between features 0 and 1: {cov_matrix[0,1]:.2f}")

Common Pitfalls and How to Avoid Them

Pitfall 1: Ignoring Singular Matrices

# This matrix looks fine but is singular!
A = np.array([
    [1, 2, 3],
    [2, 4, 6],
    [1, 1, 2]
])

det = np.linalg.det(A)
print(f"Determinant: {det:.10f}")

if abs(det) < 1e-10:
    print("Warning: Matrix is (nearly) singular!")
    print("Check for duplicate or linearly dependent rows/columns")

Pitfall 2: Not Checking Dimensions

# Always verify dimensions before operations
A = np.random.randn(5, 3)
B = np.random.randn(4, 5)

print(f"A shape: {A.shape}")
print(f"B shape: {B.shape}")

# Can we multiply A @ B?
if A.shape[1] == B.shape[0]:
    result = A @ B
    print(f"Result shape: {result.shape}")
else:
    print("Cannot multiply! Inner dimensions don't match")

Pitfall 3: Numerical Instability

# Small determinant = numerical problems
A = np.array([[1, 1], [1, 1.0001]])
det = np.linalg.det(A)

print(f"Determinant: {det}")
if abs(det) < 1e-5:
    print("Warning: Nearly singular! Results may be unstable")
    print("Consider regularization or feature engineering")

Quick Reference: Matrix Properties

Property	Formula/Code	Meaning	ML Use
Rank	`np.linalg.matrix_rank(A)`	# independent rows	Detect redundancy
Determinant	`np.linalg.det(A)`	Invertibility test	Check for singularity
Trace	`np.trace(A)`	Sum of diagonal	Variance, complexity
Inverse	`np.linalg.inv(A)`	Undo transformation	Solve equations
Solve	`np.linalg.solve(A, b)`	Find x in Ax=b	Linear regression

What's Coming Next

You now understand matrix properties and transformations! In the final post, we'll cover:

Eigenvalues and Eigenvectors: The "DNA" of matrices
Matrix Decomposition: SVD and its power
PCA: Dimensionality reduction in action
Real ML Applications: Seeing it all come together

These advanced topics build directly on what you've learned here!

Matrix properties aren't just theory—they're debugging tools, performance indicators, and keys to understanding when algorithms will work (or fail). Master these, and you'll write better ML code! If this guide helped you understand matrix properties and transformations, connect with me on Twitter or LinkedIn.

Support My Work

If this guide helped you understand matrix properties, transformations, and how they apply to machine learning, I'd really appreciate your support! Creating comprehensive, free educational content takes significant time and effort. Your support helps me continue sharing knowledge and creating more helpful tutorials for students like you.

☕ Buy me a coffee - Every contribution, big or small, means the world to me and keeps me motivated to create more content!

Cover image by Vlado Paunovic on Unsplash

Related Blogs