Linear Algebra with NumPy: Essential Mathematical Operations for Data Science
Master linear algebra with NumPy for machine learning and data science. Learn matrices, eigenvalues, SVD, matrix decomposition, vector operations, and solve real-world problems with practical Python examples.

The Day Linear Algebra Finally Made Sense
I'll be honest—linear algebra intimidated me for years. Eigenvalues? Matrix decomposition? It felt like abstract math with no practical purpose. Then I started implementing machine learning algorithms from scratch.
Suddenly, everything clicked. Principal Component Analysis (PCA) is just eigenvalue decomposition. Neural networks are matrix multiplications. Recommendation systems use matrix factorization. Linear algebra wasn't abstract—it was the language of data science.
In this guide, I'll show you the linear algebra operations that power modern AI, explained through practical NumPy implementations. No pure theory—just the math you actually need, with real code examples.
Why Linear Algebra Matters in Data Science
Every major machine learning technique relies on linear algebra:
- Neural Networks: Forward and backward propagation are matrix operations
- PCA: Reduces dimensionality using eigenvectors
- SVD: Powers recommendation systems (Netflix, Amazon)
- Linear Regression: Solved using matrix operations
- Computer Vision: Images are matrices, convolutions are linear operations
- NLP: Word embeddings live in vector spaces
Understanding linear algebra transforms you from using libraries blindly to understanding what happens under the hood.
Vectors: The Building Blocks
Vectors represent data points, features, or directions in n-dimensional space.
Vector Creation and Properties
import numpy as np
# Create vectors
v1 = np.array([1, 2, 3])
v2 = np.array([4, 5, 6])
print(f"Vector v1: {v1}")
print(f"Shape: {v1.shape}")
print(f"Dimensions: {v1.ndim}")
# Vector as column (important for matrix operations)
col_vector = v1.reshape(-1, 1)
print(f"\nColumn vector shape: {col_vector.shape}")
print(col_vector)
Vector Operations
v1 = np.array([1, 2, 3])
v2 = np.array([4, 5, 6])
# Addition and subtraction
print(f"v1 + v2 = {v1 + v2}")
print(f"v1 - v2 = {v1 - v2}")
# Scalar multiplication
print(f"3 * v1 = {3 * v1}")
# Dot product (inner product)
dot_product = np.dot(v1, v2)
# Or: v1 @ v2
print(f"\nv1 · v2 = {dot_product}")
print(f"Manual: {v1[0]*v2[0]} + {v1[1]*v2[1]} + {v1[2]*v2[2]} = {dot_product}")
# Vector magnitude (L2 norm)
magnitude = np.linalg.norm(v1)
print(f"\n||v1|| = {magnitude}")
print(f"Manual: sqrt({v1[0]}² + {v1[1]}² + {v1[2]}²) = {magnitude}")
# Unit vector (normalization)
unit_vector = v1 / magnitude
print(f"\nUnit vector: {unit_vector}")
print(f"Magnitude: {np.linalg.norm(unit_vector)}") # Should be 1.0
Angle Between Vectors
v1 = np.array([1, 0, 0])
v2 = np.array([1, 1, 0])
# cos(θ) = (v1 · v2) / (||v1|| * ||v2||)
cos_angle = np.dot(v1, v2) / (np.linalg.norm(v1) * np.linalg.norm(v2))
angle_rad = np.arccos(cos_angle)
angle_deg = np.degrees(angle_rad)
print(f"Vectors: {v1} and {v2}")
print(f"Angle: {angle_deg:.2f}°")
# Orthogonal vectors (90°)
v3 = np.array([1, 0, 0])
v4 = np.array([0, 1, 0])
print(f"\nOrthogonal check: v3 · v4 = {np.dot(v3, v4)}") # Should be 0
Matrices: Transforming Data
Matrices represent linear transformations, datasets, and systems of equations.
Matrix Creation
# 2D array (matrix)
A = np.array([[1, 2, 3],
[4, 5, 6]])
print("Matrix A:")
print(A)
print(f"Shape: {A.shape}") # (rows, columns)
# Identity matrix (I @ A = A)
I = np.eye(3)
print("\nIdentity matrix (3×3):")
print(I)
# Diagonal matrix
D = np.diag([1, 2, 3])
print("\nDiagonal matrix:")
print(D)
# Random matrix
R = np.random.randn(3, 3)
print("\nRandom matrix:")
print(R)
Matrix Operations
A = np.array([[1, 2], [3, 4]])
B = np.array([[5, 6], [7, 8]])
# Addition
print("A + B:")
print(A + B)
# Element-wise multiplication (Hadamard product)
print("\nA * B (element-wise):")
print(A * B)
# Matrix multiplication
print("\nA @ B (matrix multiplication):")
print(A @ B)
# Transpose
print("\nA transpose:")
print(A.T)
# Trace (sum of diagonal)
print(f"\nTrace of A: {np.trace(A)}")
Matrix Multiplication Deep Dive
# (m×n) @ (n×p) = (m×p)
A = np.array([[1, 2, 3], # 2×3
[4, 5, 6]])
B = np.array([[7, 8], # 3×2
[9, 10],
[11, 12]])
C = A @ B # Result: 2×2
print("A (2×3):")
print(A)
print("\nB (3×2):")
print(B)
print("\nA @ B (2×2):")
print(C)
# Manual calculation for C[0,0]
print(f"\nC[0,0] = {A[0,0]}*{B[0,0]} + {A[0,1]}*{B[1,0]} + {A[0,2]}*{B[2,0]}")
print(f" = {A[0,0]*B[0,0] + A[0,1]*B[1,0] + A[0,2]*B[2,0]}")
Matrix Inverse and Determinant
Matrix Inverse
A = np.array([[4, 7],
[2, 6]])
# Compute inverse (only for square, non-singular matrices)
A_inv = np.linalg.inv(A)
print("Matrix A:")
print(A)
print("\nA inverse:")
print(A_inv)
# Verify: A @ A_inv = I
identity = A @ A_inv
print("\nA @ A_inv (should be identity):")
print(np.round(identity, 10))
# Solve linear system: Ax = b
b = np.array([1, 2])
x = A_inv @ b
print(f"\nSolution to Ax = b: {x}")
print(f"Verification (A @ x): {A @ x}")
Determinant
A = np.array([[3, 8],
[4, 6]])
det = np.linalg.det(A)
print(f"Matrix A:")
print(A)
print(f"\nDeterminant: {det}")
print(f"Is invertible: {det != 0}")
# 2×2 determinant: ad - bc
print(f"\nManual: {A[0,0]}*{A[1,1]} - {A[0,1]}*{A[1,0]} = {A[0,0]*A[1,1] - A[0,1]*A[1,0]}")
# Singular matrix (det = 0, not invertible)
singular = np.array([[1, 2],
[2, 4]]) # Row 2 = 2 * Row 1
print(f"\nSingular matrix determinant: {np.linalg.det(singular)}")
Eigenvalues and Eigenvectors: The Heart of ML
Eigenvectors show directions that remain unchanged (only scaled) by a transformation.
Understanding Eigenvalues
A = np.array([[4, -2],
[1, 1]])
# Compute eigenvalues and eigenvectors
eigenvalues, eigenvectors = np.linalg.eig(A)
print("Matrix A:")
print(A)
print(f"\nEigenvalues: {eigenvalues}")
print("\nEigenvectors (as columns):")
print(eigenvectors)
# Verify: A @ v = λ * v
v1 = eigenvectors[:, 0] # First eigenvector
lambda1 = eigenvalues[0] # First eigenvalue
print(f"\n{'='*50}")
print("Verification for first eigenvector:")
print(f"v1 = {v1}")
print(f"λ1 = {lambda1}")
print(f"\nA @ v1 = {A @ v1}")
print(f"λ1 * v1 = {lambda1 * v1}")
print(f"Equal? {np.allclose(A @ v1, lambda1 * v1)}")
Eigenvalue Decomposition (Spectral Decomposition)
A = np.array([[3, 1],
[1, 3]])
eigenvalues, eigenvectors = np.linalg.eig(A)
# A = Q @ Λ @ Q^(-1)
Q = eigenvectors # Eigenvectors as columns
Lambda = np.diag(eigenvalues) # Eigenvalues on diagonal
Q_inv = np.linalg.inv(Q)
print("Original A:")
print(A)
print("\nQ (eigenvectors):")
print(Q)
print("\nΛ (eigenvalues):")
print(Lambda)
# Reconstruct A
A_reconstructed = Q @ Lambda @ Q_inv
print("\nReconstructed A = Q @ Λ @ Q^(-1):")
print(A_reconstructed)
print(f"\nMatch: {np.allclose(A, A_reconstructed)}")
Singular Value Decomposition (SVD): The Most Powerful Tool
SVD decomposes any matrix into three components: U, Σ, V^T
Understanding SVD
# Any matrix A = U @ Σ @ V^T
A = np.array([[3, 1, 1],
[1, 3, 1]])
U, singular_values, Vt = np.linalg.svd(A)
# Create Σ matrix (same shape as A)
Sigma = np.zeros_like(A, dtype=float)
np.fill_diagonal(Sigma, singular_values)
print("Original A (2×3):")
print(A)
print("\nU (2×2) - left singular vectors:")
print(U)
print("\nΣ (2×3) - singular values:")
print(Sigma)
print(f"Singular values: {singular_values}")
print("\nV^T (3×3) - right singular vectors:")
print(Vt)
# Reconstruct A
A_reconstructed = U @ Sigma @ Vt
print("\n" + "="*50)
print("Reconstructed A = U @ Σ @ V^T:")
print(A_reconstructed)
print(f"Match: {np.allclose(A, A_reconstructed)}")
Low-Rank Approximation with SVD
# Create larger matrix
np.random.seed(42)
A = np.random.randn(10, 10)
U, s, Vt = np.linalg.svd(A)
print(f"Original matrix shape: {A.shape}")
print(f"Singular values: {s}")
# Keep only k largest singular values
for k in [1, 2, 5]:
# Zero out smaller singular values
s_k = s.copy()
s_k[k:] = 0
# Reconstruct
Sigma_k = np.diag(s_k)
A_k = U @ Sigma_k @ Vt
# Calculate error
error = np.linalg.norm(A - A_k, 'fro') / np.linalg.norm(A, 'fro')
print(f"\nRank-{k} approximation:")
print(f" Relative error: {error*100:.2f}%")
print(f" Variance explained: {(1-error)*100:.2f}%")
Principal Component Analysis (PCA)
PCA uses eigenvalue decomposition for dimensionality reduction.
PCA from Scratch
# Generate sample data (100 samples, 5 features)
np.random.seed(123)
data = np.random.randn(100, 5)
print("Step 1: Center the data")
mean = np.mean(data, axis=0)
data_centered = data - mean
print(f"Original shape: {data.shape}")
print(f"Mean (should be ~0): {np.mean(data_centered, axis=0)}")
print("\nStep 2: Compute covariance matrix")
cov_matrix = np.cov(data_centered.T) # Features as rows
print(f"Covariance matrix shape: {cov_matrix.shape}")
print("\nStep 3: Eigenvalue decomposition")
eigenvalues, eigenvectors = np.linalg.eig(cov_matrix)
# Sort by eigenvalues (descending)
idx = np.argsort(eigenvalues)[::-1]
eigenvalues = eigenvalues[idx]
eigenvectors = eigenvectors[:, idx]
print(f"Eigenvalues: {eigenvalues}")
print("\nStep 4: Explained variance")
explained_var = eigenvalues / np.sum(eigenvalues)
cumulative_var = np.cumsum(explained_var)
for i, (var, cum) in enumerate(zip(explained_var, cumulative_var)):
print(f" PC{i+1}: {var*100:.2f}% (cumulative: {cum*100:.2f}%)")
print("\nStep 5: Project to 2D")
n_components = 2
principal_components = eigenvectors[:, :n_components]
data_pca = data_centered @ principal_components
print(f"Reduced shape: {data_pca.shape}")
print(f"Variance preserved: {cumulative_var[n_components-1]*100:.2f}%")
Solving Linear Systems
Using Matrix Inverse (Small Systems)
# System: 3x + 2y = 7
# 2x + 5y = 12
A = np.array([[3, 2],
[2, 5]])
b = np.array([7, 12])
# Method 1: Using inverse
x = np.linalg.inv(A) @ b
print("Solution using inverse:")
print(f"x = {x[0]:.4f}, y = {x[1]:.4f}")
# Verify
print(f"Verification: A @ x = {A @ x}")
Using np.linalg.solve (Better for Large Systems)
# Same system
A = np.array([[3, 2],
[2, 5]])
b = np.array([7, 12])
# Method 2: More efficient, numerically stable
x = np.linalg.solve(A, b)
print("Solution using solve:")
print(f"x = {x[0]:.4f}, y = {x[1]:.4f}")
# For least squares (overdetermined systems)
# A (m×n), b (m,), where m > n
A_over = np.array([[1, 2],
[3, 4],
[5, 6]])
b_over = np.array([1, 2, 3])
x_lstsq = np.linalg.lstsq(A_over, b_over, rcond=None)[0]
print(f"\nLeast squares solution: {x_lstsq}")
Matrix Factorizations for Data Science
QR Decomposition
A = np.array([[1.0, 2.0, 3.0],
[4.0, 5.0, 6.0],
[7.0, 8.0, 9.0]])
# A = Q @ R where Q is orthogonal, R is upper triangular
Q, R = np.linalg.qr(A)
print("Original A:")
print(A)
print("\nQ (orthogonal matrix):")
print(Q)
print("\nR (upper triangular):")
print(R)
# Verify Q is orthogonal: Q^T @ Q = I
print("\nQ^T @ Q (should be identity):")
print(np.round(Q.T @ Q, 10))
# Reconstruct A
print("\nReconstructed A = Q @ R:")
print(Q @ R)
Cholesky Decomposition (Positive Definite Matrices)
# Create positive definite matrix
A = np.array([[4, 2],
[2, 3]])
# A = L @ L^T where L is lower triangular
L = np.linalg.cholesky(A)
print("Positive definite A:")
print(A)
print("\nL (lower triangular):")
print(L)
print("\nL @ L^T:")
print(L @ L.T)
Real-World Application: Image Compression
# Simulate grayscale image
np.random.seed(42)
image = np.random.rand(50, 50)
print(f"Original image size: {image.shape}")
# SVD decomposition
U, s, Vt = np.linalg.svd(image)
print(f"Number of singular values: {len(s)}")
print(f"Top 5 singular values: {s[:5]}")
# Compress by keeping only k singular values
for k in [5, 10, 20]:
# Reconstruct with k components
S_k = np.diag(s[:k])
image_compressed = U[:, :k] @ S_k @ Vt[:k, :]
# Calculate compression ratio
original_size = image.size
compressed_size = k * (image.shape[0] + image.shape[1] + 1)
ratio = (1 - compressed_size / original_size) * 100
# Calculate error
error = np.linalg.norm(image - image_compressed) / np.linalg.norm(image)
print(f"\nk={k}:")
print(f" Compression: {ratio:.1f}% reduction")
print(f" Error: {error*100:.2f}%")
Recommendation System with Matrix Factorization
# User-item rating matrix (5 users × 4 movies)
# 0 = not rated
ratings = np.array([
[5, 3, 0, 1],
[4, 0, 0, 1],
[1, 1, 0, 5],
[1, 0, 0, 4],
[0, 1, 5, 4]
])
print("Original ratings (0 = not rated):")
print(ratings)
# Fill missing with mean for SVD
mask = ratings > 0
mean_rating = ratings[mask].mean()
ratings_filled = ratings.copy()
ratings_filled[ratings == 0] = mean_rating
# SVD factorization
U, s, Vt = np.linalg.svd(ratings_filled, full_matrices=False)
# Keep only k=2 latent factors
k = 2
ratings_pred = U[:, :k] @ np.diag(s[:k]) @ Vt[:k, :]
print(f"\nPredicted ratings:")
print(np.round(ratings_pred, 2))
# Show predictions for unrated items
print("\nRecommendations (originally unrated):")
for i in range(ratings.shape[0]):
for j in range(ratings.shape[1]):
if ratings[i, j] == 0:
print(f" User {i+1}, Movie {j+1}: {ratings_pred[i, j]:.2f}")
Gram-Schmidt Orthogonalization
# Create linearly independent vectors
v1 = np.array([1.0, 1.0, 0.0])
v2 = np.array([1.0, 0.0, 1.0])
v3 = np.array([0.0, 1.0, 1.0])
def gram_schmidt(vectors):
"""Orthogonalize vectors using Gram-Schmidt process."""
orthogonal = []
for v in vectors:
# Subtract projections onto previous orthogonal vectors
u = v.copy()
for basis in orthogonal:
projection = np.dot(v, basis) / np.dot(basis, basis) * basis
u = u - projection
# Normalize to unit length
u = u / np.linalg.norm(u)
orthogonal.append(u)
return np.array(orthogonal)
# Apply Gram-Schmidt
original = [v1, v2, v3]
orthonormal = gram_schmidt(original)
print("Original vectors:")
for i, v in enumerate(original):
print(f" v{i+1}: {v}")
print("\nOrthonormal vectors:")
for i, u in enumerate(orthonormal):
print(f" u{i+1}: {u}")
print(f" Norm: {np.linalg.norm(u):.6f}")
print("\nOrthogonality check (dot products should be ~0):")
print(f" u1·u2 = {np.dot(orthonormal[0], orthonormal[1]):.10f}")
print(f" u1·u3 = {np.dot(orthonormal[0], orthonormal[2]):.10f}")
print(f" u2·u3 = {np.dot(orthonormal[1], orthonormal[2]):.10f}")
Performance Tips
Use Built-in Functions
import time
# Generate large matrices
n = 1000
A = np.random.randn(n, n)
B = np.random.randn(n, n)
# Time matrix multiplication
start = time.time()
C = A @ B
numpy_time = time.time() - start
print(f"NumPy matrix multiplication ({n}×{n}): {numpy_time:.4f}s")
print("NumPy uses optimized BLAS/LAPACK libraries!")
Avoid Unnecessary Copies
A = np.random.randn(1000, 1000)
# Good: In-place operation
A += 1
# Bad: Creates copy
A = A + 1 # New array allocated
Common Linear Algebra Mistakes
Mistake 1: Dimension Mismatches
# Be explicit about shapes
A = np.array([[1, 2, 3]]) # Shape (1, 3)
B = np.array([[1], [2], [3]]) # Shape (3, 1)
print(f"A shape: {A.shape}")
print(f"B shape: {B.shape}")
print(f"A @ B shape: {(A @ B).shape}") # (1, 1)
print(f"B @ A shape: {(B @ A).shape}") # (3, 3)
Mistake 2: Forgetting Transpose
# Row vector
v = np.array([1, 2, 3])
# For column vector in matrix operations
v_col = v.reshape(-1, 1)
# Or: v[:, np.newaxis]
A = np.array([[1, 0, 0],
[0, 2, 0],
[0, 0, 3]])
result = A @ v_col
print(f"A @ v (column): {result.ravel()}")
Mistake 3: Not Checking Invertibility
A = np.array([[1, 2],
[2, 4]]) # Singular!
det = np.linalg.det(A)
print(f"Determinant: {det}")
if abs(det) < 1e-10:
print("Matrix is singular! Cannot invert.")
else:
A_inv = np.linalg.inv(A)
Your Linear Algebra Toolkit
You now have the essential linear algebra operations for data science:
- Vector operations - Dot products, norms, angles
- Matrix operations - Multiplication, transpose, inverse
- Eigenvalues - PCA, spectral analysis
- SVD - Dimensionality reduction, compression
- Linear systems - Regression, optimization
- Factorizations - QR, Cholesky for numerical stability
These operations power every machine learning algorithm you'll encounter.
Next Steps
- Implement PCA from scratch on real data
- Build a recommender system using SVD
- Study 3Blue1Brown's Linear Algebra series
- Explore gradient descent with matrix calculus
- Learn about advanced NumPy linear algebra functions
Remember: mastering linear algebra isn't about memorizing formulas—it's about understanding transformations, projections, and how data moves through dimensional space.
Found this guide useful? Share it with your data science community! Let's connect on Twitter or LinkedIn to discuss linear algebra applications in ML.
Support My Work
If this guide helped you with this topic, I'd really appreciate your support! Creating comprehensive, free content like this takes significant time and effort. Your support helps me continue sharing knowledge and creating more helpful resources for aspiring data scientists and engineers.
☕ Buy me a coffee - Every contribution, big or small, means the world to me and keeps me motivated to create more content!
Cover image by Pawel Czerwinski on Unsplash