Ojaswi Athghara | Linear Algebra with NumPy: Essential Mathematical Operations for Data Science

Linear Algebra with NumPy: Essential Mathematical Operations for Data Science

The Day Linear Algebra Finally Made Sense

I'll be honest—linear algebra intimidated me for years. Eigenvalues? Matrix decomposition? It felt like abstract math with no practical purpose. Then I started implementing machine learning algorithms from scratch.

Suddenly, everything clicked. Principal Component Analysis (PCA) is just eigenvalue decomposition. Neural networks are matrix multiplications. Recommendation systems use matrix factorization. Linear algebra wasn't abstract—it was the language of data science.

In this guide, I'll show you the linear algebra operations that power modern AI, explained through practical NumPy implementations. No pure theory—just the math you actually need, with real code examples.

Why Linear Algebra Matters in Data Science

Every major machine learning technique relies on linear algebra:

Neural Networks: Forward and backward propagation are matrix operations
PCA: Reduces dimensionality using eigenvectors
SVD: Powers recommendation systems (Netflix, Amazon)
Linear Regression: Solved using matrix operations
Computer Vision: Images are matrices, convolutions are linear operations
NLP: Word embeddings live in vector spaces

Understanding linear algebra transforms you from using libraries blindly to understanding what happens under the hood.

Vectors: The Building Blocks

Vectors represent data points, features, or directions in n-dimensional space.

Vector Creation and Properties

import numpy as np

# Create vectors
v1 = np.array([1, 2, 3])
v2 = np.array([4, 5, 6])

print(f"Vector v1: {v1}")
print(f"Shape: {v1.shape}")
print(f"Dimensions: {v1.ndim}")

# Vector as column (important for matrix operations)
col_vector = v1.reshape(-1, 1)
print(f"\nColumn vector shape: {col_vector.shape}")
print(col_vector)

Vector Operations

v1 = np.array([1, 2, 3])
v2 = np.array([4, 5, 6])

# Addition and subtraction
print(f"v1 + v2 = {v1 + v2}")
print(f"v1 - v2 = {v1 - v2}")

# Scalar multiplication
print(f"3 * v1 = {3 * v1}")

# Dot product (inner product)
dot_product = np.dot(v1, v2)
# Or: v1 @ v2
print(f"\nv1 · v2 = {dot_product}")
print(f"Manual: {v1[0]*v2[0]} + {v1[1]*v2[1]} + {v1[2]*v2[2]} = {dot_product}")

# Vector magnitude (L2 norm)
magnitude = np.linalg.norm(v1)
print(f"\n||v1|| = {magnitude}")
print(f"Manual: sqrt({v1[0]}² + {v1[1]}² + {v1[2]}²) = {magnitude}")

# Unit vector (normalization)
unit_vector = v1 / magnitude
print(f"\nUnit vector: {unit_vector}")
print(f"Magnitude: {np.linalg.norm(unit_vector)}")  # Should be 1.0

Angle Between Vectors

v1 = np.array([1, 0, 0])
v2 = np.array([1, 1, 0])

# cos(θ) = (v1 · v2) / (||v1|| * ||v2||)
cos_angle = np.dot(v1, v2) / (np.linalg.norm(v1) * np.linalg.norm(v2))
angle_rad = np.arccos(cos_angle)
angle_deg = np.degrees(angle_rad)

print(f"Vectors: {v1} and {v2}")
print(f"Angle: {angle_deg:.2f}°")

# Orthogonal vectors (90°)
v3 = np.array([1, 0, 0])
v4 = np.array([0, 1, 0])
print(f"\nOrthogonal check: v3 · v4 = {np.dot(v3, v4)}")  # Should be 0

Matrices: Transforming Data

Matrices represent linear transformations, datasets, and systems of equations.

Matrix Creation

# 2D array (matrix)
A = np.array([[1, 2, 3],
              [4, 5, 6]])
print("Matrix A:")
print(A)
print(f"Shape: {A.shape}")  # (rows, columns)

# Identity matrix (I @ A = A)
I = np.eye(3)
print("\nIdentity matrix (3×3):")
print(I)

# Diagonal matrix
D = np.diag([1, 2, 3])
print("\nDiagonal matrix:")
print(D)

# Random matrix
R = np.random.randn(3, 3)
print("\nRandom matrix:")
print(R)

Matrix Operations

A = np.array([[1, 2], [3, 4]])
B = np.array([[5, 6], [7, 8]])

# Addition
print("A + B:")
print(A + B)

# Element-wise multiplication (Hadamard product)
print("\nA * B (element-wise):")
print(A * B)

# Matrix multiplication
print("\nA @ B (matrix multiplication):")
print(A @ B)

# Transpose
print("\nA transpose:")
print(A.T)

# Trace (sum of diagonal)
print(f"\nTrace of A: {np.trace(A)}")

Matrix Multiplication Deep Dive

# (m×n) @ (n×p) = (m×p)
A = np.array([[1, 2, 3],    # 2×3
              [4, 5, 6]])
B = np.array([[7, 8],        # 3×2
              [9, 10],
              [11, 12]])

C = A @ B  # Result: 2×2
print("A (2×3):")
print(A)
print("\nB (3×2):")
print(B)
print("\nA @ B (2×2):")
print(C)

# Manual calculation for C[0,0]
print(f"\nC[0,0] = {A[0,0]}*{B[0,0]} + {A[0,1]}*{B[1,0]} + {A[0,2]}*{B[2,0]}")
print(f"       = {A[0,0]*B[0,0] + A[0,1]*B[1,0] + A[0,2]*B[2,0]}")

Matrix Inverse and Determinant

Matrix Inverse

A = np.array([[4, 7],
              [2, 6]])

# Compute inverse (only for square, non-singular matrices)
A_inv = np.linalg.inv(A)

print("Matrix A:")
print(A)
print("\nA inverse:")
print(A_inv)

# Verify: A @ A_inv = I
identity = A @ A_inv
print("\nA @ A_inv (should be identity):")
print(np.round(identity, 10))

# Solve linear system: Ax = b
b = np.array([1, 2])
x = A_inv @ b
print(f"\nSolution to Ax = b: {x}")
print(f"Verification (A @ x): {A @ x}")

Determinant

A = np.array([[3, 8],
              [4, 6]])

det = np.linalg.det(A)
print(f"Matrix A:")
print(A)
print(f"\nDeterminant: {det}")
print(f"Is invertible: {det != 0}")

# 2×2 determinant: ad - bc
print(f"\nManual: {A[0,0]}*{A[1,1]} - {A[0,1]}*{A[1,0]} = {A[0,0]*A[1,1] - A[0,1]*A[1,0]}")

# Singular matrix (det = 0, not invertible)
singular = np.array([[1, 2],
                     [2, 4]])  # Row 2 = 2 * Row 1
print(f"\nSingular matrix determinant: {np.linalg.det(singular)}")

Eigenvalues and Eigenvectors: The Heart of ML

Eigenvectors show directions that remain unchanged (only scaled) by a transformation.

Understanding Eigenvalues

A = np.array([[4, -2],
              [1, 1]])

# Compute eigenvalues and eigenvectors
eigenvalues, eigenvectors = np.linalg.eig(A)

print("Matrix A:")
print(A)
print(f"\nEigenvalues: {eigenvalues}")
print("\nEigenvectors (as columns):")
print(eigenvectors)

# Verify: A @ v = λ * v
v1 = eigenvectors[:, 0]  # First eigenvector
lambda1 = eigenvalues[0]  # First eigenvalue

print(f"\n{'='*50}")
print("Verification for first eigenvector:")
print(f"v1 = {v1}")
print(f"λ1 = {lambda1}")
print(f"\nA @ v1 = {A @ v1}")
print(f"λ1 * v1 = {lambda1 * v1}")
print(f"Equal? {np.allclose(A @ v1, lambda1 * v1)}")

Eigenvalue Decomposition (Spectral Decomposition)

A = np.array([[3, 1],
              [1, 3]])

eigenvalues, eigenvectors = np.linalg.eig(A)

# A = Q @ Λ @ Q^(-1)
Q = eigenvectors  # Eigenvectors as columns
Lambda = np.diag(eigenvalues)  # Eigenvalues on diagonal
Q_inv = np.linalg.inv(Q)

print("Original A:")
print(A)
print("\nQ (eigenvectors):")
print(Q)
print("\nΛ (eigenvalues):")
print(Lambda)

# Reconstruct A
A_reconstructed = Q @ Lambda @ Q_inv
print("\nReconstructed A = Q @ Λ @ Q^(-1):")
print(A_reconstructed)
print(f"\nMatch: {np.allclose(A, A_reconstructed)}")

Singular Value Decomposition (SVD): The Most Powerful Tool

SVD decomposes any matrix into three components: U, Σ, V^T

Understanding SVD

# Any matrix A = U @ Σ @ V^T
A = np.array([[3, 1, 1],
              [1, 3, 1]])

U, singular_values, Vt = np.linalg.svd(A)

# Create Σ matrix (same shape as A)
Sigma = np.zeros_like(A, dtype=float)
np.fill_diagonal(Sigma, singular_values)

print("Original A (2×3):")
print(A)
print("\nU (2×2) - left singular vectors:")
print(U)
print("\nΣ (2×3) - singular values:")
print(Sigma)
print(f"Singular values: {singular_values}")
print("\nV^T (3×3) - right singular vectors:")
print(Vt)

# Reconstruct A
A_reconstructed = U @ Sigma @ Vt
print("\n" + "="*50)
print("Reconstructed A = U @ Σ @ V^T:")
print(A_reconstructed)
print(f"Match: {np.allclose(A, A_reconstructed)}")

Low-Rank Approximation with SVD

# Create larger matrix
np.random.seed(42)
A = np.random.randn(10, 10)

U, s, Vt = np.linalg.svd(A)

print(f"Original matrix shape: {A.shape}")
print(f"Singular values: {s}")

# Keep only k largest singular values
for k in [1, 2, 5]:
    # Zero out smaller singular values
    s_k = s.copy()
    s_k[k:] = 0
    
    # Reconstruct
    Sigma_k = np.diag(s_k)
    A_k = U @ Sigma_k @ Vt
    
    # Calculate error
    error = np.linalg.norm(A - A_k, 'fro') / np.linalg.norm(A, 'fro')
    
    print(f"\nRank-{k} approximation:")
    print(f"  Relative error: {error*100:.2f}%")
    print(f"  Variance explained: {(1-error)*100:.2f}%")

Principal Component Analysis (PCA)

PCA uses eigenvalue decomposition for dimensionality reduction.

PCA from Scratch

# Generate sample data (100 samples, 5 features)
np.random.seed(123)
data = np.random.randn(100, 5)

print("Step 1: Center the data")
mean = np.mean(data, axis=0)
data_centered = data - mean
print(f"Original shape: {data.shape}")
print(f"Mean (should be ~0): {np.mean(data_centered, axis=0)}")

print("\nStep 2: Compute covariance matrix")
cov_matrix = np.cov(data_centered.T)  # Features as rows
print(f"Covariance matrix shape: {cov_matrix.shape}")

print("\nStep 3: Eigenvalue decomposition")
eigenvalues, eigenvectors = np.linalg.eig(cov_matrix)

# Sort by eigenvalues (descending)
idx = np.argsort(eigenvalues)[::-1]
eigenvalues = eigenvalues[idx]
eigenvectors = eigenvectors[:, idx]

print(f"Eigenvalues: {eigenvalues}")

print("\nStep 4: Explained variance")
explained_var = eigenvalues / np.sum(eigenvalues)
cumulative_var = np.cumsum(explained_var)

for i, (var, cum) in enumerate(zip(explained_var, cumulative_var)):
    print(f"  PC{i+1}: {var*100:.2f}% (cumulative: {cum*100:.2f}%)")

print("\nStep 5: Project to 2D")
n_components = 2
principal_components = eigenvectors[:, :n_components]
data_pca = data_centered @ principal_components

print(f"Reduced shape: {data_pca.shape}")
print(f"Variance preserved: {cumulative_var[n_components-1]*100:.2f}%")

Solving Linear Systems

Using Matrix Inverse (Small Systems)

# System: 3x + 2y = 7
#         2x + 5y = 12

A = np.array([[3, 2],
              [2, 5]])
b = np.array([7, 12])

# Method 1: Using inverse
x = np.linalg.inv(A) @ b
print("Solution using inverse:")
print(f"x = {x[0]:.4f}, y = {x[1]:.4f}")

# Verify
print(f"Verification: A @ x = {A @ x}")

Using np.linalg.solve (Better for Large Systems)

# Same system
A = np.array([[3, 2],
              [2, 5]])
b = np.array([7, 12])

# Method 2: More efficient, numerically stable
x = np.linalg.solve(A, b)
print("Solution using solve:")
print(f"x = {x[0]:.4f}, y = {x[1]:.4f}")

# For least squares (overdetermined systems)
# A (m×n), b (m,), where m > n
A_over = np.array([[1, 2],
                   [3, 4],
                   [5, 6]])
b_over = np.array([1, 2, 3])

x_lstsq = np.linalg.lstsq(A_over, b_over, rcond=None)[0]
print(f"\nLeast squares solution: {x_lstsq}")

Matrix Factorizations for Data Science

QR Decomposition

A = np.array([[1.0, 2.0, 3.0],
              [4.0, 5.0, 6.0],
              [7.0, 8.0, 9.0]])

# A = Q @ R where Q is orthogonal, R is upper triangular
Q, R = np.linalg.qr(A)

print("Original A:")
print(A)
print("\nQ (orthogonal matrix):")
print(Q)
print("\nR (upper triangular):")
print(R)

# Verify Q is orthogonal: Q^T @ Q = I
print("\nQ^T @ Q (should be identity):")
print(np.round(Q.T @ Q, 10))

# Reconstruct A
print("\nReconstructed A = Q @ R:")
print(Q @ R)

Cholesky Decomposition (Positive Definite Matrices)

# Create positive definite matrix
A = np.array([[4, 2],
              [2, 3]])

# A = L @ L^T where L is lower triangular
L = np.linalg.cholesky(A)

print("Positive definite A:")
print(A)
print("\nL (lower triangular):")
print(L)
print("\nL @ L^T:")
print(L @ L.T)

Real-World Application: Image Compression

# Simulate grayscale image
np.random.seed(42)
image = np.random.rand(50, 50)

print(f"Original image size: {image.shape}")

# SVD decomposition
U, s, Vt = np.linalg.svd(image)

print(f"Number of singular values: {len(s)}")
print(f"Top 5 singular values: {s[:5]}")

# Compress by keeping only k singular values
for k in [5, 10, 20]:
    # Reconstruct with k components
    S_k = np.diag(s[:k])
    image_compressed = U[:, :k] @ S_k @ Vt[:k, :]
    
    # Calculate compression ratio
    original_size = image.size
    compressed_size = k * (image.shape[0] + image.shape[1] + 1)
    ratio = (1 - compressed_size / original_size) * 100
    
    # Calculate error
    error = np.linalg.norm(image - image_compressed) / np.linalg.norm(image)
    
    print(f"\nk={k}:")
    print(f"  Compression: {ratio:.1f}% reduction")
    print(f"  Error: {error*100:.2f}%")

Recommendation System with Matrix Factorization

# User-item rating matrix (5 users × 4 movies)
# 0 = not rated
ratings = np.array([
    [5, 3, 0, 1],
    [4, 0, 0, 1],
    [1, 1, 0, 5],
    [1, 0, 0, 4],
    [0, 1, 5, 4]
])

print("Original ratings (0 = not rated):")
print(ratings)

# Fill missing with mean for SVD
mask = ratings > 0
mean_rating = ratings[mask].mean()
ratings_filled = ratings.copy()
ratings_filled[ratings == 0] = mean_rating

# SVD factorization
U, s, Vt = np.linalg.svd(ratings_filled, full_matrices=False)

# Keep only k=2 latent factors
k = 2
ratings_pred = U[:, :k] @ np.diag(s[:k]) @ Vt[:k, :]

print(f"\nPredicted ratings:")
print(np.round(ratings_pred, 2))

# Show predictions for unrated items
print("\nRecommendations (originally unrated):")
for i in range(ratings.shape[0]):
    for j in range(ratings.shape[1]):
        if ratings[i, j] == 0:
            print(f"  User {i+1}, Movie {j+1}: {ratings_pred[i, j]:.2f}")

Gram-Schmidt Orthogonalization

# Create linearly independent vectors
v1 = np.array([1.0, 1.0, 0.0])
v2 = np.array([1.0, 0.0, 1.0])
v3 = np.array([0.0, 1.0, 1.0])

def gram_schmidt(vectors):
    """Orthogonalize vectors using Gram-Schmidt process."""
    orthogonal = []
    
    for v in vectors:
        # Subtract projections onto previous orthogonal vectors
        u = v.copy()
        for basis in orthogonal:
            projection = np.dot(v, basis) / np.dot(basis, basis) * basis
            u = u - projection
        
        # Normalize to unit length
        u = u / np.linalg.norm(u)
        orthogonal.append(u)
    
    return np.array(orthogonal)

# Apply Gram-Schmidt
original = [v1, v2, v3]
orthonormal = gram_schmidt(original)

print("Original vectors:")
for i, v in enumerate(original):
    print(f"  v{i+1}: {v}")

print("\nOrthonormal vectors:")
for i, u in enumerate(orthonormal):
    print(f"  u{i+1}: {u}")
    print(f"    Norm: {np.linalg.norm(u):.6f}")

print("\nOrthogonality check (dot products should be ~0):")
print(f"  u1·u2 = {np.dot(orthonormal[0], orthonormal[1]):.10f}")
print(f"  u1·u3 = {np.dot(orthonormal[0], orthonormal[2]):.10f}")
print(f"  u2·u3 = {np.dot(orthonormal[1], orthonormal[2]):.10f}")

Performance Tips

Use Built-in Functions

import time

# Generate large matrices
n = 1000
A = np.random.randn(n, n)
B = np.random.randn(n, n)

# Time matrix multiplication
start = time.time()
C = A @ B
numpy_time = time.time() - start

print(f"NumPy matrix multiplication ({n}×{n}): {numpy_time:.4f}s")
print("NumPy uses optimized BLAS/LAPACK libraries!")

Avoid Unnecessary Copies

A = np.random.randn(1000, 1000)

# Good: In-place operation
A += 1

# Bad: Creates copy
A = A + 1  # New array allocated

Common Linear Algebra Mistakes

Mistake 1: Dimension Mismatches

# Be explicit about shapes
A = np.array([[1, 2, 3]])  # Shape (1, 3)
B = np.array([[1], [2], [3]])  # Shape (3, 1)

print(f"A shape: {A.shape}")
print(f"B shape: {B.shape}")
print(f"A @ B shape: {(A @ B).shape}")  # (1, 1)
print(f"B @ A shape: {(B @ A).shape}")  # (3, 3)

Mistake 2: Forgetting Transpose

# Row vector
v = np.array([1, 2, 3])

# For column vector in matrix operations
v_col = v.reshape(-1, 1)
# Or: v[:, np.newaxis]

A = np.array([[1, 0, 0],
              [0, 2, 0],
              [0, 0, 3]])

result = A @ v_col
print(f"A @ v (column): {result.ravel()}")

Mistake 3: Not Checking Invertibility

A = np.array([[1, 2],
              [2, 4]])  # Singular!

det = np.linalg.det(A)
print(f"Determinant: {det}")

if abs(det) < 1e-10:
    print("Matrix is singular! Cannot invert.")
else:
    A_inv = np.linalg.inv(A)

Your Linear Algebra Toolkit

You now have the essential linear algebra operations for data science:

Vector operations - Dot products, norms, angles
Matrix operations - Multiplication, transpose, inverse
Eigenvalues - PCA, spectral analysis
SVD - Dimensionality reduction, compression
Linear systems - Regression, optimization
Factorizations - QR, Cholesky for numerical stability

These operations power every machine learning algorithm you'll encounter.

Next Steps

Implement PCA from scratch on real data
Build a recommender system using SVD
Study 3Blue1Brown's Linear Algebra series
Explore gradient descent with matrix calculus
Learn about advanced NumPy linear algebra functions

Remember: mastering linear algebra isn't about memorizing formulas—it's about understanding transformations, projections, and how data moves through dimensional space.

Found this guide useful? Share it with your data science community! Let's connect on Twitter or LinkedIn to discuss linear algebra applications in ML.

Support My Work

If this guide helped you with this topic, I'd really appreciate your support! Creating comprehensive, free content like this takes significant time and effort. Your support helps me continue sharing knowledge and creating more helpful resources for aspiring data scientists and engineers.

☕ Buy me a coffee - Every contribution, big or small, means the world to me and keeps me motivated to create more content!

Cover image by Pawel Czerwinski on Unsplash

Related Blogs