Ojaswi Athghara | String Algorithms Tutorial: KMP, Anagrams & Pattern Matching for Interviews

String Algorithms Tutorial: KMP, Anagrams & Pattern Matching for Interviews

The String Problem That Changed My Perspective

Strings seemed simple—just arrays of characters, right? Then an interviewer asked: "Find the longest palindromic substring." My nested-loop solution worked but was O(n³). They asked: "Can you do better?"

That's when I discovered that string problems have their own specialized patterns—two pointers, sliding windows, hash maps, and elegant algorithms like KMP that I'd never even heard of.

In this guide, I'll share the string patterns that transformed my interview performance. You'll learn when to use each technique and how to recognize string problem patterns instantly.

Pattern 1: Two Pointers for Strings

Valid Palindrome

LeetCode #125 - Valid Palindrome

def is_palindrome(s):
    """
    Check if string is palindrome (alphanumeric only).
    Two pointers from both ends.
    """
    left, right = 0, len(s) - 1

    while left < right:
        # Skip non-alphanumeric characters
        while left < right and not s[left].isalnum():
            left += 1
        while left < right and not s[right].isalnum():
            right -= 1

        # Compare characters (case-insensitive)
        if s[left].lower() != s[right].lower():
            return False

        left += 1
        right -= 1

    return True

Time: O(n), Space: O(1)

Longest Palindromic Substring (Expand Around Center)

LeetCode #5 - Longest Palindromic Substring

def longest_palindrome(s):
    """
    Find longest palindromic substring.
    Expand around each possible center.
    """
    if not s:
        return ""

    def expand_around_center(left, right):
        """Expand while characters match."""
        while left >= 0 and right < len(s) and s[left] == s[right]:
            left -= 1
            right += 1
        return right - left - 1  # Length of palindrome

    start, max_len = 0, 0

    for i in range(len(s)):
        # Odd length palindromes (center is one character)
        len1 = expand_around_center(i, i)
        # Even length palindromes (center is between characters)
        len2 = expand_around_center(i, i + 1)

        current_len = max(len1, len2)
        if current_len > max_len:
            max_len = current_len
            start = i - (current_len - 1) // 2

    return s[start:start + max_len]

Time: O(n²), Space: O(1)

Key insight: Each position (or between positions) is a potential palindrome center. Expand outward while characters match.

Pattern 2: Sliding Window for Strings

Longest Substring Without Repeating Characters

LeetCode #3 - Longest Substring Without Repeating Characters

def length_of_longest_substring(s):
    """
    Find length of longest substring without repeating characters.
    Sliding window with hash set.
    """
    char_set = set()
    left = 0
    max_length = 0

    for right in range(len(s)):
        # Contract window while duplicate exists
        while s[right] in char_set:
            char_set.remove(s[left])
            left += 1

        # Add current character
        char_set.add(s[right])
        max_length = max(max_length, right - left + 1)

    return max_length

Time: O(n), Space: O(min(n, m)) where m = charset size

Minimum Window Substring

LeetCode #76 - Minimum Window Substring

from collections import Counter

def min_window(s, t):
    """
    Find minimum window in s containing all characters of t.
    Sliding window with character counts.
    """
    if not s or not t:
        return ""

    # Count characters needed
    needed = Counter(t)
    window_counts = {}

    have, need = 0, len(needed)
    left = 0
    min_len = float('inf')
    min_window = [0, 0]

    for right in range(len(s)):
        char = s[right]
        window_counts[char] = window_counts.get(char, 0) + 1

        # Check if we've satisfied requirement for this char
        if char in needed and window_counts[char] == needed[char]:
            have += 1

        # Contract window while valid
        while have == need:
            # Update minimum window
            if right - left + 1 < min_len:
                min_len = right - left + 1
                min_window = [left, right]

            # Remove from left
            char = s[left]
            window_counts[char] -= 1
            if char in needed and window_counts[char] < needed[char]:
                have -= 1
            left += 1

    return s[min_window[0]:min_window[1] + 1] if min_len != float('inf') else ""

Pattern: Expand window to find valid state, contract to minimize.

Pattern 3: Anagram Problems

Valid Anagram

LeetCode #242 - Valid Anagram

from collections import Counter

def is_anagram(s, t):
    """
    Check if two strings are anagrams.
    Compare character frequencies.
    """
    return Counter(s) == Counter(t)

# Alternative: Sort and compare
def is_anagram_v2(s, t):
    return sorted(s) == sorted(t)

Group Anagrams

LeetCode #49 - Group Anagrams

from collections import defaultdict

def group_anagrams(strs):
    """
    Group strings that are anagrams.
    Use sorted string as key.
    """
    anagrams = defaultdict(list)

    for s in strs:
        # Sort string to get canonical form
        key = ''.join(sorted(s))
        anagrams[key].append(s)

    return list(anagrams.values())

Alternative key: Use character frequency tuple (avoids sorting):

def group_anagrams_v2(strs):
    anagrams = defaultdict(list)

    for s in strs:
        # Count frequency of each character
        count = [0] * 26
        for c in s:
            count[ord(c) - ord('a')] += 1

        # Use tuple as key (lists aren't hashable)
        key = tuple(count)
        anagrams[key].append(s)

    return list(anagrams.values())

Pattern 4: String Building and Manipulation

Reverse Words in String

LeetCode #151 - Reverse Words in a String

def reverse_words(s):
    """
    Reverse words in string.
    Split, reverse, join.
    """
    # Split removes extra spaces automatically
    words = s.split()
    # Reverse and join
    return ' '.join(reversed(words))

# In-place version (for languages without built-in split)
def reverse_words_inplace(s):
    """Three-step reverse."""
    # 1. Reverse entire string
    # 2. Reverse each word
    # 3. Clean up spaces
    pass  # Implementation details omitted

String Compression

LeetCode #443 - String Compression

def compress(chars):
    """
    Compress string in-place.
    Return new length.
    """
    write = 0  # Position to write
    read = 0

    while read < len(chars):
        char = chars[read]
        count = 0

        # Count occurrences
        while read < len(chars) and chars[read] == char:
            read += 1
            count += 1

        # Write character
        chars[write] = char
        write += 1

        # Write count if > 1
        if count > 1:
            for digit in str(count):
                chars[write] = digit
                write += 1

    return write

Pattern 5: Substring Problems

Longest Common Substring

def longest_common_substring(s1, s2):
    """
    Find longest common substring (consecutive).
    Dynamic programming approach.
    """
    m, n = len(s1), len(s2)
    dp = [[0] * (n + 1) for _ in range(m + 1)]
    max_len = 0
    end_pos = 0

    for i in range(1, m + 1):
        for j in range(1, n + 1):
            if s1[i-1] == s2[j-1]:
                dp[i][j] = dp[i-1][j-1] + 1
                if dp[i][j] > max_len:
                    max_len = dp[i][j]
                    end_pos = i

    return s1[end_pos - max_len:end_pos]

Longest Repeating Character Replacement

LeetCode #424 - Longest Repeating Character Replacement

def character_replacement(s, k):
    """
    Longest substring with same character after k replacements.
    Sliding window tracking max frequency.
    """
    from collections import Counter

    count = Counter()
    left = 0
    max_count = 0  # Max frequency in current window
    max_len = 0

    for right in range(len(s)):
        count[s[right]] += 1
        max_count = max(max_count, count[s[right]])

        # If window_size - max_count > k, shrink window
        while (right - left + 1) - max_count > k:
            count[s[left]] -= 1
            left += 1

        max_len = max(max_len, right - left + 1)

    return max_len

Insight: In a valid window, we can replace (window_size - max_frequency) characters. If that exceeds k, shrink window.

Pattern 6: Pattern Matching (KMP Algorithm)

Knuth-Morris-Pratt Algorithm for efficient pattern matching.

def kmp_search(text, pattern):
    """
    Find all occurrences of pattern in text using KMP.
    Preprocessing: build failure function (LPS array).
    """
    def build_lps(pattern):
        """Build Longest Proper Prefix which is also Suffix array."""
        lps = [0] * len(pattern)
        length = 0  # Length of previous longest prefix suffix
        i = 1

        while i < len(pattern):
            if pattern[i] == pattern[length]:
                length += 1
                lps[i] = length
                i += 1
            else:
                if length != 0:
                    length = lps[length - 1]
                else:
                    lps[i] = 0
                    i += 1

        return lps

    # Build LPS array
    lps = build_lps(pattern)

    # Search for pattern
    matches = []
    i = j = 0  # i for text, j for pattern

    while i < len(text):
        if pattern[j] == text[i]:
            i += 1
            j += 1

        if j == len(pattern):
            matches.append(i - j)
            j = lps[j - 1]
        elif i < len(text) and pattern[j] != text[i]:
            if j != 0:
                j = lps[j - 1]
            else:
                i += 1

    return matches

Time: O(n + m) where n = text length, m = pattern length Space: O(m)

Why KMP? Naive pattern matching is O(n×m). KMP achieves O(n+m) by avoiding re-checking characters we already know match.

Pattern 7: String DP Problems

Edit Distance

LeetCode #72 - Edit Distance

def min_distance(word1, word2):
    """
    Minimum edits to transform word1 to word2.
    Operations: insert, delete, replace.
    """
    m, n = len(word1), len(word2)
    dp = [[0] * (n + 1) for _ in range(m + 1)]

    # Base cases
    for i in range(m + 1):
        dp[i][0] = i  # Delete all from word1
    for j in range(n + 1):
        dp[0][j] = j  # Insert all from word2

    # Fill DP table
    for i in range(1, m + 1):
        for j in range(1, n + 1):
            if word1[i-1] == word2[j-1]:
                dp[i][j] = dp[i-1][j-1]  # No operation needed
            else:
                dp[i][j] = 1 + min(
                    dp[i-1][j],    # Delete
                    dp[i][j-1],    # Insert
                    dp[i-1][j-1]   # Replace
                )

    return dp[m][n]

Time: O(m×n), Space: O(m×n)

Common String Tricks

Convert Case

s.lower()  # Lowercase
s.upper()  # Uppercase
s.capitalize()  # First char uppercase
s.title()  # Each word capitalized

Check Character Type

c.isalnum()  # Alphanumeric
c.isalpha()  # Alphabetic
c.isdigit()  # Digit
c.isspace()  # Whitespace

String Building (Efficient)

# WRONG: O(n²) due to string immutability
result = ""
for char in s:
    result += char

# RIGHT: O(n) using list
result = []
for char in s:
    result.append(char)
return ''.join(result)

Reverse String

s[::-1]  # Pythonic way
''.join(reversed(s))  # Using reversed()

String Pattern Decision Tree

When you see a string problem:

1. Checking/validating strings?

Palindrome → Two pointers
Anagram → Sort or frequency count
Valid characters → Hash set

2. Finding substring/pattern?

Simple pattern → Built-in in or find()
Repeated pattern matching → KMP
With conditions → Sliding window

3. Transforming strings?

Character by character → Loop and build
Word level → Split, transform, join

4. Counting/grouping?

Frequencies → Counter/hash map
Grouping similar → Hash map with canonical form

5. Optimization problems?

Edit distance, LCS → DP
Maximum/minimum with constraints → Sliding window

Common String Mistakes

Mistake 1: String Concatenation in Loop

# WRONG: O(n²)
result = ""
for i in range(n):
    result += str(i)

# RIGHT: O(n)
result = ''.join(str(i) for i in range(n))

Mistake 2: Not Handling Empty Strings

# Always check
if not s:
    return default_value

Mistake 3: Index Out of Bounds

# WRONG
for i in range(len(s)):
    if s[i] == s[i+1]:  # Crashes at last char!

# RIGHT
for i in range(len(s) - 1):
    if s[i] == s[i+1]:

My Practice Strategy

Week 1: Two Pointers and Palindromes

Palindrome variations
Two pointer techniques
15-20 problems

Week 2: Sliding Window for Strings

Substring problems
Window with conditions
15-20 problems

Week 3: Anagrams and Hashing

Group anagrams, valid anagram
Frequency-based problems
15-20 problems

Week 4: Advanced Patterns

KMP, DP string problems
Mixed practice
20-30 problems

Conclusion: Your String Mastery Roadmap

String problems are pattern recognition:

Two pointers - Palindromes, validation
Sliding window - Substring with conditions
Hash maps - Anagrams, frequencies
KMP - Efficient pattern matching
DP - Edit distance, transformations

Master these patterns, and string problems become straightforward.

Within 4 weeks of focused practice, you'll recognize which pattern to use instantly.

Remember: strings are just specialized arrays. Use the same techniques, adapted for character operations!

Happy coding, and may your strings always be valid!