Heap Data Structure Explained: Master Priority Queue & Kth Element Problems
Complete guide to heaps and priority queues with Python heapq. Learn min-heap vs max-heap, solve Kth largest element, top K problems, and running median for coding interviews with efficient implementations

The "Aha" Moment That Made Heaps Click
When I first learned about heaps, I memorized their properties—complete binary tree, heap property, O(log n) operations—but I had no idea when to actually use them. Then I encountered the "Kth largest element" problem in an interview, and something clicked.
The interviewer asked: "Find the Kth largest element in an unsorted array." My first instinct was to sort (O(n log n)) or use a selection algorithm. Then they asked: "What if you're processing a stream and need to maintain the Kth largest continuously?"
That's when heaps made sense. Heaps are perfect for maintaining "top K" or "bottom K" elements efficiently. Once I understood this single use case, I started seeing heap problems everywhere.
In this guide, I'll share the simple framework that made heaps intuitive for me. You'll learn when to use min-heap vs max-heap, and solve the most common heap interview patterns.
What Makes Heaps Special
A heap is a complete binary tree where:
- Min-heap: Parent ≤ children (root is minimum)
- Max-heap: Parent ≥ children (root is maximum)
Key operations (all O(log n)):
- Insert element
- Extract min/max
- Peek at min/max (O(1))
Why heaps matter:
- Efficient priority access: Always know the min/max in O(1)
- Efficient updates: Insert/delete in O(log n)
- Space efficient: No need to sort entire array
Python's heapq module:
Python provides a min-heap by default. For max-heap, negate values.
import heapq
# Min-heap operations
heap = []
heapq.heappush(heap, 5) # Insert
min_val = heapq.heappop(heap) # Extract min
min_val = heap[0] # Peek min (don't remove)
# Max-heap: negate values
max_heap = []
heapq.heappush(max_heap, -5) # Insert -5 (represents 5)
max_val = -heapq.heappop(max_heap) # Get 5
# Heapify array in O(n)
arr = [3, 1, 4, 1, 5, 9]
heapq.heapify(arr) # arr is now a min-heap
Pattern 1: Kth Largest/Smallest Element
This is THE most common heap pattern. The trick: use a heap of size K.
Kth Largest Element
LeetCode #215 - Kth Largest Element in Array
The insight: Maintain a min-heap of size K containing the K largest elements. The root is the Kth largest!
import heapq
def find_kth_largest(nums, k):
"""
Find kth largest using min-heap of size k.
Heap root is always the kth largest element.
"""
# Build min-heap of first k elements
heap = nums[:k]
heapq.heapify(heap)
# Process remaining elements
for num in nums[k:]:
if num > heap[0]: # If larger than smallest in heap
heapq.heappushpop(heap, num) # Replace smallest
return heap[0] # Root is kth largest
Time: O(n log k) Space: O(k)
Why this works:
- Heap maintains K largest elements seen so far
- Smallest of these K elements is the Kth largest overall
- When we see a larger element, we evict the smallest from heap
Test: find_kth_largest([3,2,1,5,6,4], 2) → 5
Kth Smallest Element
Just invert the logic—use a max-heap of size K!
def find_kth_smallest(nums, k):
"""
Find kth smallest using max-heap of size k.
"""
# Max-heap: negate values
heap = [-x for x in nums[:k]]
heapq.heapify(heap)
for num in nums[k:]:
if num < -heap[0]: # If smaller than largest in heap
heapq.heappushpop(heap, -num)
return -heap[0] # Root is kth smallest
Mental model:
- Kth largest → min-heap of K elements (smallest of largest K)
- Kth smallest → max-heap of K elements (largest of smallest K)
Pattern 2: Top K Elements
Generalization of "Kth element"—return all K elements, not just the Kth.
Top K Frequent Elements
LeetCode #347 - Top K Frequent Elements
from collections import Counter
import heapq
def top_k_frequent(nums, k):
"""
Find k most frequent elements.
Use min-heap of size k, track by frequency.
"""
# Count frequencies
count = Counter(nums)
# Min-heap: (frequency, element)
heap = []
for num, freq in count.items():
heapq.heappush(heap, (freq, num))
if len(heap) > k:
heapq.heappop(heap) # Remove least frequent
return [num for freq, num in heap]
Alternative: Use max-heap of all elements, pop K times:
def top_k_frequent_v2(nums, k):
"""
Alternative: max-heap of all elements, extract k times.
"""
count = Counter(nums)
# Max-heap: negate frequencies
heap = [(-freq, num) for num, freq in count.items()]
heapq.heapify(heap)
return [heapq.heappop(heap)[1] for _ in range(k)]
Time: O(n + m log k) where m = unique elements First approach is better when k is small.
Pattern 3: Merge K Sorted Lists/Arrays
Heaps are perfect for merging multiple sorted sequences.
Merge K Sorted Lists
LeetCode #23 - Merge k Sorted Lists
import heapq
def merge_k_lists(lists):
"""
Merge k sorted linked lists using min-heap.
Heap always contains smallest current element from each list.
"""
# Initialize heap with first node from each list
heap = []
for i, node in enumerate(lists):
if node:
# (value, list_index, node)
heapq.heappush(heap, (node.val, i, node))
dummy = ListNode(0)
curr = dummy
while heap:
val, i, node = heapq.heappop(heap)
# Add smallest node to result
curr.next = node
curr = curr.next
# Add next node from same list
if node.next:
heapq.heappush(heap, (node.next.val, i, node.next))
return dummy.next
Why heap? We have K lists, need to repeatedly find the minimum among K elements. Heap does this in O(log K)!
Time: O(N log K) where N = total nodes, K = number of lists Space: O(K) for heap
Kth Smallest in Sorted Matrix
LeetCode #378 - Kth Smallest Element in Sorted Matrix
def kth_smallest_in_matrix(matrix, k):
"""
Find kth smallest in row and column sorted matrix.
Use min-heap, process elements in increasing order.
"""
n = len(matrix)
heap = []
# Add first element from each row
for r in range(min(n, k)): # Only need first k rows
heapq.heappush(heap, (matrix[r][0], r, 0))
# Extract min k times
for _ in range(k):
val, r, c = heapq.heappop(heap)
# Add next element from same row
if c + 1 < n:
heapq.heappush(heap, (matrix[r][c+1], r, c+1))
return val
Key insight: Start with first column, always add the next element from the same row as the extracted minimum. This ensures we process in sorted order.
Pattern 4: Running Median
Maintain median of a stream using two heaps: max-heap for smaller half, min-heap for larger half.
Find Median from Data Stream
LeetCode #295 - Find Median from Data Stream
import heapq
class MedianFinder:
"""
Maintain median using two heaps.
Max-heap: smaller half
Min-heap: larger half
Median is between tops of two heaps.
"""
def __init__(self):
self.small = [] # Max-heap (negate values)
self.large = [] # Min-heap
def addNum(self, num):
"""Add number while maintaining balance."""
# Add to max-heap (small half) by default
heapq.heappush(self.small, -num)
# Balance: largest of small half ≤ smallest of large half
if self.small and self.large and (-self.small[0] > self.large[0]):
val = -heapq.heappop(self.small)
heapq.heappush(self.large, val)
# Balance sizes: differ by at most 1
if len(self.small) > len(self.large) + 1:
val = -heapq.heappop(self.small)
heapq.heappush(self.large, val)
if len(self.large) > len(self.small) + 1:
val = heapq.heappop(self.large)
heapq.heappush(self.small, -val)
def findMedian(self):
"""Find median in O(1)."""
if len(self.small) > len(self.large):
return -self.small[0]
elif len(self.large) > len(self.small):
return self.large[0]
else:
return (-self.small[0] + self.large[0]) / 2.0
The brilliance:
- Max-heap stores smaller half (largest at top)
- Min-heap stores larger half (smallest at top)
- Median is between the two tops
- Keep heaps balanced (sizes differ by at most 1)
Time: O(log n) per add, O(1) per findMedian
When to Use Min-Heap vs Max-Heap
Use min-heap when:
- Finding Kth largest (maintain K largest, smallest of them is Kth largest)
- Need quick access to minimum
- Merging sorted lists (min is next element)
Use max-heap when:
- Finding Kth smallest (maintain K smallest, largest of them is Kth smallest)
- Need quick access to maximum
- Median (smaller half)
Remember: It's often counterintuitive!
- Kth largest → min-heap
- Kth smallest → max-heap
Common Heap Pitfalls
Mistake 1: Using Wrong Heap Type
# WRONG: Max-heap for Kth largest
# This gives you the Kth smallest!
heap = []
for num in nums:
heapq.heappush(heap, -num) # Max-heap
if len(heap) > k:
heapq.heappop(heap)
# Wrong answer!
# RIGHT: Min-heap for Kth largest
heap = []
for num in nums:
heapq.heappush(heap, num) # Min-heap
if len(heap) > k:
heapq.heappop(heap)
Mistake 2: Not Handling Ties in Tuple Comparison
# If heap contains tuples, Python compares element-wise
# This can fail if second element isn't comparable
# WRONG: Might crash if nodes aren't comparable
heapq.heappush(heap, (node.val, node))
# RIGHT: Add index for tie-breaking
heapq.heappush(heap, (node.val, i, node))
Mistake 3: Forgetting to Heapify
# WRONG: Just creating list doesn't make it a heap
heap = [3, 1, 4, 1, 5, 9]
min_val = heap[0] # Might not be minimum!
# RIGHT: Heapify first
heap = [3, 1, 4, 1, 5, 9]
heapq.heapify(heap)
min_val = heap[0] # Now it's guaranteed minimum
The Heap Decision Tree
When you see a problem:
1. Does it mention "Kth largest/smallest"?
- YES → Heap of size K
- Kth largest → min-heap
- Kth smallest → max-heap
2. Does it ask for "top K" or "most/least frequent"?
- YES → Heap of size K with custom comparison
3. Does it involve merging sorted sequences?
- YES → Min-heap tracking current minimum from each sequence
4. Does it involve maintaining median or percentile?
- YES → Two heaps (max-heap for smaller half, min-heap for larger)
5. Does it require repeated min/max extraction?
- YES → Heap (more efficient than sorting repeatedly)
My Practice Strategy
Here's how I mastered heaps:
Week 1: Understand Heap Operations
- Implement heap from scratch (educational)
- Master Python's heapq module
- Solve basic Kth element problems
Week 2: Pattern Recognition
- Top K problems (frequent, closest, largest)
- Merge K sorted sequences
- 15-20 problems
Week 3: Advanced Patterns
- Running median (two heaps)
- Sliding window with heap
- Custom comparisons
Week 4: Mixed Practice
- 30 problems mixing all patterns
- Practice recognizing "this needs a heap"
- Optimize brute force solutions with heaps
Real Interview Success
I was asked: "Find K closest points to origin."
My immediate thought: "Closest K" → heap of size K!
Decision: Kth closest → maintain K closest → max-heap (largest distance in heap is Kth closest)
def k_closest(points, k):
"""
Find k points closest to origin.
Use max-heap of size k to track k closest.
"""
heap = []
for x, y in points:
dist = -(x*x + y*y) # Negative for max-heap
if len(heap) < k:
heapq.heappush(heap, (dist, x, y))
elif dist > heap[0][0]: # Closer than farthest in heap
heapq.heappushpop(heap, (dist, x, y))
return [[x, y] for dist, x, y in heap]
Coded in 8 minutes. The interviewer was impressed by instant pattern recognition.
Conclusion: Your Heap Mastery Roadmap
Heaps are simpler than they appear. Here's what matters:
- Recognize "Kth element" problems - Heap of size K
- Master min-heap vs max-heap choice - Often counterintuitive
- Know Python's heapq - It's your friend
- Practice pattern variations - Top K, merge K, median
Start with simple Kth element problems. Once comfortable, move to merging and median. Finally, tackle custom comparisons and optimization problems.
Within 2-3 weeks of focused practice, you'll recognize heap problems instantly and choose the right heap type immediately.
Remember: heaps are just efficient priority queues. When you need repeated min/max access, think heap!
Happy coding, and may your heaps always be balanced!