Ojaswi Athghara | Getting Started with OpenCV: My First Computer Vision Experiments

Getting Started with OpenCV: My First Computer Vision Experiments

The Day I Decided to Actually Try OpenCV

Reading about computer vision was interesting, but I wanted to actually do something. Build something. Make a computer see something.

Everyone kept mentioning OpenCV—apparently it's the library for computer vision. So I thought: How hard can it be? I'll just install it and start detecting things.

Spoiler: It was harder than I expected. But also way more fun and educational than just reading documentation.

Here's the story of my first experiments with OpenCV—the setup struggles, the confusing moments, the small victories, and what I'm learning by actually getting my hands dirty with computer vision.

What Even Is OpenCV?

Before diving in, I had to understand what I was getting into.

OpenCV (Open Source Computer Vision Library) is basically a massive collection of tools for computer vision. It was created by Intel back in 1999 and has been continuously developed since.

Think of it as a Swiss Army knife for working with images and videos:

Reading and writing images/videos
Basic operations (resize, crop, rotate)
Filtering and transformations
Feature detection (edges, corners)
Object detection (faces, people, specific objects)
And hundreds of other functions

The cool part? It's free, open-source, and works with Python (among other languages).

Why Everyone Recommends It

As I researched, I kept seeing:

Start with OpenCV
OpenCV is essential for computer vision
Every CV engineer should know OpenCV

Why? Because it's:

Battle-tested: Used in production by major companies
Fast: Optimized C++ code running under the hood
Comprehensive: Covers most CV tasks you'd want to do
Well-documented: Years of tutorials and examples available
Cross-platform: Works on Windows, Mac, Linux, even mobile

Okay, I was convinced. Time to actually use it.

Step 1: Installation (Trickier Than Expected)

I thought installation would be straightforward. Type pip install opencv, done, right?

Not quite.

The Confusion: opencv vs opencv-python vs opencv-contrib-python

There are multiple OpenCV packages on pip:

opencv-python: Main modules
opencv-contrib-python: Main modules + extra contributed modules
opencv-python-headless: For servers without GUI

Which one to install?!

After reading (and getting confused), I settled on:

pip install opencv-contrib-python

This gives me the main modules plus extras. Seemed like the safest bet for learning.

Testing the Installation

Did it work?

import cv2

print(cv2.__version__)
# Output: 4.8.1 (or whatever current version)

Success! OpenCV was installed and importable. Small victory, but it felt good.

The Gotcha: cv2, Not opencv

One thing that confused me initially: you install opencv-python but import cv2.

Why cv2? It's the name of the Python binding module. Confusing naming, but now I know.

My First OpenCV Script: Loading and Displaying an Image

Let's start with the absolute basics: load an image and display it.

import cv2

# Load an image
img = cv2.imread('test_photo.jpg')

# Display it
cv2.imshow('My First Image', img)
cv2.waitKey(0)  # Wait for key press
cv2.destroyAllWindows()  # Close window

I ran this and... the colors were wrong! Everything had a weird blue-ish tint.

The RGB vs BGR Problem

After some frantic Googling, I learned: OpenCV loads images as BGR (Blue, Green, Red) instead of RGB (Red, Green, Blue) that most other libraries use.

Why? Historical reasons. OpenCV was created when BGR was common in camera hardware.

To fix it:

import cv2

# Load image (BGR by default)
img = cv2.imread('test_photo.jpg')

# Convert to RGB
img_rgb = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)

# Now colors look correct!
cv2.imshow('Correct Colors', img_rgb)
cv2.waitKey(0)
cv2.destroyAllWindows()

Lesson learned: Always remember OpenCV uses BGR! This tripped me up multiple times.

Experiment 1: Basic Image Operations

Once I could load images, I wanted to actually manipulate them.

Resizing Images

import cv2

# Load image
img = cv2.imread('photo.jpg')

print(f"Original size: {img.shape}")  # (height, width, channels)

# Resize to specific dimensions
resized = cv2.resize(img, (800, 600))

# Resize by scale factor (50% of original)
half_size = cv2.resize(img, None, fx=0.5, fy=0.5)

# Display results
cv2.imshow('Original', img)
cv2.imshow('Resized', resized)
cv2.imshow('Half Size', half_size)
cv2.waitKey(0)
cv2.destroyAllWindows()

Seeing images change size based on my code felt satisfying!

Rotating Images

# Get image dimensions
height, width = img.shape[:2]

# Calculate rotation matrix (rotate 45 degrees around center)
center = (width // 2, height // 2)
rotation_matrix = cv2.getRotationMatrix2D(center, 45, 1.0)

# Apply rotation
rotated = cv2.warpAffine(img, rotation_matrix, (width, height))

cv2.imshow('Rotated', rotated)
cv2.waitKey(0)

Rotation was more complex than I expected—requiring a transformation matrix. But it worked!

Cropping Images

This one was surprisingly simple:

# Images are just numpy arrays, so use array slicing!
# Crop: img[y:y+h, x:x+w]

cropped = img[100:400, 200:600]  # From y=100 to y=400, x=200 to x=600

cv2.imshow('Cropped', cropped)
cv2.waitKey(0)

Wait, cropping is just array slicing? Mind. Blown. This made me appreciate that images really are just arrays of numbers.

Experiment 2: Converting Images (Color Spaces and Effects)

Grayscale Conversion

# Convert to grayscale
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

cv2.imshow('Grayscale', gray)
cv2.waitKey(0)

One line of code, and color photos became black and white. Simple but powerful.

Blur Effects

# Gaussian blur (reduce noise, make image smoother)
blurred = cv2.GaussianBlur(img, (15, 15), 0)

cv2.imshow('Original', img)
cv2.imshow('Blurred', blurred)
cv2.waitKey(0)

The (15, 15) is the kernel size—how much to blur. Larger numbers = more blur.

I tried different values to see the effect. At (51, 51), my photo looked like impressionist art!

Edge Detection

This one blew my mind:

# First convert to grayscale
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

# Apply Canny edge detection
edges = cv2.Canny(gray, 100, 200)

cv2.imshow('Original', img)
cv2.imshow('Edges Only', edges)
cv2.waitKey(0)

Seeing my photo reduced to just edges—outlines of objects—felt like magic. The computer found boundaries automatically!

Experiment 3: Face Detection (The Exciting Part!)

This is what I really wanted to try: detecting faces.

Using Haar Cascades

OpenCV comes with pre-trained models called Haar Cascades for detecting faces, eyes, etc.

import cv2

# Load the pre-trained face detector
face_cascade = cv2.CascadeClassifier(
    cv2.data.haarcascades + 'haarcascade_frontalface_default.xml'
)

# Load an image
img = cv2.imread('people_photo.jpg')
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

# Detect faces
faces = face_cascade.detectMultiScale(
    gray,
    scaleFactor=1.1,
    minNeighbors=5,
    minSize=(30, 30)
)

print(f"Found {len(faces)} faces!")

# Draw rectangles around detected faces
for (x, y, w, h) in faces:
    cv2.rectangle(img, (x, y), (x+w, y+h), (0, 255, 0), 2)

# Display result
cv2.imshow('Faces Detected', img)
cv2.waitKey(0)
cv2.destroyAllWindows()

When I ran this on a group photo and saw green rectangles appear around each face... that was the moment I thought okay, this is genuinely cool.

Understanding the Parameters

The detectMultiScale parameters confused me at first:

scaleFactor: How much to reduce image size at each scale (1.1 = 10% reduction)
minNeighbors: How many neighbors each candidate rectangle should have to keep it (higher = fewer false positives but might miss faces)
minSize: Minimum face size to detect

I experimented with these values:

scaleFactor too high: Missed faces
minNeighbors too low: Detected faces in random places (false positives)
minNeighbors too high: Missed some real faces

Finding the right balance took trial and error.

When Face Detection Failed

Not every photo worked perfectly:

Problems I encountered:

Side profiles weren't detected (model is trained for frontal faces)
Very small faces in the background were missed
Unusual lighting caused issues
Partially covered faces weren't detected

Limitations are important to understand. Face detection isn't magic—it has constraints.

Experiment 4: Video Processing (Images in Motion)

Images are cool, but what about video? Turns out, video is just a sequence of images (frames).

Capturing Video from Webcam

import cv2

# Open webcam (0 is usually the default camera)
cap = cv2.VideoCapture(0)

while True:
    # Capture frame-by-frame
    ret, frame = cap.read()
    
    if not ret:
        break
    
    # Display the frame
    cv2.imshow('Webcam', frame)
    
    # Press 'q' to quit
    if cv2.waitKey(1) & 0xFF == ord('q'):
        break

# Release the capture and close windows
cap.release()
cv2.destroyAllWindows()

Seeing myself on screen via my code was oddly exciting!

Real-Time Face Detection

Combining face detection with webcam input:

import cv2

# Load face detector
face_cascade = cv2.CascadeClassifier(
    cv2.data.haarcascades + 'haarcascade_frontalface_default.xml'
)

# Open webcam
cap = cv2.VideoCapture(0)

while True:
    ret, frame = cap.read()
    if not ret:
        break
    
    # Convert to grayscale
    gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
    
    # Detect faces
    faces = face_cascade.detectMultiScale(gray, 1.1, 5)
    
    # Draw rectangles around faces
    for (x, y, w, h) in faces:
        cv2.rectangle(frame, (x, y), (x+w, y+h), (0, 255, 0), 2)
    
    # Display
    cv2.imshow('Face Detection', frame)
    
    if cv2.waitKey(1) & 0xFF == ord('q'):
        break

cap.release()
cv2.destroyAllWindows()

Watching the green rectangle track my face in real-time? That's when computer vision felt real, not just theoretical.

What Worked and What Didn't

What Worked: Starting simple, using pre-trained models, experimenting with parameters, and getting immediate visual feedback made learning satisfying.

What Didn't Work: Jumping to complex tasks too quickly, not reading documentation, ignoring BGR vs RGB format differences, and not handling errors properly.

Forgetting cv2.waitKey(0): Windows would flash and close immediately. Not converting BGR to RGB: Colors looked wrong. Wrong array dimensions: Got confusing errors—always check img.shape. Not releasing resources: Forgot cap.release() causing issues. Unrealistic expectations: Face detection works well on frontal faces in good lighting, but struggles otherwise.

Key Learnings

Images are arrays: Every operation is math on numbers. Preprocessing matters: Grayscale conversion, resizing, blurring significantly affect results. Parameters need experimentation: No perfect values—test and iterate. Pre-trained models are powerful: Don't hesitate to use them. Real-time processing is demanding: Speed matters when processing video.

Cool Things I Built (Small But Satisfying)

Project 1: Photo Filter App

A simple script that applies different filters:

import cv2

img = cv2.imread('photo.jpg')

# Create different versions
grayscale = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
blurred = cv2.GaussianBlur(img, (21, 21), 0)
edges = cv2.Canny(grayscale, 50, 150)

# Display all versions
cv2.imshow('Original', img)
cv2.imshow('Grayscale', grayscale)
cv2.imshow('Blurred', blurred)
cv2.imshow('Edges', edges)
cv2.waitKey(0)
cv2.destroyAllWindows()

Like Instagram filters, but I made them!

Project 2: Webcam Face Detector

The real-time face detection script I showed earlier. This felt like actual computer vision!

Project 3: Motion Detector

Detecting motion by comparing consecutive frames:

import cv2

cap = cv2.VideoCapture(0)

# Read first frame
ret, prev_frame = cap.read()
prev_gray = cv2.cvtColor(prev_frame, cv2.COLOR_BGR2GRAY)
prev_gray = cv2.GaussianBlur(prev_gray, (21, 21), 0)

while True:
    ret, frame = cap.read()
    if not ret:
        break
    
    gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
    gray = cv2.GaussianBlur(gray, (21, 21), 0)
    
    # Compute difference between frames
    frame_diff = cv2.absdiff(prev_gray, gray)
    thresh = cv2.threshold(frame_diff, 25, 255, cv2.THRESH_BINARY)[1]
    
    # Display
    cv2.imshow('Motion Detection', thresh)
    
    prev_gray = gray
    
    if cv2.waitKey(1) & 0xFF == ord('q'):
        break

cap.release()
cv2.destroyAllWindows()

White pixels show where motion was detected. Wave your hand, and it lights up!

Resources That Helped Me

Official Documentation

OpenCV Python Tutorials: Official and comprehensive

Helpful Websites

PyImageSearch: Practical OpenCV tutorials
Learn OpenCV: Beginner-friendly guides

Communities

What I'm Planning to Try Next

Now that I have the basics down, I want to explore:

Object detection: Detecting specific objects, not just faces
Color detection: Finding objects by color
Contour detection: Finding shapes and boundaries
Feature matching: Comparing images to find similarities
Optical flow: Tracking moving objects across frames
Text detection (OCR): Reading text from images

Each of these will teach me something new about computer vision.

Tips for Beginners

Start with images before video—they're simpler. Use pre-trained models to learn faster. Experiment constantly with parameters. Read error messages—they're helpful. Keep images small for faster iteration. Save experiments—failures teach lessons. Join communities for support. Be patient—there's a learning curve.

Final Thoughts

OpenCV taught me that computer vision is accessible. My code isn't production-ready, and I haven't mastered anything, but I learned tons through hands-on experimentation. That's what beginner projects are for—getting your hands dirty, making mistakes, and building confidence.

If you're curious about OpenCV, stop reading and start coding. Install it, load an image, experiment. You'll get stuck, debug, and eventually see results. And when you do, it feels great.

Experimenting with OpenCV? I'd love to hear what you're building! Connect with me on Twitter or LinkedIn and let's share our experiments.

Support My Work

If this guide helped you with this topic, I'd really appreciate your support! Creating comprehensive, free content like this takes significant time and effort. Your support helps me continue sharing knowledge and creating more helpful resources for developers.

☕ Buy me a coffee - Every contribution, big or small, means the world to me and keeps me motivated to create more content!

Cover image by Robynne O on Unsplash

Related Blogs