Python Modules and Packages: Organize Code Like a Pro
Master Python modules and packages for scalable projects. Learn imports, __init__.py, package structure, relative imports, and best practices for ML projects

When My Project Became a Mess
My ML project was one giant fileβ3000 lines of chaos:
my_project.py # Everything in one file! π±
Then I learned about modules and packages:
my_project/
βββ __init__.py
βββ models/
β βββ __init__.py
β βββ random_forest.py
β βββ neural_net.py
βββ preprocessing/
β βββ __init__.py
β βββ scalers.py
βββ utils/
βββ __init__.py
βββ metrics.py
Organization = Professional code!
Modules: Single Files
A module is simply a Python file containing definitions and statements. Every .py file you create is automatically a module.
Creating Your First Module
# math_utils.py (module)
def add(a, b):
"""Add two numbers."""
return a + b
def multiply(a, b):
"""Multiply two numbers."""
return a * b
def power(base, exponent):
"""Calculate base raised to exponent."""
return base ** exponent
PI = 3.14159
E = 2.71828
class Calculator:
"""Simple calculator class."""
def __init__(self):
self.history = []
def calculate(self, operation, a, b):
result = operation(a, b)
self.history.append(f"{operation.__name__}({a}, {b}) = {result}")
return result
Using Modules
# main.py
import math_utils
# Access functions
result = math_utils.add(5, 3)
print(result) # 8
# Access constants
print(math_utils.PI) # 3.14159
# Access classes
calc = math_utils.Calculator()
calc.calculate(math_utils.add, 10, 5)
Module Attributes
Every module has special attributes:
# math_utils.py
print(__name__) # Module name
print(__file__) # File path
print(__doc__) # Module docstring
def main():
print("Running as script")
# Run only when executed directly
if __name__ == "__main__":
main()
When you import: __name__ is "math_utils".
When you run directly: __name__ is "__main__".
Packages: Directories with init.py
A package is a directory containing __init__.py. This special file tells Python "this directory is a package."
Basic Package Structure
ml_toolkit/
βββ __init__.py # Makes it a package
βββ models.py
βββ preprocessing.py
βββ utils.py
The init.py File
__init__.py can be empty, but it's powerful when used properly:
# ml_toolkit/__init__.py
"""
ML Toolkit - Machine Learning Utilities
"""
# Import key components for convenient access
from .models import RandomForest, NeuralNetwork
from .preprocessing import scale_data, normalize
from .utils import save_model, load_model
# Define public API
__all__ = ['RandomForest', 'NeuralNetwork', 'scale_data', 'normalize']
# Package metadata
__version__ = '1.0.0'
__author__ = 'Your Name'
# Initialize package-level variables
DEFAULT_CONFIG = {
'random_state': 42,
'verbose': True
}
print(f"Loaded ML Toolkit v{__version__}")
Now users can import directly:
from ml_toolkit import RandomForest, scale_data
# Instead of: from ml_toolkit.models import RandomForest
Nested Packages
Packages can contain subpackages:
ml_toolkit/
βββ __init__.py
βββ models/
β βββ __init__.py
β βββ classification/
β β βββ __init__.py
β β βββ random_forest.py
β β βββ svm.py
β βββ regression/
β βββ __init__.py
β βββ linear.py
βββ preprocessing/
β βββ __init__.py
β βββ scalers.py
β βββ encoders.py
βββ utils/
βββ __init__.py
βββ metrics.py
# Access nested modules
from ml_toolkit.models.classification import random_forest
from ml_toolkit.preprocessing.scalers import StandardScaler
Import Styles
Different Ways to Import
# 1. Import entire module
import numpy
result = numpy.array([1, 2, 3])
# 2. Import specific function/class
from numpy import array
result = array([1, 2, 3])
# 3. Import with alias (most common for popular libraries)
import pandas as pd
df = pd.DataFrame()
# 4. Import multiple items
from sklearn.ensemble import RandomForestClassifier, GradientBoostingClassifier
# 5. Import submodule
from sklearn.ensemble import RandomForestClassifier
# 6. Import everything (AVOID - pollutes namespace!)
from numpy import * # Don't do this!
When to Use Each Style
Use full import when you need many functions from a module:
import math
x = math.sin(math.pi / 2)
y = math.cos(0)
z = math.sqrt(16)
Use specific imports when you only need a few items:
from math import sin, cos, pi
x = sin(pi / 2)
y = cos(0)
Use aliases for commonly used libraries:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import tensorflow as tf
Why Avoid from module import *
# Bad - unclear where functions come from
from math import *
from numpy import *
result = sqrt(16) # math.sqrt or numpy.sqrt?
# Good - explicit is better
import math
import numpy as np
result = math.sqrt(16)
array_result = np.sqrt(np.array([16, 25, 36]))
ML Project Structure
ml_project/
βββ __init__.py
βββ data/
β βββ __init__.py
β βββ loader.py
β βββ preprocessor.py
βββ models/
β βββ __init__.py
β βββ base.py
β βββ random_forest.py
β βββ neural_net.py
βββ training/
β βββ __init__.py
β βββ trainer.py
βββ evaluation/
β βββ __init__.py
β βββ metrics.py
βββ utils/
βββ __init__.py
βββ helpers.py
Relative vs Absolute Imports
Absolute Imports (Recommended)
Absolute imports specify the full path from the project root:
# In ml_project/models/neural_net.py
from ml_project.models.random_forest import RandomForest
from ml_project.data.loader import load_data
from ml_project.utils.metrics import accuracy_score
Advantages:
- Clear and explicit
- Works from anywhere
- Easier to understand
- Better for large projects
Relative Imports
Relative imports use dots to navigate the package hierarchy:
# In ml_project/models/neural_net.py
# Import from same directory
from .random_forest import RandomForest
# Import from parent directory
from ..data.loader import load_data
# Import from sibling directory
from ..utils.metrics import accuracy_score
# Go up two levels
from ...config import settings
Dot notation:
.= current package..= parent package...= grandparent package
When to Use Each
Use absolute imports:
- In entry point scripts
- When clarity is important
- In large, complex projects
- When module might move later
Use relative imports:
- Within a tightly coupled package
- When you want package portability
- For internal package structure
Common Relative Import Mistake
# This FAILS in script run directly
# Can only use relative imports inside packages
# wrong.py (run as script)
from .utils import helper # Error: attempted relative import with no known parent package
Module Search Path
Python searches for modules in specific locations. Understanding this prevents import errors.
How Python Finds Modules
import sys
print(sys.path)
# Output:
# [
# '/current/directory',
# '/usr/lib/python3.9',
# '/usr/lib/python3.9/site-packages',
# ...
# ]
Search order:
- Current directory
- PYTHONPATH environment variable directories
- Standard library directories
- Site-packages (installed packages)
Adding Custom Paths
# Method 1: Modify sys.path
import sys
sys.path.append('/path/to/my/modules')
# Method 2: Use PYTHONPATH environment variable
# In terminal:
# export PYTHONPATH="/path/to/my/modules:$PYTHONPATH"
# Method 3: Use .pth file in site-packages
# Create mymodules.pth with path to your modules
Practical Example
# Project structure
my_project/
βββ main.py
βββ config.py
βββ src/
βββ __init__.py
βββ models.py
βββ utils.py
# main.py
import sys
from pathlib import Path
# Add src to path
src_path = Path(__file__).parent / 'src'
sys.path.insert(0, str(src_path))
# Now can import from src
from models import MyModel
from utils import helper_function
Common Import Errors and Solutions
Error 1: ModuleNotFoundError
import my_module # ModuleNotFoundError: No module named 'my_module'
Solutions:
- Check if file exists in current directory
- Verify
__init__.pyexists in package directories - Check
sys.pathincludes the module's directory - Install package if it's external:
pip install my_module
Error 2: Circular Imports
# module_a.py
from module_b import function_b
def function_a():
return function_b()
# module_b.py
from module_a import function_a # Circular import!
def function_b():
return function_a()
Solution: Restructure code or use import inside function:
# module_b.py
def function_b():
from module_a import function_a # Import inside function
return function_a()
Error 3: Relative Import Beyond Top-Level
# Attempting to go beyond package root
from ...something import anything # ValueError: attempted relative import beyond top-level package
Solution: Use absolute imports or restructure package hierarchy.
Error 4: Name Conflicts
# Shadowing standard library
import json # Our json.py file!
data = json.loads('{"key": "value"}') # Error: module has no attribute 'loads'
Solution: Rename your file to avoid conflicts with standard library or installed packages.
Best Practices
1. Use Meaningful Names
# Bad
import utils
from helpers import func
# Good
import data_preprocessing_utils
from validation_helpers import validate_email
2. Keep init.py Clean
# Good __init__.py
"""Package for data processing utilities."""
from .preprocessing import clean_data, normalize
from .validation import validate_input
__all__ = ['clean_data', 'normalize', 'validate_input']
__version__ = '1.0.0'
Avoid complex logic in __init__.py - it runs on every import!
3. Organize by Functionality
# Good structure
ml_project/
βββ data/ # Data-related modules
βββ models/ # Model implementations
βββ training/ # Training logic
βββ evaluation/ # Evaluation metrics
βββ utils/ # General utilities
4. Use all to Define Public API
# models.py
__all__ = ['RandomForest', 'NeuralNetwork'] # Public API
class RandomForest:
pass
class NeuralNetwork:
pass
class _PrivateHelper: # Not exported
pass
5. Document Your Modules
"""
Module: data_preprocessing
This module provides utilities for preprocessing raw data including:
- Data cleaning
- Normalization
- Feature extraction
Example:
from data_preprocessing import clean_data
cleaned = clean_data(raw_data)
"""
6. Avoid Circular Dependencies
Organize imports to create a dependency tree, not a web:
# Good architecture
config β utils β models β training β main
# Bad architecture (circular)
models β training β utils
Real-World Example: Complete ML Project
Let's build a proper structure for a machine learning project:
ml_classification/
βββ __init__.py
βββ setup.py # For installation
βββ data/
β βββ __init__.py
β βββ loader.py # Data loading
β βββ preprocessor.py # Data preprocessing
βββ models/
β βββ __init__.py
β βββ base.py # Base model class
β βββ random_forest.py
β βββ neural_net.py
βββ training/
β βββ __init__.py
β βββ trainer.py
β βββ callbacks.py
βββ evaluation/
β βββ __init__.py
β βββ metrics.py
βββ utils/
βββ __init__.py
βββ config.py
βββ logging.py
data/loader.py:
"""Data loading utilities."""
import pandas as pd
from pathlib import Path
class DataLoader:
def __init__(self, data_path: str):
self.data_path = Path(data_path)
def load_csv(self) -> pd.DataFrame:
return pd.read_csv(self.data_path)
models/init.py:
"""Model implementations."""
from .random_forest import RandomForestModel
from .neural_net import NeuralNetModel
__all__ = ['RandomForestModel', 'NeuralNetModel']
Main script:
# main.py
from ml_classification.data.loader import DataLoader
from ml_classification.data.preprocessor import Preprocessor
from ml_classification.models import RandomForestModel
from ml_classification.training.trainer import Trainer
from ml_classification.evaluation.metrics import calculate_accuracy
# Load data
loader = DataLoader('data/train.csv')
data = loader.load_csv()
# Preprocess
preprocessor = Preprocessor()
X, y = preprocessor.prepare(data)
# Train model
model = RandomForestModel()
trainer = Trainer(model)
trainer.fit(X, y)
# Evaluate
accuracy = calculate_accuracy(model, X, y)
print(f"Accuracy: {accuracy}")
Making Your Package Installable
Create setup.py:
from setuptools import setup, find_packages
setup(
name='ml_classification',
version='1.0.0',
packages=find_packages(),
install_requires=[
'numpy>=1.20.0',
'pandas>=1.3.0',
'scikit-learn>=1.0.0',
],
author='Your Name',
description='ML Classification Package',
python_requires='>=3.8',
)
Install in development mode:
pip install -e .
Now you can import from anywhere:
from ml_classification import RandomForestModel
Advanced Topics
Lazy Imports for Performance
Lazy imports delay module loading until actually needed:
# Instead of importing at top (slows startup)
import heavy_ml_library
def train_model():
# Import only when function is called
import heavy_ml_library
model = heavy_ml_library.Model()
return model
When to use:
- Large libraries that aren't always needed
- Speeding up script startup time
- Optional dependencies
Module Reloading (Development)
During development, reload modules without restarting Python:
import importlib
import my_module
# Make changes to my_module.py...
# Reload the module
importlib.reload(my_module)
Warning: Reloading can cause issues with existing instances. Use mainly in interactive sessions.
Conditional Imports
Handle optional dependencies gracefully:
try:
import matplotlib.pyplot as plt
HAS_MATPLOTLIB = True
except ImportError:
HAS_MATPLOTLIB = False
def plot_data(data):
if not HAS_MATPLOTLIB:
print("Matplotlib not available. Install with: pip install matplotlib")
return
plt.plot(data)
plt.show()
Key Takeaways
- Modules are Python files; packages are directories with
__init__.py - Use absolute imports for clarity and maintainability
- Understand sys.path to troubleshoot import errors
- Organize code by functionality in separate packages
- Use all to define your public API
- Avoid circular dependencies through proper architecture
- Make packages installable with
setup.pyfor reusability
Proper module organization transforms chaotic code into professional, maintainable projects. Start small, refactor as you grow, and your future self will thank you!
Connect with me on Twitter or LinkedIn.
Support My Work
If this guide helped you understand Python modules, packages, and imports, organize your code better, or resolve import errors, I'd really appreciate your support! Creating comprehensive, beginner-friendly Python tutorials like this takes significant time and effort. Your support helps me continue sharing knowledge and creating more helpful resources for Python developers.
β Buy me a coffee - Every contribution, big or small, means the world to me and keeps me motivated to create more content!
Cover image by Claudio Schwarz on Unsplash