Enhancement Levels
Epochly uses a progressive enhancement model with five levels. Each level builds upon the previous, with increasing optimization capability.
Level Overview
| Level | Name | Value | Description |
|---|---|---|---|
| 0 | LEVEL_0_MONITOR | 0 | Lightweight monitoring only |
| 1 | LEVEL_1_THREADING | 1 | Thread pool optimizations |
| 2 | LEVEL_2_JIT | 2 | JIT compilation enabled |
| 3 | LEVEL_3_FULL | 3 | Full parallelism with sub-interpreters |
| 4 | LEVEL_4_GPU | 4 | GPU acceleration |
Level 0: Monitor
Purpose
Level 0 collects baseline performance metrics without applying any optimization. This serves as the starting point for all functions.
What It Does
- Collects execution timing
- Monitors memory usage
- Identifies workload characteristics
- No performance overhead
When to Use
- Diagnosing performance issues
- Establishing baseline metrics
- Validating that code works correctly before optimization
Configuration
import epochly# Set to monitoring onlyepochly.set_level(0)
export EPOCHLY_LEVEL=0
Level 1: Threading
Purpose
Level 1 provides thread pool optimization for I/O-bound workloads. It enables concurrent execution of I/O operations.
What It Does
- Creates a thread pool for parallel I/O
- Enables concurrent file operations
- Parallelizes network requests
- Handles database queries concurrently
When Level 1 Helps
| Workload | Benefit |
|---|---|
| Multiple file reads | 2-10x speedup |
| Multiple API calls | 2-10x speedup |
| Database batch operations | 2-5x speedup |
| Mixed I/O and CPU | 1.5-3x speedup |
When Level 1 Does Not Help
- Pure CPU computation (use Level 2/3)
- Single long-running I/O operation
- Operations with strict sequential dependencies
Configuration
import epochly# Set to threading levelepochly.set_level(1)
export EPOCHLY_LEVEL=1export EPOCHLY_MAX_WORKERS=32
Level 2: JIT Compilation
Purpose
Level 2 applies Just-In-Time compilation to numerical code paths, converting Python loops to native machine code.
What It Does
- Identifies hot code paths (frequently executed)
- Compiles numerical operations to machine code
- Caches compiled code for reuse
- Provides 58–193x speedup for numerical loops
JIT Backends
| Backend | Python Versions | Platform | Notes |
|---|---|---|---|
| Numba | 3.9-3.12 | All | Primary backend, installed automatically |
| Native JIT | 3.13+ | All | Built into Python 3.13 |
| Pyston-Lite | 3.9-3.10 | Linux x86_64 only | Installed automatically where supported |
When Level 2 Helps
| Code Pattern | Benefit |
|---|---|
| Simple numerical loops | 50–100x speedup |
| Nested loops with math | 30–60x speedup |
| Polynomial/mathematical operations | 100–200x speedup |
| Iterative algorithms | 50–150x speedup |
When Level 2 Does Not Help
- String operations
- Dictionary manipulation
- Object-heavy code
- Code with many Python C API calls
Configuration
export EPOCHLY_LEVEL=2export EPOCHLY_JIT_HOT_PATH_THRESHOLD=1000export EPOCHLY_JIT_BACKEND=auto
Level 3: Multicore Parallelism
Purpose
Level 3 enables true multicore execution, bypassing Python's Global Interpreter Lock (GIL).
What It Does
- Distributes work across multiple CPU cores
- Uses sub-interpreters (Python 3.12+) or ProcessPool
- Provides near-linear scaling with core count
- Enables shared memory for efficient data transfer
Parallelism Methods
| Python Version | Method | Notes |
|---|---|---|
| 3.12+ | Sub-interpreters | True parallel Python, low overhead |
| 3.9-3.11 | ProcessPool | Process-based, higher overhead |
When Level 3 Helps
| Workload | Benefit |
|---|---|
| CPU-bound parallel tasks | 2-8x per core |
| Data parallel operations | Near-linear scaling |
| Independent computations | Excellent scaling |
| Batch processing | Excellent scaling |
When Level 3 Does Not Help
- Single-threaded algorithms
- Operations with strict dependencies
- Small workloads (overhead exceeds benefit)
- Memory-bound operations
Configuration
export EPOCHLY_LEVEL=3export EPOCHLY_MAX_WORKERS=8export EPOCHLY_MEMORY_SHARED_SIZE=134217728 # 128MB
Level 4: GPU Acceleration
Purpose
Level 4 offloads suitable workloads to GPU for massive parallelism.
What It Does
- Detects NVIDIA GPUs via CuPy
- Automatically transfers data to GPU
- Executes array operations on GPU
- Returns results to CPU
Requirements
- NVIDIA GPU with CUDA support
- CuPy installed:
pip install cupy-cuda12x - Appropriate license tier
When Level 4 Helps
| Operation | Benefit |
|---|---|
| Large matrix operations (>4096x4096) | 7–10x speedup |
| Elementwise ops on large arrays (10M+) | 66–70x speedup |
| Batched convolutions | 14–19x speedup |
| Large array reductions (100M+) | 22–36x speedup |
When Level 4 Does Not Help
- Small arrays (<10,000 elements)
- Scalar operations
- Operations dominated by CPU-GPU transfer
- Code requiring frequent CPU-GPU synchronization
Configuration
export EPOCHLY_LEVEL=4export EPOCHLY_GPU_ENABLED=trueexport EPOCHLY_GPU_MEMORY_LIMIT=4096 # MB
Automatic Level Progression
Epochly automatically progresses through levels based on:
- Stability: No errors at current level
- Performance: Measurable improvement (>5% speedup)
- Compatibility: Required features available
- Resources: Sufficient memory/CPU available
Manual Level Control
Override automatic progression:
import epochly# Force specific levelepochly.set_level(3)# Check current levelcurrent = epochly.get_level()