Documentation

Optimization Guide

Epochly Optimization Guide: Workload Analysis

How to analyze your Python workloads to predict which enhancement levels will apply and expected performance gains.

How to analyze workloads for optimal Epochly optimization.


Overview

Understanding your workload type is essential for choosing the right enhancement level:

Workload TypeCharacteristicRecommended Level
I/O-boundWaiting for external resourcesLevel 1
CPU-bound loopsNumerical computationLevel 2
Data-parallelIndependent computationsLevel 3
Large arraysMassive parallelism neededLevel 4

Identifying Workload Type

I/O-Bound Workloads

Characteristics:

  • Time spent waiting for network, disk, database
  • Low CPU utilization during execution
  • Performance limited by external resource latency
# I/O-bound: network requests
def fetch_data(urls):
return [requests.get(url) for url in urls]
# I/O-bound: file operations
def read_files(paths):
return [open(p).read() for p in paths]
# I/O-bound: database queries
def query_db(queries):
return [db.execute(q) for q in queries]

CPU-Bound Workloads

Characteristics:

  • High CPU utilization
  • Time spent in computation
  • Limited by processing speed
# CPU-bound: numerical computation
def compute_heavy(data):
result = 0
for x in data:
result += x ** 2 + x ** 0.5
return result
# CPU-bound: data transformation
def transform(records):
return [complex_transform(r) for r in records]

Memory-Bound Workloads

Characteristics:

  • Large data sets
  • Performance limited by memory bandwidth
  • CPU often waiting for memory
# Memory-bound: large array access
def memory_heavy(large_array):
return [arr[i:i+1000].sum() for i in range(0, len(arr), 1000)]

Using Profiling Tools

Epochly Built-in Profiling

import epochly
# Enable monitoring
epochly.set_level(0)
with epochly.monitoring_context() as metrics:
result = my_function(data)
# Analyze workload
print(f"CPU time: {metrics.get('cpu_time_ms')} ms")
print(f"Wait time: {metrics.get('wait_time_ms')} ms")
print(f"Memory peak: {metrics.get('memory_peak_mb')} MB")
# Determine type
cpu_ratio = metrics.get('cpu_time_ms') / metrics.get('total_time_ms')
if cpu_ratio > 0.8:
print("CPU-bound: use Level 2 or 3")
elif cpu_ratio < 0.3:
print("I/O-bound: use Level 1")
else:
print("Mixed workload")

Python cProfile

import cProfile
import pstats
def analyze_with_cprofile():
profiler = cProfile.Profile()
profiler.enable()
result = my_function(data)
profiler.disable()
stats = pstats.Stats(profiler)
stats.sort_stats('cumulative')
stats.print_stats(10) # Top 10 functions

line_profiler

# Install: pip install line_profiler
@profile # Use with kernprof -l -v script.py
def my_function(data):
result = []
for x in data: # Line-by-line timing
result.append(x ** 2)
return result

Interpreting Epochly Metrics

Key Metrics

MetricMeaningOptimization Hint
High cpu_timeCPU-boundLevel 2 or 3
High wait_timeI/O-boundLevel 1
High memory_peakMemory-boundBatch processing
Many loop_iterationsHot loopsLevel 2
Large data_sizeParallelizableLevel 3 or 4

Example Analysis

import epochly
@epochly.optimize(level=0)
def analyze_me(data):
return process(data)
# Run multiple times to gather stats
for _ in range(100):
analyze_me(test_data)
metrics = epochly.get_metrics()
# Analyze
if metrics.get('avg_loop_iterations', 0) > 10000:
print("Hot loops detected - consider Level 2")
if metrics.get('avg_data_size', 0) > 1_000_000:
print("Large data - consider Level 3 or 4")
if metrics.get('io_wait_ratio', 0) > 0.5:
print("I/O-bound - consider Level 1")

Optimization Strategy Selection

Decision Tree

Start
├─ Is it I/O-bound? (network, disk, DB)
│ └─ YES → Level 1 (Threading)
├─ Is it CPU-bound with hot loops?
│ └─ YES → Level 2 (JIT)
├─ Is data large and parallelizable?
│ └─ YES → Level 3 (Multicore)
├─ Are there large array operations?
│ └─ YES → Level 4 (GPU) [if licensed]
└─ Mixed or unclear → Start with Level 0, analyze

Code-Based Decision

import epochly
def select_optimal_level(workload_metrics):
"""Select optimal level based on workload analysis."""
io_ratio = workload_metrics.get('io_wait_ratio', 0)
loop_iters = workload_metrics.get('avg_loop_iterations', 0)
data_size = workload_metrics.get('avg_data_size', 0)
if io_ratio > 0.5:
return 1 # I/O-bound
elif loop_iters > 10000:
if data_size > 10_000_000 and epochly.check_feature('gpu_acceleration'):
return 4 # GPU for large arrays
elif data_size > 100_000:
return 3 # Parallel for large data
else:
return 2 # JIT for hot loops
elif data_size > 100_000:
return 3 # Parallel for large data
else:
return 0 # Monitor only (not enough work)

Benchmarking Levels

Comparative Benchmarking

import epochly
import time
def benchmark_levels(func, data, levels=[0, 1, 2, 3]):
results = {}
for level in levels:
with epochly.optimize_context(level=level):
# Warmup
func(data)
# Measure
start = time.perf_counter()
for _ in range(10):
func(data)
elapsed = time.perf_counter() - start
results[level] = elapsed / 10
return results
# Compare
timings = benchmark_levels(my_function, test_data)
print("Level | Time (s) | Speedup")
baseline = timings[0]
for level, t in sorted(timings.items()):
speedup = baseline / t if t > 0 else 0
print(f" {level} | {t:.4f} | {speedup:.2f}x")

Common Patterns

Pattern: API Client

# I/O-bound: use Level 1
@epochly.optimize(level=1)
def api_client(endpoints):
return [call_api(ep) for ep in endpoints]

Pattern: Data Pipeline

# Mixed: Level 1 for I/O, Level 2 for transform
def data_pipeline(sources):
with epochly.optimize_context(level=1):
raw_data = [fetch(s) for s in sources]
with epochly.optimize_context(level=2):
transformed = [transform(d) for d in raw_data]
return transformed

Pattern: Scientific Computing

# CPU-bound with large arrays: Level 3 or 4
@epochly.optimize(level=3)
def scientific_compute(matrix):
# Parallel processing for matrix operations
return complex_calculation(matrix)