Documentation

Workload Analysis

How to analyze workloads for optimal Epochly optimization.


Overview

Understanding your workload type is essential for choosing the right enhancement level:

Workload TypeCharacteristicRecommended Level
I/O-boundWaiting for external resourcesLevel 1
CPU-bound loopsNumerical computationLevel 2
Data-parallelIndependent computationsLevel 3
Large arraysMassive parallelism neededLevel 4

Identifying Workload Type

I/O-Bound Workloads

Characteristics:

  • Time spent waiting for network, disk, database
  • Low CPU utilization during execution
  • Performance limited by external resource latency
# I/O-bound: network requests
def fetch_data(urls):
return [requests.get(url) for url in urls]
# I/O-bound: file operations
def read_files(paths):
return [open(p).read() for p in paths]
# I/O-bound: database queries
def query_db(queries):
return [db.execute(q) for q in queries]

CPU-Bound Workloads

Characteristics:

  • High CPU utilization
  • Time spent in computation
  • Limited by processing speed
# CPU-bound: numerical computation
def compute_heavy(data):
result = 0
for x in data:
result += x ** 2 + x ** 0.5
return result
# CPU-bound: data transformation
def transform(records):
return [complex_transform(r) for r in records]

Memory-Bound Workloads

Characteristics:

  • Large data sets
  • Performance limited by memory bandwidth
  • CPU often waiting for memory
# Memory-bound: large array access
def memory_heavy(large_array):
return [arr[i:i+1000].sum() for i in range(0, len(arr), 1000)]

Using Profiling Tools

Epochly Built-in Profiling

import epochly
# Enable monitoring
epochly.set_level(0)
with epochly.monitoring_context() as metrics:
result = my_function(data)
# Analyze workload
print(f"CPU time: {metrics.get('cpu_time_ms')} ms")
print(f"Wait time: {metrics.get('wait_time_ms')} ms")
print(f"Memory peak: {metrics.get('memory_peak_mb')} MB")
# Determine type
cpu_ratio = metrics.get('cpu_time_ms') / metrics.get('total_time_ms')
if cpu_ratio > 0.8:
print("CPU-bound: use Level 2 or 3")
elif cpu_ratio < 0.3:
print("I/O-bound: use Level 1")
else:
print("Mixed workload")

Python cProfile

import cProfile
import pstats
def analyze_with_cprofile():
profiler = cProfile.Profile()
profiler.enable()
result = my_function(data)
profiler.disable()
stats = pstats.Stats(profiler)
stats.sort_stats('cumulative')
stats.print_stats(10) # Top 10 functions

line_profiler

# Install: pip install line_profiler
@profile # Use with kernprof -l -v script.py
def my_function(data):
result = []
for x in data: # Line-by-line timing
result.append(x ** 2)
return result

Interpreting Epochly Metrics

Key Metrics

MetricMeaningOptimization Hint
High cpu_timeCPU-boundLevel 2 or 3
High wait_timeI/O-boundLevel 1
High memory_peakMemory-boundBatch processing
Many loop_iterationsHot loopsLevel 2
Large data_sizeParallelizableLevel 3 or 4

Example Analysis

import epochly
@epochly.optimize(level=0)
def analyze_me(data):
return process(data)
# Run multiple times to gather stats
for _ in range(100):
analyze_me(test_data)
metrics = epochly.get_metrics()
# Analyze
if metrics.get('avg_loop_iterations', 0) > 10000:
print("Hot loops detected - consider Level 2")
if metrics.get('avg_data_size', 0) > 1_000_000:
print("Large data - consider Level 3 or 4")
if metrics.get('io_wait_ratio', 0) > 0.5:
print("I/O-bound - consider Level 1")

Optimization Strategy Selection

Decision Tree

Start
├─ Is it I/O-bound? (network, disk, DB)
│ └─ YES → Level 1 (Threading)
├─ Is it CPU-bound with hot loops?
│ └─ YES → Level 2 (JIT)
├─ Is data large and parallelizable?
│ └─ YES → Level 3 (Multicore)
├─ Are there large array operations?
│ └─ YES → Level 4 (GPU) [if licensed]
└─ Mixed or unclear → Start with Level 0, analyze

Code-Based Decision

import epochly
def select_optimal_level(workload_metrics):
"""Select optimal level based on workload analysis."""
io_ratio = workload_metrics.get('io_wait_ratio', 0)
loop_iters = workload_metrics.get('avg_loop_iterations', 0)
data_size = workload_metrics.get('avg_data_size', 0)
if io_ratio > 0.5:
return 1 # I/O-bound
elif loop_iters > 10000:
if data_size > 10_000_000 and epochly.check_feature('gpu_acceleration'):
return 4 # GPU for large arrays
elif data_size > 100_000:
return 3 # Parallel for large data
else:
return 2 # JIT for hot loops
elif data_size > 100_000:
return 3 # Parallel for large data
else:
return 0 # Monitor only (not enough work)

Benchmarking Levels

Comparative Benchmarking

import epochly
import time
def benchmark_levels(func, data, levels=[0, 1, 2, 3]):
results = {}
for level in levels:
with epochly.optimize_context(level=level):
# Warmup
func(data)
# Measure
start = time.perf_counter()
for _ in range(10):
func(data)
elapsed = time.perf_counter() - start
results[level] = elapsed / 10
return results
# Compare
timings = benchmark_levels(my_function, test_data)
print("Level | Time (s) | Speedup")
baseline = timings[0]
for level, t in sorted(timings.items()):
speedup = baseline / t if t > 0 else 0
print(f" {level} | {t:.4f} | {speedup:.2f}x")

Common Patterns

Pattern: API Client

# I/O-bound: use Level 1
@epochly.optimize(level=1)
def api_client(endpoints):
return [call_api(ep) for ep in endpoints]

Pattern: Data Pipeline

# Mixed: Level 1 for I/O, Level 2 for transform
def data_pipeline(sources):
with epochly.optimize_context(level=1):
raw_data = [fetch(s) for s in sources]
with epochly.optimize_context(level=2):
transformed = [transform(d) for d in raw_data]
return transformed

Pattern: Scientific Computing

# CPU-bound with large arrays: Level 3 or 4
@epochly.optimize(level=3)
def scientific_compute(matrix):
# Parallel processing for matrix operations
return complex_calculation(matrix)