Data science in Python means spending time waiting. A DataFrame groupby on 10 million rows. A NumPy reduction across 100 million elements. A scikit-learn cross-validation grid search that runs overnight.
Epochly accelerates these workloads without changing your code.
Where Data Science Time Goes
A typical data science workflow breaks down into:
| Phase | Tools | CPU Profile |
|---|---|---|
| Data loading | Pandas read_csv, read_parquet | I/O-bound |
| Data transformation | Pandas groupby, apply, merge | Mixed (Python + C) |
| Feature engineering | Custom Python loops, NumPy | CPU-bound Python |
| Model training | scikit-learn fit, cross_val_score | CPU-bound (mixed) |
| Evaluation | NumPy, custom metrics | CPU-bound Python |
The bottleneck shifts depending on your stage. Data loading is I/O-bound (Epochly won't help). But feature engineering, custom transformations, and numerical computations are CPU-bound Python -- exactly what Epochly optimizes.
Accelerating NumPy Workflows
Large array operations benefit from GPU acceleration (Level 4).
import epochlyimport numpy as np@epochly.optimizedef compute_features(data):"""Feature engineering on large arrays."""normalized = (data - np.mean(data, axis=0)) / np.std(data, axis=0)correlations = np.corrcoef(normalized.T)eigenvalues = np.linalg.eigvalsh(correlations)return eigenvalues
Expected speedups by data size
| Array Size | Operation Type | Speedup |
|---|---|---|
| 100K elements | Elementwise | 2.3x |
| 1M elements | Elementwise | 12.3x |
| 10M+ elements | Elementwise | 65-70x |
| 1024x1024 matrix | Matrix multiply | 9.5x |
| 100M elements | Reduction (sum, mean) | 35.9x |
The threshold is 10M+ elements for elementwise operations. Below 1M elements, GPU kernel launch overhead exceeds the benefit.
Accelerating Pandas Workflows
Pandas operations that call back into Python (apply with custom functions, iterrows, itertuples) are prime targets for JIT compilation (Level 2).
import epochlyimport pandas as pd@epochly.optimizedef process_transactions(df):"""Custom transformation on DataFrame."""results = []for _, row in df.iterrows():score = row['amount'] * 0.05 + row['frequency'] ** 1.5if score > row['threshold']:results.append(score)else:results.append(0.0)return results
The iterrows loop above runs as interpreted Python. Epochly's Level 2 JIT compiles the numerical inner loop to native code: 58-193x speedup (113x average on numerical loops).
When Pandas won't speed up
df.groupby().sum()-- already uses C internally (~1.0x)df.merge()-- already optimized in Pandas C engine (~1.0x)df.read_csv()-- I/O-bound, no CPU optimization helps
Profile first. If the bottleneck is a Python-level loop or custom apply function, Epochly helps. If it's a built-in Pandas operation, it's already fast.
Accelerating scikit-learn
scikit-learn training involves numerical computation that can benefit from multi-core parallelism (Level 3).
import epochlyfrom sklearn.model_selection import cross_val_scorefrom sklearn.ensemble import RandomForestClassifier@epochly.optimizedef evaluate_model(X, y):"""Cross-validation with parallel execution."""clf = RandomForestClassifier(n_estimators=100)scores = cross_val_score(clf, X, y, cv=5, scoring='accuracy')return scores.mean(), scores.std()
Level 3 parallel execution distributes work across CPU cores: 8-12x on 16 cores for CPU-bound workloads (ProcessPool, 50-60% parallel efficiency).
Note: scikit-learn already supports n_jobs=-1 for built-in parallelism. Epochly helps most when you have custom preprocessing pipelines or evaluation loops that aren't covered by scikit-learn's own parallelism.
What Epochly Does NOT Help in Data Science
Transparency means acknowledging limitations:
- Pandas built-in operations:
groupby,merge,pivot_tableuse C engines internally. ~1.0x improvement. - Data loading: I/O-bound by disk/network speed. ~1.0x.
- Small datasets: Arrays under 1M elements or DataFrames under 10K rows. Overhead exceeds benefit.
- Already vectorized NumPy: Operations like
np.matmulalready use optimized BLAS libraries internally. ~1.0x.
Getting Started
# Install Epochlypip install epochly# Add to your data science workflowimport epochly@epochly.optimizedef your_feature_engineering(data):# Your existing code -- no changes neededpass
Start with Level 0 monitoring to understand where your time goes. Then let Epochly graduate to the appropriate optimization level.
Benchmark conditions: Python 3.12.3, Linux WSL2, 16 cores, NVIDIA Quadro M6000 24GB (CUDA 12.1). January 29, 2026 comprehensive benchmark report.