Benchmarks¶
These benchmarks are small, reproducible checks for MolScope's core operations: PDB parsing, residue contact maps, and graph export. They are meant to give a practical sense of overhead on the bundled sample structures, not to replace domain-specific benchmark suites for MDAnalysis, MDTraj, RDKit, or simulation engines.
Run them locally from the repository root:
uv run python scripts/benchmark_core.py
Local reference run¶
Measured on May 26, 2026 with Python 3.12.11 on macOS arm64. Each row reports the median of 7 timed runs after one warm-up call.
| Structure | Operation | Median (ms) | Range (ms) | Notes |
|---|---|---|---|---|
1fqy |
parse PDB | 3.37 | 3.27-4.09 | single 1,661-atom model |
1aml |
parse PDB models | 23.81 | 23.35-24.87 | 20-model NMR ensemble, 11,960 atoms total |
1fqy CA |
distance matrix (NumPy) | 1.25 | 1.13-1.54 | 226 x 226 dense CA matrix |
1fqy |
residue contact map | 1.49 | 1.49-1.55 | 226 x 226 residue map |
1aml model 1 |
residue contact map | 0.19 | 0.18-0.22 | 40 x 40 residue map |
1fqy |
graph export | 0.96 | 0.90-0.99 | inferred atom-level bonds |
1aml model 1 |
graph export | 0.43 | 0.40-1.22 | inferred atom-level bonds |
What is measured¶
parse PDB: fixed-column PDB parsing intoMoleculeobjects.parse PDB models: allMODELrecords from the NMR ensemble.distance matrix:alpha_carbons().distance_matrix(backend="numpy").residue contact map:contact_map(cutoff=8.0, level="residue").graph export:to_graph()using explicit or inferred bonds.
The benchmark script keeps parsed molecules in memory for contact-map and graph export tests so those rows measure analysis/export cost rather than file I/O. If PyTorch or CuPy is installed, the script also adds optional dense-backend rows for CPU/GPU distance matrices.
Interpreting the numbers¶
The bundled examples are intentionally small enough for teaching and CI. Larger systems scale differently:
- PDB parsing scales approximately with line count.
- Residue-level contact maps are much smaller than atom-level dense maps.
- Atom-level dense contact outputs are still
O(N^2)in memory. - Bond/contact searches use SciPy KD-tree paths when SciPy is installed and fall back to pure NumPy paths otherwise.
Use the benchmark script as a smoke test when changing parsers, contact maps, or graph export internals.