Performance¶
This page summarizes the current benchmark results for Atompack. The main pattern is consistent across the reported comparisons:
Atompack is strongest on read-heavy serving paths, especially once access becomes less sequential.
Atompack also leads the write-throughput slices used by the report, particularly when using the native batch ingestion APIs.
Artifact size stays close to HDF5 SOA and substantially smaller than the LMDB and ASE baselines used in this repository.
Baselines¶
The benchmark set is intentionally mixed. It is not only comparing Atompack to one alternative storage engine, but to the main layout families that show up in atomistic ML codebases:
hdf5_soais the conventional scientific-array baseline: one chunked HDF5 dataset per field such as positions, atomic numbers, energies, and forces. Its main trade-off is that it is very compact and works well for bulk array access, but shuffled single-molecule reads can force extra chunk traffic and more Python-side reconstruction work.lmdb_packedis a key-value baseline where each molecule is serialized into one compact binary record and stored in LMDB. That is closer to the access pattern used by training dataloaders and usually behaves better than HDF5 on random reads, but it pays per-record encode/decode overhead and is less storage-efficient than a compact array-oriented layout.lmdb_pickleis a common Python-first LMDB pattern where each entry is a pickled dict of numpy arrays and scalar metadata. It is flexible and easy to integrate into existing repos, but pickle framing and object reconstruction are exactly the costs that show up in the read and size numbers.ase_sqliteandase_lmdbare included as ecosystem reference baselines. They are not designed first as high-throughput training stores, but ASE is still the main interchange layer in many atomistic and materials ML repositories: datasets are often published asase.Atomsorase.dbcollections, and training code frequently starts from an ASE-based reader or preprocessing step. That makes ASE an important baseline for practical compatibility, even when its database backends trade away throughput for generality and broad tool support.
The ASE results answer a different question from the HDF5 and LMDB baselines. They show the cost of staying on the most common ecosystem path end to end, while the HDF5 and LMDB results show the more specialized storage trade-offs that practitioners already use when they start optimizing training throughput.
Read Performance¶
For a representative NVMe slice at 64 atoms per molecule, the comparison below uses:
sequential read throughput
the single-worker
multiprocessingslice as the random or shuffled read proxy
For that slice, Atompack reaches about:
646k mol/son sequential reads446k mol/son the single-worker random or shuffled read path
Relative to the same benchmark slice, Atompack is:
1.37xfaster than HDF5 SOA on sequential reads and24.0xfaster on the random or shuffled path3.32xfaster than LMDB Packed on sequential reads and2.81xfaster on the random or shuffled path5.18xfaster than LMDB Pickle on sequential reads and3.82xfaster on the random or shuffled path
backend |
sequential read (mol/s) |
random/shuffled read (mol/s) |
|---|---|---|
atompack |
646,261 |
445,830 |
hdf5_soa |
470,417 |
18,569 |
lmdb_packed |
194,467 |
158,871 |
lmdb_pickle |
124,706 |
116,579 |
ase_lmdb |
4,637 |
4,620 |
ase_sqlite |
1,790 |
1,803 |
The “random/shuffled” column above comes from the single-worker multiprocessing benchmark
slice used for this benchmark report, not from a separate point-query microbenchmark.
Scaling And Filesystems¶
The benchmark results also cover atom-count scaling and behavior on shared filesystems:
Across NVMe, NFS, GPFS, and Lustre, Atompack keeps a clear lead on the single-worker random/shuffled read slice:
vs LMDB Packed: about
2.6xto2.9xfastervs LMDB Pickle: about
3.8xto4.8xfastervs HDF5 SOA: about
15.7xto31.6xfaster
That consistency matters for shared-storage training setups where local-NVMe numbers are not representative of the final deployment environment.
Write Throughput¶
For write throughput, the comparison below uses the current NVMe write benchmark slices:
For the 64-atom slice, Atompack leads the plotted backends both with builtin fields only and with additional custom properties:
backend |
write builtins (mol/s) |
write with custom props (mol/s) |
|---|---|---|
atompack |
105,473 |
77,193 |
hdf5_soa |
91,431 |
57,198 |
lmdb_packed |
37,573 |
26,323 |
lmdb_pickle |
24,967 |
18,477 |
ase_sqlite |
1,417 |
1,398 |
ase_lmdb |
919 |
755 |
This is the path where Atompack’s native batch ingestion matters most:
Database.add_arrays_batch(...)for stacked numpy inputsatompack.add_ase_batch(...)for iterables ofase.Atoms
Storage Efficiency¶
For storage footprint, the comparison below reports normalized artifact size for the same write benchmark slices:
For the 64-atom slice:
backend |
size ratio vs atompack (builtins) |
size ratio vs atompack (custom) |
|---|---|---|
hdf5_soa |
0.96x |
0.95x |
atompack |
1.00x |
1.00x |
lmdb_packed |
2.34x |
1.35x |
lmdb_pickle |
2.35x |
1.35x |
ase_sqlite |
3.05x |
2.08x |
ase_lmdb |
4.69x |
2.69x |
The practical takeaway is not that Atompack is always the absolute smallest representation. The more important result is that it stays in the compact-storage regime while pairing that with much stronger read behavior.
Reproducing The Benchmarks¶
uv run --project atompack-py --no-sync python atompack-py/benchmarks/benchmark.py \
--out /tmp/atompack-bench/benchmark.json
uv run --project atompack-py --no-sync python atompack-py/benchmarks/scaling_benchmark.py \
--out /tmp/atompack-bench/scaling.json
uv run --project atompack-py --no-sync python atompack-py/benchmarks/write_benchmark.py \
--out /tmp/atompack-bench/write.json
For running quick microbenchmarks or inspecting the code, you can also run this binary directly:
cargo run -p atompack --release --bin atompack-bench -- --help
Practical Guidance¶
Use
Database.open(path)for read-mostly datasets. It defaults to mmap-backed read-only mode.Reopen with
Database.open(path, mmap=False)when you need to append more molecules.Prefer
add_arrays_batch(...)oradd_ase_batch(...)when ingesting large datasets from existing array or ASE pipelines.Prefer
db[i]ordb.get_molecules(...)for straightforward read paths, and useget_molecules_flat(...)when you specifically want already-stacked training batches.Compression is available when artifact size matters, but it should be treated as a workload tuning knob rather than as the main reason to adopt Atompack.