.. Copyright 2026 Entalpic Performance =========== This page summarizes the current benchmark results for Atompack. The main pattern is consistent across the reported comparisons: - Atompack is strongest on read-heavy serving paths, especially once access becomes less sequential. - Atompack also leads the write-throughput slices used by the report, particularly when using the native batch ingestion APIs. - Artifact size stays close to HDF5 SOA and substantially smaller than the LMDB and ASE baselines used in this repository. Baselines ---------------------------- The benchmark set is intentionally mixed. It is not only comparing Atompack to one alternative storage engine, but to the main layout families that show up in atomistic ML codebases: - ``hdf5_soa`` is the conventional scientific-array baseline: one chunked HDF5 dataset per field such as positions, atomic numbers, energies, and forces. Its main trade-off is that it is very compact and works well for bulk array access, but shuffled single-molecule reads can force extra chunk traffic and more Python-side reconstruction work. - ``lmdb_packed`` is a key-value baseline where each molecule is serialized into one compact binary record and stored in LMDB. That is closer to the access pattern used by training dataloaders and usually behaves better than HDF5 on random reads, but it pays per-record encode/decode overhead and is less storage-efficient than a compact array-oriented layout. - ``lmdb_pickle`` is a common Python-first LMDB pattern where each entry is a pickled dict of numpy arrays and scalar metadata. It is flexible and easy to integrate into existing repos, but pickle framing and object reconstruction are exactly the costs that show up in the read and size numbers. - ``ase_sqlite`` and ``ase_lmdb`` are included as ecosystem reference baselines. They are not designed first as high-throughput training stores, but ASE is still the main interchange layer in many atomistic and materials ML repositories: datasets are often published as ``ase.Atoms`` or ``ase.db`` collections, and training code frequently starts from an ASE-based reader or preprocessing step. That makes ASE an important baseline for practical compatibility, even when its database backends trade away throughput for generality and broad tool support. The ASE results answer a different question from the HDF5 and LMDB baselines. They show the cost of staying on the most common ecosystem path end to end, while the HDF5 and LMDB results show the more specialized storage trade-offs that practitioners already use when they start optimizing training throughput. Read Performance ---------------- For a representative NVMe slice at 64 atoms per molecule, the comparison below uses: - sequential read throughput - the single-worker ``multiprocessing`` slice as the random or shuffled read proxy .. figure:: _static/img/atompack-story/story_read_hero.svg :alt: Atompack read throughput benchmark hero figure For that slice, Atompack reaches about: - ``646k mol/s`` on sequential reads - ``446k mol/s`` on the single-worker random or shuffled read path Relative to the same benchmark slice, Atompack is: - ``1.37x`` faster than HDF5 SOA on sequential reads and ``24.0x`` faster on the random or shuffled path - ``3.32x`` faster than LMDB Packed on sequential reads and ``2.81x`` faster on the random or shuffled path - ``5.18x`` faster than LMDB Pickle on sequential reads and ``3.82x`` faster on the random or shuffled path .. list-table:: :header-rows: 1 * - backend - sequential read (mol/s) - random/shuffled read (mol/s) * - atompack - 646,261 - 445,830 * - hdf5_soa - 470,417 - 18,569 * - lmdb_packed - 194,467 - 158,871 * - lmdb_pickle - 124,706 - 116,579 * - ase_lmdb - 4,637 - 4,620 * - ase_sqlite - 1,790 - 1,803 The “random/shuffled” column above comes from the single-worker ``multiprocessing`` benchmark slice used for this benchmark report, not from a separate point-query microbenchmark. Scaling And Filesystems ----------------------- The benchmark results also cover atom-count scaling and behavior on shared filesystems: .. figure:: _static/img/atompack-story/story_size_scaling.svg :alt: Atompack scaling figure across atom counts .. figure:: _static/img/atompack-story/story_random_filesystems.svg :alt: Atompack random read behavior across filesystems Across NVMe, NFS, GPFS, and Lustre, Atompack keeps a clear lead on the single-worker random/shuffled read slice: - vs LMDB Packed: about ``2.6x`` to ``2.9x`` faster - vs LMDB Pickle: about ``3.8x`` to ``4.8x`` faster - vs HDF5 SOA: about ``15.7x`` to ``31.6x`` faster That consistency matters for shared-storage training setups where local-NVMe numbers are not representative of the final deployment environment. Write Throughput ---------------- For write throughput, the comparison below uses the current NVMe write benchmark slices: .. figure:: _static/img/atompack-story/story_write_overview.svg :alt: Atompack write throughput overview For the 64-atom slice, Atompack leads the plotted backends both with builtin fields only and with additional custom properties: .. list-table:: :header-rows: 1 * - backend - write builtins (mol/s) - write with custom props (mol/s) * - atompack - 105,473 - 77,193 * - hdf5_soa - 91,431 - 57,198 * - lmdb_packed - 37,573 - 26,323 * - lmdb_pickle - 24,967 - 18,477 * - ase_sqlite - 1,417 - 1,398 * - ase_lmdb - 919 - 755 This is the path where Atompack's native batch ingestion matters most: - ``Database.add_arrays_batch(...)`` for stacked numpy inputs - ``atompack.add_ase_batch(...)`` for iterables of ``ase.Atoms`` Storage Efficiency ------------------ For storage footprint, the comparison below reports normalized artifact size for the same write benchmark slices: .. figure:: _static/img/atompack-story/story_write_storage.svg :alt: Atompack write storage efficiency comparison For the 64-atom slice: .. list-table:: :header-rows: 1 * - backend - size ratio vs atompack (builtins) - size ratio vs atompack (custom) * - hdf5_soa - 0.96x - 0.95x * - atompack - 1.00x - 1.00x * - lmdb_packed - 2.34x - 1.35x * - lmdb_pickle - 2.35x - 1.35x * - ase_sqlite - 3.05x - 2.08x * - ase_lmdb - 4.69x - 2.69x The practical takeaway is not that Atompack is always the absolute smallest representation. The more important result is that it stays in the compact-storage regime while pairing that with much stronger read behavior. Reproducing The Benchmarks -------------------------- .. code-block:: bash uv run --project atompack-py --no-sync python atompack-py/benchmarks/benchmark.py \ --out /tmp/atompack-bench/benchmark.json uv run --project atompack-py --no-sync python atompack-py/benchmarks/scaling_benchmark.py \ --out /tmp/atompack-bench/scaling.json uv run --project atompack-py --no-sync python atompack-py/benchmarks/write_benchmark.py \ --out /tmp/atompack-bench/write.json For running quick microbenchmarks or inspecting the code, you can also run this binary directly: .. code-block:: bash cargo run -p atompack --release --bin atompack-bench -- --help Practical Guidance ------------------ - Use ``Database.open(path)`` for read-mostly datasets. It defaults to mmap-backed read-only mode. - Reopen with ``Database.open(path, mmap=False)`` when you need to append more molecules. - Prefer ``add_arrays_batch(...)`` or ``add_ase_batch(...)`` when ingesting large datasets from existing array or ASE pipelines. - Prefer ``db[i]`` or ``db.get_molecules(...)`` for straightforward read paths, and use ``get_molecules_flat(...)`` when you specifically want already-stacked training batches. - Compression is available when artifact size matters, but it should be treated as a workload tuning knob rather than as the main reason to adopt Atompack.