.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "auto_examples/core/plot_load_save_external.py" .. LINE NUMBERS ARE GIVEN BELOW. .. only:: html .. note:: :class: sphx-glr-download-link-note :ref:`Go to the end ` to download the full example code. .. rst-class:: sphx-glr-example-title .. _sphx_glr_auto_examples_core_plot_load_save_external.py: .. _l-example-plot-load-save-external: Load and save ONNX models with external data ============================================ This example demonstrates how to load and save ONNX models that store tensor weights in separate *external data* files. The approach allows the graph structure to be inspected independently of the (potentially very large) weight payload and enables memory-mapping the weight file directly when the model is loaded back through :func:`onnx_light.onnx.load`. All operations are performed through :mod:`onnx_light.onnx`, which routes all I/O through C++ without any Python-level tensor iteration. .. GENERATED FROM PYTHON SOURCE LINES 16-46 .. code-block:: Python import os import shutil import numpy as np import onnx import onnx.helper as oh import onnx.numpy_helper as onh import onnx_light.onnx as onnxl def check_w0_roundtrip(model_proto, expected: np.ndarray, label: str = "") -> None: """Verifies that the first initializer matches *expected* after a round-trip. Extracts the raw bytes of the first initializer in *model_proto*, interprets them as float32, reshapes to match *expected*, and asserts element-wise closeness. :param model_proto: Loaded :class:`onnxl.ModelProto` to inspect. :param expected: Reference numpy array to compare against. :param label: Optional suffix appended to the assertion message. """ raw = bytes(model_proto.graph.initializer[0].raw_data) loaded = np.frombuffer(raw, dtype=np.float32).reshape(expected.shape) suffix = f" ({label})" if label else "" assert np.allclose(expected, loaded), f"Round-trip mismatch for W0{suffix}" print(f"W0 round-trip{suffix}: OK") .. GENERATED FROM PYTHON SOURCE LINES 47-53 Build a tiny synthetic ONNX model ----------------------------------- The model has two ``Gemm`` nodes with float32 weight matrices. All tensors are stored as initializers so they end up in the external data file when saved with ``location``. .. GENERATED FROM PYTHON SOURCE LINES 53-72 .. code-block:: Python DIM = 64 if os.environ.get("UNITTEST_GOING") == "1" else 256 w0 = np.random.randn(DIM, DIM).astype(np.float32) w1 = np.random.randn(DIM, DIM).astype(np.float32) inputs = [oh.make_tensor_value_info("X", onnx.TensorProto.FLOAT, [None, DIM])] outputs = [oh.make_tensor_value_info("Y1", onnx.TensorProto.FLOAT, [None, DIM])] initializers = [onh.from_array(w0, name="W0"), onh.from_array(w1, name="W1")] nodes = [ oh.make_node("Gemm", ["X", "W0"], ["Y0"], transB=1), oh.make_node("Gemm", ["Y0", "W1"], ["Y1"], transB=1), ] graph = oh.make_graph(nodes, "demo_graph", inputs, outputs, initializer=initializers) onnx_model = oh.make_model(graph, opset_imports=[oh.make_opsetid("", 18)], ir_version=9) print(f"Number of initializers: {len(onnx_model.graph.initializer)}") print(f"Model ByteSize: {onnx_model.ByteSize() / 1024:.1f} KB") .. rst-class:: sphx-glr-script-out .. code-block:: none Number of initializers: 2 Model ByteSize: 512.2 KB .. GENERATED FROM PYTHON SOURCE LINES 73-79 Save to a single .onnx file first ----------------------------------- Write the model to disk using standard ``onnx.save`` so that we can later convert it to the two-file layout via :func:`onnxl.load` / :func:`onnxl.save`. .. GENERATED FROM PYTHON SOURCE LINES 79-87 .. code-block:: Python out_dir = "temp_plot_load_save_external" os.makedirs(out_dir, exist_ok=True) single_file_path = os.path.join(out_dir, "model_single.onnx") onnx.save(onnx_model, single_file_path) print(f"Saved single-file model: {single_file_path}") .. rst-class:: sphx-glr-script-out .. code-block:: none Saved single-file model: temp_plot_load_save_external/model_single.onnx .. GENERATED FROM PYTHON SOURCE LINES 88-93 Load with onnx_light --------------------- :func:`onnxl.load` memory-maps the main ``.onnx`` file (and any external weights file) and can optionally parse tensors in parallel. .. GENERATED FROM PYTHON SOURCE LINES 93-98 .. code-block:: Python onnxl_model = onnxl.load(single_file_path) print(f"Loaded model ir_version={onnxl_model.ir_version}") print(f"Graph name: {onnxl_model.graph.name}") .. rst-class:: sphx-glr-script-out .. code-block:: none Loaded model ir_version=9 Graph name: demo_graph .. GENERATED FROM PYTHON SOURCE LINES 99-108 Save with external data (two files) ------------------------------------- Passing a ``location`` argument routes all tensor raw-data to a separate binary file. The ``.onnx`` file stores only the graph structure plus a small metadata record (offset + length) for each weight tensor. * ``model_ext.onnx`` – graph structure (kilobytes) * ``model_ext.onnx.data`` – raw weight bytes (megabytes) .. GENERATED FROM PYTHON SOURCE LINES 108-120 .. code-block:: Python ext_onnx = os.path.join(out_dir, "model_ext.onnx") ext_data = ext_onnx + ".data" onnxl.save(onnxl_model, ext_onnx, location=ext_data) onnx_size = os.path.getsize(ext_onnx) data_size = os.path.getsize(ext_data) print("Saved two-file model:") print(f" {ext_onnx!r:<50} {onnx_size / 1024:7.1f} KB (graph structure)") print(f" {ext_data!r:<50} {data_size / 1024:7.1f} KB (tensor weights)") .. rst-class:: sphx-glr-script-out .. code-block:: none Saved two-file model: 'temp_plot_load_save_external/model_ext.onnx' 0.3 KB (graph structure) 'temp_plot_load_save_external/model_ext.onnx.data' 512.0 KB (tensor weights) .. GENERATED FROM PYTHON SOURCE LINES 121-127 Load from the two-file layout ------------------------------ Pass ``load_external_data=True`` so :func:`onnxl.load` scans the model metadata and auto-discovers the external data file from the ``location`` entry stored in each tensor's ``external_data`` field. .. GENERATED FROM PYTHON SOURCE LINES 127-134 .. code-block:: Python loaded_ext = onnxl.load(ext_onnx, load_external_data=True) print(f"Loaded two-file model, initializers={len(loaded_ext.graph.initializer)}") # Verify that the first weight round-trips correctly. check_w0_roundtrip(loaded_ext, w0) .. rst-class:: sphx-glr-script-out .. code-block:: none Loaded two-file model, initializers=2 W0 round-trip: OK .. GENERATED FROM PYTHON SOURCE LINES 135-140 Override the data-file location at load time ---------------------------------------------- If the weight file has been moved or renamed, the ``location`` keyword lets you override the path stored inside the ``.onnx`` metadata. .. GENERATED FROM PYTHON SOURCE LINES 140-144 .. code-block:: Python loaded_override = onnxl.load(ext_onnx, location=ext_data, load_external_data=True) print(f"Loaded with explicit location, initializers={len(loaded_override.graph.initializer)}") .. rst-class:: sphx-glr-script-out .. code-block:: none Loaded with explicit location, initializers=2 .. GENERATED FROM PYTHON SOURCE LINES 145-152 Save with parallel I/O ----------------------- Large models benefit from writing raw-data blocks in parallel. Pass ``num_threads=N > 1 or 0`` to control the thread pool. The ``min_block_size`` parameter prevents spawning threads for tiny tensors. .. GENERATED FROM PYTHON SOURCE LINES 152-159 .. code-block:: Python ext_par_onnx = os.path.join(out_dir, "model_ext_par.onnx") ext_par_data = ext_par_onnx + ".data" onnxl.save(onnxl_model, ext_par_onnx, location=ext_par_data, num_threads=4) print(f"Saved with parallel I/O: {ext_par_onnx!r}") .. rst-class:: sphx-glr-script-out .. code-block:: none Saved with parallel I/O: 'temp_plot_load_save_external/model_ext_par.onnx' .. GENERATED FROM PYTHON SOURCE LINES 160-168 Split external data across multiple files ------------------------------------------ Use ``max_external_file_size`` to cap the size of each external weight file. Once the primary file (``model_split.onnx.data``) reaches the limit, a new file is opened automatically with the suffix ``.1``, ``.2``, and so on. When loading, only the primary location needs to be specified; the loader follows the split-file references stored in each tensor's metadata. .. GENERATED FROM PYTHON SOURCE LINES 168-193 .. code-block:: Python ext_split_onnx = os.path.join(out_dir, "model_split.onnx") ext_split_data = ext_split_onnx + ".data" # Cap at half the total weight size so that at least two data files are # produced regardless of the chosen DIM. total_weight_bytes = (w0.nbytes + w1.nbytes) // 2 onnxl.save( onnxl_model, ext_split_onnx, location=ext_split_data, max_external_file_size=total_weight_bytes, ) split_files = sorted(p for p in os.listdir(out_dir) if p.startswith("model_split.onnx.data")) print("Files produced by split save:") for fname in split_files: fpath = os.path.join(out_dir, fname) print(f" {fname!r:<40} {os.path.getsize(fpath) / 1024:7.1f} KB") # Load back – only the primary data file is needed. loaded_split = onnxl.load(ext_split_onnx, load_external_data=True) check_w0_roundtrip(loaded_split, w0, "split") .. rst-class:: sphx-glr-script-out .. code-block:: none Files produced by split save: 'model_split.onnx.data' 256.0 KB 'model_split.onnx.data.1' 256.0 KB W0 round-trip (split): OK .. GENERATED FROM PYTHON SOURCE LINES 194-196 Cleanup -------- .. GENERATED FROM PYTHON SOURCE LINES 196-198 .. code-block:: Python shutil.rmtree(out_dir, ignore_errors=True) .. rst-class:: sphx-glr-timing **Total running time of the script:** (0 minutes 0.137 seconds) .. _sphx_glr_download_auto_examples_core_plot_load_save_external.py: .. only:: html .. container:: sphx-glr-footer sphx-glr-footer-example .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: plot_load_save_external.ipynb ` .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: plot_load_save_external.py ` .. container:: sphx-glr-download sphx-glr-download-zip :download:`Download zipped: plot_load_save_external.zip ` .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_