.. DO NOT EDIT.
.. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY.
.. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE:
.. "auto_examples/core/plot_load_save_external.py"
.. LINE NUMBERS ARE GIVEN BELOW.

.. only:: html

    .. note::
        :class: sphx-glr-download-link-note

        :ref:`Go to the end <sphx_glr_download_auto_examples_core_plot_load_save_external.py>`
        to download the full example code.

.. rst-class:: sphx-glr-example-title

.. _sphx_glr_auto_examples_core_plot_load_save_external.py:


.. _l-example-plot-load-save-external:

Load and save ONNX models with external data
============================================

This example demonstrates how to load and save ONNX models that store tensor weights
in separate *external data* files.  The approach allows the graph structure to
be inspected independently of the (potentially very large) weight payload and
enables memory-mapping the weight file directly when the model is loaded
back through :func:`onnx_light.onnx.load`.

All operations are performed through :mod:`onnx_light.onnx`, which routes all
I/O through C++ without any Python-level tensor iteration.

.. GENERATED FROM PYTHON SOURCE LINES 16-46

.. code-block:: Python


    import os
    import shutil

    import numpy as np
    import onnx
    import onnx.helper as oh
    import onnx.numpy_helper as onh

    import onnx_light.onnx as onnxl


    def check_w0_roundtrip(model_proto, expected: np.ndarray, label: str = "") -> None:
        """Verifies that the first initializer matches *expected* after a round-trip.

        Extracts the raw bytes of the first initializer in *model_proto*, interprets
        them as float32, reshapes to match *expected*, and asserts element-wise
        closeness.

        :param model_proto: Loaded :class:`onnxl.ModelProto` to inspect.
        :param expected: Reference numpy array to compare against.
        :param label: Optional suffix appended to the assertion message.
        """
        raw = bytes(model_proto.graph.initializer[0].raw_data)
        loaded = np.frombuffer(raw, dtype=np.float32).reshape(expected.shape)
        suffix = f" ({label})" if label else ""
        assert np.allclose(expected, loaded), f"Round-trip mismatch for W0{suffix}"
        print(f"W0 round-trip{suffix}: OK")


.. GENERATED FROM PYTHON SOURCE LINES 47-53

Build a tiny synthetic ONNX model
-----------------------------------

The model has two ``Gemm`` nodes with float32 weight matrices.  All tensors
are stored as initializers so they end up in the external data file when
saved with ``location``.

.. GENERATED FROM PYTHON SOURCE LINES 53-72

.. code-block:: Python


    DIM = 64 if os.environ.get("UNITTEST_GOING") == "1" else 256

    w0 = np.random.randn(DIM, DIM).astype(np.float32)
    w1 = np.random.randn(DIM, DIM).astype(np.float32)

    inputs = [oh.make_tensor_value_info("X", onnx.TensorProto.FLOAT, [None, DIM])]
    outputs = [oh.make_tensor_value_info("Y1", onnx.TensorProto.FLOAT, [None, DIM])]
    initializers = [onh.from_array(w0, name="W0"), onh.from_array(w1, name="W1")]
    nodes = [
        oh.make_node("Gemm", ["X", "W0"], ["Y0"], transB=1),
        oh.make_node("Gemm", ["Y0", "W1"], ["Y1"], transB=1),
    ]
    graph = oh.make_graph(nodes, "demo_graph", inputs, outputs, initializer=initializers)
    onnx_model = oh.make_model(graph, opset_imports=[oh.make_opsetid("", 18)], ir_version=9)

    print(f"Number of initializers: {len(onnx_model.graph.initializer)}")
    print(f"Model ByteSize: {onnx_model.ByteSize() / 1024:.1f} KB")


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    Number of initializers: 2
    Model ByteSize: 512.2 KB


.. GENERATED FROM PYTHON SOURCE LINES 73-79

Save to a single .onnx file first
-----------------------------------

Write the model to disk using standard ``onnx.save`` so that we can later
convert it to the two-file layout via :func:`onnxl.load` /
:func:`onnxl.save`.

.. GENERATED FROM PYTHON SOURCE LINES 79-87

.. code-block:: Python


    out_dir = "temp_plot_load_save_external"
    os.makedirs(out_dir, exist_ok=True)

    single_file_path = os.path.join(out_dir, "model_single.onnx")
    onnx.save(onnx_model, single_file_path)
    print(f"Saved single-file model: {single_file_path}")


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    Saved single-file model: temp_plot_load_save_external/model_single.onnx


.. GENERATED FROM PYTHON SOURCE LINES 88-93

Load with onnx_light
---------------------

:func:`onnxl.load` memory-maps the main ``.onnx`` file (and any external
weights file) and can optionally parse tensors in parallel.

.. GENERATED FROM PYTHON SOURCE LINES 93-98

.. code-block:: Python


    onnxl_model = onnxl.load(single_file_path)
    print(f"Loaded model ir_version={onnxl_model.ir_version}")
    print(f"Graph name: {onnxl_model.graph.name}")


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    Loaded model ir_version=9
    Graph name: demo_graph


.. GENERATED FROM PYTHON SOURCE LINES 99-108

Save with external data (two files)
-------------------------------------

Passing a ``location`` argument routes all tensor raw-data to a separate
binary file.  The ``.onnx`` file stores only the graph structure plus a
small metadata record (offset + length) for each weight tensor.

* ``model_ext.onnx`` – graph structure (kilobytes)
* ``model_ext.onnx.data`` – raw weight bytes (megabytes)

.. GENERATED FROM PYTHON SOURCE LINES 108-120

.. code-block:: Python


    ext_onnx = os.path.join(out_dir, "model_ext.onnx")
    ext_data = ext_onnx + ".data"

    onnxl.save(onnxl_model, ext_onnx, location=ext_data)

    onnx_size = os.path.getsize(ext_onnx)
    data_size = os.path.getsize(ext_data)
    print("Saved two-file model:")
    print(f"  {ext_onnx!r:<50} {onnx_size / 1024:7.1f} KB  (graph structure)")
    print(f"  {ext_data!r:<50} {data_size / 1024:7.1f} KB  (tensor weights)")


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    Saved two-file model:
      'temp_plot_load_save_external/model_ext.onnx'          0.3 KB  (graph structure)
      'temp_plot_load_save_external/model_ext.onnx.data'   512.0 KB  (tensor weights)


.. GENERATED FROM PYTHON SOURCE LINES 121-127

Load from the two-file layout
------------------------------

Pass ``load_external_data=True`` so :func:`onnxl.load` scans the model
metadata and auto-discovers the external data file from the ``location``
entry stored in each tensor's ``external_data`` field.

.. GENERATED FROM PYTHON SOURCE LINES 127-134

.. code-block:: Python


    loaded_ext = onnxl.load(ext_onnx, load_external_data=True)
    print(f"Loaded two-file model, initializers={len(loaded_ext.graph.initializer)}")

    # Verify that the first weight round-trips correctly.
    check_w0_roundtrip(loaded_ext, w0)


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    Loaded two-file model, initializers=2
    W0 round-trip: OK


.. GENERATED FROM PYTHON SOURCE LINES 135-140

Override the data-file location at load time
----------------------------------------------

If the weight file has been moved or renamed, the ``location`` keyword lets
you override the path stored inside the ``.onnx`` metadata.

.. GENERATED FROM PYTHON SOURCE LINES 140-144

.. code-block:: Python


    loaded_override = onnxl.load(ext_onnx, location=ext_data, load_external_data=True)
    print(f"Loaded with explicit location, initializers={len(loaded_override.graph.initializer)}")


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    Loaded with explicit location, initializers=2


.. GENERATED FROM PYTHON SOURCE LINES 145-152

Save with parallel I/O
-----------------------

Large models benefit from writing raw-data blocks in parallel.  Pass
``num_threads=N > 1 or 0`` to control
the thread pool. The ``min_block_size`` parameter prevents spawning
threads for tiny tensors.

.. GENERATED FROM PYTHON SOURCE LINES 152-159

.. code-block:: Python


    ext_par_onnx = os.path.join(out_dir, "model_ext_par.onnx")
    ext_par_data = ext_par_onnx + ".data"

    onnxl.save(onnxl_model, ext_par_onnx, location=ext_par_data, num_threads=4)
    print(f"Saved with parallel I/O: {ext_par_onnx!r}")


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    Saved with parallel I/O: 'temp_plot_load_save_external/model_ext_par.onnx'


.. GENERATED FROM PYTHON SOURCE LINES 160-168

Split external data across multiple files
------------------------------------------

Use ``max_external_file_size`` to cap the size of each external weight file.
Once the primary file (``model_split.onnx.data``) reaches the limit, a new
file is opened automatically with the suffix ``.1``, ``.2``, and so on.
When loading, only the primary location needs to be specified; the loader
follows the split-file references stored in each tensor's metadata.

.. GENERATED FROM PYTHON SOURCE LINES 168-193

.. code-block:: Python


    ext_split_onnx = os.path.join(out_dir, "model_split.onnx")
    ext_split_data = ext_split_onnx + ".data"

    # Cap at half the total weight size so that at least two data files are
    # produced regardless of the chosen DIM.
    total_weight_bytes = (w0.nbytes + w1.nbytes) // 2

    onnxl.save(
        onnxl_model,
        ext_split_onnx,
        location=ext_split_data,
        max_external_file_size=total_weight_bytes,
    )

    split_files = sorted(p for p in os.listdir(out_dir) if p.startswith("model_split.onnx.data"))
    print("Files produced by split save:")
    for fname in split_files:
        fpath = os.path.join(out_dir, fname)
        print(f"  {fname!r:<40} {os.path.getsize(fpath) / 1024:7.1f} KB")

    # Load back – only the primary data file is needed.
    loaded_split = onnxl.load(ext_split_onnx, load_external_data=True)
    check_w0_roundtrip(loaded_split, w0, "split")


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    Files produced by split save:
      'model_split.onnx.data'                    256.0 KB
      'model_split.onnx.data.1'                  256.0 KB
    W0 round-trip (split): OK


.. GENERATED FROM PYTHON SOURCE LINES 194-196

Cleanup
--------

.. GENERATED FROM PYTHON SOURCE LINES 196-198

.. code-block:: Python


    shutil.rmtree(out_dir, ignore_errors=True)


.. rst-class:: sphx-glr-timing

   **Total running time of the script:** (0 minutes 0.137 seconds)


.. _sphx_glr_download_auto_examples_core_plot_load_save_external.py:

.. only:: html

  .. container:: sphx-glr-footer sphx-glr-footer-example

    .. container:: sphx-glr-download sphx-glr-download-jupyter

      :download:`Download Jupyter notebook: plot_load_save_external.ipynb <plot_load_save_external.ipynb>`

    .. container:: sphx-glr-download sphx-glr-download-python

      :download:`Download Python source code: plot_load_save_external.py <plot_load_save_external.py>`

    .. container:: sphx-glr-download sphx-glr-download-zip

      :download:`Download zipped: plot_load_save_external.zip <plot_load_save_external.zip>`


.. only:: html

 .. rst-class:: sphx-glr-signature

    `Gallery generated by Sphinx-Gallery <https://sphinx-gallery.github.io>`_