.. _l-design-differences:

Differences between ``onnx`` and ``onnx_light``
===============================================

This section explains the internal design of *onnx-light* and how it differs from
the reference :epkg:`onnx` package.
Both packages share the same on-disk format (binary protobuf-encoded ``.onnx``
files) and expose very similar Python APIs, so ``onnx_light`` can act as a
near-drop-in replacement for common model loading and inspection tasks.
The key differences lie in how the serialization layer is implemented.

.. contents::
   :local:
   :depth: 2

----

No protobuf dependency
-----------------------

``onnx`` wraps the official `Protocol Buffers
<https://protobuf.dev/>`_ (protobuf) runtime.  Every message class is
auto-generated by the ``protoc`` compiler from ``.proto`` schema files, and
the resulting Python objects delegate all parsing and serialization to the
``libprotobuf`` C++ library.

``onnx_light`` ships its own hand-written parser and serializer implemented
entirely in C++ (see ``onnx_light/onnx_proto/``).  There is **no
dependency on protobuf at compile time or at runtime**.

Design implications:

* The C++ shared object (``_onnxpy.so``) built by ``onnx_light`` is smaller
  because it does not statically link any portion of ``libprotobuf``.
* All parsing and serialization code lives in a single self-contained library
  that can be consumed by other C++ projects without installing protobuf
  (see :epkg:`C++ onnx-light examples`).
* C++ projects that only need protobuf-compatible message types can link
  ``onnx_light::lib_onnx_proto`` directly; ``onnx_light::onnx_light`` is only
  needed when features tied to operator notions are required (checker, schema
  lookup, shape inference, version conversion, ...).
* The wire format produced by ``onnx_light`` is 100 % compatible with the
  official ONNX binary format, so models can be freely exchanged between the
  two libraries.

----

Files larger than 2 GB
-----------------------

The protobuf C++ runtime enforces a hard **2 GB message-size limit** at
parsing time.  Loading or saving a model larger than that threshold with the
standard ``onnx`` package raises a ``DecodeError``.

``onnx_light`` imposes no such limit.  Internally it tracks byte counts with
64-bit unsigned integers throughout the parsing and serialization path, so
models of arbitrary size are supported.

----

Buffered file I/O
-----------------

For file-based loading, ``onnx_light`` uses ``FileStream``
(``stream.h`` / ``stream.cc``), a buffered binary reader that opens the
file with ``std::ifstream`` and reads ahead in 4096-byte chunks.  On
POSIX platforms a second file descriptor is opened for parallel block reads
via ``pread``.

The ``onnx`` package reads the whole file into a Python ``bytes`` object first
and then passes it to protobuf, which copies it again internally.

----

Parallel tensor loading
-----------------------

Large ONNX models contain hundreds or thousands of initializers (tensor
weights).  Parsing these sequentially is the dominant cost when loading a
model.

``onnx_light`` exposes a ``num_threads`` option that distributes the initializer
parsing across a thread pool:

.. code-block:: python

    import onnx_light.onnx as onnxl

    model = onnxl.load("model.onnx", num_threads=4)

On the C++ side the thread pool is implemented in ``thread_pool.h`` /
``thread_pool.cc``.  Each worker independently parses a slice of the
initializer list, so wall-clock loading time scales with the number of
hardware threads available.

The standard ``onnx`` package is single-threaded; it offers no built-in
parallel loading mechanism.

----

Zero-copy parsing
-----------------

When the full model bytes are already in memory (e.g. downloaded into a
``bytes`` object), ``onnx_light`` can skip the ``malloc + memcpy`` that would
normally be used to copy each tensor's raw data into an owned buffer:

.. code-block:: python

    import onnx_light.onnx as onnxl

    serialized = open("model.onnx", "rb").read()   # keep alive!
    model = onnxl.load(serialized, no_copy=True)
    # tensor.raw_data now points directly into 'serialized'

Internally, each ``TensorProto`` stores a non-owning ``ByteSpan`` (from
``simple_span.h``) that borrows the bytes from the source buffer.  The
borrowed span's ``is_borrowed()`` predicate can be used to check whether raw
data is owned or borrowed.

.. warning::
    The source ``bytes`` object **must** remain alive for as long as the model
    is in use.  Freeing it while ``raw_data`` fields still point into it
    causes undefined behavior.  This constraint does not exist in the standard
    ``onnx`` package.

----

C++ class generation via macros
---------------------------------

The ``onnx`` package generates Python message classes from ``.proto`` schema
files using ``protoc``.  ``onnx_light`` takes a different approach: message
classes are generated **at compile time** from a small set of C++ macros
defined in ``stream_class.h``:

* ``BEGIN_PROTO(cls, doc)`` / ``END_PROTO()`` — open/close a message class.
* ``FIELD(type, name, order, doc)`` — declare a scalar field with typed
  accessors ``ref_<name>()``, ``has_<name>()``, ``set_<name>()``.
* ``FIELD_STR(name, order, doc)`` — shorthand for ``utils::String`` fields
  that also accepts ``std::string``.
* ``FIELD_REPEATED(type, name, order, doc)`` — declare a repeated (list)
  field.
* ``SERIALIZATION_METHOD()`` — inject ``ParseFromString``,
  ``SerializeToString``, ``ParseFromStream``, and ``SerializeToStream``
  declarations.

The resulting classes in ``onnx.h`` closely mirror the protobuf-generated
classes so that code originally written for ``onnx`` can be adapted with
minimal changes.

----

External-data / multi-file models
----------------------------------

Large ONNX models can be split across two files: a small ``.onnx`` file that
holds the graph structure and a separate binary blob (the *external data file*)
that holds the raw tensor weights.  This layout allows the structural metadata
to be inspected quickly without loading the weights and makes it possible to
memory-map only the weight region.

Saving with external data
~~~~~~~~~~~~~~~~~~~~~~~~~

Pass a ``location`` argument to ``onnxl.save`` to route tensor weights to a
separate file:

.. code-block:: python

    import onnx_light.onnx as onnxl

    # model.onnx – graph structure only
    # model.onnx.data – all tensor weights
    onnxl.save(model, "model.onnx", location="model.onnx.data")

Serializing to two files does **not** mutate the in-memory ``ModelProto``.
``onnx_light`` applies external-data metadata on a temporary copy while writing
and keeps the original model unchanged.

The ``location`` value stored inside the ``.onnx`` metadata is automatically
reduced to a *relative* path (just the file name) when an absolute path is
provided, so the two files can be moved together without breaking the
reference.

Loading with external data
~~~~~~~~~~~~~~~~~~~~~~~~~~

When the ``.onnx`` file already references an external data file through its
tensor metadata, ``onnxl.load`` can discover and load the weights
automatically:

.. code-block:: python

    import onnx_light.onnx as onnxl

    model = onnxl.load("model.onnx", load_external_data=True)

To override the data-file location (for example when the file has been moved),
pass ``location`` explicitly:

.. code-block:: python

    model = onnxl.load("model.onnx", location="/data/weights.bin",
                        load_external_data=True)

When ``no_copy=True`` is combined with external data, ``onnx_light`` reads
each external weights file once into a shared model-owned buffer and every
tensor points into that shared storage. This avoids one allocation and copy
per tensor while still handling split external-data files transparently.

Splitting external data across multiple files
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

For very large models it can be useful to cap the size of each external weight
file.  Set ``max_external_file_size`` (in bytes) and ``onnxl.save`` will
automatically open a new file once the limit is reached, appending ``.1``,
``.2``, … suffixes to the base name:

.. code-block:: python

    import onnx_light.onnx as onnxl

    # Produces: model.onnx, model.onnx.data, model.onnx.data.1, …
    onnxl.save(
        model,
        "model.onnx",
        location="model.onnx.data",
        max_external_file_size=2 * 1024 ** 3,  # 2 GB per file
    )

When loading, only the primary location (``model.onnx.data``) needs to be
specified; the loader automatically opens ``model.onnx.data.1``,
``model.onnx.data.2``, … as required.

All I/O is performed in C++ via ``TwoFilesWriteStream`` /
``TwoFilesStream``, so no Python overhead is incurred per tensor.

----

Encrypted model save / load
-----------------------------

``onnx_light`` optionally supports saving and loading models in an
**AES-256-CBC encrypted** binary format (extension ``.onnxc``).  The
standard ``onnx`` package offers no equivalent functionality.

The feature is available only when ``onnx_light`` is built with OpenSSL
(``-DONNX_LIGHT_HAS_OPENSSL``); when OpenSSL is absent the helpers raise
``NotImplementedError`` with a clear message.

File format
~~~~~~~~~~~

The encrypted file is a compact, self-contained binary:

.. code-block:: text

    Offset  Size  Field
    ------  ----  -----
         0     8  Magic: "ONNXCRY1"
         8    16  Random PBKDF2 salt
        24    16  Random AES-CBC initialisation vector
        40     N  AES-256-CBC ciphertext (PKCS#7-padded protobuf payload)

Key derivation uses **PBKDF2-HMAC-SHA256** with 100 000 iterations, which
makes brute-force attacks on the passphrase computationally expensive.

Python API (file-based)
~~~~~~~~~~~~~~~~~~~~~~~

.. code-block:: python

    import onnx_light.onnx as onnxl

    # Save an encrypted model to a file
    onnxl.save_encrypted(model, "model.onnxc", key="my_passphrase")

    # Load and decrypt from a file
    model = onnxl.load_encrypted("model.onnxc", key="my_passphrase")

    # A wrong key raises RuntimeError
    model = onnxl.load_encrypted("model.onnxc", key="wrong")  # RuntimeError

Python API (in-memory / bytes)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

When no file I/O is desired, the model can be encrypted to a ``bytes``
object and decrypted back directly:

.. code-block:: python

    import onnx_light.onnx as onnxl

    # Encrypt to bytes (no file written)
    blob: bytes = onnxl.save_encrypted_string(model, key="my_passphrase")

    # Decrypt from bytes
    model = onnxl.load_encrypted_string(blob, key="my_passphrase")

The ``bytes`` object produced by :func:`save_encrypted_string` is in the
same ``ONNXCRY1`` format as the file produced by :func:`save_encrypted`,
so the two forms are interchangeable.

C++ API
~~~~~~~

.. code-block:: cpp

    #include "onnx_crypt.h"

    // File-based
    ONNX_LIGHT_NAMESPACE::SaveEncryptedModel(model, "model.onnxc", "passphrase");
    ONNX_LIGHT_NAMESPACE::LoadEncryptedModel(model, "model.onnxc", "passphrase");

    // In-memory
    std::string blob = ONNX_LIGHT_NAMESPACE::SaveEncryptedModelToString(model, "passphrase");
    ONNX_LIGHT_NAMESPACE::LoadEncryptedModelFromString(model, blob, "passphrase");

See :ref:`l-api-onnx-onnx-proto-onnx-crypt` for the full C++ API reference.

----

API compatibility
-----------------

``onnx_light`` aims to be a functional subset of the ``onnx`` Python API for
the most common operations:

.. list-table::
   :header-rows: 1
   :widths: 40 30 30

   * - Operation
     - ``onnx``
     - ``onnx_light``
   * - Load from file
     - ``onnx.load(path)``
     - ``onnxl.load(path)``
   * - Load from bytes
     - ``onnx.load_from_string(b)``
     - ``onnxl.load(b)``
   * - Save to file
     - ``onnx.save(model, path)``
     - ``onnxl.save(model, path)``
   * - Save with external data
     - ``onnx.save_model(model, path, save_as_external_data=True, location=loc)``
     - ``onnxl.save(model, path, location=loc)``
   * - Save external data with aligned tensor offsets
     - not supported
     - ``opts = onnxl.SerializeOptions(); opts.alignment = 4096; model.SerializeToFile(path, opts, loc)``
   * - Load with external data
     - ``onnx.load(path, load_external_data=True)``
     - ``onnxl.load(path, load_external_data=True)``
   * - Load external data with shared no-copy buffers
     - not supported
     - ``onnxl.load(path, load_external_data=True, no_copy=True)``
   * - Split external data
     - not supported
     - ``onnxl.save(model, path, location=loc, max_external_file_size=N)``
   * - Save encrypted to file
     - not supported
     - ``onnxl.save_encrypted(model, path, key=k)``
   * - Load encrypted from file
     - not supported
     - ``onnxl.load_encrypted(path, key=k)``
   * - Save encrypted to bytes
     - not supported
     - ``onnxl.save_encrypted_string(model, key=k)``
   * - Load encrypted from bytes
     - not supported
     - ``onnxl.load_encrypted_string(blob, key=k)``
   * - Parse a message
     - ``msg.ParseFromString(b)``
     - ``msg.ParseFromString(b)``
   * - Serialize a message
     - ``msg.SerializeToString()``
     - ``msg.SerializeToString()``
   * - Parallel load
     - not supported
     - ``onnxl.load(path, num_threads=N)``
   * - Zero-copy parse
     - not supported
     - ``onnxl.load(b, no_copy=True)``
   * - File size limit
     - 2 GB (protobuf)
     - unlimited

Some helper utilities present in ``onnx`` (shape inference, model checker,
etc.) are not yet implemented in ``onnx_light``, which focuses on fast,
dependency-free loading and saving.

----

Summary
-------

.. list-table::
   :header-rows: 1
   :widths: 35 30 35

   * - Aspect
     - ``onnx``
     - ``onnx_light``
   * - Serialization runtime
     - Google protobuf
     - Custom C++ (no protobuf)
   * - Max model size
     - 2 GB
     - Unlimited
   * - File I/O
     - Read-into-bytes
     - Memory-mapped (mmap)
   * - Tensor loading
     - Single-threaded
     - Optional parallel (thread pool)
   * - Raw-data copying
     - Always copied
     - Zero-copy option (``no_copy=True``)
   * - External data (2-file)
     - Yes (``save_model`` / ``load``)
     - Yes (``save`` / ``load``)
   * - External data no-copy shared buffers
     - No
     - Yes (``load(..., no_copy=True)``)
   * - Split external data (N files)
     - No
     - Yes (``max_external_file_size``)
   * - Tensor offset alignment in external files
     - No
     - Yes (``SerializeOptions.alignment``)
   * - Standalone C++ library
     - Yes
     - Yes (``onnx_light::lib_onnx_proto`` for proto-only code,
       ``onnx_light::onnx_light`` when operator-aware APIs are needed)
   * - Wire format
     - ONNX binary protobuf
     - ONNX binary protobuf (identical)
   * - Encrypted save / load
     - No
     - Yes (AES-256-CBC, requires OpenSSL)