.. _l-design-differences:
Differences between ``onnx`` and ``onnx_light``
===============================================
This section explains the internal design of *onnx-light* and how it differs from
the reference :epkg:`onnx` package.
Both packages share the same on-disk format (binary protobuf-encoded ``.onnx``
files) and expose very similar Python APIs, so ``onnx_light`` can act as a
near-drop-in replacement for common model loading and inspection tasks.
The key differences lie in how the serialization layer is implemented.
.. contents::
:local:
:depth: 2
----
No protobuf dependency
-----------------------
``onnx`` wraps the official `Protocol Buffers
`_ (protobuf) runtime. Every message class is
auto-generated by the ``protoc`` compiler from ``.proto`` schema files, and
the resulting Python objects delegate all parsing and serialization to the
``libprotobuf`` C++ library.
``onnx_light`` ships its own hand-written parser and serializer implemented
entirely in C++ (see ``onnx_light/onnx_proto/``). There is **no
dependency on protobuf at compile time or at runtime**.
Design implications:
* The C++ shared object (``_onnxpy.so``) built by ``onnx_light`` is smaller
because it does not statically link any portion of ``libprotobuf``.
* All parsing and serialization code lives in a single self-contained library
that can be consumed by other C++ projects without installing protobuf
(see :epkg:`C++ onnx-light examples`).
* C++ projects that only need protobuf-compatible message types can link
``onnx_light::lib_onnx_proto`` directly; ``onnx_light::onnx_light`` is only
needed when features tied to operator notions are required (checker, schema
lookup, shape inference, version conversion, ...).
* The wire format produced by ``onnx_light`` is 100 % compatible with the
official ONNX binary format, so models can be freely exchanged between the
two libraries.
----
Files larger than 2 GB
-----------------------
The protobuf C++ runtime enforces a hard **2 GB message-size limit** at
parsing time. Loading or saving a model larger than that threshold with the
standard ``onnx`` package raises a ``DecodeError``.
``onnx_light`` imposes no such limit. Internally it tracks byte counts with
64-bit unsigned integers throughout the parsing and serialization path, so
models of arbitrary size are supported.
----
Buffered file I/O
-----------------
For file-based loading, ``onnx_light`` uses ``FileStream``
(``stream.h`` / ``stream.cc``), a buffered binary reader that opens the
file with ``std::ifstream`` and reads ahead in 4096-byte chunks. On
POSIX platforms a second file descriptor is opened for parallel block reads
via ``pread``.
The ``onnx`` package reads the whole file into a Python ``bytes`` object first
and then passes it to protobuf, which copies it again internally.
----
Parallel tensor loading
-----------------------
Large ONNX models contain hundreds or thousands of initializers (tensor
weights). Parsing these sequentially is the dominant cost when loading a
model.
``onnx_light`` exposes a ``num_threads`` option that distributes the initializer
parsing across a thread pool:
.. code-block:: python
import onnx_light.onnx as onnxl
model = onnxl.load("model.onnx", num_threads=4)
On the C++ side the thread pool is implemented in ``thread_pool.h`` /
``thread_pool.cc``. Each worker independently parses a slice of the
initializer list, so wall-clock loading time scales with the number of
hardware threads available.
The standard ``onnx`` package is single-threaded; it offers no built-in
parallel loading mechanism.
----
Zero-copy parsing
-----------------
When the full model bytes are already in memory (e.g. downloaded into a
``bytes`` object), ``onnx_light`` can skip the ``malloc + memcpy`` that would
normally be used to copy each tensor's raw data into an owned buffer:
.. code-block:: python
import onnx_light.onnx as onnxl
serialized = open("model.onnx", "rb").read() # keep alive!
model = onnxl.load(serialized, no_copy=True)
# tensor.raw_data now points directly into 'serialized'
Internally, each ``TensorProto`` stores a non-owning ``ByteSpan`` (from
``simple_span.h``) that borrows the bytes from the source buffer. The
borrowed span's ``is_borrowed()`` predicate can be used to check whether raw
data is owned or borrowed.
.. warning::
The source ``bytes`` object **must** remain alive for as long as the model
is in use. Freeing it while ``raw_data`` fields still point into it
causes undefined behavior. This constraint does not exist in the standard
``onnx`` package.
----
C++ class generation via macros
---------------------------------
The ``onnx`` package generates Python message classes from ``.proto`` schema
files using ``protoc``. ``onnx_light`` takes a different approach: message
classes are generated **at compile time** from a small set of C++ macros
defined in ``stream_class.h``:
* ``BEGIN_PROTO(cls, doc)`` / ``END_PROTO()`` — open/close a message class.
* ``FIELD(type, name, order, doc)`` — declare a scalar field with typed
accessors ``ref_()``, ``has_()``, ``set_()``.
* ``FIELD_STR(name, order, doc)`` — shorthand for ``utils::String`` fields
that also accepts ``std::string``.
* ``FIELD_REPEATED(type, name, order, doc)`` — declare a repeated (list)
field.
* ``SERIALIZATION_METHOD()`` — inject ``ParseFromString``,
``SerializeToString``, ``ParseFromStream``, and ``SerializeToStream``
declarations.
The resulting classes in ``onnx.h`` closely mirror the protobuf-generated
classes so that code originally written for ``onnx`` can be adapted with
minimal changes.
----
External-data / multi-file models
----------------------------------
Large ONNX models can be split across two files: a small ``.onnx`` file that
holds the graph structure and a separate binary blob (the *external data file*)
that holds the raw tensor weights. This layout allows the structural metadata
to be inspected quickly without loading the weights and makes it possible to
memory-map only the weight region.
Saving with external data
~~~~~~~~~~~~~~~~~~~~~~~~~
Pass a ``location`` argument to ``onnxl.save`` to route tensor weights to a
separate file:
.. code-block:: python
import onnx_light.onnx as onnxl
# model.onnx – graph structure only
# model.onnx.data – all tensor weights
onnxl.save(model, "model.onnx", location="model.onnx.data")
Serializing to two files does **not** mutate the in-memory ``ModelProto``.
``onnx_light`` applies external-data metadata on a temporary copy while writing
and keeps the original model unchanged.
The ``location`` value stored inside the ``.onnx`` metadata is automatically
reduced to a *relative* path (just the file name) when an absolute path is
provided, so the two files can be moved together without breaking the
reference.
Loading with external data
~~~~~~~~~~~~~~~~~~~~~~~~~~
When the ``.onnx`` file already references an external data file through its
tensor metadata, ``onnxl.load`` can discover and load the weights
automatically:
.. code-block:: python
import onnx_light.onnx as onnxl
model = onnxl.load("model.onnx", load_external_data=True)
To override the data-file location (for example when the file has been moved),
pass ``location`` explicitly:
.. code-block:: python
model = onnxl.load("model.onnx", location="/data/weights.bin",
load_external_data=True)
When ``no_copy=True`` is combined with external data, ``onnx_light`` reads
each external weights file once into a shared model-owned buffer and every
tensor points into that shared storage. This avoids one allocation and copy
per tensor while still handling split external-data files transparently.
Splitting external data across multiple files
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
For very large models it can be useful to cap the size of each external weight
file. Set ``max_external_file_size`` (in bytes) and ``onnxl.save`` will
automatically open a new file once the limit is reached, appending ``.1``,
``.2``, … suffixes to the base name:
.. code-block:: python
import onnx_light.onnx as onnxl
# Produces: model.onnx, model.onnx.data, model.onnx.data.1, …
onnxl.save(
model,
"model.onnx",
location="model.onnx.data",
max_external_file_size=2 * 1024 ** 3, # 2 GB per file
)
When loading, only the primary location (``model.onnx.data``) needs to be
specified; the loader automatically opens ``model.onnx.data.1``,
``model.onnx.data.2``, … as required.
All I/O is performed in C++ via ``TwoFilesWriteStream`` /
``TwoFilesStream``, so no Python overhead is incurred per tensor.
----
Encrypted model save / load
-----------------------------
``onnx_light`` optionally supports saving and loading models in an
**AES-256-CBC encrypted** binary format (extension ``.onnxc``). The
standard ``onnx`` package offers no equivalent functionality.
The feature is available only when ``onnx_light`` is built with OpenSSL
(``-DONNX_LIGHT_HAS_OPENSSL``); when OpenSSL is absent the helpers raise
``NotImplementedError`` with a clear message.
File format
~~~~~~~~~~~
The encrypted file is a compact, self-contained binary:
.. code-block:: text
Offset Size Field
------ ---- -----
0 8 Magic: "ONNXCRY1"
8 16 Random PBKDF2 salt
24 16 Random AES-CBC initialisation vector
40 N AES-256-CBC ciphertext (PKCS#7-padded protobuf payload)
Key derivation uses **PBKDF2-HMAC-SHA256** with 100 000 iterations, which
makes brute-force attacks on the passphrase computationally expensive.
Python API (file-based)
~~~~~~~~~~~~~~~~~~~~~~~
.. code-block:: python
import onnx_light.onnx as onnxl
# Save an encrypted model to a file
onnxl.save_encrypted(model, "model.onnxc", key="my_passphrase")
# Load and decrypt from a file
model = onnxl.load_encrypted("model.onnxc", key="my_passphrase")
# A wrong key raises RuntimeError
model = onnxl.load_encrypted("model.onnxc", key="wrong") # RuntimeError
Python API (in-memory / bytes)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
When no file I/O is desired, the model can be encrypted to a ``bytes``
object and decrypted back directly:
.. code-block:: python
import onnx_light.onnx as onnxl
# Encrypt to bytes (no file written)
blob: bytes = onnxl.save_encrypted_string(model, key="my_passphrase")
# Decrypt from bytes
model = onnxl.load_encrypted_string(blob, key="my_passphrase")
The ``bytes`` object produced by :func:`save_encrypted_string` is in the
same ``ONNXCRY1`` format as the file produced by :func:`save_encrypted`,
so the two forms are interchangeable.
C++ API
~~~~~~~
.. code-block:: cpp
#include "onnx_crypt.h"
// File-based
ONNX_LIGHT_NAMESPACE::SaveEncryptedModel(model, "model.onnxc", "passphrase");
ONNX_LIGHT_NAMESPACE::LoadEncryptedModel(model, "model.onnxc", "passphrase");
// In-memory
std::string blob = ONNX_LIGHT_NAMESPACE::SaveEncryptedModelToString(model, "passphrase");
ONNX_LIGHT_NAMESPACE::LoadEncryptedModelFromString(model, blob, "passphrase");
See :ref:`l-api-onnx-onnx-proto-onnx-crypt` for the full C++ API reference.
----
API compatibility
-----------------
``onnx_light`` aims to be a functional subset of the ``onnx`` Python API for
the most common operations:
.. list-table::
:header-rows: 1
:widths: 40 30 30
* - Operation
- ``onnx``
- ``onnx_light``
* - Load from file
- ``onnx.load(path)``
- ``onnxl.load(path)``
* - Load from bytes
- ``onnx.load_from_string(b)``
- ``onnxl.load(b)``
* - Save to file
- ``onnx.save(model, path)``
- ``onnxl.save(model, path)``
* - Save with external data
- ``onnx.save_model(model, path, save_as_external_data=True, location=loc)``
- ``onnxl.save(model, path, location=loc)``
* - Save external data with aligned tensor offsets
- not supported
- ``opts = onnxl.SerializeOptions(); opts.alignment = 4096; model.SerializeToFile(path, opts, loc)``
* - Load with external data
- ``onnx.load(path, load_external_data=True)``
- ``onnxl.load(path, load_external_data=True)``
* - Load external data with shared no-copy buffers
- not supported
- ``onnxl.load(path, load_external_data=True, no_copy=True)``
* - Split external data
- not supported
- ``onnxl.save(model, path, location=loc, max_external_file_size=N)``
* - Save encrypted to file
- not supported
- ``onnxl.save_encrypted(model, path, key=k)``
* - Load encrypted from file
- not supported
- ``onnxl.load_encrypted(path, key=k)``
* - Save encrypted to bytes
- not supported
- ``onnxl.save_encrypted_string(model, key=k)``
* - Load encrypted from bytes
- not supported
- ``onnxl.load_encrypted_string(blob, key=k)``
* - Parse a message
- ``msg.ParseFromString(b)``
- ``msg.ParseFromString(b)``
* - Serialize a message
- ``msg.SerializeToString()``
- ``msg.SerializeToString()``
* - Parallel load
- not supported
- ``onnxl.load(path, num_threads=N)``
* - Zero-copy parse
- not supported
- ``onnxl.load(b, no_copy=True)``
* - File size limit
- 2 GB (protobuf)
- unlimited
Some helper utilities present in ``onnx`` (shape inference, model checker,
etc.) are not yet implemented in ``onnx_light``, which focuses on fast,
dependency-free loading and saving.
----
Summary
-------
.. list-table::
:header-rows: 1
:widths: 35 30 35
* - Aspect
- ``onnx``
- ``onnx_light``
* - Serialization runtime
- Google protobuf
- Custom C++ (no protobuf)
* - Max model size
- 2 GB
- Unlimited
* - File I/O
- Read-into-bytes
- Memory-mapped (mmap)
* - Tensor loading
- Single-threaded
- Optional parallel (thread pool)
* - Raw-data copying
- Always copied
- Zero-copy option (``no_copy=True``)
* - External data (2-file)
- Yes (``save_model`` / ``load``)
- Yes (``save`` / ``load``)
* - External data no-copy shared buffers
- No
- Yes (``load(..., no_copy=True)``)
* - Split external data (N files)
- No
- Yes (``max_external_file_size``)
* - Tensor offset alignment in external files
- No
- Yes (``SerializeOptions.alignment``)
* - Standalone C++ library
- Yes
- Yes (``onnx_light::lib_onnx_proto`` for proto-only code,
``onnx_light::onnx_light`` when operator-aware APIs are needed)
* - Wire format
- ONNX binary protobuf
- ONNX binary protobuf (identical)
* - Encrypted save / load
- No
- Yes (AES-256-CBC, requires OpenSSL)