onnx-light#
onnx without protobuf#
Files larger than 2 GB – The standard
onnxpackage relies on protobuf, which enforces a 2 GB message-size limit and cannot load or save models that exceed that threshold.onnx-lightbypasses protobuf entirely and supports arbitrarily large ONNX files.External-data / multi-file models – Large models can be split across a
.onnxstructure file and one or more separate weight files. Pass alocationargument toonnxl.saveto route tensor weights to a separate file, and useload_external_data=Truewhen loading them back. Themax_external_file_sizeoption automatically partitions weights across multiple files (model.data,model.data.1, …) when a single file would exceed the given byte limit.Parallel loading – Tensor weights can be read in parallel using multiple threads, which significantly reduces wall-clock load time for large models.
Zero-copy parsing – When parsing from an in-memory bytes buffer, the
no_copy=Trueoption makes each tensor’sraw_datapoint directly into the source bytes without allocating an extra copy. This eliminates onemalloc + memcpyper tensor initializer:import onnx_light.onnx as onnxl serialized = open("model.onnx", "rb").read() # keep alive! model = onnxl.load(serialized, no_copy=True) # tensor.raw_data now points into 'serialized' – no extra copy
Warning
The original bytes object must remain alive for as long as the returned model is in use. This constraint does not apply to the standard
onnxpackage.Encrypted save / load – Models can be encrypted with AES-256-CBC (PBKDF2-HMAC-SHA256 key derivation) and saved to a single self-contained
.onnxcfile, or serialized to an in-memorybytesobject. This feature is unique toonnx-lightand requires that the package was built with OpenSSL support:import onnx_light.onnx as onnxl # File-based onnxl.save_encrypted(model, "model.onnxc", key="my_passphrase") model = onnxl.load_encrypted("model.onnxc", key="my_passphrase") # In-memory bytes (no file I/O) blob = onnxl.save_encrypted_string(model, key="my_passphrase") model = onnxl.load_encrypted_string(blob, key="my_passphrase")
Getting started#
Install the package in editable mode:
pip install -e .[dev] -v
or
python setup.py build_ext --inplace
To speed up compilation with multiple threads, pass --parallel (or -j)
with the number of jobs:
python setup.py build_ext --inplace --parallel 8
By default, python setup.py build_ext auto-enables parallel builds
(--parallel <cpu_count>) unless CMAKE_BUILD_PARALLEL_LEVEL is already set.
Alternatively, when installing with pip, control parallel builds using the
CMAKE_BUILD_PARALLEL_LEVEL environment variable:
CMAKE_BUILD_PARALLEL_LEVEL=8 pip install -e .[dev]
Run a quick check:
python -c "import onnx_light; print(onnx_light.__version__)"
Build and run the C++ unit tests from the editable build:
With pip install:
pip install -C build-dir=build -C cmake.build-type=Debug -C cmake.define.ONNX_LIGHT_BUILD_TESTS=ON -e .[dev]
ctest --test-dir build --output-on-failure
With setup.py:
python setup.py build_ext --inplace --build-temp build --cpp-tests
ctest --test-dir build --output-on-failure
On multi-config generators such as Visual Studio, add the matching
configuration to ctest: use -C Debug when the build was configured with
cmake.build-type=Debug, and -C Release after python setup.py
build_ext --cpp-tests.
Load a model with parallel tensor parsing:
import onnx_light.onnx
model = onnx_light.onnx.load("model.onnx", num_threads=4)
print(model.ir_version)
Source code: xadupre/onnx-light
Contents