onnx_light.onnx.io_helper#
- onnx_light.onnx.io_helper.load(f: str | Path, skip_raw_data: bool = False, raw_data_threshold: int = 1024, load_external_data: bool | None = None, num_threads: int = 1, location: str = '', min_block_size: int = 0, no_copy: bool = False, touch_raw_data_pages: bool = False, file_load_mode: FileLoadMode | str = FileLoadMode.AUTO) ModelProto#
Loads a serialized ModelProto into memory.
When f is a file path, the file is memory-mapped (
mmapon POSIX,CreateFileMappingon Windows) and parsed directly out of the mapped region. The OS page cache is exposed as contiguous memory, so no per-byte system call buffering is required and parsing is comparable to parsing from an already-in-memory bytes object. The same mapping strategy applies to the external weights file when a model is stored with external data — each weights file is mapped once into a shared buffer that all tensors point into. Whenno_copy=Trueis requested with a single-file model the loader still copies inlineraw_dataso that the parsed model does not depend on the lifetime of the mmap region; zero-copy of inline raw data is reserved forbytesinputs (where the caller owns the buffer) and for external weights files. When f is abytesobject, it is parsed in-place using aStringStreamwith no additional copy.- Parameters:
f – path or bytes
skip_raw_data – skips the raw data of every tensor, this can be used to load only the architecture of the model even if the model is stored in one unique file
raw_data_threshold – if skip_raw_data is True, still keeps the tensors smaller than this size (in bytes)
load_external_data – Whether to load the external data. Set to True if the data is under the same directory of the model.
num_threads – number of threads to use for parallel parsing.
1(default) disables parallelization,> 1uses exactly that many worker threads, and any negative value picks a sensible value based on the number of available CPU cores.location – location of the external weights (can be different from the value stored in the main model). When
load_external_dataisTrueand this parameter is omitted, the primary external data file is auto-discovered from the tensor metadata stored in the model file.min_block_size – minimum raw-data block size in bytes to read in parallel when
num_threads != 1; tensor blocks smaller than this threshold are read on the calling thread to avoid thread-pool overhead for tiny tensors. A value of 0 (default) parallelizes all blocks.no_copy –
if True, raw tensor data is not copied into per-tensor owned buffers. Inline protobuf
raw_datathen points directly into the source bytes buffer. For models with external tensor data, each external weights file is loaded once into a shared model-owned buffer and every tensor points into that buffer. This avoids one memory allocation + copy per tensor and can be significantly faster for large models.Warning
When f is a
bytesobject, the caller must keep that original bytes object alive for the entire lifetime of the returned model. Modifying or releasing the bytes while the model is still in use leads to undefined behaviour. External-data files do not have this lifetime requirement because onnx-light keeps the shared file buffers alive.touch_raw_data_pages – if True, touches one byte per memory page in each non-empty tensor
raw_databuffer (plus the last byte) after parsing. This forces lazy page faults (for example mmap-backed no-copy buffers) to happen during load timing.file_load_mode – selects the file-backed stream implementation used when f is a file path. Accepts either a
FileLoadModevalue or its name as a string ("AUTO","MMAP"or"IFSTREAM", case-insensitive).FileLoadMode.AUTO(default) lets onnx-light pick the fastest implementation compatible with the other options — currentlyMmapFileStreamfor the main model file, except whenno_copy=Trueis set on a single-file model (in which case the bufferedFileStreamis used so borrowed pointers do not outlive the stream).FileLoadMode.MMAPforces memory-mapped I/O andFileLoadMode.IFSTREAMforces the bufferedstd::ifstream-based reader. Ignored when f is abytesobject or when an external weights file is provided vialocation.
- Returns:
Loaded in-memory ModelProto.
- onnx_light.onnx.io_helper.load_encrypted(f: str | Path, key: str | bytes, *, num_threads: int = 1, min_block_size: int = 0) ModelProto#
Decrypts and parses an AES-256-CBC encrypted ONNX model.
The file must have been produced by
save_encrypted()with the same key. Decryption is performed with AES-256-CBC using a key derived from the passphrase via PBKDF2-HMAC-SHA256.Note
This function requires that onnx-light was built with OpenSSL support (
ONNX_LIGHT_HAS_OPENSSLcompile-time flag).- Parameters:
f – Source file path (str or
pathlib.Path).key – Passphrase or raw bytes (must match the one used to save).
bytesvalues are decoded aslatin-1.num_threads – Number of threads to use for parallel parsing.
1(default) disables parallelization,> 1uses exactly that many worker threads, and any negative value picks a sensible value based on the number of available CPU cores.min_block_size – Minimum block size (bytes) to parallelise when
num_threads != 1.
- Returns:
The decrypted and parsed
ModelProto.- Raises:
RuntimeError – On decryption failures or I/O errors.
NotImplementedError – When OpenSSL support is not compiled in.
- onnx_light.onnx.io_helper.load_encrypted_string(data: bytes, key: str | bytes, *, num_threads: int = 1, min_block_size: int = 0) ModelProto#
Decrypts and parses an in-memory AES-256-CBC encrypted ONNX model.
Equivalent to
load_encrypted()but takes abytesobject instead of a file path. The bytes must be in ONNXCRY1 format as produced bysave_encrypted_string()(orsave_encrypted()).Note
This function requires that onnx-light was built with OpenSSL support (
ONNX_LIGHT_HAS_OPENSSLcompile-time flag).- Parameters:
data – Encrypted model bytes in ONNXCRY1 format.
key – Passphrase or raw bytes (must match the one used to encrypt).
bytesvalues are decoded aslatin-1.num_threads – Number of threads to use for parallel parsing.
1(default) disables parallelization,> 1uses exactly that many worker threads, and any negative value picks a sensible value based on the number of available CPU cores.min_block_size – Minimum block size (bytes) to parallelise when
num_threads != 1.
- Returns:
The decrypted and parsed
ModelProto.- Raises:
RuntimeError – On decryption failures.
NotImplementedError – When OpenSSL support is not compiled in.
- onnx_light.onnx.io_helper.save(proto: ModelProto, f: str | Path, format: str = 'protobuf', *, save_as_external_data: bool = False, all_tensors_to_one_file: bool = True, location: str | None = None, size_threshold: int = 1024, convert_attribute: bool = False, num_threads: int = 1, min_block_size: int = 0, max_external_file_size: int = 0) None#
Saves the ModelProto to the specified path and optionally, serializes tensors with raw data as external data before saving. When external data is used, serialization writes through a temporary view, so the input in-memory ModelProto is left unchanged.
- Parameters:
proto – should be a in-memory ModelProto
f – can be a file-like object (has “write” function) or a string containing a file name or a pathlike object
format – The serialization format. When it is not specified, it is inferred from the file extension when
fis a path. If not specified _and_fis not a path, ‘protobuf’ is used. The encoding is assumed to be “utf-8” when the format is a text format.save_as_external_data – If true, save tensors to external file(s). all_tensors_to_one_file: Effective only if save_as_external_data is True. If true, save all tensors to one external file specified by location. If false, save each tensor to a file named with the tensor name.
all_tensors_to_one_file – if save_as_external_data is True, then saves all tensors into one file instead of a file per tensor
location – Effective only if save_as_external_data is true. Specify the external file that all tensors to save to. If an absolute path is given it is used as-is; the value stored in the ONNX metadata will be the path relative to the model file. If not specified, defaults to
str(f) + ".data"(next to the model file with a.datasuffix).size_threshold – Effective only if save_as_external_data is True. Threshold for size of data. Only when tensor’s data is >= the size_threshold it will be converted to external data. To convert every tensor with raw data to external data set size_threshold=0.
convert_attribute – Effective only if save_as_external_data is True. If true, convert all tensors to external data If false, convert only non-attribute tensors to external data
num_threads – number of threads to use for parallel serialization.
1(default) disables parallelization,> 1uses exactly that many worker threads, and any negative value picks a sensible value based on the number of available CPU cores.min_block_size – minimum raw-data block size in bytes to write in parallel when
num_threads != 1; tensor blocks smaller than this threshold are written on the calling thread to avoid thread-pool overhead. A value of 0 (default) parallelizes all blocks.max_external_file_size – maximum size in bytes for one external weight file when saving with external data. A value of 0 (default) means no limit.
- onnx_light.onnx.io_helper.save_encrypted(proto: ModelProto, f: str | Path, key: str | bytes, *, num_threads: int = 1, size_threshold: int = 1024, min_block_size: int = 0) None#
Serializes and encrypts a ModelProto to a single AES-256-CBC file.
The model is first serialized to an in-memory buffer and then encrypted using AES-256-CBC. The passphrase is stretched to a 32-byte key via PBKDF2-HMAC-SHA256 (100 000 iterations). The output file is a self-contained binary that can only be loaded by
load_encrypted()with the same key.Note
This function requires that onnx-light was built with OpenSSL support (
ONNX_LIGHT_HAS_OPENSSLcompile-time flag).- Parameters:
proto – The ModelProto to save.
f – Destination file path (str or
pathlib.Path).key – Passphrase or raw bytes used to derive the AES-256 key. When key is
bytesit is decoded aslatin-1before PBKDF2 so that arbitrary byte values are preserved faithfully.num_threads – Number of threads to use for parallel serialization.
1(default) disables parallelization,> 1uses exactly that many worker threads, and any negative value picks a sensible value based on the number of available CPU cores.size_threshold – Minimum tensor raw-data size (bytes) that is considered “large” for the purposes of parallelisation.
min_block_size – Minimum raw-data block size (bytes) parallelised when
num_threads != 1.
- Raises:
RuntimeError – On OpenSSL errors or I/O failures.
NotImplementedError – When OpenSSL support is not compiled in.
- onnx_light.onnx.io_helper.save_encrypted_string(proto: ModelProto, key: str | bytes, *, num_threads: int = 1, size_threshold: int = 1024, min_block_size: int = 0) bytes#
Serializes and encrypts a ModelProto to an in-memory AES-256-CBC bytes object.
Equivalent to
save_encrypted()but returns the ciphertext asbytesinstead of writing it to a file. The returned bytes are in ONNXCRY1 format and can be decrypted withload_encrypted_string()(orload_encrypted()after writing the bytes to a file).Note
This function requires that onnx-light was built with OpenSSL support (
ONNX_LIGHT_HAS_OPENSSLcompile-time flag).- Parameters:
proto – The ModelProto to save.
key – Passphrase or raw bytes used to derive the AES-256 key. When key is
bytesit is decoded aslatin-1before PBKDF2 so that arbitrary byte values are preserved faithfully.num_threads – Number of threads to use for parallel serialization.
1(default) disables parallelization,> 1uses exactly that many worker threads, and any negative value picks a sensible value based on the number of available CPU cores.size_threshold – Minimum tensor raw-data size (bytes) that is considered “large” for the purposes of parallelisation.
min_block_size – Minimum raw-data block size (bytes) parallelised when
num_threads != 1.
- Returns:
Encrypted model bytes in ONNXCRY1 format.
- Raises:
RuntimeError – On OpenSSL errors.
NotImplementedError – When OpenSSL support is not compiled in.