onnx_light.onnx.io_helper#

onnx_light.onnx.io_helper.load(f: str | Path, skip_raw_data: bool = False, raw_data_threshold: int = 1024, load_external_data: bool | None = None, num_threads: int = 1, location: str = '', min_block_size: int = 0, no_copy: bool = False, touch_raw_data_pages: bool = False, file_load_mode: FileLoadMode | str = FileLoadMode.AUTO) ModelProto#

Loads a serialized ModelProto into memory.

When f is a file path, the file is memory-mapped (mmap on POSIX, CreateFileMapping on Windows) and parsed directly out of the mapped region. The OS page cache is exposed as contiguous memory, so no per-byte system call buffering is required and parsing is comparable to parsing from an already-in-memory bytes object. The same mapping strategy applies to the external weights file when a model is stored with external data — each weights file is mapped once into a shared buffer that all tensors point into. When no_copy=True is requested with a single-file model the loader still copies inline raw_data so that the parsed model does not depend on the lifetime of the mmap region; zero-copy of inline raw data is reserved for bytes inputs (where the caller owns the buffer) and for external weights files. When f is a bytes object, it is parsed in-place using a StringStream with no additional copy.

Parameters:
  • f – path or bytes

  • skip_raw_data – skips the raw data of every tensor, this can be used to load only the architecture of the model even if the model is stored in one unique file

  • raw_data_threshold – if skip_raw_data is True, still keeps the tensors smaller than this size (in bytes)

  • load_external_data – Whether to load the external data. Set to True if the data is under the same directory of the model.

  • num_threads – number of threads to use for parallel parsing. 1 (default) disables parallelization, > 1 uses exactly that many worker threads, and any negative value picks a sensible value based on the number of available CPU cores.

  • location – location of the external weights (can be different from the value stored in the main model). When load_external_data is True and this parameter is omitted, the primary external data file is auto-discovered from the tensor metadata stored in the model file.

  • min_block_size – minimum raw-data block size in bytes to read in parallel when num_threads != 1; tensor blocks smaller than this threshold are read on the calling thread to avoid thread-pool overhead for tiny tensors. A value of 0 (default) parallelizes all blocks.

  • no_copy

    if True, raw tensor data is not copied into per-tensor owned buffers. Inline protobuf raw_data then points directly into the source bytes buffer. For models with external tensor data, each external weights file is loaded once into a shared model-owned buffer and every tensor points into that buffer. This avoids one memory allocation + copy per tensor and can be significantly faster for large models.

    Warning

    When f is a bytes object, the caller must keep that original bytes object alive for the entire lifetime of the returned model. Modifying or releasing the bytes while the model is still in use leads to undefined behaviour. External-data files do not have this lifetime requirement because onnx-light keeps the shared file buffers alive.

  • touch_raw_data_pages – if True, touches one byte per memory page in each non-empty tensor raw_data buffer (plus the last byte) after parsing. This forces lazy page faults (for example mmap-backed no-copy buffers) to happen during load timing.

  • file_load_mode – selects the file-backed stream implementation used when f is a file path. Accepts either a FileLoadMode value or its name as a string ("AUTO", "MMAP" or "IFSTREAM", case-insensitive). FileLoadMode.AUTO (default) lets onnx-light pick the fastest implementation compatible with the other options — currently MmapFileStream for the main model file, except when no_copy=True is set on a single-file model (in which case the buffered FileStream is used so borrowed pointers do not outlive the stream). FileLoadMode.MMAP forces memory-mapped I/O and FileLoadMode.IFSTREAM forces the buffered std::ifstream-based reader. Ignored when f is a bytes object or when an external weights file is provided via location.

Returns:

Loaded in-memory ModelProto.

onnx_light.onnx.io_helper.load_encrypted(f: str | Path, key: str | bytes, *, num_threads: int = 1, min_block_size: int = 0) ModelProto#

Decrypts and parses an AES-256-CBC encrypted ONNX model.

The file must have been produced by save_encrypted() with the same key. Decryption is performed with AES-256-CBC using a key derived from the passphrase via PBKDF2-HMAC-SHA256.

Note

This function requires that onnx-light was built with OpenSSL support (ONNX_LIGHT_HAS_OPENSSL compile-time flag).

Parameters:
  • f – Source file path (str or pathlib.Path).

  • key – Passphrase or raw bytes (must match the one used to save). bytes values are decoded as latin-1.

  • num_threads – Number of threads to use for parallel parsing. 1 (default) disables parallelization, > 1 uses exactly that many worker threads, and any negative value picks a sensible value based on the number of available CPU cores.

  • min_block_size – Minimum block size (bytes) to parallelise when num_threads != 1.

Returns:

The decrypted and parsed ModelProto.

Raises:
onnx_light.onnx.io_helper.load_encrypted_string(data: bytes, key: str | bytes, *, num_threads: int = 1, min_block_size: int = 0) ModelProto#

Decrypts and parses an in-memory AES-256-CBC encrypted ONNX model.

Equivalent to load_encrypted() but takes a bytes object instead of a file path. The bytes must be in ONNXCRY1 format as produced by save_encrypted_string() (or save_encrypted()).

Note

This function requires that onnx-light was built with OpenSSL support (ONNX_LIGHT_HAS_OPENSSL compile-time flag).

Parameters:
  • data – Encrypted model bytes in ONNXCRY1 format.

  • key – Passphrase or raw bytes (must match the one used to encrypt). bytes values are decoded as latin-1.

  • num_threads – Number of threads to use for parallel parsing. 1 (default) disables parallelization, > 1 uses exactly that many worker threads, and any negative value picks a sensible value based on the number of available CPU cores.

  • min_block_size – Minimum block size (bytes) to parallelise when num_threads != 1.

Returns:

The decrypted and parsed ModelProto.

Raises:
onnx_light.onnx.io_helper.save(proto: ModelProto, f: str | Path, format: str = 'protobuf', *, save_as_external_data: bool = False, all_tensors_to_one_file: bool = True, location: str | None = None, size_threshold: int = 1024, convert_attribute: bool = False, num_threads: int = 1, min_block_size: int = 0, max_external_file_size: int = 0) None#

Saves the ModelProto to the specified path and optionally, serializes tensors with raw data as external data before saving. When external data is used, serialization writes through a temporary view, so the input in-memory ModelProto is left unchanged.

Parameters:
  • proto – should be a in-memory ModelProto

  • f – can be a file-like object (has “write” function) or a string containing a file name or a pathlike object

  • format – The serialization format. When it is not specified, it is inferred from the file extension when f is a path. If not specified _and_ f is not a path, ‘protobuf’ is used. The encoding is assumed to be “utf-8” when the format is a text format.

  • save_as_external_data – If true, save tensors to external file(s). all_tensors_to_one_file: Effective only if save_as_external_data is True. If true, save all tensors to one external file specified by location. If false, save each tensor to a file named with the tensor name.

  • all_tensors_to_one_file – if save_as_external_data is True, then saves all tensors into one file instead of a file per tensor

  • location – Effective only if save_as_external_data is true. Specify the external file that all tensors to save to. If an absolute path is given it is used as-is; the value stored in the ONNX metadata will be the path relative to the model file. If not specified, defaults to str(f) + ".data" (next to the model file with a .data suffix).

  • size_threshold – Effective only if save_as_external_data is True. Threshold for size of data. Only when tensor’s data is >= the size_threshold it will be converted to external data. To convert every tensor with raw data to external data set size_threshold=0.

  • convert_attribute – Effective only if save_as_external_data is True. If true, convert all tensors to external data If false, convert only non-attribute tensors to external data

  • num_threads – number of threads to use for parallel serialization. 1 (default) disables parallelization, > 1 uses exactly that many worker threads, and any negative value picks a sensible value based on the number of available CPU cores.

  • min_block_size – minimum raw-data block size in bytes to write in parallel when num_threads != 1; tensor blocks smaller than this threshold are written on the calling thread to avoid thread-pool overhead. A value of 0 (default) parallelizes all blocks.

  • max_external_file_size – maximum size in bytes for one external weight file when saving with external data. A value of 0 (default) means no limit.

onnx_light.onnx.io_helper.save_encrypted(proto: ModelProto, f: str | Path, key: str | bytes, *, num_threads: int = 1, size_threshold: int = 1024, min_block_size: int = 0) None#

Serializes and encrypts a ModelProto to a single AES-256-CBC file.

The model is first serialized to an in-memory buffer and then encrypted using AES-256-CBC. The passphrase is stretched to a 32-byte key via PBKDF2-HMAC-SHA256 (100 000 iterations). The output file is a self-contained binary that can only be loaded by load_encrypted() with the same key.

Note

This function requires that onnx-light was built with OpenSSL support (ONNX_LIGHT_HAS_OPENSSL compile-time flag).

Parameters:
  • proto – The ModelProto to save.

  • f – Destination file path (str or pathlib.Path).

  • key – Passphrase or raw bytes used to derive the AES-256 key. When key is bytes it is decoded as latin-1 before PBKDF2 so that arbitrary byte values are preserved faithfully.

  • num_threads – Number of threads to use for parallel serialization. 1 (default) disables parallelization, > 1 uses exactly that many worker threads, and any negative value picks a sensible value based on the number of available CPU cores.

  • size_threshold – Minimum tensor raw-data size (bytes) that is considered “large” for the purposes of parallelisation.

  • min_block_size – Minimum raw-data block size (bytes) parallelised when num_threads != 1.

Raises:
onnx_light.onnx.io_helper.save_encrypted_string(proto: ModelProto, key: str | bytes, *, num_threads: int = 1, size_threshold: int = 1024, min_block_size: int = 0) bytes#

Serializes and encrypts a ModelProto to an in-memory AES-256-CBC bytes object.

Equivalent to save_encrypted() but returns the ciphertext as bytes instead of writing it to a file. The returned bytes are in ONNXCRY1 format and can be decrypted with load_encrypted_string() (or load_encrypted() after writing the bytes to a file).

Note

This function requires that onnx-light was built with OpenSSL support (ONNX_LIGHT_HAS_OPENSSL compile-time flag).

Parameters:
  • proto – The ModelProto to save.

  • key – Passphrase or raw bytes used to derive the AES-256 key. When key is bytes it is decoded as latin-1 before PBKDF2 so that arbitrary byte values are preserved faithfully.

  • num_threads – Number of threads to use for parallel serialization. 1 (default) disables parallelization, > 1 uses exactly that many worker threads, and any negative value picks a sensible value based on the number of available CPU cores.

  • size_threshold – Minimum tensor raw-data size (bytes) that is considered “large” for the purposes of parallelisation.

  • min_block_size – Minimum raw-data block size (bytes) parallelised when num_threads != 1.

Returns:

Encrypted model bytes in ONNXCRY1 format.

Raises: