stream_class.h#

Defines

FIELD_VARINT#

FIELD_FIXED64#

FIELD_FIXED_SIZE#

FIELD_FIXED32#

SERIALIZATION_METHOD()#: Serialization/parsing API declaration macro for generated proto classes.

BEGIN_PROTO(cls, doc)#: Macro for beginning a generated proto class with a default constructor.

BEGIN_PROTO_NOINIT(cls, doc)#: Macro for beginning a generated proto class without adding a default constructor.

END_PROTO()#: Macro for ending a generated proto class and injecting the serialization/parsing API.

FIELD(type, name, order, doc)#

FIELD_DEFAULT(type, name, order, default_value, doc)#

FIELD_STR(name, order, doc)#

FIELD_REPEATED(type, name, order, doc)#

FIELD_REPEATED_STR(type, name, order, doc)#

FIELD_REPEATED_PROTO(type, name, order, doc)#

FIELD_REPEATED_PACKED(type, name, order, doc)#

_FIELD_OPTIONAL(type, name, order, doc)#

FIELD_OPTIONAL(type, name, order, doc)#

FIELD_OPTIONAL_ONEOF(type, name, order, oneof, doc)#

FIELD_OPTIONAL_ENUM(type, name, order, doc)#

namespace onnx_light

Alias that makes onnx-light headers compatible with code that references ONNX_LIGHT_NAMESPACE (the macro used in the standard onnx package).

Set to ONNX_LIGHT_NAMESPACE so both names resolve to the same namespace.

Symbol-visibility attribute for the public onnx-light C++ API.

Defined as empty because onnx-light does not require explicit __declspec(dllexport) or __attribute__((visibility("default"))) annotations — visibility is controlled at the shared-library level. The macro is provided so that vendored ONNX headers that decorate their declarations with ONNX_API compile without modification.

Namespace alias so that ONNX C++ code (and consumers such as onnxruntime) that refers to the literal onnx namespace — rather than the ONNX_NAMESPACE macro — resolves to the onnx-light namespace. The standard onnx package lives in namespace onnx; onnx-light uses onnx_light (via ONNX_LIGHT_NAMESPACE), so this alias keeps onnx-light a true drop-in. It is only introduced when the onnx-light namespace differs from onnx.

Enums

enum class FileLoadMode : int32_t#

Selects which file-backed BinaryStream implementation is used when parsing a model from a file path (for example via ModelProto::ParseFromFile).

kAuto (default): pick the fastest implementation that is compatible with the other options. Today that means MmapFileStream except when no_copy is true with a single-file model — see ParseFromFile for the precise selection rules.
kMmap: force usage of MmapFileStream (memory-mapped file).
kFileStream: force usage of FileStream (buffered std::ifstream).

Values:

enumerator kAuto#

enumerator kMmap#

enumerator kFileStream#

enum class SerializeFormat : int32_t#

Selects the on-disk serialization format used when parsing or serializing a ModelProto. kOnnx is the default ONNX protobuf format. kOrtFlatbuffers selects the flatbuffer-based format used by onnxruntime (*.ort files).

Values:

enumerator kOnnx#

enumerator kOrtFlatbuffers#

Functions

inline bool EnforceMaxSerializedSize(const SerializeSizeResult &total_size, const SerializeOptions &options, const char *context)#

Returns false when options.max_serialized_size_bytes is non-zero and total_size exceeds the configured limit.

Enforces SerializeOptions::max_serialized_size_bytes for a computed serialized size.

template<typename T> inline bool _has_field_(const T&)#: Returns true if the field holds a non-default value (always true for scalar types other than String and raw-bytes vectors, which have their own specializations).

template<> inline bool _has_field_(const utils::String &field)#: Returns true if the string field is non-empty.

template<> inline bool _has_field_(const std::vector<uint8_t> &field)#: Returns true if the raw-bytes field is non-empty.

template<> inline bool _has_field_(const utils::ByteSpan &field)#: Returns true if the ByteSpan field is non-empty (owned or borrowed).

template<typename T> void CopyProtoFrom(T &dest, const T &src)#: Copies all fields from src into dest. Generated for every proto class.

template<typename T, typename = std::enable_if_t<std::is_base_of<Message, T>::value>> inline void swap(T &a, T &b) noexcept#: ADL-visible swap for generated proto messages so that unqualified swap(a, b) calls in consuming code (e.g. onnxruntime) resolve, mirroring the friend swap that protobuf generates for every message class.

class Message#

#include <stream_class.h>

Base class for generated ONNX proto messages.

Subclassed by onnx_light::AttributeProto, onnx_light::DeviceConfigurationProto, onnx_light::FunctionProto, onnx_light::GraphProto, onnx_light::IntIntListEntryProto, onnx_light::MapProto, onnx_light::ModelProto, onnx_light::NodeDeviceConfigurationProto, onnx_light::NodeProto, onnx_light::OperatorSetIdProto, onnx_light::OptionalProto, onnx_light::SequenceProto, onnx_light::ShardedDimProto, onnx_light::ShardingSpecProto, onnx_light::SimpleShardedDimProto, onnx_light::SparseTensorProto, onnx_light::StringStringEntryProto, onnx_light::TensorAnnotation, onnx_light::TensorProto, onnx_light::TensorProto::Segment, onnx_light::TensorShapeProto, onnx_light::TensorShapeProto::Dimension, onnx_light::TypeProto, onnx_light::TypeProto::Map, onnx_light::TypeProto::Opaque, onnx_light::TypeProto::Optional, onnx_light::TypeProto::Sequence, onnx_light::TypeProto::SparseTensor, onnx_light::TypeProto::Tensor, onnx_light::ValueInfoProto

Public Functions

inline explicit Message()#: Constructs an empty message base object.

inline bool operator==(const Message&) const#: Throws an exception as a placeholder; generated classes provide their own operator==.

struct ParseOptions : public onnx_light::TensorBufferOptions #

#include <stream_class.h>

Controls behavior when parsing ONNX protobuf messages from a stream or string.

Public Functions

inline ParseOptions()#: Constructs a ParseOptions instance with the default raw_data_threshold of 1024 bytes.

inline bool is_parallel() const#: Returns true when parallel reading should be enabled, i.e. when num_threads is greater than 1 or negative. num_threads == 0 and num_threads == 1 both disable parallelization.

Public Members

SerializeFormat format = SerializeFormat::kOnnx #: Selects the on-disk serialization format expected when parsing. SerializeFormat::kOnnx (default) parses the ONNX protobuf wire format; SerializeFormat::kOrtFlatbuffers parses the onnxruntime flatbuffer format (.ort files). The flatbuffer path is not yet implemented and raises an error when used.

bool skip_raw_data = false#: if true, raw data will not be read but skipped, tensors are not valid in that case but the model structure is still available

int32_t num_threads = 1#

Number of threads to use for parallel reading of big blocks.

1 (default): no parallelization, everything runs on the calling thread.
> 1: use exactly this many worker threads.
< 0: choose a sensible value based on the number of available CPU cores (std::thread::hardware_concurrency()).
0: treated the same as 1 (no parallelization) for the purposes of :cpp:func:is_parallel.

int64_t min_parallel_block_size = 0#: minimum raw-data block size in bytes to submit to the thread pool when parallel reading is enabled (num_threads != 1); blocks smaller than this value are read on the main thread to avoid thread-pool overhead

bool no_copy = false#: If true, raw_data blocks are not copied into a new buffer. Inline protobuf raw_data borrows directly from the source bytes buffer (for example the bytes passed to ParseFromString), so the caller MUST keep that buffer alive for as long as any TensorProto references it. For external-data files, onnx-light loads each weights file once into a shared model-owned buffer and each tensor borrows a view into that buffer.

bool _touch_raw_data_pages = false#: If true, parses all tensors normally and then touches one byte per memory page in each non-empty raw_data buffer (plus the last byte). This forces lazy page faults (for example mmap-backed no-copy buffers) to occur within the parse timing window.

int64_t tiny_external_data_threshold = -1#

Loads tiny external-data tensors inline during parsing when reading a model file without an explicit external weights stream.

< 0 (default): disabled.
>= 0: if a tensor is marked EXTERNAL and its external metadata declares length/size below this threshold (in bytes), parsing loads it from disk into raw_data and clears data_location and external_data.

FileLoadMode file_load_mode = FileLoadMode::kAuto #: Selects the file-backed BinaryStream implementation used when parsing a model from a file path (e.g. ModelProto::ParseFromFile). See FileLoadMode for the semantics of each value. Ignored when parsing from bytes/streams.

int32_t max_recursion_depth = 50#: Maximum nesting depth of protobuf sub-messages accepted while parsing. Protects the parser against stack overflow / out-of-memory caused by maliciously or accidentally deeply nested messages. Parsing raises an error when a message nests deeper than this value. The default is deliberately more conservative than protobuf’s limit of 100: the recursive-descent parser uses large per-message stack frames (especially in debug builds), so a lower limit guarantees the guard rejects the message before the recursion exhausts the platform’s default thread stack (e.g. 1 MB on Windows). It is still far above any realistic ONNX message nesting.

int32_t _recursion_depth = 0#: Internal counter tracking the current sub-message nesting depth while parsing. Managed automatically by the parser through a scoped guard; it is not a user-facing setting and is reset to 0 once a top-level parse completes.

int64_t max_tensor_size_bytes = 0#

Maximum number of bytes that may be allocated for a single tensor’s raw data (or packed repeated-field payload) during parsing. This guards against OOM caused by maliciously or accidentally large size prefixes in the wire format.

0 (default): no limit — any allocation is allowed.
> 0: parsing raises an error when the declared byte count for a single tensor allocation exceeds this value. The check fires before the allocation, so the process is never asked to commit memory larger than this threshold. Set this to a value comfortably above the largest legitimate tensor you expect, e.g. 2 GB for most models: options.max_tensor_size_bytes = 2LL * 1024 * 1024 * 1024;

std::function<std::function<void()>(TensorProto&)> raw_data_callback = {}#

Holds an optional callback invoked for each TensorProto once its raw_data has been parsed (including external-data tensors, after their bytes have been resolved). The callback receives the freshly parsed TensorProto and returns a deleter — a zero-argument callable invoked once when the tensor’s raw_data is released (the tensor and all copies sharing the same buffer go out of scope, or the buffer is overwritten/cleared).

This lets callers take custom ownership of tensor data and register the matching cleanup, regardless of whether the bytes live on disk (no_copy borrowed view of an mmap or external file) or in CPU memory (owned buffer): the returned deleter is attached on top of the existing storage without moving the bytes. Return an empty std::function to leave the tensor’s ownership unchanged.

By default it is empty (no callback) and parsing behaves exactly as before.

struct SerializeOptions : public onnx_light::TensorBufferOptions #

#include <stream_class.h>

Controls behavior when serializing ONNX protobuf messages to a stream or string.

Public Functions

inline SerializeOptions()#: Constructs a SerializeOptions instance with the default raw_data_threshold.

inline bool is_parallel() const#: Returns true when parallel writing should be enabled, i.e. when num_threads is greater than 1 or negative. num_threads == 0 and num_threads == 1 both disable parallelization.

Public Members

SerializeFormat format = SerializeFormat::kOnnx #: Selects the on-disk serialization format produced when serializing. SerializeFormat::kOnnx (default) writes the ONNX protobuf wire format; SerializeFormat::kOrtFlatbuffers writes the onnxruntime flatbuffer format (.ort files). The flatbuffer path is not yet implemented and raises an error when used.

bool skip_raw_data = false#: if true, raw data will not be written but skipped, tensors are not valid in that case but the model structure is still available

int32_t num_threads = 1#

Number of threads to use for parallel writing of big blocks.

1 (default): no parallelization, everything runs on the calling thread.
> 1: use exactly this many worker threads.
< 0: choose a sensible value based on the number of available CPU cores (std::thread::hardware_concurrency()).
0: treated the same as 1 (no parallelization) for the purposes of :cpp:func:is_parallel.

int64_t min_parallel_block_size = 0#: minimum raw-data block size in bytes to submit to the thread pool when parallel writing is enabled (num_threads != 1); blocks smaller than this value are written on the main thread to avoid thread-pool overhead

bool use_external_data_location = true#: if true, tensors already marked with data_location=EXTERNAL are serialized using their external_data metadata location (can target multiple weights files).

int64_t max_serialized_size_bytes = 0#

Maximum serialized size in bytes allowed for one serialization operation. The limit applies to the total output size (protobuf payload + external data).

0 (default): no limit.
> 0: serialization returns false when the computed size exceeds this limit.

int64_t max_external_file_size = 0#: maximum size in bytes for one external weights file when saving with external data; 0 means no limit (single weights file)

std::function<int64_t(TensorProto&, uint8_t*, size_t, bool)> raw_data_callback = {}#

Holds an optional callback invoked for each TensorProto carrying raw_data immediately before serialization.

Serialization calls the callback twice per tensor:

size pass: fn(tensor, nullptr, 0, true) must return the number of bytes that the callback will serialize for that tensor.
fill pass: onnx-light allocates a buffer of that size, then calls fn(tensor, buffer, buffer_size, false). The callback may update the tensor metadata in place (for example dims or data_type), must fill buffer with exactly that many bytes, and must return the same size again.

When the tensor was previously marked with data_location=EXTERNAL and still carries raw_data (for example after load_external_data), serialization regenerates the external-data metadata after the callback so the stored length and offset reflect the rewritten bytes.

By default it is empty (no callback) and serialization behaves exactly as before.

struct TensorBufferOptions#

#include <stream_class.h>

Common options shared by tensor buffer operations: in-place consolidation (ConsolidateTensorsToBuffer), serialization (SerializeOptions) and parsing (ParseOptions).

Subclassed by onnx_light::ParseOptions, onnx_light::SerializeOptions

Public Members

int64_t raw_data_threshold = kSmallTensorDataThresholdBytes #: Specifies the minimum raw_data size (in bytes) to include in buffer operations. Tensors whose raw_data is smaller than this threshold are left in-place.

int64_t alignment = 0#: Controls the alignment boundary for tensor offsets within the buffer. If > 0, each tensor’s offset is padded to a multiple of this many bytes. 0 disables alignment. Use 4096 for mmap-friendly page-aligned offsets.