Python Runtime for ONNX#
This runtime does not take any dependency on scikit-learn, only on numpy, scipy, and has custom implementations in C++ (cython, pybind11).
Inference#
The main class reads an ONNX file and may computes predictions based on a runtime implementated in Python. The ONNX model relies on the following operators Python Runtime for ONNX operators.
mlprodict.onnxrt.OnnxInference
(self, onnx_or_bytes_or_stream, runtime = None, skip_run = False, inplace = True, input_inplace = False, ir_version = None, target_opset = None, runtime_options = None, session_options = None, inside_loop = False, static_inputs = None, new_outputs = None, new_opset = None, existing_functions = None)
Loads an ONNX file or object or stream. Computes the output of the ONNX graph. Several runtimes are available.
'python'
: the runtime implements every onnx operator needed to run a scikit-learn model by using numpy or C++ code.
'python_compiled'
: it is the same runtime than the previous one except every operator is called from a compiled function (_build_compile_run
) instead for a method going through the list of operator
'onnxruntime1'
: uses onnxruntime (or onnxruntime1-cuda, …)
'onnxruntime2'
: this mode is mostly used to debug as python handles calling every operator but onnxruntime is called for every of them, this process may fail due to wrong inference type specially of the graph includes custom nodes, in that case, it is better to compute the output of intermediates nodes. It is much slower as fo every output, every node is computed but more robust.
check_model
(self)Checks the model follow ONNX conventions.
get_profiling
(self, as_df = False)Returns the profiling after a couple of execution.
run
(self, inputs, clean_right_away = False, intermediate = False, verbose = 0, node_time = False, overwrite_types = None, yield_ops = None, fLOG = None)Computes the predictions for this onnx graph.
run2onnx
(self, inputs, verbose = 0, fLOG = None, as_parameter = True, suffix = ‘_DBG’, param_name = None, node_type = ‘DEBUG’, domain = ‘DEBUG’, domain_opset = 1)Executes the graphs with the given inputs, then adds the intermediate results into ONNX nodes in the original graph. Once saved, it can be looked with a tool such as netron.
shape_inference
(self)Infers the shape of the outputs with onnx package.
mlprodict.onnxrt.onnx_micro_inference.OnnxMicroRuntime
The following is technically implemented as a runtime but it does shape inference.
mlprodict.onnxrt.OnnxShapeInference
(self, model_onnx)
Implements a micro runtime for ONNX graphs. It does not implements all the operator types.
run
(self, inputs = None)Runs shape inference and type given known inputs.
The execution produces a result of type:
mlprodict.onnxrt.ops_shape.shape_container.ShapeContainer
(self)
Stores all infered shapes as
ShapeResult
.Attributes:
shapes: dictionary { result name: ShapeResult }
- names: some dimensions are unknown and represented as
variables, this dictionary keeps track of them
names_rev: reverse dictionary of names
get
(self)Returns the value of attribute resolved_ (method resolve() must have been called first).
Methods get returns a dictionary mapping result name and the following type:
mlprodict.onnxrt.ops_shape.shape_result.ShapeResult
(self, name, shape = None, dtype = None, sparse = False, mtype = OnnxKind.Tensor, constraints = None)
Contains information about shape and type of a result in an onnx graph.
broadcast
(sh1, sh2, name = None)Broadcasts dimensions for an element wise operator.
copy
(self, deep = False)Returns a copy for the result.
is_compatible
(self, shape)Tells if this shape is compatible with the given tuple.
merge
(self, other_result)Merges constraints from other_results into self.
n_dims
(self)Returns the number of dimensions if it is a tensor. Raises an exception otherwise.
resolve
(self, variables)Results variables in a shape using values stored in variables. It does not copy any constraints.
Backend validation#
mlprodict.tools.onnx_backend.enumerate_onnx_tests
mlprodict.tools.onnx_backend.OnnxBackendTest
Python to ONNX#
mlprodict.onnx_tools.onnx_grammar.translate_fct2onnx
(fct, context = None, cpl = False, context_cpl = None, output_names = None, dtype = <class ‘numpy.float32’>, verbose = 0, fLOG = None)
Translates a function into ONNX. The code it produces is using classes OnnxAbs, OnnxAdd, …
ONNX Export#
mlprodict.onnxrt.onnx_inference_exports.OnnxInferenceExport
(self, oinf)
Implements methods to export a instance of
OnnxInference
into json, dot, text, python.
ONNX Structure#
mlprodict.onnx_tools.onnx_manipulations.enumerate_model_node_outputs
(model, add_node = False, order = False)
Enumerates all the nodes of a model.
mlprodict.onnx_tools.onnx_manipulations.select_model_inputs_outputs
(model, outputs = None, inputs = None, infer_shapes = False, overwrite = None, remove_unused = True, verbose = 0, fLOG = None)
Takes a model and changes its outputs.
onnxruntime#
mlprodict.onnxrt.onnx_inference_ort.device_to_providers
mlprodict.onnxrt.onnx_inference_ort.get_ort_device
Validation of scikit-learn models#
mlprodict.onnxrt.validate.enumerate_validated_operator_opsets
(verbose = 0, opset_min = -1, opset_max = -1, check_runtime = True, debug = False, runtime = ‘python’, models = None, dump_folder = None, store_models = False, benchmark = False, skip_models = None, assume_finite = True, node_time = False, fLOG = <built-in function print>, filter_exp = None, versions = False, extended_list = False, time_kwargs = None, dump_all = False, n_features = None, skip_long_test = True, fail_bad_results = False, filter_scenario = None, time_kwargs_fact = None, time_limit = 4, n_jobs = None)
Tests all possible configurations for all possible operators and returns the results.
mlprodict.onnxrt.validate.side_by_side.side_by_side_by_values
(sessions, args, inputs = None, return_results = False, kwargs)
Compares the execution of two sessions. It calls method
OnnxInference.run
with valueintermediate=True
and compares the results.
mlprodict.onnxrt.validate.summary_report
(df, add_cols = None, add_index = None)
Finalizes the results computed by function
enumerate_validated_operator_opsets
.
mlprodict.onnxrt.validate.validate_graph.plot_validate_benchmark
C++ classes#
Gather
mlprodict.onnxrt.ops_cpu.op_gather_.GatherDouble
(self, arg0)
Implements runtime for operator Gather. The code is inspired from tfidfvectorizer.cc in onnxruntime.
mlprodict.onnxrt.ops_cpu.op_gather_.GatherFloat
(self, arg0)
Implements runtime for operator Gather. The code is inspired from tfidfvectorizer.cc in onnxruntime.
mlprodict.onnxrt.ops_cpu.op_gather_.GatherInt64
(self, arg0)
Implements runtime for operator Gather. The code is inspired from tfidfvectorizer.cc in onnxruntime.
ArrayFeatureExtractor
mlprodict.onnxrt.ops_cpu._op_onnx_numpy.array_feature_extractor_double
(arg0, arg1)
array_feature_extractor_double(arg0: numpy.ndarray[numpy.float64], arg1: numpy.ndarray[numpy.int64]) -> numpy.ndarray[numpy.float64]
C++ implementation of operator ArrayFeatureExtractor for float64. The function only works with contiguous arrays.
mlprodict.onnxrt.ops_cpu._op_onnx_numpy.array_feature_extractor_float
(arg0, arg1)
array_feature_extractor_float(arg0: numpy.ndarray[numpy.float32], arg1: numpy.ndarray[numpy.int64]) -> numpy.ndarray[numpy.float32]
C++ implementation of operator ArrayFeatureExtractor for float32. The function only works with contiguous arrays.
mlprodict.onnxrt.ops_cpu._op_onnx_numpy.array_feature_extractor_int64
(arg0, arg1)
array_feature_extractor_int64(arg0: numpy.ndarray[numpy.int64], arg1: numpy.ndarray[numpy.int64]) -> numpy.ndarray[numpy.int64]
C++ implementation of operator ArrayFeatureExtractor for int64. The function only works with contiguous arrays.
SVM
mlprodict.onnxrt.ops_cpu.op_svm_classifier_.RuntimeSVMClassifier
mlprodict.onnxrt.ops_cpu.op_svm_regressor_.RuntimeSVMRegressor
Tree Ensemble
mlprodict.onnxrt.ops_cpu.op_tree_ensemble_classifier_.RuntimeTreeEnsembleClassifierDouble
(self)
Implements runtime for operator TreeEnsembleClassifier. The code is inspired from tree_ensemble_classifier.cc in onnxruntime. Supports double only.
mlprodict.onnxrt.ops_cpu.op_tree_ensemble_classifier_.RuntimeTreeEnsembleClassifierFloat
(self)
Implements runtime for operator TreeEnsembleClassifier. The code is inspired from tree_ensemble_classifier.cc in onnxruntime. Supports float only.
mlprodict.onnxrt.ops_cpu.op_tree_ensemble_regressor_.RuntimeTreeEnsembleRegressorDouble
(self)
Implements double runtime for operator TreeEnsembleRegressor. The code is inspired from tree_ensemble_regressor.cc in onnxruntime. Supports double only.
mlprodict.onnxrt.ops_cpu.op_tree_ensemble_regressor_.RuntimeTreeEnsembleRegressorFloat
(self)
Implements float runtime for operator TreeEnsembleRegressor. The code is inspired from tree_ensemble_regressor.cc in onnxruntime. Supports float only.
Still tree ensembles but refactored.
mlprodict.onnxrt.ops_cpu.op_tree_ensemble_classifier_p_.RuntimeTreeEnsembleClassifierPDouble
(self, arg0, arg1, arg2, arg3)
Implements double runtime for operator TreeEnsembleClassifier. The code is inspired from tree_ensemble_Classifier.cc in onnxruntime. Supports double only.
mlprodict.onnxrt.ops_cpu.op_tree_ensemble_classifier_p_.RuntimeTreeEnsembleClassifierPFloat
(self, arg0, arg1, arg2, arg3)
Implements float runtime for operator TreeEnsembleClassifier. The code is inspired from tree_ensemble_Classifier.cc in onnxruntime. Supports float only.
mlprodict.onnxrt.ops_cpu.op_tree_ensemble_regressor_p_.RuntimeTreeEnsembleRegressorPDouble
(self, arg0, arg1, arg2, arg3)
Implements double runtime for operator TreeEnsembleRegressor. The code is inspired from tree_ensemble_regressor.cc in onnxruntime. Supports double only.
mlprodict.onnxrt.ops_cpu.op_tree_ensemble_regressor_p_.RuntimeTreeEnsembleRegressorPFloat
(self, arg0, arg1, arg2, arg3)
Implements float runtime for operator TreeEnsembleRegressor. The code is inspired from tree_ensemble_regressor.cc in onnxruntime. Supports float only.
Topk
mlprodict.onnxrt.ops_cpu._op_onnx_numpy.topk_element_max_double
(arg0, arg1, arg2, arg3)
topk_element_max_double(arg0: numpy.ndarray[numpy.float64], arg1: int, arg2: bool, arg3: int) -> numpy.ndarray[numpy.int64]
C++ implementation of operator TopK for float32. The function only works with contiguous arrays. The function is parallelized for more than th_para rows. It only does it on the last axis.
mlprodict.onnxrt.ops_cpu._op_onnx_numpy.topk_element_max_float
(arg0, arg1, arg2, arg3)
topk_element_max_float(arg0: numpy.ndarray[numpy.float32], arg1: int, arg2: bool, arg3: int) -> numpy.ndarray[numpy.int64]
C++ implementation of operator TopK for float32. The function only works with contiguous arrays. The function is parallelized for more than th_para rows. It only does it on the last axis.
mlprodict.onnxrt.ops_cpu._op_onnx_numpy.topk_element_max_int64
(arg0, arg1, arg2, arg3)
topk_element_max_int64(arg0: numpy.ndarray[numpy.int64], arg1: int, arg2: bool, arg3: int) -> numpy.ndarray[numpy.int64]
C++ implementation of operator TopK for float32. The function only works with contiguous arrays. The function is parallelized for more than th_para rows. It only does it on the last axis.
mlprodict.onnxrt.ops_cpu._op_onnx_numpy.topk_element_min_double
(arg0, arg1, arg2, arg3)
topk_element_min_double(arg0: numpy.ndarray[numpy.float64], arg1: int, arg2: bool, arg3: int) -> numpy.ndarray[numpy.int64]
C++ implementation of operator TopK for float32. The function only works with contiguous arrays. The function is parallelized for more than th_para rows. It only does it on the last axis.
mlprodict.onnxrt.ops_cpu._op_onnx_numpy.topk_element_min_float
(arg0, arg1, arg2, arg3)
topk_element_min_float(arg0: numpy.ndarray[numpy.float32], arg1: int, arg2: bool, arg3: int) -> numpy.ndarray[numpy.int64]
C++ implementation of operator TopK for float32. The function only works with contiguous arrays. The function is parallelized for more than th_para rows. It only does it on the last axis.
mlprodict.onnxrt.ops_cpu._op_onnx_numpy.topk_element_min_int64
(arg0, arg1, arg2, arg3)
topk_element_min_int64(arg0: numpy.ndarray[numpy.int64], arg1: int, arg2: bool, arg3: int) -> numpy.ndarray[numpy.int64]
C++ implementation of operator TopK for float32. The function only works with contiguous arrays. The function is parallelized for more than th_para rows. It only does it on the last axis.
mlprodict.onnxrt.ops_cpu._op_onnx_numpy.topk_element_fetch_double
(arg0, arg1)
topk_element_fetch_double(arg0: numpy.ndarray[numpy.float64], arg1: numpy.ndarray[numpy.int64]) -> numpy.ndarray[numpy.float64]
Fetches the top k element knowing their indices on each row (= last dimension for a multi dimension array).
mlprodict.onnxrt.ops_cpu._op_onnx_numpy.topk_element_fetch_float
(arg0, arg1)
topk_element_fetch_float(arg0: numpy.ndarray[numpy.float32], arg1: numpy.ndarray[numpy.int64]) -> numpy.ndarray[numpy.float32]
Fetches the top k element knowing their indices on each row (= last dimension for a multi dimension array).
mlprodict.onnxrt.ops_cpu._op_onnx_numpy.topk_element_fetch_int64
(arg0, arg1)
topk_element_fetch_int64(arg0: numpy.ndarray[numpy.int64], arg1: numpy.ndarray[numpy.int64]) -> numpy.ndarray[numpy.int64]
Fetches the top k element knowing their indices on each row (= last dimension for a multi dimension array).