module `onnxrt.validate.validate`#

Short summary#

module mlprodict.onnxrt.validate.validate

Validates runtime for many :scikit-learn: operators. The submodule relies on onnxconverter_common, sklearn-onnx.

Functions#

function	truncated documentation
`_call_conv_runtime_opset`
`_call_runtime`	Private.
`_check_run_benchmark`
`_dofit_model`
`_enumerate_validated_operator_opsets_ops`
`_enumerate_validated_operator_opsets_version`
`_retrieve_problems_extra`	Use by `enumerate_compatible_opset()`.
`_run_skl_prediction`
`enumerate_compatible_opset`	Lists all compatible opsets for a specific model.
`enumerate_validated_operator_opsets`	Tests all possible configurations for all possible operators and returns the results.

Documentation#

Validates runtime for many :scikit-learn: operators. The submodule relies on onnxconverter_common, sklearn-onnx.

source on GitHub

mlprodict.onnxrt.validate.validate._call_conv_runtime_opset(obs, opsets, debug, new_conv_options, model, prob, scenario, extra, extras, conv_options, init_types, inst, optimisations, verbose, benchmark, runtime, filter_scenario, check_runtime, X_test, y_test, ypred, Xort_test, method_name, output_index, kwargs, time_limit, fLOG)#

mlprodict.onnxrt.validate.validate._call_runtime(obs_op, conv, opset, debug, inst, runtime, X_test, y_test, init_types, method_name, output_index, ypred, Xort_test, model, dump_folder, benchmark, node_time, fLOG, verbose, store_models, time_kwargs, dump_all, skip_long_test, time_limit)#

Private.

source on GitHub

mlprodict.onnxrt.validate.validate._check_run_benchmark(benchmark, stat_onnx, bench_memo, runtime)#

mlprodict.onnxrt.validate.validate._dofit_model(dofit, obs, inst, X_train, y_train, X_test, y_test, Xort_test, init_types, store_models, debug, verbose, fLOG)#

mlprodict.onnxrt.validate.validate._enumerate_validated_operator_opsets_ops(extended_list, models, skip_models)#

mlprodict.onnxrt.validate.validate._enumerate_validated_operator_opsets_version(runtime)#

mlprodict.onnxrt.validate.validate._retrieve_problems_extra(model, verbose, fLOG, extended_list)#

Use by enumerate_compatible_opset.

source on GitHub

mlprodict.onnxrt.validate.validate._run_skl_prediction(obs, check_runtime, assume_finite, inst, method_name, predict_kwargs, X_test, benchmark, debug, verbose, time_kwargs, skip_long_test, time_kwargs_fact, fLOG)#

mlprodict.onnxrt.validate.validate.enumerate_compatible_opset(model, opset_min=-1, opset_max=-1, check_runtime=True, debug=False, runtime='python', dump_folder=None, store_models=False, benchmark=False, assume_finite=True, node_time=False, fLOG=<built-in function print>, filter_exp=None, verbose=0, time_kwargs=None, extended_list=False, dump_all=False, n_features=None, skip_long_test=True, filter_scenario=None, time_kwargs_fact=None, time_limit=4, n_jobs=None)#

Lists all compatible opsets for a specific model.

Parameters

model – operator class
opset_min – starts with this opset
opset_max – ends with this opset (None to use current onnx opset)
check_runtime – checks that runtime can consume the model and compute predictions
debug – catch exception (True) or not (False)
runtime – test a specific runtime, by default 'python'
dump_folder – dump information to replicate in case of mismatch
dump_all – dump all models not only the one which fail
store_models – if True, the function also stores the fitted model and its conversion into ONNX
benchmark – if True, measures the time taken by each function to predict for different number of rows
fLOG – logging function
filter_exp – function which tells if the experiment must be run, None to run all, takes model, problem as an input
filter_scenario – second function which tells if the experiment must be run, None to run all, takes model, problem, scenario, extra, options as an input
node_time – collect time for each node in the ONNX graph
assume_finite – See config_context, If True, validation for finiteness will be skipped, saving time, but leading to potential crashes. If False, validation for finiteness will be performed, avoiding error.
verbose – verbosity
extended_list – extends the list to custom converters and problems
time_kwargs – to define a more precise way to measure a model
n_features – modifies the shorts datasets used to train the models to use exactly this number of features, it can also be a list to test multiple datasets
skip_long_test – skips tests for high values of N if they seem too long
time_kwargs_fact – see _multiply_time_kwargs
time_limit – to stop benchmarking after this amount of time was spent
n_jobs – n_jobs is set to the number of CPU by default unless this value is changed

Returns

dictionaries, each row has the following keys: opset, exception if any, conversion time, problem chosen to test the conversion…

The function requires sklearn-onnx. The outcome can be seen at pages references by scikit-learn Converters and Benchmarks. The parameter time_kwargs is a dictionary which defines the number of times to repeat the same predictions in order to give more precise figures. The default value (if None) is returned by the following code:

<<<

from mlprodict.onnxrt.validate.validate_helper import default_time_kwargs
import pprint
pprint.pprint(default_time_kwargs())

>>>

    {1: {'number': 15, 'repeat': 20},
{'number': 10, 'repeat': 20},
{'number': 4, 'repeat': 10},
{'number': 4, 'repeat': 4},
{'number': 2, 'repeat': 2}}

Parameter time_kwargs_fact multiples these values for some specific models. 'lin' multiplies by 10 when the model is linear.

source on GitHub

mlprodict.onnxrt.validate.validate.enumerate_validated_operator_opsets(verbose=0, opset_min=-1, opset_max=-1, check_runtime=True, debug=False, runtime='python', models=None, dump_folder=None, store_models=False, benchmark=False, skip_models=None, assume_finite=True, node_time=False, fLOG=<built-in function print>, filter_exp=None, versions=False, extended_list=False, time_kwargs=None, dump_all=False, n_features=None, skip_long_test=True, fail_bad_results=False, filter_scenario=None, time_kwargs_fact=None, time_limit=4, n_jobs=None)#

Tests all possible configurations for all possible operators and returns the results.

Parameters

verbose – integer 0, 1, 2
opset_min – checks conversion starting from the opset, -1 to get the last one
opset_max – checks conversion up to this opset, None means __max_supported_opset__
check_runtime – checks the python runtime
models – only process a small list of operators, set of model names
debug – stops whenever an exception is raised
runtime – test a specific runtime, by default 'python'
dump_folder – dump information to replicate in case of mismatch
dump_all – dump all models not only the one which fail
store_models – if True, the function also stores the fitted model and its conversion into ONNX
benchmark – if True, measures the time taken by each function to predict for different number of rows
filter_exp – function which tells if the experiment must be run, None to run all, takes model, problem as an input
filter_scenario – second function which tells if the experiment must be run, None to run all, takes model, problem, scenario, extra, options as an input
skip_models – models to skip
assume_finite –
See config_context, If True, validation for finiteness will be skipped, saving time, but leading to potential crashes. If False, validation for finiteness will be performed, avoiding error.
node_time – measure time execution for every node in the graph
versions – add columns with versions of used packages, numpy, scikit-learn, onnx, onnxruntime, sklearn-onnx
extended_list – also check models this module implements a converter for
time_kwargs – to define a more precise way to measure a model
n_features – modifies the shorts datasets used to train the models to use exactly this number of features, it can also be a list to test multiple datasets
skip_long_test – skips tests for high values of N if they seem too long
fail_bad_results – fails if the results are aligned with scikit-learn
time_kwargs_fact – see _multiply_time_kwargs
time_limit – to skip the rest of the test after this limit (in second)
n_jobs – n_jobs is set to the number of CPU by default unless this value is changed
fLOG – logging function

Returns

list of dictionaries

The function is available through command line validate_runtime. The default for time_kwargs is the following:

<<<

from mlprodict.onnxrt.validate.validate_helper import default_time_kwargs
import pprint
pprint.pprint(default_time_kwargs())

>>>

    {1: {'number': 15, 'repeat': 20},
{'number': 10, 'repeat': 20},
{'number': 4, 'repeat': 10},
{'number': 4, 'repeat': 4},
{'number': 2, 'repeat': 2}}

source on GitHub

module onnxrt.validate.side_by_side

module onnxrt.validate.validate_benchmark

module onnxrt.validate.validate#

Short summary#

Functions#

Documentation#

module `onnxrt.validate.validate`#