module onnxrt.validate.validate
#
Short summary#
module mlprodict.onnxrt.validate.validate
Validates runtime for many :scikit-learn: operators. The submodule relies on onnxconverter_common, sklearn-onnx.
Functions#
function |
truncated documentation |
---|---|
Private. |
|
Use by |
|
Lists all compatible opsets for a specific model. |
|
Tests all possible configurations for all possible operators and returns the results. |
Documentation#
Validates runtime for many :scikit-learn: operators. The submodule relies on onnxconverter_common, sklearn-onnx.
- mlprodict.onnxrt.validate.validate._call_conv_runtime_opset(obs, opsets, debug, new_conv_options, model, prob, scenario, extra, extras, conv_options, init_types, inst, optimisations, verbose, benchmark, runtime, filter_scenario, check_runtime, X_test, y_test, ypred, Xort_test, method_name, output_index, kwargs, time_limit, fLOG)#
- mlprodict.onnxrt.validate.validate._call_runtime(obs_op, conv, opset, debug, inst, runtime, X_test, y_test, init_types, method_name, output_index, ypred, Xort_test, model, dump_folder, benchmark, node_time, fLOG, verbose, store_models, time_kwargs, dump_all, skip_long_test, time_limit)#
Private.
- mlprodict.onnxrt.validate.validate._check_run_benchmark(benchmark, stat_onnx, bench_memo, runtime)#
- mlprodict.onnxrt.validate.validate._dofit_model(dofit, obs, inst, X_train, y_train, X_test, y_test, Xort_test, init_types, store_models, debug, verbose, fLOG)#
- mlprodict.onnxrt.validate.validate._enumerate_validated_operator_opsets_ops(extended_list, models, skip_models)#
- mlprodict.onnxrt.validate.validate._enumerate_validated_operator_opsets_version(runtime)#
- mlprodict.onnxrt.validate.validate._retrieve_problems_extra(model, verbose, fLOG, extended_list)#
Use by
enumerate_compatible_opset
.
- mlprodict.onnxrt.validate.validate._run_skl_prediction(obs, check_runtime, assume_finite, inst, method_name, predict_kwargs, X_test, benchmark, debug, verbose, time_kwargs, skip_long_test, time_kwargs_fact, fLOG)#
- mlprodict.onnxrt.validate.validate.enumerate_compatible_opset(model, opset_min=-1, opset_max=-1, check_runtime=True, debug=False, runtime='python', dump_folder=None, store_models=False, benchmark=False, assume_finite=True, node_time=False, fLOG=<built-in function print>, filter_exp=None, verbose=0, time_kwargs=None, extended_list=False, dump_all=False, n_features=None, skip_long_test=True, filter_scenario=None, time_kwargs_fact=None, time_limit=4, n_jobs=None)#
Lists all compatible opsets for a specific model.
- Parameters
model – operator class
opset_min – starts with this opset
opset_max – ends with this opset (None to use current onnx opset)
check_runtime – checks that runtime can consume the model and compute predictions
debug – catch exception (True) or not (False)
runtime – test a specific runtime, by default
'python'
dump_folder – dump information to replicate in case of mismatch
dump_all – dump all models not only the one which fail
store_models – if True, the function also stores the fitted model and its conversion into ONNX
benchmark – if True, measures the time taken by each function to predict for different number of rows
fLOG – logging function
filter_exp – function which tells if the experiment must be run, None to run all, takes model, problem as an input
filter_scenario – second function which tells if the experiment must be run, None to run all, takes model, problem, scenario, extra, options as an input
node_time – collect time for each node in the ONNX graph
assume_finite – See config_context, If True, validation for finiteness will be skipped, saving time, but leading to potential crashes. If False, validation for finiteness will be performed, avoiding error.
verbose – verbosity
extended_list – extends the list to custom converters and problems
time_kwargs – to define a more precise way to measure a model
n_features – modifies the shorts datasets used to train the models to use exactly this number of features, it can also be a list to test multiple datasets
skip_long_test – skips tests for high values of N if they seem too long
time_kwargs_fact – see
_multiply_time_kwargs
time_limit – to stop benchmarking after this amount of time was spent
n_jobs – n_jobs is set to the number of CPU by default unless this value is changed
- Returns
dictionaries, each row has the following keys: opset, exception if any, conversion time, problem chosen to test the conversion…
The function requires sklearn-onnx. The outcome can be seen at pages references by scikit-learn Converters and Benchmarks. The parameter time_kwargs is a dictionary which defines the number of times to repeat the same predictions in order to give more precise figures. The default value (if None) is returned by the following code:
<<<
from mlprodict.onnxrt.validate.validate_helper import default_time_kwargs import pprint pprint.pprint(default_time_kwargs())
>>>
{1: {'number': 15, 'repeat': 20}, 10: {'number': 10, 'repeat': 20}, 100: {'number': 4, 'repeat': 10}, 1000: {'number': 4, 'repeat': 4}, 10000: {'number': 2, 'repeat': 2}}
Parameter time_kwargs_fact multiples these values for some specific models.
'lin'
multiplies by 10 when the model is linear.
- mlprodict.onnxrt.validate.validate.enumerate_validated_operator_opsets(verbose=0, opset_min=-1, opset_max=-1, check_runtime=True, debug=False, runtime='python', models=None, dump_folder=None, store_models=False, benchmark=False, skip_models=None, assume_finite=True, node_time=False, fLOG=<built-in function print>, filter_exp=None, versions=False, extended_list=False, time_kwargs=None, dump_all=False, n_features=None, skip_long_test=True, fail_bad_results=False, filter_scenario=None, time_kwargs_fact=None, time_limit=4, n_jobs=None)#
Tests all possible configurations for all possible operators and returns the results.
- Parameters
verbose – integer 0, 1, 2
opset_min – checks conversion starting from the opset, -1 to get the last one
opset_max – checks conversion up to this opset, None means __max_supported_opset__
check_runtime – checks the python runtime
models – only process a small list of operators, set of model names
debug – stops whenever an exception is raised
runtime – test a specific runtime, by default
'python'
dump_folder – dump information to replicate in case of mismatch
dump_all – dump all models not only the one which fail
store_models – if True, the function also stores the fitted model and its conversion into ONNX
benchmark – if True, measures the time taken by each function to predict for different number of rows
filter_exp – function which tells if the experiment must be run, None to run all, takes model, problem as an input
filter_scenario – second function which tells if the experiment must be run, None to run all, takes model, problem, scenario, extra, options as an input
skip_models – models to skip
assume_finite –
See config_context, If True, validation for finiteness will be skipped, saving time, but leading to potential crashes. If False, validation for finiteness will be performed, avoiding error.
node_time – measure time execution for every node in the graph
versions – add columns with versions of used packages, numpy, scikit-learn, onnx, onnxruntime, sklearn-onnx
extended_list – also check models this module implements a converter for
time_kwargs – to define a more precise way to measure a model
n_features – modifies the shorts datasets used to train the models to use exactly this number of features, it can also be a list to test multiple datasets
skip_long_test – skips tests for high values of N if they seem too long
fail_bad_results – fails if the results are aligned with scikit-learn
time_kwargs_fact – see
_multiply_time_kwargs
time_limit – to skip the rest of the test after this limit (in second)
n_jobs – n_jobs is set to the number of CPU by default unless this value is changed
fLOG – logging function
- Returns
list of dictionaries
The function is available through command line validate_runtime. The default for time_kwargs is the following:
<<<
from mlprodict.onnxrt.validate.validate_helper import default_time_kwargs import pprint pprint.pprint(default_time_kwargs())
>>>
{1: {'number': 15, 'repeat': 20}, 10: {'number': 10, 'repeat': 20}, 100: {'number': 4, 'repeat': 10}, 1000: {'number': 4, 'repeat': 4}, 10000: {'number': 2, 'repeat': 2}}