Converters with options#

Some converters have options to change the way a specific operator is converted. The whole list is described at Converters with options.

Option cdist for GaussianProcessRegressor#

Notebooks Pairwise distances with ONNX (pdist) shows how much slower an ONNX implementation of function cdist, from 3 to 10 times slower. One way to optimize the converted model is to create dedicated operators such as the one for function cdist. The first example shows how to convert a GaussianProcessRegressor into standard ONNX (see also CDist).

%0 X X double((0, 4)) Sc_Scan Scan (Sc_Scan) body=node {  input: 'next_in'... num_scan_inputs=1 X->Sc_Scan GPmean GPmean double((0, 1)) Sc_Scancst Sc_Scancst float64((112, 4)) [[5.5 3.5 1.3 0.2] [6.  2.7 5.1 1.6] [5.  2.  3.... Sc_Scancst->Sc_Scan kgpd_Divcst kgpd_Divcst float64((1,)) [0.295] kgpd_Div Div (kgpd_Div) kgpd_Divcst->kgpd_Div kgpd_Mulcst kgpd_Mulcst float64((1,)) [3.142] kgpd_Mul Mul (kgpd_Mul) kgpd_Mulcst->kgpd_Mul kgpd_Divcst1 kgpd_Divcst1 float64((1,)) [1634.535] kgpd_Div1 Div (kgpd_Div1) kgpd_Divcst1->kgpd_Div1 kgpd_Powcst kgpd_Powcst float64((1,)) [2.] kgpd_Pow Pow (kgpd_Pow) kgpd_Powcst->kgpd_Pow kgpd_Mulcst1 kgpd_Mulcst1 float64((1,)) [-2.] kgpd_Mul1 Mul (kgpd_Mul1) kgpd_Mulcst1->kgpd_Mul1 gpr_MatMulcst gpr_MatMulcst float64((112,)) [-0.041  0.009  0.009  0.059 -0.041  0.009  0.059 ... gpr_MatMul MatMul (gpr_MatMul) gpr_MatMulcst->gpr_MatMul gpr_Addcst gpr_Addcst float64((1, 1)) [[0.]] gpr_Add Add (gpr_Add) gpr_Addcst->gpr_Add Re_Reshapecst Re_Reshapecst int64((2,)) [-1  1] Re_Reshape Reshape (Re_Reshape) allowzero=0 Re_Reshapecst->Re_Reshape UU040UU UU040UU UU041UU UU041UU kgpd_Transpose Transpose (kgpd_Transpose) perm=[1 0] UU041UU->kgpd_Transpose Sc_Scan->UU040UU Sc_Scan->UU041UU kgpd_transposed0 kgpd_transposed0 kgpd_Sqrt Sqrt (kgpd_Sqrt) kgpd_transposed0->kgpd_Sqrt kgpd_Transpose->kgpd_transposed0 kgpd_Y0 kgpd_Y0 kgpd_Y0->kgpd_Div kgpd_Sqrt->kgpd_Y0 kgpd_C03 kgpd_C03 kgpd_C03->kgpd_Mul kgpd_Div->kgpd_C03 kgpd_C02 kgpd_C02 kgpd_Sin Sin (kgpd_Sin) kgpd_C02->kgpd_Sin kgpd_Mul->kgpd_C02 kgpd_output02 kgpd_output02 kgpd_output02->kgpd_Div1 kgpd_Sin->kgpd_output02 kgpd_C01 kgpd_C01 kgpd_C01->kgpd_Pow kgpd_Div1->kgpd_C01 kgpd_Z0 kgpd_Z0 kgpd_Z0->kgpd_Mul1 kgpd_Pow->kgpd_Z0 kgpd_C0 kgpd_C0 kgpd_Exp Exp (kgpd_Exp) kgpd_C0->kgpd_Exp kgpd_Mul1->kgpd_C0 kgpd_output01 kgpd_output01 kgpd_output01->gpr_MatMul kgpd_Exp->kgpd_output01 gpr_Y0 gpr_Y0 gpr_Y0->gpr_Add gpr_MatMul->gpr_Y0 gpr_C0 gpr_C0 gpr_C0->Re_Reshape gpr_Add->gpr_C0 Re_Reshape->GPmean

Now the new model with the operator CDist.

%0 X X double((0, 4)) kgpd_CDist CDist (kgpd_CDist) metric=b'euclidean' X->kgpd_CDist GPmean GPmean double((0, 1)) kgpd_CDistcst kgpd_CDistcst float64((112, 4)) [[5.5 3.5 1.3 0.2] [6.  2.7 5.1 1.6] [5.  2.  3.... kgpd_CDistcst->kgpd_CDist kgpd_Divcst kgpd_Divcst float64((1,)) [0.295] kgpd_Div Div (kgpd_Div) kgpd_Divcst->kgpd_Div kgpd_Mulcst kgpd_Mulcst float64((1,)) [3.142] kgpd_Mul Mul (kgpd_Mul) kgpd_Mulcst->kgpd_Mul kgpd_Divcst1 kgpd_Divcst1 float64((1,)) [1634.535] kgpd_Div1 Div (kgpd_Div1) kgpd_Divcst1->kgpd_Div1 kgpd_Powcst kgpd_Powcst float64((1,)) [2.] kgpd_Pow Pow (kgpd_Pow) kgpd_Powcst->kgpd_Pow kgpd_Mulcst1 kgpd_Mulcst1 float64((1,)) [-2.] kgpd_Mul1 Mul (kgpd_Mul1) kgpd_Mulcst1->kgpd_Mul1 gpr_MatMulcst gpr_MatMulcst float64((112,)) [-0.041  0.009  0.009  0.059 -0.041  0.009  0.059 ... gpr_MatMul MatMul (gpr_MatMul) gpr_MatMulcst->gpr_MatMul gpr_Addcst gpr_Addcst float64((1, 1)) [[0.]] gpr_Add Add (gpr_Add) gpr_Addcst->gpr_Add Re_Reshapecst Re_Reshapecst int64((2,)) [-1  1] Re_Reshape Reshape (Re_Reshape) allowzero=0 Re_Reshapecst->Re_Reshape kgpd_dist kgpd_dist kgpd_dist->kgpd_Div kgpd_CDist->kgpd_dist kgpd_C03 kgpd_C03 kgpd_C03->kgpd_Mul kgpd_Div->kgpd_C03 kgpd_C02 kgpd_C02 kgpd_Sin Sin (kgpd_Sin) kgpd_C02->kgpd_Sin kgpd_Mul->kgpd_C02 kgpd_output02 kgpd_output02 kgpd_output02->kgpd_Div1 kgpd_Sin->kgpd_output02 kgpd_C01 kgpd_C01 kgpd_C01->kgpd_Pow kgpd_Div1->kgpd_C01 kgpd_Z0 kgpd_Z0 kgpd_Z0->kgpd_Mul1 kgpd_Pow->kgpd_Z0 kgpd_C0 kgpd_C0 kgpd_Exp Exp (kgpd_Exp) kgpd_C0->kgpd_Exp kgpd_Mul1->kgpd_C0 kgpd_output01 kgpd_output01 kgpd_output01->gpr_MatMul kgpd_Exp->kgpd_output01 gpr_Y0 gpr_Y0 gpr_Y0->gpr_Add gpr_MatMul->gpr_Y0 gpr_C0 gpr_C0 gpr_C0->Re_Reshape gpr_Add->gpr_C0 Re_Reshape->GPmean

The only change is parameter options set to options={GaussianProcessRegressor: {'optim': 'cdist'}}. It tells the conversion fonction that every every model sklearn.gaussian_process.GaussianProcessRegressor must be converted with the option optim='cdist'. The converter of this model checks that that options and uses custom operator CDist instead of its standard implementation based on operator Scan. Section GaussianProcess shows how much the gain is depending on the number of observations for this example.

Other model supported cdist#

Pairwise distances are also is all nearest neighbours models. That same cdist option is also supported for these models.

Option zipmap for classifiers#

By default, the library sklearn-onnx produces a list of dictionaries {label: prediction} but this data structure takes a significant time to be build. The converted model can stick to matrices by removing operator ZipMap. This is done by using option {'zipmap': False}.

%0 X X double((0, 4)) MatMul MatMul (MatMul) X->MatMul label label int64((0,)) probabilities probabilities double((0, 3)) coef coef float64((4, 3)) [[-0.376  0.465 -0.089] [ 0.883 -0.699 -0.184] [... coef->MatMul intercept intercept float64((1, 3)) [[  8.587   2.949 -11.536]] Add Add (Add) intercept->Add classes classes int32((3,)) [0 1 2] ArrayFeatureExtractor ArrayFeatureExtractor (ArrayFeatureExtractor) classes->ArrayFeatureExtractor shape_tensor shape_tensor int64((1,)) [-1] Reshape Reshape (Reshape) shape_tensor->Reshape multiplied multiplied multiplied->Add MatMul->multiplied raw_scores raw_scores Softmax Softmax (Softmax) axis=-1 raw_scores->Softmax ArgMax ArgMax (ArgMax) axis=1 raw_scores->ArgMax Add->raw_scores Softmax->probabilities label1 label1 label1->ArrayFeatureExtractor ArgMax->label1 array_feature_extractor_result array_feature_extractor_result Cast Cast (Cast) to=11 array_feature_extractor_result->Cast ArrayFeatureExtractor->array_feature_extractor_result cast2_result cast2_result cast2_result->Reshape Cast->cast2_result reshaped_result reshaped_result Cast1 Cast (Cast1) to=7 reshaped_result->Cast1 Reshape->reshaped_result Cast1->label

Option raw_scores for classifiers#

By default, the library sklearn-onnx produces probabilities whenever it is possible for a classifier. Raw scores can usually be still obtained by using option {'raw_scores': True}.

%0 X X double((0, 4)) MatMul MatMul (MatMul) X->MatMul label label int64((0,)) probabilities probabilities double((0, 3)) ArgMax ArgMax (ArgMax) axis=1 probabilities->ArgMax coef coef float64((4, 3)) [[-0.376  0.465 -0.089] [ 0.883 -0.699 -0.184] [... coef->MatMul intercept intercept float64((1, 3)) [[  8.587   2.949 -11.536]] Add Add (Add) intercept->Add classes classes int32((3,)) [0 1 2] ArrayFeatureExtractor ArrayFeatureExtractor (ArrayFeatureExtractor) classes->ArrayFeatureExtractor shape_tensor shape_tensor int64((1,)) [-1] Reshape Reshape (Reshape) shape_tensor->Reshape multiplied multiplied multiplied->Add MatMul->multiplied Add->probabilities label1 label1 label1->ArrayFeatureExtractor ArgMax->label1 array_feature_extractor_result array_feature_extractor_result Cast Cast (Cast) to=11 array_feature_extractor_result->Cast ArrayFeatureExtractor->array_feature_extractor_result cast2_result cast2_result cast2_result->Reshape Cast->cast2_result reshaped_result reshaped_result Cast1 Cast (Cast1) to=7 reshaped_result->Cast1 Reshape->reshaped_result Cast1->label

Pickability and Pipeline#

The proposed way to specify options is not always pickable. Function id(model) depends on the execution and map an option to one class may be not enough to customize the conversion. However, it is possible to specify an option the same way parameters are referenced in a scikit-learn pipeline with method get_params. Following syntax are supported:

pipe = Pipeline([('pca', PCA()), ('classifier', LogisticRegression())])

options = {'classifier': {'zipmap': False}}

Or

options = {'classifier__zipmap': False}

Options applied to one model, not a pipeline as the converter replaces the pipeline structure by a single onnx graph. Following that rule, option zipmap would not have any impact if applied to a pipeline and to the last step of the pipeline. However, because there is no ambiguity about what the conversion should be, for options zipmap and nocl, the following options would have the same effect:

pipe = Pipeline([('pca', PCA()), ('classifier', LogisticRegression())])

options = {id(pipe.steps[-1][1]): {'zipmap': False}}
options = {id(pipe): {'zipmap': False}}
options = {'classifier': {'zipmap': False}}
options = {'classifier__zipmap': False}