Converters with options
Some converters have options to change the way
a specific operator is converted. The whole list
is described at Converters with options .
Notebooks Pairwise distances with ONNX (pdist) shows how much slower
an ONNX implementation of function
cdist , from 3 to 10 times slower.
One way to optimize the converted model is to
create dedicated operators such as the one for function
cdist . The first example shows how to
convert a GaussianProcessRegressor into
standard ONNX (see also CDist
).
%0
X
X
double((0, 4))
Sc_Scan
Scan
(Sc_Scan)
body=node {
input: 'next_in'...
num_scan_inputs=1
X->Sc_Scan
GPmean
GPmean
double((0, 1))
Sc_Scancst
Sc_Scancst
float64((112, 4))
[[5.5 3.5 1.3 0.2]
[6. 2.7 5.1 1.6]
[5. 2. 3....
Sc_Scancst->Sc_Scan
kgpd_Divcst
kgpd_Divcst
float64((1,))
[0.295]
kgpd_Div
Div
(kgpd_Div)
kgpd_Divcst->kgpd_Div
kgpd_Mulcst
kgpd_Mulcst
float64((1,))
[3.142]
kgpd_Mul
Mul
(kgpd_Mul)
kgpd_Mulcst->kgpd_Mul
kgpd_Divcst1
kgpd_Divcst1
float64((1,))
[1634.535]
kgpd_Div1
Div
(kgpd_Div1)
kgpd_Divcst1->kgpd_Div1
kgpd_Powcst
kgpd_Powcst
float64((1,))
[2.]
kgpd_Pow
Pow
(kgpd_Pow)
kgpd_Powcst->kgpd_Pow
kgpd_Mulcst1
kgpd_Mulcst1
float64((1,))
[-2.]
kgpd_Mul1
Mul
(kgpd_Mul1)
kgpd_Mulcst1->kgpd_Mul1
gpr_MatMulcst
gpr_MatMulcst
float64((112,))
[-0.041 0.009 0.009 0.059 -0.041 0.009 0.059 ...
gpr_MatMul
MatMul
(gpr_MatMul)
gpr_MatMulcst->gpr_MatMul
gpr_Addcst
gpr_Addcst
float64((1, 1))
[[0.]]
gpr_Add
Add
(gpr_Add)
gpr_Addcst->gpr_Add
Re_Reshapecst
Re_Reshapecst
int64((2,))
[-1 1]
Re_Reshape
Reshape
(Re_Reshape)
allowzero=0
Re_Reshapecst->Re_Reshape
UU040UU
UU040UU
UU041UU
UU041UU
kgpd_Transpose
Transpose
(kgpd_Transpose)
perm=[1 0]
UU041UU->kgpd_Transpose
Sc_Scan->UU040UU
Sc_Scan->UU041UU
kgpd_transposed0
kgpd_transposed0
kgpd_Sqrt
Sqrt
(kgpd_Sqrt)
kgpd_transposed0->kgpd_Sqrt
kgpd_Transpose->kgpd_transposed0
kgpd_Y0
kgpd_Y0
kgpd_Y0->kgpd_Div
kgpd_Sqrt->kgpd_Y0
kgpd_C03
kgpd_C03
kgpd_C03->kgpd_Mul
kgpd_Div->kgpd_C03
kgpd_C02
kgpd_C02
kgpd_Sin
Sin
(kgpd_Sin)
kgpd_C02->kgpd_Sin
kgpd_Mul->kgpd_C02
kgpd_output02
kgpd_output02
kgpd_output02->kgpd_Div1
kgpd_Sin->kgpd_output02
kgpd_C01
kgpd_C01
kgpd_C01->kgpd_Pow
kgpd_Div1->kgpd_C01
kgpd_Z0
kgpd_Z0
kgpd_Z0->kgpd_Mul1
kgpd_Pow->kgpd_Z0
kgpd_C0
kgpd_C0
kgpd_Exp
Exp
(kgpd_Exp)
kgpd_C0->kgpd_Exp
kgpd_Mul1->kgpd_C0
kgpd_output01
kgpd_output01
kgpd_output01->gpr_MatMul
kgpd_Exp->kgpd_output01
gpr_Y0
gpr_Y0
gpr_Y0->gpr_Add
gpr_MatMul->gpr_Y0
gpr_C0
gpr_C0
gpr_C0->Re_Reshape
gpr_Add->gpr_C0
Re_Reshape->GPmean
Now the new model with the operator CDist .
%0
X
X
double((0, 4))
kgpd_CDist
CDist
(kgpd_CDist)
metric=b'euclidean'
X->kgpd_CDist
GPmean
GPmean
double((0, 1))
kgpd_CDistcst
kgpd_CDistcst
float64((112, 4))
[[5.5 3.5 1.3 0.2]
[6. 2.7 5.1 1.6]
[5. 2. 3....
kgpd_CDistcst->kgpd_CDist
kgpd_Divcst
kgpd_Divcst
float64((1,))
[0.295]
kgpd_Div
Div
(kgpd_Div)
kgpd_Divcst->kgpd_Div
kgpd_Mulcst
kgpd_Mulcst
float64((1,))
[3.142]
kgpd_Mul
Mul
(kgpd_Mul)
kgpd_Mulcst->kgpd_Mul
kgpd_Divcst1
kgpd_Divcst1
float64((1,))
[1634.535]
kgpd_Div1
Div
(kgpd_Div1)
kgpd_Divcst1->kgpd_Div1
kgpd_Powcst
kgpd_Powcst
float64((1,))
[2.]
kgpd_Pow
Pow
(kgpd_Pow)
kgpd_Powcst->kgpd_Pow
kgpd_Mulcst1
kgpd_Mulcst1
float64((1,))
[-2.]
kgpd_Mul1
Mul
(kgpd_Mul1)
kgpd_Mulcst1->kgpd_Mul1
gpr_MatMulcst
gpr_MatMulcst
float64((112,))
[-0.041 0.009 0.009 0.059 -0.041 0.009 0.059 ...
gpr_MatMul
MatMul
(gpr_MatMul)
gpr_MatMulcst->gpr_MatMul
gpr_Addcst
gpr_Addcst
float64((1, 1))
[[0.]]
gpr_Add
Add
(gpr_Add)
gpr_Addcst->gpr_Add
Re_Reshapecst
Re_Reshapecst
int64((2,))
[-1 1]
Re_Reshape
Reshape
(Re_Reshape)
allowzero=0
Re_Reshapecst->Re_Reshape
kgpd_dist
kgpd_dist
kgpd_dist->kgpd_Div
kgpd_CDist->kgpd_dist
kgpd_C03
kgpd_C03
kgpd_C03->kgpd_Mul
kgpd_Div->kgpd_C03
kgpd_C02
kgpd_C02
kgpd_Sin
Sin
(kgpd_Sin)
kgpd_C02->kgpd_Sin
kgpd_Mul->kgpd_C02
kgpd_output02
kgpd_output02
kgpd_output02->kgpd_Div1
kgpd_Sin->kgpd_output02
kgpd_C01
kgpd_C01
kgpd_C01->kgpd_Pow
kgpd_Div1->kgpd_C01
kgpd_Z0
kgpd_Z0
kgpd_Z0->kgpd_Mul1
kgpd_Pow->kgpd_Z0
kgpd_C0
kgpd_C0
kgpd_Exp
Exp
(kgpd_Exp)
kgpd_C0->kgpd_Exp
kgpd_Mul1->kgpd_C0
kgpd_output01
kgpd_output01
kgpd_output01->gpr_MatMul
kgpd_Exp->kgpd_output01
gpr_Y0
gpr_Y0
gpr_Y0->gpr_Add
gpr_MatMul->gpr_Y0
gpr_C0
gpr_C0
gpr_C0->Re_Reshape
gpr_Add->gpr_C0
Re_Reshape->GPmean
The only change is parameter options
set to options={GaussianProcessRegressor: {'optim': 'cdist'}}
.
It tells the conversion fonction that every every model
sklearn.gaussian_process.GaussianProcessRegressor
must be converted with the option optim='cdist'
. The converter
of this model checks that that options and uses custom operator CDist
instead of its standard implementation based on operator
Scan .
Section GaussianProcess shows how much the gain
is depending on the number of observations for this example.
Pairwise distances are also is all nearest neighbours models.
That same cdist option is also supported for these models.
By default, the library sklearn-onnx produces a list
of dictionaries {label: prediction}
but this data structure
takes a significant time to be build. The converted
model can stick to matrices by removing operator ZipMap .
This is done by using option {'zipmap': False}
.
%0
X
X
double((0, 4))
MatMul
MatMul
(MatMul)
X->MatMul
label
label
int64((0,))
probabilities
probabilities
double((0, 3))
coef
coef
float64((4, 3))
[[-0.376 0.465 -0.089]
[ 0.883 -0.699 -0.184]
[...
coef->MatMul
intercept
intercept
float64((1, 3))
[[ 8.587 2.949 -11.536]]
Add
Add
(Add)
intercept->Add
classes
classes
int32((3,))
[0 1 2]
ArrayFeatureExtractor
ArrayFeatureExtractor
(ArrayFeatureExtractor)
classes->ArrayFeatureExtractor
shape_tensor
shape_tensor
int64((1,))
[-1]
Reshape
Reshape
(Reshape)
shape_tensor->Reshape
multiplied
multiplied
multiplied->Add
MatMul->multiplied
raw_scores
raw_scores
Softmax
Softmax
(Softmax)
axis=-1
raw_scores->Softmax
ArgMax
ArgMax
(ArgMax)
axis=1
raw_scores->ArgMax
Add->raw_scores
Softmax->probabilities
label1
label1
label1->ArrayFeatureExtractor
ArgMax->label1
array_feature_extractor_result
array_feature_extractor_result
Cast
Cast
(Cast)
to=11
array_feature_extractor_result->Cast
ArrayFeatureExtractor->array_feature_extractor_result
cast2_result
cast2_result
cast2_result->Reshape
Cast->cast2_result
reshaped_result
reshaped_result
Cast1
Cast
(Cast1)
to=7
reshaped_result->Cast1
Reshape->reshaped_result
Cast1->label
By default, the library sklearn-onnx produces probabilities
whenever it is possible for a classifier. Raw scores can usually
be still obtained by using option {'raw_scores': True}
.
%0
X
X
double((0, 4))
MatMul
MatMul
(MatMul)
X->MatMul
label
label
int64((0,))
probabilities
probabilities
double((0, 3))
ArgMax
ArgMax
(ArgMax)
axis=1
probabilities->ArgMax
coef
coef
float64((4, 3))
[[-0.376 0.465 -0.089]
[ 0.883 -0.699 -0.184]
[...
coef->MatMul
intercept
intercept
float64((1, 3))
[[ 8.587 2.949 -11.536]]
Add
Add
(Add)
intercept->Add
classes
classes
int32((3,))
[0 1 2]
ArrayFeatureExtractor
ArrayFeatureExtractor
(ArrayFeatureExtractor)
classes->ArrayFeatureExtractor
shape_tensor
shape_tensor
int64((1,))
[-1]
Reshape
Reshape
(Reshape)
shape_tensor->Reshape
multiplied
multiplied
multiplied->Add
MatMul->multiplied
Add->probabilities
label1
label1
label1->ArrayFeatureExtractor
ArgMax->label1
array_feature_extractor_result
array_feature_extractor_result
Cast
Cast
(Cast)
to=11
array_feature_extractor_result->Cast
ArrayFeatureExtractor->array_feature_extractor_result
cast2_result
cast2_result
cast2_result->Reshape
Cast->cast2_result
reshaped_result
reshaped_result
Cast1
Cast
(Cast1)
to=7
reshaped_result->Cast1
Reshape->reshaped_result
Cast1->label
The proposed way to specify options is not always pickable.
Function id(model)
depends on the execution and map an option
to one class may be not enough to customize the conversion.
However, it is possible to specify an option the same way
parameters are referenced in a scikit-learn pipeline
with method get_params .
Following syntax are supported:
pipe = Pipeline ([( 'pca' , PCA ()), ( 'classifier' , LogisticRegression ())])
options = { 'classifier' : { 'zipmap' : False }}
Or
options = { 'classifier__zipmap' : False }
Options applied to one model, not a pipeline as the converter
replaces the pipeline structure by a single onnx graph.
Following that rule, option zipmap would not have any impact
if applied to a pipeline and to the last step of the pipeline.
However, because there is no ambiguity about what the conversion
should be, for options zipmap and nocl , the following
options would have the same effect:
pipe = Pipeline ([( 'pca' , PCA ()), ( 'classifier' , LogisticRegression ())])
options = { id ( pipe . steps [ - 1 ][ 1 ]): { 'zipmap' : False }}
options = { id ( pipe ): { 'zipmap' : False }}
options = { 'classifier' : { 'zipmap' : False }}
options = { 'classifier__zipmap' : False }