2021-05-05 Numpy API for ONNX and scikit-learn (part I)#
sklearn-onnx converts most of the pipelines including numerical preprocessing or predictors but it fails whenever custom code is involved. That covers the use of FunctionTransformer or a new model inheriting from BaseEstimator. To be successful, the conversion needs a way to convert the custom code into ONNX. The proposed solution here is bypass that complex steps (rewrite a python function with ONNX operators) by directly writing the custom code with ONNX operators. However, even though most of the operator are close to numpy functions, they are not the same. To avoid spending time looking at them, many numpy functions were implementing with ONNX operators. The custom function or predictor can then just be implemented with this API to build a unique ONNX graph executed with a runtime.
Next sections takes some examples from Numpy to ONNX: Create ONNX graphs with an API similar to numpy.
numpy API for ONNX
Let’s an example with a FunctionTransformer.
The mechanism is similar to what pytorch or tensorflow
put in place: write a graph assuming every node processes a variable.
Then the user instantiates a variable and executes the graph.
It works the same with ONNX. The following snippet implement the
function .
import numpy as np
import mlprodict.npy.numpy_onnx_impl as npnx
def onnx_log_1(x):
return npnx.log(x + np.float32(1))
The list of implemented function is module npy.numpy_onnx_impl. ONNX is strongly typed so we need to specified them with annotations.
from typing import Any
import numpy as np
from mlprodict.npy import NDArray
import mlprodict.npy.numpy_onnx_impl as npnx
def onnx_log_1(x: NDArray[Any, np.float32]) -> NDArray[(None, None), np.float32]:
return npnx.log(x + np.float32(1))
And finally, this function does not run on a numpy array as every
function expects a variable (see OnnxVariable
) to define an ONNX graph
which can be executed with a runtime. That’s the purpose of the decorator
from typing import Any
import numpy as np
from mlprodict.npy import onnxnumpy_default, NDArray
import mlprodict.npy.numpy_onnx_impl as npnx
def onnx_log_1(x: NDArray[Any, np.float32]) -> NDArray[(None, None), np.float32]:
return npnx.log(x + np.float32(1))
x = np.array([[1, 2], [3, 4]], dtype=np.float32)
[[0.693 1.099]
[1.386 1.609]]
<class 'mlprodict.npy.onnx_numpy_wrapper.onnxnumpy_onnx_log_1_None_None'>
onnx_log_1 is not a function but an instance of a class which defines operator __call__ and that class has a hold on the ONNX graph and all the necessary information to have sklearn-onnx convert any pipeline using it after a new converter for FunctionTransformer is registered to handle this API.
The ONNX graph is created when the function is called for the first time and loaded by the runtime. That explains why the first call is much slower and all the other call.
from mlprodict.onnx_conv import register_rewritten_operators
The complete example:
from typing import Any
import numpy as np
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.pipeline import make_pipeline
from sklearn.preprocessing import FunctionTransformer, StandardScaler
from sklearn.linear_model import LogisticRegression
import mlprodict.npy.numpy_onnx_impl as npnx
from mlprodict.npy import onnxnumpy_default, NDArray
from mlprodict.onnxrt import OnnxInference
from skl2onnx import to_onnx
from mlprodict.onnx_conv import register_rewritten_operators
def onnx_log_1(x: NDArray[Any, np.float32]) -> NDArray[(None, None), np.float32]:
return npnx.log(x + np.float32(1))
data = load_iris()
X, y = data.data.astype(np.float32), data.target
X_train, X_test, y_train, y_test = train_test_split(X, y)
pipe = make_pipeline(
pipe.fit(X_train, y_train)
onx = to_onnx(pipe, X_train[:1],
options={LogisticRegression: {'zipmap': False}})
oinf = OnnxInference(onx)
print(oinf.run({'X': X_test[:2]})['probabilities'])
[[9.860e-01 1.396e-02 3.130e-07]
[2.012e-02 8.711e-01 1.088e-01]]
[[9.860e-01 1.396e-02 3.130e-07]
[2.012e-02 8.711e-01 1.088e-01]]
The decorator has parameter to change the way the function is converted or executed. ONNX has different version or opset, it is possible to target a specific opset. The ONNX graph must be executed with a runtime, this one or onnxruntime. This can be defined too. The function is strongly typed but it is possible to have an implementation which supports multiple types. An ONNX graph will be created for every distinct type, like a template in C++. See Numpy to ONNX: Create ONNX graphs with an API similar to numpy for more information.