{"cells": [{"cell_type": "markdown", "metadata": {}, "source": ["# Precision loss due to float32 conversion with ONNX\n", "\n", "The notebook studies the loss of precision while converting a non-continuous model into float32. It studies the conversion of [GradientBoostingClassifier](https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.GradientBoostingClassifier.html) and then a [DecisionTreeRegressor](https://scikit-learn.org/stable/modules/generated/sklearn.tree.DecisionTreeRegressor.html) for which a runtime supported float64 was implemented."]}, {"cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [{"data": {"text/html": ["
run previous cell, wait for 2 seconds
\n", ""], "text/plain": [""]}, "execution_count": 2, "metadata": {}, "output_type": "execute_result"}], "source": ["from jyquickhelper import add_notebook_menu\n", "add_notebook_menu()"]}, {"cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": ["%matplotlib inline"]}, {"cell_type": "markdown", "metadata": {}, "source": ["## GradientBoostingClassifier\n", "\n", "We just train such a model on Iris dataset."]}, {"cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": ["from sklearn.datasets import load_iris\n", "from sklearn.model_selection import train_test_split\n", "from sklearn.ensemble import GradientBoostingClassifier"]}, {"cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [{"data": {"text/plain": ["GradientBoostingClassifier(n_estimators=20)"]}, "execution_count": 5, "metadata": {}, "output_type": "execute_result"}], "source": ["iris = load_iris()\n", "X, y = iris.data, iris.target\n", "X_train, X_test, y_train, _ = train_test_split(\n", " X, y, random_state=1, shuffle=True)\n", "clr = GradientBoostingClassifier(n_estimators=20)\n", "clr.fit(X_train, y_train)"]}, {"cell_type": "markdown", "metadata": {}, "source": ["We are interested into the probability of the last class."]}, {"cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [{"data": {"text/plain": ["array([0.03010582, 0.03267555, 0.03267424, 0.03010582, 0.94383517,\n", " 0.02866979, 0.94572751, 0.03010582, 0.03010582, 0.94383517,\n", " 0.03267555, 0.03010582, 0.94696795, 0.0317053 , 0.03267555,\n", " 0.03010582, 0.03267555, 0.03267555, 0.03010582, 0.03010582,\n", " 0.03267555, 0.03267555, 0.94577389, 0.03010582, 0.91161635,\n", " 0.03267555, 0.03010582, 0.03010582, 0.03267424, 0.94282974,\n", " 0.03267424, 0.94696795, 0.03267555, 0.94696795, 0.9387834 ,\n", " 0.03010582, 0.03267555, 0.03010582])"]}, "execution_count": 6, "metadata": {}, "output_type": "execute_result"}], "source": ["exp = clr.predict_proba(X_test)[:, 2]\n", "exp"]}, {"cell_type": "markdown", "metadata": {}, "source": ["## Conversion to ONNX and comparison to original outputs"]}, {"cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [], "source": ["import numpy\n", "from mlprodict.onnxrt import OnnxInference\n", "from mlprodict.onnx_conv import to_onnx"]}, {"cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [{"data": {"text/plain": ["{'output_label': array([0, 1, 1, 0, 2, 1, 2, 0, 0, 2, 1, 0, 2, 1, 1, 0, 1, 1, 0, 0, 1, 1,\n", " 2, 0, 2, 1, 0, 0, 1, 2, 1, 2, 1, 2, 2, 0, 1, 0], dtype=int64),\n", " 'output_probability': [{0: 0.94445217, 1: 0.025442092, 2: 0.030105816},\n", " {0: 0.02932842, 1: 0.9379961, 2: 0.032675553},\n", " {0: 0.029367255, 1: 0.93795854, 2: 0.032674246},\n", " {0: 0.94445217, 1: 0.025442092, 2: 0.030105816},\n", " {0: 0.026494453, 1: 0.02967037, 2: 0.9438352},\n", " {0: 0.027988827, 1: 0.94334143, 2: 0.028669795},\n", " {0: 0.026551371, 1: 0.027721122, 2: 0.9457275},\n", " {0: 0.94445217, 1: 0.025442092, 2: 0.030105816},\n", " {0: 0.94445217, 1: 0.025442092, 2: 0.030105816},\n", " {0: 0.026494453, 1: 0.02967037, 2: 0.9438352},\n", " {0: 0.02932842, 1: 0.9379961, 2: 0.032675553},\n", " {0: 0.94445217, 1: 0.025442092, 2: 0.030105816},\n", " {0: 0.026586197, 1: 0.026445853, 2: 0.946968},\n", " {0: 0.027929045, 1: 0.9403657, 2: 0.0317053},\n", " {0: 0.02932842, 1: 0.9379961, 2: 0.032675553},\n", " {0: 0.94445217, 1: 0.025442092, 2: 0.030105816},\n", " {0: 0.02932842, 1: 0.9379961, 2: 0.032675553},\n", " {0: 0.02932842, 1: 0.9379961, 2: 0.032675553},\n", " {0: 0.94445217, 1: 0.025442092, 2: 0.030105816},\n", " {0: 0.94445217, 1: 0.025442092, 2: 0.030105816},\n", " {0: 0.02932842, 1: 0.9379961, 2: 0.032675553},\n", " {0: 0.02932842, 1: 0.9379961, 2: 0.032675553},\n", " {0: 0.026503632, 1: 0.027722482, 2: 0.9457739},\n", " {0: 0.94445217, 1: 0.025442092, 2: 0.030105816},\n", " {0: 0.041209597, 1: 0.04717405, 2: 0.9116163},\n", " {0: 0.02932842, 1: 0.9379961, 2: 0.032675553},\n", " {0: 0.94445217, 1: 0.025442092, 2: 0.030105816},\n", " {0: 0.94445217, 1: 0.025442092, 2: 0.030105816},\n", " {0: 0.029367255, 1: 0.93795854, 2: 0.032674246},\n", " {0: 0.027969029, 1: 0.029201236, 2: 0.9428297},\n", " {0: 0.029367255, 1: 0.93795854, 2: 0.032674246},\n", " {0: 0.026586197, 1: 0.026445853, 2: 0.946968},\n", " {0: 0.02932842, 1: 0.9379961, 2: 0.032675553},\n", " {0: 0.026586197, 1: 0.026445853, 2: 0.946968},\n", " {0: 0.027941188, 1: 0.033275396, 2: 0.9387834},\n", " {0: 0.94445217, 1: 0.025442092, 2: 0.030105816},\n", " {0: 0.02932842, 1: 0.9379961, 2: 0.032675553},\n", " {0: 0.94445217, 1: 0.025442092, 2: 0.030105816}]}"]}, "execution_count": 8, "metadata": {}, "output_type": "execute_result"}], "source": ["model_def = to_onnx(clr, X_train.astype(numpy.float32))\n", "oinf = OnnxInference(model_def)\n", "inputs = {'X': X_test.astype(numpy.float32)}\n", "outputs = oinf.run(inputs)\n", "outputs"]}, {"cell_type": "markdown", "metadata": {}, "source": ["Let's extract the probability of the last class."]}, {"cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [{"data": {"text/plain": ["array([0.03010582, 0.03267555, 0.03267425, 0.03010582, 0.9438352 ,\n", " 0.0286698 , 0.9457275 , 0.03010582, 0.03010582, 0.9438352 ,\n", " 0.03267555, 0.03010582, 0.946968 , 0.0317053 , 0.03267555,\n", " 0.03010582, 0.03267555, 0.03267555, 0.03010582, 0.03010582,\n", " 0.03267555, 0.03267555, 0.9457739 , 0.03010582, 0.9116163 ,\n", " 0.03267555, 0.03010582, 0.03010582, 0.03267425, 0.9428297 ,\n", " 0.03267425, 0.946968 , 0.03267555, 0.946968 , 0.9387834 ,\n", " 0.03010582, 0.03267555, 0.03010582], dtype=float32)"]}, "execution_count": 9, "metadata": {}, "output_type": "execute_result"}], "source": ["def output_fct(res):\n", " val = res['output_probability'].values\n", " return val[:, 2]\n", "\n", "output_fct(outputs)"]}, {"cell_type": "markdown", "metadata": {}, "source": ["Let's compare both predictions."]}, {"cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [{"data": {"text/plain": ["array([1.35649712e-09, 1.35649712e-09, 1.35649712e-09, 1.35649712e-09,\n", " 1.35649712e-09, 1.35649712e-09, 1.35649712e-09, 1.35649712e-09,\n", " 1.35649712e-09, 1.35649712e-09, 1.40241483e-09, 1.40403427e-09,\n", " 1.40403427e-09, 1.40403427e-09, 4.08553857e-09, 7.87733068e-09,\n", " 8.05985446e-09, 8.05985446e-09, 8.05985446e-09, 8.05985446e-09,\n", " 8.05985446e-09, 8.05985446e-09, 8.05985446e-09, 8.05985446e-09,\n", " 8.05985446e-09, 8.05985446e-09, 8.05985446e-09, 8.05985446e-09,\n", " 8.05985446e-09, 9.19990018e-09, 9.34906490e-09, 1.80944041e-08,\n", " 2.73915506e-08, 2.81494498e-08, 2.81494498e-08, 6.50696940e-08,\n", " 6.50696940e-08, 6.50696940e-08])"]}, "execution_count": 10, "metadata": {}, "output_type": "execute_result"}], "source": ["diff = numpy.sort(numpy.abs(output_fct(outputs) - exp))\n", "diff"]}, {"cell_type": "markdown", "metadata": {}, "source": ["The highest difference is quite high but there is only one."]}, {"cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [{"data": {"text/plain": ["6.506969396635753e-08"]}, "execution_count": 11, "metadata": {}, "output_type": "execute_result"}], "source": ["max(diff)"]}, {"cell_type": "markdown", "metadata": {}, "source": ["## Why this difference?\n", "\n", "The function *astype_range* returns floats (single floats) around the true value of the orginal features in double floats. "]}, {"cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [{"data": {"text/plain": ["(array([[5.7999997 , 3.9999995 , 1.1999999 , 0.19999999],\n", " [5.0999994 , 2.4999998 , 2.9999998 , 1.0999999 ],\n", " [6.5999994 , 2.9999998 , 4.3999996 , 1.3999999 ],\n", " [5.3999996 , 3.8999996 , 1.2999998 , 0.39999998],\n", " [7.899999 , 3.7999995 , 6.3999996 , 1.9999998 ]], dtype=float32),\n", " array([[5.8000007 , 4.0000005 , 1.2000002 , 0.20000002],\n", " [5.1000004 , 2.5000002 , 3.0000002 , 1.1000001 ],\n", " [6.6000004 , 3.0000002 , 4.4000006 , 1.4000001 ],\n", " [5.4000006 , 3.9000006 , 1.3000001 , 0.40000004],\n", " [7.900001 , 3.8000004 , 6.4000006 , 2.0000002 ]], dtype=float32))"]}, "execution_count": 12, "metadata": {}, "output_type": "execute_result"}], "source": ["from mlprodict.onnx_tools.model_checker import astype_range\n", "astype_range(X_test[:5])"]}, {"cell_type": "markdown", "metadata": {}, "source": ["If a decision tree uses a threshold which verifies ``float32(t) != t``, it cannot be converted into single float without discrepencies. The interval ``[float32(t - |t|*1e-7), float32(t + |t|*1e-7)]`` is close to all double values converted to the same *float32* but every feature *x* in this interval verifies ``float32(x) >= float32(t)``. It is not an issue for continuous machine learned models as all errors usually compensate. For non continuous models, there might some outliers. Next function considers all intervals of input features and randomly chooses one extremity for each of them."]}, {"cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [], "source": ["from mlprodict.onnx_tools.model_checker import onnx_shaker"]}, {"cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [{"data": {"text/plain": ["(38, 100)"]}, "execution_count": 14, "metadata": {}, "output_type": "execute_result"}], "source": ["n = 100\n", "shaked = onnx_shaker(oinf, inputs, dtype=numpy.float32, n=n,\n", " output_fct=output_fct)\n", "shaked.shape"]}, {"cell_type": "markdown", "metadata": {}, "source": ["The function draws out 100 input vectors randomly choosing one extremity for each feature. It then sort every row. First column is the lower bound, last column is the upper bound."]}, {"cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [{"data": {"text/plain": ["array([0. , 0. , 0. , 0. , 0. ,\n", " 0. , 0. , 0. , 0. , 0. ,\n", " 0. , 0. , 0. , 0. , 0. ,\n", " 0. , 0. , 0. , 0. , 0. ,\n", " 0. , 0. , 0.02333647, 0. , 0. ,\n", " 0. , 0. , 0. , 0. , 0. ,\n", " 0. , 0. , 0. , 0. , 0. ,\n", " 0. , 0. , 0. ], dtype=float32)"]}, "execution_count": 15, "metadata": {}, "output_type": "execute_result"}], "source": ["diff2 = shaked[:, n-1] - shaked[:, 0]\n", "diff2"]}, {"cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [{"data": {"text/plain": ["0.02333647"]}, "execution_count": 16, "metadata": {}, "output_type": "execute_result"}], "source": ["max(diff2)"]}, {"cell_type": "markdown", "metadata": {}, "source": ["We get the same value as before. At least one feature of one observation is really close to one threshold and changes the prediction."]}, {"cell_type": "markdown", "metadata": {}, "source": ["## Bigger datasets"]}, {"cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [{"data": {"text/plain": ["GradientBoostingClassifier()"]}, "execution_count": 17, "metadata": {}, "output_type": "execute_result"}], "source": ["from sklearn.datasets import load_breast_cancer\n", "\n", "data = load_breast_cancer()\n", "X, y = data.data, data.target\n", "X_train, X_test, y_train, _ = train_test_split(\n", " X, y, random_state=1, shuffle=True)\n", "clr = GradientBoostingClassifier()\n", "clr.fit(X_train, y_train)"]}, {"cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [], "source": ["model_def = to_onnx(clr, X_train.astype(numpy.float32))\n", "oinf = OnnxInference(model_def)\n", "inputs = {'X': X_test.astype(numpy.float32)}"]}, {"cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [{"data": {"text/plain": ["(143, 100)"]}, "execution_count": 19, "metadata": {}, "output_type": "execute_result"}], "source": ["def output_fct1(res):\n", " val = res['output_probability'].values\n", " return val[:, 1]\n", "\n", "n = 100\n", "shaked = onnx_shaker(oinf, inputs, dtype=numpy.float32, n=n,\n", " output_fct=output_fct1, force=1)\n", "shaked.shape"]}, {"cell_type": "code", "execution_count": 19, "metadata": {}, "outputs": [{"data": {"image/png": "\n", "text/plain": ["
"]}, "metadata": {"needs_background": "light"}, "output_type": "display_data"}], "source": ["import matplotlib.pyplot as plt\n", "plt.plot(shaked[:, n-1] - shaked[:, 0])\n", "plt.title(\"Observed differences on a dataset\\nwhen exploring rounding to float32\");"]}, {"cell_type": "markdown", "metadata": {}, "source": ["## DecisionTreeRegressor\n", "\n", "This model is much simple than the previous one as it contains only one tree. We study it on the [Boston](https://scikit-learn.org/stable/modules/generated/sklearn.datasets.load_boston.html#sklearn.datasets.load_boston) datasets."]}, {"cell_type": "code", "execution_count": 20, "metadata": {}, "outputs": [], "source": ["from sklearn.datasets import load_boston\n", "data = load_boston()\n", "X, y = data.data, data.target\n", "X_train, X_test, y_train, y_test = train_test_split(X, y, shuffle=2, random_state=2)"]}, {"cell_type": "code", "execution_count": 21, "metadata": {}, "outputs": [{"data": {"text/plain": ["DecisionTreeRegressor()"]}, "execution_count": 22, "metadata": {}, "output_type": "execute_result"}], "source": ["from sklearn.tree import DecisionTreeRegressor\n", "clr = DecisionTreeRegressor()\n", "clr.fit(X_train, y_train)"]}, {"cell_type": "code", "execution_count": 22, "metadata": {}, "outputs": [], "source": ["ypred = clr.predict(X_test)"]}, {"cell_type": "code", "execution_count": 23, "metadata": {}, "outputs": [], "source": ["model_onnx = to_onnx(clr, X_train.astype(numpy.float32))"]}, {"cell_type": "code", "execution_count": 24, "metadata": {}, "outputs": [], "source": ["oinf = OnnxInference(model_onnx)\n", "opred = oinf.run({'X': X_test.astype(numpy.float32)})['variable']"]}, {"cell_type": "code", "execution_count": 25, "metadata": {}, "outputs": [{"data": {"text/plain": ["array([1.52587891e-06, 1.52587891e-06, 1.52587891e-06, 1.52587891e-06,\n", " 1.52587891e-06])"]}, "execution_count": 26, "metadata": {}, "output_type": "execute_result"}], "source": ["numpy.sort(numpy.abs(ypred - opred))[-5:]"]}, {"cell_type": "code", "execution_count": 26, "metadata": {}, "outputs": [{"data": {"text/plain": ["4.680610146230323e-06"]}, "execution_count": 27, "metadata": {}, "output_type": "execute_result"}], "source": ["numpy.max(numpy.abs(ypred - opred) / ypred) * 100"]}, {"cell_type": "code", "execution_count": 27, "metadata": {}, "outputs": [{"name": "stdout", "output_type": "stream", "text": ["highest relative error: 4.68e-06%\n"]}], "source": ["print(\"highest relative error: {0:1.3}%\".format((numpy.max(numpy.abs(ypred - opred) / ypred) * 100)))"]}, {"cell_type": "markdown", "metadata": {}, "source": ["The last difference is quite big. Let's reuse function *onnx_shaker*."]}, {"cell_type": "code", "execution_count": 28, "metadata": {}, "outputs": [{"data": {"text/plain": ["(127, 1000)"]}, "execution_count": 29, "metadata": {}, "output_type": "execute_result"}], "source": ["def output_fct_reg(res):\n", " val = res['variable']\n", " return val\n", "\n", "n = 1000\n", "shaked = onnx_shaker(oinf, {'X': X_test.astype(numpy.float32)},\n", " dtype=numpy.float32, n=n,\n", " output_fct=output_fct_reg, force=1)\n", "shaked.shape"]}, {"cell_type": "code", "execution_count": 29, "metadata": {}, "outputs": [{"data": {"image/png": "\n", "text/plain": ["
"]}, "metadata": {"needs_background": "light"}, "output_type": "display_data"}], "source": ["plt.plot(shaked[:, n-1] - shaked[:, 0])\n", "plt.title(\"Observed differences on a Boston dataset\\nwith a DecisionTreeRegressor\"\n", " \"\\nwhen exploring rounding to float32\");"]}, {"cell_type": "markdown", "metadata": {}, "source": ["That's consistent. This function is way to retrieve the error due to the conversion into float32 without using the expected values."]}, {"cell_type": "markdown", "metadata": {}, "source": ["## Runtime supporting float64 for DecisionTreeRegressor\n", "\n", "We prooved that the conversion to float32 introduces discrepencies in a statistical way. But if the runtime supports float64 and not only float32, we should have absolutely no discrepencies. Let's verify that error disappear when the runtime supports an operator handling float64, which is the case for the python runtime for *DecisionTreeRegression*."]}, {"cell_type": "code", "execution_count": 30, "metadata": {}, "outputs": [], "source": ["model_onnx64 = to_onnx(clr, X_train, rewrite_ops=True)"]}, {"cell_type": "markdown", "metadata": {}, "source": ["The option **rewrite_ops** is needed to tell the function the operator we need is not (yet) supported by the official specification of ONNX. [TreeEnsembleRegressor](https://github.com/onnx/onnx/blob/master/docs/Operators-ml.md#ai.onnx.ml.TreeEnsembleRegressor) only allows float coefficients and we need double coefficients. That's why the function rewrites the converter of this operator and selects the appropriate runtime operator **RuntimeTreeEnsembleRegressorDouble**. It works as if the ONNX specification was extended to support operator *TreeEnsembleRegressorDouble* which behaves the same but with double."]}, {"cell_type": "code", "execution_count": 31, "metadata": {}, "outputs": [], "source": ["oinf64 = OnnxInference(model_onnx64)\n", "opred64 = oinf64.run({'X': X_test})['variable']"]}, {"cell_type": "markdown", "metadata": {}, "source": ["The runtime operator is accessible with the following path:"]}, {"cell_type": "code", "execution_count": 32, "metadata": {}, "outputs": [{"data": {"text/plain": [""]}, "execution_count": 33, "metadata": {}, "output_type": "execute_result"}], "source": ["oinf64.sequence_[0].ops_"]}, {"cell_type": "markdown", "metadata": {}, "source": ["Different from this one:"]}, {"cell_type": "code", "execution_count": 33, "metadata": {}, "outputs": [{"data": {"text/plain": [""]}, "execution_count": 34, "metadata": {}, "output_type": "execute_result"}], "source": ["oinf.sequence_[0].ops_"]}, {"cell_type": "markdown", "metadata": {}, "source": ["And the highest absolute difference is now null."]}, {"cell_type": "code", "execution_count": 34, "metadata": {}, "outputs": [{"data": {"text/plain": ["0.0"]}, "execution_count": 35, "metadata": {}, "output_type": "execute_result"}], "source": ["numpy.max(numpy.abs(ypred - opred64))"]}, {"cell_type": "markdown", "metadata": {}, "source": ["## Interpretation\n", "\n", "We may wonder if we should extend the ONNX specifications to support double for every operator. However, the fact the model predict a very different value for an observation indicates the prediction cannot be trusted as a very small modification of the input introduces a huge change on the output. I would use a different model. We may also wonder which prediction is the best one compare to the expected value..."]}, {"cell_type": "code", "execution_count": 35, "metadata": {}, "outputs": [{"data": {"text/plain": ["26"]}, "execution_count": 36, "metadata": {}, "output_type": "execute_result"}], "source": ["i = numpy.argmax(numpy.abs(ypred - opred))\n", "i"]}, {"cell_type": "code", "execution_count": 36, "metadata": {}, "outputs": [{"data": {"text/plain": ["(50.0, 43.1, 43.1, 43.1)"]}, "execution_count": 37, "metadata": {}, "output_type": "execute_result"}], "source": ["y_test[i], ypred[i], opred[i], opred64[i]"]}, {"cell_type": "markdown", "metadata": {}, "source": ["Well at the end, it is only luck on that kind of example."]}, {"cell_type": "code", "execution_count": 37, "metadata": {}, "outputs": [], "source": []}], "metadata": {"kernelspec": {"display_name": "Python 3", "language": "python", "name": "python3"}, "language_info": {"codemirror_mode": {"name": "ipython", "version": 3}, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.2"}}, "nbformat": 4, "nbformat_minor": 2}