com.microsoft - QuantizeLinear#

QuantizeLinear - 1#

Version

name: QuantizeLinear (GitHub)
domain: com.microsoft
since_version: 1
function:
support_level: SupportType.COMMON
shape inference: True

This version of the operator has been available since version 1 of domain com.microsoft.

Summary

Attributes

axis - INT : The axis along which same quantization parameters are applied. It’s optional.If it’s not specified, it means per-tensor quantization and input ‘x_scale’ and ‘x_zero_point’ must be scalars.If it’s specified, it means per ‘axis’ quantization and input ‘x_scale’ and ‘x_zero_point’ must be 1-D tensors.

Inputs

x (heterogeneous) - T1:
y_scale (heterogeneous) - T1:
y_zero_point (heterogeneous) - T2:

Outputs

y (heterogeneous) - T2:

Type Constraints

T1 in ( tensor(float), tensor(float16) ): Constrain ‘x’, ‘y_scale’ to float tensors.
T2 in ( tensor(int8), tensor(uint8) ): Constrain ‘y_zero_point’ and ‘y’ to 8-bit integer tensors.

Examples

default

import numpy as np
import onnx

node = onnx.helper.make_node(
    "QuantizeLinear",
    inputs=["x", "y_scale", "y_zero_point"],
    outputs=["y"],
)

x = np.array([0, 2, 3, 1000, -254, -1000]).astype(np.float32)
y_scale = np.float32(2)
y_zero_point = np.uint8(128)
y = np.array([128, 129, 130, 255, 1, 0]).astype(np.uint8)

expect(
    node,
    inputs=[x, y_scale, y_zero_point],
    outputs=[y],
    name="test_quantizelinear",
)

_axis

import numpy as np
import onnx

node = onnx.helper.make_node(
    "QuantizeLinear",
    inputs=["x", "y_scale", "y_zero_point"],
    outputs=["y"],
)

x = np.array(
    [
        [
            [[-162, 10], [-100, 232], [-20, -50]],
            [[-76, 0], [0, 252], [32, -44]],
            [[245, -485], [-960, -270], [-375, -470]],
        ],
    ],
    dtype=np.float32,
)
y_scale = np.array([2, 4, 5], dtype=np.float32)
y_zero_point = np.array([84, 24, 196], dtype=np.uint8)
y = (x / y_scale.reshape(1, 3, 1, 1) + y_zero_point.reshape(1, 3, 1, 1)).astype(
    np.uint8
)

expect(
    node,
    inputs=[x, y_scale, y_zero_point],
    outputs=[y],
    name="test_quantizelinear_axis",
)

_e4m3fn

import numpy as np
import onnx

node = onnx.helper.make_node(
    "QuantizeLinear",
    inputs=["x", "y_scale", "y_zero_point"],
    outputs=["y"],
)

x = np.array([0.0, 1.0, 2.0, 100000.0, 200.0]).astype(np.float32)
y_scale = np.float32(2)
y_zero_point = make_tensor("zero_point", TensorProto.FLOAT8E4M3FN, [1], [0])
y = make_tensor(
    "zero_point", TensorProto.FLOAT8E4M3FN, [5], [0, 0.5, 1, 448, 104]
)

expect(
    node,
    inputs=[x, y_scale, y_zero_point],
    outputs=[y],
    name="test_quantizelinear_e4m3fn",
)

_e5m2

import numpy as np
import onnx

node = onnx.helper.make_node(
    "QuantizeLinear",
    inputs=["x", "y_scale", "y_zero_point"],
    outputs=["y"],
)

x = np.array([0.0, 1.0, 2.0, 100000.0, 200.0]).astype(np.float32)
y_scale = np.float32(2)
y_zero_point = make_tensor("zero_point", TensorProto.FLOAT8E5M2, [1], [0.0])
y = make_tensor(
    "zero_point", TensorProto.FLOAT8E5M2, [5], [0, 0.5, 1, 49152, 96]
)

expect(
    node,
    inputs=[x, y_scale, y_zero_point],
    outputs=[y],
    name="test_quantizelinear_e5m2",
)