QuantizeLinear - 10 vs 13#

Next section compares an older to a newer version of the same operator after both definition are converted into markdown text. Green means an addition to the newer version, red means a deletion. Anything else is unchanged.

Files changed (1) hide show

QuantizeLinear10 → QuantizeLinear13 +7 -18

QuantizeLinear10 → QuantizeLinear13 RENAMED Viewed

@@ -1 +1 @@
- The linear quantization operator. It consumes a high precision tensor, a scale, and a zero point to compute the low precision / quantized tensor.
+ The linear per-tensor/layer quantization operator. It consumes a high precision tensor, a scale, a zero point to compute the low precision / quantized tensor.
+ The quantization formula is y = saturate ((x / y_scale) + y_zero_point). For saturation, it saturates to [0, 255] if it's uint8, or [-128, 127] if it's int8.
- The scale factor and zero point must have same shape, and can be either a scalar for per-tensor / per layer quantization, or a 1-D tensor for per-axis quantization.
- The quantization formula is y = saturate ((x / y_scale) + y_zero_point).
- For saturation, it saturates to [0, 255] if it's uint8, or [-128, 127] if it's int8.
  For (x / y_scale), it's rounding to nearest ties to even. Refer to https://en.wikipedia.org/wiki/Rounding for details. 'y_zero_point' and 'y' must have same type.
- **Attributes**
- * **axis**:
-   (Optional) The axis of the quantization dimension of the input
-   tensor. Ignored for per-tensor quantization. Negative value means
-   counting dimensions from the back. Accepted range is [-r, r-1] where
-   r = rank(input).
  **Inputs**
  Between 2 and 3 inputs.
  * **x** (heterogeneous) - **T1**:
    N-D full precision Input tensor to be quantized.
  * **y_scale** (heterogeneous) - **tensor(float)**:
-   Scale for doing quantization to get 'y'. It can be a scalar, which
+   Scale for doing quantization to get 'y'. It's a scalar, which means
+   a per-tensor/layer quantization.
-   means per-tensor/layer quantization, or a 1-D Tensor for per-axis
-   quantization.
  * **y_zero_point** (optional, heterogeneous) - **T2**:
-   Zero point for doing quantization to get 'y'. Shape must match
+   Zero point for doing quantization to get 'y'. It's a scalar, which
-   y_scale. Default is uint8 with zero point of 0 if it's not
+   means a per-tensor/layer quantization. Default value is uint8 typed
-   specified.
+   0 if it's not specified.
  **Outputs**
  * **y** (heterogeneous) - **T2**:
    N-D quantized output tensor. It has same shape as input 'x'.
  **Type Constraints**
  * **T1** in (
    tensor(float),
    tensor(int32)
    ):
    Constrain 'x' to float or int32 tensor.
  * **T2** in (
    tensor(int8),
    tensor(uint8)
    ):
    Constrain 'y_zero_point' and 'y' to 8-bit integer tensor.