com.microsoft - QuantizeWithOrder#
QuantizeWithOrder - 1#
Version
domain: com.microsoft
since_version: 1
function:
support_level: SupportType.COMMON
shape inference: True
This version of the operator has been available since version 1 of domain com.microsoft.
Summary
Attributes
order_input - INT (required) : cublasLt order of input matrix. ORDER_COL = 0, ORDER_ROW = 1, ORDER_COL32 = 2, ORDER_COL4_4R2_8C = 3, ORDER_COL32_2R_4R4 = 4. Please refer https://docs.nvidia.com/cuda/cublas/index.html#cublasLtOrder_t for their meaning.
order_output - INT (required) : cublasLt order of output matrix.
Inputs
input (heterogeneous) - F:
scale_input (heterogeneous) - S:
Outputs
output (heterogeneous) - Q:
Type Constraints
Q in ( tensor(int8) ): Constrain input and output types to int8 tensors.
F in ( tensor(float), tensor(float16) ): Constrain to float types
S in ( tensor(float) ): Constrain Scale to float32 types
Examples