com.microsoft - QuantizeWithOrder#

QuantizeWithOrder - 1#

Version

  • name: QuantizeWithOrder (GitHub)

  • domain: com.microsoft

  • since_version: 1

  • function:

  • support_level: SupportType.COMMON

  • shape inference: True

This version of the operator has been available since version 1 of domain com.microsoft.

Summary

Attributes

  • order_input - INT (required) : cublasLt order of input matrix. ORDER_COL = 0, ORDER_ROW = 1, ORDER_COL32 = 2, ORDER_COL4_4R2_8C = 3, ORDER_COL32_2R_4R4 = 4. Please refer https://docs.nvidia.com/cuda/cublas/index.html#cublasLtOrder_t for their meaning.

  • order_output - INT (required) : cublasLt order of output matrix.

Inputs

  • input (heterogeneous) - F:

  • scale_input (heterogeneous) - S:

Outputs

  • output (heterogeneous) - Q:

Type Constraints

  • Q in ( tensor(int8) ): Constrain input and output types to int8 tensors.

  • F in ( tensor(float), tensor(float16) ): Constrain to float types

  • S in ( tensor(float) ): Constrain Scale to float32 types

Examples