com.microsoft - QOrderedMatMul#

QOrderedMatMul - 1#

Version

  • name: QOrderedMatMul (GitHub)

  • domain: com.microsoft

  • since_version: 1

  • function:

  • support_level: SupportType.COMMON

  • shape inference: True

This version of the operator has been available since version 1 of domain com.microsoft.

Summary

Attributes

  • order_A - INT (required) : cublasLt order of matrix A. See the schema of QuantizeWithOrder for order definition.

  • order_B - INT (required) : cublasLt order of matrix B

  • order_Y - INT (required) : cublasLt order of matrix Y and optional matrix C

Inputs

Between 5 and 8 inputs.

  • A (heterogeneous) - Q:

  • scale_A (heterogeneous) - S:

  • B (heterogeneous) - Q:

  • scale_B (heterogeneous) - S:

  • scale_Y (heterogeneous) - S:

  • bias (optional, heterogeneous) - S:

  • C (optional, heterogeneous) - Q:

  • scale_C (optional, heterogeneous) - S:

Outputs

  • Y (heterogeneous) - Q:

Type Constraints

  • Q in ( tensor(int8) ): Constrain input and output types to int8 tensors.

  • S in ( tensor(float) ): Constrain bias and scales to float32

Examples