com.microsoft - QOrderedLayerNormalization#

QOrderedLayerNormalization - 1#

Version

This version of the operator has been available since version 1 of domain com.microsoft.

Summary

Attributes

  • axis - INT : The first normalization dimension: normalization will be performed along dimensions axis : rank(inputs).

  • epsilon - FLOAT : The epsilon value to use to avoid division by zero.

  • order_X - INT : cublasLt order of input X. Default is ROW MAJOR. See the schema of QuantizeWithOrder for order definition.

  • order_Y - INT : cublasLt order of matrix Y, must be same as order_X. Default is ROW MAJOR.

Inputs

  • X (heterogeneous) - Q:

  • scale_X (heterogeneous) - S:

  • scale (heterogeneous) - F:

  • B (optional, heterogeneous) - F:

  • scale_Y (heterogeneous) - S:

Outputs

  • Y (heterogeneous) - Q:

Type Constraints

  • F in ( tensor(float), tensor(float16) ): Constrain input gamma and bias could be float16/float tensors. float may get better precision, float16 runs faster.

  • S in ( tensor(float) ): quantization scale must be float tensors.

  • Q in ( tensor(int8) ): quantization tensor must be int8 tensors.

Examples