com.microsoft - QOrderedLayerNormalization#

QOrderedLayerNormalization - 1#

Version

name: QOrderedLayerNormalization (GitHub)
domain: com.microsoft
since_version: 1
function:
support_level: SupportType.COMMON
shape inference: True

This version of the operator has been available since version 1 of domain com.microsoft.

Summary

Attributes

axis - INT : The first normalization dimension: normalization will be performed along dimensions axis : rank(inputs).
epsilon - FLOAT : The epsilon value to use to avoid division by zero.
order_X - INT : cublasLt order of input X. Default is ROW MAJOR. See the schema of QuantizeWithOrder for order definition.
order_Y - INT : cublasLt order of matrix Y, must be same as order_X. Default is ROW MAJOR.

Inputs

X (heterogeneous) - Q:
scale_X (heterogeneous) - S:
scale (heterogeneous) - F:
B (optional, heterogeneous) - F:
scale_Y (heterogeneous) - S:

Outputs

Y (heterogeneous) - Q:

Type Constraints

F in ( tensor(float), tensor(float16) ): Constrain input gamma and bias could be float16/float tensors. float may get better precision, float16 runs faster.
S in ( tensor(float) ): quantization scale must be float tensors.
Q in ( tensor(int8) ): quantization tensor must be int8 tensors.

Examples