com.microsoft - QuantizeBFP#
QuantizeBFP - 1#
Version
name: QuantizeBFP (GitHub)
domain: com.microsoft
since_version: 1
function:
support_level: SupportType.COMMON
shape inference: True
This version of the operator has been available since version 1 of domain com.microsoft.
Summary
Attributes
bfp_type - INT (required) : The type of BFP - must match with the BFPType enum
block_dim - INT : Each bounding box spans this dimension.Typically, the block dimension corresponds to the reduction dimension of the matrix multipication that consumes the output of this operator.For example, for a 2D matrix multiplication A@W, QuantizeBFP(A) would use block_dim 1 and QuantizeBFP(W) would use block_dim 0.The default is the last dimension.
Inputs
x (heterogeneous) - T1:
Outputs
y (heterogeneous) - T2:
shape (heterogeneous) - T3:
strides (heterogeneous) - T3:
Type Constraints
T1 in ( tensor(bfloat16), tensor(float), tensor(float16) ): Constrain the input to float and bfloat.
T2 in ( tensor(uint8) ): Constrain y to uint8.
T3 in ( tensor(int64) ): Constrain shape and strides to uint64.
Examples