RoiAlign#
Domain:
ai.onnxSince version: 22
Region of Interest (RoI) align operation described in the Mask R-CNN paper. RoiAlign consumes an input tensor X and region of interests (rois) to apply pooling across each RoI; it produces a 4-D tensor of shape (num_rois, C, output_height, output_width).
RoiAlign is proposed to avoid the misalignment by removing quantizations while converting from original image into feature map and from feature map into RoI feature; in each ROI bin, the value of the sampled locations are computed directly through bilinear interpolation.
Inputs
X (T1): Input data tensor from the previous operator; 4-D feature map of shape (N, C, H, W), where N is the batch size, C is the number of channels, and H and W are the height and the width of the data.
rois (T1): RoIs (Regions of Interest) to pool over; rois is 2-D input of shape (num_rois, 4) given as [[x1, y1, x2, y2], …]. The RoIs’ coordinates are in the coordinate system of the input image. Each coordinate set has a 1:1 correspondence with the ‘batch_indices’ input.
batch_indices (T2): 1-D tensor of shape (num_rois,) with each element denoting the index of the corresponding image in the batch.
Outputs
Y (T1): RoI pooled output, 4-D tensor of shape (num_rois, C, output_height, output_width). The r-th batch element Y[r-1] is a pooled feature map corresponding to the r-th RoI X[r-1].
Type Constraints
T1: Constrain types to float tensors. Allowed types: tensor(bfloat16), tensor(double), tensor(float), tensor(float16).
T2: Constrain types to int tensors. Allowed types: tensor(int64).
Differences with previous version (16)#
SchemaDiff: RoiAlign (domain 'ai.onnx')
old version: 16
new version: 22
breaking: no