RoiAlign#

Domain: ai.onnx
Since version: 22

Region of Interest (RoI) align operation described in the Mask R-CNN paper. RoiAlign consumes an input tensor X and region of interests (rois) to apply pooling across each RoI; it produces a 4-D tensor of shape (num_rois, C, output_height, output_width).

RoiAlign is proposed to avoid the misalignment by removing quantizations while converting from original image into feature map and from feature map into RoI feature; in each ROI bin, the value of the sampled locations are computed directly through bilinear interpolation.

Inputs

X (T1): Input data tensor from the previous operator; 4-D feature map of shape (N, C, H, W), where N is the batch size, C is the number of channels, and H and W are the height and the width of the data.
rois (T1): RoIs (Regions of Interest) to pool over; rois is 2-D input of shape (num_rois, 4) given as [[x1, y1, x2, y2], …]. The RoIs’ coordinates are in the coordinate system of the input image. Each coordinate set has a 1:1 correspondence with the ‘batch_indices’ input.
batch_indices (T2): 1-D tensor of shape (num_rois,) with each element denoting the index of the corresponding image in the batch.

Outputs

Y (T1): RoI pooled output, 4-D tensor of shape (num_rois, C, output_height, output_width). The r-th batch element Y[r-1] is a pooled feature map corresponding to the r-th RoI X[r-1].

Type Constraints

T1: Constrain types to float tensors. Allowed types: tensor(bfloat16), tensor(double), tensor(float), tensor(float16).
T2: Constrain types to int tensors. Allowed types: tensor(int64).

Differences with previous version (16)#

SchemaDiff: RoiAlign (domain 'ai.onnx')

old version: 16
new version: 22
breaking: no

RoiAlign#

Differences with previous version (16)#

Version History#