.. _op_ai_onnx_RoiAlign:

RoiAlign
========

- **Domain**: ``ai.onnx``
- **Since version**: 22

Region of Interest (RoI) align operation described in the
`Mask R-CNN paper <https://arxiv.org/abs/1703.06870>`_.
RoiAlign consumes an input tensor X and region of interests (rois)
to apply pooling across each RoI; it produces a 4-D tensor of shape
(num_rois, C, output_height, output_width).

RoiAlign is proposed to avoid the misalignment by removing
quantizations while converting from original image into feature
map and from feature map into RoI feature; in each ROI bin,
the value of the sampled locations are computed directly
through bilinear interpolation.

**Inputs**

- **X** (*T1*): Input data tensor from the previous operator; 4-D feature map of shape (N, C, H, W), where N is the batch size, C is the number of channels, and H and W are the height and the width of the data.
- **rois** (*T1*): RoIs (Regions of Interest) to pool over; rois is 2-D input of shape (num_rois, 4) given as [[x1, y1, x2, y2], ...]. The RoIs' coordinates are in the coordinate system of the input image. Each coordinate set has a 1:1 correspondence with the 'batch_indices' input.
- **batch_indices** (*T2*): 1-D tensor of shape (num_rois,) with each element denoting the index of the corresponding image in the batch.

**Outputs**

- **Y** (*T1*): RoI pooled output, 4-D tensor of shape (num_rois, C, output_height, output_width). The r-th batch element Y[r-1] is a pooled feature map corresponding to the r-th RoI X[r-1].

**Type Constraints**

- **T1**: Constrain types to float tensors.
  Allowed types: tensor(bfloat16), tensor(double), tensor(float), tensor(float16).
- **T2**: Constrain types to int tensors.
  Allowed types: tensor(int64).

Differences with previous version (16)
--------------------------------------

**SchemaDiff**: ``RoiAlign`` (domain ``'ai.onnx'``)

* old version: 16
* new version: 22
* breaking: no

Version History
---------------

- :doc:`Version 16 <RoiAlign-16>`
- :doc:`Version 10 <RoiAlign-10>`