AveragePool - version 19#

This page documents version 19 of operator AveragePool. See AveragePool for the latest version (since version 22).

  • Domain: ai.onnx

  • Since version: 19

AveragePool consumes an input tensor X and applies average pooling across the tensor according to kernel sizes, stride sizes, and pad lengths. average pooling consisting of computing the average on all values of a subset of the input tensor according to the kernel size and downsampling the data into the output tensor Y for further processing. The output spatial shape is calculated differently depending on whether explicit padding is used, where pads is employed, or auto padding is used, where auto_pad is utilized. With explicit padding (https://pytorch.org/docs/stable/generated/torch.nn.MaxPool2d.html?highlight=maxpool#torch.nn.MaxPool2d):

output_spatial_shape[i] = floor((input_spatial_shape[i] + pad_shape[i] - dilation[i] * (kernel_shape[i] - 1) - 1) / strides_spatial_shape[i] + 1)

or

output_spatial_shape[i] = ceil((input_spatial_shape[i] + pad_shape[i] - dilation[i] * (kernel_shape[i] - 1) - 1) / strides_spatial_shape[i] + 1)

if ceil_mode is enabled. pad_shape[i] is the sum of pads along axis i.

auto_pad is a DEPRECATED attribute. If you are using them currently, the output spatial shape will be following when ceil_mode is enabled:

VALID: output_spatial_shape[i] = ceil((input_spatial_shape[i] - ((kernel_spatial_shape[i] - 1) * dilations[i] + 1) + 1) / strides_spatial_shape[i])
SAME_UPPER or SAME_LOWER: output_spatial_shape[i] = ceil(input_spatial_shape[i] / strides_spatial_shape[i])

or when ceil_mode is disabled (https://www.tensorflow.org/api_docs/python/tf/keras/layers/AveragePooling2D):

VALID: output_spatial_shape[i] = floor((input_spatial_shape[i] - ((kernel_spatial_shape[i] - 1) * dilations[i] + 1)) / strides_spatial_shape[i]) + 1
SAME_UPPER or SAME_LOWER: output_spatial_shape[i] = floor((input_spatial_shape[i] - 1) / strides_spatial_shape[i]) + 1

And pad shape will be following if SAME_UPPER or SAME_LOWER:

pad_shape[i] = (output_spatial_shape[i] - 1) * strides_spatial_shape[i] + ((kernel_spatial_shape[i] - 1) * dilations[i] + 1) - input_spatial_shape[i]

The output of each pooling window is divided by the number of elements (exclude pad when attribute count_include_pad is zero).

Inputs

  • X (T): Input data tensor from the previous operator; dimensions for image case are (N x C x H x W), where N is the batch size, C is the number of channels, and H and W are the height and the width of the data. For non image case, the dimensions are in the form of (N x C x D1 x D2 … Dn), where N is the batch size. Optionally, if dimension denotation is in effect, the operation expects the input data tensor to arrive with the dimension denotation of [DATA_BATCH, DATA_CHANNEL, DATA_FEATURE, DATA_FEATURE …].

Outputs

  • Y (T): Output data tensor from average or max pooling across the input tensor. Dimensions will vary based on various kernel, stride, and pad sizes. Floor value of the dimension is used

Type Constraints

  • T: Constrain input and output types to float tensors. Allowed types: tensor(double), tensor(float), tensor(float16).

Examples#

test_cc_averagepool_1d_default

Attributes:
  kernel_shape = [2]
Inputs:
  x: shape=(1, 1, 8), dtype=float32
    [[[1., 2., 3., 4., 5., 6., 7., 8.]]]

Outputs:
  y: shape=(1, 1, 7), dtype=float32
    [[[1.5, 2.5, 3.5, 4.5, 5.5, 6.5, 7.5]]]

test_cc_averagepool_2d_ceil

Attributes:
  kernel_shape = [3, 3]
  strides = [2, 2]
  ceil_mode = 1
Inputs:
  x: shape=(1, 1, 4, 4), dtype=float32
    [[[[ 1.,  2.,  3.,  4.],
       [ 5.,  6.,  7.,  8.],
       [ 9., 10., 11., 12.],
       [13., 14., 15., 16.]]]]

Outputs:
  y: shape=(1, 1, 2, 2), dtype=float32
    [[[[ 6. ,  7.5],
       [12. , 13.5]]]]

test_cc_averagepool_2d_ceil_last_window_starts_on_pad

Attributes:
  kernel_shape = [3, 3]
  strides = [3, 3]
  pads = [1, 1, 1, 1]
  ceil_mode = 1
  count_include_pad = 1
Inputs:
  x: shape=(1, 1, 2, 2), dtype=float32
    [[[[1., 2.],
       [3., 4.]]]]

Outputs:
  y: shape=(1, 1, 1, 1), dtype=float32
    [[[[1.1111112]]]]

test_cc_averagepool_2d_default

Attributes:
  kernel_shape = [2, 2]
Inputs:
  x: shape=(1, 1, 4, 4), dtype=float32
    [[[[ 1.,  2.,  3.,  4.],
       [ 5.,  6.,  7.,  8.],
       [ 9., 10., 11., 12.],
       [13., 14., 15., 16.]]]]

Outputs:
  y: shape=(1, 1, 3, 3), dtype=float32
    [[[[ 3.5,  4.5,  5.5],
       [ 7.5,  8.5,  9.5],
       [11.5, 12.5, 13.5]]]]

test_cc_averagepool_2d_pads

Attributes:
  kernel_shape = [3, 3]
  pads = [2, 2, 2, 2]
Inputs:
  x: shape=(1, 1, 4, 4), dtype=float32
    [[[[ 1.,  2.,  3.,  4.],
       [ 5.,  6.,  7.,  8.],
       [ 9., 10., 11., 12.],
       [13., 14., 15., 16.]]]]

Outputs:
  y: shape=(1, 1, 6, 6), dtype=float32
    [[[[ 1. ,  1.5,  2. ,  3. ,  3.5,  4. ],
       [ 3. ,  3.5,  4. ,  5. ,  5.5,  6. ],
       [ 5. ,  5.5,  6. ,  7. ,  7.5,  8. ],
       [ 9. ,  9.5, 10. , 11. , 11.5, 12. ],
       [11. , 11.5, 12. , 13. , 13.5, 14. ],
       [13. , 13.5, 14. , 15. , 15.5, 16. ]]]]

test_cc_averagepool_2d_pads_count_include_pad

Attributes:
  kernel_shape = [3, 3]
  pads = [1, 1, 1, 1]
  count_include_pad = 1
Inputs:
  x: shape=(1, 1, 5, 5), dtype=float32
    [[[[ 1.,  2.,  3.,  4.,  5.],
       [ 6.,  7.,  8.,  9., 10.],
       [11., 12., 13., 14., 15.],
       [16., 17., 18., 19., 20.],
       [21., 22., 23., 24., 25.]]]]

Outputs:
  y: shape=(1, 1, 5, 5), dtype=float32
    [[[[ 1.7777778,  3.       ,  3.6666667,  4.3333335,  3.1111112],
       [ 4.3333335,  7.       ,  8.       ,  9.       ,  6.3333335],
       [ 7.6666665, 12.       , 13.       , 14.       ,  9.666667 ],
       [11.       , 17.       , 18.       , 19.       , 13.       ],
       [ 8.444445 , 13.       , 13.666667 , 14.333333 ,  9.777778 ]]]]

test_cc_averagepool_2d_precomputed_pads

Attributes:
  kernel_shape = [5, 5]
  pads = [2, 2, 2, 2]
Inputs:
  x: shape=(1, 1, 5, 5), dtype=float32
    [[[[ 1.,  2.,  3.,  4.,  5.],
       [ 6.,  7.,  8.,  9., 10.],
       [11., 12., 13., 14., 15.],
       [16., 17., 18., 19., 20.],
       [21., 22., 23., 24., 25.]]]]

Outputs:
  y: shape=(1, 1, 5, 5), dtype=float32
    [[[[ 7. ,  7.5,  8. ,  8.5,  9. ],
       [ 9.5, 10. , 10.5, 11. , 11.5],
       [12. , 12.5, 13. , 13.5, 14. ],
       [14.5, 15. , 15.5, 16. , 16.5],
       [17. , 17.5, 18. , 18.5, 19. ]]]]

test_cc_averagepool_2d_precomputed_pads_count_include_pad

Attributes:
  kernel_shape = [5, 5]
  pads = [2, 2, 2, 2]
  count_include_pad = 1
Inputs:
  x: shape=(1, 1, 5, 5), dtype=float32
    [[[[ 1.,  2.,  3.,  4.,  5.],
       [ 6.,  7.,  8.,  9., 10.],
       [11., 12., 13., 14., 15.],
       [16., 17., 18., 19., 20.],
       [21., 22., 23., 24., 25.]]]]

Outputs:
  y: shape=(1, 1, 5, 5), dtype=float32
    [[[[ 2.52,  3.6 ,  4.8 ,  4.08,  3.24],
       [ 4.56,  6.4 ,  8.4 ,  7.04,  5.52],
       [ 7.2 , 10.  , 13.  , 10.8 ,  8.4 ],
       [ 6.96,  9.6 , 12.4 , 10.24,  7.92],
       [ 6.12,  8.4 , 10.8 ,  8.88,  6.84]]]]

test_cc_averagepool_2d_precomputed_strides

Attributes:
  kernel_shape = [2, 2]
  strides = [2, 2]
Inputs:
  x: shape=(1, 1, 5, 5), dtype=float32
    [[[[ 1.,  2.,  3.,  4.,  5.],
       [ 6.,  7.,  8.,  9., 10.],
       [11., 12., 13., 14., 15.],
       [16., 17., 18., 19., 20.],
       [21., 22., 23., 24., 25.]]]]

Outputs:
  y: shape=(1, 1, 2, 2), dtype=float32
    [[[[ 4.,  6.],
       [14., 16.]]]]

test_cc_averagepool_2d_strides

Attributes:
  kernel_shape = [3, 3]
  strides = [2, 2]
Inputs:
  x: shape=(1, 1, 5, 5), dtype=float32
    [[[[ 1.,  2.,  3.,  4.,  5.],
       [ 6.,  7.,  8.,  9., 10.],
       [11., 12., 13., 14., 15.],
       [16., 17., 18., 19., 20.],
       [21., 22., 23., 24., 25.]]]]

Outputs:
  y: shape=(1, 1, 2, 2), dtype=float32
    [[[[ 7.,  9.],
       [17., 19.]]]]

test_cc_averagepool_3d_default

Attributes:
  kernel_shape = [2, 2, 2]
Inputs:
  x: shape=(1, 1, 3, 3, 3), dtype=float32
    [[[[[ 1.,  2.,  3.],
        [ 4.,  5.,  6.],
        [ 7.,  8.,  9.]],

       [[10., 11., 12.],
        [13., 14., 15.],
        [16., 17., 18.]],

       [[19., 20., 21.],
        [22., 23., 24.],
        [25., 26., 27.]]]]]

Outputs:
  y: shape=(1, 1, 2, 2, 2), dtype=float32
    [[[[[ 7.5,  8.5],
        [10.5, 11.5]],

       [[16.5, 17.5],
        [19.5, 20.5]]]]]

Differences with previous version (11)#

SchemaDiff: AveragePool (domain 'ai.onnx')

  • old version: 11

  • new version: 19

  • breaking: no

Documentation:

  • line similarity: 0.63 (+13/-10 lines)

--- AveragePool v11
+++ AveragePool v19
@@ -3,28 +3,31 @@
  the tensor according to kernel sizes, stride sizes, and pad lengths.
  average pooling consisting of computing the average on all values of a
  subset of the input tensor according to the kernel size and downsampling the
- data into the output tensor Y for further processing. The output spatial shape will be following:
+ data into the output tensor Y for further processing. The output spatial shape is calculated differently
+ depending on whether explicit padding is used, where pads is employed, or auto padding is used, where auto_pad is utilized.
+ With explicit padding (https://pytorch.org/docs/stable/generated/torch.nn.MaxPool2d.html?highlight=maxpool#torch.nn.MaxPool2d):
  ```
- output_spatial_shape[i] = floor((input_spatial_shape[i] + pad_shape[i] - kernel_spatial_shape[i]) / strides_spatial_shape[i] + 1)
+ output_spatial_shape[i] = floor((input_spatial_shape[i] + pad_shape[i] - dilation[i] * (kernel_shape[i] - 1) - 1) / strides_spatial_shape[i] + 1)
  ```
  or
  ```
- output_spatial_shape[i] = ceil((input_spatial_shape[i] + pad_shape[i] - kernel_spatial_shape[i]) / strides_spatial_shape[i] + 1)
+ output_spatial_shape[i] = ceil((input_spatial_shape[i] + pad_shape[i] - dilation[i] * (kernel_shape[i] - 1) - 1) / strides_spatial_shape[i] + 1)
  ```
- if ceil_mode is enabled
+ if ceil_mode is enabled. `pad_shape[i]` is the sum of pads along axis `i`.

+ `auto_pad` is a DEPRECATED attribute. If you are using them currently, the output spatial shape will be following when ceil_mode is enabled:
  ```
- * pad_shape[i] is sum of pads along axis i
+ VALID: output_spatial_shape[i] = ceil((input_spatial_shape[i] - ((kernel_spatial_shape[i] - 1) * dilations[i] + 1) + 1) / strides_spatial_shape[i])
+ SAME_UPPER or SAME_LOWER: output_spatial_shape[i] = ceil(input_spatial_shape[i] / strides_spatial_shape[i])
  ```
-
- `auto_pad` is a DEPRECATED attribute. If you are using them currently, the output spatial shape will be following:
+ or when ceil_mode is disabled (https://www.tensorflow.org/api_docs/python/tf/keras/layers/AveragePooling2D):
  ```
- VALID: output_spatial_shape[i] = ceil((input_spatial_shape[i] - kernel_spatial_shape[i] + 1) / strides_spatial_shape[i])
- SAME_UPPER or SAME_LOWER: output_spatial_shape[i] = ceil(input_spatial_shape[i] / strides_spatial_shape[i])
+ VALID: output_spatial_shape[i] = floor((input_spatial_shape[i] - ((kernel_spatial_shape[i] - 1) * dilations[i] + 1)) / strides_spatial_shape[i]) + 1
+ SAME_UPPER or SAME_LOWER: output_spatial_shape[i] = floor((input_spatial_shape[i] - 1) / strides_spatial_shape[i]) + 1
  ```
  And pad shape will be following if `SAME_UPPER` or `SAME_LOWER`:
  ```
- pad_shape[i] = (output_spatial_shape[i] - 1) * strides_spatial_shape[i] + kernel_spatial_shape[i] - input_spatial_shape[i]
+ pad_shape[i] = (output_spatial_shape[i] - 1) * strides_spatial_shape[i] + ((kernel_spatial_shape[i] - 1) * dilations[i] + 1) - input_spatial_shape[i]
  ```
  The output of each pooling window is divided by the number of elements (exclude pad when attribute count_include_pad is zero).