.. _op_ai_onnx_AveragePool-19: AveragePool - version 19 ======================== This page documents version **19** of operator **AveragePool**. See :doc:`AveragePool` for the latest version (since version 22). - **Domain**: ``ai.onnx`` - **Since version**: 19 AveragePool consumes an input tensor X and applies average pooling across the tensor according to kernel sizes, stride sizes, and pad lengths. average pooling consisting of computing the average on all values of a subset of the input tensor according to the kernel size and downsampling the data into the output tensor Y for further processing. The output spatial shape is calculated differently depending on whether explicit padding is used, where pads is employed, or auto padding is used, where auto_pad is utilized. With explicit padding (https://pytorch.org/docs/stable/generated/torch.nn.MaxPool2d.html?highlight=maxpool#torch.nn.MaxPool2d): .. code-block:: output_spatial_shape[i] = floor((input_spatial_shape[i] + pad_shape[i] - dilation[i] * (kernel_shape[i] - 1) - 1) / strides_spatial_shape[i] + 1) or .. code-block:: output_spatial_shape[i] = ceil((input_spatial_shape[i] + pad_shape[i] - dilation[i] * (kernel_shape[i] - 1) - 1) / strides_spatial_shape[i] + 1) if ceil_mode is enabled. ``pad_shape[i]`` is the sum of pads along axis ``i``. ``auto_pad`` is a DEPRECATED attribute. If you are using them currently, the output spatial shape will be following when ceil_mode is enabled: .. code-block:: VALID: output_spatial_shape[i] = ceil((input_spatial_shape[i] - ((kernel_spatial_shape[i] - 1) * dilations[i] + 1) + 1) / strides_spatial_shape[i]) SAME_UPPER or SAME_LOWER: output_spatial_shape[i] = ceil(input_spatial_shape[i] / strides_spatial_shape[i]) or when ceil_mode is disabled (https://www.tensorflow.org/api_docs/python/tf/keras/layers/AveragePooling2D): .. code-block:: VALID: output_spatial_shape[i] = floor((input_spatial_shape[i] - ((kernel_spatial_shape[i] - 1) * dilations[i] + 1)) / strides_spatial_shape[i]) + 1 SAME_UPPER or SAME_LOWER: output_spatial_shape[i] = floor((input_spatial_shape[i] - 1) / strides_spatial_shape[i]) + 1 And pad shape will be following if ``SAME_UPPER`` or ``SAME_LOWER``: .. code-block:: pad_shape[i] = (output_spatial_shape[i] - 1) * strides_spatial_shape[i] + ((kernel_spatial_shape[i] - 1) * dilations[i] + 1) - input_spatial_shape[i] The output of each pooling window is divided by the number of elements (exclude pad when attribute count_include_pad is zero). **Inputs** - **X** (*T*): Input data tensor from the previous operator; dimensions for image case are (N x C x H x W), where N is the batch size, C is the number of channels, and H and W are the height and the width of the data. For non image case, the dimensions are in the form of (N x C x D1 x D2 ... Dn), where N is the batch size. Optionally, if dimension denotation is in effect, the operation expects the input data tensor to arrive with the dimension denotation of [DATA_BATCH, DATA_CHANNEL, DATA_FEATURE, DATA_FEATURE ...]. **Outputs** - **Y** (*T*): Output data tensor from average or max pooling across the input tensor. Dimensions will vary based on various kernel, stride, and pad sizes. Floor value of the dimension is used **Type Constraints** - **T**: Constrain input and output types to float tensors. Allowed types: tensor(double), tensor(float), tensor(float16). Examples -------- **test_cc_averagepool_1d_default** .. code-block:: text Attributes: kernel_shape = [2] .. code-block:: text Inputs: x: shape=(1, 1, 8), dtype=float32 [[[1., 2., 3., 4., 5., 6., 7., 8.]]] Outputs: y: shape=(1, 1, 7), dtype=float32 [[[1.5, 2.5, 3.5, 4.5, 5.5, 6.5, 7.5]]] **test_cc_averagepool_2d_ceil** .. code-block:: text Attributes: kernel_shape = [3, 3] strides = [2, 2] ceil_mode = 1 .. code-block:: text Inputs: x: shape=(1, 1, 4, 4), dtype=float32 [[[[ 1., 2., 3., 4.], [ 5., 6., 7., 8.], [ 9., 10., 11., 12.], [13., 14., 15., 16.]]]] Outputs: y: shape=(1, 1, 2, 2), dtype=float32 [[[[ 6. , 7.5], [12. , 13.5]]]] **test_cc_averagepool_2d_ceil_last_window_starts_on_pad** .. code-block:: text Attributes: kernel_shape = [3, 3] strides = [3, 3] pads = [1, 1, 1, 1] ceil_mode = 1 count_include_pad = 1 .. code-block:: text Inputs: x: shape=(1, 1, 2, 2), dtype=float32 [[[[1., 2.], [3., 4.]]]] Outputs: y: shape=(1, 1, 1, 1), dtype=float32 [[[[1.1111112]]]] **test_cc_averagepool_2d_default** .. code-block:: text Attributes: kernel_shape = [2, 2] .. code-block:: text Inputs: x: shape=(1, 1, 4, 4), dtype=float32 [[[[ 1., 2., 3., 4.], [ 5., 6., 7., 8.], [ 9., 10., 11., 12.], [13., 14., 15., 16.]]]] Outputs: y: shape=(1, 1, 3, 3), dtype=float32 [[[[ 3.5, 4.5, 5.5], [ 7.5, 8.5, 9.5], [11.5, 12.5, 13.5]]]] **test_cc_averagepool_2d_pads** .. code-block:: text Attributes: kernel_shape = [3, 3] pads = [2, 2, 2, 2] .. code-block:: text Inputs: x: shape=(1, 1, 4, 4), dtype=float32 [[[[ 1., 2., 3., 4.], [ 5., 6., 7., 8.], [ 9., 10., 11., 12.], [13., 14., 15., 16.]]]] Outputs: y: shape=(1, 1, 6, 6), dtype=float32 [[[[ 1. , 1.5, 2. , 3. , 3.5, 4. ], [ 3. , 3.5, 4. , 5. , 5.5, 6. ], [ 5. , 5.5, 6. , 7. , 7.5, 8. ], [ 9. , 9.5, 10. , 11. , 11.5, 12. ], [11. , 11.5, 12. , 13. , 13.5, 14. ], [13. , 13.5, 14. , 15. , 15.5, 16. ]]]] **test_cc_averagepool_2d_pads_count_include_pad** .. code-block:: text Attributes: kernel_shape = [3, 3] pads = [1, 1, 1, 1] count_include_pad = 1 .. code-block:: text Inputs: x: shape=(1, 1, 5, 5), dtype=float32 [[[[ 1., 2., 3., 4., 5.], [ 6., 7., 8., 9., 10.], [11., 12., 13., 14., 15.], [16., 17., 18., 19., 20.], [21., 22., 23., 24., 25.]]]] Outputs: y: shape=(1, 1, 5, 5), dtype=float32 [[[[ 1.7777778, 3. , 3.6666667, 4.3333335, 3.1111112], [ 4.3333335, 7. , 8. , 9. , 6.3333335], [ 7.6666665, 12. , 13. , 14. , 9.666667 ], [11. , 17. , 18. , 19. , 13. ], [ 8.444445 , 13. , 13.666667 , 14.333333 , 9.777778 ]]]] **test_cc_averagepool_2d_precomputed_pads** .. code-block:: text Attributes: kernel_shape = [5, 5] pads = [2, 2, 2, 2] .. code-block:: text Inputs: x: shape=(1, 1, 5, 5), dtype=float32 [[[[ 1., 2., 3., 4., 5.], [ 6., 7., 8., 9., 10.], [11., 12., 13., 14., 15.], [16., 17., 18., 19., 20.], [21., 22., 23., 24., 25.]]]] Outputs: y: shape=(1, 1, 5, 5), dtype=float32 [[[[ 7. , 7.5, 8. , 8.5, 9. ], [ 9.5, 10. , 10.5, 11. , 11.5], [12. , 12.5, 13. , 13.5, 14. ], [14.5, 15. , 15.5, 16. , 16.5], [17. , 17.5, 18. , 18.5, 19. ]]]] **test_cc_averagepool_2d_precomputed_pads_count_include_pad** .. code-block:: text Attributes: kernel_shape = [5, 5] pads = [2, 2, 2, 2] count_include_pad = 1 .. code-block:: text Inputs: x: shape=(1, 1, 5, 5), dtype=float32 [[[[ 1., 2., 3., 4., 5.], [ 6., 7., 8., 9., 10.], [11., 12., 13., 14., 15.], [16., 17., 18., 19., 20.], [21., 22., 23., 24., 25.]]]] Outputs: y: shape=(1, 1, 5, 5), dtype=float32 [[[[ 2.52, 3.6 , 4.8 , 4.08, 3.24], [ 4.56, 6.4 , 8.4 , 7.04, 5.52], [ 7.2 , 10. , 13. , 10.8 , 8.4 ], [ 6.96, 9.6 , 12.4 , 10.24, 7.92], [ 6.12, 8.4 , 10.8 , 8.88, 6.84]]]] **test_cc_averagepool_2d_precomputed_strides** .. code-block:: text Attributes: kernel_shape = [2, 2] strides = [2, 2] .. code-block:: text Inputs: x: shape=(1, 1, 5, 5), dtype=float32 [[[[ 1., 2., 3., 4., 5.], [ 6., 7., 8., 9., 10.], [11., 12., 13., 14., 15.], [16., 17., 18., 19., 20.], [21., 22., 23., 24., 25.]]]] Outputs: y: shape=(1, 1, 2, 2), dtype=float32 [[[[ 4., 6.], [14., 16.]]]] **test_cc_averagepool_2d_strides** .. code-block:: text Attributes: kernel_shape = [3, 3] strides = [2, 2] .. code-block:: text Inputs: x: shape=(1, 1, 5, 5), dtype=float32 [[[[ 1., 2., 3., 4., 5.], [ 6., 7., 8., 9., 10.], [11., 12., 13., 14., 15.], [16., 17., 18., 19., 20.], [21., 22., 23., 24., 25.]]]] Outputs: y: shape=(1, 1, 2, 2), dtype=float32 [[[[ 7., 9.], [17., 19.]]]] **test_cc_averagepool_3d_default** .. code-block:: text Attributes: kernel_shape = [2, 2, 2] .. code-block:: text Inputs: x: shape=(1, 1, 3, 3, 3), dtype=float32 [[[[[ 1., 2., 3.], [ 4., 5., 6.], [ 7., 8., 9.]], [[10., 11., 12.], [13., 14., 15.], [16., 17., 18.]], [[19., 20., 21.], [22., 23., 24.], [25., 26., 27.]]]]] Outputs: y: shape=(1, 1, 2, 2, 2), dtype=float32 [[[[[ 7.5, 8.5], [10.5, 11.5]], [[16.5, 17.5], [19.5, 20.5]]]]] Differences with previous version (11) -------------------------------------- **SchemaDiff**: ``AveragePool`` (domain ``'ai.onnx'``) * old version: 11 * new version: 19 * breaking: no **Documentation:** * line similarity: 0.63 (+13/-10 lines) .. code-block:: diff --- AveragePool v11 +++ AveragePool v19 @@ -3,28 +3,31 @@ the tensor according to kernel sizes, stride sizes, and pad lengths. average pooling consisting of computing the average on all values of a subset of the input tensor according to the kernel size and downsampling the - data into the output tensor Y for further processing. The output spatial shape will be following: + data into the output tensor Y for further processing. The output spatial shape is calculated differently + depending on whether explicit padding is used, where pads is employed, or auto padding is used, where auto_pad is utilized. + With explicit padding (https://pytorch.org/docs/stable/generated/torch.nn.MaxPool2d.html?highlight=maxpool#torch.nn.MaxPool2d): ``` - output_spatial_shape[i] = floor((input_spatial_shape[i] + pad_shape[i] - kernel_spatial_shape[i]) / strides_spatial_shape[i] + 1) + output_spatial_shape[i] = floor((input_spatial_shape[i] + pad_shape[i] - dilation[i] * (kernel_shape[i] - 1) - 1) / strides_spatial_shape[i] + 1) ``` or ``` - output_spatial_shape[i] = ceil((input_spatial_shape[i] + pad_shape[i] - kernel_spatial_shape[i]) / strides_spatial_shape[i] + 1) + output_spatial_shape[i] = ceil((input_spatial_shape[i] + pad_shape[i] - dilation[i] * (kernel_shape[i] - 1) - 1) / strides_spatial_shape[i] + 1) ``` - if ceil_mode is enabled + if ceil_mode is enabled. `pad_shape[i]` is the sum of pads along axis `i`. + `auto_pad` is a DEPRECATED attribute. If you are using them currently, the output spatial shape will be following when ceil_mode is enabled: ``` - * pad_shape[i] is sum of pads along axis i + VALID: output_spatial_shape[i] = ceil((input_spatial_shape[i] - ((kernel_spatial_shape[i] - 1) * dilations[i] + 1) + 1) / strides_spatial_shape[i]) + SAME_UPPER or SAME_LOWER: output_spatial_shape[i] = ceil(input_spatial_shape[i] / strides_spatial_shape[i]) ``` - - `auto_pad` is a DEPRECATED attribute. If you are using them currently, the output spatial shape will be following: + or when ceil_mode is disabled (https://www.tensorflow.org/api_docs/python/tf/keras/layers/AveragePooling2D): ``` - VALID: output_spatial_shape[i] = ceil((input_spatial_shape[i] - kernel_spatial_shape[i] + 1) / strides_spatial_shape[i]) - SAME_UPPER or SAME_LOWER: output_spatial_shape[i] = ceil(input_spatial_shape[i] / strides_spatial_shape[i]) + VALID: output_spatial_shape[i] = floor((input_spatial_shape[i] - ((kernel_spatial_shape[i] - 1) * dilations[i] + 1)) / strides_spatial_shape[i]) + 1 + SAME_UPPER or SAME_LOWER: output_spatial_shape[i] = floor((input_spatial_shape[i] - 1) / strides_spatial_shape[i]) + 1 ``` And pad shape will be following if `SAME_UPPER` or `SAME_LOWER`: ``` - pad_shape[i] = (output_spatial_shape[i] - 1) * strides_spatial_shape[i] + kernel_spatial_shape[i] - input_spatial_shape[i] + pad_shape[i] = (output_spatial_shape[i] - 1) * strides_spatial_shape[i] + ((kernel_spatial_shape[i] - 1) * dilations[i] + 1) - input_spatial_shape[i] ``` The output of each pooling window is divided by the number of elements (exclude pad when attribute count_include_pad is zero).