ConvTranspose#

ConvTranspose - 11#

Version

  • name: ConvTranspose (GitHub)

  • domain: main

  • since_version: 11

  • function: False

  • support_level: SupportType.COMMON

  • shape inference: True

This version of the operator has been available since version 11.

Summary

The convolution transpose operator consumes an input tensor and a filter, and computes the output.

If the pads parameter is provided the shape of the output is calculated via the following equation:

output_shape[i] = stride[i] * (input_size[i] - 1) + output_padding[i] + ((kernel_shape[i] - 1) * dilations[i] + 1) - pads[start_i] - pads[end_i]

output_shape can also be explicitly specified in which case pads values are auto generated using these equations:

total_padding[i] = stride[i] * (input_size[i] - 1) + output_padding[i] + ((kernel_shape[i] - 1) * dilations[i] + 1) - output_shape[i] If (auto_pads == SAME_UPPER): pads[start_i] = total_padding[i]/2; pads[end_i] = total_padding[i] - (total_padding[i]/2) Else: pads[start_i] = total_padding[i] - (total_padding[i]/2); pads[end_i] = (total_padding[i]/2).

Attributes

  • auto_pad: auto_pad must be either NOTSET, SAME_UPPER, SAME_LOWER or VALID. Where default value is NOTSET, which means explicit padding is used. SAME_UPPER or SAME_LOWER mean pad the input so that output_shape[i] = input_shape[i] * strides[i] for each axis i. The padding is split between the two sides equally or almost equally (depending on whether it is even or odd). In case the padding is an odd number, the extra padding is added at the end for SAME_UPPER and at the beginning for SAME_LOWER. Default value is 'NOTSET'.

  • dilations: dilation value along each spatial axis of the filter. If not present, the dilation defaults to 1 along each spatial axis.

  • group: number of groups input channels and output channels are divided into. Default value is 1.

  • kernel_shape: The shape of the convolution kernel. If not present, should be inferred from input W.

  • output_padding: Additional elements added to the side with higher coordinate indices in the output. Each padding value in “output_padding” must be less than the corresponding stride/dilation dimension. By default, this attribute is a zero vector. Note that this attribute doesn’t directly affect the computed output values. It only controls the selection of the computed values, so changing this attribute only adds or removes output elements. If “output_shape” is explicitly provided, “output_padding” does not contribute additional size to “output_shape” but participates in the computation of the needed padding amount. This is also called adjs or adjustment in some frameworks.

  • output_shape: The shape of the output can be explicitly set which will cause pads values to be auto generated. If output_shape is specified pads values are ignored. See doc for details for equations to generate pads

  • pads: Padding for the beginning and ending along each spatial axis, it can take any value greater than or equal to 0. The value represent the number of pixels added to the beginning and end part of the corresponding axis. pads format should be as follow [x1_begin, x2_begin…x1_end, x2_end,…], where xi_begin the number of pixels added at the beginning of axis i and xi_end, the number of pixels added at the end of axis i. This attribute cannot be used simultaneously with auto_pad attribute. If not present, the padding defaults to 0 along start and end of each spatial axis.

  • strides: Stride along each spatial axis. If not present, the stride defaults to 1 along each spatial axis.

Inputs

Between 2 and 3 inputs.

  • X (heterogeneous) - T: Input data tensor from previous layer; has size (N x C x H x W), where N is the batch size, C is the number of channels, and H and W are the height and width. Note that this is for the 2D image. Otherwise the size is (N x C x D1 x D2 … x Dn)

  • W (heterogeneous) - T: The weight tensor that will be used in the convolutions; has size (C x M/group x kH x kW), where C is the number of channels, and kH and kW are the height and width of the kernel, and M is the number of feature maps. For more than 2 dimensions, the weight shape will be (C x M/group x k1 x k2 x … x kn), where (k1 x k2 x … x kn) is the dimension of the kernel. The number of channels in the output should be equal to W.shape[1] * group (assuming zero based indices of the shape array)

  • B (optional, heterogeneous) - T: Optional 1D bias to be added to the convolution, has size of M.

Outputs

  • Y (heterogeneous) - T: Output data tensor that contains the result of the convolution. The output dimensions are functions of the kernel size, stride size, pad lengths and group count. The number of channels in the output should be equal to W.shape[1] * group (assuming zero based indices of the shape array)

Type Constraints

  • T in ( tensor(double), tensor(float), tensor(float16) ): Constrain input and output types to float tensors.

Examples

convtranspose_1d

x = np.array([[[0., 1., 2.]]]).astype(np.float32)  # (1, 1, 3)

W = np.array([[[1., 1., 1.],  # (1, 2, 3)
               [1., 1., 1.]]]).astype(np.float32)

node = onnx.helper.make_node("ConvTranspose", ["X", "W"], ["Y"])

y = np.array([[[0., 1., 3., 3., 2.],  # (1, 2, 5)
               [0., 1., 3., 3., 2.]]]).astype(np.float32)

expect(node, inputs=[x, W], outputs=[y], name='test_convtranspose_1d')

convtranspose_3d

x = np.array([[[[[0., 1., 2., 3., 4.],  # (1, 1, 3, 4, 5)
                 [5., 6., 7., 8., 9.],
                 [10., 11., 12., 13., 14.],
                 [15., 16., 17., 18., 19.]],
                [[20., 21., 22., 23., 24.],
                 [25., 26., 27., 28., 29.],
                 [30., 31., 32., 33., 34.],
                 [35., 36., 37., 38., 39.]],
                [[40., 41., 42., 43., 44.],
                 [45., 46., 47., 48., 49.],
                 [50., 51., 52., 53., 54.],
                 [55., 56., 57., 58., 59.]]]]]).astype(np.float32)

W = np.array([[[[[1., 1., 1.],  # (1, 2, 3, 3, 3)
                 [1., 1., 1.],
                 [1., 1., 1.]],
                [[1., 1., 1.],
                 [1., 1., 1.],
                 [1., 1., 1.]],
                [[1., 1., 1.],
                 [1., 1., 1.],
                 [1., 1., 1.]]],
               [[[1., 1., 1.],
                 [1., 1., 1.],
                 [1., 1., 1.]],
                [[1., 1., 1.],
                 [1., 1., 1.],
                 [1., 1., 1.]],
                [[1., 1., 1.],
                 [1., 1., 1.],
                 [1., 1., 1.]]]]]).astype(np.float32)

node = onnx.helper.make_node("ConvTranspose", ["X", "W"], ["Y"])

y = np.array([[[[[0., 1., 3., 6., 9., 7., 4.],  # (1, 2, 5, 6, 7)
                 [5., 12., 21., 27., 33., 24., 13.],
                 [15., 33., 54., 63., 72., 51., 27.],
                 [30., 63., 99., 108., 117., 81., 42.],
                 [25., 52., 81., 87., 93., 64., 33.],
                 [15., 31., 48., 51., 54., 37., 19.]],

                [[20., 42., 66., 72., 78., 54., 28.],
                 [50., 104., 162., 174., 186., 128., 66.],
                 [90., 186., 288., 306., 324., 222., 114.],
                 [120., 246., 378., 396., 414., 282., 144.],
                 [90., 184., 282., 294., 306., 208., 106.],
                 [50., 102., 156., 162., 168., 114., 58.]],

                [[60., 123., 189., 198., 207., 141., 72.],
                 [135., 276., 423., 441., 459., 312., 159.],
                 [225., 459., 702., 729., 756., 513., 261.],
                 [270., 549., 837., 864., 891., 603., 306.],
                 [195., 396., 603., 621., 639., 432., 219.],
                 [105., 213., 324., 333., 342., 231., 117.]],

                [[60., 122., 186., 192., 198., 134., 68.],
                 [130., 264., 402., 414., 426., 288., 146.],
                 [210., 426., 648., 666., 684., 462., 234.],
                 [240., 486., 738., 756., 774., 522., 264.],
                 [170., 344., 522., 534., 546., 368., 186.],
                 [90., 182., 276., 282., 288., 194., 98.]],

                [[40., 81., 123., 126., 129., 87., 44.],
                 [85., 172., 261., 267., 273., 184., 93.],
                 [135., 273., 414., 423., 432., 291., 147.],
                 [150., 303., 459., 468., 477., 321., 162.],
                 [105., 212., 321., 327., 333., 224., 113.],
                 [55., 111., 168., 171., 174., 117., 59.]]],

               [[[0., 1., 3., 6., 9., 7., 4.],
                 [5., 12., 21., 27., 33., 24., 13.],
                 [15., 33., 54., 63., 72., 51., 27.],
                 [30., 63., 99., 108., 117., 81., 42.],
                 [25., 52., 81., 87., 93., 64., 33.],
                 [15., 31., 48., 51., 54., 37., 19.]],

                [[20., 42., 66., 72., 78., 54., 28.],
                 [50., 104., 162., 174., 186., 128., 66.],
                 [90., 186., 288., 306., 324., 222., 114.],
                 [120., 246., 378., 396., 414., 282., 144.],
                 [90., 184., 282., 294., 306., 208., 106.],
                 [50., 102., 156., 162., 168., 114., 58.]],

                [[60., 123., 189., 198., 207., 141., 72.],
                 [135., 276., 423., 441., 459., 312., 159.],
                 [225., 459., 702., 729., 756., 513., 261.],
                 [270., 549., 837., 864., 891., 603., 306.],
                 [195., 396., 603., 621., 639., 432., 219.],
                 [105., 213., 324., 333., 342., 231., 117.]],

                [[60., 122., 186., 192., 198., 134., 68.],
                 [130., 264., 402., 414., 426., 288., 146.],
                 [210., 426., 648., 666., 684., 462., 234.],
                 [240., 486., 738., 756., 774., 522., 264.],
                 [170., 344., 522., 534., 546., 368., 186.],
                 [90., 182., 276., 282., 288., 194., 98.]],

                [[40., 81., 123., 126., 129., 87., 44.],
                 [85., 172., 261., 267., 273., 184., 93.],
                 [135., 273., 414., 423., 432., 291., 147.],
                 [150., 303., 459., 468., 477., 321., 162.],
                 [105., 212., 321., 327., 333., 224., 113.],
                 [55., 111., 168., 171., 174., 117., 59.]]]]]).astype(np.float32)

expect(node, inputs=[x, W], outputs=[y], name='test_convtranspose_3d')

convtranspose_attributes

x = np.array([[[[0., 1., 2.],  # (1, 1, 3, 3)
                [3., 4., 5.],
                [6., 7., 8.]]]]).astype(np.float32)

W = np.array([[[[1., 1., 1.],  # (1, 2, 3, 3)
                [1., 1., 1.],
                [1., 1., 1.]],
               [[1., 1., 1.],
                [1., 1., 1.],
                [1., 1., 1.]]]]).astype(np.float32)

y = np.array([[[[0., 0., 1., 1., 3., 2., 2., 0.],  # (1, 2, 10, 8)
                [0., 0., 1., 1., 3., 2., 2., 0.],
                [0., 0., 1., 1., 3., 2., 2., 0.],
                [3., 3., 7., 4., 9., 5., 5., 0.],
                [3., 3., 7., 4., 9., 5., 5., 0.],
                [3., 3., 7., 4., 9., 5., 5., 0.],
                [6., 6., 13., 7., 15., 8., 8., 0.],
                [6., 6., 13., 7., 15., 8., 8., 0.],
                [6., 6., 13., 7., 15., 8., 8., 0.],
                [0., 0., 0., 0., 0., 0., 0., 0.]],

               [[0., 0., 1., 1., 3., 2., 2., 0.],
                [0., 0., 1., 1., 3., 2., 2., 0.],
                [0., 0., 1., 1., 3., 2., 2., 0.],
                [3., 3., 7., 4., 9., 5., 5., 0.],
                [3., 3., 7., 4., 9., 5., 5., 0.],
                [3., 3., 7., 4., 9., 5., 5., 0.],
                [6., 6., 13., 7., 15., 8., 8., 0.],
                [6., 6., 13., 7., 15., 8., 8., 0.],
                [6., 6., 13., 7., 15., 8., 8., 0.],
                [0., 0., 0., 0., 0., 0., 0., 0.]]]]).astype(np.float32)

node = onnx.helper.make_node("ConvTranspose", ["X", "W"], ["Y"],
                             strides=[3, 2],
                             output_shape=[10, 8])
expect(node, inputs=[x, W], outputs=[y], name='test_convtranspose_output_shape')

node = onnx.helper.make_node("ConvTranspose", ["X", "W"], ["Y"],
                             strides=[3, 2],
                             output_padding=[1, 1])
expect(node, inputs=[x, W], outputs=[y], name='test_convtranspose_pad')

node = onnx.helper.make_node(
    'ConvTranspose', ['X', 'W'], ['Y'],
    name='test',
    strides=[3, 2],
    output_shape=[10, 8],
    kernel_shape=[3, 3],
    output_padding=[1, 1]
)
expect(node, inputs=[x, W], outputs=[y],
       name='test_convtranspose_kernel_shape')

convtranspose_pads

x = np.array([[[[0., 1., 2.],  # (1, 1, 3, 3)
                [3., 4., 5.],
                [6., 7., 8.]]]]).astype(np.float32)

W = np.array([[[[1., 1., 1.],  # (1, 2, 3, 3)
                [1., 1., 1.],
                [1., 1., 1.]],
               [[1., 1., 1.],
                [1., 1., 1.],
                [1., 1., 1.]]]]).astype(np.float32)

node = onnx.helper.make_node("ConvTranspose", ["X", "W"], ["Y"],
                             strides=[3, 2],
                             pads=[1, 2, 1, 2])

y = np.array([[[[1., 1., 3.],  # (1, 2, 7, 3)
                [1., 1., 3.],
                [7., 4., 9.],
                [7., 4., 9.],
                [7., 4., 9.],
                [13., 7., 15.],
                [13., 7., 15.]],

               [[1., 1., 3.],
                [1., 1., 3.],
                [7., 4., 9.],
                [7., 4., 9.],
                [7., 4., 9.],
                [13., 7., 15.],
                [13., 7., 15.]]]]).astype(np.float32)

expect(node, inputs=[x, W], outputs=[y], name='test_convtranspose_pads')

convtranspose_dilations

x = np.array([[[[3., 8., 1.],  # (1, 1, 3, 3)
                [9., 5., 7.],
                [3., 2., 6.]]]]).astype(np.float32)
W = np.array([[[[7., 2.],  # (1, 1, 2, 2)
                [1., 9.]]]]).astype(np.float32)

node = onnx.helper.make_node("ConvTranspose", ["X", "W"], ["Y"], dilations=[2, 2])

y = np.array([[[[21., 56., 13., 16., 2.],  # [1, 1, 5, 5]
                [63., 35., 67., 10., 14.],
                [24., 22., 76., 76., 21.],
                [9., 5., 88., 45., 63.],
                [3., 2., 33., 18., 54.]]]]).astype(np.float32)

expect(node, inputs=[x, W], outputs=[y], name='test_convtranspose_dilations')

convtranspose_autopad_same

x = np.array([[[[0., 1., 2.],  # (1, 1, 3, 3)
                [3., 4., 5.],
                [6., 7., 8.]]]]).astype(np.float32)

W = np.array([[[[1., 1., 1.],  # (1, 2, 3, 3)
                [1., 1., 1.],
                [1., 1., 1.]],
               [[1., 1., 1.],
                [1., 1., 1.],
                [1., 1., 1.]]]]).astype(np.float32)

node = onnx.helper.make_node("ConvTranspose", ["X", "W"], ["Y"], auto_pad="SAME_UPPER", strides=[2, 2])

y = np.array([[[[0., 0., 1., 1., 3., 2.],
                [0., 0., 1., 1., 3., 2.],
                [3., 3., 8., 5., 12., 7.],
                [3., 3., 7., 4., 9., 5.],
                [9., 9., 20., 11., 24., 13.],
                [6., 6., 13., 7., 15., 8.]],

               [[0., 0., 1., 1., 3., 2.],
                [0., 0., 1., 1., 3., 2.],
                [3., 3., 8., 5., 12., 7.],
                [3., 3., 7., 4., 9., 5.],
                [9., 9., 20., 11., 24., 13.],
                [6., 6., 13., 7., 15., 8.]]]]).astype(np.float32)

expect(node, inputs=[x, W], outputs=[y], name='test_convtranspose_autopad_same')

Differences

00The convolution transpose operator consumes an input tensor and a filter,The convolution transpose operator consumes an input tensor and a filter,
11and computes the output.and computes the output.
22
33If the pads parameter is provided the shape of the output is calculated via the following equation:If the pads parameter is provided the shape of the output is calculated via the following equation:
44
55 output_shape[i] = stride[i] * (input_size[i] - 1) + output_padding[i] + ((kernel_shape[i] - 1) * dilations[i] + 1) - pads[start_i] - pads[end_i] output_shape[i] = stride[i] * (input_size[i] - 1) + output_padding[i] + ((kernel_shape[i] - 1) * dilations[i] + 1) - pads[start_i] - pads[end_i]
66
77output_shape can also be explicitly specified in which case pads values are auto generated using these equations:output_shape can also be explicitly specified in which case pads values are auto generated using these equations:
88
99 total_padding[i] = stride[i] * (input_size[i] - 1) + output_padding[i] + ((kernel_shape[i] - 1) * dilations[i] + 1) - output_shape[i] total_padding[i] = stride[i] * (input_size[i] - 1) + output_padding[i] + ((kernel_shape[i] - 1) * dilations[i] + 1) - output_shape[i]
1010 If (auto_pads != SAME_UPPER): pads[start_i] = total_padding[i]/2; pads[end_i] = total_padding[i] - (total_padding[i]/2) If (auto_pads == SAME_UPPER): pads[start_i] = total_padding[i]/2; pads[end_i] = total_padding[i] - (total_padding[i]/2)
1111 Else: pads[start_i] = total_padding[i] - (total_padding[i]/2); pads[end_i] = (total_padding[i]/2). Else: pads[start_i] = total_padding[i] - (total_padding[i]/2); pads[end_i] = (total_padding[i]/2).
1212
1313**Attributes****Attributes**
1414
1515* **auto_pad**:* **auto_pad**:
1616 auto_pad must be either NOTSET, SAME_UPPER, SAME_LOWER or VALID. auto_pad must be either NOTSET, SAME_UPPER, SAME_LOWER or VALID.
1717 Where default value is NOTSET, which means explicit padding is used. Where default value is NOTSET, which means explicit padding is used.
1818 SAME_UPPER or SAME_LOWER mean pad the input so that the output SAME_UPPER or SAME_LOWER mean pad the input so that output_shape[i]
19 spatial size match the input.In case of odd number add the extra
19 = input_shape[i] * strides[i] for each axis i. The padding is
20 split between the two sides equally or almost equally (depending on
21 whether it is even or odd). In case the padding is an odd number,
2022 padding at the end for SAME_UPPER and at the beginning for the extra padding is added at the end for SAME_UPPER and at the
2123 SAME_LOWER. VALID mean no padding. Default value is 'NOTSET'. beginning for SAME_LOWER. Default value is 'NOTSET'.
2224* **dilations**:* **dilations**:
2325 dilation value along each spatial axis of the filter. dilation value along each spatial axis of the filter. If not
26 present, the dilation defaults to 1 along each spatial axis.
2427* **group**:* **group**:
2528 number of groups input channels and output channels are divided number of groups input channels and output channels are divided
2629 into. Default value is 1. into. Default value is 1.
2730* **kernel_shape**:* **kernel_shape**:
2831 The shape of the convolution kernel. If not present, should be The shape of the convolution kernel. If not present, should be
2932 inferred from input W. inferred from input W.
3033* **output_padding**:* **output_padding**:
34 Additional elements added to the side with higher coordinate indices
35 in the output. Each padding value in "output_padding" must be less
36 than the corresponding stride/dilation dimension. By default, this
37 attribute is a zero vector. Note that this attribute doesn't
38 directly affect the computed output values. It only controls the
39 selection of the computed values, so changing this attribute only
40 adds or removes output elements. If "output_shape" is explicitly
41 provided, "output_padding" does not contribute additional size to
42 "output_shape" but participates in the computation of the needed
43 padding amount. This is also called adjs or adjustment in some
3144 The zero-padding added to one side of the output. This is also frameworks.
32 called adjs/adjustment in some frameworks.
3345* **output_shape**:* **output_shape**:
3446 The shape of the output can be explicitly set which will cause pads The shape of the output can be explicitly set which will cause pads
3547 values to be auto generated. If output_shape is specified pads values to be auto generated. If output_shape is specified pads
3648 values are ignored. See doc for details for equations to generate values are ignored. See doc for details for equations to generate
3749 pads pads
3850* **pads**:* **pads**:
3951 Padding for the beginning and ending along each spatial axis, it can Padding for the beginning and ending along each spatial axis, it can
4052 take any value greater than or equal to 0. The value represent the take any value greater than or equal to 0. The value represent the
4153 number of pixels added to the beginning and end part of the number of pixels added to the beginning and end part of the
4254 corresponding axis. pads format should be as follow [x1_begin, corresponding axis. pads format should be as follow [x1_begin,
4355 x2_begin...x1_end, x2_end,...], where xi_begin the number of pixels x2_begin...x1_end, x2_end,...], where xi_begin the number of pixels
4456 added at the beginning of axis i and xi_end, the number of pixels added at the beginning of axis i and xi_end, the number of pixels
4557 added at the end of axis i. This attribute cannot be used added at the end of axis i. This attribute cannot be used
4658 simultaneously with auto_pad attribute. If not present, the padding simultaneously with auto_pad attribute. If not present, the padding
4759 defaults to 0 along start and end of each spatial axis. defaults to 0 along start and end of each spatial axis.
4860* **strides**:* **strides**:
4961 Stride along each spatial axis. Stride along each spatial axis. If not present, the stride defaults
62 to 1 along each spatial axis.
5063
5164**Inputs****Inputs**
5265
5366Between 2 and 3 inputs.Between 2 and 3 inputs.
5467
5568* **X** (heterogeneous) - **T**:* **X** (heterogeneous) - **T**:
5669 Input data tensor from previous layer; has size (N x C x H x W), Input data tensor from previous layer; has size (N x C x H x W),
5770 where N is the batch size, C is the number of channels, and H and W where N is the batch size, C is the number of channels, and H and W
5871 are the height and width. Note that this is for the 2D image. are the height and width. Note that this is for the 2D image.
5972 Otherwise the size is (N x C x D1 x D2 ... x Dn) Otherwise the size is (N x C x D1 x D2 ... x Dn)
6073* **W** (heterogeneous) - **T**:* **W** (heterogeneous) - **T**:
6174 The weight tensor that will be used in the convolutions; has size (C The weight tensor that will be used in the convolutions; has size (C
6275 x M/group x kH x kW), where C is the number of channels, and kH and x M/group x kH x kW), where C is the number of channels, and kH and
6376 kW are the height and width of the kernel, and M is the number of kW are the height and width of the kernel, and M is the number of
6477 feature maps. For more than 2 dimensions, the weight shape will be feature maps. For more than 2 dimensions, the weight shape will be
6578 (C x M/group x k1 x k2 x ... x kn), where (k1 x k2 x ... x kn) is (C x M/group x k1 x k2 x ... x kn), where (k1 x k2 x ... x kn) is
6679 the dimension of the kernel. The number of channels in the output the dimension of the kernel. The number of channels in the output
6780 should be equal to W.shape[1] * group (assuming zero based indices should be equal to W.shape[1] * group (assuming zero based indices
6881 of the shape array) of the shape array)
6982* **B** (optional, heterogeneous) - **T**:* **B** (optional, heterogeneous) - **T**:
7083 Optional 1D bias to be added to the convolution, has size of M. Optional 1D bias to be added to the convolution, has size of M.
7184
7285**Outputs****Outputs**
7386
7487* **Y** (heterogeneous) - **T**:* **Y** (heterogeneous) - **T**:
7588 Output data tensor that contains the result of the convolution. The Output data tensor that contains the result of the convolution. The
7689 output dimensions are functions of the kernel size, stride size, pad output dimensions are functions of the kernel size, stride size, pad
7790 lengths and group count. The number of channels in the output should lengths and group count. The number of channels in the output should
7891 be equal to W.shape[1] * group (assuming zero based indices of the be equal to W.shape[1] * group (assuming zero based indices of the
7992 shape array) shape array)
8093
8194**Type Constraints****Type Constraints**
8295
8396* **T** in (* **T** in (
8497 tensor(double), tensor(double),
8598 tensor(float), tensor(float),
8699 tensor(float16) tensor(float16)
87100 ): ):
88101 Constrain input and output types to float tensors. Constrain input and output types to float tensors.

ConvTranspose - 1#

Version

  • name: ConvTranspose (GitHub)

  • domain: main

  • since_version: 1

  • function: False

  • support_level: SupportType.COMMON

  • shape inference: True

This version of the operator has been available since version 1.

Summary

The convolution transpose operator consumes an input tensor and a filter, and computes the output.

If the pads parameter is provided the shape of the output is calculated via the following equation:

output_shape[i] = stride[i] * (input_size[i] - 1) + output_padding[i] + ((kernel_shape[i] - 1) * dilations[i] + 1) - pads[start_i] - pads[end_i]

output_shape can also be explicitly specified in which case pads values are auto generated using these equations:

total_padding[i] = stride[i] * (input_size[i] - 1) + output_padding[i] + ((kernel_shape[i] - 1) * dilations[i] + 1) - output_shape[i] If (auto_pads != SAME_UPPER): pads[start_i] = total_padding[i]/2; pads[end_i] = total_padding[i] - (total_padding[i]/2) Else: pads[start_i] = total_padding[i] - (total_padding[i]/2); pads[end_i] = (total_padding[i]/2).

Attributes

  • auto_pad: auto_pad must be either NOTSET, SAME_UPPER, SAME_LOWER or VALID. Where default value is NOTSET, which means explicit padding is used. SAME_UPPER or SAME_LOWER mean pad the input so that the output spatial size match the input.In case of odd number add the extra padding at the end for SAME_UPPER and at the beginning for SAME_LOWER. VALID mean no padding. Default value is 'NOTSET'.

  • dilations: dilation value along each spatial axis of the filter.

  • group: number of groups input channels and output channels are divided into. Default value is 1.

  • kernel_shape: The shape of the convolution kernel. If not present, should be inferred from input W.

  • output_padding: The zero-padding added to one side of the output. This is also called adjs/adjustment in some frameworks.

  • output_shape: The shape of the output can be explicitly set which will cause pads values to be auto generated. If output_shape is specified pads values are ignored. See doc for details for equations to generate pads

  • pads: Padding for the beginning and ending along each spatial axis, it can take any value greater than or equal to 0. The value represent the number of pixels added to the beginning and end part of the corresponding axis. pads format should be as follow [x1_begin, x2_begin…x1_end, x2_end,…], where xi_begin the number of pixels added at the beginning of axis i and xi_end, the number of pixels added at the end of axis i. This attribute cannot be used simultaneously with auto_pad attribute. If not present, the padding defaults to 0 along start and end of each spatial axis.

  • strides: Stride along each spatial axis.

Inputs

Between 2 and 3 inputs.

  • X (heterogeneous) - T: Input data tensor from previous layer; has size (N x C x H x W), where N is the batch size, C is the number of channels, and H and W are the height and width. Note that this is for the 2D image. Otherwise the size is (N x C x D1 x D2 … x Dn)

  • W (heterogeneous) - T: The weight tensor that will be used in the convolutions; has size (C x M/group x kH x kW), where C is the number of channels, and kH and kW are the height and width of the kernel, and M is the number of feature maps. For more than 2 dimensions, the weight shape will be (C x M/group x k1 x k2 x … x kn), where (k1 x k2 x … x kn) is the dimension of the kernel. The number of channels in the output should be equal to W.shape[1] * group (assuming zero based indices of the shape array)

  • B (optional, heterogeneous) - T: Optional 1D bias to be added to the convolution, has size of M.

Outputs

  • Y (heterogeneous) - T: Output data tensor that contains the result of the convolution. The output dimensions are functions of the kernel size, stride size, pad lengths and group count. The number of channels in the output should be equal to W.shape[1] * group (assuming zero based indices of the shape array)

Type Constraints

  • T in ( tensor(double), tensor(float), tensor(float16) ): Constrain input and output types to float tensors.