ConvTranspose - 1 vs 11#
Next section compares an older to a newer version of the same operator after both definition are converted into markdown text. Green means an addition to the newer version, red means a deletion. Anything else is unchanged.
ConvTranspose1 → ConvTranspose11
RENAMED
@@ -1 +1 @@
|
|
1
1
|
The convolution transpose operator consumes an input tensor and a filter,
|
2
2
|
and computes the output.
|
3
3
|
If the pads parameter is provided the shape of the output is calculated via the following equation:
|
4
4
|
output_shape[i] = stride[i] * (input_size[i] - 1) + output_padding[i] + ((kernel_shape[i] - 1) * dilations[i] + 1) - pads[start_i] - pads[end_i]
|
5
5
|
output_shape can also be explicitly specified in which case pads values are auto generated using these equations:
|
6
6
|
total_padding[i] = stride[i] * (input_size[i] - 1) + output_padding[i] + ((kernel_shape[i] - 1) * dilations[i] + 1) - output_shape[i]
|
7
|
-
If (auto_pads
|
7
|
+
If (auto_pads != SAME_UPPER): pads[start_i] = total_padding[i]/2; pads[end_i] = total_padding[i] - (total_padding[i]/2)
|
8
8
|
Else: pads[start_i] = total_padding[i] - (total_padding[i]/2); pads[end_i] = (total_padding[i]/2).
|
9
9
|
**Attributes**
|
10
10
|
* **auto_pad**:
|
11
11
|
auto_pad must be either NOTSET, SAME_UPPER, SAME_LOWER or VALID.
|
12
12
|
Where default value is NOTSET, which means explicit padding is used.
|
13
|
-
SAME_UPPER or SAME_LOWER mean pad the input so that
|
13
|
+
SAME_UPPER or SAME_LOWER mean pad the input so that the output
|
14
|
+
spatial size match the input.In case of odd number add the extra
|
15
|
+
padding at the end for SAME_UPPER and at the beginning for
|
16
|
+
SAME_LOWER. VALID mean no padding.
|
14
|
-
= input_shape[i] * strides[i] for each axis i. The padding is
|
15
|
-
split between the two sides equally or almost equally (depending on
|
16
|
-
whether it is even or odd). In case the padding is an odd number,
|
17
|
-
the extra padding is added at the end for SAME_UPPER and at the
|
18
|
-
beginning for SAME_LOWER.
|
19
17
|
* **dilations**:
|
20
|
-
dilation value along each spatial axis of the filter.
|
18
|
+
dilation value along each spatial axis of the filter.
|
21
|
-
present, the dilation defaults to 1 along each spatial axis.
|
22
19
|
* **group**:
|
23
20
|
number of groups input channels and output channels are divided
|
24
21
|
into.
|
25
22
|
* **kernel_shape**:
|
26
23
|
The shape of the convolution kernel. If not present, should be
|
27
24
|
inferred from input W.
|
28
25
|
* **output_padding**:
|
26
|
+
The zero-padding added to one side of the output. This is also
|
27
|
+
called adjs/adjustment in some frameworks.
|
29
|
-
Additional elements added to the side with higher coordinate indices
|
30
|
-
in the output. Each padding value in "output_padding" must be less
|
31
|
-
than the corresponding stride/dilation dimension. By default, this
|
32
|
-
attribute is a zero vector. Note that this attribute doesn't
|
33
|
-
directly affect the computed output values. It only controls the
|
34
|
-
selection of the computed values, so changing this attribute only
|
35
|
-
adds or removes output elements. If "output_shape" is explicitly
|
36
|
-
provided, "output_padding" does not contribute additional size to
|
37
|
-
"output_shape" but participates in the computation of the needed
|
38
|
-
padding amount. This is also called adjs or adjustment in some
|
39
|
-
frameworks.
|
40
28
|
* **output_shape**:
|
41
29
|
The shape of the output can be explicitly set which will cause pads
|
42
30
|
values to be auto generated. If output_shape is specified pads
|
43
31
|
values are ignored. See doc for details for equations to generate
|
44
32
|
pads
|
45
33
|
* **pads**:
|
46
34
|
Padding for the beginning and ending along each spatial axis, it can
|
47
35
|
take any value greater than or equal to 0. The value represent the
|
48
36
|
number of pixels added to the beginning and end part of the
|
49
37
|
corresponding axis. pads format should be as follow [x1_begin,
|
50
38
|
x2_begin...x1_end, x2_end,...], where xi_begin the number of pixels
|
51
39
|
added at the beginning of axis i and xi_end, the number of pixels
|
52
40
|
added at the end of axis i. This attribute cannot be used
|
53
41
|
simultaneously with auto_pad attribute. If not present, the padding
|
54
42
|
defaults to 0 along start and end of each spatial axis.
|
55
43
|
* **strides**:
|
56
|
-
Stride along each spatial axis. If not present, the stride defaults
|
57
|
-
|
44
|
+
Stride along each spatial axis.
|
58
45
|
**Inputs**
|
59
46
|
Between 2 and 3 inputs.
|
60
47
|
* **X** (heterogeneous) - **T**:
|
61
48
|
Input data tensor from previous layer; has size (N x C x H x W),
|
62
49
|
where N is the batch size, C is the number of channels, and H and W
|
63
50
|
are the height and width. Note that this is for the 2D image.
|
64
51
|
Otherwise the size is (N x C x D1 x D2 ... x Dn)
|
65
52
|
* **W** (heterogeneous) - **T**:
|
66
53
|
The weight tensor that will be used in the convolutions; has size (C
|
67
54
|
x M/group x kH x kW), where C is the number of channels, and kH and
|
68
55
|
kW are the height and width of the kernel, and M is the number of
|
69
56
|
feature maps. For more than 2 dimensions, the weight shape will be
|
70
57
|
(C x M/group x k1 x k2 x ... x kn), where (k1 x k2 x ... x kn) is
|
71
58
|
the dimension of the kernel. The number of channels in the output
|
72
59
|
should be equal to W.shape[1] * group (assuming zero based indices
|
73
60
|
of the shape array)
|
74
61
|
* **B** (optional, heterogeneous) - **T**:
|
75
62
|
Optional 1D bias to be added to the convolution, has size of M.
|
76
63
|
**Outputs**
|
77
64
|
* **Y** (heterogeneous) - **T**:
|
78
65
|
Output data tensor that contains the result of the convolution. The
|
79
66
|
output dimensions are functions of the kernel size, stride size, pad
|
80
67
|
lengths and group count. The number of channels in the output should
|
81
68
|
be equal to W.shape[1] * group (assuming zero based indices of the
|
82
69
|
shape array)
|
83
70
|
**Type Constraints**
|
84
71
|
* **T** in (
|
85
72
|
tensor(double),
|
86
73
|
tensor(float),
|
87
74
|
tensor(float16)
|
88
75
|
):
|
89
76
|
Constrain input and output types to float tensors.
|