Resize - 11 vs 18#

Next section compares an older to a newer version of the same operator after both definition are converted into markdown text. Green means an addition to the newer version, red means a deletion. Anything else is unchanged.

Files changed (1) hide show

Resize11 → Resize18 +41 -80

Resize11 → Resize18 RENAMED Viewed

@@ -1 +1 @@
  Resize the input tensor. In general, it calculates every value in the output tensor as a weighted average of neighborhood (a.k.a. sampling locations) in the input tensor.
- Each dimension value of the output tensor is: <br/>
+ Each dimension value of the output tensor is:
-   output_dimension = floor(input_dimension * (roi_end - roi_start) * scale) <br/>
+   output_dimension = floor(input_dimension * (roi_end - roi_start) * scale) if input "sizes" is not specified.
- if input "sizes" is not specified.
  **Attributes**
- * **antialias**:
-   If set to 1, "linear" and "cubic" interpolation modes will use an
-   antialiasing filter when downscaling. Antialiasing is achieved by
-   stretching the resampling filter by a factor max(1, 1 / scale),
-   which means that when downsampling, more input pixels contribute to
-   an output pixel.
- * **axes**:
-   If provided, it specifies a subset of axes that 'roi', 'scales' and
-   'sizes' refer to. If not provided, all axes are assumed [0, 1, ...,
-   r-1], where r = rank(data). Non-specified dimensions are interpreted
-   as non-resizable. Negative value means counting dimensions from the
-   back. Accepted range is [-r, r-1], where r = rank(data). Behavior is
-   undefined if an axis is repeated.
  * **coordinate_transformation_mode**:
     This attribute describes how to transform the coordinate in the
    resized tensor to the coordinate in the original tensor. <br/>  The
    coordinate of each dimension is transformed individually. Let's
    describe a case using axis x as an example. Denote x_resized as the
    coordinate of axis x in the resized tensor, x_original as the
-   coordinate of axis x in the original tensor, length_original as
+   coordinate of axis x in the original tensor, length_original as the
-   the length of the original tensor in axis x, length_resized as the
+   length of the original tensor in axis x, length_resized as the
    length of the resized tensor in axis x, roi_x = (start_x, end_x) of
-   the axis x in input "roi", scale = length_resized /
+   the axis x in input "roi", scale = length_resized / length_original,
-   length_original, <br/>  if coordinate_transformation_mode is
-   "half_pixel", <br/> x_original = (x_resized + 0.5) / scale - 0.5
-   <br/>  if coordinate_transformation_mode is "pytorch_half_pixel",
+   <br/>  if coordinate_transformation_mode is "half_pixel", <br/>
+   x_original = (x_resized + 0.5) / scale - 0.5, <br/>  if
+   coordinate_transformation_mode is "pytorch_half_pixel", <br/>
-   <br/> x_original = length_resized > 1 ? (x_resized + 0.5) / scale -
+   x_original = length_resized > 1 ? (x_resized + 0.5) / scale - 0.5 :
-   0.5 : 0 <br/>  if coordinate_transformation_mode is
+   0, <br/>  if coordinate_transformation_mode is "align_corners",
-   "align_corners", <br/> x_original = x_resized * (length_original
+   <br/> x_original = x_resized * (length_original - 1) /
-   - 1) / (length_resized - 1) <br/>  if
-   coordinate_transformation_mode is "asymmetric", <br/> x_original
-   = x_resized / scale <br/>  if coordinate_transformation_mode is
+   (length_resized - 1), <br/>  if coordinate_transformation_mode is
-   "tf_crop_and_resize", <br/> x_original = length_resized > 1 ?
+   "asymmetric", <br/> x_original = x_resized / scale, <br/>  if
+   coordinate_transformation_mode is "tf_half_pixel_for_nn", <br/>
+   x_original = (x_resized + 0.5) / scale, <br/>  if
+   coordinate_transformation_mode is "tf_crop_and_resize", <br/>
-   start_x * (length_original - 1) + x_resized * (end_x - start_x) *
+   x_original = length_resized > 1 ? start_x * (length_original - 1) +
-   (length_original - 1) / (length_resized - 1) : 0.5 * (start_x +
-   end_x) * (length_original - 1) .
+   x_resized * (end_x - start_x) * (length_original - 1) /
+   (length_resized - 1) : 0.5 * (start_x + end_x) * (length_original -
+   1).
  * **cubic_coeff_a**:
    The coefficient 'a' used in cubic interpolation. Two common choice
    are -0.5 (in some cases of TensorFlow) and -0.75 (in PyTorch). Check
    out Equation (4) in https://ieeexplore.ieee.org/document/1163711 for
-   the details. This attribute is valid only if mode is "cubic".
+   the details. This attribute is valid only if "mode" is "cubic".
  * **exclude_outside**:
    If set to 1, the weight of sampling locations outside the tensor
    will be set to 0 and the weight will be renormalized so that their
    sum is 1.0. The default value is 0.
  * **extrapolation_value**:
    When coordinate_transformation_mode is "tf_crop_and_resize" and
    x_original is outside the range [0, length_original - 1], this value
    is used as the corresponding output value. Default is 0.0f.
- * **keep_aspect_ratio_policy**:
-    This attribute describes how to interpret the sizes input with
-   regard to keeping the original aspect ratio of the input, and it is
-   not applicable when the scales input is used. <br/>  Given a set
-   of sizes, associated with a subset of axes (explicitly provided
-   or default), and assuming d = axes[i], with i being the index of
-   the provided sizes. <br/>  If keep_aspect_ratio_policy is
-   "stretch", the original aspect ratio is disregarded, and the input
-   is resized to the specified size: <br/> out_size[d] = sizes[i]
-   <br/>  If keep_aspect_ratio_policy is "not_larger", the sizes
-   are adjusted so that no extent of the output is larger than the
-   specified size, while keeping the original aspect ratio: <br/>
-   scale = Min(sizes[i] / in_size[d]) <br/> out_size[d] =
-   round_int(scale * in_size[i]) <br/>  If keep_aspect_ratio_policy
-   is "not_smaller", the sizes are adjusted so that no extent of the
-   output is smaller than the specified size, while keeping the
-   original aspect ratio: <br/> scale = Max(sizes[i] / in_size[d])
-   <br/> out_size[d] = round_int(scale * in_size[i]) <br/>  For non-
-   resizable axes (those not specified in axes), the output size will
-   be equal to the input size.  Note: round_int stands for computing
-   the nearest integer value, rounding halfway cases up.
  * **mode**:
-   Three interpolation modes: "nearest" (default), "linear" and
+   Three interpolation modes: nearest (default), linear and cubic. The
-   "cubic". The "linear" mode includes linear interpolation for 1D
+   "linear" mode includes linear interpolation for 1D tensor and
-   tensor and N-linear interpolation for N-D tensor (for example,
+   N-linear interpolation for N-D tensor (for example, bilinear
-   bilinear interpolation for 2D tensor). The "cubic" mode includes
+   interpolation for 2D tensor). The "cubic" mode includes cubic
-   cubic interpolation for 1D tensor and N-cubic interpolation for N-D
+   interpolation for 1D tensor and N-cubic interpolation for N-D tensor
-   tensor (for example, bicubic interpolation for 2D tensor).
+   (for example, bicubic interpolation for 2D tensor).
  * **nearest_mode**:
-   Four modes: "round_prefer_floor" (default, as known as round half
+   Four modes: round_prefer_floor (default, as known as round half
-   down), "round_prefer_ceil" (as known as round half up), "floor",
+   down), round_prefer_ceil (as known as round half up), floor, ceil.
-   "ceil". Only used by nearest interpolation. It indicates how to get
+   Only used by nearest interpolation. It indicates how to get
    "nearest" pixel in input tensor from x_original, so this attribute
    is valid only if "mode" is "nearest".
  **Inputs**
- Between 1 and 4 inputs.
+ Between 3 and 4 inputs.
  * **X** (heterogeneous) - **T1**:
    N-D tensor
- * **roi** (optional, heterogeneous) - **T2**:
+ * **roi** (heterogeneous) - **T2**:
    1-D tensor given as [start1, ..., startN, end1, ..., endN], where N
+   is the rank of X. The RoIs' coordinates are normalized in the
+   coordinate system of the input image. It only takes effect when
+   coordinate_transformation_mode is "tf_crop_and_resize"
-   is the rank of X or the length of axes, if provided. The RoIs'
-   coordinates are normalized in the coordinate system of the input
-   image. It only takes effect when coordinate_transformation_mode is
-   "tf_crop_and_resize"
- * **scales** (optional, heterogeneous) - **tensor(float)**:
+ * **scales** (heterogeneous) - **tensor(float)**:
    The scale array along each dimension. It takes value greater than 0.
    If it's less than 1, it's sampling down, otherwise, it's upsampling.
    The number of elements of 'scales' should be the same as the rank of
+   input 'X'. If 'size' is needed, the user must set 'scales' to an
+   empty tensor.
-   input 'X' or the length of 'axes', if provided. One of 'scales' and
-   'sizes' MUST be specified and it is an error if both are specified.
-   If 'sizes' is needed, the user can use an empty string as the name
-   of 'scales' in this operator's input list.
  * **sizes** (optional, heterogeneous) - **tensor(int64)**:
+   The size of the output tensor. The number of elements of 'sizes'
-   Target size of the output tensor. Its interpretation depends on the
-   'keep_aspect_ratio_policy' value.The number of elements of 'sizes'
-   should be the same as the rank of input 'X', or the length of
+   should be the same as the rank of input 'X'. May only be set if
+   'scales' is set to an empty tensor.
-   'axes', if provided. Only one of 'scales' and 'sizes' can be
-   specified.
  **Outputs**
  * **Y** (heterogeneous) - **T1**:
    N-D tensor after resizing
  **Type Constraints**
  * **T1** in (
-   tensor(bfloat16),
    tensor(bool),
    tensor(complex128),
    tensor(complex64),
    tensor(double),
    tensor(float),
    tensor(float16),
    tensor(int16),
    tensor(int32),
    tensor(int64),
    tensor(int8),
    tensor(string),
    tensor(uint16),
    tensor(uint32),
    tensor(uint64),
    tensor(uint8)
    ):
    Constrain input 'X' and output 'Y' to all tensor types.
  * **T2** in (
    tensor(double),
    tensor(float),
    tensor(float16)
    ):
    Constrain roi type to float or double.