Scan#

Scan - 16#

Version

  • name: Scan (GitHub)

  • domain: main

  • since_version: 16

  • function: False

  • support_level: SupportType.COMMON

  • shape inference: True

This version of the operator has been available since version 16.

Summary

Scan can be used to iterate over one or more scan_input tensors, constructing zero or more scan_output tensors. It combines ideas from general recurrences, functional programming constructs such as scan, fold, map, and zip and is intended to enable generalizations of RNN-like constructs for sequence-to-sequence processing. Other tensors (referred to as state_variables here) can be used to carry a state when iterating from one element to another (similar to hidden-state in RNNs, also referred to as loop-carried dependences in the context of loops). Many common usages involve a single scan_input tensor (where functionality similar to scan, fold and map can be obtained). When more than one scan_input is used, a behavior similar to zip is obtained.

The attribute body must be a graph, specifying the computation to be performed in every iteration. It takes as input the current values of the state_variables and the current iterated element of the scan_inputs. It must return the (updated) values of the state_variables and zero or more scan_output_element tensors. The values of the scan_output_element tensors are concatenated over all the iterations to produce the scan_output values of the scan construct (similar to the concatenated intermediate hidden-state values of RNN-like constructs). All the output tensors (state_variables as well as scan_output_element tensors) are required to have the same shape in each iteration of the loop (a restriction imposed to enable efficient memory allocation).

Note that the iterated element passed to the body subgraph does not have a sequence axis. It will have a rank one less than the rank of the corresponding scan_input.

The scan operation returns the final values of the state_variables as well as the scan_outputs.

The optional attribute scan_input_directions specifies the direction (forward or backward) for each scan input. If this attribute is omitted, all sequences are scanned in the forward direction. A bidirectional scan may be performed by specifying the same tensor input twice in the scan_inputs, once with a forward direction, and once with a backward direction.

The scan_output of the operation is produced by concatenating the scan_output_element values produced by the body in each iteration. The optional attribute scan_output_directions specifies the direction in which scan_output is constructed (by appending or prepending the scan_output_element to scan_output in each iteration) for each scan_output. If this attribute is omitted, the scan_output_element is appended to the scan_output in each iteration.

The optional attribute scan_input_axes specifies the axis to be scanned for each scan_input. If omitted, every scan_input will be scanned in axis 0. For example, if axis 0 is the batch axis and axis 1 is the time axis (to be scanned), specify an axis value of 1. Note that scanning a non-zero axis may be less efficient than scanning axis zero.

The optional attribute scan_output_axes specifies the axis along which the scan_outputs are accumulated for each scan_output. For example, if axis 1 is the time axis (to be scanned) for both inputs and outputs, specify a scan_input axis and scan_output axis value of 1.

Note that because of the ONNX restriction that only the last parameter of an operator can be variadic, the initial-states and scan-inputs are listed together as one input parameter. Similarly, the final-states and scan-outputs are listed together as one output parameter. The attribute num_scan_inputs indicates the number M of scan-inputs.

The behavior of

Scan <

num_scan_inputs = m, body = loop-body, scan_input_axes = [axis_1, …, axis_m]

> (init_1, …, init_n, scan_1, …, scan_m)

is equivalent to the following pseudo-code:

// scan_i.shape[axis_i] denotes the (max) sequence-length of scan_i // scan_i.shape[axis_i] is required to be equal to scan_j.shape[axis_j] for all i,j. sequence_length = scan_1.shape[axis_1];

// initialize state-variables st_1 = init_1; … st_n = init_n; // initialize scan-output variables: [] denotes an empty tensor scan_out_1 = []; …; scan_out_k = []; // identify number of iterations:

// execute loop for (int t = 0; t < sequence_length; ++t) {

// generate the scan-input elements: the notation T<axis=k>[t] indicates the sub-tensor // of rank one less than T obtained by indexing T at position t along axis k. si_1 = scan_1<axis=axis_1>[t]; … ; si_m = scan_m<axis=axis_m>[t]; // execute loop-body st_1, …, st_n, so_1, …, so_k = loop-body(st_1, …, st_n, si_1, …, si_m) // accumulate the scan-output elements scan_out_1 = Concat<axis=0>(scan_out_1, so_1); … ; scan_out_k = Concat<axis=0>(scan_out_k, so_k);

}

return st_1, …, st_n, scan_out_1, …, scan_out_k;

Sample usage: Encoding RNN using a Scan

The following example shows how a simple RNN over an input tensor %X, with weight tensor %Wi, recurrence weight tensor %Ri, bias tensors %Wbi and %Rbi, and initial hidden-state %H_0 can be encoded as a ScanLoop. Note that the loop-body is a nested graph, and it directly computes %Wi, %Ri, %Wbi, and %Rbi (typically constants or initializers in the body graph). If these values are computed in the outer graph, they need to be passed in as extra state_variables.

graph rnn-encoding {

%H_0 = … %X = … %Y_h, %Y = Scan[body = <graph rnn-cell-1>, num_scan_inputs=1](%H_0, %X) return %Y, %Y_h

}

graph rnn-cell-1 (

%H_tminus1[FLOAT, tensor] %X_t[FLOAT, tensor]

) {

%Wi = … %Ri = … %Wbi = … %Rbi = … %t1 = X_t * (Wi^T) %t2 = H_tminus1*(Ri^T) %t3 = Add(%t1, %t2) %t4 = Add(%t3, %Wbi) %t5 = Add(%t4, %Rbi) %Ht = Tanh(%t5) %Accumulate = Identity(%Ht) return %Ht, %Accumulate

}

Attributes

  • body (required): The graph run each iteration. It has N+M inputs: (loop state variables…, scan_input_elts…). It has N+K outputs: (loop state variables…, scan_output_elts…). Each scan_output is created by concatenating the value of the specified scan_output_elt value at the end of each iteration of the loop. It is an error if the dimensions of these values change across loop iterations.

  • num_scan_inputs (required): An attribute specifying the number of scan_inputs M.

  • scan_input_axes: An optional list of M flags. The i-th element of the list specifies the axis to be scanned (the sequence axis) for the i-th scan_input. If omitted, 0 will be used as the scan axis for every scan_input. Negative value for an axis means counting dimensions from the back. Accepted range is [-r, r-1] where r = rank(input).

  • scan_input_directions: An optional list of M flags. The i-th element of the list specifies the direction to be scanned for the i-th scan_input tensor: 0 indicates forward direction and 1 indicates reverse direction. If omitted, all scan_input tensors will be scanned in the forward direction.

  • scan_output_axes: An optional list of K flags. The i-th element of the list specifies the axis for the i-th scan_output. The scan outputs are accumulated along the specified axis. If omitted, 0 will be used as the scan axis for every scan_output. Negative value for an axis means counting dimensions from the back. Accepted range is [-r, r-1].

  • scan_output_directions: An optional list of K flags, one for each scan_output. The i-th element of the list specifies whether the i-th scan_output should be constructed by appending or prepending a new value in each iteration: 0 indicates appending and 1 indicates prepending. If omitted, all scan_output tensors will be produced by appending a value in each iteration.

Inputs

Between 1 and 2147483647 inputs.

  • initial_state_and_scan_inputs (variadic) - V: Initial values of the loop’s N state variables followed by M scan_inputs

Outputs

Between 1 and 2147483647 outputs.

  • final_state_and_scan_outputs (variadic) - V: Final values of the loop’s N state variables followed by K scan_outputs

Type Constraints

  • I in ( tensor(int64) ): Int64 tensor

  • V in ( tensor(bfloat16), tensor(bool), tensor(complex128), tensor(complex64), tensor(double), tensor(float), tensor(float16), tensor(int16), tensor(int32), tensor(int64), tensor(int8), tensor(string), tensor(uint16), tensor(uint32), tensor(uint64), tensor(uint8) ): All Tensor types

Examples

scan_8

# Given an input sequence [x1, ..., xN], sum up its elements using a scan
# returning the final state (x1+x2+...+xN) as well the scan_output
# [x1, x1+x2, ..., x1+x2+...+xN]
#
# create graph to represent scan body
sum_in = onnx.helper.make_tensor_value_info('sum_in', onnx.TensorProto.FLOAT, [2])
next = onnx.helper.make_tensor_value_info('next', onnx.TensorProto.FLOAT, [2])
sum_out = onnx.helper.make_tensor_value_info('sum_out', onnx.TensorProto.FLOAT, [2])
scan_out = onnx.helper.make_tensor_value_info('scan_out', onnx.TensorProto.FLOAT, [2])
add_node = onnx.helper.make_node(
    'Add',
    inputs=['sum_in', 'next'],
    outputs=['sum_out']
)
id_node = onnx.helper.make_node(
    'Identity',
    inputs=['sum_out'],
    outputs=['scan_out']
)
scan_body = onnx.helper.make_graph(
    [add_node, id_node],
    'scan_body',
    [sum_in, next],
    [sum_out, scan_out]
)
# create scan op node
no_sequence_lens = ''   # optional input, not supplied
node = onnx.helper.make_node(
    'Scan',
    inputs=[no_sequence_lens, 'initial', 'x'],
    outputs=['y', 'z'],
    num_scan_inputs=1,
    body=scan_body
)
# create inputs for batch-size 1, sequence-length 3, inner dimension 2
initial = np.array([0, 0]).astype(np.float32).reshape((1, 2))
x = np.array([1, 2, 3, 4, 5, 6]).astype(np.float32).reshape((1, 3, 2))
# final state computed = [1 + 3 + 5, 2 + 4 + 6]
y = np.array([9, 12]).astype(np.float32).reshape((1, 2))
# scan-output computed
z = np.array([1, 2, 4, 6, 9, 12]).astype(np.float32).reshape((1, 3, 2))

expect(node, inputs=[initial, x], outputs=[y, z],
       name='test_scan_sum', opset_imports=[onnx.helper.make_opsetid("", 8)])

scan_9

# Given an input sequence [x1, ..., xN], sum up its elements using a scan
# returning the final state (x1+x2+...+xN) as well the scan_output
# [x1, x1+x2, ..., x1+x2+...+xN]
#
# create graph to represent scan body
sum_in = onnx.helper.make_tensor_value_info('sum_in', onnx.TensorProto.FLOAT, [2])
next = onnx.helper.make_tensor_value_info('next', onnx.TensorProto.FLOAT, [2])
sum_out = onnx.helper.make_tensor_value_info('sum_out', onnx.TensorProto.FLOAT, [2])
scan_out = onnx.helper.make_tensor_value_info('scan_out', onnx.TensorProto.FLOAT, [2])
add_node = onnx.helper.make_node(
    'Add',
    inputs=['sum_in', 'next'],
    outputs=['sum_out']
)
id_node = onnx.helper.make_node(
    'Identity',
    inputs=['sum_out'],
    outputs=['scan_out']
)
scan_body = onnx.helper.make_graph(
    [add_node, id_node],
    'scan_body',
    [sum_in, next],
    [sum_out, scan_out]
)
# create scan op node
node = onnx.helper.make_node(
    'Scan',
    inputs=['initial', 'x'],
    outputs=['y', 'z'],
    num_scan_inputs=1,
    body=scan_body
)
# create inputs for sequence-length 3, inner dimension 2
initial = np.array([0, 0]).astype(np.float32).reshape((2,))
x = np.array([1, 2, 3, 4, 5, 6]).astype(np.float32).reshape((3, 2))
# final state computed = [1 + 3 + 5, 2 + 4 + 6]
y = np.array([9, 12]).astype(np.float32).reshape((2,))
# scan-output computed
z = np.array([1, 2, 4, 6, 9, 12]).astype(np.float32).reshape((3, 2))

expect(node, inputs=[initial, x], outputs=[y, z],
       name='test_scan9_sum', opset_imports=[onnx.helper.make_opsetid("", 9)])

Differences

00Scan can be used to iterate over one or more scan_input tensors,Scan can be used to iterate over one or more scan_input tensors,
11constructing zero or more scan_output tensors. It combines ideas from general recurrences,constructing zero or more scan_output tensors. It combines ideas from general recurrences,
22functional programming constructs such as scan, fold, map, and zip and is intended to enablefunctional programming constructs such as scan, fold, map, and zip and is intended to enable
33generalizations of RNN-like constructs for sequence-to-sequence processing.generalizations of RNN-like constructs for sequence-to-sequence processing.
44Other tensors (referred to as state_variables here) can be used to carry a stateOther tensors (referred to as state_variables here) can be used to carry a state
55when iterating from one element to another (similar to hidden-state in RNNs, also referredwhen iterating from one element to another (similar to hidden-state in RNNs, also referred
66to as loop-carried dependences in the context of loops).to as loop-carried dependences in the context of loops).
77Many common usages involve a single scan_input tensor (where functionalityMany common usages involve a single scan_input tensor (where functionality
88similar to scan, fold and map can be obtained). When more than one scan_input is used,similar to scan, fold and map can be obtained). When more than one scan_input is used,
99a behavior similar to zip is obtained.a behavior similar to zip is obtained.
1010
1111The attribute body must be a graph, specifying the computation to be performed inThe attribute body must be a graph, specifying the computation to be performed in
1212every iteration. It takes as input the current values of the state_variables andevery iteration. It takes as input the current values of the state_variables and
1313the current iterated element of the scan_inputs. It must return the (updated) valuesthe current iterated element of the scan_inputs. It must return the (updated) values
1414of the state_variables and zero or more scan_output_element tensors. The values of theof the state_variables and zero or more scan_output_element tensors. The values of the
1515scan_output_element tensors are concatenated over all the iterations to produce thescan_output_element tensors are concatenated over all the iterations to produce the
1616scan_output values of the scan construct (similar to the concatenated intermediatescan_output values of the scan construct (similar to the concatenated intermediate
1717hidden-state values of RNN-like constructs). All the output tensors (state_variables ashidden-state values of RNN-like constructs). All the output tensors (state_variables as
1818well as scan_output_element tensors) are required to have the same shape in each iterationwell as scan_output_element tensors) are required to have the same shape in each iteration
1919of the loop (a restriction imposed to enable efficient memory allocation).of the loop (a restriction imposed to enable efficient memory allocation).
2020
2121Note that the iterated element passed to the body subgraph does not have a sequenceNote that the iterated element passed to the body subgraph does not have a sequence
2222axis. It will have a rank one less than the rank of the corresponding scan_input.axis. It will have a rank one less than the rank of the corresponding scan_input.
2323
2424The scan operation returns the final values of the state_variables as well as theThe scan operation returns the final values of the state_variables as well as the
2525scan_outputs.scan_outputs.
2626
2727The optional attribute scan_input_directions specifies the direction (forward or backward)The optional attribute scan_input_directions specifies the direction (forward or backward)
2828for each scan input. If this attribute is omitted, all sequences are scanned in the forwardfor each scan input. If this attribute is omitted, all sequences are scanned in the forward
2929direction. A bidirectional scan may be performed by specifying the same tensor input twicedirection. A bidirectional scan may be performed by specifying the same tensor input twice
3030in the scan_inputs, once with a forward direction, and once with a backward direction.in the scan_inputs, once with a forward direction, and once with a backward direction.
3131
3232The scan_output of the operation is produced by concatenating the scan_output_elementThe scan_output of the operation is produced by concatenating the scan_output_element
3333values produced by the body in each iteration. The optional attribute scan_output_directionsvalues produced by the body in each iteration. The optional attribute scan_output_directions
3434specifies the direction in which scan_output is constructed (by appending or prepending thespecifies the direction in which scan_output is constructed (by appending or prepending the
3535scan_output_element to scan_output in each iteration) for each scan_output. If this attributescan_output_element to scan_output in each iteration) for each scan_output. If this attribute
3636is omitted, the scan_output_element is appended to the scan_output in each iteration.is omitted, the scan_output_element is appended to the scan_output in each iteration.
3737
3838The optional attribute scan_input_axes specifies the axis to be scanned for each scan_input.The optional attribute scan_input_axes specifies the axis to be scanned for each scan_input.
3939If omitted, every scan_input will be scanned in axis 0. For example, if axis 0 is theIf omitted, every scan_input will be scanned in axis 0. For example, if axis 0 is the
4040batch axis and axis 1 is the time axis (to be scanned), specify an axis value of 1.batch axis and axis 1 is the time axis (to be scanned), specify an axis value of 1.
4141Note that scanning a non-zero axis may be less efficient than scanning axis zero.Note that scanning a non-zero axis may be less efficient than scanning axis zero.
4242
4343The optional attribute scan_output_axes specifies the axis along which the scan_outputsThe optional attribute scan_output_axes specifies the axis along which the scan_outputs
4444are accumulated for each scan_output. For example, if axis 1 is the time axis (to beare accumulated for each scan_output. For example, if axis 1 is the time axis (to be
4545scanned) for both inputs and outputs, specify a scan_input axis and scan_output axisscanned) for both inputs and outputs, specify a scan_input axis and scan_output axis
4646value of 1.value of 1.
4747
4848Note that because of the ONNX restriction that only the last parameter of an operator canNote that because of the ONNX restriction that only the last parameter of an operator can
4949be variadic, the initial-states and scan-inputs are listed together as one input parameter.be variadic, the initial-states and scan-inputs are listed together as one input parameter.
5050Similarly, the final-states and scan-outputs are listed together as one output parameter.Similarly, the final-states and scan-outputs are listed together as one output parameter.
5151The attribute num_scan_inputs indicates the number M of scan-inputs.The attribute num_scan_inputs indicates the number M of scan-inputs.
5252
5353The behavior ofThe behavior of
5454
5555 Scan < Scan <
5656 num_scan_inputs = m, num_scan_inputs = m,
5757 body = loop-body, body = loop-body,
5858 scan_input_axes = [axis_1, ..., axis_m] scan_input_axes = [axis_1, ..., axis_m]
5959 > (init_1, ..., init_n, scan_1, ..., scan_m) > (init_1, ..., init_n, scan_1, ..., scan_m)
6060
6161is equivalent to the following pseudo-code:is equivalent to the following pseudo-code:
6262
6363 // scan_i.shape[axis_i] denotes the (max) sequence-length of scan_i // scan_i.shape[axis_i] denotes the (max) sequence-length of scan_i
6464 // scan_i.shape[axis_i] is required to be equal to scan_j.shape[axis_j] for all i,j. // scan_i.shape[axis_i] is required to be equal to scan_j.shape[axis_j] for all i,j.
6565 sequence_length = scan_1.shape[axis_1]; sequence_length = scan_1.shape[axis_1];
6666
6767 // initialize state-variables // initialize state-variables
6868 st_1 = init_1; ... st_n = init_n; st_1 = init_1; ... st_n = init_n;
6969 // initialize scan-output variables: [] denotes an empty tensor // initialize scan-output variables: [] denotes an empty tensor
7070 scan_out_1 = []; ...; scan_out_k = []; scan_out_1 = []; ...; scan_out_k = [];
7171 // identify number of iterations: // identify number of iterations:
7272
7373 // execute loop // execute loop
7474 for (int t = 0; t < sequence_length; ++t) { for (int t = 0; t < sequence_length; ++t) {
7575 // generate the scan-input elements: the notation T[t] indicates the sub-tensor // generate the scan-input elements: the notation T[t] indicates the sub-tensor
7676 // of rank one less than T obtained by indexing T at position t along axis k. // of rank one less than T obtained by indexing T at position t along axis k.
7777 si_1 = scan_1[t]; si_1 = scan_1[t];
7878 ... ; ... ;
7979 si_m = scan_m[t]; si_m = scan_m[t];
8080 // execute loop-body // execute loop-body
8181 st_1, ..., st_n, so_1, ..., so_k = loop-body(st_1, ..., st_n, si_1, ..., si_m) st_1, ..., st_n, so_1, ..., so_k = loop-body(st_1, ..., st_n, si_1, ..., si_m)
8282 // accumulate the scan-output elements // accumulate the scan-output elements
8383 scan_out_1 = Concat(scan_out_1, so_1); ... ; scan_out_k = Concat(scan_out_k, so_k); scan_out_1 = Concat(scan_out_1, so_1); ... ; scan_out_k = Concat(scan_out_k, so_k);
8484 } }
8585
8686 return st_1, ..., st_n, scan_out_1, ..., scan_out_k; return st_1, ..., st_n, scan_out_1, ..., scan_out_k;
8787
8888*Sample usage: Encoding RNN using a Scan**Sample usage: Encoding RNN using a Scan*
8989
9090The following example shows how a simple RNN over an input tensor %X, with weight tensor %Wi,The following example shows how a simple RNN over an input tensor %X, with weight tensor %Wi,
9191recurrence weight tensor %Ri, bias tensors %Wbi and %Rbi, and initial hidden-state %H_0 canrecurrence weight tensor %Ri, bias tensors %Wbi and %Rbi, and initial hidden-state %H_0 can
9292be encoded as a ScanLoop. Note that the loop-body is a nested graph, and it directly computesbe encoded as a ScanLoop. Note that the loop-body is a nested graph, and it directly computes
9393%Wi, %Ri, %Wbi, and %Rbi (typically constants or initializers in the body graph). If these%Wi, %Ri, %Wbi, and %Rbi (typically constants or initializers in the body graph). If these
9494values are computed in the outer graph, they need to be passed in as extra state_variables.values are computed in the outer graph, they need to be passed in as extra state_variables.
9595
9696 graph rnn-encoding { graph rnn-encoding {
9797 %H_0 = ... %H_0 = ...
9898 %X = ... %X = ...
9999 %Y_h, %Y = Scan[body = , num_scan_inputs=1](%H_0, %X) %Y_h, %Y = Scan[body = , num_scan_inputs=1](%H_0, %X)
100100 return %Y, %Y_h return %Y, %Y_h
101101 } }
102102
103103 graph rnn-cell-1 ( graph rnn-cell-1 (
104104 %H_tminus1[FLOAT, tensor] %H_tminus1[FLOAT, tensor]
105105 %X_t[FLOAT, tensor] %X_t[FLOAT, tensor]
106106 ) { ) {
107107 %Wi = ... %Wi = ...
108108 %Ri = ... %Ri = ...
109109 %Wbi = ... %Wbi = ...
110110 %Rbi = ... %Rbi = ...
111111 %t1 = X_t * (Wi^T) %t1 = X_t * (Wi^T)
112112 %t2 = H_tminus1*(Ri^T) %t2 = H_tminus1*(Ri^T)
113113 %t3 = Add(%t1, %t2) %t3 = Add(%t1, %t2)
114114 %t4 = Add(%t3, %Wbi) %t4 = Add(%t3, %Wbi)
115115 %t5 = Add(%t4, %Rbi) %t5 = Add(%t4, %Rbi)
116116 %Ht = Tanh(%t5) %Ht = Tanh(%t5)
117117 %Accumulate = Identity(%Ht) %Accumulate = Identity(%Ht)
118118 return %Ht, %Accumulate return %Ht, %Accumulate
119119 } }
120120
121121**Attributes****Attributes**
122122
123123* **body** (required):* **body** (required):
124124 The graph run each iteration. It has N+M inputs: (loop state The graph run each iteration. It has N+M inputs: (loop state
125125 variables..., scan_input_elts...). It has N+K outputs: (loop state variables..., scan_input_elts...). It has N+K outputs: (loop state
126126 variables..., scan_output_elts...). Each scan_output is created by variables..., scan_output_elts...). Each scan_output is created by
127127 concatenating the value of the specified scan_output_elt value at concatenating the value of the specified scan_output_elt value at
128128 the end of each iteration of the loop. It is an error if the the end of each iteration of the loop. It is an error if the
129129 dimensions of these values change across loop iterations. dimensions of these values change across loop iterations.
130130* **num_scan_inputs** (required):* **num_scan_inputs** (required):
131131 An attribute specifying the number of scan_inputs M. An attribute specifying the number of scan_inputs M.
132132* **scan_input_axes**:* **scan_input_axes**:
133133 An optional list of M flags. The i-th element of the list specifies An optional list of M flags. The i-th element of the list specifies
134134 the axis to be scanned (the sequence axis) for the i-th scan_input. the axis to be scanned (the sequence axis) for the i-th scan_input.
135135 If omitted, 0 will be used as the scan axis for every scan_input. If omitted, 0 will be used as the scan axis for every scan_input.
136136 Negative value for an axis means counting dimensions from the back. Negative value for an axis means counting dimensions from the back.
137137 Accepted range is [-r, r-1] where r = rank(input). Accepted range is [-r, r-1] where r = rank(input).
138138* **scan_input_directions**:* **scan_input_directions**:
139139 An optional list of M flags. The i-th element of the list specifies An optional list of M flags. The i-th element of the list specifies
140140 the direction to be scanned for the i-th scan_input tensor: 0 the direction to be scanned for the i-th scan_input tensor: 0
141141 indicates forward direction and 1 indicates reverse direction. If indicates forward direction and 1 indicates reverse direction. If
142142 omitted, all scan_input tensors will be scanned in the forward omitted, all scan_input tensors will be scanned in the forward
143143 direction. direction.
144144* **scan_output_axes**:* **scan_output_axes**:
145145 An optional list of K flags. The i-th element of the list specifies An optional list of K flags. The i-th element of the list specifies
146146 the axis for the i-th scan_output. The scan outputs are accumulated the axis for the i-th scan_output. The scan outputs are accumulated
147147 along the specified axis. If omitted, 0 will be used as the scan along the specified axis. If omitted, 0 will be used as the scan
148148 axis for every scan_output. Negative value for an axis means axis for every scan_output. Negative value for an axis means
149149 counting dimensions from the back. Accepted range is [-r, r-1]. counting dimensions from the back. Accepted range is [-r, r-1].
150150* **scan_output_directions**:* **scan_output_directions**:
151151 An optional list of K flags, one for each scan_output. The i-th An optional list of K flags, one for each scan_output. The i-th
152152 element of the list specifies whether the i-th scan_output should be element of the list specifies whether the i-th scan_output should be
153153 constructed by appending or prepending a new value in each constructed by appending or prepending a new value in each
154154 iteration: 0 indicates appending and 1 indicates prepending. If iteration: 0 indicates appending and 1 indicates prepending. If
155155 omitted, all scan_output tensors will be produced by appending a omitted, all scan_output tensors will be produced by appending a
156156 value in each iteration. value in each iteration.
157157
158158**Inputs****Inputs**
159159
160160Between 1 and 2147483647 inputs.Between 1 and 2147483647 inputs.
161161
162162* **initial_state_and_scan_inputs** (variadic) - **V**:* **initial_state_and_scan_inputs** (variadic) - **V**:
163163 Initial values of the loop's N state variables followed by M Initial values of the loop's N state variables followed by M
164164 scan_inputs scan_inputs
165165
166166**Outputs****Outputs**
167167
168168Between 1 and 2147483647 outputs.Between 1 and 2147483647 outputs.
169169
170170* **final_state_and_scan_outputs** (variadic) - **V**:* **final_state_and_scan_outputs** (variadic) - **V**:
171171 Final values of the loop's N state variables followed by K Final values of the loop's N state variables followed by K
172172 scan_outputs scan_outputs
173173
174174**Type Constraints****Type Constraints**
175175
176176* **I** in (* **I** in (
177177 tensor(int64) tensor(int64)
178178 ): ):
179179 Int64 tensor Int64 tensor
180180* **V** in (* **V** in (
181 tensor(bfloat16),
181182 tensor(bool), tensor(bool),
182183 tensor(complex128), tensor(complex128),
183184 tensor(complex64), tensor(complex64),
184185 tensor(double), tensor(double),
185186 tensor(float), tensor(float),
186187 tensor(float16), tensor(float16),
187188 tensor(int16), tensor(int16),
188189 tensor(int32), tensor(int32),
189190 tensor(int64), tensor(int64),
190191 tensor(int8), tensor(int8),
191192 tensor(string), tensor(string),
192193 tensor(uint16), tensor(uint16),
193194 tensor(uint32), tensor(uint32),
194195 tensor(uint64), tensor(uint64),
195196 tensor(uint8) tensor(uint8)
196197 ): ):
197198 All Tensor types All Tensor types

Scan - 11#

Version

  • name: Scan (GitHub)

  • domain: main

  • since_version: 11

  • function: False

  • support_level: SupportType.COMMON

  • shape inference: True

This version of the operator has been available since version 11.

Summary

Scan can be used to iterate over one or more scan_input tensors, constructing zero or more scan_output tensors. It combines ideas from general recurrences, functional programming constructs such as scan, fold, map, and zip and is intended to enable generalizations of RNN-like constructs for sequence-to-sequence processing. Other tensors (referred to as state_variables here) can be used to carry a state when iterating from one element to another (similar to hidden-state in RNNs, also referred to as loop-carried dependences in the context of loops). Many common usages involve a single scan_input tensor (where functionality similar to scan, fold and map can be obtained). When more than one scan_input is used, a behavior similar to zip is obtained.

The attribute body must be a graph, specifying the computation to be performed in every iteration. It takes as input the current values of the state_variables and the current iterated element of the scan_inputs. It must return the (updated) values of the state_variables and zero or more scan_output_element tensors. The values of the scan_output_element tensors are concatenated over all the iterations to produce the scan_output values of the scan construct (similar to the concatenated intermediate hidden-state values of RNN-like constructs). All the output tensors (state_variables as well as scan_output_element tensors) are required to have the same shape in each iteration of the loop (a restriction imposed to enable efficient memory allocation).

Note that the iterated element passed to the body subgraph does not have a sequence axis. It will have a rank one less than the rank of the corresponding scan_input.

The scan operation returns the final values of the state_variables as well as the scan_outputs.

The optional attribute scan_input_directions specifies the direction (forward or backward) for each scan input. If this attribute is omitted, all sequences are scanned in the forward direction. A bidirectional scan may be performed by specifying the same tensor input twice in the scan_inputs, once with a forward direction, and once with a backward direction.

The scan_output of the operation is produced by concatenating the scan_output_element values produced by the body in each iteration. The optional attribute scan_output_directions specifies the direction in which scan_output is constructed (by appending or prepending the scan_output_element to scan_output in each iteration) for each scan_output. If this attribute is omitted, the scan_output_element is appended to the scan_output in each iteration.

The optional attribute scan_input_axes specifies the axis to be scanned for each scan_input. If omitted, every scan_input will be scanned in axis 0. For example, if axis 0 is the batch axis and axis 1 is the time axis (to be scanned), specify an axis value of 1. Note that scanning a non-zero axis may be less efficient than scanning axis zero.

The optional attribute scan_output_axes specifies the axis along which the scan_outputs are accumulated for each scan_output. For example, if axis 1 is the time axis (to be scanned) for both inputs and outputs, specify a scan_input axis and scan_output axis value of 1.

Note that because of the ONNX restriction that only the last parameter of an operator can be variadic, the initial-states and scan-inputs are listed together as one input parameter. Similarly, the final-states and scan-outputs are listed together as one output parameter. The attribute num_scan_inputs indicates the number M of scan-inputs.

The behavior of

Scan <

num_scan_inputs = m, body = loop-body, scan_input_axes = [axis_1, …, axis_m]

> (init_1, …, init_n, scan_1, …, scan_m)

is equivalent to the following pseudo-code:

// scan_i.shape[axis_i] denotes the (max) sequence-length of scan_i // scan_i.shape[axis_i] is required to be equal to scan_j.shape[axis_j] for all i,j. sequence_length = scan_1.shape[axis_1];

// initialize state-variables st_1 = init_1; … st_n = init_n; // initialize scan-output variables: [] denotes an empty tensor scan_out_1 = []; …; scan_out_k = []; // identify number of iterations:

// execute loop for (int t = 0; t < sequence_length; ++t) {

// generate the scan-input elements: the notation T<axis=k>[t] indicates the sub-tensor // of rank one less than T obtained by indexing T at position t along axis k. si_1 = scan_1<axis=axis_1>[t]; … ; si_m = scan_m<axis=axis_m>[t]; // execute loop-body st_1, …, st_n, so_1, …, so_k = loop-body(st_1, …, st_n, si_1, …, si_m) // accumulate the scan-output elements scan_out_1 = Concat<axis=0>(scan_out_1, so_1); … ; scan_out_k = Concat<axis=0>(scan_out_k, so_k);

}

return st_1, …, st_n, scan_out_1, …, scan_out_k;

Sample usage: Encoding RNN using a Scan

The following example shows how a simple RNN over an input tensor %X, with weight tensor %Wi, recurrence weight tensor %Ri, bias tensors %Wbi and %Rbi, and initial hidden-state %H_0 can be encoded as a ScanLoop. Note that the loop-body is a nested graph, and it directly computes %Wi, %Ri, %Wbi, and %Rbi (typically constants or initializers in the body graph). If these values are computed in the outer graph, they need to be passed in as extra state_variables.

graph rnn-encoding {

%H_0 = … %X = … %Y_h, %Y = Scan[body = <graph rnn-cell-1>, num_scan_inputs=1](%H_0, %X) return %Y, %Y_h

}

graph rnn-cell-1 (

%H_tminus1[FLOAT, tensor] %X_t[FLOAT, tensor]

) {

%Wi = … %Ri = … %Wbi = … %Rbi = … %t1 = X_t * (Wi^T) %t2 = H_tminus1*(Ri^T) %t3 = Add(%t1, %t2) %t4 = Add(%t3, %Wbi) %t5 = Add(%t4, %Rbi) %Ht = Tanh(%t5) %Accumulate = Identity(%Ht) return %Ht, %Accumulate

}

Attributes

  • body (required): The graph run each iteration. It has N+M inputs: (loop state variables…, scan_input_elts…). It has N+K outputs: (loop state variables…, scan_output_elts…). Each scan_output is created by concatenating the value of the specified scan_output_elt value at the end of each iteration of the loop. It is an error if the dimensions of these values change across loop iterations.

  • num_scan_inputs (required): An attribute specifying the number of scan_inputs M.

  • scan_input_axes: An optional list of M flags. The i-th element of the list specifies the axis to be scanned (the sequence axis) for the i-th scan_input. If omitted, 0 will be used as the scan axis for every scan_input. Negative value for an axis means counting dimensions from the back. Accepted range is [-r, r-1] where r = rank(input).

  • scan_input_directions: An optional list of M flags. The i-th element of the list specifies the direction to be scanned for the i-th scan_input tensor: 0 indicates forward direction and 1 indicates reverse direction. If omitted, all scan_input tensors will be scanned in the forward direction.

  • scan_output_axes: An optional list of K flags. The i-th element of the list specifies the axis for the i-th scan_output. The scan outputs are accumulated along the specified axis. If omitted, 0 will be used as the scan axis for every scan_output. Negative value for an axis means counting dimensions from the back. Accepted range is [-r, r-1].

  • scan_output_directions: An optional list of K flags, one for each scan_output. The i-th element of the list specifies whether the i-th scan_output should be constructed by appending or prepending a new value in each iteration: 0 indicates appending and 1 indicates prepending. If omitted, all scan_output tensors will be produced by appending a value in each iteration.

Inputs

Between 1 and 2147483647 inputs.

  • initial_state_and_scan_inputs (variadic) - V: Initial values of the loop’s N state variables followed by M scan_inputs

Outputs

Between 1 and 2147483647 outputs.

  • final_state_and_scan_outputs (variadic) - V: Final values of the loop’s N state variables followed by K scan_outputs

Type Constraints

  • I in ( tensor(int64) ): Int64 tensor

  • V in ( tensor(bool), tensor(complex128), tensor(complex64), tensor(double), tensor(float), tensor(float16), tensor(int16), tensor(int32), tensor(int64), tensor(int8), tensor(string), tensor(uint16), tensor(uint32), tensor(uint64), tensor(uint8) ): All Tensor types

Differences

00Scan can be used to iterate over one or more scan_input tensors,Scan can be used to iterate over one or more scan_input tensors,
11constructing zero or more scan_output tensors. It combines ideas from general recurrences,constructing zero or more scan_output tensors. It combines ideas from general recurrences,
22functional programming constructs such as scan, fold, map, and zip and is intended to enablefunctional programming constructs such as scan, fold, map, and zip and is intended to enable
33generalizations of RNN-like constructs for sequence-to-sequence processing.generalizations of RNN-like constructs for sequence-to-sequence processing.
44Other tensors (referred to as state_variables here) can be used to carry a stateOther tensors (referred to as state_variables here) can be used to carry a state
55when iterating from one element to another (similar to hidden-state in RNNs, also referredwhen iterating from one element to another (similar to hidden-state in RNNs, also referred
66to as loop-carried dependences in the context of loops).to as loop-carried dependences in the context of loops).
77Many common usages involve a single scan_input tensor (where functionalityMany common usages involve a single scan_input tensor (where functionality
88similar to scan, fold and map can be obtained). When more than one scan_input is used,similar to scan, fold and map can be obtained). When more than one scan_input is used,
99a behavior similar to zip is obtained.a behavior similar to zip is obtained.
1010
1111The attribute body must be a graph, specifying the computation to be performed inThe attribute body must be a graph, specifying the computation to be performed in
1212every iteration. It takes as input the current values of the state_variables andevery iteration. It takes as input the current values of the state_variables and
1313the current iterated element of the scan_inputs. It must return the (updated) valuesthe current iterated element of the scan_inputs. It must return the (updated) values
1414of the state_variables and zero or more scan_output_element tensors. The values of theof the state_variables and zero or more scan_output_element tensors. The values of the
1515scan_output_element tensors are concatenated over all the iterations to produce thescan_output_element tensors are concatenated over all the iterations to produce the
1616scan_output values of the scan construct (similar to the concatenated intermediatescan_output values of the scan construct (similar to the concatenated intermediate
1717hidden-state values of RNN-like constructs). All the output tensors (state_variables ashidden-state values of RNN-like constructs). All the output tensors (state_variables as
1818well as scan_output_element tensors) are required to have the same shape in each iterationwell as scan_output_element tensors) are required to have the same shape in each iteration
1919of the loop (a restriction imposed to enable efficient memory allocation).of the loop (a restriction imposed to enable efficient memory allocation).
2020
2121Note that the iterated element passed to the body subgraph does not have a sequenceNote that the iterated element passed to the body subgraph does not have a sequence
2222axis. It will have a rank one less than the rank of the corresponding scan_input.axis. It will have a rank one less than the rank of the corresponding scan_input.
2323
2424The scan operation returns the final values of the state_variables as well as theThe scan operation returns the final values of the state_variables as well as the
2525scan_outputs.scan_outputs.
2626
2727The optional attribute scan_input_directions specifies the direction (forward or backward)The optional attribute scan_input_directions specifies the direction (forward or backward)
2828for each scan input. If this attribute is omitted, all sequences are scanned in the forwardfor each scan input. If this attribute is omitted, all sequences are scanned in the forward
2929direction. A bidirectional scan may be performed by specifying the same tensor input twicedirection. A bidirectional scan may be performed by specifying the same tensor input twice
3030in the scan_inputs, once with a forward direction, and once with a backward direction.in the scan_inputs, once with a forward direction, and once with a backward direction.
3131
3232The scan_output of the operation is produced by concatenating the scan_output_elementThe scan_output of the operation is produced by concatenating the scan_output_element
3333values produced by the body in each iteration. The optional attribute scan_output_directionsvalues produced by the body in each iteration. The optional attribute scan_output_directions
3434specifies the direction in which scan_output is constructed (by appending or prepending thespecifies the direction in which scan_output is constructed (by appending or prepending the
3535scan_output_element to scan_output in each iteration) for each scan_output. If this attributescan_output_element to scan_output in each iteration) for each scan_output. If this attribute
3636is omitted, the scan_output_element is appended to the scan_output in each iteration.is omitted, the scan_output_element is appended to the scan_output in each iteration.
3737
3838The optional attribute scan_input_axes specifies the axis to be scanned for each scan_input.The optional attribute scan_input_axes specifies the axis to be scanned for each scan_input.
3939If omitted, every scan_input will be scanned in axis 0. For example, if axis 0 is theIf omitted, every scan_input will be scanned in axis 0. For example, if axis 0 is the
4040batch axis and axis 1 is the time axis (to be scanned), specify an axis value of 1.batch axis and axis 1 is the time axis (to be scanned), specify an axis value of 1.
4141Note that scanning a non-zero axis may be less efficient than scanning axis zero.Note that scanning a non-zero axis may be less efficient than scanning axis zero.
4242
4343The optional attribute scan_output_axes specifies the axis along which the scan_outputsThe optional attribute scan_output_axes specifies the axis along which the scan_outputs
4444are accumulated for each scan_output. For example, if axis 1 is the time axis (to beare accumulated for each scan_output. For example, if axis 1 is the time axis (to be
4545scanned) for both inputs and outputs, specify a scan_input axis and scan_output axisscanned) for both inputs and outputs, specify a scan_input axis and scan_output axis
4646value of 1.value of 1.
4747
4848Note that because of the ONNX restriction that only the last parameter of an operator canNote that because of the ONNX restriction that only the last parameter of an operator can
4949be variadic, the initial-states and scan-inputs are listed together as one input parameter.be variadic, the initial-states and scan-inputs are listed together as one input parameter.
5050Similarly, the final-states and scan-outputs are listed together as one output parameter.Similarly, the final-states and scan-outputs are listed together as one output parameter.
5151The attribute num_scan_inputs indicates the number M of scan-inputs.The attribute num_scan_inputs indicates the number M of scan-inputs.
5252
5353The behavior ofThe behavior of
5454
5555 Scan < Scan <
5656 num_scan_inputs = m, num_scan_inputs = m,
5757 body = loop-body, body = loop-body,
5858 scan_input_axes = [axis_1, ..., axis_m] scan_input_axes = [axis_1, ..., axis_m]
5959 > (init_1, ..., init_n, scan_1, ..., scan_m) > (init_1, ..., init_n, scan_1, ..., scan_m)
6060
6161is equivalent to the following pseudo-code:is equivalent to the following pseudo-code:
6262
6363 // scan_i.shape[axis_i] denotes the (max) sequence-length of scan_i // scan_i.shape[axis_i] denotes the (max) sequence-length of scan_i
6464 // scan_i.shape[axis_i] is required to be equal to scan_j.shape[axis_j] for all i,j. // scan_i.shape[axis_i] is required to be equal to scan_j.shape[axis_j] for all i,j.
6565 sequence_length = scan_1.shape[axis_1]; sequence_length = scan_1.shape[axis_1];
6666
6767 // initialize state-variables // initialize state-variables
6868 st_1 = init_1; ... st_n = init_n; st_1 = init_1; ... st_n = init_n;
6969 // initialize scan-output variables: [] denotes an empty tensor // initialize scan-output variables: [] denotes an empty tensor
7070 scan_out_1 = []; ...; scan_out_k = []; scan_out_1 = []; ...; scan_out_k = [];
7171 // identify number of iterations: // identify number of iterations:
7272
7373 // execute loop // execute loop
7474 for (int t = 0; t < sequence_length; ++t) { for (int t = 0; t < sequence_length; ++t) {
7575 // generate the scan-input elements: the notation T[t] indicates the sub-tensor // generate the scan-input elements: the notation T[t] indicates the sub-tensor
7676 // of rank one less than T obtained by indexing T at position t along axis k. // of rank one less than T obtained by indexing T at position t along axis k.
7777 si_1 = scan_1[t]; si_1 = scan_1[t];
7878 ... ; ... ;
7979 si_m = scan_m[t]; si_m = scan_m[t];
8080 // execute loop-body // execute loop-body
8181 st_1, ..., st_n, so_1, ..., so_k = loop-body(st_1, ..., st_n, si_1, ..., si_m) st_1, ..., st_n, so_1, ..., so_k = loop-body(st_1, ..., st_n, si_1, ..., si_m)
8282 // accumulate the scan-output elements // accumulate the scan-output elements
8383 scan_out_1 = Concat(scan_out_1, so_1); ... ; scan_out_k = Concat(scan_out_k, so_k); scan_out_1 = Concat(scan_out_1, so_1); ... ; scan_out_k = Concat(scan_out_k, so_k);
8484 } }
8585
8686 return st_1, ..., st_n, scan_out_1, ..., scan_out_k; return st_1, ..., st_n, scan_out_1, ..., scan_out_k;
8787
8888*Sample usage: Encoding RNN using a Scan**Sample usage: Encoding RNN using a Scan*
8989
9090The following example shows how a simple RNN over an input tensor %X, with weight tensor %Wi,The following example shows how a simple RNN over an input tensor %X, with weight tensor %Wi,
9191recurrence weight tensor %Ri, bias tensors %Wbi and %Rbi, and initial hidden-state %H_0 canrecurrence weight tensor %Ri, bias tensors %Wbi and %Rbi, and initial hidden-state %H_0 can
9292be encoded as a ScanLoop. Note that the loop-body is a nested graph, and it directly computesbe encoded as a ScanLoop. Note that the loop-body is a nested graph, and it directly computes
9393%Wi, %Ri, %Wbi, and %Rbi (typically constants or initializers in the body graph). If these%Wi, %Ri, %Wbi, and %Rbi (typically constants or initializers in the body graph). If these
9494values are computed in the outer graph, they need to be passed in as extra state_variables.values are computed in the outer graph, they need to be passed in as extra state_variables.
9595
9696 graph rnn-encoding { graph rnn-encoding {
9797 %H_0 = ... %H_0 = ...
9898 %X = ... %X = ...
9999 %Y_h, %Y = Scan[body = , num_scan_inputs=1](%H_0, %X) %Y_h, %Y = Scan[body = , num_scan_inputs=1](%H_0, %X)
100100 return %Y, %Y_h return %Y, %Y_h
101101 } }
102102
103103 graph rnn-cell-1 ( graph rnn-cell-1 (
104104 %H_tminus1[FLOAT, tensor] %H_tminus1[FLOAT, tensor]
105105 %X_t[FLOAT, tensor] %X_t[FLOAT, tensor]
106106 ) { ) {
107107 %Wi = ... %Wi = ...
108108 %Ri = ... %Ri = ...
109109 %Wbi = ... %Wbi = ...
110110 %Rbi = ... %Rbi = ...
111111 %t1 = X_t * (Wi^T) %t1 = X_t * (Wi^T)
112112 %t2 = H_tminus1*(Ri^T) %t2 = H_tminus1*(Ri^T)
113113 %t3 = Add(%t1, %t2) %t3 = Add(%t1, %t2)
114114 %t4 = Add(%t3, %Wbi) %t4 = Add(%t3, %Wbi)
115115 %t5 = Add(%t4, %Rbi) %t5 = Add(%t4, %Rbi)
116116 %Ht = Tanh(%t5) %Ht = Tanh(%t5)
117117 %Accumulate = Identity(%Ht) %Accumulate = Identity(%Ht)
118118 return %Ht, %Accumulate return %Ht, %Accumulate
119119 } }
120120
121121**Attributes****Attributes**
122122
123123* **body** (required):* **body** (required):
124124 The graph run each iteration. It has N+M inputs: (loop state The graph run each iteration. It has N+M inputs: (loop state
125125 variables..., scan_input_elts...). It has N+K outputs: (loop state variables..., scan_input_elts...). It has N+K outputs: (loop state
126126 variables..., scan_output_elts...). Each scan_output is created by variables..., scan_output_elts...). Each scan_output is created by
127127 concatenating the value of the specified scan_output_elt value at concatenating the value of the specified scan_output_elt value at
128128 the end of each iteration of the loop. It is an error if the the end of each iteration of the loop. It is an error if the
129129 dimensions of these values change across loop iterations. dimensions of these values change across loop iterations.
130130* **num_scan_inputs** (required):* **num_scan_inputs** (required):
131131 An attribute specifying the number of scan_inputs M. An attribute specifying the number of scan_inputs M.
132132* **scan_input_axes**:* **scan_input_axes**:
133133 An optional list of M flags. The i-th element of the list specifies An optional list of M flags. The i-th element of the list specifies
134134 the axis to be scanned (the sequence axis) for the i-th scan_input. the axis to be scanned (the sequence axis) for the i-th scan_input.
135135 If omitted, 0 will be used as the scan axis for every scan_input. If omitted, 0 will be used as the scan axis for every scan_input.
136 Negative value for an axis means counting dimensions from the back.
137 Accepted range is [-r, r-1] where r = rank(input).
136138* **scan_input_directions**:* **scan_input_directions**:
137139 An optional list of M flags. The i-th element of the list specifies An optional list of M flags. The i-th element of the list specifies
138140 the direction to be scanned for the i-th scan_input tensor: 0 the direction to be scanned for the i-th scan_input tensor: 0
139141 indicates forward direction and 1 indicates reverse direction. If indicates forward direction and 1 indicates reverse direction. If
140142 omitted, all scan_input tensors will be scanned in the forward omitted, all scan_input tensors will be scanned in the forward
141143 direction. direction.
142144* **scan_output_axes**:* **scan_output_axes**:
143145 An optional list of K flags. The i-th element of the list specifies An optional list of K flags. The i-th element of the list specifies
144146 the axis for the i-th scan_output. The scan outputs are accumulated the axis for the i-th scan_output. The scan outputs are accumulated
145147 along the specified axis. If omitted, 0 will be used as the scan along the specified axis. If omitted, 0 will be used as the scan
146148 axis for every scan_output. axis for every scan_output. Negative value for an axis means
149 counting dimensions from the back. Accepted range is [-r, r-1].
147150* **scan_output_directions**:* **scan_output_directions**:
148151 An optional list of K flags, one for each scan_output. The i-th An optional list of K flags, one for each scan_output. The i-th
149152 element of the list specifies whether the i-th scan_output should be element of the list specifies whether the i-th scan_output should be
150153 constructed by appending or prepending a new value in each constructed by appending or prepending a new value in each
151154 iteration: 0 indicates appending and 1 indicates prepending. If iteration: 0 indicates appending and 1 indicates prepending. If
152155 omitted, all scan_output tensors will be produced by appending a omitted, all scan_output tensors will be produced by appending a
153156 value in each iteration. value in each iteration.
154157
155158**Inputs****Inputs**
156159
157160Between 1 and 2147483647 inputs.Between 1 and 2147483647 inputs.
158161
159162* **initial_state_and_scan_inputs** (variadic) - **V**:* **initial_state_and_scan_inputs** (variadic) - **V**:
160163 Initial values of the loop's N state variables followed by M Initial values of the loop's N state variables followed by M
161164 scan_inputs scan_inputs
162165
163166**Outputs****Outputs**
164167
165168Between 1 and 2147483647 outputs.Between 1 and 2147483647 outputs.
166169
167170* **final_state_and_scan_outputs** (variadic) - **V**:* **final_state_and_scan_outputs** (variadic) - **V**:
168171 Final values of the loop's N state variables followed by K Final values of the loop's N state variables followed by K
169172 scan_outputs scan_outputs
170173
171174**Type Constraints****Type Constraints**
172175
173176* **I** in (* **I** in (
174177 tensor(int64) tensor(int64)
175178 ): ):
176179 Int64 tensor Int64 tensor
177180* **V** in (* **V** in (
178181 tensor(bool), tensor(bool),
179182 tensor(complex128), tensor(complex128),
180183 tensor(complex64), tensor(complex64),
181184 tensor(double), tensor(double),
182185 tensor(float), tensor(float),
183186 tensor(float16), tensor(float16),
184187 tensor(int16), tensor(int16),
185188 tensor(int32), tensor(int32),
186189 tensor(int64), tensor(int64),
187190 tensor(int8), tensor(int8),
188191 tensor(string), tensor(string),
189192 tensor(uint16), tensor(uint16),
190193 tensor(uint32), tensor(uint32),
191194 tensor(uint64), tensor(uint64),
192195 tensor(uint8) tensor(uint8)
193196 ): ):
194197 All Tensor types All Tensor types

Scan - 9#

Version

  • name: Scan (GitHub)

  • domain: main

  • since_version: 9

  • function: False

  • support_level: SupportType.COMMON

  • shape inference: True

This version of the operator has been available since version 9.

Summary

Scan can be used to iterate over one or more scan_input tensors, constructing zero or more scan_output tensors. It combines ideas from general recurrences, functional programming constructs such as scan, fold, map, and zip and is intended to enable generalizations of RNN-like constructs for sequence-to-sequence processing. Other tensors (referred to as state_variables here) can be used to carry a state when iterating from one element to another (similar to hidden-state in RNNs, also referred to as loop-carried dependences in the context of loops). Many common usages involve a single scan_input tensor (where functionality similar to scan, fold and map can be obtained). When more than one scan_input is used, a behavior similar to zip is obtained.

The attribute body must be a graph, specifying the computation to be performed in every iteration. It takes as input the current values of the state_variables and the current iterated element of the scan_inputs. It must return the (updated) values of the state_variables and zero or more scan_output_element tensors. The values of the scan_output_element tensors are concatenated over all the iterations to produce the scan_output values of the scan construct (similar to the concatenated intermediate hidden-state values of RNN-like constructs). All the output tensors (state_variables as well as scan_output_element tensors) are required to have the same shape in each iteration of the loop (a restriction imposed to enable efficient memory allocation).

Note that the iterated element passed to the body subgraph does not have a sequence axis. It will have a rank one less than the rank of the corresponding scan_input.

The scan operation returns the final values of the state_variables as well as the scan_outputs.

The optional attribute scan_input_directions specifies the direction (forward or backward) for each scan input. If this attribute is omitted, all sequences are scanned in the forward direction. A bidirectional scan may be performed by specifying the same tensor input twice in the scan_inputs, once with a forward direction, and once with a backward direction.

The scan_output of the operation is produced by concatenating the scan_output_element values produced by the body in each iteration. The optional attribute scan_output_directions specifies the direction in which scan_output is constructed (by appending or prepending the scan_output_element to scan_output in each iteration) for each scan_output. If this attribute is omitted, the scan_output_element is appended to the scan_output in each iteration.

The optional attribute scan_input_axes specifies the axis to be scanned for each scan_input. If omitted, every scan_input will be scanned in axis 0. For example, if axis 0 is the batch axis and axis 1 is the time axis (to be scanned), specify an axis value of 1. Note that scanning a non-zero axis may be less efficient than scanning axis zero.

The optional attribute scan_output_axes specifies the axis along which the scan_outputs are accumulated for each scan_output. For example, if axis 1 is the time axis (to be scanned) for both inputs and outputs, specify a scan_input axis and scan_output axis value of 1.

Note that because of the ONNX restriction that only the last parameter of an operator can be variadic, the initial-states and scan-inputs are listed together as one input parameter. Similarly, the final-states and scan-outputs are listed together as one output parameter. The attribute num_scan_inputs indicates the number M of scan-inputs.

The behavior of

Scan <

num_scan_inputs = m, body = loop-body, scan_input_axes = [axis_1, …, axis_m]

> (init_1, …, init_n, scan_1, …, scan_m)

is equivalent to the following pseudo-code:

// scan_i.shape[axis_i] denotes the (max) sequence-length of scan_i // scan_i.shape[axis_i] is required to be equal to scan_j.shape[axis_j] for all i,j. sequence_length = scan_1.shape[axis_1];

// initialize state-variables st_1 = init_1; … st_n = init_n; // initialize scan-output variables: [] denotes an empty tensor scan_out_1 = []; …; scan_out_k = []; // identify number of iterations:

// execute loop for (int t = 0; t < sequence_length; ++t) {

// generate the scan-input elements: the notation T<axis=k>[t] indicates the sub-tensor // of rank one less than T obtained by indexing T at position t along axis k. si_1 = scan_1<axis=axis_1>[t]; … ; si_m = scan_m<axis=axis_m>[t]; // execute loop-body st_1, …, st_n, so_1, …, so_k = loop-body(st_1, …, st_n, si_1, …, si_m) // accumulate the scan-output elements scan_out_1 = Concat<axis=0>(scan_out_1, so_1); … ; scan_out_k = Concat<axis=0>(scan_out_k, so_k);

}

return st_1, …, st_n, scan_out_1, …, scan_out_k;

Sample usage: Encoding RNN using a Scan

The following example shows how a simple RNN over an input tensor %X, with weight tensor %Wi, recurrence weight tensor %Ri, bias tensors %Wbi and %Rbi, and initial hidden-state %H_0 can be encoded as a ScanLoop. Note that the loop-body is a nested graph, and it directly computes %Wi, %Ri, %Wbi, and %Rbi (typically constants or initializers in the body graph). If these values are computed in the outer graph, they need to be passed in as extra state_variables.

graph rnn-encoding {

%H_0 = … %X = … %Y_h, %Y = Scan[body = <graph rnn-cell-1>, num_scan_inputs=1](%H_0, %X) return %Y, %Y_h

}

graph rnn-cell-1 (

%H_tminus1[FLOAT, tensor] %X_t[FLOAT, tensor]

) {

%Wi = … %Ri = … %Wbi = … %Rbi = … %t1 = X_t * (Wi^T) %t2 = H_tminus1*(Ri^T) %t3 = Add(%t1, %t2) %t4 = Add(%t3, %Wbi) %t5 = Add(%t4, %Rbi) %Ht = Tanh(%t5) %Accumulate = Identity(%Ht) return %Ht, %Accumulate

}

Attributes

  • body (required): The graph run each iteration. It has N+M inputs: (loop state variables…, scan_input_elts…). It has N+K outputs: (loop state variables…, scan_output_elts…). Each scan_output is created by concatenating the value of the specified scan_output_elt value at the end of each iteration of the loop. It is an error if the dimensions of these values change across loop iterations.

  • num_scan_inputs (required): An attribute specifying the number of scan_inputs M.

  • scan_input_axes: An optional list of M flags. The i-th element of the list specifies the axis to be scanned (the sequence axis) for the i-th scan_input. If omitted, 0 will be used as the scan axis for every scan_input.

  • scan_input_directions: An optional list of M flags. The i-th element of the list specifies the direction to be scanned for the i-th scan_input tensor: 0 indicates forward direction and 1 indicates reverse direction. If omitted, all scan_input tensors will be scanned in the forward direction.

  • scan_output_axes: An optional list of K flags. The i-th element of the list specifies the axis for the i-th scan_output. The scan outputs are accumulated along the specified axis. If omitted, 0 will be used as the scan axis for every scan_output.

  • scan_output_directions: An optional list of K flags, one for each scan_output. The i-th element of the list specifies whether the i-th scan_output should be constructed by appending or prepending a new value in each iteration: 0 indicates appending and 1 indicates prepending. If omitted, all scan_output tensors will be produced by appending a value in each iteration.

Inputs

Between 1 and 2147483647 inputs.

  • initial_state_and_scan_inputs (variadic) - V: Initial values of the loop’s N state variables followed by M scan_inputs

Outputs

Between 1 and 2147483647 outputs.

  • final_state_and_scan_outputs (variadic) - V: Final values of the loop’s N state variables followed by K scan_outputs

Type Constraints

  • I in ( tensor(int64) ): Int64 tensor

  • V in ( tensor(bool), tensor(complex128), tensor(complex64), tensor(double), tensor(float), tensor(float16), tensor(int16), tensor(int32), tensor(int64), tensor(int8), tensor(string), tensor(uint16), tensor(uint32), tensor(uint64), tensor(uint8) ): All Tensor types

Differences

00Scan can be used to iterate over one or more scan_input tensors,Scan can be used to iterate over one or more scan_input tensors,
11constructing zero or more scan_output tensors. It combines ideas from general recurrences,constructing zero or more scan_output tensors. It combines ideas from general recurrences,
22functional programming constructs such as scan, fold, map, and zip and is intended to enablefunctional programming constructs such as scan, fold, map, and zip and is intended to enable
33generalizations of RNN-like constructs for sequence-to-sequence processing.generalizations of RNN-like constructs for sequence-to-sequence processing.
44Other tensors (referred to as state_variables here) can be used to carry a stateOther tensors (referred to as state_variables here) can be used to carry a state
55when iterating from one element to another (similar to hidden-state in RNNs, also referredwhen iterating from one element to another (similar to hidden-state in RNNs, also referred
66to as loop-carried dependences in the context of loops). All these tensors are required toto as loop-carried dependences in the context of loops).
7have the same shape in each iteration of the loop (a restriction imposed to enable efficient
87memory allocation). Many common usages involve a single scan_input tensor (where functionalityMany common usages involve a single scan_input tensor (where functionality
98similar to scan, fold and map can be obtained). When more than one scan_input is used,similar to scan, fold and map can be obtained). When more than one scan_input is used,
109a behavior similar to zip is obtained.a behavior similar to zip is obtained.
1110
1211The attribute body must be a graph, specifying the computation to be performed inThe attribute body must be a graph, specifying the computation to be performed in
1312every iteration. It takes as input the current values of the state_variables andevery iteration. It takes as input the current values of the state_variables and
1413the current iterated element of the scan_inputs. It must return the (updated) valuesthe current iterated element of the scan_inputs. It must return the (updated) values
1514of the state_variables and zero or more scan_output_element tensors. The values of theof the state_variables and zero or more scan_output_element tensors. The values of the
1615scan_output_element tensors are concatenated over all the iterations to produce thescan_output_element tensors are concatenated over all the iterations to produce the
1716scan_output values of the scan construct (similar to the concatenated intermediatescan_output values of the scan construct (similar to the concatenated intermediate
17hidden-state values of RNN-like constructs). All the output tensors (state_variables as
1818hidden-state values of RNN-like constructs).well as scan_output_element tensors) are required to have the same shape in each iteration
19of the loop (a restriction imposed to enable efficient memory allocation).
1920
21Note that the iterated element passed to the body subgraph does not have a sequence
22axis. It will have a rank one less than the rank of the corresponding scan_input.
23
2024The scan operation returns the final values of the state_variables as well as theThe scan operation returns the final values of the state_variables as well as the
2125scan_outputs.scan_outputs.
2226
23The operation supports batching, and the batch-axis is required to be 0.
24When multiple scan_input tensors are used, they must all have the same batch-size,
25and they must all have the same maximum-sequence-length (the dimensionality of the
26sequence axis or scan axis). The sequence axis or scan axis is required to be 1.
27
28The operation has an optional sequence_lens input (of shape [BATCH_SIZE]) to
29allow variable length sequences of length <= the maximum-sequence-length. If this
30input is not specified, all sequences are assumed to be of length equal to
31maximum-sequence-length. For variable length input sequences, the scan_outputs
32will consist of a sequence of same length as the input, padded to the
3327maximum-sequence-length.The optional attribute scan_input_directions specifies the direction (forward or backward)
34
35The optional attribute directions can be used to scan a sequence in the reverse direction.
3628If this attribute is omitted, all sequences are scanned in the forward direction.for each scan input. If this attribute is omitted, all sequences are scanned in the forward
3729A bidirectional scan be performed by specifying the same tensor input twice in thedirection. A bidirectional scan may be performed by specifying the same tensor input twice
3830scan_inputs, once with a forward direction, and once with a backward direction.in the scan_inputs, once with a forward direction, and once with a backward direction.
3931
32The scan_output of the operation is produced by concatenating the scan_output_element
33values produced by the body in each iteration. The optional attribute scan_output_directions
34specifies the direction in which scan_output is constructed (by appending or prepending the
35scan_output_element to scan_output in each iteration) for each scan_output. If this attribute
36is omitted, the scan_output_element is appended to the scan_output in each iteration.
37
38The optional attribute scan_input_axes specifies the axis to be scanned for each scan_input.
39If omitted, every scan_input will be scanned in axis 0. For example, if axis 0 is the
40batch axis and axis 1 is the time axis (to be scanned), specify an axis value of 1.
41Note that scanning a non-zero axis may be less efficient than scanning axis zero.
42
43The optional attribute scan_output_axes specifies the axis along which the scan_outputs
44are accumulated for each scan_output. For example, if axis 1 is the time axis (to be
45scanned) for both inputs and outputs, specify a scan_input axis and scan_output axis
46value of 1.
47
4048Note that because of the ONNX restriction that only the last parameter of an operator canNote that because of the ONNX restriction that only the last parameter of an operator can
4149be variadic, the initial-states and scan-inputs are listed together as one input parameter.be variadic, the initial-states and scan-inputs are listed together as one input parameter.
4250Similarly, the final-states and scan-outputs are listed together as one output parameter.Similarly, the final-states and scan-outputs are listed together as one output parameter.
4351The attribute num_scan_inputs indicates the number M of scan-inputs.The attribute num_scan_inputs indicates the number M of scan-inputs.
4452
4553The behavior ofThe behavior of
4654
4755 Scan < Scan <
4856 num_scan_inputs = m, num_scan_inputs = m,
4957 body = loop-body body = loop-body,
58 scan_input_axes = [axis_1, ..., axis_m]
5059 > (sequence_lengths, init_1, ..., init_n, scan_1, ..., scan_m) > (init_1, ..., init_n, scan_1, ..., scan_m)
5160
5261is equivalent to the following pseudo-code:is equivalent to the following pseudo-code:
5362
54 // T.shape[0] denotes the batch-size of T
55 // The batch-size of scan_1, ..., scan_m are all required to be equal
56 batch_size = scan_1.shape[0];
57
5863 // scan_i.shape[1] denotes the (max) sequence-length of scan_i // scan_i.shape[axis_i] denotes the (max) sequence-length of scan_i
5964 // scan_i.shape[1] is required to be equal to scan_j.shape[1] for all i,j. // scan_i.shape[axis_i] is required to be equal to scan_j.shape[axis_j] for all i,j.
6065 max_sequence_length = scan_1.shape[1]; sequence_length = scan_1.shape[axis_1];
6166
62 for (int batch = 0; batch < batch_size; ++batch) {
6367 // initialize state-variables // initialize state-variables
6468 st_1 = init_1; ... st_n = init_n; st_1 = init_1; ... st_n = init_n;
6569 // initialize scan-output variables: [] denotes an empty tensor // initialize scan-output variables: [] denotes an empty tensor
6670 scan_out_1 = []; ...; scan_out_k = []; scan_out_1 = []; ...; scan_out_k = [];
6771 // identify number of iterations: // identify number of iterations:
72
6873 N = (sequence_lengths specified) ? sequence_lengths[batch] : max_sequence_length; // execute loop
69
7074 // execute loop for (int t = 0; t < sequence_length; ++t) {
71 for (int t = 0; t < N; ++t) {
7275 // generate the scan-input elements: the notation T[t] indicates the sub-tensor // generate the scan-input elements: the notation T[t] indicates the sub-tensor
7376 // of rank one less than T obtained by indexing T at position t along axis k. // of rank one less than T obtained by indexing T at position t along axis k.
7477 si_1 = (scan_10>[batch])<axis=1>[t]; si_1 = scan_1_1>[t];
7578 ... ; ... ;
7679 si_m = (scan_m0>[batch])<axis=1>[t]; si_m = scan_m_m>[t];
7780 // execute loop-body // execute loop-body
7881 st_1, ..., st_n, so_1, ..., so_k = loop-body(st_1, ..., st_n, si_1, ..., si_m) st_1, ..., st_n, so_1, ..., so_k = loop-body(st_1, ..., st_n, si_1, ..., si_m)
7982 // accumulate the scan-output elements // accumulate the scan-output elements
8083 scan_out_1 = Concat(scan_out_1, so_1); ... ; scan_out_k = Concat(scan_out_k, so_k); scan_out_1 = Concat(scan_out_1, so_1); ... ; scan_out_k = Concat(scan_out_k, so_k);
81 }
82 // accumulate the outputs for this batch:
83 bst_1[batch] = st_1; ..., bst_n[batch] = st_n;
84 // Note scan-outputs will have size max_sequence_length, but only first N values will be meaningful.
85 // The remaining values have an undefined value.
86 b_scan_out_1[batch] = scan_out_1; ...; b_scan_out_k[batch] = scan_out_k;
8784 } }
85
8886 return bst_1, ..., bst_n, b_scan_out_1, ..., b_scan_out_k; return st_1, ..., st_n, scan_out_1, ..., scan_out_k;
8987
9088*Sample usage: Encoding RNN using a Scan**Sample usage: Encoding RNN using a Scan*
9189
9290The following example shows how a simple RNN over an input tensor %X, with weight tensor %Wi,The following example shows how a simple RNN over an input tensor %X, with weight tensor %Wi,
9391recurrence weight tensor %Ri, bias tensors %Wbi and %Rbi, and initial hidden-state %H_0 canrecurrence weight tensor %Ri, bias tensors %Wbi and %Rbi, and initial hidden-state %H_0 can
9492be encoded as a ScanLoop. Note that the loop-body is a nested graph, and it directly computesbe encoded as a ScanLoop. Note that the loop-body is a nested graph, and it directly computes
9593%Wi, %Ri, %Wbi, and %Rbi (typically constants or initializers in the body graph). If these%Wi, %Ri, %Wbi, and %Rbi (typically constants or initializers in the body graph). If these
9694values are computed in the outer graph, they need to be passed in as extra state_variables.values are computed in the outer graph, they need to be passed in as extra state_variables.
9795
9896 graph rnn-encoding { graph rnn-encoding {
9997 %H_0 = ... %H_0 = ...
10098 %X = ... %X = ...
10199 %Y_h, %Y = Scan[body = , num_scan_inputs=1]("", %H_0, %X) %Y_h, %Y = Scan[body = , num_scan_inputs=1](%H_0, %X)
102100 return %Y, %Y_h return %Y, %Y_h
103101 } }
104102
105103 graph rnn-cell-1 ( graph rnn-cell-1 (
106104 %H_tminus1[FLOAT, tensor] %H_tminus1[FLOAT, tensor]
107105 %X_t[FLOAT, tensor] %X_t[FLOAT, tensor]
108106 ) { ) {
109107 %Wi = ... %Wi = ...
110108 %Ri = ... %Ri = ...
111109 %Wbi = ... %Wbi = ...
112110 %Rbi = ... %Rbi = ...
113111 %t1 = X_t * (Wi^T) %t1 = X_t * (Wi^T)
114112 %t2 = H_tminus1*(Ri^T) %t2 = H_tminus1*(Ri^T)
115113 %t3 = Add(%t1, %t2) %t3 = Add(%t1, %t2)
116114 %t4 = Add(%t3, %Wbi) %t4 = Add(%t3, %Wbi)
117115 %t5 = Add(%t4, %Rbi) %t5 = Add(%t4, %Rbi)
118116 %Ht = Tanh(%t5) %Ht = Tanh(%t5)
119117 %Accumulate = Identity(%Ht) %Accumulate = Identity(%Ht)
120118 return %Ht, %Accumulate return %Ht, %Accumulate
121119 } }
122120
123121**Attributes****Attributes**
124122
125123* **body** (required):* **body** (required):
126124 The graph run each iteration. It has N+M inputs: (loop state The graph run each iteration. It has N+M inputs: (loop state
127125 variables..., scan_input_elts...). It has N+K outputs: (loop state variables..., scan_input_elts...). It has N+K outputs: (loop state
128126 variables..., scan_output_elts...). Each scan_output is created by variables..., scan_output_elts...). Each scan_output is created by
129127 concatenating the value of the specified scan_output_elt value at concatenating the value of the specified scan_output_elt value at
130128 the end of each iteration of the loop. It is an error if the the end of each iteration of the loop. It is an error if the
131129 dimensions of these values change across loop iterations. dimensions of these values change across loop iterations.
130* **num_scan_inputs** (required):
132131* **directions**: An attribute specifying the number of scan_inputs M.
132* **scan_input_axes**:
133133 An optional list of M flags. The i-th element of the list specifies An optional list of M flags. The i-th element of the list specifies
134 the axis to be scanned (the sequence axis) for the i-th scan_input.
135 If omitted, 0 will be used as the scan axis for every scan_input.
136* **scan_input_directions**:
137 An optional list of M flags. The i-th element of the list specifies
134138 the direction to be scanned for the i-th scan_input tensor: 0 the direction to be scanned for the i-th scan_input tensor: 0
135139 indicates forward direction and 1 indicates reverse direction. If indicates forward direction and 1 indicates reverse direction. If
136140 omitted, all scan_input tensors will be scanned in the forward omitted, all scan_input tensors will be scanned in the forward
137 direction.
138141* **num_scan_inputs** (required): direction.
139142 An attribute specifying the number of scan_inputs M.* **scan_output_axes**:
140
141143**Inputs** An optional list of K flags. The i-th element of the list specifies
142
143144Between 2 and 2147483647 inputs. the axis for the i-th scan_output. The scan outputs are accumulated
144
145* **sequence_lens** (optional, heterogeneous) - **I**:
145 along the specified axis. If omitted, 0 will be used as the scan
146146 Optional tensor specifying lengths of the sequences in a batch. If axis for every scan_output.
147147 this input is not specified, all sequences are assumed to be of the* **scan_output_directions**:
148 An optional list of K flags, one for each scan_output. The i-th
149 element of the list specifies whether the i-th scan_output should be
150 constructed by appending or prepending a new value in each
151 iteration: 0 indicates appending and 1 indicates prepending. If
152 omitted, all scan_output tensors will be produced by appending a
153 value in each iteration.
154
155**Inputs**
156
148157 maximum sequence length (the dimension of the sequence axis of theBetween 1 and 2147483647 inputs.
149 scan_input tensors).
158
150159* **initial_state_and_scan_inputs** (variadic) - **V**:* **initial_state_and_scan_inputs** (variadic) - **V**:
151160 Initial values of the loop's N state variables followed by M Initial values of the loop's N state variables followed by M
152161 scan_inputs scan_inputs
153162
154163**Outputs****Outputs**
155164
156165Between 1 and 2147483647 outputs.Between 1 and 2147483647 outputs.
157166
158167* **final_state_and_scan_outputs** (variadic) - **V**:* **final_state_and_scan_outputs** (variadic) - **V**:
159168 Final values of the loop's N state variables followed by K Final values of the loop's N state variables followed by K
160169 scan_outputs scan_outputs
161170
162171**Type Constraints****Type Constraints**
163172
164173* **I** in (* **I** in (
165174 tensor(int64) tensor(int64)
166175 ): ):
167176 Int64 tensor Int64 tensor
168177* **V** in (* **V** in (
169178 tensor(bool), tensor(bool),
170179 tensor(complex128), tensor(complex128),
171180 tensor(complex64), tensor(complex64),
172181 tensor(double), tensor(double),
173182 tensor(float), tensor(float),
174183 tensor(float16), tensor(float16),
175184 tensor(int16), tensor(int16),
176185 tensor(int32), tensor(int32),
177186 tensor(int64), tensor(int64),
178187 tensor(int8), tensor(int8),
179188 tensor(string), tensor(string),
180189 tensor(uint16), tensor(uint16),
181190 tensor(uint32), tensor(uint32),
182191 tensor(uint64), tensor(uint64),
183192 tensor(uint8) tensor(uint8)
184193 ): ):
185194 All Tensor types All Tensor types

Scan - 8#

Version

  • name: Scan (GitHub)

  • domain: main

  • since_version: 8

  • function: False

  • support_level: SupportType.COMMON

  • shape inference: True

This version of the operator has been available since version 8.

Summary

Scan can be used to iterate over one or more scan_input tensors, constructing zero or more scan_output tensors. It combines ideas from general recurrences, functional programming constructs such as scan, fold, map, and zip and is intended to enable generalizations of RNN-like constructs for sequence-to-sequence processing. Other tensors (referred to as state_variables here) can be used to carry a state when iterating from one element to another (similar to hidden-state in RNNs, also referred to as loop-carried dependences in the context of loops). All these tensors are required to have the same shape in each iteration of the loop (a restriction imposed to enable efficient memory allocation). Many common usages involve a single scan_input tensor (where functionality similar to scan, fold and map can be obtained). When more than one scan_input is used, a behavior similar to zip is obtained.

The attribute body must be a graph, specifying the computation to be performed in every iteration. It takes as input the current values of the state_variables and the current iterated element of the scan_inputs. It must return the (updated) values of the state_variables and zero or more scan_output_element tensors. The values of the scan_output_element tensors are concatenated over all the iterations to produce the scan_output values of the scan construct (similar to the concatenated intermediate hidden-state values of RNN-like constructs).

The scan operation returns the final values of the state_variables as well as the scan_outputs.

The operation supports batching, and the batch-axis is required to be 0. When multiple scan_input tensors are used, they must all have the same batch-size, and they must all have the same maximum-sequence-length (the dimensionality of the sequence axis or scan axis). The sequence axis or scan axis is required to be 1.

The operation has an optional sequence_lens input (of shape [BATCH_SIZE]) to allow variable length sequences of length <= the maximum-sequence-length. If this input is not specified, all sequences are assumed to be of length equal to maximum-sequence-length. For variable length input sequences, the scan_outputs will consist of a sequence of same length as the input, padded to the maximum-sequence-length.

The optional attribute directions can be used to scan a sequence in the reverse direction. If this attribute is omitted, all sequences are scanned in the forward direction. A bidirectional scan be performed by specifying the same tensor input twice in the scan_inputs, once with a forward direction, and once with a backward direction.

Note that because of the ONNX restriction that only the last parameter of an operator can be variadic, the initial-states and scan-inputs are listed together as one input parameter. Similarly, the final-states and scan-outputs are listed together as one output parameter. The attribute num_scan_inputs indicates the number M of scan-inputs.

The behavior of

Scan <

num_scan_inputs = m, body = loop-body

> (sequence_lengths, init_1, …, init_n, scan_1, …, scan_m)

is equivalent to the following pseudo-code:

// T.shape[0] denotes the batch-size of T // The batch-size of scan_1, …, scan_m are all required to be equal batch_size = scan_1.shape[0];

// scan_i.shape[1] denotes the (max) sequence-length of scan_i // scan_i.shape[1] is required to be equal to scan_j.shape[1] for all i,j. max_sequence_length = scan_1.shape[1];

for (int batch = 0; batch < batch_size; ++batch) {

// initialize state-variables st_1 = init_1; … st_n = init_n; // initialize scan-output variables: [] denotes an empty tensor scan_out_1 = []; …; scan_out_k = []; // identify number of iterations: N = (sequence_lengths specified) ? sequence_lengths[batch] : max_sequence_length;

// execute loop for (int t = 0; t < N; ++t) {

// generate the scan-input elements: the notation T<axis=k>[t] indicates the sub-tensor // of rank one less than T obtained by indexing T at position t along axis k. si_1 = (scan_1<axis=0>[batch])<axis=1>[t]; … ; si_m = (scan_m<axis=0>[batch])<axis=1>[t]; // execute loop-body st_1, …, st_n, so_1, …, so_k = loop-body(st_1, …, st_n, si_1, …, si_m) // accumulate the scan-output elements scan_out_1 = Concat<axis=0>(scan_out_1, so_1); … ; scan_out_k = Concat<axis=0>(scan_out_k, so_k);

} // accumulate the outputs for this batch: bst_1[batch] = st_1; …, bst_n[batch] = st_n; // Note scan-outputs will have size max_sequence_length, but only first N values will be meaningful. // The remaining values have an undefined value. b_scan_out_1[batch] = scan_out_1; …; b_scan_out_k[batch] = scan_out_k;

} return bst_1, …, bst_n, b_scan_out_1, …, b_scan_out_k;

Sample usage: Encoding RNN using a Scan

The following example shows how a simple RNN over an input tensor %X, with weight tensor %Wi, recurrence weight tensor %Ri, bias tensors %Wbi and %Rbi, and initial hidden-state %H_0 can be encoded as a ScanLoop. Note that the loop-body is a nested graph, and it directly computes %Wi, %Ri, %Wbi, and %Rbi (typically constants or initializers in the body graph). If these values are computed in the outer graph, they need to be passed in as extra state_variables.

graph rnn-encoding {

%H_0 = … %X = … %Y_h, %Y = Scan[body = <graph rnn-cell-1>, num_scan_inputs=1](“”, %H_0, %X) return %Y, %Y_h

}

graph rnn-cell-1 (

%H_tminus1[FLOAT, tensor] %X_t[FLOAT, tensor]

) {

%Wi = … %Ri = … %Wbi = … %Rbi = … %t1 = X_t * (Wi^T) %t2 = H_tminus1*(Ri^T) %t3 = Add(%t1, %t2) %t4 = Add(%t3, %Wbi) %t5 = Add(%t4, %Rbi) %Ht = Tanh(%t5) %Accumulate = Identity(%Ht) return %Ht, %Accumulate

}

Attributes

  • body (required): The graph run each iteration. It has N+M inputs: (loop state variables…, scan_input_elts…). It has N+K outputs: (loop state variables…, scan_output_elts…). Each scan_output is created by concatenating the value of the specified scan_output_elt value at the end of each iteration of the loop. It is an error if the dimensions of these values change across loop iterations.

  • directions: An optional list of M flags. The i-th element of the list specifies the direction to be scanned for the i-th scan_input tensor: 0 indicates forward direction and 1 indicates reverse direction. If omitted, all scan_input tensors will be scanned in the forward direction.

  • num_scan_inputs (required): An attribute specifying the number of scan_inputs M.

Inputs

Between 2 and 2147483647 inputs.

  • sequence_lens (optional, heterogeneous) - I: Optional tensor specifying lengths of the sequences in a batch. If this input is not specified, all sequences are assumed to be of the maximum sequence length (the dimension of the sequence axis of the scan_input tensors).

  • initial_state_and_scan_inputs (variadic) - V: Initial values of the loop’s N state variables followed by M scan_inputs

Outputs

Between 1 and 2147483647 outputs.

  • final_state_and_scan_outputs (variadic) - V: Final values of the loop’s N state variables followed by K scan_outputs

Type Constraints

  • I in ( tensor(int64) ): Int64 tensor

  • V in ( tensor(bool), tensor(complex128), tensor(complex64), tensor(double), tensor(float), tensor(float16), tensor(int16), tensor(int32), tensor(int64), tensor(int8), tensor(string), tensor(uint16), tensor(uint32), tensor(uint64), tensor(uint8) ): All Tensor types