RNN#

Domain: ai.onnx
Since version: 22

Computes an one-layer simple RNN. This operator is usually supported via some custom implementation such as CuDNN.

Notations:

X - input tensor
i - input gate
t - time step (t-1 means previous time step)
Wi - W parameter weight matrix for input gate
Ri - R recurrence weight matrix for input gate
Wbi - W parameter bias vector for input gate
Rbi - R parameter bias vector for input gate
WBi - W parameter weight matrix for backward input gate
RBi - R recurrence weight matrix for backward input gate
WBbi - WR bias vectors for backward input gate
RBbi - RR bias vectors for backward input gate
H - Hidden state
num_directions - 2 if direction == bidirectional else 1

Activation functions:

Relu(x) - max(0, x)
Tanh(x) - (1 - e^{-2x})/(1 + e^{-2x})
Sigmoid(x) - 1/(1 + e^{-x})

NOTE: Below are optional

Affine(x) - alpha*x + beta
LeakyRelu(x) - x if x >= 0 else alpha * x
ThresholdedRelu(x) - x if x >= alpha else 0
ScaledTanh(x) - alpha*Tanh(beta*x)
HardSigmoid(x) - min(max(alpha*x + beta, 0), 1)
Elu(x) - x if x >= 0 else alpha*(e^x - 1)
Softsign(x) - x/(1 + |x|)
Softplus(x) - log(1 + e^x)

Equations (Default: f=Tanh):

Ht = f(Xt*(Wi^T) + Ht-1*(Ri^T) + Wbi + Rbi)

Inputs

X (T): The input sequences packed (and potentially padded) into one 3-D tensor with the shape of [seq_length, batch_size, input_size].
W (T): The weight tensor for input gate. Concatenation of Wi and WBi (if bidirectional). The tensor has shape [num_directions, hidden_size, input_size].
R (T): The recurrence weight tensor. Concatenation of Ri and RBi (if bidirectional). The tensor has shape [num_directions, hidden_size, hidden_size].
B (T): The bias tensor for input gate. Concatenation of [Wbi, Rbi] and [WBbi, RBbi] (if bidirectional). The tensor has shape [num_directions, 2*hidden_size]. Optional: If not specified - assumed to be 0.
sequence_lens (T1): Optional tensor specifying lengths of the sequences in a batch. If not specified - assumed all sequences in the batch to have length seq_length. It has shape [batch_size].
initial_h (T): Optional initial value of the hidden. If not specified - assumed to be 0. It has shape [num_directions, batch_size, hidden_size].

Outputs

Y (T): A tensor that concats all the intermediate output values of the hidden. It has shape [seq_length, num_directions, batch_size, hidden_size].
Y_h (T): The last output value of the hidden. It has shape [num_directions, batch_size, hidden_size].

Type Constraints

T: Constrain input and output types to float tensors. Allowed types: tensor(bfloat16), tensor(double), tensor(float), tensor(float16).
T1: Constrain seq_lens to integer tensor. Allowed types: tensor(int32).

Examples#

test_cc_rnn_seq_length

Node:
  RNN(X, W, R, B) -> ("", Y_h)
  Attributes:
    hidden_size = 5

Inputs:
  X: shape=(2, 3, 3), dtype=float32
    [[[-0.3       , -0.23000002, -0.16000001],
      [-0.09      , -0.02000001,  0.04999998],
      [ 0.12      ,  0.19      ,  0.26      ]],

     [[ 0.32999998,  0.39999998,  0.46999997],
      [ 0.54      ,  0.61      ,  0.68      ],
      [ 0.74999994,  0.82      ,  0.89000005]]]
  W: shape=(1, 5, 3), dtype=float32
    [[[-0.2       , -0.12      , -0.04000001],
      [ 0.03999999,  0.11999999,  0.19999997],
      [-0.2       , -0.12      , -0.04000001],
      [ 0.03999999,  0.11999999,  0.19999997],
      [-0.2       , -0.12      , -0.04000001]]]
  R: shape=(1, 5, 5), dtype=float32
    [[[-0.15      , -0.11000001, -0.07000001, -0.03000001,  0.00999999],
      [ 0.04999998,  0.08999999,  0.13      ,  0.16999999, -0.15      ],
      [-0.11000001, -0.07000001, -0.03000001,  0.00999999,  0.04999998],
      [ 0.08999999,  0.13      ,  0.16999999, -0.15      , -0.11000001],
      [-0.07000001, -0.03000001,  0.00999999,  0.04999998,  0.08999999]]]
  B: shape=(1, 10), dtype=float32
    [[-0.07      , -0.04      , -0.01      ,  0.02      ,  0.05      ,  0.07999999,
       0.10999999,  0.13999999,  0.16999999,  0.19999999]]

Outputs:
  Y_h: shape=(1, 3, 5), dtype=float32
    [[[-0.1526143 ,  0.2253239 , -0.00296609,  0.32540613,  0.14681508],
      [-0.22053953,  0.3106558 , -0.07594406,  0.3803142 ,  0.07191767],
      [-0.2861242 ,  0.39105037, -0.14813372,  0.43283218, -0.00412378]]]

test_cc_simple_rnn_batchwise

Node:
  RNN(X, W, R) -> (Y, Y_h)
  Attributes:
    hidden_size = 4
    layout = 1

Inputs:
  X: shape=(3, 1, 2), dtype=float32
    [[[1., 2.]],

     [[3., 4.]],

     [[5., 6.]]]
  W: shape=(1, 4, 2), dtype=float32
    [[[0.5, 0.5],
      [0.5, 0.5],
      [0.5, 0.5],
      [0.5, 0.5]]]
  R: shape=(1, 4, 4), dtype=float32
    [[[0.5, 0.5, 0.5, 0.5],
      [0.5, 0.5, 0.5, 0.5],
      [0.5, 0.5, 0.5, 0.5],
      [0.5, 0.5, 0.5, 0.5]]]

Outputs:
  Y: shape=(3, 1, 1, 4), dtype=float32
    [[[[0.90514827, 0.90514827, 0.90514827, 0.90514827]]],


     [[[0.9981779 , 0.9981779 , 0.9981779 , 0.9981779 ]]],


     [[[0.9999666 , 0.9999666 , 0.9999666 , 0.9999666 ]]]]
  Y_h: shape=(3, 1, 4), dtype=float32
    [[[0.90514827, 0.90514827, 0.90514827, 0.90514827]],

     [[0.9981779 , 0.9981779 , 0.9981779 , 0.9981779 ]],

     [[0.9999666 , 0.9999666 , 0.9999666 , 0.9999666 ]]]

test_cc_simple_rnn_defaults

Node:
  RNN(X, W, R) -> ("", Y_h)
  Attributes:
    hidden_size = 4

Inputs:
  X: shape=(2, 3, 2), dtype=float32
    [[[-0.5       , -0.4       ],
      [-0.3       , -0.19999999],
      [-0.09999999,  0.        ]],

     [[ 0.10000002,  0.19999999],
      [ 0.3       ,  0.40000004],
      [ 0.5       ,  0.6       ]]]
  W: shape=(1, 4, 2), dtype=float32
    [[[-0.2       , -0.1       ],
      [ 0.        ,  0.10000001],
      [ 0.2       , -0.2       ],
      [-0.1       ,  0.        ]]]
  R: shape=(1, 4, 4), dtype=float32
    [[[-0.15      , -0.10000001, -0.05      ,  0.        ],
      [ 0.05      ,  0.09999999,  0.15      , -0.15      ],
      [-0.10000001, -0.05      ,  0.        ,  0.05      ],
      [ 0.09999999,  0.15      , -0.15      , -0.10000001]]]

Outputs:
  Y_h: shape=(1, 3, 4), dtype=float32
    [[[-0.05580809,  0.01246275, -0.02940391, -0.00408378],
      [-0.10854553,  0.03447984, -0.02547805, -0.02501091],
      [-0.16059728,  0.05644028, -0.02149644, -0.04596822]]]

test_cc_simple_rnn_with_initial_bias

Node:
  RNN(X, W, R, B, "", initial_h) -> (Y, Y_h)
  Attributes:
    hidden_size = 4

Inputs:
  X: shape=(2, 3, 2), dtype=float32
    [[[-0.5       , -0.4       ],
      [-0.3       , -0.19999999],
      [-0.09999999,  0.        ]],

     [[ 0.10000002,  0.19999999],
      [ 0.3       ,  0.40000004],
      [ 0.5       ,  0.6       ]]]
  W: shape=(1, 4, 2), dtype=float32
    [[[-0.2       , -0.1       ],
      [ 0.        ,  0.10000001],
      [ 0.2       , -0.2       ],
      [-0.1       ,  0.        ]]]
  R: shape=(1, 4, 4), dtype=float32
    [[[-0.15      , -0.10000001, -0.05      ,  0.        ],
      [ 0.05      ,  0.09999999,  0.15      , -0.15      ],
      [-0.10000001, -0.05      ,  0.        ,  0.05      ],
      [ 0.09999999,  0.15      , -0.15      , -0.10000001]]]
  B: shape=(1, 8), dtype=float32
    [[-0.05      , -0.03      , -0.01      ,  0.01      ,  0.03      ,  0.04999999,
       0.06999999,  0.09      ]]
  initial_h: shape=(1, 3, 4), dtype=float32
    [[[-0.1       , -0.07      , -0.04      , -0.01000001],
      [ 0.02      ,  0.04999999,  0.07999999,  0.10999999],
      [ 0.13999999,  0.16999999,  0.19999999,  0.22999999]]]

Outputs:
  Y: shape=(2, 1, 3, 4), dtype=float32
    [[[[ 0.14301287, -0.03648381,  0.05295042,  0.1356585 ],
       [ 0.04796317,  0.00149999,  0.04097702,  0.11597578],
       [-0.04796318,  0.03947946,  0.02899186,  0.09620157]]],


     [[[-0.08027796,  0.03108602,  0.03429237,  0.07716657],
       [-0.128676  ,  0.0512534 ,  0.04090462,  0.05721463],
       [-0.17634039,  0.07134689,  0.04759643,  0.03713958]]]]
  Y_h: shape=(1, 3, 4), dtype=float32
    [[[-0.08027796,  0.03108602,  0.03429237,  0.07716657],
      [-0.128676  ,  0.0512534 ,  0.04090462,  0.05721463],
      [-0.17634039,  0.07134689,  0.04759643,  0.03713958]]]

Differences with previous version (14)#

SchemaDiff: RNN (domain 'ai.onnx')

old version: 14
new version: 22
breaking: no

Type constraints:

changed ‘T’: added types: [‘tensor(bfloat16)’]

RNN#

Examples#

Differences with previous version (14)#

Version History#