.. _op_ai_onnx_LSTM-7: LSTM - version 7 ================ This page documents version **7** of operator **LSTM**. See :doc:`LSTM` for the latest version (since version 22). - **Domain**: ``ai.onnx`` - **Since version**: 7 Computes an one-layer LSTM. This operator is usually supported via some custom implementation such as CuDNN. Notations: * ``X`` - input tensor * ``i`` - input gate * ``o`` - output gate * ``f`` - forget gate * ``c`` - cell gate * ``t`` - time step (t-1 means previous time step) * ``W[iofc]`` - W parameter weight matrix for input, output, forget, and cell gates * ``R[iofc]`` - R recurrence weight matrix for input, output, forget, and cell gates * ``Wb[iofc]`` - W bias vectors for input, output, forget, and cell gates * ``Rb[iofc]`` - R bias vectors for input, output, forget, and cell gates * ``P[iof]`` - P peephole weight vector for input, output, and forget gates * ``WB[iofc]`` - W parameter weight matrix for backward input, output, forget, and cell gates * ``RB[iofc]`` - R recurrence weight matrix for backward input, output, forget, and cell gates * ``WBb[iofc]`` - W bias vectors for backward input, output, forget, and cell gates * ``RBb[iofc]`` - R bias vectors for backward input, output, forget, and cell gates * ``PB[iof]`` - P peephole weight vector for backward input, output, and forget gates * ``H`` - Hidden state * ``num_directions`` - 2 if direction == bidirectional else 1 Activation functions: * Relu(x) - max(0, x) * Tanh(x) - (1 - e^{-2x})/(1 + e^{-2x}) * Sigmoid(x) - 1/(1 + e^{-x}) NOTE: Below are optional * Affine(x) - alpha\*x + beta * LeakyRelu(x) - x if x >= 0 else alpha \* x * ThresholdedRelu(x) - x if x >= alpha else 0 * ScaledTanh(x) - alpha\*Tanh(beta\*x) * HardSigmoid(x) - min(max(alpha\*x + beta, 0), 1) * Elu(x) - x if x >= 0 else alpha\*(e^x - 1) * Softsign(x) - x/(1 + ``|x|``) * Softplus(x) - log(1 + e^x) Equations (Default: f=Sigmoid, g=Tanh, h=Tanh): * it = f(Xt\*(Wi^T) + Ht-1\*(Ri^T) + Pi (.) Ct-1 + Wbi + Rbi) * ft = f(Xt\*(Wf^T) + Ht-1\*(Rf^T) + Pf (.) Ct-1 + Wbf + Rbf) * ct = g(Xt\*(Wc^T) + Ht-1\*(Rc^T) + Wbc + Rbc) * Ct = ft (.) Ct-1 + it (.) ct * gt = f(Xt\*(Wo^T) + Ht-1\*(Ro^T) + Po (.) Ct + Wbo + Rbo) * Ht = gt (.) h(Ct) **Inputs** - **X** (*T*): The input sequences packed (and potentially padded) into one 3-D tensor with the shape of ``[seq_length, batch_size, input_size]``. - **W** (*T*): The weight tensor for the gates. Concatenation of ``W[iofc]`` and ``WB[iofc]`` (if bidirectional) along dimension 0. The tensor has shape ``[num_directions, 4*hidden_size, input_size]``. - **R** (*T*): The recurrence weight tensor. Concatenation of ``R[iofc]`` and ``RB[iofc]`` (if bidirectional) along dimension 0. This tensor has shape ``[num_directions, 4*hidden_size, hidden_size]``. - **B** (*T*): The bias tensor for input gate. Concatenation of ``[Wb[iofc], Rb[iofc]]``, and ``[WBb[iofc], RBb[iofc]]`` (if bidirectional) along dimension 0. This tensor has shape ``[num_directions, 8*hidden_size]``. Optional: If not specified - assumed to be 0. - **sequence_lens** (*T1*): Optional tensor specifying lengths of the sequences in a batch. If not specified - assumed all sequences in the batch to have length ``seq_length``. It has shape ``[batch_size]``. - **initial_h** (*T*): Optional initial value of the hidden. If not specified - assumed to be 0. It has shape ``[num_directions, batch_size, hidden_size]``. - **initial_c** (*T*): Optional initial value of the cell. If not specified - assumed to be 0. It has shape ``[num_directions, batch_size, hidden_size]``. - **P** (*T*): The weight tensor for peepholes. Concatenation of ``P[iof]`` and ``PB[iof]`` (if bidirectional) along dimension 0. It has shape ``[num_directions, 3*hidde_size]``. Optional: If not specified - assumed to be 0. **Outputs** - **Y** (*T*): A tensor that concats all the intermediate output values of the hidden. It has shape ``[seq_length, num_directions, batch_size, hidden_size]``. - **Y_h** (*T*): The last output value of the hidden. It has shape ``[num_directions, batch_size, hidden_size]``. - **Y_c** (*T*): The last output value of the cell. It has shape ``[num_directions, batch_size, hidden_size]``. **Type Constraints** - **T**: Constrain input and output types to float tensors. Allowed types: tensor(double), tensor(float), tensor(float16). - **T1**: Constrain seq_lens to integer tensor. Allowed types: tensor(int32). Differences with previous version (1) ------------------------------------- **SchemaDiff**: ``LSTM`` (domain ``'ai.onnx'``) * old version: 1 * new version: 7 * breaking: no