LogSoftmax - 1 vs 13#

Next section compares an older to a newer version of the same operator after both definition are converted into markdown text. Green means an addition to the newer version, red means a deletion. Anything else is unchanged.

Files changed (1) hide show
  1. LogSoftmax1 → LogSoftmax13 +20 -12
LogSoftmax1 → LogSoftmax13 RENAMED
@@ -1 +1 @@
1
- The operator computes the log of softmax values for the given input:
1
+ The operator computes the logsoftmax (log of softmax) values for each layer in the batch
2
+ of the given input. The input is a 2-D tensor (Tensor<float>) of size
3
+ (batch_size x input_feature_dimensions). The output tensor has the same shape
4
+ and contains the logsoftmax values of the corresponding input.
5
+ Input does not need to explicitly be a 2D vector; rather, it will be
2
- LogSoftmax(input, axis) = Log(Softmax(input, axis=axis))
6
+ coerced into one. For an arbitrary n-dimensional tensor
3
-
7
+ input in [a_0, a_1, ..., a_{k-1}, a_k, ..., a_{n-1}] and k is
4
- The "axis" attribute indicates the dimension along which LogSoftmax
8
+ the axis provided, then input will be coerced into a 2-dimensional tensor with
9
+ dimensions [a_0 * ... * a_{k-1}, a_k * ... * a_{n-1}]. For the default
5
- will be performed. The output tensor has the same shape
10
+ case where axis=1, this means the input tensor will be coerced into a 2D tensor
11
+ of dimensions [a_0, a_1 * ... * a_{n-1}], where a_0 is often the batch size.
12
+ In this situation, we must have a_0 = N and a_1 * ... * a_{n-1} = D.
6
- and contains the LogSoftmax values of the corresponding input.
13
+ Each of these dimensions must be matched correctly, or else the operator
14
+ will throw errors.
7
15
  **Attributes**
8
16
  * **axis**:
17
+ Describes the axis of the inputs when coerced to 2D; defaults to one
18
+ because the 0th axis most likely describes the batch_size
9
- Describes the dimension LogSoftmax will be performed on. Negative
10
- value means counting dimensions from the back. Accepted range is
11
- [-r, r-1] where r = rank(input).
12
19
  **Inputs**
13
20
  * **input** (heterogeneous) - **T**:
14
- The input tensor of rank >= axis.
21
+ The input tensor that's coerced into a 2D matrix of size (NxD) as
22
+ described above.
15
23
  **Outputs**
16
24
  * **output** (heterogeneous) - **T**:
17
- The output values with the same shape as the input tensor.
25
+ The output values with the same shape as input tensor (the original
26
+ size without coercion).
18
27
  **Type Constraints**
19
28
  * **T** in (
20
- tensor(bfloat16),
21
29
  tensor(double),
22
30
  tensor(float),
23
31
  tensor(float16)
24
32
  ):
25
33
  Constrain input and output types to float tensors.