Softmax#

Softmax - 13
Softmax - 11
Softmax - 1

Softmax - 13 #

Version

name: Softmax (GitHub)
domain: main
since_version: 13
function: False
support_level: SupportType.COMMON
shape inference: True

This version of the operator has been available since version 13.

Summary

The operator computes the normalized exponential values for the given input:

Softmax(input, axis) = Exp(input) / ReduceSum(Exp(input), axis=axis, keepdims=1)

The “axis” attribute indicates the dimension along which Softmax will be performed. The output tensor has the same shape and contains the Softmax values of the corresponding input.

Attributes

axis:
Describes the dimension Softmax will be performed on. Negative

value means counting dimensions from the back. Accepted range is [-r, r-1] where r = rank(input). Default value is -1.

Inputs

input (heterogeneous) - T: The input tensor of rank >= axis.

Outputs

output (heterogeneous) - T: The output values with the same shape as the input tensor.

Type Constraints

T in ( tensor(bfloat16), tensor(double), tensor(float), tensor(float16) ): Constrain input and output types to float tensors.

Examples

softmax_axis

x = np.array([[0, 1, 2, 3], [10000, 10001, 10002, 10003]]
             ).astype(np.float32)
# expected output
# [[0.032058604 0.08714432  0.23688284  0.6439143  ]
# [0.032058604 0.08714432  0.23688284  0.6439143  ]]
y = softmax(x)

node = onnx.helper.make_node(
    'Softmax',
    inputs=['x'],
    outputs=['y'],
)
expect(node, inputs=[x], outputs=[y],
       name='test_softmax_large_number')

x = np.abs(np.random.randn(3, 4, 5).astype(np.float32))
node = onnx.helper.make_node(
    'Softmax',
    inputs=['x'],
    outputs=['y'],
    axis=0,
)
y = softmax(x, axis=0)
expect(node, inputs=[x], outputs=[y],
       name='test_softmax_axis_0')

node = onnx.helper.make_node(
    'Softmax',
    inputs=['x'],
    outputs=['y'],
    axis=1,
)
y = softmax(x, axis=1)
expect(node, inputs=[x], outputs=[y],
       name='test_softmax_axis_1')

node = onnx.helper.make_node(
    'Softmax',
    inputs=['x'],
    outputs=['y'],
    axis=2,
)
y = softmax(x, axis=2)
expect(node, inputs=[x], outputs=[y],
       name='test_softmax_axis_2')

node = onnx.helper.make_node(
    'Softmax',
    inputs=['x'],
    outputs=['y'],
    axis=-1,
)
y = softmax(x, axis=-1)
expect(node, inputs=[x], outputs=[y],
       name='test_softmax_negative_axis')

# default axis is -1
node = onnx.helper.make_node(
    'Softmax',
    inputs=['x'],
    outputs=['y'],
)
expect(node, inputs=[x], outputs=[y],
       name='test_softmax_default_axis')

Differences

`0`	`0`	`The operator computes the softmax (normalized exponential) values for each layer in the batch`	`The operator computes the normalized exponential values for the given input:`
	`1`
`1`	`2`	`of the given input.`	`Softmax(input, axis) = Exp(input) / ReduceSum(Exp(input), axis=axis, keepdims=1)`
`2`	`3`
`3`		`The input does not need to explicitly be a 2D vector; rather, it will be`
`4`		`coerced into one. For an arbitrary n-dimensional tensor`
`5`		`input \in [a_0, a_1, ..., a_{k-1}, a_k, ..., a_{n-1}] and k is`
`6`	`4`	`the axis provided, then input will be coerced into a 2-dimensional tensor with`	`The "axis" attribute indicates the dimension along which Softmax`
`7`		`dimensions [a_0 * ... * a_{k-1}, a_k * ... * a_{n-1}]. For the default`
`8`		`case where axis=1, this means the input tensor will be coerced into a 2D tensor`
`9`		`of dimensions [a_0, a_1 * ... * a_{n-1}], where a_0 is often the batch size.`
`10`		`In this situation, we must have a_0 = N and a_1 * ... * a_{n-1} = D.`
`11`		`Each of these dimensions must be matched correctly, or else the operator`
`12`	`5`	`will throw errors. The output tensor has the same shape`	`will be performed. The output tensor has the same shape`
`13`	`6`	`and contains the softmax values of the corresponding input.`	`and contains the Softmax values of the corresponding input.`
`14`	`7`
`15`	`8`	`Attributes`	`Attributes`
`16`	`9`
`17`	`10`	`* axis:`	`* axis:`
`18`	`11`	`Describes the axis of the inputs when coerced to 2D; defaults to one`	`Describes the dimension Softmax will be performed on. Negative`
`19`		`because the 0th axis most likely describes the batch_size. Negative`
`20`	`12`	`value means counting dimensions from the back. Accepted range is`	`value means counting dimensions from the back. Accepted range is`
`21`	`13`	`[-r, r-1] where r = rank(input). Default value is 1.`	`[-r, r-1] where r = rank(input). Default value is -1.`
`22`	`14`
`23`	`15`	`Inputs`	`Inputs`
`24`	`16`
`25`	`17`	`* input (heterogeneous) - T:`	`* input (heterogeneous) - T:`
`26`	`18`	`The input tensor that's coerced into a 2D matrix of size (NxD) as`	`The input tensor of rank >= axis.`
`27`		`described above.`
`28`	`19`
`29`	`20`	`Outputs`	`Outputs`
`30`	`21`
`31`	`22`	`* output (heterogeneous) - T:`	`* output (heterogeneous) - T:`
`32`	`23`	`The output values with the same shape as input tensor (the original`	`The output values with the same shape as the input tensor.`
`33`		`size without coercion).`
`34`	`24`
`35`	`25`	`Type Constraints`	`Type Constraints`
`36`	`26`
`37`	`27`	`* T in (`	`* T in (`
	`28`		`tensor(bfloat16),`
`38`	`29`	`tensor(double),`	`tensor(double),`
`39`	`30`	`tensor(float),`	`tensor(float),`
`40`	`31`	`tensor(float16)`	`tensor(float16)`
`41`	`32`	`):`	`):`
`42`	`33`	`Constrain input and output types to float tensors.`	`Constrain input and output types to float tensors.`

Softmax - 11 #

Version

name: Softmax (GitHub)
domain: main
since_version: 11
function: False
support_level: SupportType.COMMON
shape inference: True

This version of the operator has been available since version 11.

Summary

The operator computes the softmax (normalized exponential) values for each layer in the batch: of the given input.

The input does not need to explicitly be a 2D vector; rather, it will be coerced into one. For an arbitrary n-dimensional tensor input in [a_0, a_1, …, a_{k-1}, a_k, …, a_{n-1}] and k is the axis provided, then input will be coerced into a 2-dimensional tensor with dimensions [a_0 * … * a_{k-1}, a_k * … * a_{n-1}]. For the default case where axis=1, this means the input tensor will be coerced into a 2D tensor of dimensions [a_0, a_1 * … * a_{n-1}], where a_0 is often the batch size. In this situation, we must have a_0 = N and a_1 * … * a_{n-1} = D. Each of these dimensions must be matched correctly, or else the operator will throw errors. The output tensor has the same shape and contains the softmax values of the corresponding input.

Attributes

axis: Describes the axis of the inputs when coerced to 2D; defaults to one because the 0th axis most likely describes the batch_size. Negative value means counting dimensions from the back. Accepted range is [-r, r-1] where r = rank(input). Default value is 1.

Inputs

input (heterogeneous) - T: The input tensor that’s coerced into a 2D matrix of size (NxD) as described above.

Outputs

output (heterogeneous) - T: The output values with the same shape as input tensor (the original size without coercion).

Type Constraints

T in ( tensor(double), tensor(float), tensor(float16) ): Constrain input and output types to float tensors.

Differences

`0`	`0`	`The operator computes the softmax (normalized exponential) values for each layer in the batch`	`The operator computes the softmax (normalized exponential) values for each layer in the batch`
`1`	`1`	`of the given input. The input is a 2-D tensor (Tensor<float>) of size`	`of the given input.`
`2`		`(batch_size x input_feature_dimensions). The output tensor has the same shape`
`3`		`and contains the softmax values of the corresponding input.`
`4`	`2`
`5`	`3`	`Input does not need to explicitly be a 2D vector; rather, it will be`	`The input does not need to explicitly be a 2D vector; rather, it will be`
`6`	`4`	`coerced into one. For an arbitrary n-dimensional tensor`	`coerced into one. For an arbitrary n-dimensional tensor`
`7`	`5`	`input \in [a_0, a_1, ..., a_{k-1}, a_k, ..., a_{n-1}] and k is`	`input \in [a_0, a_1, ..., a_{k-1}, a_k, ..., a_{n-1}] and k is`
`8`	`6`	`the axis provided, then input will be coerced into a 2-dimensional tensor with`	`the axis provided, then input will be coerced into a 2-dimensional tensor with`
`9`	`7`	`dimensions [a_0 * ... * a_{k-1}, a_k * ... * a_{n-1}]. For the default`	`dimensions [a_0 * ... * a_{k-1}, a_k * ... * a_{n-1}]. For the default`
`10`	`8`	`case where axis=1, this means the input tensor will be coerced into a 2D tensor`	`case where axis=1, this means the input tensor will be coerced into a 2D tensor`
`11`	`9`	`of dimensions [a_0, a_1 * ... * a_{n-1}], where a_0 is often the batch size.`	`of dimensions [a_0, a_1 * ... * a_{n-1}], where a_0 is often the batch size.`
`12`	`10`	`In this situation, we must have a_0 = N and a_1 * ... * a_{n-1} = D.`	`In this situation, we must have a_0 = N and a_1 * ... * a_{n-1} = D.`
`13`	`11`	`Each of these dimensions must be matched correctly, or else the operator`	`Each of these dimensions must be matched correctly, or else the operator`
`14`	`12`	`will throw errors.`	`will throw errors. The output tensor has the same shape`
	`13`		`and contains the softmax values of the corresponding input.`
`15`	`14`
`16`	`15`	`Attributes`	`Attributes`
`17`	`16`
`18`	`17`	`* axis:`	`* axis:`
`19`	`18`	`Describes the axis of the inputs when coerced to 2D; defaults to one`	`Describes the axis of the inputs when coerced to 2D; defaults to one`
`20`	`19`	`because the 0th axis most likely describes the batch_size Default value is 1.`	`because the 0th axis most likely describes the batch_size. Negative`
	`20`		`value means counting dimensions from the back. Accepted range is`
	`21`		`[-r, r-1] where r = rank(input). Default value is 1.`
`21`	`22`
`22`	`23`	`Inputs`	`Inputs`
`23`	`24`
`24`	`25`	`* input (heterogeneous) - T:`	`* input (heterogeneous) - T:`
`25`	`26`	`The input tensor that's coerced into a 2D matrix of size (NxD) as`	`The input tensor that's coerced into a 2D matrix of size (NxD) as`
`26`	`27`	`described above.`	`described above.`
`27`	`28`
`28`	`29`	`Outputs`	`Outputs`
`29`	`30`
`30`	`31`	`* output (heterogeneous) - T:`	`* output (heterogeneous) - T:`
`31`	`32`	`The output values with the same shape as input tensor (the original`	`The output values with the same shape as input tensor (the original`
`32`	`33`	`size without coercion).`	`size without coercion).`
`33`	`34`
`34`	`35`	`Type Constraints`	`Type Constraints`
`35`	`36`
`36`	`37`	`* T in (`	`* T in (`
`37`	`38`	`tensor(double),`	`tensor(double),`
`38`	`39`	`tensor(float),`	`tensor(float),`
`39`	`40`	`tensor(float16)`	`tensor(float16)`
`40`	`41`	`):`	`):`
`41`	`42`	`Constrain input and output types to float tensors.`	`Constrain input and output types to float tensors.`

Softmax - 1 #

Version

name: Softmax (GitHub)
domain: main
since_version: 1
function: False
support_level: SupportType.COMMON
shape inference: True

This version of the operator has been available since version 1.

Summary

The operator computes the softmax (normalized exponential) values for each layer in the batch: of the given input. The input is a 2-D tensor (Tensor<float>) of size

(batch_size x input_feature_dimensions). The output tensor has the same shape and contains the softmax values of the corresponding input.

Input does not need to explicitly be a 2D vector; rather, it will be coerced into one. For an arbitrary n-dimensional tensor input in [a_0, a_1, …, a_{k-1}, a_k, …, a_{n-1}] and k is the axis provided, then input will be coerced into a 2-dimensional tensor with dimensions [a_0 * … * a_{k-1}, a_k * … * a_{n-1}]. For the default case where axis=1, this means the input tensor will be coerced into a 2D tensor of dimensions [a_0, a_1 * … * a_{n-1}], where a_0 is often the batch size. In this situation, we must have a_0 = N and a_1 * … * a_{n-1} = D. Each of these dimensions must be matched correctly, or else the operator will throw errors.

Attributes

axis: Describes the axis of the inputs when coerced to 2D; defaults to one because the 0th axis most likely describes the batch_size Default value is 1.

Inputs

input (heterogeneous) - T: The input tensor that’s coerced into a 2D matrix of size (NxD) as described above.

Outputs

output (heterogeneous) - T: The output values with the same shape as input tensor (the original size without coercion).

Type Constraints

T in ( tensor(double), tensor(float), tensor(float16) ): Constrain input and output types to float tensors.

Slice

SoftmaxCrossEntropyLoss

Softmax#

Softmax - 13#

Softmax - 11#

Softmax - 1#

Softmax - 13 #

Softmax - 11 #

Softmax - 1 #