LabelEncoder#

Domain: ai.onnx.ml
Since version: 4

Maps each element in the input tensor to another value. The mapping is determined by the two parallel attributes, ‘keys*’ and ‘values*’ attribute. The i-th value in the specified ‘keys*’ attribute would be mapped to the i-th value in the specified ‘values*’ attribute. It implies that input’s element type and the element type of the specified ‘keys*’ should be identical while the output type is identical to the specified ‘values*’ attribute. Note that the ‘keys*’ and ‘values*’ attributes must have the same length. If an input element can not be found in the specified ‘keys*’ attribute, the ‘default*’ that matches the specified ‘values*’ attribute may be used as its output value. The type of the ‘default*’ attribute must match the ‘values*’ attribute chosen. Let’s consider an example which maps a string tensor to an integer tensor. Assume and ‘keys_strings’ is [“Amy”, “Sally”], ‘values_int64s’ is [5, 6], and ‘default_int64’ is ‘-1’. The input [“Dori”, “Amy”, “Amy”, “Sally”, “Sally”] would be mapped to [-1, 5, 5, 6, 6]. Since this operator is an one-to-one mapping, its input and output shapes are the same. Notice that only one of ‘keys*’/’values*’ can be set. Float keys with value ‘NaN’ match any input ‘NaN’ value regardless of bit value. If a key is repeated, the last key takes precedence.

Inputs

X (T1): Input data. It must have the same element type as the keys* attribute set.

Outputs

Y (T2): Output data. This tensor’s element type is based on the values* attribute set.

Type Constraints

T1: The input type is a tensor of any shape. Allowed types: tensor(double), tensor(float), tensor(int16), tensor(int32), tensor(int64), tensor(string).
T2: Output type is determined by the specified ‘values*’ attribute. Allowed types: tensor(double), tensor(float), tensor(int16), tensor(int32), tensor(int64), tensor(string).

Examples#

test_ai_onnx_ml_label_encoder_string_int

Node:
  ai.onnx.ml.LabelEncoder(x) -> (y)
  Attributes:
    keys_strings = ['a', 'b', 'c']
    values_int64s = [0, 1, 2]
    default_int64 = 42

Inputs:
  x: shape=(5,), dtype=object
    ['a', 'b', 'd', 'c', 'g']

Outputs:
  y: shape=(5,), dtype=int64
    [ 0,  1, 42,  2, 42]

test_ai_onnx_ml_label_encoder_string_int_no_default

Node:
  ai.onnx.ml.LabelEncoder(x) -> (y)
  Attributes:
    keys_strings = ['a', 'b', 'c']
    values_int64s = [0, 1, 2]

Inputs:
  x: shape=(5,), dtype=object
    ['a', 'b', 'd', 'c', 'g']

Outputs:
  y: shape=(5,), dtype=int64
    [ 0,  1, -1,  2, -1]

test_ai_onnx_ml_label_encoder_tensor_mapping

Node:
  ai.onnx.ml.LabelEncoder(x) -> (y)
  Attributes:
    keys_tensor = <tensor>
    values_tensor = <tensor>
    default_tensor = <tensor>

Inputs:
  x: shape=(5,), dtype=object
    ['a', 'b', 'd', 'c', 'g']

Outputs:
  y: shape=(5,), dtype=int16
    [ 0,  1, 42,  2, 42]

test_ai_onnx_ml_label_encoder_tensor_value_only_mapping

Node:
  ai.onnx.ml.LabelEncoder(x) -> (y)
  Attributes:
    keys_strings = ['a', 'b', 'c']
    values_tensor = <tensor>
    default_tensor = <tensor>

Inputs:
  x: shape=(5,), dtype=object
    ['a', 'b', 'd', 'c', 'g']

Outputs:
  y: shape=(5,), dtype=int16
    [ 0,  1, 42,  2, 42]

test_cc_label_encoder_float_to_int64

Node:
  ai.onnx.ml.LabelEncoder(x) -> (y)
  Attributes:
    keys_floats = [1.0, 2.0, 3.0]
    values_int64s = [10, 20, 30]
    default_int64 = -1

Inputs:
  x: shape=(2, 2), dtype=float32
    [[1., 2.],
     [3., 9.]]

Outputs:
  y: shape=(2, 2), dtype=int64
    [[10, 20],
     [30, -1]]

test_cc_label_encoder_int64_to_float

Node:
  ai.onnx.ml.LabelEncoder(x) -> (y)
  Attributes:
    keys_int64s = [0, 1, 2]
    values_floats = [0.5, 1.5, 2.5]
    default_float = -1.0

Inputs:
  x: shape=(4,), dtype=int64
    [0, 1, 2, 7]

Outputs:
  y: shape=(4,), dtype=float32
    [ 0.5,  1.5,  2.5, -1. ]