LabelEncoder - 1 vs 2¶

Files changed (1) hide show

LabelEncoder1 → LabelEncoder2 +39 -24

LabelEncoder1 → LabelEncoder2 RENAMED Viewed

@@ -1 +1 @@
- Converts strings to integers and vice versa.
+ Maps each element in the input tensor to another value.
+ The mapping is determined by the two parallel attributes, 'keys_*' and
+ 'values_*' attribute. The i-th value in the specified 'keys_*' attribute
+ would be mapped to the i-th value in the specified 'values_*' attribute. It
- If the string default value is set, it will convert integers to strings.
+ implies that input's element type and the element type of the specified
- If the int default value is set, it will convert strings to integers.
+ 'keys_*' should be identical while the output type is identical to the
+ specified 'values_*' attribute. If an input element can not be found in the
+ specified 'keys_*' attribute, the 'default_*' that matches the specified
+ 'values_*' attribute may be used as its output value.
- Each operator converts either integers to strings or strings to integers, depending
+ Let's consider an example which maps a string tensor to an integer tensor.
+ Assume and 'keys_strings' is ["Amy", "Sally"], 'values_int64s' is [5, 6],
+ and 'default_int64' is '-1'.  The input ["Dori", "Amy", "Amy", "Sally",
- on which default value attribute is provided. Only one default value attribute
+ "Sally"] would be mapped to [-1, 5, 5, 6, 6].
- should be defined.
- When converting from integers to strings, the string is fetched from the
+ Since this operator is an one-to-one mapping, its input and output shapes
- 'classes_strings' list, by simple indexing.
+ are the same. Notice that only one of 'keys_*'/'values_*' can be set.
- When converting from strings to integers, the string is looked up in the list
- and the index at which it is found is used as the converted value.
+ For key look-up, bit-wise comparison is used so even a float NaN can be
+ mapped to a value in 'values_*' attribute.
  **Attributes**
- * **classes_strings**:
+ * **default_float**:
-   A list of labels.
+   A float.
  * **default_int64**:
+   An integer.
-   An integer to use when an input string value is not found in the
-   map.<br>One and only one of the 'default_*' attributes must be
-   defined.
  * **default_string**:
+   A string.
+ * **keys_floats**:
+   A list of floats.
+ * **keys_int64s**:
+   A list of ints.
+ * **keys_strings**:
-   A string to use when an input integer value is not found in the
+   A list of strings. One and only one of 'keys_*'s should be set.
+ * **values_floats**:
+   A list of floats.
+ * **values_int64s**:
+   A list of ints.
+ * **values_strings**:
-   map.<br>One and only one of the 'default_*' attributes must be
+   A list of strings. One and only one of 'value_*'s should be set.
-   defined.
  **Inputs**
  * **X** (heterogeneous) - **T1**:
-   Input data.
+   Input data. It can be either tensor or scalar.
  **Outputs**
  * **Y** (heterogeneous) - **T2**:
+   Output data.
-   Output data. If strings are input, the output values are integers,
-   and vice versa.
  **Type Constraints**
  * **T1** in (
+   tensor(float),
    tensor(int64),
    tensor(string)
    ):
+   The input type is a tensor of any shape.
-   The input type must be a tensor of integers or strings, of any
-   shape.
  * **T2** in (
+   tensor(float),
    tensor(int64),
    tensor(string)
    ):
+   Output type is determined by the specified 'values_*' attribute.-   The output type will be a tensor of strings or integers, and will
-   have the same shape as the input.