LabelEncoder - 1 vs 2

Files changed (1) hide show
  1. LabelEncoder1 → LabelEncoder2 +39 -24
LabelEncoder1 → LabelEncoder2 RENAMED
@@ -1 +1 @@
1
- Converts strings to integers and vice versa.
1
+ Maps each element in the input tensor to another value.
2
+ The mapping is determined by the two parallel attributes, 'keys_*' and
3
+ 'values_*' attribute. The i-th value in the specified 'keys_*' attribute
4
+ would be mapped to the i-th value in the specified 'values_*' attribute. It
2
- If the string default value is set, it will convert integers to strings.
5
+ implies that input's element type and the element type of the specified
3
- If the int default value is set, it will convert strings to integers.
6
+ 'keys_*' should be identical while the output type is identical to the
7
+ specified 'values_*' attribute. If an input element can not be found in the
8
+ specified 'keys_*' attribute, the 'default_*' that matches the specified
9
+ 'values_*' attribute may be used as its output value.
4
- Each operator converts either integers to strings or strings to integers, depending
10
+ Let's consider an example which maps a string tensor to an integer tensor.
11
+ Assume and 'keys_strings' is ["Amy", "Sally"], 'values_int64s' is [5, 6],
12
+ and 'default_int64' is '-1'. The input ["Dori", "Amy", "Amy", "Sally",
5
- on which default value attribute is provided. Only one default value attribute
13
+ "Sally"] would be mapped to [-1, 5, 5, 6, 6].
6
- should be defined.
7
- When converting from integers to strings, the string is fetched from the
14
+ Since this operator is an one-to-one mapping, its input and output shapes
8
- 'classes_strings' list, by simple indexing.
15
+ are the same. Notice that only one of 'keys_*'/'values_*' can be set.
9
- When converting from strings to integers, the string is looked up in the list
10
- and the index at which it is found is used as the converted value.
16
+ For key look-up, bit-wise comparison is used so even a float NaN can be
17
+ mapped to a value in 'values_*' attribute.
11
18
  **Attributes**
12
- * **classes_strings**:
19
+ * **default_float**:
13
- A list of labels.
20
+ A float.
14
21
  * **default_int64**:
22
+ An integer.
15
- An integer to use when an input string value is not found in the
16
- map.<br>One and only one of the 'default_*' attributes must be
17
- defined.
18
23
  * **default_string**:
24
+ A string.
25
+ * **keys_floats**:
26
+ A list of floats.
27
+ * **keys_int64s**:
28
+ A list of ints.
29
+ * **keys_strings**:
19
- A string to use when an input integer value is not found in the
30
+ A list of strings. One and only one of 'keys_*'s should be set.
31
+ * **values_floats**:
32
+ A list of floats.
33
+ * **values_int64s**:
34
+ A list of ints.
35
+ * **values_strings**:
20
- map.<br>One and only one of the 'default_*' attributes must be
36
+ A list of strings. One and only one of 'value_*'s should be set.
21
- defined.
22
37
  **Inputs**
23
38
  * **X** (heterogeneous) - **T1**:
24
- Input data.
39
+ Input data. It can be either tensor or scalar.
25
40
  **Outputs**
26
41
  * **Y** (heterogeneous) - **T2**:
42
+ Output data.
27
- Output data. If strings are input, the output values are integers,
28
- and vice versa.
29
43
  **Type Constraints**
30
44
  * **T1** in (
45
+ tensor(float),
31
46
  tensor(int64),
32
47
  tensor(string)
33
48
  ):
49
+ The input type is a tensor of any shape.
34
- The input type must be a tensor of integers or strings, of any
35
- shape.
36
50
  * **T2** in (
51
+ tensor(float),
37
52
  tensor(int64),
38
53
  tensor(string)
39
54
  ):
55
+ Output type is determined by the specified 'values_*' attribute.- The output type will be a tensor of strings or integers, and will
40
- have the same shape as the input.