NegativeLogLikelihoodLoss - 12 vs 13

NegativeLogLikelihoodLoss12 → NegativeLogLikelihoodLoss13 RENAMED
@@ -1 +1 @@
1
1
  A NegativeLogLikelihoodLoss operator computes (weighted) negative log likelihood loss.
2
2
  Its "input" tensor has the shape of (N, C, d1, d2, ..., dk) where k >= 0.
3
3
  The "input" tensor contains log-probabilities for input[n, :, d_1, d_2,..., d_k] being in a class of [0, C).
4
4
  The operator's "target" input tensor has the shape of (N, d1, d2, ..., dk). It encodes class labels (one of C classes)
5
5
  or it may contain a special value (indicated by an attribute ignore_index) for N x d1 x d2 x ... x dk samples.
6
6
  The loss value for input[n, :, d_1, d_2,...d_k] being classified as class c = target[n][d_1][d_2]...[d_k] is computed as:
7
+
7
8
  loss[n][d_1][d_2]...[d_k] = -input[n][c][d_1][d_2]...[d_k].
9
+
8
10
  When an optional "weight" is provided, the sample loss is calculated as:
11
+
9
12
  loss[n][d_1][d_2]...[d_k] = -input[n][c][d_1][d_2]...[d_k] * weight[c].
13
+
10
14
  loss is zero for the case when target-value equals ignore_index.
11
15
  loss[n][d_1][d_2]...[d_k] = 0, when target[n][d_1][d_2]...[d_k] = ignore_index
16
+
12
17
  If "reduction" attribute is set to "none", the operator's output will be the above loss with shape (N, d1, d2, ..., dk).
13
18
  If "reduction" attribute is set to "mean" (the default attribute value), the output loss is (weight) averaged:
19
+
14
20
  mean(loss), if "weight" is not provided,
21
+
15
22
  or if weight is provided,
23
+
16
24
  sum(loss) / sum(weight[target[n][d_1][d_2]...[d_k]]]), for all samples.
25
+
17
26
  If "reduction" attribute is set to "sum", the output is a scalar:
18
27
  sum(loss).
28
+
19
29
  See also https://pytorch.org/docs/stable/nn.html#torch.nn.NLLLoss.
30
+
20
31
  Example 1:
32
+
21
33
  // negative log likelihood loss, "none" reduction
22
34
  N, C, d1 = 2, 3, 2
23
35
  input = [[[1.0, 2.0], [2.0, 2.0], [3.0, 2.0]],
24
36
  [[0.0, 1.0], [2.0, 2.0], [1.0, 2]]]
25
37
  target = [[2, 1], [0, 2]]
38
+
26
39
  loss = np.zeros((N, d1))
27
40
  for n in range(N):
28
41
  for d_1 in range(d1):
29
42
  c = target[n][d_1]
30
43
  loss[n][d_1] = -input[n][c][d_1]
44
+
31
45
  // print(loss)
32
46
  // [[-3. -2.]
33
47
  // [-0. -2.]]
48
+
34
49
  Example 2:
50
+
35
51
  // weighted negative log likelihood loss, sum reduction
36
52
  N, C, d1 = 2, 3, 2
37
53
  input = [[[1.0, 2.0], [2.0, 2.0], [3.0, 2.0]],
38
54
  [[0.0, 1.0], [2.0, 2.0], [1.0, 2]]]
39
55
  target = [[2, 1], [0, 2]]
40
56
  weight = [0.2, 0.3, 0.1]
41
57
  loss = np.zeros((N, d1))
42
58
  for n in range(N):
43
59
  for d_1 in range(d1):
44
60
  c = target[n][d_1]
45
61
  loss[n][d_1] = -input[n][c][d_1] * weight[c]
62
+
46
63
  loss = np.sum(loss)
47
64
  // print(loss)
48
65
  // -1.1
66
+
49
67
  Example 3:
68
+
50
69
  // weighted negative log likelihood loss, mean reduction
51
70
  N, C, d1 = 2, 3, 2
52
71
  input = [[[1.0, 2.0], [2.0, 2.0], [3.0, 2.0]],
53
72
  [[0.0, 1.0], [2.0, 2.0], [1.0, 2]]]
54
73
  target = [[2, 1], [0, 2]]
55
74
  weight = [0.2, 0.3, 0.1]
56
75
  loss = np.zeros((N, d1))
57
76
  weight_total = 0
58
77
  for n in range(N):
59
78
  for d_1 in range(d1):
60
79
  c = target[n][d_1]
61
80
  loss[n][d_1] = -input[n][c][d_1] * weight[c]
62
81
  weight_total = weight_total + weight[c]
82
+
63
83
  loss = np.sum(loss) / weight_total
64
84
  // print(loss)
65
85
  // -1.57
66
86
  **Attributes**
67
87
  * **ignore_index**:
68
88
  Specifies a target value that is ignored and does not contribute to
69
89
  the input gradient. It's an optional value.
70
90
  * **reduction**:
71
91
  Type of reduction to apply to loss: none, sum, mean (default).
72
92
  'none': the output is the loss for each sample. 'sum': the output
73
93
  will be summed. 'mean': the sum of the output will be divided by the
74
94
  sum of applied weights.
75
95
  **Inputs**
76
96
  Between 2 and 3 inputs.
77
97
  * **input** (heterogeneous) - **T**:
78
98
  Input tensor of shape (N, C) or (N, C, d1, d2, ..., dk).
79
99
  * **target** (heterogeneous) - **Tind**:
80
100
  Target tensor of shape (N) or (N, d1, d2, ..., dk). Target element
81
101
  value shall be in range of [0, C). If ignore_index is specified, it
82
102
  may have a value outside [0, C) and the target values should either
83
103
  be in the range [0, C) or have the value ignore_index.
84
104
  * **weight** (optional, heterogeneous) - **T**:
85
105
  Optional rescaling weight tensor. If given, it has to be a tensor of
86
106
  size C. Otherwise, it is treated as if having all ones.
87
107
  **Outputs**
88
108
  * **loss** (heterogeneous) - **T**:
89
109
  The negative log likelihood loss
90
110
  **Type Constraints**
91
111
  * **T** in (
92
112
  tensor(double),
93
113
  tensor(float),
94
114
  tensor(float16)
95
115
  ):
96
116
  Constrain input, weight, and output types to floating-point tensors.
97
117
  * **Tind** in (
98
118
  tensor(int32),
99
119
  tensor(int64)
100
120
  ):
101
121
  Constrain target to integer types