NegativeLogLikelihoodLoss - 12 vs 13#

Next section compares an older to a newer version of the same operator after both definition are converted into markdown text. Green means an addition to the newer version, red means a deletion. Anything else is unchanged.

NegativeLogLikelihoodLoss12 → NegativeLogLikelihoodLoss13 RENAMED
@@ -1 +1 @@
1
1
  A NegativeLogLikelihoodLoss operator computes (weighted) negative log likelihood loss.
2
2
  Its "input" tensor has the shape of (N, C, d1, d2, ..., dk) where k >= 0.
3
3
  The "input" tensor contains log-probabilities for input[n, :, d_1, d_2,..., d_k] being in a class of [0, C).
4
4
  The operator's "target" input tensor has the shape of (N, d1, d2, ..., dk). It encodes class labels (one of C classes)
5
5
  or it may contain a special value (indicated by an attribute ignore_index) for N x d1 x d2 x ... x dk samples.
6
6
  The loss value for input[n, :, d_1, d_2,...d_k] being classified as class c = target[n][d_1][d_2]...[d_k] is computed as:
7
-
8
7
  loss[n][d_1][d_2]...[d_k] = -input[n][c][d_1][d_2]...[d_k].
9
-
10
8
  When an optional "weight" is provided, the sample loss is calculated as:
11
-
12
9
  loss[n][d_1][d_2]...[d_k] = -input[n][c][d_1][d_2]...[d_k] * weight[c].
13
-
14
10
  loss is zero for the case when target-value equals ignore_index.
15
11
  loss[n][d_1][d_2]...[d_k] = 0, when target[n][d_1][d_2]...[d_k] = ignore_index
16
-
17
12
  If "reduction" attribute is set to "none", the operator's output will be the above loss with shape (N, d1, d2, ..., dk).
18
13
  If "reduction" attribute is set to "mean" (the default attribute value), the output loss is (weight) averaged:
19
-
20
14
  mean(loss), if "weight" is not provided,
21
-
22
15
  or if weight is provided,
23
-
24
16
  sum(loss) / sum(weight[target[n][d_1][d_2]...[d_k]]]), for all samples.
25
-
26
17
  If "reduction" attribute is set to "sum", the output is a scalar:
27
18
  sum(loss).
28
-
29
19
  See also https://pytorch.org/docs/stable/nn.html#torch.nn.NLLLoss.
30
-
31
20
  Example 1:
32
-
33
21
  // negative log likelihood loss, "none" reduction
34
22
  N, C, d1 = 2, 3, 2
35
23
  input = [[[1.0, 2.0], [2.0, 2.0], [3.0, 2.0]],
36
24
  [[0.0, 1.0], [2.0, 2.0], [1.0, 2]]]
37
25
  target = [[2, 1], [0, 2]]
38
-
39
26
  loss = np.zeros((N, d1))
40
27
  for n in range(N):
41
28
  for d_1 in range(d1):
42
29
  c = target[n][d_1]
43
30
  loss[n][d_1] = -input[n][c][d_1]
44
-
45
31
  // print(loss)
46
32
  // [[-3. -2.]
47
33
  // [-0. -2.]]
48
-
49
34
  Example 2:
50
-
51
35
  // weighted negative log likelihood loss, sum reduction
52
36
  N, C, d1 = 2, 3, 2
53
37
  input = [[[1.0, 2.0], [2.0, 2.0], [3.0, 2.0]],
54
38
  [[0.0, 1.0], [2.0, 2.0], [1.0, 2]]]
55
39
  target = [[2, 1], [0, 2]]
56
40
  weight = [0.2, 0.3, 0.1]
57
41
  loss = np.zeros((N, d1))
58
42
  for n in range(N):
59
43
  for d_1 in range(d1):
60
44
  c = target[n][d_1]
61
45
  loss[n][d_1] = -input[n][c][d_1] * weight[c]
62
-
63
46
  loss = np.sum(loss)
64
47
  // print(loss)
65
48
  // -1.1
66
-
67
49
  Example 3:
68
-
69
50
  // weighted negative log likelihood loss, mean reduction
70
51
  N, C, d1 = 2, 3, 2
71
52
  input = [[[1.0, 2.0], [2.0, 2.0], [3.0, 2.0]],
72
53
  [[0.0, 1.0], [2.0, 2.0], [1.0, 2]]]
73
54
  target = [[2, 1], [0, 2]]
74
55
  weight = [0.2, 0.3, 0.1]
75
56
  loss = np.zeros((N, d1))
76
57
  weight_total = 0
77
58
  for n in range(N):
78
59
  for d_1 in range(d1):
79
60
  c = target[n][d_1]
80
61
  loss[n][d_1] = -input[n][c][d_1] * weight[c]
81
62
  weight_total = weight_total + weight[c]
82
-
83
63
  loss = np.sum(loss) / weight_total
84
64
  // print(loss)
85
65
  // -1.57
86
66
  **Attributes**
87
67
  * **ignore_index**:
88
68
  Specifies a target value that is ignored and does not contribute to
89
69
  the input gradient. It's an optional value.
90
70
  * **reduction**:
91
71
  Type of reduction to apply to loss: none, sum, mean (default).
92
72
  'none': the output is the loss for each sample. 'sum': the output
93
73
  will be summed. 'mean': the sum of the output will be divided by the
94
74
  sum of applied weights.
95
75
  **Inputs**
96
76
  Between 2 and 3 inputs.
97
77
  * **input** (heterogeneous) - **T**:
98
78
  Input tensor of shape (N, C) or (N, C, d1, d2, ..., dk).
99
79
  * **target** (heterogeneous) - **Tind**:
100
80
  Target tensor of shape (N) or (N, d1, d2, ..., dk). Target element
101
81
  value shall be in range of [0, C). If ignore_index is specified, it
102
82
  may have a value outside [0, C) and the target values should either
103
83
  be in the range [0, C) or have the value ignore_index.
104
84
  * **weight** (optional, heterogeneous) - **T**:
105
85
  Optional rescaling weight tensor. If given, it has to be a tensor of
106
86
  size C. Otherwise, it is treated as if having all ones.
107
87
  **Outputs**
108
88
  * **loss** (heterogeneous) - **T**:
109
89
  The negative log likelihood loss
110
90
  **Type Constraints**
111
91
  * **T** in (
112
92
  tensor(double),
113
93
  tensor(float),
114
94
  tensor(float16)
115
95
  ):
116
96
  Constrain input, weight, and output types to floating-point tensors.
117
97
  * **Tind** in (
118
98
  tensor(int32),
119
99
  tensor(int64)
120
100
  ):
121
101
  Constrain target to integer types