pred and label can have arbitrary shape as long as they have the same
number of elements.

Parameters

weight (float or None) – Global scalar weight for loss.

batch_axis (int, default 0) – The axis that represents mini-batch.

Inputs:

pred: prediction tensor with arbitrary shape

label: target tensor with the same size as pred.

sample_weight: element-wise weighting tensor. Must be broadcastable
to the same shape as pred. For example, if pred has shape (64, 10)
and you want to weigh each sample in the batch separately,
sample_weight should have shape (64, 1).

Outputs:

loss: loss tensor with shape (batch_size,). Dimenions other than
batch_axis are averaged out.