How to write a customized symbol loss?

Question: How to write a customized symbol loss? What is the relationship between customized loss and metric?

Must I write the backward function for customized symbol loss? Because I think for GLuon, we only need to write forward function for customized loss?

Any advice will be appreciated, thanks!

If you use Symbol API then you can use MakeLoss: https://mxnet.incubator.apache.org/versions/master/api/python/symbol/symbol.html#mxnet.symbol.MakeLoss
If you use Gluon, then you can just create a new HybridBlock and define the loss computation within the hybrid_forward. For instance:

class MyLoss(Loss):
  def __init__(self, weight=None, batch_axis=0, **kwargs):
     super(MyLoss, self).__init__(weight, batch_axis, **kwargs)

  def hybrid_forward(self, F, input, output):
     myloss = F.abs(input - output)

     return myloss

Thanks for your reply. If I you Make Loss, I don’t need to write backward, right? My customized loss needs to compute IoU, density and other things, so how can I define my customized loss? thank you very much!

No, you don’t need to write your backward, because loss is a terminal node.

Here is a full example how to create your custom loss in symbol API https://stackoverflow.com/questions/45809154/mxnet-custom-loss-function-and-eval-metric Notice, that the code uses both label and softmax (which is a transformed output of the network).

You may want to try and write the loss yourself by taking this code as an example:
https://www.kaggle.com/c/data-science-bowl-2018/discussion/51553 - Keras version
Or maybe from gluon-cv: https://gluon-cv.mxnet.io/_modules/gluoncv/utils/bbox.html#bbox_iou

Answering your first question: What is the relationship between customized loss and metric? - there are no direct relationship between these two things. Loss is a number that DL algorithm tries to minimize, so it shows an error (in an abstract value) of the algorithm. The metric is something that makes sense for the meaning of the task and easy to interpret for a human.

It is very helpful for me. Thank you very much!

Do you have some examples for writing a customized activation function that needs gradients? Any advice will be appreciated, thanks