I’m training multi-label classification model using C++ API.
Let’s say I have N labels. Then each sample may have any combination of labels.
sample1, 0,1,0,0,1,1 . # 1- label is assigned, 0 - not
I use LogisticRegressionOutput as my final layer (I believe it has ‘cross entropy’ loss function)
Everything seems to be working fine on my initial tests when labels have only 0/1 values.
In my case there also can be a situation when we don’t know the label for a certain position. (E.g: activity wasn’t measured). So, for every position we have: 0,1 or ‘undefined’.
I want these ‘undefined’ values to be ignored in the training (e.g: I don’t want gradients to be pulled to 0 or some other value)
Any suggestions how to deal with this situation?
Can it be done without writing custom loss and activation functions? (e.g: what will happen if I use NAN for these undefined positions)?