Pretrained network for multi-class classification

Hi,

I am very new to MXNet and Gluon. I took the “Transfering knowledge through finetuning” example from http://gluon.mxnet.io/chapter08_computer-vision/fine-tuning.html and I am trying to do use it on my own data for multi-class classification. My labels are n-hot. I just substituted loss = gluon.loss.SoftmaxCrossEntropyLoss() with loss = gluon.loss.SigmoidBinaryCrossEntropyLoss() However the error loss doesn’t go down. Is this the way to go for multi-class classification? Here is my code: https://gist.github.com/mongoose54/8d47be1359691bae3b2470dfab60fc00

Anyone who has some idea? I would love to switch to MxNet. So much faster.

Hi @mongoose54, it is my understanding that if you are training a multiclass problem you cannot use SigmoidBinaryCrossEntropyLoss since it expects - for a single datum - a single probability value, not a vector (in 1-hot representation) of probabilities for various classes. You need to use SoftmaxCrossEntropyLoss and specify manually the axis of dimension in which 1hot probabilities live.

@feevos Thanks for the input. I have to use SigmoidBinaryCrossEntropyLoss because of I am doing an N-hot classification. I agree with you if it was a 1-Hot I would have to use SoftmaxCrossEntropyLoss .

Hi @mongoose54, thanks, this is something I don’t know. Can I please ask what is the difference between one-hot and n-hot representation? I’ve never seen n-hot anywhere, any links?

@feevos You are right. The term N-hot is not popular but it is used to describe Multi-label classification (https://en.wikipedia.org/wiki/Multi-label_classification) in which you have more than one class assigned per sample e.g.[ 0 0 1 0 1] , hence the term N-hot instead of one-hot I should have used the Multi-label classification term instead. My bad.

1 Like