On using SoftmaxOutput with single output

sanyuan · May 24, 2018, 6:49am

When using softmaxOutput layer, if the param multi_output is set to be False, the input data and output data will be transfered into a 2-D tensor:
Tensor<xpu, 2, DType> in_data[softmaxfocalout_enum::kData].get_with_shape<xpu, 2, DType>(s2, Tensor<xpu, 2, DType> out_data[softmaxfocalout_enum::kOut].get_with_shape<xpu, 2, DType>(s2, s);
And then, Softmax(out, data) is used.
My question is:

Why should use softmax function but not a sigmod function, which usually outputs a single value.
Why is the label’s shape Tensor<xpu, 1, DType>, but not Tensor<xpu, 2, DType>, the same shape as input data?
I would like to to use it in a multi-label image classification. Normally, the format of the image data is BCHW, and the 2D tensor’s shape should be B-CHW, which may leads to a speed problem: In backward, the GridDim may be too small (the batch is small).

indu · May 24, 2018, 7:53pm

Sigmoid is an activation function. It transforms values between -inf and inf to values between 0 and 1. Softmax transforms an array of floats into a probability distribution summing up to 1. They are completely different things. You can use Sigmoid as a loss function for binary classification problems. You need to do crossentropy on top of softmax to use it as a loss function. softmax_cross_entropy is a commonly used loss function for multi-class classification.
The label shape is 1D because it is easier to specify the indices to classes as labels. Example: [3, 5, 1] means the first example belongs to class 3, the second belong to class 5 and the third belongs to class 1. If label is 2D, this must be written as one hot array like [ [0,0,0,1,0,0], [0,0,0,0,0,1], [0,1,0,0,0,0] ]. If you have a 2D label, you can use argmax function to make it 1D. That said it will be nice to have both options like how gluon’s SoftmaxCrossEntropyLoss does it.

I’m not sure why there will be a speed problem. Could you please explain?

Topic		Replies	Views
Differentiating specific softmax output label with respect to input image Discussion	1	789	October 11, 2017
Issues with Softmax Cross Entropy Loss inferred shape Gluon	2	998	June 30, 2019
SoftmaxOutput in gluon Gluon	6	2234	April 10, 2018
How to weight terms in softmax cross entropy loss based on value of class label Discussion	3	8173	December 12, 2017
SoftmaxCrossEntropyLoss vs KLDivLoss Gluon	1	1044	September 11, 2019

On using SoftmaxOutput with single output

Related Topics