Softmax block in Gluon


#1

The http://gluon.mxnet.io/chapter02_supervised-learning/softmax-regression-gluon.html# doesn’t explain how to get a “pure” Softmax block in a Gluon model.

“Note, we didn’t have to include the softmax layer because MXNet’s has an efficient function that simultaneously computes the softmax activation and cross-entropy loss. However, if ever need to get the output probabilities,”

it seems like the last sentence is not complete. Is the only way to build a custom block and use F.softmax(x) in the hybrid_forward method?

Thanks!
-Andrey


#2
>>> import mxnet as mx
>>> from mxnet import ndarray as nd
>>> x = nd.array([1, 2, 3, 4])
>>> y = nd.softmax(x)
>>> y

[ 0.0320586   0.08714432  0.23688284  0.64391428]
<NDArray 4 @cpu(0)>
>>> sum(y)

[ 1.]
<NDArray 1 @cpu(0)>

#3

Thanks astonzhang!

I was actually looking for an existing softmax block to use in Gluon models. In the meantime I am using a little custom one:

class Softmax(HybridBlock):

def __init__(self, **kwargs):
     super(Softmax, self).__init__(**kwargs)

def hybrid_forward(self, F, x):
    return F.softmax(x)

#4

@avolozin this seems useful enough to be directly added to gluon.
Would be be ok with creating a PR request?
https://mxnet.incubator.apache.org/community/contribute.html


#5

Thanks @madjam! I will need to checkout the code and setup a dev env - might have time toward the end of this week, quite swamped now :frowning:

Another option would be to add ‘softmax’ as a new type in mxnet.gluon.nn.Activation. What do you think?


#6

I like that idea. Keras does this in a similar fashion. https://keras.io/activations/


#7

Thanks! I’ll ping you when I get some time to implement this