Replace SoftmaxOutput layer from converted Caffe network [solved]


#1

Hi there,

I converted a Caffe network (MTCNNv1*, see url ref below) to MXNet, using <mxnet_repo>/tools/caffe_converter/convert_model.py. I think that went well, except for one part. The PNET has 2 outputs: conv4_2 and prob1, where prob1 is a softmax output. However, I think the converted mx version of the SoftmaxOutput is not doing well in this multidimensional output. When validating the values, it does not give the right answers. I have implemented a new multidimensional softmax function and want to connect it to the layer conv4_1, before the SoftmaxOutput layer, but I cannot reach this layer as output. When I ask for outputs, it gives me these options:

sym.list_outputs()
Out[37]: ['conv4_2_output', 'prob1_output']

How do I access conv4-1_output so that I can redirect this to my own Softmax implementation?

The rest of the network looks like this below:

In[36]: mx.viz.print_summary(sym)
____________________________________________________
Layer (type)             Param #     Previous Layer
====================================================
data(null)               0
____________________________________________________
conv1(Convolution)       10          data
____________________________________________________
PReLU1(LeakyReLU)        0           conv1
____________________________________________________
pool1(Pooling)           0           PReLU1
____________________________________________________
conv2(Convolution)       16          pool1
____________________________________________________
PReLU2(LeakyReLU)        0           conv2
____________________________________________________
conv3(Convolution)       32          PReLU2
____________________________________________________
PReLU3(LeakyReLU)        0           conv3
____________________________________________________
conv4_2(Convolution)     4           PReLU3
____________________________________________________
conv4_1(Convolution)     2           PReLU3
____________________________________________________
prob1(SoftmaxOutput)     0           conv4_1
====================================================
Total params: 64
____________________________________________________

Many thanks,
Blake


#2

I found a hacky way to solve the problem. I changed the mynet-symbol.json file and changed the Softmax layer there from:

    {
      "op": "SoftmaxOutput", 
      "name": "prob1", 
      "inputs": [[22, 0, 0], [23, 0, 0]]
    }

to

    {
      "op": "SoftmaxActivation",
      "name": "prob1",
      "attrs": {"mode": "channel"},
      "inputs": [[22, 0, 0]]
    }

Now it does multidimensional Softmax over channels, as intended. The output is now similar to the original Matlab - Caffe implementation. Note that layer 23 was a reference to the “prob1_label” layer, which we don’t need.