I tried to develop the detection network using mxnet.
previously, I build a subnetwork using caffe.
this subnetwork requires the data spliting and additional forward operation in training.
In my configuration in mxnet, the subnetworks is
conv_part0 | conv_part1 | conv_part2 [ these are sharing weights ]
roi_pooled0 | roi_pooled1 | roi_pooled2 [ these features are extracted from conv_partx ]
| feature_selector |
[ this selector choose one roi_pooled feature per roi. and, when backpropagation, this layer sets the gradient of the roi_pooled feature not selected to 0]
and the training is processed with additional forward operation to get the label
mod.forward using data_batch_1
data_batch_2 is updated using the output of mod.forward
mod.forward_and_backward using data_batch_2
however, the configuration doesn’t seem to work in back-propagation.
when i set the shared weight of conv to reverse identity matrix or feed the wrong label related to feature selector, it makes the same loss and accuracy.
I doubt two possibilities: in feature selector, the gradients of some data are zero and spread to convolution symbols by roi pooling. so maybe this zero and the spreading interrupts the proper back-propagation.
or the additional forwarding operation maybe interrupt the binding.
I am trying to find out the problem, however, this is not easy.
Are the additional forwarding and the back-propagation with useless data (gradients zero) suitable in mxnet?
any ideas how to solve this problems.