I’d like to add a sparsity penalty on hidden layers, similar to how the penalty is applied in sparse AutoEncoders (ref: https://web.stanford.edu/class/cs294a/sparseAutoencoder.pdf).
I think this should be possible using mxnet.symbol.IdentityAttachKLSparseReg, but I’d like to do it with gluon. Looking around at some of the other operators, and with some trail and error, I ended up with something like:
class SparsityPenalty(gluon.HybridBlock):
def __init__(self, sparseness_target, penalty, momentum, **kwargs):
super(SparsityPenalty, self).__init__(**kwargs)
self._kwargs = {
'sparseness_target': sparseness_target,
'penalty': penalty,
'momentum': momentum
}
self.moving_avg = self.params.get('fwd_moving_avg', grad_req='null', allow_deferred_init=True)
def hybrid_forward(self, F, x, moving_avg):
self._kwargs['moving_avg'] = moving_avg
return F.IdentityAttachKLSparseReg(x, name='fwd', **self._kwargs)
I’m not sure if this is set up the right way, especially as the moving_avg is used as auxiliary data by IdentityAttachKLSparseReg.
I also had to set up the param so that it’s name matched the one generated within IdentityAttachKLSparseReg. Otherwise, an exception is raised within net.hybridize(). Hence the ‘fwd_moving_avg’. I think this should be fixable, but haven’t found out how.
Any guidance on this would be appreciated.