Hi I need to implement a fully connected linear layer in symbolic mode, where the weights are restricted to lie on a simplex, i.e,. each row of the weights vector is a probability distribution (the entries of each row are all non-negative and they sum to one). One way that I could think of imposing this restriction is to use
softmax on weights as follows:
y = mx.sym.dot(x, mx.sym.transpose(mx.sym.softmax(W))),
x is the input symbol and
y is the output symbol, and
W is the weight matrix.
However, it’s not clear to me how to tell
mxnet that the
W is not a symbol, but a parameter matrix that needs to be learned during training.
Can anyone give me pointers on how to set
W in the above equation to be a parameter matrix?