Hi I need to implement a fully connected linear layer in symbolic mode, where the weights are restricted to lie on a simplex, i.e,. each row of the weights vector is a probability distribution (the entries of each row are all non-negative and they sum to one). One way that I could think of imposing this restriction is to use `softmax`

on weights as follows:

`y = mx.sym.dot(x, mx.sym.transpose(mx.sym.softmax(W)))`

,

where `x`

is the input symbol and `y`

is the output symbol, and `W`

is the weight matrix.

However, itâ€™s not clear to me how to tell `mxnet`

that the `W`

is not a symbol, but a parameter matrix that needs to be learned during training.

Can anyone give me pointers on how to set `W`

in the above equation to be a parameter matrix?