I’m writing a network structure which need an initialized input variable to be optimized.
The example code is below:
scale = mx.symbol.var(shape = (1,1,9), init = mx.initializer.Constant(1))
scale_input = mx.symbol.broadcast_mul(input, scale_exp)
… to some loss function for optimization
But when I load the resulting trained params, the variable scale are all still constant 1. I want the input variable also get the gradient and be optimized except the input training data.
Can someone instruct me how to do this?