Design for `sym.constant` and `sym.value`

Right now it can be confusing to use existing numpy data with the symbol api. The only way I know of to represent the sum between a numpy array and a symbol would be to create a variable for the numpy data, pass the value to bind as input and block the gradient (adapted from #6087):

a = mx.sym.var('a')
x = np.random.randn(5)

b = mx.sym.var('b')
b = mx.sym.BlockGrad(b)

a_ = mx.nd.zeros(5)
b_ = mx.nd.zeros(5)
b_[:] = x

func = (a + b).bind({'a': a_, 'b': b_})

Instead it would be great to have an Op mx.sym.value that has no input, but an auxiliary state that is used to store the actual value. The usage would be something like this:

a = mx.sym.var('a')
b = mx.sym.value('b')

a_ = mx.nd.zeros(5)
b_ = mx.nd.zeros(5)
b_[:] = x

func = (a + b).bind({'a': a_}, aux_states={'b': b_})

or with a default value:

a = mx.sym.var('a')
b = mx.sym.value('b', value=mx.np.zeros(5))

a_ = mx.np.zeros(5)
func = (a + b).bind({'a': a_})

An additional mx.sym.constant could work exactly the same, only that input arrays would be set to read-only, so that graph optimizations can assume that the value can’t change after bind was called. This would allow optimizations like constant propagation.

Probably using init can be a more elegant way to do that.

a = mx.sym.var('a', init=mx.init.Array(np.random.rand(1,2,3)))

Writing a new initializer.Array should be easy.

1 Like

I use the following initializer (suggested by David)

@mx.init.register
class TensorInitializer(mx.init.Initializer):

def __init__(self, ini_tensor):
    super(TensorInitializer,self).__init__(ini_tensor=ini_tensor)
    self.ini_tensor = ini_tensor

def _init_weight(self, _, arr):
    arr[:] = self.ini_tensor

It raises an error:
TypeError: array([…]]) is not JSON serializable

Any suggestions?

however, this way, β€œa” will be changed by BP. what we want is β€œa” should be a constant.