How can I use a sparse constant with Gluon?


#1

Hi

I am using the Gluon API to store a Constant using the get_constant method on the ParameterDict of each HybridBlock. The value of this constant is a sparse matrix and I therefore want to use sparse matrix multiplication. However, this has proven difficult since get_constant converts the matrix to a dense numpy.ndarray by invoking asnumpy() before storing it as a Constant:

Is there anyway to utilize sparse matrix multiplication in this case using a HybridBlock?


#2

Instead of calling get_constant(), you could just set grad_req=null. Then your sparse array will not be converted to an ndarray. You won’t have an immutable tensor, but autograd will at least ignore the tensor.


#3

Thanks for your suggestion.

I tried out your suggestion by instantiating a parameter as follows in the constructor of my layer:

       self.A = self.params.get(
             'A', grad_req='null',
             shape=A.shape, dtype=A.dtype,
             stype='csr', init=Constant(A))

Here, A is the sparse matrix and Constant is the initializer class.

However, this produces an error during initialization with the following (shortened) stack trace:

~/.local/share/virtualenvs/rne-U4U7XVrs/lib/python3.6/site-packages/mxnet/gluon/block.py in <dictcomp>(.0)
    909 
    910                 try:
--> 911                     params = {i: j.data(ctx) for i, j in self._reg_params.items()}
    912                 except DeferredInitializationError:
    913                     self._deferred_infer_shape(x, *args)

~/.local/share/virtualenvs/rne-U4U7XVrs/lib/python3.6/site-packages/mxnet/gluon/parameter.py in data(self, ctx)
    491             raise RuntimeError("Cannot return a copy of Parameter '%s' on ctx %s via data() " \
    492                                "because its storage type is %s. Please use row_sparse_data() " \
--> 493                                "instead." % (self.name, str(ctx), self._stype))
    494         return self._check_and_get(self._data, ctx)
    495 

RuntimeError: Cannot return a copy of Parameter 'hybridsequential0_hybridsequential0_graphsagemean0_A'
on ctx cpu(0) via data() because its storage type is csr.
Please use row_sparse_data() instead.

It would appear that mxnet does not support CSRNDArray values in Parameter objects.

For completenes, here is the full stack trace:

RuntimeError                              Traceback (most recent call last)
<ipython-input-7-a025f3eff79e> in <module>
      1 #model.hybridize()
      2 model.initialize(Xavier())
----> 3 model(X)

~/.local/share/virtualenvs/rne-U4U7XVrs/lib/python3.6/site-packages/mxnet/gluon/block.py in __call__(self, *args)
    539             hook(self, args)
    540 
--> 541         out = self.forward(*args)
    542 
    543         for hook in self._forward_hooks.values():

~/.local/share/virtualenvs/rne-U4U7XVrs/lib/python3.6/site-packages/mxnet/gluon/block.py in forward(self, x, *args)
    916                     params = {i: j.data(ctx) for i, j in self._reg_params.items()}
    917 
--> 918                 return self.hybrid_forward(ndarray, x, *args, **params)
    919 
    920         assert isinstance(x, Symbol), \

~/.local/share/virtualenvs/rne-U4U7XVrs/lib/python3.6/site-packages/mxnet/gluon/nn/basic_layers.py in hybrid_forward(self, F, x)
    115     def hybrid_forward(self, F, x):
    116         for block in self._children.values():
--> 117             x = block(x)
    118         return x
    119 

~/.local/share/virtualenvs/rne-U4U7XVrs/lib/python3.6/site-packages/mxnet/gluon/block.py in __call__(self, *args)
    539             hook(self, args)
    540 
--> 541         out = self.forward(*args)
    542 
    543         for hook in self._forward_hooks.values():

~/.local/share/virtualenvs/rne-U4U7XVrs/lib/python3.6/site-packages/mxnet/gluon/block.py in forward(self, x, *args)
    916                     params = {i: j.data(ctx) for i, j in self._reg_params.items()}
    917 
--> 918                 return self.hybrid_forward(ndarray, x, *args, **params)
    919 
    920         assert isinstance(x, Symbol), \

~/.local/share/virtualenvs/rne-U4U7XVrs/lib/python3.6/site-packages/mxnet/gluon/nn/basic_layers.py in hybrid_forward(self, F, x)
    115     def hybrid_forward(self, F, x):
    116         for block in self._children.values():
--> 117             x = block(x)
    118         return x
    119 

~/.local/share/virtualenvs/rne-U4U7XVrs/lib/python3.6/site-packages/mxnet/gluon/block.py in __call__(self, *args)
    539             hook(self, args)
    540 
--> 541         out = self.forward(*args)
    542 
    543         for hook in self._forward_hooks.values():

~/.local/share/virtualenvs/rne-U4U7XVrs/lib/python3.6/site-packages/mxnet/gluon/block.py in forward(self, x, *args)
    909 
    910                 try:
--> 911                     params = {i: j.data(ctx) for i, j in self._reg_params.items()}
    912                 except DeferredInitializationError:
    913                     self._deferred_infer_shape(x, *args)

~/.local/share/virtualenvs/rne-U4U7XVrs/lib/python3.6/site-packages/mxnet/gluon/block.py in <dictcomp>(.0)
    909 
    910                 try:
--> 911                     params = {i: j.data(ctx) for i, j in self._reg_params.items()}
    912                 except DeferredInitializationError:
    913                     self._deferred_infer_shape(x, *args)

~/.local/share/virtualenvs/rne-U4U7XVrs/lib/python3.6/site-packages/mxnet/gluon/parameter.py in data(self, ctx)
    491             raise RuntimeError("Cannot return a copy of Parameter '%s' on ctx %s via data() " \
    492                                "because its storage type is %s. Please use row_sparse_data() " \
--> 493                                "instead." % (self.name, str(ctx), self._stype))
    494         return self._check_and_get(self._data, ctx)
    495 

RuntimeError: Cannot return a copy of Parameter 'hybridsequential0_hybridsequential0_graphsagemean0_A' on ctx cpu(0) via data() because its storage type is csr. Please use row_sparse_data() instead.

#4

Hi @TobiasJepsen unfortunately that’s a bug (or csr should have been removed from the doc). As a workaround, would it work if you repeatedly feed the same csr_matrix as the input to the network (instead of making it a constant parameter)?