How to efficiently build a mask from indexes?

peller · March 26, 2018, 10:20am

Hi there.
I need to build a mask symbol out of a list of indexes.

For example given a set of indexes idxs:

idxs=[[1,2],[2,3],[0,1]]

then I want a mask like the following

mask= [[0,1,1,0],[0,0,1,1],[1,1,0,0]].

where idxs has shape [batch_size, num_indexes]
and mask has shape [batch_size, max_values]

At the moment I am using the following:
mask = mx.sym.sum(
mx.sym.one_hot(idxs, depth=max_values,
axis=1)

Unfortunately, this takes a lot of gpu memory. It scales as batch_size * max_values * num_indexes. I was wondering if anyone here has some ideas on how to do this in a more efficient way.

thomelane · March 30, 2018, 5:59am

Hi @peller,

Sounds like sparse matrices would help out here. You can then define the expected output much more directly using your indexes, and you’ll save tons of memory by avoiding the one hot encoding. Check out the example below using mx.nd by the api should be the same for mx.sym (https://mxnet.incubator.apache.org/api/python/symbol/sparse.html?highlight=sparse#module-mxnet.symbol.sparse).

import mxnet as mx

# indicates which indicies and data relate to which rows
indptr = mx.nd.array([0, 2, 4, 6])
## row 0 is 0:2 from indices and data
## row 1 is 2:4 from indices and data
## row 2 is 4:6 from indices and data

# same as your `idxs` but flattened
indices = mx.nd.array([1, 2, 2, 3, 0, 1])

# all 1s in your example
data = mx.nd.array([1, 1, 1, 1, 1, 1])

a = mx.nd.sparse.csr_matrix((data, indices, indptr), shape=(3, 4))
a.asnumpy()

peller · March 30, 2018, 9:40am

Hi @thomelane, thanks for the reply. Unfortunately sparse array are not an option for me as we are running on gpus and I believe they are not supported for gpus.

thomelane · April 2, 2018, 8:24pm

@eric-haibin-lin could you clarify if sparse arrays can be used on GPU?

eric-haibin-lin · April 4, 2018, 6:01pm

csr_matrix is supported on GPU but the scope is limited. You can do common operations such as sparse.where, contrib.SparseEmbedding and sparse.dot.

@peller what do you want to do with the mask? would sparse.where work for you?

peller · April 5, 2018, 8:01am

Hi,
Thank you for the replies!
I need to mask a large softmax layer. I am doing policy gradient but not all the options are always available so I am masking the options that are not available. In practice, I implemented a numerically stable softmax that returns non-zero probabilities only for the indeces contained in the idxs symbol as in the example.
I believe sparse arrays are the right way to do it, but given that there is limited support I am not super-keen in following this path.

eric-haibin-lin · April 6, 2018, 5:50pm

Hi @peller, I’m interested in your masked softmax. I implemented sampled_softmax at https://github.com/apache/incubator-mxnet/blob/master/example/rnn/large_word_lm/model.py#L75 with sparse ndarray (on GPU). Is this similar to what you’re doing? Why do you want mask for your case?

peller · April 10, 2018, 3:11pm

Hi, I think it is something much simpler. I have a bunch of options and contexts. My model should output the best options given some context. I know at training time that some of the options are not available for some of the contexts. So I mask the output of my softmax to return non-zero probabilities only for the available options.
Does this make sense?

eric-haibin-lin · April 11, 2018, 4:59pm

I see. Would it be helpful if MXNet support elementwise multiplication of csr * dense = csr on GPU?
In this way you only need a sparse mask, and get a sparse output.

peller · April 11, 2018, 5:32pm

It would be very beneficial. I believe that it would basically solve my problem.

eric-haibin-lin · April 18, 2018, 5:13am

Cool. I’ve put down the feature request in https://github.com/apache/incubator-mxnet/issues/10506

Topic		Replies	Views
Get list of indices of max value occurences Gluon symbol , best-practices , performance	4	1037	February 26, 2019
Simple NDArray/Symbol indexing Discussion	3	810	October 17, 2021
How to hash mxnet.np.ndarray fast? Performance	0	285	December 26, 2022
How to choose some specific rows according to 0/1 mask?	4	609	October 21, 2019
Array newbie Discussion	0	294	February 21, 2022

How to efficiently build a mask from indexes?

Related Topics