Argpartition on ndarray

leopd · March 7, 2018, 8:05pm

I’m converting a chunk of numpy code to run on MXNet. I’m not sure how to convert this line without dropping back to numpy:

smallest_k = np.argpartition(vec, k)[0:k]

If you’re not familiar, the np.partition does a partial sort on the vector, just separating the numbers into two groups (partitions) one of size k, and the other len-k such that all the values in the first group are less than all the values in the second. That’s np.partition, while np.argpartition does the same thing but gives you the indices into the vector rather than the values.

Can I do this efficiently in MXNet (symbol or gluon) without writing a custom operator or dropping back to numpy? Advice appreciated.

Thanks!

safrooze · March 8, 2018, 10:40pm

The following code returns the smallest k:

smallest_k = mxnet.ndarray.topk(vec, ret_typ='indices', k=k, axis=0, is_ascend=True)

leopd · March 12, 2018, 7:42pm

Thanks! That works great and seems to be quite fast too.

Any way to change the return type to be integer instead of float32? It effectively limits the size of vector that can be scanned to ~16 million, because that’s the largest integer that can be exactly represented.

safrooze · March 14, 2018, 1:33am

Unfortunately in the current implementation of topk the return tensor for indices is cast to float. Feel free to open a GitHub issue if this is something you need.

P.S. I realized there is a parameter that allows topk to return smallest values without a multiplication by -1:

smallest_k = mxnet.ndarray.topk(vec, ret_typ='indices', k=k, axis=0, is_ascend=True)

Topic		Replies	Views
Mxnet ndarray to numpy without copy Discussion	1	479	September 11, 2019
How to use argsort to zero out a matrix Performance	1	552	December 19, 2017
Numpy array to ndarray Discussion	1	2973	November 8, 2017
Ndarray problem Gluon	1	423	September 19, 2019
NDarray fails silently on large array size Discussion	3	540	January 10, 2019

Argpartition on ndarray

Related Topics