Argpartition on ndarray


#1

I’m converting a chunk of numpy code to run on MXNet. I’m not sure how to convert this line without dropping back to numpy:

smallest_k = np.argpartition(vec, k)[0:k]

If you’re not familiar, the np.partition does a partial sort on the vector, just separating the numbers into two groups (partitions) one of size k, and the other len-k such that all the values in the first group are less than all the values in the second. That’s np.partition, while np.argpartition does the same thing but gives you the indices into the vector rather than the values.

Can I do this efficiently in MXNet (symbol or gluon) without writing a custom operator or dropping back to numpy? Advice appreciated.

Thanks!


#2

The following code returns the smallest k:

smallest_k = mxnet.ndarray.topk(vec, ret_typ='indices', k=k, axis=0, is_ascend=True)

#3

Thanks! That works great and seems to be quite fast too.

Any way to change the return type to be integer instead of float32? It effectively limits the size of vector that can be scanned to ~16 million, because that’s the largest integer that can be exactly represented.


#4

Unfortunately in the current implementation of topk the return tensor for indices is cast to float. Feel free to open a GitHub issue if this is something you need.

P.S. I realized there is a parameter that allows topk to return smallest values without a multiplication by -1:

smallest_k = mxnet.ndarray.topk(vec, ret_typ='indices', k=k, axis=0, is_ascend=True)