Complex CustomOp: Allocate memory once, re-use it?



I have implemented a fairly complex CustomOp. It is doing some internal computations, which require a substantial amount of working memory.

What I’d like to do is to allocate this working memory only once, and then re-use it in all subsequent forward and backward calls. Size, dtype, etc of this memory depend on those of inputs, but these are known at binding time.

Is there a way to do this? I tried the obvious thing to allocate the working memory in the first call to forward, set a flag, and subsequent calls check this flag. It turns out this does not work, the working memory does not persist between calls. It is almost like the whole CustomOp object is created from scratch for every call.

Or is it that a newer concept has replaced CustomOp, which would allow me to allocate working memory once and re-use it?