Memory leak when running cpu inference



I’m running into a memory leak when performing inference on an mxnet model (i.e. converting an image buffer to tensor and running one forward pass through the model).

A minimal reproducable example is below:

import mxnet
from gluoncv import model_zoo
from import ssd

model = model_zoo.get_model('ssd_512_resnet50_v1_coco')

for _ in range(100000):
  # note: an example imgbuf string is too long to post
  # see gist or use requests etc to obtain
  imgbuf = 
  ndarray = mxnet.image.imdecode(imgbuf, to_rgb=1)
  tensor, orig = ssd.transform_test(ndarray, 512)
  labels, confidences, bboxs = model.forward(tensor)

The result is a linear increase of RSS memory (from 700MB up to 10GB+).

Libraries used: gluoncv==0.3.0, mxnet-mkl==1.3.1

The problem persists with other pretrained models and with a custom model that I am trying to use. And using garbage collectors does not show any increase in objects.

This gist has the full code snippet including an example imgbuf.


This very likely due to you adding ops faster than MXNet is able to process them.
MXNet is foundamentally asynchronous, it runs on eager execution. When you call forward, you effectively say, compute this forward as soon as possible. The python callbacks returns which allows very simple and intuitive parallelism.
To properly benchmark you need to add a synchronous call.
For example mx.nd.waitall() or labels.wait_to_read() or bboxs.asnumpy() etc


Hey thanks for the quick reply.

You are right, in the example above adding the synchronous call stops the memory increasing.

In my actual use case (which I tried to simplify above, but clearly not properly!) I actually already had this in place, and am still seeing constant memory increase. My program uses a queue system to feed image buffers to a function which does the tensor transformation and forward pass, then puts the result back on a different queue. If I perform this without the mxnet component (e.g. either the function returns a fake result, or the function does some ML work using a different library such as pytorch) then the memory is stable.

Any ideas on what may be causing this? Or do you know if there is a way to force mxnet to release all memory?



Could you share a bigger snippet of your code?
MXNet should release the memory once it is out of scope, it gets garbage collected.
My hunch is that you are calling nd.array somewhere and keeping a reference to that object.


Sorry for hijacking.
Have a similar problem, where i repeatedly call a function that loads a model and returns a prediction and memory keeps increasing with number of calls to that function.

here some stripped down example code:

import mxnet as mx
import numpy as np
import cv2
from IPython import embed

CTX = mx.cpu()

def resize(img, img_dims):
    img = cv2.resize(img, (img_dims[0], img_dims[1]))
    img = np.swapaxes(img, 0, 2)
    img = np.swapaxes(img, 1, 2)
    img = img[np.newaxis, :].astype(np.float32) / 255.0
    return mx.nd.array(img)

def predict():
    img = cv2.imread('/path/to/some/image.jpg')
    small_img = resize(img.copy(), (224,224))
    model_name = "/path/to/model.json"
    model_params = "path/to/model.params"
    model = mx.gluon.nn.SymbolBlock.imports(model_name, ['data'], model_params, ctx=CTX)
    return model(small_img).asnumpy()

def main(repeats=3):
    for i in range(repeats):
        result = predict()

if __name__ == '__main__':

mxnet = 1.3.0
python = 3.6.6

The idea was to load and predict inside a function such that memory would be freed up once the function call is done and model/data are out of scope.