Accuracy Drop when Inference in the autograd.record scope

Hi Everyone, I noticed a weird problem when I try to implement an adversarial attack model. Below is my main code

metric = mx.metric.Accuracy()

metric.reset()

for i, batch in enumerate(valid_data):
    # Extract data and label
    data = gluon.utils.split_and_load(batch[0], ctx_list=ctx, batch_axis=0)
    label = gluon.utils.split_and_load(batch[1], ctx_list=ctx, batch_axis=0)

    #output=[net(X) for X in data]
    #metric.update(labels=label,preds=output)
    # AutoGrad
    with ag.record():
        output = [net(X) for X in data]
        
    metric.update(labels=label,preds=output)


_ , acc= metric.get()
print(acc)

If I evaluate the accuracy use the first metric update, I can get 0.9938 on MNIST. However, I can only get 0.9886 accuracy if I place the output in the ag.record(). The full program can be found on

I get the answer. It seems that some unknown thing will be done to the parameters if is placed in the autograd and the train_mode is set to true. I get the first result when I set the train_mode to false