Pytorch BatchNorm1D equivalent in MxNet Gluon

I want to reproduce the following Pytorch behavior in MxNet Gluon (or simply in MxNet). What is the simplest snippet of code that does this?

(In particular, I want to center and normalize a batch of input data.)

PyTorch Code which produces following output.

tensor([-0.2708, -0.2600])
tensor([-0.0000, -0.0000])

Code:

import torch

num_examples = 10
num_features = 2
_input = torch.randn(num_examples, num_features)

print(_input.mean(dim=0))

m = torch.nn.BatchNorm1d(num_features, affine=False)
_output = m(_input)
print(_output.mean(dim=0))

MxNet Code (on a separate input) which produces following output.

[-0.06546137 -0.14399691]
[-0.06546105 -0.14399621]

Code:

import mxnet as mx

num_examples = 10
num_features = 2

_input = mx.nd.random_normal(shape=(num_examples, num_features))

print(_input.mean(axis=0))
bn = mx.gluon.nn.BatchNorm(center=True)
bn.initialize()
_output = bn.forward(_input)
print(_output.mean(axis=0))

You need to set bn = mx.gluon.nn.BatchNorm(center=True) in the autograd recording scope, otherwise local stats will not be computed.

with autograd.record():
   _output = bn.forward(_input)