Mxnet-quantization Int8 support with MKLDNN

I applied mxnet-quantization on a FCN model which requires Int8 (as opposed to Uint8) due to having negative weights, and applying batch-norm on input data. I was able to run quantization with the MKLDNN backend, but got a runtime error during inference (see below).

It seems like MKLDNN 0.17 does support Int8 convolutions - https://github.com/intel/mkl-dnn/releases
I pulled latest (commit 830a10059a018cd2634d94195140cf2d8790a75a) and rebuilt, but I’m still getting the runtime error.

Is the error originating from MKLDNN (i.e. no Int8 convolution support), or from some kind of protection inside the mxnet library?


mod.update_metric(m, batch.label)
File “/usr/local/lib/python2.7/dist-packages/mxnet/module/module.py”, line 773, in update_metric
self.exec_group.update_metric(eval_metric, labels, pre_sliced)
File “/usr/local/lib/python2.7/dist-packages/mxnet/module/executor_group.py”, line 639, in update_metric
eval_metric.update_dict(labels
, preds)
File “/usr/local/lib/python2.7/dist-packages/mxnet/metric.py”, line 132, in update_dict
self.update(label, pred)
File “/usr/local/lib/python2.7/dist-packages/mxnet/metric.py”, line 418, in update
pred_label = pred_label.asnumpy().astype(‘int32’)
File “/usr/local/lib/python2.7/dist-packages/mxnet/ndarray/ndarray.py”, line 1980, in asnumpy
ctypes.c_size_t(data.size)))
File “/usr/local/lib/python2.7/dist-packages/mxnet/base.py”, line 252, in check_call
raise MXNetError(py_str(_LIB.MXGetLastError()))
mxnet.base.MXNetError: [19:39:11] src/operator/subgraph/mkldnn/mkldnn_conv.cc:346: Can’t handle negetive value for QuantizeData

Stack trace returned 10 entries:
[bt] (0) /usr/local/lib/python2.7/dist-packages/mxnet/libmxnet.so(+0x1d2674) [0x7fd8db1c2674]
[bt] (1) /usr/local/lib/python2.7/dist-packages/mxnet/libmxnet.so(+0x1d2a71) [0x7fd8db1c2a71]
[bt] (2) /usr/local/lib/python2.7/dist-packages/mxnet/libmxnet.so(+0x25d723) [0x7fd8db24d723]
[bt] (3) /usr/local/lib/python2.7/dist-packages/mxnet/libmxnet.so(+0x2c0cee4) [0x7fd8ddbfcee4]
[bt] (4) /usr/local/lib/python2.7/dist-packages/mxnet/libmxnet.so(+0x2c15093) [0x7fd8ddc05093]
[bt] (5) /usr/local/lib/python2.7/dist-packages/mxnet/libmxnet.so(+0x2bf4784) [0x7fd8ddbe4784]
[bt] (6) /usr/local/lib/python2.7/dist-packages/mxnet/libmxnet.so(+0x2bf88c2) [0x7fd8ddbe88c2]
[bt] (7) /usr/local/lib/python2.7/dist-packages/mxnet/libmxnet.so(+0x2bf4ea4) [0x7fd8ddbe4ea4]
[bt] (8) /usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0xb8c80) [0x7fd8fafacc80]
[bt] (9) /lib/x86_64-linux-gnu/libpthread.so.0(+0x76ba) [0x7fd900c656ba]

Process finished with exit code 1

I looked into the code where the error is happening and it seems that this is still work in progress:

        // TODO: Support int8 input when mkldnn supports.
        LOG(FATAL) << "Can't handle negetive value for QuantizeData";
}```

Int8 is still a work in progress, and it’ll probably be a while until it’s fully tested and implemented. I’d recommend if possible you use uint8 as the data type for quantization, that should be a fairly well supported path. It’s not an issue if you have negative weights in your model, uint8 quantization will account for that and it shouldn’t affect predictions in most cases.

I believe the GEMM called by MKL is similar to the method described here: https://github.com/google/gemmlowp/blob/master/doc/quantization.md aka ‘gemmlowp’. A more high-level description is available here: https://sahnimanas.github.io/2018/06/24/quantization-in-tf-lite.htm

I believe that this gemm implicitly expects that uint8 quantized inputs to represent source inputs and weights in the real space with an affine transformation applied to them. I was able to run the quantization script using uint8 for example with pre-trained resnet 50 and resnet 100 models from the model zoo, which do have negative weights and running inference with those models worked as expected. Are you able to give quantization a shot using uint8?