I am writing the symbol and parameters of a hybridized model in Python as follows
net.hybridize()
out_sym = net(mx.sym.Variable("data"))
out_sym.save(sym_path)
net.collect_params().save(params_path)
Further, I am able to load the symbol and params in Python again (although there might be a more elegant way to do this) with
sym = mx.sym.load(sym_path)
params = mx.nd.load(params_path)
mod = mx.mod.Module(model, label_names=[])
mod.bind(data_shapes=[('data', data.shape)])
mod.init_params()
(args, auxs) = mod.get_params()
mod.set_params(params, auxs)
mod.forward(mx.io.DataBatch(data=[data]))
out = mod.get_outputs()[0]
However, I failed to load use the symbol and params in C++. I implemented the following code based on the examples in the repository
// Load the network structure and parameters
mxnet::cpp::Context ctx_gpu(mxnet::cpp::kGPU, 0);
mxnet::cpp::Symbol net = mxnet::cpp::Symbol::Load(model_path);
std::map<std::string, mxnet::cpp::NDArray> params;
mxnet::cpp::NDArray::Load(param_path, 0, ¶ms);
std::map<std::string, mxnet::cpp::NDArray> args_map;
std::map<std::string, mxnet::cpp::NDArray> aux_map;
for (const auto &k : params) {
if (k.first.substr(0, 4) == "aux:") {
auto name = k.first.substr(4, k.first.size() - 4);
aux_map[name] = k.second.Copy(ctx_gpu);
}
if (k.first.substr(0, 4) == "arg:") {
auto name = k.first.substr(4, k.first.size() - 4);
args_map[name] = k.second.Copy(ctx_gpu);
}
}
//Variant 1
mxnet::cpp::Executor* executor = net.SimpleBind(ctx_gpu, args_map);
executor->Forward(false);
//Variant 2
std::vector<mxnet::cpp::NDArray> arg_arrays;
std::vector<mxnet::cpp::NDArray> grad_arrays;
std::vector<mxnet::cpp::OpReqType> grad_reqs;
std::vector<mxnet::cpp::NDArray> aux_arrays;
std::map<std::string, mxnet::cpp::NDArray> arg_grad_store;
std::map<std::string, mxnet::cpp::OpReqType> grad_req_type;
net.InferExecutorArrays(ctx_gpu, &arg_arrays, &grad_arrays, &grad_reqs, &aux_arrays, args_map, arg_grad_store, grad_req_type, aux_map);
auto executor = net.Bind(ctx_gpu, arg_arrays, grad_arrays, grad_reqs, aux_arrays);
executor->Forward(false);
Both variants crash on execution with an CUDNN error: Check failed: e == CUDNN_STATUS_SUCCESS (8 vs. 0) cuDNN: CUDNN_STATUS_EXECUTION_FAILED if started on the GPU, or a SegFault on the CPU. If I change the MXNET_ENGINE_TYPE to NaiveEngine, the example runs, but the output is garbage (containing NaNs, …).
What am I missing in the C++ API?
Edit: It seems that reading the params is already problematic, as the names have no prefix “args:”, or “aux:”. But also loading all params to the args_map still fails with the same errors.