Unable to load Gluon HybridSequential model (in Python)

Hi,

I’m creating a Gluon model as follows:

    model = mx.gluon.nn.HybridSequential()
    with model.name_scope():
        model.add(mx.gluon.rnn.LSTM(m_nodes, dropout=0.2, layout='NTC'))
        model.add(mx.gluon.rnn.LSTM(n_nodes, dropout=0.2, layout='NTC'))
        model.add(mx.gluon.nn.Dense(o_nodes, flatten=True))

    model.hybridize()
    model.collect_params().initialize(mx.init.Normal(sigma=1.), ctx=model_ctx)

I’m successfully able to train and save its params and symbol using

net.export(best_model_path, epoch=epoch)

However, when I try to load it back with

gluon.nn.SymbolBlock.imports(os.path.join(model_dir, symbol_file),['data'],os.path.join(model_dir, params_file),ctx='mx.cpu()')

I receive the following error:

ERROR:root:Parameter 'hybridsequential0_lstm0_l0_i2h_weight' is missing in file '/tmp/model_name.params', which contains parameters: '0._unfused.0.i2h_weight', '0._unfused.0.h2h_bias', '2.weight', ..., '1._unfused.0.i2h_weight', '2.bias', '1._unfused.0.h2h_weight', '0._unfused.0.h2h_weight'. Please make sure source and target networks have the same prefix.

Using mx.__version__: 1.3.1. Any clues about what I might be doing incorrectly here?

Seems to work fine for me. I’m using version 1.4 though, so maybe give that a try. I based my test code on your example but filled in a few of the blanks.

Script to save model:

import mxnet as mx

model = mx.gluon.nn.HybridSequential()
with model.name_scope():
    model.add(mx.gluon.rnn.LSTM(5, dropout=0.2, layout='NTC'))
    model.add(mx.gluon.rnn.LSTM(6, dropout=0.2, layout='NTC'))
    model.add(mx.gluon.nn.Dense(7, flatten=True))

ctx = mx.cpu()
model.hybridize()
model.initialize(mx.init.Normal(sigma=1.), ctx=ctx)
data = mx.nd.random.uniform(shape=(2,3,4))
outputs = model(data)
model.export('test.ckp', epoch=0)

Script to load model:

import mxnet as mx

params_file = 'test.ckp-0000.params'
symbol_file = 'test.ckp-symbol.json'
ctx = mx.cpu()
new_model = mx.gluon.nn.SymbolBlock.imports(symbol_file, ['data'], params_file, ctx=ctx)
data = mx.nd.random.uniform(shape=(2,3,4))
outputs = new_model(data)
outputs.wait_to_read()
print(outputs.shape)