How state work?

I am working on a time series data model. I want to stack multiple layers of conv2dlstm. As MXNet ConvLSTM requires “state” to be one of the input parameters, How can I build this below model in mxnet?


Here is my code:

def evaluate_accuracy(model, dataloader ,state):

    eval_metrics_1 = mx.metric.MAE()
    eval_metrics_2 = mx.metric.MSE()
    eval_metrics_3 = mx.metric.RMSE()
    eval_metrics = mx.metric.CompositeEvalMetric()
    for child_metric in [eval_metrics_1, eval_metrics_2, eval_metrics_3]:

    for i, (data, label) in enumerate(dataloader):
        data = data.as_in_context(ctx)
        label = label.as_in_context(ctx)
        preds,state=model(data, state)

        eval_metrics.update(labels=label, preds=preds)
    return eval_metrics.get() , state

class Net(gluon.HybridBlock):

    def __init__(self, **kwargs):
        super(Net, self).__init__(**kwargs)
        with self.name_scope():

            self.cnn0 = mx.gluon.contrib.rnn.Conv2DLSTMCell(input_shape=(n_input, rows, columns),hidden_channels=n_input, activation='relu',i2h_kernel=(3,3),i2h_pad=(1,1), h2h_kernel=(3,3))
            #self.cnn1 = mx.gluon.contrib.rnn.Conv2DLSTMCell(input_shape=(n_input, rows, columns),hidden_channels=n_input, activation='relu',i2h_kernel=(5,5),i2h_pad=(2,2), h2h_kernel=(5,5))
            self.rnn1 = mx.gluon.rnn.LSTM(rnn_size, n_layer, 'NTC',bidirectional=True)
            self.dense1 = mx.gluon.nn.Dense(rows*columns)
    def hybrid_forward(self, F, x,state):

        x,state= self.cnn0(x,state)
        x= self.rnn1(x)
        x= x.reshape(shape=(batch_size, -1,columns))        
        return x,state

def fit(model):

    train_loss = []
    val_loss = []

    for e in range(epochs):
        tick = time.time()
        for i, (data, label) in enumerate(train_iter):
            data = data.as_in_context(ctx)
            label = label.as_in_context(ctx)
            state = model.cnn0.begin_state(batch_size=batch_size, ctx=ctx)

            with autograd.record():
                Y_pred,state= model(data,state)
                loss =loss1(Y_pred, label) 

    return train_loss, val_loss, model, state

ctx = mx.gpu()
net.collect_params().initialize(mx.init.Xavier(), ctx=ctx)
loss1 = mx.gluon.loss.L1Loss()
trainer = mx.gluon.Trainer(net.collect_params(), 'adam', {'learning_rate': lr})
t_loss,v_loss, net,state=fit(net)
test=evaluate_accuracy(net, test_iter,state)

How to stack multiple layers? I don’t know whether the state is correctly passed in every iteration or not. Please help me with states and how to stack another conv2dLSTM layer. my problem is only states. Do I have to pass the state of previous layer’s (ie cnn0) state to next layer (ie cnn1) ?? also, am I passing state properly to evaluate_accuracy function??
Please help
Thank you!

Hi @komal,

Check out mx.gluon.rnn.SequentialRNNCell() and mx.gluon.rnn.HybridSequentialRNNCell() for a convenient way of handling state between stacked RNN cells, including Conv2DLSTMCell. You can then begin_state once for all the cells in the Sequential container. Check out this great example from @NRauschmayr that shows how to use stacked Conv2DLSTMCells.

Some useful snippets from that include the model definition…

class convLSTMAE(gluon.nn.HybridBlock):
    def __init__(self, **kwargs):
        super(convLSTMAE, self).__init__(**kwargs)
        with self.name_scope():
          self.encoder = ...
          self.temporal_encoder = gluon.rnn.HybridSequentialRNNCell()
          self.temporal_encoder.add(gluon.contrib.rnn.Conv2DLSTMCell((64,26,26), 64, 3, 3, i2h_pad=1))
          self.temporal_encoder.add(gluon.contrib.rnn.Conv2DLSTMCell((64,26,26), 32, 3, 3, i2h_pad=1))
          self.temporal_encoder.add(gluon.contrib.rnn.Conv2DLSTMCell((32,26,26), 64, 3, 3, i2h_pad=1))
          self.decoder = ...

    def hybrid_forward(self, F, x, states=None, **kwargs):
        x = self.encoder(x)
        x, states = self.temporal_encoder(x, states)
        x = self.decoder(x)
        return x, states

And the training loop, managing states once…

for epoch in range(num_epochs):
    for image in dataloader:
        image  = image.as_in_context(ctx)
        states = model.temporal_encoder.begin_state(func=mx.nd.zeros, batch_size=batch_size, ctx=ctx)
        with mx.autograd.record():
            reconstructed, states = model(image, states)
            loss = l2loss(reconstructed, image)

With regards to state in your evalution function, you shouldn’t need to pass in states. You should begin_state from within your evaluation function, and not persist between training and testing.

1 Like

It worked. thank you so much. you have been of great help.