What makes the RNN layer unable to hybridize?


follow up of How to use hybridization for Language modelling (RNN)

If the RNN/LSTM… Cells are capable of hybridization , what part of the RNN layer (https://github.com/apache/incubator-mxnet/blob/master/python/mxnet/gluon/rnn/rnn_layer.py) makes it unable to hybridize ? (on GPU specifically)


When you use cells directly, the sequence length that the symbolic computational graph can handle must be fixed after hybridization. This means that, for example, for sentence embedding or language models, the length of the sentence that you pass into the model would have to always be the same (basically you either need to pad to trim the sequence to fit your model). This is because the computational graph contains the unrolled RNN and for the computational graph to be hybridizable, it would need to be static.

The FusedRNN operator is an operator that can process an entire LSTM forward pass within the operator. This means that the computational graph, which is made up of operators, can still be static but deal with variable sequence length.

Right now, the FusedRNN operator is only implemented in CuDNN (by NVidia) and MXNet takes advantage of that when you use Cuda. But when you do not use Cuda, in order to still function with variable length inputs, the unrolls the RNN using the shape of the data on every forward call and that’s why this computational graph cannot be represented statically (i.e. it isn’t hybridizable).