Speeding up Machine Translation with RNNs


Machine learning translation example is very sound didactically, but it is very slow.

What should be done to speed it up?

I’ve tried rewriting everything to HybridBlocks, and I actually rewrote the iteration from batch_loss to use F.contrib.foreach that should work for both types of blocks. The results are still pretty slow though.

Is there any other way to make this faster other than rewriting decoder to use time dimension, and not just do step-by-step decoding?

Of course I already rewrote the code to use GPU acceleration, which wasn’t featured in the original notebook.

I know that Seq2Seq Machine Translation script from GluonNLP is way faster, but it is way more complex than d2l example, and, as witnessed in this issue, there doesn’t exist any tutorial that makes the details clear.


I’ve found what was the problem - MT example uses LSTMs, which are supposed to work with unrolling, and not LSTMCells, like in GNMT example.