Speeding up Machine Translation with RNNs

gpu
performance
docs
#1

Machine learning translation example is very sound didactically, but it is very slow.

What should be done to speed it up?

I’ve tried rewriting everything to HybridBlocks, and I actually rewrote the iteration from batch_loss to use F.contrib.foreach that should work for both types of blocks. The results are still pretty slow though.

Is there any other way to make this faster other than rewriting decoder to use time dimension, and not just do step-by-step decoding?

PS
Of course I already rewrote the code to use GPU acceleration, which wasn’t featured in the original notebook.

PS2
I know that Seq2Seq Machine Translation script from GluonNLP is way faster, but it is way more complex than d2l example, and, as witnessed in this issue, there doesn’t exist any tutorial that makes the details clear.

#2

I’ve found what was the problem - MT example uses LSTMs, which are supposed to work with unrolling, and not LSTMCells, like in GNMT example.

1 Like
#3

@lambdaofgod thanks for getting back to share your solution. Could you put your code here in case someone else would like to try your version? Thanks!

#4

I’m working on it. I thought about modifying notebook and adding it to d2l notebooks github repo, is that a good way? Or should I just add it as a branch on d2l book repo?