Performance of Symbol vs. NDArray vs. PyTorch


In the documentation it shows that by hybridizing you get nearly a 2x performance boost, so I was wondering how each compares to other iterative frameworks, particularly PyTorch. It seems to me that PyTorch’s iterative paradigm is similar to using NDArray, so then is using Symbol twice as fast as PyTorch?


Borealis AI did a comparison between Gluon and PyTorch recently which you might find useful:


Thanks! That is a very interesting article. I really wish they had explored hybridization, although since they were using an RNN it’s understandable.


I created this very simple benchmark based on the original hybridization benchmark:

Results on my MacBook Pro
Processor: 2.5 GHz Intel Core i7

 Device: CPU
 Framework      Paradigm        Precision     Time
 MXNet          imperative      32 bit        1171
 MXNet          imperative      16 bit         701
 MXNet          symbolic        32 bit         557
 MXNet          symbolic        16 bit         813
 PyTorch        imperative      32 bit         697
 PyTorch        imperative      16 bit         ---

It looks like PyTorch is faster than MXNet imperative mode but slower than symbolic mode, which makes sense since PyTorch is only imperative so it’s optimized for that case.

I hope to try it on a GPU soon, especially a P100 and V100 to better test the effects of half precision (16 bit).