Performance


Topic Replies Activity
Best practicies when deploying an MXNet model 3 May 28, 2018
Data copy between cpu and gpu in jetson TX1 2 May 28, 2018
The GPU memory usage is not stable 4 May 12, 2018
Performance issue of BatchNorm with use_global_stats=True 2 May 10, 2018
8x inference runtime difference between pip install and manual install 7 May 2, 2018
Is released python package in pypi compiled with tcmalloc or jemalloc? 2 March 16, 2018
How to scale a symbol 5 April 19, 2018
SSD Finetuning with Resnet50 3 April 12, 2018
Embedding size too big for GPU memory 2 March 27, 2018
Timing for Each Layer 1 March 12, 2018
Dot product on fp16 for simple networks 1 March 11, 2018
Documentation Request: Model Parallelism Tutorial 7 March 10, 2018
Lazy update with Adam optimizer is much slower for sparse input 2 March 10, 2018
Rcnn forward slow during distributed training 0.12 5 February 27, 2018
System crushes when running mxnet-ssd learning on multiple GPUs 1 February 13, 2018
Forward pass performance (for one image) is quite slow. Concerns mxnet 0.11.0 3 January 23, 2018
Simple network does not learn on own images 1 January 5, 2018
Marginal performance improvement with Titan V (volta) + CUDA 9 + CUDNN 7 4 December 29, 2017
MxNet (Python) version of Keras MLP doesn't learn 2 December 20, 2017
How to use argsort to zero out a matrix 2 December 19, 2017
Nd.array() not scalable, fails on large array size 7 November 18, 2017
Kvstore for distributed multi-gpu training 11 November 16, 2017
Very low CPU utilization 4 October 20, 2017
Memory profiling for MxNet 5 October 11, 2017
Training is faster when get_params() is called every mini-batch 2 October 9, 2017