I tried running densenet121 from the gluon model zoo for the first time since upgrading to 1.5.0 this is a standard pip3 install, I have not built anything on this computer. If I set ignore_stale_grad=True. It doesn’t work well and theirs a strange imbalance between gpu usage one is running at 80% and the the other at 30%. The data is a standard rec file made from caltech256. I have had this notebook since version 1.4.1 came out and have never had any problems. I should note that I do create a kvstore device and set kv=kvstore in the trainer.
Related Topics
Topic | Replies | Views | Activity | |
---|---|---|---|---|
Fine-tuning error "gradient has not been updated by backward since last step" | 1 | 1433 | September 1, 2019 | |
Lower accuracy on Cifar10 with multi-gpu implementation
|
5 | 599 | August 23, 2018 | |
SageMaker CPU Training: Gradient of Parameter `lstnet0_conv0_weight` on context cpu(1) has not been updated by backward since last `step` | 4 | 861 | April 2, 2019 | |
Any tutorial for Gluon models?
|
10 | 1597 | November 7, 2018 | |
Gluoncv fcn inference failed | 10 | 620 | November 26, 2018 |