Hi everyone,
My training stopped for some reason, and now that I want to resume it, the learning rate wont change!
I am following the Classification exmple here.
I believe I set all paramters correctly, which by the way is as follows :
DTYPE=float16
BATCHSIZE=384
WORKER=20
EPOCH=187
CHECKPOINT=params_model_mixup/0.3399-imagenet-186-best.states
PARAMS=params_model_mixup/0.3399-imagenet-186-best.params
python train_imagenet.py \
--rec-train /media/void/SSD/ImageNet_DataSet/train/rec_train/train.rec --rec-train-idx /media/void/SSD/ImageNet_DataSet/train/rec_train/train.idx \
--rec-val /media/void/SSD/ImageNet_DataSet/train/rec_val/val.rec --rec-val-idx /media/void/SSD/ImageNet_DataSet/train/rec_val/val.idx \
--model model --mode hybrid \
--lr 0.4 --lr-mode cosine --num-epochs 200 --batch-size $BATCHSIZE --num-gpus 1 -j $WORKER \
--use-rec --dtype $DTYPE --warmup-epochs 0 --no-wd --label-smoothing --mixup \
--save-dir params_model_mixup \
--logging-file model_mixup.log --resume-states $CHECKPOINT --resume-params $PARAMS --resume-epoch $EPOCH
As you can see below, the learning rate wont change! :
|Epoch[187] Batch [49]|Speed: 492.394147 samples/sec|rmse=0.019614|lr=0.004371|
|Epoch[187] Batch [99]|Speed: 603.372949 samples/sec|rmse=0.019578|lr=0.004371|
|Epoch[187] Batch [149]|Speed: 604.314057 samples/sec|rmse=0.019593|lr=0.004371|
What am I missing here?
any help is greatly appreciated