Recovery from mxent checkpoint for wide & deep , accuracy decrease


#1

use example/sparse/wide_deep/train.py to train
e.g. in first epoch the accuracy in 0.83 like below

2018-06-27 18:48:53,896 epoch 0, accuracy = 0.8382098765432099
2018-06-27 18:48:53,900 Saved checkpoint to “./checkpoint/checkpoint-0000.params”
2018-06-27 18:48:53,902 Saved optimizer state to “./checkpoint/checkpoint-0000.states”
2018-06-27 18:48:53,902 Training completed.

load the sample and use the validate dataset to score, the accuracy decrease to 0.79
INFO:logger:Finished inference with 16200 images
INFO:logger:Finished with 115108.471217 images per second
INFO:logger:(‘accuracy’, 0.7991358024691358)


#2

@glingyan could you share your entire training and validation code so I can try to understand what happened?

A typical mistake here would be to compare the training accuracy (calculated using the training set you used to train) and the testing accuracy, computed with the unseen testing set.


#3

@ThomasDelteil thanks very much , I found the problem
this is because W&D need to use hash() function to preprocess the data before training