I have been using Gluon for implementing Q1.4 Modeling for MNIST-fashion. I’m getting reduced loss after each epoch, but I’m getting 0 test/training accuracy for each epoch for some reason. Do you know what the cause might be for this?
I think the net is outputting values in a continuous range. So, it’s outputting 0 accuracy because it’s comparing values like 3.2, -11.1, etc. (just random numbers, no meaning) to the labels (-1 or 1).
I ended up taking these outputs, plugging them into the sigmoid function, mapping them to -1 if the post-sigmoid value is less than 0.5 and map to 1 else, and then comparing those with the labels to determine accuracy.
I recommend looking back to the first problem in Homework 3. Think about o in that context as being f(x_i), so that when you pass the output of the network through the loss, you are effectively computing -\log p(y|x), where p(y|x) is given by the sigmoid function.