I’m wondering if there’s a bug in the Factorization Machine code (15.9)? The forward method for FM calls
npx.sigmoid(x) so it seems to be returning a probability. However the loss function is
SigmoidBinaryCrossEntropyLoss. According to the documentation this uses
from_sigmoid=False by default, so it’s interpreting the predictions as log odds, not probabilities.
In practice, most of the predictions from
net(X) are extremely close to 0 or 1. This seems like what you would expect if the optimizer is interpreting them as log odds (hence probably wants to drive the 0s lower and 1s higher) while the forward method is constraining the values to lie in (0,1).
Related, I think the
d2l.accuracy call in the training process might not be right here- it seems to be expecting
y_hat to be class labels rather than log odds or probs.