Factorization Machines


I’m wondering if there’s a bug in the Factorization Machine code (15.9)? The forward method for FM calls npx.sigmoid(x) so it seems to be returning a probability. However the loss function is SigmoidBinaryCrossEntropyLoss. According to the documentation this uses from_sigmoid=False by default, so it’s interpreting the predictions as log odds, not probabilities.

In practice, most of the predictions from net(X) are extremely close to 0 or 1. This seems like what you would expect if the optimizer is interpreting them as log odds (hence probably wants to drive the 0s lower and 1s higher) while the forward method is constraining the values to lie in (0,1).

Related, I think the d2l.accuracy call in the training process might not be right here- it seems to be expecting y_hat to be class labels rather than log odds or probs.