Good example:)
Can we define a gluon.dataset like below for putting negative sampling inside train iter instead of calling this function per iteration? I think this may be helpful for simplifying the training progress and reducing memory usage especially dealing with larger dataset like movielens 20 million and other.
def __getitem__(self, idx):
if idx % (self.nb_neg + 1) == 0:
idx = idx // (self.nb_neg + 1)
return self.data[idx][0], self.data[idx][1], np.ones(1, dtype=np.float32).item()
else:
idx = idx // (self.nb_neg + 1)
u = self.data[idx][0]
j = mx.random.randint(0, self.nb_items).asnumpy().item()
while (u, j) in self.mat:
j = mx.random.randint(0, self.nb_items).asnumpy().item()
return u, j, np.zeros(1, dtype=np.float32).item()
Thank you very much for providing the code, I will give it a test ! If it helps, we will revise it.
thank you so much. i will try
Hi, I think the ‘num_users’ in evaluate_ranking function should be ‘num_items’.
def evaluate_ranking(net, test_input, seq, candidates, num_users, num_items,
ctx):
ranked_list, ranked_items, hit_rate, auc = {}, {}, [], []
# all_items = set([i for i in range(num_users)])
all_items = set([i for i in range(num_items)])
You have a typo - eastimating should be estimating. Your typo did however make me hungry
Have a question for “negative sampling”. the book says
samples negative items randomly for each user from the candidate set of that user.
But the code says otherwise: all negative samples are drawn from exactly NOT from that candidate set:
list(self.all - set(self.cand[int(self.users[idx])]))
And all the formula for computing AUC is using I\S_u
to exclude the candidate set.
How can I make sense of it?
Agreed.
According to
- the definition of function load_data_ml100k
- this line
users_train, items_train, ratings_train, candidates = d2l.load_data_ml100k(train_data, num_users, num_items, feedback="implicit")
candidates should be the list of items users have interacted with.
Hi, Thanks for sharing the detailed information. Can you provide us with an example or how to do prediction once you built this model. May be we can use the movielens itself. It would mean a lot or just guide me to the some reference.
Thanks in advance