-
Notifications
You must be signed in to change notification settings - Fork 661
Description
Hello, @hexiangnan
First of all, thank you for sharing your codes for many reviewers.
I reviewed your codes, and explored preprocessed data in Data
folder.
I find some strange thing;duplicated negative samples exist for a user.
In the paper Section 4.1 Evaluation Protocols, there is a sentence as follows.
we followed the common strategy [6, 21] that randomly samples 100 items that are not interacted by the user, ranking the test item among the 100 items.
Although you mentioned about replacement for negative sampling, I think it is reasonable to extract negative sampling without replacement for each user.
This is because the ndcg of test dataset would be over-estimated.
As an example, this scenario can be happened.
If given negative samples which has duplicated items, recommended list also can have duplicated items.
# suppose that there is a top 10 recommended list for given one positive and 99 negative samples with replacement.
recs= [10, 11, 11, 11, 9, 29, 102, 204, 23, 2]
gt = [11]
ndcg(recs, gt)
Above ndcg returns 1 / log2(1 + 2)
.
This ndcg is not reasonable because 11 sampled 3 times. It means other items lose their chances to be recommended.
Summary
Generally, recommended list is distinct.
However, your test negative samples has duplicated items for a user.
Please checkout as follows. (Reproduce unreasonable behavior)
for uid, iid, label in test_loader:
assert len(set(iid)) == len(iid)