Item-item collaborative filtering
Item-item collaborative filtering, or item-based, or item-to-item, is a form of collaborative filtering based on the similarity between items calculated using people's ratings of those items.
Item-item collaborative filtering was invented and used by Amazon.com in 1998.[1] It was first published in an academic conference in 2001. The authors of that paper, Badrul Sarwar, George Karypis, Joseph Konstan, and John Riedl, won the 2016 Test of Time Award for their paper Item-based collaborative filtering recommendation algorithms. The International World Wide Web Conference committee stated that "this outstanding paper has had a considerable real-world impact".[2]
Earlier collaborative filtering systems based on rating similarity between users (known as user-user collaborative filtering) had several problems:
- systems performed poorly when they had many items but comparatively few ratings
- computing similarities between all pairs of users was expensive
- user profiles changed quickly and the entire system model had to be recomputed
Item-item models resolve these problems in systems that have more users than items. Item-item models use rating distributions per item, not per user. With more users than items, each item tends to have more ratings than each user, so an item's average rating usually doesn't change quickly. This leads to more stable rating distributions in the model, so the model doesn't have to be rebuilt as often. When users consume and then rate an item, that item's similar items are picked from the existing system model and added to the user's recommendations.
Method
First, the system executes a model-building stage by finding the similarity between all pairs of items. This similarity function can take many forms, such as correlation between ratings or cosine of those rating vectors. As in user-user systems, similarity functions can use normalized ratings (correcting, for instance, for each user's average rating).
Second, the system executes a recommendation stage. It uses the most similar items to a user's already-rated items to generate a list of recommendations. Usually this calculation is a weighted sum or linear regression. This form of recommendation is analogous to "people who rate item X highly, like you, also tend to rate item Y highly, and you haven't rated item Y yet, so you should try it".
Results
Item-item collaborative filtering had less error than user-user collaborative filtering. In addition, its less-dynamic model was computed less often and stored in a smaller matrix, so item-item system performance was better than user-user systems.
Further research
Many variations of item-item collaborative filtering exist. For instance, a method named Item2Vec[3] was introduced for scalable item-item collaborative filtering. Item2Vec produces low dimensional representation for items, where the affinity between items can be measured by cosine similarity. The method is based on the Word2Vec method that was successfully applied to natural language processing applications.
Slope One is a family of item-item collaborative filtering algorithms designed to reduce model overfitting problems.
Bibliography
- Sarwar, Badrul; Karypis, George; Konstan, Joseph; Riedl, John (2001). "Item-based collaborative filtering recommendation algorithms". Proceedings of the 10th international conference on the World Wide Web. ACM: 285–295. doi:10.1145/371920.372071. ISBN 1-58113-348-0.
- Linden, G; Smith, B; York, J (22 January 2003). "Amazon.com recommendations: item-to-item collaborative filtering". IEEE Internet Computing. IEEE. 7 (1): 76–80. doi:10.1109/MIC.2003.1167344. ISSN 1089-7801.
References
- ↑ "Collaborative recommendations using item-to-item similarity mappings".
- ↑ "University of Minnesota professors and alumnus win international award for groundbreaking recommender systems research: Recommender platforms first developed at the University of Minnesota are now used by Amazon and Netflix". University of Minnesota. March 24, 2016.
- ↑ Barkan, O; Koenigstein, N (2016)."Item2Vec: Neural Item Embedding for Collaborative Filtering". arXiv:1603.04259.