Publication View

Probability And Statistics—Contingency table analysis (2008)

Abstract
We have developed a method for recommending items that combines content and collaborative data under a single probabilistic framework. We benchmark our algorithm against a naïve Bayes classifier on the cold-start problem, where we wish to recommend items that no one in the community has yet rated. We systematically explore three testing methodologies using a publicly available data set, and explain how these methods apply to specific real-world applications. We advocate heuristic recommenders when benchmarking to give competent baseline performance. We introduce a new performance metric, the CROC curve, and demonstrate empirically that the various components of our testing strategy combine to obtain deeper understanding of the performance characteristics of recommender systems. Though the emphasis of our testing is on cold-start recommending, our methods for recommending and evaluation are general. Categories and Subject Descriptors

Publication details
Download http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.62.6237
Source http://www.cis.upenn.edu/datamining/Publications/p8734-schein.pdf
Contributors CiteSeerX
Repository CiteSeerX - Scientific Literature Digital Library and Search Engine (United States)
Keywords Algorithms, Experimentation, Performance Keywords Recommender systems, collaborative filtering, content-based filtering, information retrieval, graphical models, performance
Type text
Language English
Relation 10.1.1.136.4322, 10.1.1.30.6583, 10.1.1.21.4665, 10.1.1.109.6332, 10.1.1.33.3584, 10.1.1.29.1951, 10.1.1.18.7888, 10.1.1.38.5499, 10.1.1.30.9676, 10.1.1.36.4620, 10.1.1.42.639, 10.1.1.38.744, 10.1.1.43.3696, 10.1.1.38.6498, 10.1.1.33.4026, 10.1.1.97.9813, 10.1.1.46.3659, 10.1.1.28.4279, 10.1.1.36.6762, 10.1.1.16.6429, 10.1.1.10.6090, 10.1.1.24.6153, 10.1.1.28.3100, 10.1.1.21.8200