Exploring Rated Datasets with Rating Maps.

Exploring rated datasets with rating maps

S Amer-Yahia, S Kleisarchaki, NK Kolloju… - Proceedings of the 26th …, 2017 - dl.acm.org

S Amer-Yahia, S Kleisarchaki, NK Kolloju, LVS Lakshmanan, RH Zamar

Proceedings of the 26th International Conference on World Wide Web, 2017•dl.acm.org

Online rated datasets have become a source for large-scale population studies for analysts and a means for end-users to achieve routine tasks such as finding a book club. Existing systems however only provide limited insights into the opinions of different segments of the rater population. In this paper, we develop a framework for finding and exploring population segments and their opinions. We propose rating maps, a collection of (population segment, rating distribution) pairs, where a segment, e.g., {18-29 year old males in CA} has a rating distribution in the form of a histogram that aggregates its ratings for a set of items (e.g., movies starring Russel Crowe). We formalize the problem of building rating maps dynamically given desired input distributions. Our problem raises two challenges: (i) the choice of an appropriate measure for comparing rating distributions, and (ii) the design of efficient algorithms to find segments. We show that the Earth Mover's Distance (EMD) is well-adapted to comparing rating distributions and prove that finding segments whose rating distribution is close to input ones is NP-complete. We propose an efficient algorithm for building Partition Decision Trees and heuristics for combining the resulting partitions to further improve their quality. Our experiments on real and synthetic datasets validate the utility of rating maps for both analysts and end-users.

ACM Digital Library

Show moreShow less

Save Cite Cited by 34 Related articles All 5 versions

Showing the best result for this search. See all results

Cite

Advanced search

Saved to My library

Exploring rated datasets with rating maps