skip to main content
10.1145/1081870.1081956acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
Article

Evaluating similarity measures: a large-scale study in the orkut social network

Published: 21 August 2005 Publication History

Abstract

Online information services have grown too large for users to navigate without the help of automated tools such as collaborative filtering, which makes recommendations to users based on their collective past behavior. While many similarity measures have been proposed and individually evaluated, they have not been evaluated relative to each other in a large real-world environment. We present an extensive empirical comparison of six distinct measures of similarity for recommending online communities to members of the Orkut social network. We determine the usefulness of the different recommendations by actually measuring users' propensity to visit and join recommended communities. We also examine how the ordering of recommendations influenced user selection, as well as interesting social issues that arise in recommending communities within a real social network.

References

[1]
Breese, J.; Heckerman, D.; Kadie, C. Empirical Analysis of Predictive Algorithms for Collaborative Filtering. In Proceedings of the Fourteenth Conference on Uncertainty in Artificial Intelligence (Madison, Wisconsin, 1998). Morgan Kaufmann.
[2]
Cover, T.M., and Thomas, J.A. Elements of Information Theory. Wiley, New York, 1991.
[3]
Deshpande, M., and Karypis, G. Item-Based Top-N Recommendation Algorithms. ACM Transactions on Information Systems 22(1) (January 2004), 143--177.
[4]
Domingos, P. Prospects and Challenges for Multi-Relational Data Mining. ACM SIGKDD Exploration Newsletter 5(1) (July 2003).
[5]
Dumais, S.; Joachims, T.; Bharat, K.; Weigend, A. SIGIR 2003 Workshop Report: Implicit Measures of User Interests and Preferences. SIGIR Forum 37(2) (Fall 2003).
[6]
Harman, D. Ranking Algorithms. In W. B. Frakes and R. Baeza-Yates (ed.), Information Retrieval: Data Structures & Algorithms (chapter 14). Upper Saddle River, NJ, USA: Prentice Hall, 1992.
[7]
Joachims, T. Evaluating Retrieval Performance Using Clickthrough Data. In Proceedings of the SIGIR Workshop on Mathematical/Formal Methods in Information Retrieval (2002). ACM Press, New York, NY.
[8]
Kautz, H.; Selman, Bart; Shah, M. Referral Web: Combining Social Networks and Collaborative Filtering. Communications of the ACM 45(8) (March 1997).
[9]
Kitts, B.; Freed, D.; Vrieze, M. Cross-Sell: A Fast Promotion-Tunable Customer-Item Recommendation Method based on Conditionally Independent Probabilities. In Proceedings of the Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (Boston, 2000). ACM Press, New York, NY, 437--446.
[10]
Lehmann, E.L. Testing Statistical Hypotheses (second edition). Springer-Verlag, 1986.
[11]
Raghavan, P. Social Networks and the Web (Invited Talk). In Advances in Web Intelligence: Proceedings of the Second International Atlantic Web Intelligence Conference, May 2004. Springer-Verlag, Heidelberg.
[12]
Salton, G. Automatic Text Processing: The Transformation, Analysis, and Retrieval of Information by Computer. Addison Wesley, Reading, MA, 1989.
[13]
Sarwar, B.; Karypis, G.; Konstan, J.; Reidl, J. Item-Based Collaborative Filtering Recommendation Algorithms. In Proceedings of the Tenth International Conference on the World Wide Web (WWW10) (Hong Kong, 2001). ACM Press, New York, NY, 285--295.
[14]
Spertus, Ellen. Too Much Information. Orkut Media Selections, January 19, 2005. Available online at "http://media.orkut.com/articles/0078.html".

Cited By

View all
  • (2023)Carving Nature at Its Joints: A Comparison of CEMI Field Theory with Integrated Information Theory and Global Workspace TheoryEntropy10.3390/e2512163525:12(1635)Online publication date: 8-Dec-2023
  • (2023)Dynamic Set Similarity Join: An Update Log Based ApproachIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2021.312663135:4(3727-3741)Online publication date: 1-Apr-2023
  • (2023)MetricJoin: Leveraging Metric Properties for Robust Exact Set Similarity Joins2023 IEEE 39th International Conference on Data Engineering (ICDE)10.1109/ICDE55515.2023.00085(1045-1058)Online publication date: Apr-2023
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
KDD '05: Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
August 2005
844 pages
ISBN:159593135X
DOI:10.1145/1081870
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 21 August 2005

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. collaborative filtering
  2. data mining
  3. online communities
  4. recommender system
  5. similarity measure
  6. social networks

Qualifiers

  • Article

Conference

KDD05

Acceptance Rates

Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)40
  • Downloads (Last 6 weeks)4
Reflects downloads up to 10 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2023)Carving Nature at Its Joints: A Comparison of CEMI Field Theory with Integrated Information Theory and Global Workspace TheoryEntropy10.3390/e2512163525:12(1635)Online publication date: 8-Dec-2023
  • (2023)Dynamic Set Similarity Join: An Update Log Based ApproachIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2021.312663135:4(3727-3741)Online publication date: 1-Apr-2023
  • (2023)MetricJoin: Leveraging Metric Properties for Robust Exact Set Similarity Joins2023 IEEE 39th International Conference on Data Engineering (ICDE)10.1109/ICDE55515.2023.00085(1045-1058)Online publication date: Apr-2023
  • (2023)Scalable Computation of Fuzzy Joins Over Large Collections of JSON Data2023 IEEE International Conference on Fuzzy Systems (FUZZ)10.1109/FUZZ52849.2023.10309759(01-06)Online publication date: 13-Aug-2023
  • (2023)Group Recommendation Based on Heterogeneous Graph Algorithm for EBSNsIEEE Access10.1109/ACCESS.2022.322459811(1854-1866)Online publication date: 2023
  • (2023)Visual Representation for Patterned Proliferation of Social Media Addiction: Quantitative Model and Network AnalysisSN Computer Science10.1007/s42979-023-02164-74:6Online publication date: 18-Sep-2023
  • (2023)Content-based comparison of communities in social networks: Ex-Yugoslavian reactions to the Russian invasion of UkraineApplied Network Science10.1007/s41109-023-00561-88:1Online publication date: 28-Jun-2023
  • (2022)Bitmap filterInformation Systems10.1016/j.is.2019.10144988:COnline publication date: 21-Apr-2022
  • (2022)Unsupervised discovery of non-trivial similarities between online communitiesExpert Systems with Applications10.1016/j.eswa.2022.117900206(117900)Online publication date: Nov-2022
  • (2022)HSCRD: Hybridized Semantic Approach for Knowledge Centric Requirement DiscoveryDigital Technologies and Applications10.1007/978-3-031-02447-4_8(70-79)Online publication date: 6-May-2022
  • Show More Cited By

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media