skip to main content
research-article

Random walk with restart on hypergraphs: fast computation and an application to anomaly detection

Published: 21 December 2023 Publication History

Abstract

Random walk with restart (RWR) is a widely-used measure of node similarity in graphs, and it has proved useful for ranking, community detection, link prediction, anomaly detection, etc. Since RWR is typically required to be computed separately for a larger number of query nodes or even for all nodes, fast computation of it is indispensable. However, for hypergraphs, the fast computation of RWR has been unexplored, despite its great potential. In this paper, we propose ARCHER, a fast computation framework for RWR on hypergraphs. Specifically, we first formally define RWR on hypergraphs, and then we propose two computation methods that compose ARCHER. Since the two methods are complementary (i.e., offering relative advantages on different hypergraphs), we also develop a method for automatic selection between them, which takes a very short time compared to the total running time. Through our extensive experiments on 18 real-world hypergraphs, we demonstrate (a) the speed and space efficiency of ARCHER, (b) the complementary nature of the two computation methods composing ARCHER, (c) the accuracy of its automatic selection method, and (d) its successful application to anomaly detection on hypergraphs.

References

[1]
Amburg I, Veldt N, Benson A (2020) Clustering in graphs and hypergraphs with categorical edge labels. In: Proceedings of the web conference 2020 (WWW), pp 706–717.
[2]
Benson AR, Abebe R, Schaub MT, et al. Simplicial closure and higher-order link prediction Proceed Natl Academy Sci 2018
[3]
Boyd S, Boyd SP, and Vandenberghe L Convex optimization 2004 Cambridge Cambridge University Press
[4]
Brin S and Page L The anatomy of a large-scale hypertextual web search engine Comput Netw ISDN Syst 1998 30 1–7 107-117
[5]
Chitra U, Raphael B (2019) Random walks on hypergraphs with edge-dependent vertex weights. In: Proceedings of the 36th international conference on machine learning (ICML), pp 1172–1181, arXiv:1905.08287
[6]
Chodrow PS, Veldt N, and Benson AR Generative hypergraph clustering: from blockmodels to modularity Sci Adv 2021 7 28 eabh1303
[7]
Cohen MB, Kelner J, Peebles J, et al (2016) Faster algorithms for computing the stationary distribution, simulating random walks, and more. In: 2016 IEEE 57th annual symposium on foundations of computer science (FOCS), pp 583–592.
[8]
Cohen MB, Kelner J, Kyng R, et al (2018) Solving directed laplacian systems in nearly-linear time through sparse lu factorizations. In: 2018 IEEE 59th annual symposium on foundations of computer science (FOCS), pp 898–909.
[9]
Comrie C, Kleinberg J (2021) Hypergraph ego-networks and their temporal evolution. In: 2021 IEEE international conference on data mining (ICDM), pp 91–100.
[10]
Do MT, Yoon Se, Hooi B, et al (2020) Structural patterns and generative models of real-world hypergraphs. In: Proceedings of the 26th ACM SIGKDD international conference on knowledge discovery & data mining (KDD). ACM, pp 176–186.
[11]
Fowler JH Connecting the congress: a study of cosponsorship networks Polit Anal 2006 14 4 456-487
[12]
Fowler JH Legislative cosponsorship networks in the US house and senate Soc Netw 2006 28 4 454-465
[13]
Fujiwara Y, Nakatsuji M, Onizuka M, et al (2012) Fast and exact top-k search for random walk with restart. Proceed VLDB Endowment 5(5), 442–453.https://doi.org/10.14778/2140436.2140441
[14]
Gasteiger J, Bojchevski A, Günnemann S (2019a) Predict then propagate: Graph neural networks meet personalized pagerank. In: International conference on learning representations (ICLR). arXiv:1810.05997
[15]
Gasteiger J, Weißenberger S, Günnemann S (2019b) Diffusion improves graph learning. In: Advances in neural information processing systems (NeurIPS). arXiv:1911.05485
[16]
Harper FM and Konstan JA The MovieLens datasets ACM Trans Interact Intell Syst 2015 5 4 1-19
[17]
Hayashi K, Aksoy SG, Park CH, et al (2020) Hypergraph random walks, laplacians, and clustering. In: Proceedings of the 29th ACM international conference on information & knowledge management (CIKM), pp 495–504.
[18]
Horn RA, Johnson CR (2012) Matrix analysis. Cambridge University Press.
[19]
Hou G, Chen X, Wang S, et al (2021) Massively parallel algorithms for personalized pagerank. Proceed VLDB Endow 14(9):1668–1680.
[20]
Jung J, Jin W, Sael L, et al (2016) Personalized ranking in signed networks using signed random walk with restart. In: 2016 IEEE 16th international conference on data mining (ICDM), pp 973–978.
[21]
Jung J, Park N, Lee S, et al. (2017) BePI. In: Proceedings of the 2017 ACM international conference on management of data (SIGMOD), pp 789–804.
[22]
Jung J, Jin W, and Kang U Random walk-based ranking in signed social networks: model and algorithms Knowl Inf Syst 2019 62 2 571-610
[23]
Kang U, Faloutsos C (2011) Beyond ’caveman communities’: Hubs and spokes for graph compression and mining. In: 2011 IEEE 11th international conference on data mining (ICDM), pp 300–309,
[24]
Langville AN, Meyer CD (2006) Google’s PageRank and beyond: the science of search engine rankings. Princeton University Press, Princeton.
[25]
Lee G, Choe M, Shin K (2021) How do hyperedges overlap in real-world hypergraphs?—patterns, measures, and generators. In: Proceedings of the web conference 2021 (WWW), pp 3396–3407.
[26]
Lee G, Choe M, Shin K (2022) HashNWalk: Hash and random walk based anomaly detection in hyperedge streams. In: Proceedings of the thirty-first international joint conference on artificial intelligence (IJCAI), pp 2129–2137.
[27]
Lee G, Yoo J, Shin K (2023) Mining of real-world hypergraphs: Patterns, tools, and generators. In: Proceedings of the 29th ACM SIGKDD international conference on knowledge discovery & data mining (KDD). ACM, pp 5811–5812.,
[28]
Lee J, Jung J (2023) Time-aware random walk diffusion to improve dynamic graph learning. In: Proceedings of the AAAI conference on artificial intelligence (AAAI).
[29]
Leskovec J, Kleinberg J, and Faloutsos C Graph evolution ACM Trans Knowl Discovery Data 2007 1 1 2
[30]
Li J, He J, Zhu Y (2018) E-tail product return prediction via hypergraph-based local graph cut. In: Proceedings of the 24th ACM SIGKDD international conference on knowledge discovery & data mining (KDD), pp 519–527.
[31]
Lin D, Wong RCW, Xie M, et al (2020) Index-free approach with theoretical guarantee for efficient random walk with restart query. In: IEEE 36th international conference on data engineering (ICDE), pp 913–924.
[32]
McAuley J, Leskovec J (2013) Discovering social circles in ego networks. arXiv:1210.8182
[33]
Nassar H, Kloster K, Gleich DF (2015) Strong localization in personalized PageRank vectors. In: Algorithms and models for the web graph (WAW), pp 190–202.
[34]
Ni J, Li J, McAuley J (2019) Justifying recommendations using distantly-labeled reviews and fine-grained aspects. In: Proceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing (EMNLP-IJCNLP), pp 188–197.
[35]
Page L, Brin S, Motwani R, et al (1999) The pagerank citation ranking: Bringing order to the web. Tech. rep., Stanford InfoLab
[36]
Rajaraman A, Ullman JD (2011) Mining of massive datasets. Cambridge University Press.
[37]
Ranshous S, Chaudhary M, Samatova NF (2017) Efficient outlier detection in hyperedge streams using MinHash and locality-sensitive hashing. In: Complex networks & their applications VI, pp 105–116.
[38]
Shin K, Jung J, Lee S, et al (2015) Bear: Block elimination approach for random walk with restart on large graphs. In: Proceedings of the 2015 ACM SIGMOD international conference on management of data (SIGMOD), pp 1571–1585.
[39]
Sinha A, Shen Z, Song Y, et al (2015) An overview of microsoft academic service (MAS) and applications. In: Proceedings of the 24th international conference on world wide web (WWW), pp 519–527.
[40]
Sun J, Qu H, Chakrabarti D, et al (2005) Neighborhood formation and anomaly detection in bipartite graphs. In: Proceedings of the fifth IEEE international conference on data mining (ICDM), pp 418–425.
[41]
Sun L, Ji S, Ye J (2008) Hypergraph spectral learning for multi-label classification. In: Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining (KDD), pp 668–676.
[42]
Tong H, Faloutsos C, Gallagher B, et al (2007a) Fast best-effort pattern matching in large attributed graphs. In: Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining (KDD), pp 737–746.
[43]
Tong H, Faloutsos C, and Pan JY Random walk with restart: fast solutions and applications Knowl Inf Syst 2007 14 3 327-346
[44]
Trefethen LN, Bau D (2022) Numerical linear algebra, vol 181. Siam,
[45]
Wang R, Wang S, and Zhou X Parallelizing approximate single-source personalized PageRank queries on shared memory VLDB J 2019 28 6 923-940
[46]
Wang S, Yang R, Xiao X, et al (2017) Fora: simple and effective approximate single-source personalized pagerank. In: Proceedings of the 23rd ACM SIGKDD international conference on knowledge discovery and data mining (KDD), pp 505–514.
[47]
Wang S, Yang R, Wang R et al (2019) Efficient algorithms for approximate single-source personalized PageRank queries. ACM Trans Database Syst 44(4):1–37.
[48]
Wei Z, He X, Xiao X, et al (2018) Topppr: Top-k personalized pagerank queries with precision guarantees on large graphs. In: Proceedings of the 2018 international conference on management of data (SIGMOD), pp 441–456.
[49]
Wu H, Gan J, Wei Z, et al (2021) Unifying the global and local approaches: An efficient power iteration with forward push. In: Proceedings of the 2021 international conference on management of data (SIGMOD), pp 1996–2008.
[50]
Yin H, Benson AR, Leskovec J, et al (2017) Local higher-order graph clustering. In: Proceedings of the 23rd ACM SIGKDD international conference on knowledge discovery and data mining (KDD), pp 555–564.
[51]
Zhang Y, Zhao Z, Feng Z (2018a) A unified approach to scalable spectral sparsification of directed graphs. arXiv:1812.04165
[52]
Zhang Z, Lin H, Gao Y (2018b) Dynamic hypergraph structure learning. In: Proceedings of the twenty-seventh international joint conference on artificial intelligence (IJCAI), pp 3162–3169.
[53]
Zhou D, Huang J, Schölkopf B (2006) Learning with hypergraphs: clustering, classification, and embedding. In: Proceedings of the 19th international conference on neural information processing systems (NIPS), pp 1601–1608.
[54]
Zhu S, Zou L, and Fang B Content based image retrieval via a transductive model J Intell Inf Syst 2013 42 1 95-109
[55]
Zien J, Schlag M, and Chan P Multi-level spectral hypergraph partitioning with arbitrary vertex sizes ITCSDI 1999 18 9 1389-1399

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Data Mining and Knowledge Discovery
Data Mining and Knowledge Discovery  Volume 38, Issue 3
May 2024
732 pages

Publisher

Kluwer Academic Publishers

United States

Publication History

Published: 21 December 2023
Accepted: 23 November 2023
Received: 10 February 2023

Author Tags

  1. Hypergraph
  2. Random walk with restart
  3. Fast computation
  4. Anomaly detection

Qualifiers

  • Research-article

Funding Sources

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 0
    Total Downloads
  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 05 Nov 2024

Other Metrics

Citations

View Options

View options

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media