Standard
ProbGraph : High-Performance and High-Accuracy Graph Mining with Probabilistic Set Representations. / Besta, MacIej; Miglioli, Cesare; Labini, Paolo Sylos; Tetek, Jakub; Iff, Patrick; Kanakagiri, Raghavendra; Ashkboos, Saleh; Janda, Kacper; Podstawski, Michal; Kwasniewski, Grzegorz; Gleinig, Niels; Vella, Flavio; Mutlu, Onur; Hoefler, Torsten.
Proceedings of SC 2022: International Conference for High Performance Computing, Networking, Storage and Analysis. IEEE, 2022. p. 1-17 (International Conference for High Performance Computing, Networking, Storage and Analysis, SC, Vol. 2022-November).
Research output: Chapter in Book/Report/Conference proceeding › Article in proceedings › Research › peer-review
Harvard
Besta, M, Miglioli, C, Labini, PS
, Tetek, J, Iff, P, Kanakagiri, R, Ashkboos, S, Janda, K, Podstawski, M, Kwasniewski, G, Gleinig, N, Vella, F, Mutlu, O & Hoefler, T 2022,
ProbGraph: High-Performance and High-Accuracy Graph Mining with Probabilistic Set Representations. in
Proceedings of SC 2022: International Conference for High Performance Computing, Networking, Storage and Analysis. IEEE, International Conference for High Performance Computing, Networking, Storage and Analysis, SC, vol. 2022-November, pp. 1-17, 2022 International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2022, Dallas, United States,
13/11/2022.
https://doi.org/10.1109/SC41404.2022.00048
APA
Besta, M., Miglioli, C., Labini, P. S.
, Tetek, J., Iff, P., Kanakagiri, R., Ashkboos, S., Janda, K., Podstawski, M., Kwasniewski, G., Gleinig, N., Vella, F., Mutlu, O., & Hoefler, T. (2022).
ProbGraph: High-Performance and High-Accuracy Graph Mining with Probabilistic Set Representations. In
Proceedings of SC 2022: International Conference for High Performance Computing, Networking, Storage and Analysis (pp. 1-17). IEEE. International Conference for High Performance Computing, Networking, Storage and Analysis, SC Vol. 2022-November
https://doi.org/10.1109/SC41404.2022.00048
Vancouver
Besta M, Miglioli C, Labini PS
, Tetek J, Iff P, Kanakagiri R et al.
ProbGraph: High-Performance and High-Accuracy Graph Mining with Probabilistic Set Representations. In Proceedings of SC 2022: International Conference for High Performance Computing, Networking, Storage and Analysis. IEEE. 2022. p. 1-17. (International Conference for High Performance Computing, Networking, Storage and Analysis, SC, Vol. 2022-November).
https://doi.org/10.1109/SC41404.2022.00048
Author
Besta, MacIej ; Miglioli, Cesare ; Labini, Paolo Sylos ; Tetek, Jakub ; Iff, Patrick ; Kanakagiri, Raghavendra ; Ashkboos, Saleh ; Janda, Kacper ; Podstawski, Michal ; Kwasniewski, Grzegorz ; Gleinig, Niels ; Vella, Flavio ; Mutlu, Onur ; Hoefler, Torsten. / ProbGraph : High-Performance and High-Accuracy Graph Mining with Probabilistic Set Representations. Proceedings of SC 2022: International Conference for High Performance Computing, Networking, Storage and Analysis. IEEE, 2022. pp. 1-17 (International Conference for High Performance Computing, Networking, Storage and Analysis, SC, Vol. 2022-November).
Bibtex
@inproceedings{0bb780d7b8434f29af591e91b848224a,
title = "ProbGraph: High-Performance and High-Accuracy Graph Mining with Probabilistic Set Representations",
abstract = "Important graph mining problems such as Clustering are computationally demanding. To significantly accelerate these problems, we propose ProbGraph: a graph representation that enables simple and fast approximate parallel graph mining with strong theoretical guarantees on work, depth, and result accuracy. The key idea is to represent sets of vertices using probabilistic set representations such as Bloom filters. These representations are much faster to process than the original vertex sets thanks to vectorizability and small size. We use these representations as building blocks in important parallel graph mining algorithms such as Clique Counting or Clustering. When enhanced with ProbGraph, these algorithms significantly outperform tuned parallel exact baselines (up to nearly 50 x on 32 cores) while ensuring accuracy of more than 90% for many input graph datasets. Our novel bounds and algorithms based on probabilistic set representations with desirable statistical properties are of separate interest for the data analytics community. Proofs of theorems & more results: http://arxiv.org/abs/2208.11469",
keywords = "Approximate Community Detection, Approximate Graph Clustering, Approximate Graph Mining, Approximate Graph Pattern Matching, Approximate Triangle Counting, Bloom Filters, Graph Sketching, High-Performance Graph Computations, K Minimum Values, MinHash",
author = "MacIej Besta and Cesare Miglioli and Labini, {Paolo Sylos} and Jakub Tetek and Patrick Iff and Raghavendra Kanakagiri and Saleh Ashkboos and Kacper Janda and Michal Podstawski and Grzegorz Kwasniewski and Niels Gleinig and Flavio Vella and Onur Mutlu and Torsten Hoefler",
note = "Publisher Copyright: {\textcopyright} 2022 IEEE.; 2022 International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2022 ; Conference date: 13-11-2022 Through 18-11-2022",
year = "2022",
doi = "10.1109/SC41404.2022.00048",
language = "English",
series = "International Conference for High Performance Computing, Networking, Storage and Analysis, SC",
pages = "1--17",
booktitle = "Proceedings of SC 2022",
publisher = "IEEE",
}
RIS
TY - GEN
T1 - ProbGraph
T2 - 2022 International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2022
AU - Besta, MacIej
AU - Miglioli, Cesare
AU - Labini, Paolo Sylos
AU - Tetek, Jakub
AU - Iff, Patrick
AU - Kanakagiri, Raghavendra
AU - Ashkboos, Saleh
AU - Janda, Kacper
AU - Podstawski, Michal
AU - Kwasniewski, Grzegorz
AU - Gleinig, Niels
AU - Vella, Flavio
AU - Mutlu, Onur
AU - Hoefler, Torsten
N1 - Publisher Copyright:
© 2022 IEEE.
PY - 2022
Y1 - 2022
N2 - Important graph mining problems such as Clustering are computationally demanding. To significantly accelerate these problems, we propose ProbGraph: a graph representation that enables simple and fast approximate parallel graph mining with strong theoretical guarantees on work, depth, and result accuracy. The key idea is to represent sets of vertices using probabilistic set representations such as Bloom filters. These representations are much faster to process than the original vertex sets thanks to vectorizability and small size. We use these representations as building blocks in important parallel graph mining algorithms such as Clique Counting or Clustering. When enhanced with ProbGraph, these algorithms significantly outperform tuned parallel exact baselines (up to nearly 50 x on 32 cores) while ensuring accuracy of more than 90% for many input graph datasets. Our novel bounds and algorithms based on probabilistic set representations with desirable statistical properties are of separate interest for the data analytics community. Proofs of theorems & more results: http://arxiv.org/abs/2208.11469
AB - Important graph mining problems such as Clustering are computationally demanding. To significantly accelerate these problems, we propose ProbGraph: a graph representation that enables simple and fast approximate parallel graph mining with strong theoretical guarantees on work, depth, and result accuracy. The key idea is to represent sets of vertices using probabilistic set representations such as Bloom filters. These representations are much faster to process than the original vertex sets thanks to vectorizability and small size. We use these representations as building blocks in important parallel graph mining algorithms such as Clique Counting or Clustering. When enhanced with ProbGraph, these algorithms significantly outperform tuned parallel exact baselines (up to nearly 50 x on 32 cores) while ensuring accuracy of more than 90% for many input graph datasets. Our novel bounds and algorithms based on probabilistic set representations with desirable statistical properties are of separate interest for the data analytics community. Proofs of theorems & more results: http://arxiv.org/abs/2208.11469
KW - Approximate Community Detection
KW - Approximate Graph Clustering
KW - Approximate Graph Mining
KW - Approximate Graph Pattern Matching
KW - Approximate Triangle Counting
KW - Bloom Filters
KW - Graph Sketching
KW - High-Performance Graph Computations
KW - K Minimum Values
KW - MinHash
U2 - 10.1109/SC41404.2022.00048
DO - 10.1109/SC41404.2022.00048
M3 - Article in proceedings
AN - SCOPUS:85143573874
T3 - International Conference for High Performance Computing, Networking, Storage and Analysis, SC
SP - 1
EP - 17
BT - Proceedings of SC 2022
PB - IEEE
Y2 - 13 November 2022 through 18 November 2022
ER -