论文标题
Connectit:静态和增量并行图连接算法的框架
ConnectIt: A Framework for Static and Incremental Parallel Graph Connectivity Algorithms
论文作者
论文摘要
连接的组件是图形应用中的基本内核。用于连接性的最快现有的平行多核算法基于某种形式的边缘采样和/或链接和压缩树。但是,这些设计选择的许多组合都没有探索。在本文中,我们设计了连接框架,该框架提供了不同的采样策略以及各种树的链接和压缩方案。 Connectit使我们能够获得数百个连接算法的新变体,其中大部分扩展到计算跨越森林。除静态图外,我们还扩展了连接,以支持并发设置中插入和连接性查询的混合。 我们提出了72核机上连接的实验评估,我们认为这是对迄今为止对并行连接算法的最全面评估。与最先进的静态多核算法的集合相比,我们获得的平均速度为12.4倍(每张图最快的现有实现的平均速度为2.36倍)。使用Connectit,我们能够在10秒内使用72核机在10秒内计算最大的公共可用图(具有超过35亿个顶点和1,280亿个边缘)上的连接性,在任何计算设置中,该图表都超过了该图的最快现有连接结果的3.1倍加速度。对于我们的增量算法,我们表明我们的算法可以以每秒多达数十亿个边缘摄入图形更新。为了指导用户为不同情况选择连接中的最佳变体,我们提供了对不同策略的详细分析。最后,我们展示了如何使用Connectit中的技术来加快两个重要的图形应用程序:近似最小的跨越森林和扫描聚类。
Connected components is a fundamental kernel in graph applications. The fastest existing parallel multicore algorithms for connectivity are based on some form of edge sampling and/or linking and compressing trees. However, many combinations of these design choices have been left unexplored. In this paper, we design the ConnectIt framework, which provides different sampling strategies as well as various tree linking and compression schemes. ConnectIt enables us to obtain several hundred new variants of connectivity algorithms, most of which extend to computing spanning forest. In addition to static graphs, we also extend ConnectIt to support mixes of insertions and connectivity queries in the concurrent setting. We present an experimental evaluation of ConnectIt on a 72-core machine, which we believe is the most comprehensive evaluation of parallel connectivity algorithms to date. Compared to a collection of state-of-the-art static multicore algorithms, we obtain an average speedup of 12.4x (2.36x average speedup over the fastest existing implementation for each graph). Using ConnectIt, we are able to compute connectivity on the largest publicly-available graph (with over 3.5 billion vertices and 128 billion edges) in under 10 seconds using a 72-core machine, providing a 3.1x speedup over the fastest existing connectivity result for this graph, in any computational setting. For our incremental algorithms, we show that our algorithms can ingest graph updates at up to several billion edges per second. To guide the user in selecting the best variants in ConnectIt for different situations, we provide a detailed analysis of the different strategies. Finally, we show how the techniques in ConnectIt can be used to speed up two important graph applications: approximate minimum spanning forest and SCAN clustering.
