分布式图算法Pregel

最近看了Google的Pregel论文,图算法有一些经典且不可被替代的应用场景,如社交网络,相互引用等。但是在单个结点上的运算量往往过少,注重的是消息传播和逻辑处理,而不是单纯的大规模计算。虽然已经过去了十几年,但是其中的messsage passing,combiner,aggregator,group partition,状态机等机制还是设计分布式图算法的有效方法。这里针对设计和实现部分做一点点笔记。

Pregel: A System for Large-Scale Graph Processing

key point:

distributed computer clusters -> graph task
graph algorithm

introduction

Target:

Many practical computing problems concern large graphs. e.g. the Web graph and various social networks.

Parallelism for graph challenging:

  1. Graph algorithms often exhibit poor locality of memory access,
  2. very little work per vertex,
  3. and a changing degree of parallelism over the course of execution.

We want a scalable general-purpose system for graph! Compared with existed options:

  • efficient
  • scalable
  • fault-tolerant

inspired by: Valiant’s Bulk Synchronous Parallel model


Model

data structure

input: directed graph(both have value):

  • vertex: vertex identifier!
  • edge: associated with source vertices

process

superstep: working unit with state machine -> sync!
vertex: first class citizens

output: may not as same as input(the graph structure can change)
terminal: voting to halt


API

Compute()
GetValue()
MutableValue()

combiner:

aggregator: global monitor (in my opinion, aggregator is global combiner + global coordinator(more logical than combiner))
aggregator: get global information and do a calculation, not only the combination of data.

e.g. do a particular operation when all the vertices meet a particualr condition


Implementation

Architecture

group partition: default(hash(ID) mod N)

self-define partition allocator: use locality

hierarchy(up to bottom):

  • user program(copies of app) = 1 master + N-1 workers
  • physical machine
  • group partition
  • vertex & outer-edge

Fault Tolerance

checkpoint

worker: partition state -> persistant storage
master: aggregator

regular "ping" message

posted @   真昼小天使daisuki  阅读(58)  评论(0编辑  收藏  举报
相关博文:
阅读排行:
· winform 绘制太阳,地球,月球 运作规律
· 震惊!C++程序真的从main开始吗?99%的程序员都答错了
· AI与.NET技术实操系列(五):向量存储与相似性搜索在 .NET 中的实现
· 超详细:普通电脑也行Windows部署deepseek R1训练数据并当服务器共享给他人
· 【硬核科普】Trae如何「偷看」你的代码?零基础破解AI编程运行原理
点击右上角即可分享
微信分享提示