始于足|

MuXinu

园龄:2年7个月粉丝:3关注:1

GraphFrames介绍和基本用法

阅读本篇博客前需先了解图数据、scala、spark相关知识

 

GraphFrames是一款图处理类库。该类库构建在DataFrame之上,既能利用DataFrame良好的扩展性和强大的性能,同时也为Scala、Java和Python提供了统一的图处理API。

github:https://github.com/graphframes/graphframes
官方文档:https://graphframes.github.io/graphframes/docs/_site/user-guide.html#graphframe-to-graphx

一、对比graphX

 GraphFramesGraphX
数据模型 DataFrames RDD
开发语言 Scala/Java/Python Scala
使用场景 数据查询、图计算 图计算
顶点ID Any Type Long
点边属性 DataFrame columns Any Type(VD, ED)
返回类型 GraphFrame、DataFrame Graph[VD, ED] 、RDD[Long, VD]

二、scala下GraphFrames使用

//导入graphframes依赖
<
dependency>   <groupId>graphframes</groupId>   <artifactId>graphframes</artifactId>   <version>0.8.1-spark2.4-s_2.11</version> </dependency>

三、官网案例实践

简单获取

复制代码
import org.apache.spark.SparkConf
import org.apache.spark.sql.SparkSession
import org.graphframes.GraphFrame


object GraphFramesExample {


  def main(args: Array[String]): Unit = {
    val sparkConfig = new SparkConf().setAppName("GraphFrames").setMaster("local[2]")
      .set("spark.sql.shuffle.partitions", "1")//分区大小
    val spark: SparkSession = SparkSession.builder().config(sparkConfig).getOrCreate()
    // Vertex DataFrame
    val v = spark.createDataFrame(List(
      ("a", "Alice", 34),
      ("b", "Bob", 36),
      ("c", "Charlie", 30),
      ("d", "David", 29),
      ("e", "Esther", 32),
      ("f", "Fanny", 36),
      ("g", "Gabby", 60)
    )).toDF("id", "name", "age")
    // Edge DataFrame
    val e = spark.createDataFrame(List(
      ("a", "b", "friend"),
      ("b", "c", "follow"),
      ("c", "b", "follow"),
      ("f", "c", "follow"),
      ("e", "f", "follow"),
      ("e", "d", "friend"),
      ("d", "a", "friend"),
      ("a", "e", "friend")
    )).toDF("src", "dst", "relationship")
    // Create a GraphFrame
    val g = GraphFrame(v, e)
//    g.find("(a)-[e]->(b); (b)-[e2]->(a)").show()
//     g.find("(a)-[e]->(b); (b)-[e2]->(a)").show()
     g.find("(a)-[e]->(b); (b)-[e2]->(c); (c)-[e3]->(a)")
       .where("a.age > 29")
       .show()
    //获取图内所有点
    g.vertices.show()
    //获取图内所有边
    g.edges.show()
    //获取点的入度表
    g.inDegrees.show()
    //获取点的出度表
    g.outDegrees.show()
    //获取点的出入度表
    g.degrees.show()
    //获取图内所有三元组
    g.triplets.show()
  }
}
复制代码

输出

复制代码
//获取图内所有点
g.vertices.show()

+---+-------+---+
| id|   name|age|
+---+-------+---+
|  a|  Alice| 34|
|  b|    Bob| 36|
|  c|Charlie| 30|
|  d|  David| 29|
|  e| Esther| 32|
|  f|  Fanny| 36|
|  g|  Gabby| 60|
+---+-------+---+
//获取图内所有边
g.edges.show()
+---+---+------------+
|src|dst|relationship|
+---+---+------------+
|  a|  b|      friend|
|  b|  c|      follow|
|  c|  b|      follow|
|  f|  c|      follow|
|  e|  f|      follow|
|  e|  d|      friend|
|  d|  a|      friend|
|  a|  e|      friend|
+---+---+------------+
//获取点的入度表
g.inDegrees.show()
+---+--------+
| id|inDegree|
+---+--------+
|  b|       2|
|  c|       2|
|  f|       1|
|  d|       1|
|  a|       1|
|  e|       1|
+---+--------+
//获取点的出度表
g.outDegrees.show()
+---+---------+
| id|outDegree|
+---+---------+
|  a|        2|
|  b|        1|
|  c|        1|
|  f|        1|
|  e|        2|
|  d|        1|
+---+---------+
//获取点的出入度表
g.degrees.show()
+---+------+
| id|degree|
+---+------+
|  a|     3|
|  b|     3|
|  c|     3|
|  f|     2|
|  e|     3|
|  d|     2|
+---+------+
//获取图内所有三元组
g.triplets.show()
+----------------+--------------+----------------+
|             src|          edge|             dst|
+----------------+--------------+----------------+
|  [a, Alice, 34]|[a, b, friend]|    [b, Bob, 36]|
|    [b, Bob, 36]|[b, c, follow]|[c, Charlie, 30]|
|[c, Charlie, 30]|[c, b, follow]|    [b, Bob, 36]|
|  [f, Fanny, 36]|[f, c, follow]|[c, Charlie, 30]|
| [e, Esther, 32]|[e, f, follow]|  [f, Fanny, 36]|
| [e, Esther, 32]|[e, d, friend]|  [d, David, 29]|
|  [d, David, 29]|[d, a, friend]|  [a, Alice, 34]|
|  [a, Alice, 34]|[a, e, friend]| [e, Esther, 32]|
+----------------+--------------+----------------+
复制代码

Motif finding(主题查找)

  GraphFrame主题查找使用特定语言(DSL)来表达结构查询。使用()表示点,[]表示边
例如,graph.find("(a)-[e]->(b); (b)-[e2]->(a)")将搜索由双向边缘连接的顶点a,b对。它将返回图形中所有此类结构DataFrame,
中包含主题中每个命名元素(顶点或边缘)的列。

+----------------+--------------+----------------+--------------+
|               a|             e|               b|            e2|
+----------------+--------------+----------------+--------------+
|    [b, Bob, 36]|[b, c, follow]|[c, Charlie, 30]|[c, b, follow]|
|[c, Charlie, 30]|[c, b, follow]|    [b, Bob, 36]|[b, c, follow]|
+----------------+--------------+----------------+--------------+

查找a点指向b点,同时存在b点指向a点的模式

g.find("(a)-[e]->(b); (b)-[e2]->(a)").show()

+----------------+--------------+----------------+--------------+
|               a|             e|               b|            e2|
+----------------+--------------+----------------+--------------+
|    [b, Bob, 36]|[b, c, follow]|[c, Charlie, 30]|[c, b, follow]|
|[c, Charlie, 30]|[c, b, follow]|    [b, Bob, 36]|[b, c, follow]|
+----------------+--------------+----------------+--------------+

查找a点指向b点,b点指向c点,c点指向a点,相当于一个有向的三点成环
g.find("(a)-[e]->(b); (b)-[e2]->(c); (c)-[e3]->(a)").show()

+---------------+--------------+---------------+--------------+---------------+--------------+
|              a|             e|              b|            e2|              c|            e3|
+---------------+--------------+---------------+--------------+---------------+--------------+
|[e, Esther, 32]|[e, d, friend]| [d, David, 29]|[d, a, friend]| [a, Alice, 34]|[a, e, friend]|
| [d, David, 29]|[d, a, friend]| [a, Alice, 34]|[a, e, friend]|[e, Esther, 32]|[e, d, friend]|
| [a, Alice, 34]|[a, e, friend]|[e, Esther, 32]|[e, d, friend]| [d, David, 29]|[d, a, friend]|
+---------------+--------------+---------------+--------------+---------------+--------------+

在以上基础之上添加条件过滤,此处使用where等同于filter
g.find("(a)-[e]->(b); (b)-[e2]->(c); (c)-[e3]->(a)")
.where("a.age > 29")
.show()

+---------------+--------------+---------------+--------------+--------------+--------------+
|              a|             e|              b|            e2|             c|            e3|
+---------------+--------------+---------------+--------------+--------------+--------------+
|[e, Esther, 32]|[e, d, friend]| [d, David, 29]|[d, a, friend]|[a, Alice, 34]|[a, e, friend]|
| [a, Alice, 34]|[a, e, friend]|[e, Esther, 32]|[e, d, friend]|[d, David, 29]|[d, a, friend]|
+---------------+--------------+---------------+--------------+--------------+--------------+

 

本文作者:MuXinu

本文链接:https://www.cnblogs.com/MuXinu/p/17839253.html

版权声明:本作品采用知识共享署名-非商业性使用-禁止演绎 2.5 中国大陆许可协议进行许可。

posted @   MuXinu  阅读(476)  评论(0编辑  收藏  举报
点击右上角即可分享
微信分享提示
评论
收藏
关注
推荐
深色
回顶
收起