到目前为止,我们在例子中使用的图数据都是在计算机系统的内存中。到目前为止我们看到的唯一的持久化的形式就是把整个图保存成JSON或XML格式,可以以后把它读回到内存中。显然,对于很多生产系统来说, 我们需要一个更好的数据持久化的存储。从发布起,杰森图就支持各种不同的后端数据库,它们可以用来持久化图数据。后边一点,在“使用Docker来进行卡珊德拉和杰森图的实验”这一节,我们将会探究体验这些数据库方案中的某一种的一种简单的方法。
So far we have concentrated on examples where the graph data resides in the memory of the
computer system. The only form of persistence we have so far looked at is saving an entire graph as
JSON or XML and reading it back into memory at a future date. Clearly, for many production
systems, we need a better story for data persistence. As delivered, JanusGraph supports a number
of different back end databases that can be used to persist graph data. A bit later, in the "Using
Docker to experiment with Cassandra and JanusGraph" section, we will explore a simple way to
experiment with one of these database options.
一旦杰森图被下载并解压安装了,您就可以发现一个名为/conf的目录,它在杰森图的安装目录的下面。在这个目录中您可以找到一些Java的属性文件,它们可以用来连接杰森图到不同的后端数据存储。取决于您的设置,这些配置文件可以不需要变更就可以工作,或者需要进行修改才可以工作。每个配置文件都有详细的注释,解释了各种各样的设置是做什么用的。
Once JanusGraph has been downloaded and installed (unzipped) you will find a directory called
/conf below the directory where JanusGraph was installed. In this directory you will find a number
of Java properties files that can be used to connect JanusGraph to different back end data stores.
Depending upon your configuration these property files may work unchanged or may need to be
edited. Each property file has detailed comments that explain what the various setting do.
杰森图的官方文档为每种目前已支持的后端存储提供了详细的配置信息。http://docs.janusgraph.org/latest/storage-backends.html
The official JanusGraph documentation provides detailed configuration
information for each of the currently supported back end stores.
http://docs.janusgraph.org/latest/storage-backends.html
现在让我们一起大略地浏览一些在使用杰森图时,可能用到的持久化的方案。
Let’s now take a brief look at some of the persistent storage options available to us when using
JanusGraph.
6.8.1. 甲骨文 伯克利 DB Oracle Berkley DB
如果您的应用程序是在单台机器上运行但需要一个持久化存储,Oracle Berkely DB会是个不错的选项。所有的数据都持久化到了与您的应用程序运行的系统相同的本地磁盘空间。 对于打算在一台单机上不是只在内存后端测试和开发图应用的开发者来说,Berkley DB是非常受欢迎的。假设您正在使用Java或Groovy开发一个应用,Java版本的Berkley DB,有时也被称为是Berkley DB Java 版本,可以提供一系列的库,您可以嵌入到您的应用程序中,也可以使用和您的应用程序相同的JVM来运行它们。因为Berkley DB JE 运行在单台计算机上,您可以存储的图数据的量取决于那台机器的磁盘空间的大小。
Oracle Berkely DB may be a good choice if your application runs on a single machine but needs a
persistent store. All data is persisted to the same local disk of the system where your application
runs. Berkley DB is popular with developers who want to develop and test graph applications on a
single machine using more than an in-memory back end. Assuming you are developing an
application using Java or Groovy, the Java version of Berkley DB, known as Berkley DB Java Edition,
is provided as a set of libraries that you embed with your application and run using the same JVM
as your application. Because Berkley DB JE runs on a single machine, the amount of graph data that
you can store will depend on the size of the disk available on that machine.
对于只需要普通大小的图的生产系统,这也是个不错的选择。如果您的应用程序可能会产生超过100百万顶点的超大图,您就可能需要研究其它的方案了。多节点聚簇能力的存储方案, 我们再下来会讨论。如果有多个用户并发访问和变更图,Berkley DB也不是一个好的方案。
For production systems that only need a modest sized graph this may also be a valid choice. If your
application is likely to generate very large graphs in excess of 100 million vertices you will probably
need to investigate some of the other, multi node cluster capable, storage options that we will
discuss next. Berkley DB is probably not a good choice if you need multiple users to be accessing
and changing the graph concurrently.
杰森图的/conf文件夹包含了一个名为janusgraph-berkleyje.properties的配置文件,可以用来创建一个新的使用 Berkley的杰森图的新实例,如下所示。
The JanusGraph /conf directory contains a file called janusgraph-berkleyje.properties that can be
used to create a new instance of a JanusGraph backed by Berkley as follows.
相应地,使用 Berkley DB 就不再需要配置了,您可以像下面这样把属性直接传给杰森图。第二个set 命令指定了您的数据将存储在磁盘的哪个位置。
Alternatively, as there is not much to configure when using Berkley DB, you could decide to pass the
properties directly to JanusGraph as follows. The second set command specifies where your data
will be stored on the disk.
Oracle Berkley DB可以从下面的URL的 Oracle 网站下载,http://www.oracle.com/technetwork/database/database-technologies/berkeleydb/overview/index.html
Oracle Berkley DB can be downloaded from the Oracle web site from the following URL.
http://www.oracle.com/technetwork/database/database-technologies/berkeleydb/overview/
index.html
浙公网安备 33010602011771号