Graph database_neo4j 底层存储结构分析(3)

3.3 通用的Store 类型

3.3.1 id 类型

下面是 neo4j db 中,每种Store都有自己的ID文件(即后缀.id 文件)，它们的格式都是一样的。

[test00]$ls -lh target/neo4j-test00.db/ |grep .id

-rw-r–r–9 04-11 13:28 neostore.id

-rw-r–r–9 04-11 13:28 neostore.labeltokenstore.db.id

-rw-r–r–9 04-11 13:28 neostore.labeltokenstore.db.names.id

-rw-r–r–9 04-11 13:28 neostore.nodestore.db.id

-rw-r–r–9 04-11 13:28 neostore.nodestore.db.labels.id

-rw-r–r–9 04-11 13:28 neostore.propertystore.db.arrays.id

-rw-r–r–9 04-11 13:28 neostore.propertystore.db.id

-rw-r–r–9 04-11 13:28 neostore.propertystore.db.index.id

-rw-r–r–9 04-11 13:28 neostore.propertystore.db.index.keys.id

-rw-r–r–9 04-11 13:28 neostore.propertystore.db.strings.id

-rw-r–r–9 04-11 13:28 neostore.relationshipgroupstore.db.id

-rw-r–r–9 04-11 13:28 neostore.relationshipstore.db.id

-rw-r–r–9 04-11 13:28 neostore.relationshiptypestore.db.id

-rw-r–r–9 04-11 13:28 neostore.relationshiptypestore.db.names.id

-rw-r–r–9 04-11 13:28 neostore.schemastore.db.id

3.3.1.1 ID类型文件的存储格式

neo4j 中后缀为 “.id”的文件格式如上图所示，由文件头（9 Bytes）和 long类型数组 2部分构成：

sticky(1 byte) : if sticky the id generator wasn’t closed properly so it has to berebuilt (go through the node, relationship, property, rel type etc files).
nextFreeId(long) : 保存最大的ID,该值与对应类型的存储数组的数组大小相对应。
reuseId(long):用来保存已经释放且可复用的ID值。通过复用ID ,可以减少资源数组的空洞，提高磁盘利用率。

3.3.1.2 IdGeneratorImpl.java

每一种资源类型的ID 分配 neo4j 中是通过 IdGeneratorImpl 来实现的，其功能是负责ID管理分配和回收复用。对于节点，关系，属性等每一种资源类型，都可以生成一个IdGenerator 实例来负责其ID管理分配和回收复用。

3.3.1.2.1 读取id 文件进行初始化

下面试 IdGeneratorImpl.java 中，读取id 文件进行初始化的过程，IdGeneratorImpl 会从 id 文件中读取grabSize 个可复用的ID (reuseId) 到idsReadFromFile(LinkedList<Long>) 中，在需要申请id 时优先分配 idsReadFromFile中的可复用ID。

<div>
 
// initialize the id generator and performs a simple validation
 
private synchronized void initGenerator()
 
{
 
try
 
{
 
fileChannel = fs.open( fileName, "rw" );
 
ByteBuffer buffer = ByteBuffer.allocate( HEADER_SIZE );
 
readHeader( buffer );
 
markAsSticky( buffer );
 
fileChannel.position( HEADER_SIZE );
 
maxReadPosition = fileChannel.size();
 
defraggedIdCount = (int) (maxReadPosition - HEADER_SIZE) / 8;
 
readIdBatch();
 
}
 
catch ( IOException e )
 
{
 
throw new UnderlyingStorageException(
 
"Unable to init id generator " + fileName, e );
 
}
 
}
 
private void readHeader( ByteBuffer buffer ) throws IOException
 
{
 
readPosition = fileChannel.read( buffer );
 
if ( readPosition != HEADER_SIZE )
 
{
 
fileChannel.close();
 
throw new InvalidIdGeneratorException(
 
"Unable to read header, bytes read: " + readPosition );
 
}
 
buffer.flip();
 
byte storageStatus = buffer.get();
 
if ( storageStatus != CLEAN_GENERATOR )
 
{
 
fileChannel.close();
 
throw new InvalidIdGeneratorException( "Sticky generator[ " +
 
fileName + "] delete this id file and build a new one" );
 
}
 
this.highId.set( buffer.getLong() );
 
}
 
private void readIdBatch()
 
{
 
if ( !canReadMoreIdBatches() )
 
return;
 
try
 
{
 
int howMuchToRead = (int) Math.min( grabSize*8, maxReadPosition-readPosition );
 
ByteBuffer readBuffer = ByteBuffer.allocate( howMuchToRead );
 
fileChannel.position( readPosition );
 
int bytesRead = fileChannel.read( readBuffer );
 
assert fileChannel.position() <= maxReadPosition;
 
readPosition += bytesRead;
 
readBuffer.flip();
 
assert (bytesRead % 8) == 0;
 
int idsRead = bytesRead / 8;
 
defraggedIdCount -= idsRead;
 
for ( int i = 0; i < idsRead; i++ )
 
{
 
long id = readBuffer.getLong();
 
if ( id != INTEGER_MINUS_ONE )
 
{
 
idsReadFromFile.add( id );
 
}
 
}
 
}
 
catch ( IOException e )
 
{
 
throw new UnderlyingStorageException(
 
"Failed reading defragged id batch", e );
 
}
 
}

3.3.1.2.2 释放id(freeId)

用户释放一个 id 后，会先放入 releasedIdList （LinkedList<Long>），当releasedIdList 中回收的 id 个数超过 grabSize 个时，写入到 id 文件的末尾。所以可见，对于一个 IdGeneratorImpl，最多有 2 * grabSize 个 id 缓存(releasedIdList 和 idsReadFromFile)。

<div>
 
/**
 
* Frees the <CODE>id</CODE> making it a defragged id that will be
 
* returned by next id before any new id (that hasn't been used yet) is
 
* returned.
 
* <p>
 
* This method will throw an <CODE>IOException</CODE> if id is negative or
 
* if id is greater than the highest returned id. However as stated in the
 
* class documentation above the id isn't validated to see if it really is
 
* free.
 
*/
 
@Override
 
public synchronized void freeId( long id )
 
{
 
if ( id == INTEGER_MINUS_ONE )
 
{
 
return;
 
}
 
if ( fileChannel == null )
 
{
 
throw new IllegalStateException( "Generator closed " + fileName );
 
}
 
if ( id < 0 || id >= highId.get() )
 
{
 
throw new IllegalArgumentException( "Illegal id[" + id + "]" );
 
}
 
releasedIdList.add( id );
 
defraggedIdCount++;
 
if ( releasedIdList.size() >= grabSize )
 
{
 
writeIdBatch( ByteBuffer.allocate( grabSize*8 ) );
 
}
 
}

3.3.1.2.3 申请id ( nextId)

当用户申请一个 id 时，IdGeneratorImpl 在分配时，有2种分配策略： “正常的分配策略” 和激进分配策略”（aggressiveReuse），可以根据配置进行选择。

n “正常的分配策略”：

a) 首先从idsReadFromFile 中分配; 如果 idsReadFromFile 为空，则先从对应的 id 文件中读取已释放且可复用的 id 到idsReadFromFile.

b) 如果 idsReadFromFile 及 id 文件中没有已释放且可复用的 id了，则分配全新的id,即id = highId.get() 并将highId 加1；

n “激进分配策略”（aggressiveReuse）:

a) 首先从releasedIdList（刚回收的ID List）中分配。

b) releasedIdList分配光了，则从idsReadFromFile 中分配; 如果 idsReadFromFile 为空，则先从对应的 id 文件中读取已释放且可复用的 id 到idsReadFromFile.

c) 如果 idsReadFromFile 及 id 文件中没有已释放且可复用的 id了，则分配全新的id,即id = highId.get() 并将highId 加1；

<div>
 
/**
 
* Returns the next "free" id. If a defragged id exist it will be returned
 
* else the next free id that hasn't been used yet is returned. If no id
 
* exist the capacity is exceeded (all values <= max are taken) and a
 
* {@link UnderlyingStorageException} will be thrown.
 
*/
 
@Override
 
public synchronized long nextId()
 
{
 
assertStillOpen();
 
long nextDefragId = nextIdFromDefragList();
 
if ( nextDefragId != -1 ) return nextDefragId;
 
long id = highId.get();
 
if ( id == INTEGER_MINUS_ONE )
 
{
 
// Skip the integer -1 (0xFFFFFFFF) because it represents
 
// special values, f.ex. the end of a relationships/property chain.
 
id = highId.incrementAndGet();
 
}
 
assertIdWithinCapacity( id );
 
highId.incrementAndGet();
 
return id;
 
}

posted @ 2015-06-02 10:39 GisClub 阅读(524) 评论(0) 收藏举报

刷新页面返回顶部