B-Trees Concepts B-树介绍(都快忘了:))
1.about B-Tree
Although most of the search trees are binary trees, there is a popular serach tree taht is not binary. this tree is known as a B-tree.
虽然大多数的查找树是二叉树,但是也存在不是二叉树的查找树。这种树就是B-树。简言之,B树是一种多路平衡查找树。他适合在磁盘等直接存储设备上组织动态的查找表。
A B-tree of order M is a tree with the following structural properties:
(1) Theroot is either a leaf or has between 2 and M children.
(2) All nodeleaf nodes(except the root) have between [M/2] and M children.
(3) All leaves are at the same depth.
(4) Each node should contain such data fields
(P0,K0,P1,K1,....Pi,Ki..Pm,Km)
[K0,K1,...Km] is the increased key.
[P0,P1,...Pm] is the pointer which is point to its related child nodes.
一棵m阶(m>=3)的B树具有如下性质:
(1)根节点要么是叶子节点 或者有2到m 个孩子节点。
(2) 所有非叶子节点有 m/2到m个孩子节点。
(3)所有叶子节点都具有相同高度。
(4) 每个节点应含有以下数据域:
(P0,K0,P1,K1,....Pi,Ki..Pm,Km)
其中, [K0,K1,...Km] 是递增的关键字序列
[P0,P1,...Pm] 是指向其孩子节点的指针域,初始为空。
All data are stored at the leaves. Contained in each interios node are pointers P1,P2,...Pm to the children, and values k1,k2,...k(m-1),representing the smallest key found in the subtrees P2,P3,...,Pm,respectively. Of course, some of the these pointers might be NULL, and the corresponding Ki would then be undefined. For every node, all the keys in subtree P1 are smaller than the keys in subtree P2, and so on. The leaves contail all the actula data, whichare either the keys themselves or pointers to recores containing the keys. We will assume the former tokeep our examples simpe. There are various definitions of B-trees that change this structure in mostly minor ways, but this definition is one ot the popular forms. We will also insist that the number of keys in a leaf is also between [M/2] and M.
所有数据存储在叶子节点。在每个节点内部包含指向其孩子节点的指针P0,P1,P2...Pm,以及K0,K1,K2,...Km等值,这些值是相应的子树P1,P2...P(m-1)的最小值。当然,这些节点指针Pi也可能是空值,其相应的Ki也是空值。对于每个节点,所有在子数P1中的键值都要小于子树P2中的键值,其他子树亦然。
下图是一个典型的B 树
The tree as following diagram showed is an example of a B-tree of order 4
A B-tree of order 4 is more popularly known as a 2-3-4 tree, and a B-tree of order 3 is know as a 2-3 tree. we will describe the operation of B-trees by using the special case of 2-3 trees. our starting point is the 2-3 tree follows.
一棵4阶B树也可称为2-3-4树,但是通常称为4阶树。 阶为3的树通常称为2-3树。下面我们以2-3树为例。
We have drawn interior nodes(nonleaves) , which contain the two pieces of data for each node. A dash line as a second piece of information in an interior node indicates that the node has only tow children. Leaves are drawn in boxes, which contain the keys. The keys in the leaves are ordered. To perform a Find. we start at the root and branch in one of(at most) three directions, depending on the relation of the key we are looking for to the two (possibly one) values stored at the node
我们画出了内部节点,每个节点包括两块数据,节点内部--表示该节点只有两个孩子。方框内是叶子节点,在该节点里面包含键值,叶子节点里的键值都按序排列。为了进行查找,我们从根节点开始,然后根据存储在节点中的两个值与键值之间的关系,选择相应的支树。
To perform an Insert on a previously unseen key, X, we follow the path as though we were performing a Find. when we get to a leaf node, we have found the correct place to put X. Thus, to insert a node with key 18, we can just add it to a leaf without causing any violations of the 2-3 tree properties. The result is shown in the following figure.
为了执行一个插入未见键值X的操作,我们首先进行查找操作,然后找到其相应位置。当找到一个叶子节点后,我们就可以找到正确的位置来插入X。 例如,我们要插入一个键18,我们只需要将其插入节点即可,而不会破坏2-3树的属性。
Unfortunately, since a leaf can hold only two or three keys,this might not always be possible, If we now try to insert 1 into the tree, we find that the node where it belongs is already full. Placing our new key into this node would give it a fourth element, which is not allowed. This is can be solved by making two nodes of two keys each and adjusting the information in the parent.
不幸的是,因为一个叶子节点只能有2或者3个键值。但是如果我们要插入键值1,那么这个可能性就要被破坏了,因为要插入的节点键值已满。如果将其插入作为第4个键值,很显然是不允许的。那么,我们只需要在父节点中进行调整就可以了。
Unfortanately, this idea does not always work, as can be seen by an attempt to insert 19 into the current tree. If we make two nodes of two keys each, we obtain the following tree.
This tree has an internal node with four children, but we only allow three per node. The solution is simple, We merely split this node into two nodes with two children. of course, this node might be one of three children itself, and thus splitting it would create a probrem for its parent(which would now have four children), but we can keep on splitting nodes on the way up to the root until we either get to the root or find a node with only two children.