【墨鳌】【数据结构】平衡树-(2,4)tree

2-3-4 tree 平衡2-3-4树

2-3-4 tree 学习资源

定义

一颗2-3-4树,要么为空,要么由以下四种结点组成:

中间结点三种:

  1. 2-节点,有一个关键字和由其划分的两个区间链接
  2. 3-节点,有两个关键字和由其划分的三个区间链接
  3. 4-节点,有三个关键字和由其划分的四个区间链接

叶子结点一种:可以有2、3、4个区间,但是没有儿子。

A leaf node can have 2, 3 or 4 items but no children. In other words, a leaf is 2-Node, 3-Node or 4-Node where all children are Null.

一颗平衡2-3-4树是一颗2-3-4搜索树,其中所有指向空树的链接到树根的距离都相同,即叶子结点深度都相同

完美平衡 All leaf nodes are at the same level (perfectly balanced).

2-3-4 Node 的结构组成

Each node in 2-3-4 Tree contains the following.

  1. Sorted List of Data Elements. 排序过的存储了数据的链表
  2. A list of pointers to its child nodes. 和对应的指针指向其儿子节点

学前感知

查看图片

查询操作 Searching in a 2-3-4 Tree

2-3-4 stores sorted data. Compare keys against the element you are looking for to guide the search to the appropriate interval. Recursively traverse to lower levels and if you reach the leaf node and haven't found the key, then the key doesn't exist.

同普通的搜索树类似,通过递归的方式查询数据,自上往下,每次向查询方向的区间递归。若查询到叶子结点仍然没有对应的数据,即该元素不存在于2-3-4 tree内。

插入操作 Insertion into a 2-3-4 Tree

Here's the most important thing that you need to remember about insertion into a 2-3-4 Tree

New elements are inserted only in leaf nodes.

这是最重要的部分,插入操作,牢牢记住,新元素只会被插入到叶子结点。

With the above rule in mind, to insert x into the tree, search for the leaf node with the interval containing x.

Once, we reach the leaf node, insert x into the node.

如果把一个新数据x插入原树,我们只需要搜索包含x的区间的叶子结点,然后把x插入这个节点即可。听起来是不是很容易?

But wait, there's a catch. What happens if the leaf node already has three elements?

不过稍微等一下,如果结点已经满了,即已经存在三个元素了,该怎么办呢?

The obvious question is how do we avoid encountering a 4-Node as leaf node? The trick is that as we traverse to find the node for insertion, we split all the 4-Nodes that we encounter. This way, once we reach the leaf node, it's guaranteed that the leaf node has room for at least one more element.

显而易见,我们应该避免发生出现一个结点出现四个元素的情况,所以说,我们提前把可能成为四个元素的结点给拆开来,这样我们就能保证在每次插入时会刚好有至少一个空间给我们的新元素。

The above splitting technique is called a Preemptive Split, and it goes hand-in-hand with the Top Down Insertion. This method dictates that before visiting a node, make sure that the node that we are going to visit is not a 4-Node. If it's a 4-Node, split it before visiting it. If we apply this, we have the following invariant.

这种分割方式我们称之为先发制人的分割,具有自顶向下的特点。该方法规定在我们访问结点之前,其结点不会是4-node的。如果是,则提前分割。依照这个规则,不断调整原树,至不再变化。

What happens if the root of the tree is a 4-Node when we start the insertion? Well, we just split the root before traversing further, and a new root is introduced (for an example look at the transition from step 10 to step 11 in the above animation)

更进一步,如果根节点成为4-node又该怎么处理呢? 一个简单的方法是,我们可以直接把根节点拆开,创建一个新的root,分裂两个子节点即可。

We've talked about 'split' several times without actually discussing the mechanics of how a split works. Let's look at the splitting of a 4-Node in detail.

讲了老半天了,也只是讨论了个大概,具体怎么操作还是没说,接下来让我们看看具体规则细节吧!

分裂4-node, Splitting a 4-Node in 2-3-4 Tree

To split a 4-Node, follow the steps below. 分裂4-node有如下步骤

  1. Pick the middle element and move it to the parent.

    将其中间结点移动到其父亲结点

    If it's the root and has no parent, a new root is created, and middle element moves to the new root.

    如果没有父亲结点,创建一个新的root

  2. Create two 2-Nodes and move the left element to one node (we'll call it left 2-Node) and the right element to the other (we'll call this one right 2-Node).

    将原来的两个元素新建为两个2-node节点,分别在上提的节点一左一右。

  3. If the original 4-Node were pointing to some children, these children would now be pointed by the newly created 2-Nodes.

    设计到儿子节点的修改,将原来中间节点的儿子移动到其现在的左右儿子中去。

It means during the split; we just have to move the middle element of 4-Node to the parent and create two new 2-Nodes. But, wait, what if the parent doesn't have the capacity for another element?
Good question. Well, the answer is still the same - this will never happen.

也就是说每次我们只需要把4-node节点的中间元素移动到其父亲节点,然后创建两个新的2-node节点。但是如何处理父亲节点也是4-node节点呢?

问得好,但是这已经不会发生了。

Why? Because, we can never have a 4-Node parent. If we have a 4-Node parent, it means we missed splitting it before traversing to its child. Remember, we never visit a 4-Node. If we are at a node whose child is a 4-Node, we first split the 4-Node child and then visit the 2-Node that falls within the range of the element to be inserted.

因为我们已经不可能在递归到当前节点时,其父亲节点任然是4-node节点,由于递归的搜索特性,其祖先必然是已经分裂过的。所以说,总是可以将儿子节点的一个元素移动到其父亲节点。

During insertion, whenever we find a 4-Node child, the current node will always be a 2-Node or a 3-Node. Hence, a split will have no cascading effects. We call such transformations Local Transformations.

在我们执行插入操作时,不论何时,遇到4-node结点,我们总是把他拆分掉(局部处理),这样不会对其它部分有影响。

To summarize, by following preemptive split, we ensure that all splits are local transformations.

总结就是,用这个方法来分裂就对了,不会有bug的。

回顾样例

查看图片

删除操作 Deletion in 2-3-4 Tree

Deletion is trickier in 2-3-4 Trees. As in insertion, the deletion always happens at the leaf node. When the element to be deleted is part of an internal node, we follow a series of steps to ensure that we can move the target element to a leaf node and then delete it.

删除相较于插入要难处理一些,但是和插入一样,删除也总是发生在叶子节点。需要注意的是,当删除的元素在中间层的时候,我们需要应用一些规则将其调整到叶子节点,然后再将其删除。

When the leaf node containing the target element is a 3-Node or 4-Node, deletion is pretty straight forward. Just remove the element from the node and hence the resultant node is either a 2-Node or 3-Node.

如果叶子节点为3-node或者4-node结点,那么直接删除就好了,没有然后问题。因为他们正正好变成2-node或者3-node。

However, if the leaf node is a 2-Node before deletion and we delete the only element in the node, it will become empty (and it'll violate the properties of 2-3-4 tree). We call this situation an underflow.

然后,如果遇到了2-node我们就不能直接这样删除了,这样会使得所有叶子结点的层高变得不统一,所有我们需要处理这个特殊情况,称之为:"下溢"

To avoid underflow, as we are traversing towards the leaf node, we follow certain steps to ensure that we never end up in a situation where the leaf node containing the target element is a 2-Node (has only one element).

为之我们会定义规则来避免这种问题的发生,来专门应对这种2-node的删除。

如下是规则细节:

  1. Start searching for the node containing the element that we are going to delete.

    搜索我们想删除的元素。

  2. If the element is in a leaf node (delete it). We are guaranteed that the leaf node will not underflow. In other words, leaf node will always be either a 3-Node or 4-Node (except when the leaf node is a root).

    搜索到叶子节点,直接删除。但是我们也需要保证不会发生 "下溢"。换句话说,我们只对于3-node或者4-node的叶子节点直接删除元素,根节点除外

  3. If the element is not in a leaf node, note down the node and continue finding the leaf that contains the element's predecessor or successor.

    如果元素位于中间节点,那么我们需要找到该元素位于叶子节点前驱后继

    Once we find the leaf node with the predecessor or successor, we swap the element to be deleted with predecessor or successor. Then we can safely remove the element from the leaf node.

    一旦我们找到了这样的前驱或者后继的叶子节点,我们可以交换元素位置,接下来就可以愉快的直接删除就好了。

  4. While traversing down towards the leaf node, whenever we encounter a 2-Node (other than the root node), we convert it into a 3-Node or 4-Node (using known techniques). This way, when we reach the leaf, we are guaranteed that the leaf is not a 2-Node and can safely delete (or swap and delete) the element.

    当我们向下遍历时,总是遵循这么一个规则,遇到2-node节点(除了根节点),就将其转换为3-node或者4-node节点。这样我们就能保证到达叶子节点时,一定不是2-node的,就能直接安全删除。

To further dig into the deletion algorithm, let's first understand a few basic concepts (and you can skip them if you already know them).

进一步学习需要知道以下前置概念:

中序遍历下的前驱与后继 In-Order Predecessor and In-Order Successor

For an element X, its in-order predecessor is defined as the largest key smaller than X.

For an element X, its in-order successor is defined as the smallest key larger than X.

2-node变3-node, Converting a 2-node into a 3-Node

During a deletion in 2-3-4 Tree, whenever we find a 2-Node during traversal towards the leaf, we convert this 2-Node to a 3-Node. The only exception is root. We don't convert the root to a 3-Node even if it's a 2-Node.

就是说删除时遇到2-node,只要不是root,就要先变3-node再说。

To convert a 2-Node to 3-Node, we can use one of the following three options

具体有如下三种可选的方法:

1. 旋转 Option 1: Rotation (also known as Transfer)

This is the simplest case. We can apply this when a 2-Node has an adjacent 3-Node or 4-Node sibling. In other words, rotate can be implemented if the node's immediate left or right sibling has 2 or 3 elements. In such cases, we 'steal' an element from the adjacent 3-Node. But we cannot directly put the element in our current node as it would break the ordering of 2-3-4 Tree. Instead, we steal the adjacent element, put it in the root and put the root element in the 2-Node. That's why it's called a rotation.

如果相邻的兄弟节点为3-node或者4-node,那么可以从兄弟节点中借一个元素过来。但是注意不能直接搬过来,需要先取出一个元素到父亲节点,然后再从父亲节点移过来一个元素,以此来保证2-3-4 tree的特性。

Here are the steps for the rotation. 操作如下:

从左邻接兄弟中借 Stealing from a Left Adjacent Sibling

  1. Pick the smallest key in root and move it to the 2-Node (which is being transformed to a 3-Node).

  2. Pick the highest key in the left sibling (call it X)and move it to root.

  3. If sibling has a subtree for interval greater than X, that subtree is also moved under the original 2-Node.

  4. 选root中最小的元素移动到该2-node。

  5. 选左邻接兄弟中最大的元素X,移动到root。

  6. 如果该兄弟节点有大于X的子树,将子树一并移动到该2-node。

从右邻接兄弟中借 Stealing from a Right Adjacent Sibling

  1. Pick the highest key in root and move it to the 2-Node (which is being transformed to a 3-Node).
  2. Pick the smallest key in the right sibling (call it Y)and move it to root.
  3. If the sibling has a subtree for the interval less than Y, that subtree is also moved under the original 2-Node.
  • 反之亦然

查看图片

2. 合并 Option 2: Fusion (or merge)

Rotation can convert a 2-Node to 3-Node if any of the adjacent siblings is a 3-Node or 4-Node. However, what if both adjacent siblings are 2-Nodes (or even worse, one sibling doesn't exist, and the other is a 2-Node).

如果遇到更坏的情况,相邻的兄弟节点都是2-node,甚至一个为空,一个为2-node。

In such cases, if the parent node is a 3-Node or 4-Node, we create a new 4-Node consisting of the following three elements

在这种情况下,如果父节点是3-Node或4-Node,我们创建一个新4-Node,由以下三个元素组成:

  1. The Element in the current node.

  2. The Element in the 2-Node sibling.

  3. The Element in the parent node that overlooks these two siblings. We remove this element from parent and put it in the new 4-Node.

  4. 当前节点

  5. 2-node的兄弟节点的元素

  6. 父亲节点中夹在这两个元素之间的元素

Parent has lost one element. If the parent was a 3-Node, now it's a 2-Node. If the parent was a 4-Node, now it's a 3-Node.

所以说,父亲节点会减少一个元素(4-node/3-node变为3-node/2-node)

3. 收缩(删根) Option 3: Shrink Tree (or root removal)

Rotation and Fusion take care of most cases. However, there's a relatively rare case when both siblings of the current node are 2-Nodes, and the parent is a 2-Node as well. As we've already seen, whenever that happens, we are guaranteed that the 2-Node parent is the root (because we are converting every 2-Node to 4-Node during traversal -- except the root. Hence, the only 2-Node parent can be the root).

另外一种更加罕见的情况,父亲和兄弟都是2-node的,父亲节点是tree root。将三个节点合并,树高减小1。

  • 按照我的理解,是合并成了新的4-node节点。

In such situations, we merge the following nodes to create a 4-Node.

  1. The Element in the current Node
  2. The Element in the 2-Node parent (root).
  3. The Element in the 2-Node sibling

Current 2-Node is replaced by this newly created 4-Node. The children of the sibling become children of the new node. However, the root is now empty as the only value in the root has moved down to the new node. Hence, we just delete the root and reduce the height the tree.

查看图片

posted @ 2022-04-05 15:17  墨鳌  阅读(77)  评论(0编辑  收藏  举报