TRIE - Data Structure
Introduction
介绍
Trie,又称单词查找树,是一种树形结构,
Trie is an ordered tree data structure that uses strings as keys. Unlike Binary Trees, Tries do not store keys associated with the node. The key is actually determined based on the position of the node on the tree. Any descendants of a node shares a common prefix of the key string associated with that node. Hence, trie is also called as Prefix Tree. The word "trie" comes from Retrieval, and it is pronounced as "try".
Trie是一种以字符串为键的有序树状数据结构,
Since this data structure is a prefix tree, trie is commonly used in Dictionaries, Phone Directories and matching algorithms. Trie is best-suited for phone directory (any matching application for that matter) because it is very efficient in matching strings.
因为这个数据结构是一个前缀树,Trie通常被用在字典算法中,
So I have decided to implement Trie myself in C#. I have created three classes:
所以我决定使用C#来实现一个Trie数据结构,
- Node: Represents a single tree node;
- Node:树节点类;
- NodeCollection: Represents the children of a node;
- NodeCollection:树节点结合类;
- Trie: Trie implementation to insert and search nodes.
- Trie:Trie类实现了节点的查找以及插入操作。
Implementation
实现
Node: Node represents a basic tree node. Node implements both Depth First and Breadth First algorithms to search its children. It also contains its Parent node and Children node. Node has a key and a Value. Key contains the single character and the value has the actual value of the node. The actual key of the node will be determined by suffixing the single character to its parent's key. Node has a special property called IsTerminal. This property is set to true if the key or a value represents a complete string.
Node:Node类代表了一个基本的节点类,
See the picture below:
参见下面的图片:
NodeCollection: This class is a simple collection class which implements the IEnumerable interface for iteration operations. This class contains an internal List class of type Node. NodeCollection implements the standard list methods such as Add( ), Contains( ), Remove( ) etc.
NodeCollection:该类是一个简单的集合类型,
Trie: This class implements the Trie algorithms such as Insert and Find methods.
Trie:该类实现了Trie算法,实现了插入和查找操作。
示例代码:
public static Node InsertNode(string name, Node root)
{
//Is name null?
if (string.IsNullOrEmpty(name))
throw new ArgumentNullException("Null Key");
//set the index, start inserting characters
int index = 1;
//key
string key;
//start with the root node
Node currentNode = root;
//loop for all charecters in the name
while (index <= name.Length)
{
//get the key character
key = name[index - 1].ToString();
//does the node with same key already exist?
Node resultNode = currentNode.Children.GetNodeByKey(key);
//No, this is a new key
if (resultNode == null)
{
//Add a node
Node newNode = new Node(key, name.Substring(0, index));
//If reached the last charaecter, this is a valid full name
if (index == name.Length)
newNode.IsTerminal = true;
//add the node to currentNode(i.e. Root node for the first time)
currentNode.Children.Add(newNode);
//set as the current node
currentNode = newNode;
}
else
{
//node already exist, set as tghe current node
//and move to the next character in the name
currentNode = resultNode;
}
//move to the next character in the name
index++;
}
//all done, return root node
return root;
}
The Insert method inserts the string as one character at a time. It starts with the first character; if the first character doesn't already exist in the root node it adds a new node with the new character and returns the new node. Otherwise it returns the node with the fist character for adding remaining characters. It loops until it adds the entire string. Once it reaches the last character, it marks that node as a terminal node because this node represents a complete string in the tree hierarchy.
插入操作再插入一个字符串的时候,从第一个字符开始,
The Find methods is implemented by Depth First search algorithm. The tree is searched until the complete string is found. Below is the code.
查找方法实现了深度优先的查找算法,
public static bool Find(Node node, string key){
//Is key empty
if (string.IsNullOrEmpty(key))
return true;//terminal Node
//get the first character
string first = key.Substring(0, 1);
//get the tail: key - first character
string tail = key.Substring(1);
Node curNode = node.Children.GetNodeByKey(first);
//loop until you locate the key i.e. "Jo"
if (curNode != null)
{
return Find(curNode, tail);
}
else
{
//not found, return false
return false;
}
}
I've attached the entire source code above. The source code contains the Trie class library and a console application to test the Trie library. The console application loads a set of names (stored in names.txt in debug folder) in to the tree and provides options to run Depth First & Breadth First algorithm. The application also provides options for Directory Look-Up and Find option.
我已经将源代码添加在附件中了,源代码中包含了Trie算法类库
The class library can be further used to develop a web based phone directory. The data can also be stored on the client (it is too small) and the Trie can be implemented in JavaScript.
这个类库可以被进一步开发成为一个基于Web的电话目录,
Happy Coding!
编程快乐!
原贴地址:http://www.codeproject.