数据结构与算法分析之树
Concept
Node with the same parent are siblings 兄弟姐妹。
We say that u is an ancestor of v if one of the following holds:
- u = v
- u is the parent of v, or
- u is the parent of an ancestor of v.
注意自己也是自己的ancestor。所以还有一个概念叫proper ancestor, when u <>v.
A leaf node is a node without children.
An internal node is a node with one or more children.
注意一个节点不是leaf就是internal node。
level/depth
Depth of a node = # of edges on the path leading it to the root.
Nodes with the same depth form a level of the tree.
注意这里DSAA的level是从0开始的,也就是depth为i的节点,位于level i上。
没什么新的concept。
注意区别list的两种实现(array,linked list)时间复杂度的差异。注意key的作用是用于查询的。key-value,key-value!
树的作用:
Data Dictionary
维护数据的一种序关系。
A list implemented using an array
- Searching for an item, O(logn)
- Inserting an item, O(n)
A list implemented using a linked list
- Searching for an item, O(n)
- Inserting an item, O(n)
注意去区分树上的list 还是链上的list,即这个list需不需要去维护一种序关系。
Key field = the field used when searching for a data itemkey 是用于查找的。
Multiple data items with the same key are referred to as duplicates
Properties of Tree (focus on binary tree)
- A tree with n nodes has n-1 edges
For each non-root node v, it has one and only one edge point to itself.
-
Let T be a tree where every internal node has at least 2 child nodes.
If m is the number of leaf nodes, then the number of internal nodes is at most m-1.
设 the average child nodes of each internal node is x,然后就能写出一个等比数列表示每一层level的internal nodes。
-
Tree is a Recursive Data Structure,很多算法都需要通过recursive implementation 实现。
-
A complete binary tree with n ≥ 2 nodes has height O(log n)
同样是能写出一个等比数列。
Difinition:Complete Binary Tree
A binary tree of height h is complete if:
- Level 0, 1, …, h-1 are all full
- At level h, the leaf nodes are as far left as possible
Binary Tree Traversal
Preorder Traversal
最先访问根节点
Postorder Traversal
最后访问根节点
Inorder Traversal
在中间访问根节点
Level-order Traverse
一层一层的访问
整理起来就是:
- Preorder: root, left subtree, right subtree
- Postoder: left subtree, right subtree, root
- Inorder: left subtree, root, right subtree
- Level-order: top to bottom, left to right
Algebraic expression
We only consider fully parenthesized expressions with binary operators: +,-, ∗, /
Leaf nodes are variables or constants, internal nodes are operators.
-
Inorder gives conventional algebraic notation
•print “(” before visiting left tree
•print “)” after visiting right tree
•for tree on the right: ((a+(b∗c))-(d/e))
-
Preorder gives functional notations
•print “(” and “)” as for inorder, and commas after visiting left subtree
•for tree on the right: -(+(a,∗(b,c)),/(d,e))
-
Postorder gives the order in which the computation must be carried out on a stack.
Character Encoding
A character encoding maps each character to a number.
- fixed-length character encoding
- variable-length encoding
Huffman encoding
Huffman encoding is a type of variable-length encoding on the basis of the actual character frequencies in a given document.
Requirement: no character’s encoding can be the prefix of another character’s encoding (e.g., cannot have encodings of ’00’ and ‘001’ at the same time)
Huffman encoding is the optimum prefix code, the space cost is minimized.
每次去合并两颗根节点(用频率代表)最小的树.
-
Begin by reading through the text to determine the frequencies
-
Create a list of nodes containing (character, frequency) pairs for each character that appears in text
-
Remove and merge the nodes with the two lowest frequencies, forming a new node as their parent
-
Add the parent to the list of nodes
-
Repeat steps (3) and (4) until there is only a single node in the list, which will be the root of the Huffman tree.
在哈夫曼树中,频率最小的两个节点一定拥有同一个父亲, 并且拥有最深的深度.
Summary:
注意区别list的两种实现(array,linked list)时间复杂度的差异。注意key的作用是用于查询的。key-value,key-value!
注意自己也是自己的ancestor。所以还有一个概念叫proper ancestor, when u <>v.