图论
1.图在计算机中是如何存储的呢?
G=<V,E> (G表示graph,v表示顶点,e表示边)
图是一种很复杂的数据结构,但是可以利用基础数据结构来表示:邻接矩阵,链表。邻接矩阵本质是一个线性表,只不过是二维的而已,将图以矩阵的形式存储,那么之后就可以利用线性代数对其进行计算,如求笛卡尔乘积,矩阵的逆等。邻接表(链表)以图中的每个顶点为头结点,与之相邻的节点通过链表的形式连接在其后。
对于稠密图而言,由于顶点的关系非常复杂,如果使用链表的方式存储,那么链表将变得很长,占用太多空间(空间复杂度),所以临接表适合表示稀疏图。
而邻接矩阵对于任何图而言,占用的空间都是n的平方,如果存储稀疏图,将浪费大量空间。所以它适合表示稠密图。
2.有向图,无向图,入度,出度,连通图,权值
如果图的每条边都没有方向,则表示的是无向图,无向图的边是双向的;如何图中的边有方向,则表示的是有向图,可达的方向要看箭头的具体指向。
出度:从某个顶点出发,有多少条边出去,该结点的出度就为多少。
入度:有多少条边进入某个顶点,该结点的入度就为多少。
连通图:如果图中任意两点都是连通的,则称为连通图;若是有向的,则称为强连通图。
如果图的每条边都带有权值,则称为带权图。
3.欧拉环,汉密尔顿图
欧拉环研究的是有两个清江半岛,中间有一个湖,其中有两个小岛,有7座桥将其互相连接,有没有一种路线,使得经过所有的桥(边)后,重新回到原点。
汉密尔顿图与欧拉环类似,它研究的是有没有一种路线,可以经过所有的顶点,然后重新回到原点。
4.要解决现实生活中的问题,首先就要对问题进行建模,看采用哪种数据结构方便存放数据,能够抽象的描述问题。一旦建模完成,就可以写算法去解决具体的问题,这种思想尤其重要(数据结构+算法)。
5.与图相关的有哪些著名的算法呢?
1)最小生成树:
最小生成树描述的问题是,从一个图中找出一颗树,使得该树包含图中所有的顶点,且树的权值最小。对应到现实生活中的问题就是,如果有一个图,图中每个顶点表示一个城市,问要如何设计光纤线路,使得各个城市可以互联,且成本最低?
算法思想:贪心算法,首先随机找到一个结点S,然后找与之相邻的离它最近的一个结点G,并将其加入集合{S,G};接着,从剩余的所有结点中,找到离{S,G}最近的一个结点,并加入该集合。以此类推。
2)哈弗曼树:
哈弗曼树描述的问题是,如果学生成绩小于60分为不及格,60~70分表示较好,70~80表示良,80~90表示好,90~100表示优秀。如果设计一个程序,按照常规的思维,会老老实实的用if elseif else这样的判断语句来分流。但是却忽略了一个重要的事实,那就是每个得分段的比重是不一样的。如果我们根据每个结点的权重构建一个哈弗曼树,那么总的比较次数就会大大降低,提高算法效率。
与之类似的是发送报文,每个字母出现的频率是不一样的,如果能够构建一颗哈弗曼树对其进行编码,那么数据包的总体积就会大大减小,减小网络带宽。
算法思想:从所有结点中,选择权值最小的两个结点,构建一颗二叉树,其父亲结点是子结点的和,然后将这两个结点从集合中删掉,并加入新生成的父亲结点。继续从剩余结点中选择权值最小的结点,构建二叉树。以此类推,这样树的带权路径长度将会最小。若要进行编码,则左右子树按照0,1编码下去。
https://www.siggraph.org/education/materials/HyperGraph/video/mpeg/mpegfaq/huffman_tutorial.html
3)Dijkstra算法:
单源最短路径算法,描述的是从一顶点到达另外一顶点的最短路径。现实生活中的具体应用就是,滴滴打车的司机,如何根据导航选择最短路径,将你送达目的地。
最短路径的最优子结构性质
该性质描述为:如果P(i,j)={Vi....Vk..Vs...Vj}是从顶点i到j的最短路径,k和s是这条路径上的一个中间顶点,那么P(k,s)必定是从k到s的最短路径。下面证明该性质的正确性。
假设P(i,j)={Vi....Vk..Vs...Vj}是从顶点i到j的最短路径,则有P(i,j)=P(i,k)+P(k,s)+P(s,j)。而P(k,s)不是从k到s的最短距离,那么必定存在另一条从k到s的最短路径P'(k,s),那么P'(i,j)=P(i,k)+P'(k,s)+P(s,j)<P(i,j)。则与P(i,j)是从i到j的最短路径相矛盾。因此该性质得证。
算法思想:贪心算法
Let the node at which we are starting be called the initial node. Let the distance of node Y be the distance from the initial node to Y. Dijkstra's algorithm will assign some initial distance values and will try to improve them step by step.
- Assign to every node a tentative distance value: set it to zero for our initial node and to infinity for all other nodes.
- Set the initial node as current. Mark all other nodes unvisited. Create a set of all the unvisited nodes called the unvisited set.
- For the current node, consider all of its neighbors and calculate their tentative distances. Compare the newly calculated tentative distance to the current assigned value and assign the smaller one. For example, if the current node A is marked with a distance of 6, and the edge connecting it with a neighbor B has length 2, then the distance to B (through A) will be 6 + 2 = 8. If B was previously marked with a distance greater than 8 then change it to 8. Otherwise, keep the current value.
- When we are done considering all of the neighbors of the current node, mark the current node as visited and remove it from the unvisited set. A visited node will never be checked again.
- If the destination node has been marked visited or if the smallest tentative distance among the nodes in the unvisited set is infinity , then stop. The algorithm has finished.
- Otherwise, select the unvisited node that is marked with the smallest tentative distance, set it as the new "current node", and go back to step 3.
首先将原点到原点的值设为0,并将图中其它每个结点的值(到原点source的距离)设为无穷大;Q作为整个图中的点集合,首先找到Q中与source相邻的结点,并更新dist[v],计算出最小的一个,设为结点u,并将source删除。以u为source,找到所有的相邻结点,并更新每个结点的dist[v],找到最小的一个值v,删掉source u,并将v作为source继续操作。
动态图
https://upload.wikimedia.org/wikipedia/commons/5/57/Dijkstra_Animation.gif
4)图的深度优先遍历和广度优先遍历
深度优先遍历(Depth-first search):类似于树的先序遍历(先遍历根结点,再遍历左子树,再遍历右子树,然后递归的调用),可以借助栈来实现。
The DFS algorithm is a recursive algorithm that uses the idea of backtracking. It involves exhaustive searches of all the nodes by going ahead, if possible, else by backtracking.Here, the word backtrack means that when you are moving forward and there are no more nodes along the current path, you move backwards on the same path to find nodes to traverse. All the nodes will be visited on the current path till all the unvisited nodes have been traversed after which the next path will be selected.This recursive nature of DFS can be implemented using stacks. The basic idea is as follows:Pick a starting node and push all its adjacent nodes into a stack.Pop a node from stack to select the next node to visit and push all its adjacent nodes into a stack.Repeat this process until the stack is empty. However, ensure that the nodes that are visited are marked. This will prevent you from visiting the same node more than once. If you do not mark the nodes that are visited and you visit the same node more than once, you may end up in an infinite loop.
非递归实现
(1)栈S初始化;visited[n]=0; //先将所有结点都设为未访问过
(2)访问顶点v;visited[v]=1;顶点v入栈S
(3)while(栈S非空)
x=栈S的顶元素(不出栈);
if(存在并找到未被访问的x的邻接点w)
访问w;visited[w]=1;
w进栈; //所有的邻接点都要入栈,然后重新进入while循环
else
x出栈;
(如果能开发一种程序,可以将自然语言的伪代码直接转化成程序语言那该多好)
广度优先遍历(Breadth First Search):类似于树的按层遍历(从上往下一层一层的遍历,可以借助队列来实现)
There are many ways to traverse graphs. BFS is the most commonly used approach.BFS is a traversing algorithm where you should start traversing from a source node and traverse the graph layerwise thus exploring the neighbour nodes (nodes which are directly connected to source node). You must then move towards the next-level neighbour nodes.As the name BFS suggests, you are required to traverse the graph breadthwise as follows:
- First move horizontally and visit all the nodes of the current layer
- Move to the next layer
Consider the following diagram.
A graph can contain cycles, which may bring you to the same node again while traversing the graph. To avoid processing of same node again, use a boolean array which marks the node after it is processed. While visiting the nodes in the layer of a graph, store them in a manner such that you can traverse the corresponding child nodes in a similar order.In the diagram, start traversing from 0 and visit its child nodes 1, 2, and 3. Store them in the order in which they are visited. This will allow you to visit the child nodes of 1 first (i.e. 4 and 5), then of 2 (i.e. 6 and 7), and then of 3 (i.e. 7) etc.To make this process easy, use a queue to store the node and mark it as 'visited' until all its neighbours (vertices that are directly connected to it) are marked. The queue follows the First In First Out (FIFO) queuing method, and therefore, the neigbors of the node will be visited in the order in which they were inserted in the node i.e. the node that was inserted first will be visited first, and so on.
(1)初始化队列Q;visited[n]=0;
(2)访问顶点v;visited[v]=1;顶点v入队列Q;
(3) while(队列Q非空)
v=队列Q的头元素出队; //v出队的同时将所有的邻结点加入队列
w=顶点v的第一个邻接点;
while(w存在)
如果w未访问,则访问顶点w;
visited[w]=1;
顶点w入队列Q;
w=顶点v的下一个邻接点。
5)拓扑排序:
拓扑排序是针对有向无环图而言的,它能够将图中的结点以线性关系排列,使得:如果图中存在一条边n到m可达,那么序列中n必然在m的前面出现。拓扑排序的结果可能有多个,取决于是偏序还是全序(全序是唯一的)。
在现实生活中的具体应用:拓扑图主要描述的是结点间的依赖关系。例如选课,如果要选人工智能,则需要先修数据结构,在修数据结构之前,必须要先修离散数学。偏序描述的是结点间部分有序,但并不是全部有序(某些结点出现的先后顺序可以不需要care),全序描述的任意结点之间都有关系,所以排序只有一种结果。
算法思路(Kahn算法):
首先找到图中入度为0的所有结点,将这些结点放入一个集合S中。然后删掉S中的结点,同时将该结点加入List中(该List是最终的排序序列),并删掉与S有关的所有边;以此类推,直到最后没有边存在,则返回List,否则表示该图有环,返回Error。
L ← Empty list that will contain the sorted elements S ← Set of all nodes with no incoming edges while S is non-empty do remove a node n from S add n to L for each node m with an edge e from n to m do remove edge e from the graph if m has no other incoming edges then insert m into S if graph has edges then return error (graph has at least one cycle) else return L (a topologically sorted order)
该算法的复杂度为O(E+V)