Thuật toán Algorithms (Phần 42)

Chia sẻ: Tran Anh Phuong | Ngày: | Loại File: PDF | Số trang:10

Thêm vào BST

Báo xấu

67
lượt xem 6
download

Download Vui lòng tải xuống để xem tài liệu đầy đủ

Tham khảo tài liệu 'thuật toán algorithms (phần 42)', khoa học tự nhiên, toán học phục vụ nhu cầu học tập, nghiên cứu và làm việc hiệu quả

Chủ đề:

Bình luận(0) Đăng nhập để gửi bình luận!

Lưu

Nội dung Text: Thuật toán Algorithms (Phần 42)

CONNECTIVITY 403 function fastfind(x, y: integer; union: boolean): boolean; var i, j, t: integer; begin i:=x; while dad[i]>O do i:=dad[i]; j:=y; while dadlj]>0 do j:=dadb]; while dad [x] > 0 do begin t:=x; x:=dad[x]; dad[t]:=i end; while dad [y] > 0 do begin t:=y; y:=dad[y]; dad[t]:=j end; if union and (ij) then if dad b]
404 CHAPTER 30 A B C D E F G H I J K L M A G : 100000A000000 A B : 2AOOOOA000000 A C : 3 A A O O O A O O O O O 0 LM: 3 A A 0 0 O A 0 0 0 0 1 L JM: 3 A A 0 0 O A 0 O L 0 2 L JL: 3AAOOOAOOLO2L* J K : 3AAOOOAOOLL3L E D : 3AAElOAOOLL3L F D : 3AAE2EAOOLL3L HI: 3 A A E 2 E A 1HLL 3 L F E : 3AAE2EAlHLL3L* A F : GAAEAEAlHLL 3 L G E : GAAEAEAlHLL 3 L * G C : GAAEAEAlHLL 3 L * G H : 8AAEAEAAHLL3L JG:12AAEAEAAHLLAL LG:12AAEAEAAHLLAL * For clarity in this table, each positive entry i is replaced by the ith letter of the alphabet (the name of the father), and each negative entry is complemented to give a positive integer (the weight of the tree). Several other techniques have been developed to avoid degenerate struc- tures. For example, path compression has the disadvantage that it requires another pass up through the tree. Another technique, called halving, is to make each node point to its granddad on the way up the tree. Still another technique, splitting, is like halving, but is applied only to every other node on the search path. Either of these can be used in combination with weight balancing or with height balancing, which is similar but uses tree height in- stead of tree size to decide which way to merge trees. How is one to choose from among all these methods? And exactly how “flat” are the trees produced? Analysis for this problem is quite difficult because the performance depends not only on the V and E parameters, but also on the number of find operations and, what’s worse, on the order in which the union and find operations appear. Unlike sorting, where the actual files that appear in practice are quite often close to “random,” it’s hard to see how to model graphs and request patterns that might appear in practice. For this reason, algorithms which do well in the worst case are normally preferred for union-find (and other graph algorithms), though this may be an overly ccnservative approach.
CONNECTMTY 405 Even if only the worst case is being considered, the analysis of union-find algorithms is extremely complex and intricate. This can be seen even from the nature of the results, which do give us clear indications of how the algorithms will perform in a practical situation. If either weight balancing or height balancing is used in combination with either path compression, halving, or splitting, then the total number of operations required to build up a structure with E edges is proportional to Es(E), where a(E) is a function that is so slowly growing that o(E) < 4 unless E is so large that taking lg E, then taking lg of the result, then taking lg of that result, and continuing 16 times still gives a number bigger than 1. This is a stunningly large number; for all practical purposes, it is safe to assume that the average amount of time to execute each union and find operation is constant. This result is due to R. E. Tarjan, who further showed that no algorithm for this problem (from a certain general class) can do better that E&(E), so that this function is intrinsic to the problem. An important practical application of union-find algorithms is that they can be used to determine whether a graph with V vertices and E edges is connected in space proportional to V (and almost linear time). This is an advantage over depth-first search in some situations: here we don’t need to ever store the edges. Thus connectivity for a graph with thousands of vertices and millions of edges can be determined with one quick pass through the edges.
406 Exercises 1. Give the articulation points and the biconnected components of the graph formed by deleting GJ and adding IK to our sample graph. 2. Write a program to print out the biconnected components of a graph. 3. Give adjacency lists for one graph where breadth-first search would find a cycle before depth first search would, and another graph where depth-first search would find the cycle first. 4. Draw the search tree that results if, in the depth-first search we ignore nodes already on the fringe (as in breadth-first search). 5. Draw the search tree that results if, in the breadth-first search we change the priority of nodes already on the fringe (as in depth-first search). 6. Draw the union-find forest constructed for the example in the text, but assuming that find is changed to set a[i]=j rather than a b]=i. 7. Solve the previous problem, assuming further that path compression is used. 8. Draw the union-find forests constructed for the edges AB BC CD DE EF . . . YZ, assuming first that weight balancing without path compression is used, then that path compression without weight balancing is used, then that both are used. 9. Implement the union-find variants described in the text, and empirically determine their comparative performance for 1000 union operations with both arguments random integers between 1 and 100. 10. Write a program to generate a random connected graph on V vertices by generating random pairs of integers between 1 and V. Estimate how many edges are needed to produce a connected graph as a function of V.
3 1. Weighted Graphs It is often necessary to model practical problems using graphs in which weights or costs are associated with each edge. In an airline map where edges represent flight routes, these weights might represent distances or fares. In an electric circuit where edges represent wires, the length or cost of the wire are natural weights to use. In a job-scheduling chart, weights could represent time or cost of performing tasks or of waiting for tasks to be performed. Questions entailing minimizing costs naturally arise for such situations. In this chapter, we’ll examine algorithms for two such problems in detail: “find the lowest-cost way to connect all of the points,” and “find the lowest-cost path between two given points.” The first, which is obviously useful for graphs representing something like an electric circuit, is called the minimzlm spanning tree problem; the second, which is obviously useful for graphs representing something like an airline route map, is called the shortest path problem. These problems are representative of a variety of problems that arise on weighted graphs. Our algorithms involve searching through the graph, and sometimes our intuition is supported by thinking of the weights as distances: we speak of “the closest vertex to 5,” etc. In fact, this bias is built into the nomenclature for the shortest path problem. Despite this, it is important to remember that the weights need not be proportional to any distance at all; they might represent time or cost or something else entirely different. When the weights actually do represent distances, other algorithms may be appropriate. This issue is discussed in further detail at the end of the chapter. A typical weighted undirected graph is diagramed below, with edges comprising a minimum spanning tree drawn with double lines. Note that the shortest paths in the graph do not necessarily use edges of the minimum spanning tree: for example, the shortest path from vertex A to vertex G is AF’EG. 407
408 CHAPTER 31 It is obvious how to represent weighted graphs: in the adjacency matrix representation, the matrix can contain edge weights rather than boolean values, and in the adjacency structure representation, each list element (which represents an edge) can contain a weight. We’ll start by assuming that all of the weights are positive. Some of the algorithms can be adapted to handle negative weights, but they become significantly more complicated. In other cases, negative weights change the nature of the problem in an essential way, and require far more sophisticated algorithms than those considered here. For an example of the type of difficulty that can arise, suppose that we have a situation where the sum of the weights of the edges around a cycle is negative: an infinitely short path could be generated by simply spinning around the cycle. Minimum Spanning Tree A minimum spanning tree of a weighted graph is a collection of edges that connects all the vertices such that the sum of the weights of the edges is at least as small as the sum of the weights of any other collection of edges that connects all the vertices. The minimum spanning tree need not be unique: for example, the following diagram shows three other minimum spanning trees for our sample graph.
WEIGHTED GRAPHS 409 It’s easy to prove that the “collection of edges” referred to in the definition above must form a spanning tree: if there’s any cycle, some edge in the cycle can be deleted to give a collection of edges which still connects the vertices but has a smaller weight. We’ve seen in previous chapters that many graph traversal procedures compute a spanning tree for the graph. How can we arrange things for a weighted graph so that the tree computed is the one with the lowest total weight? The answer is simple: always visit next the vertex which can be connected to the tree using the edge of lowest weight. The following sequence of diagrams illustrates the sequence in which the edges are visited when this strategy is used for our example graph. The implementation of this strategy is a trivial application of the priority graph search procedure in the previous chapter: we simply add a weight field to the edge record (and modify the input code to read in weights as well), then use tt.weight for priority in that program. Thus we always visit next the vertex in the fringe which is closest to the tree. The traversal is diagramed as above for comparison with a completely different method that we’ll examine below; we can also redraw the graph in our standard search tree format:
410 CHAPTER 31 This method is based on the following fundamental property of minimum spanning trees: “Given any division of the vertices of a graph into two sets, the minimum spanning tree contains the shortest of the edges connecting a vertex in one of the sets to a vertex in the other set.” For example if we divide the vertices into the sets ABCD and EFG in our sample graph, this says that DF must be in any minimum spanning tree. This is easy to prove by contradiction. Call the shortest edge connecting the two sets s, and assume that s is not in the minimum spanning tree. Then consider the graph formed by adding s to the purported minimum spanning tree. This graph has a cycle; furthermore, that cycle must have some other edge besides s connecting the two sets. Deleting this edge and adding s gives a shorter spanning tree, contradicting the assumption that s is not in the minimum spanning tree. When we use priority-first searching, the two sets of nodes in question are the visited nodes and the unvisited ones. At each step, we pick the shortest edge from a visited node to a fringe node (there are no edges from visited nodes to unseen nodes). By the property above every edge that is picked is on the minimum spanning tree. As described in the previous chapter, the priority graph traversal alge rithm has a worst-case running time proportional to (E + V)logV, though a different implementation of the priority queue can give a V2 algorithm, which is appropriate for dense graphs. Later in this chapter, we’ll examine this implementation of the priority graph traversal for dense graphs in full detail. For minimum spanning trees, this reduces to a method discovered by R. Prim in 1956 (and independently by E. Dijkstra soon thereafter). Though the methods are the same in essence (just the graph representation and im- plementation of priority queues differ), we’ll refer to the sparsepfs program of the previous chapter with priority replaced by tf.weight as the “priority-first search solution” to the minimum spanning tree problem and the adjacency matrix version given later in this chapter (for dense graphs) as “Prim’s al-
WEIGHTED GRAPHS 411 gorithm.” Note that Prim’s algorithm takes time proportional to V2 even for sparse graphs (a factor of about V2/E 1ogV slower than the priority-first search solution, and that the priority-first search solution is a factor of 1ogV slower than Prim’s algorithm for dense graphs. A completely different approach to finding the minimum spanning tree is to simply add edges one at a time, at each step using the shortest edge that does not form a cycle. This algorithm gradually builds up the tree one edge at a time from disconnected components, as illustrated in the following sequence of diagrams for our sample graph: D 8F’ The correctness of this algorithm also follows from the general property of minimum spanning trees that is proved above. The code for this method can be pieced together from programs that we’ve already seen. A priority queue is obviously the data structure to use to consider the edges in order of their weight, and the job of testing for cycles can be obviously done with union-find structures. The appropriate data structure to use for the graph is simply an array edge with one entry for each edge. The indirect priority queue procedures pqconstruct and pqremove from Chapter 11 can be used to maintain the priority queue, using the weight fields in the edge array for priorities. Also, the program uses the findinit and fastfind procedures from Chapter 30. The program simply prints out the edges which comprise the spanning tree; with slightly more work a dad array or other representation could be computed:
412 CHAPTER 31 program kruskaI(input, output); const maxV=50; maxE=2500; type edge=record x, y, weight: integer end; var i, j, m, x, y, V, E: integer; edges: array [O..maxE] of edge; begin readln (V, E) ; forj:=l toEdo begin readln (c, d, edges/j] . weight) ; edgesb].x:=index(c); edgesb].y:=index(d); end ; findinit; pqconstruct; i:=O; repeat m:=pqremove; x:=edges[m].x; y:=edges[m].y; if not fastfind(x, y, true) then begin writeln(name(x), name(y), edges[m].weight); i:=i+l end until i=V-I; end. The running time of this program is dominated by the time spent processing edges in the priority queue. Suppose that the graph consists of two clusters of vertices all connected together by very short edges, and only one edge which is very long connecting the two clusters. Then the longest edge in the graph is in the minimum spanning tree, but it will be the last edge out of the priority queue. This shows that the running time could be proportional to ElogE in the worst case, although we might expect it to be much smaller for typical graphs (though it always takes time proportional to E to build the priority queue initially). An alternate implementation of the same strategy is to sort the edges by weight initially, then simply process them in order. Also, the cycle testing can be done in time proportional to Elog E with a much simpler strategy than union-find, to give a minimum spanning tree algorithm that always takes E log E steps. This method was proposed by J. Kruskal in 1956, even earlier than Prim’s algorithm. We’ll refer to the modernized version above, which uses priority queues and union-find structures, as “Kruskal’s algorithm.” The performance characteristics of these three methods indicate that the