Min Cost Flows: Kruskal, MooreBellmann, Floyd, Dijkstra Algorithms

JavaTools provides several network flow optimization functions from the realm of combinatorial optimization. At a glance:

JDijkstra[] to compute the set of all shortest paths from a source node to all other nodes.
JFloyd[] to compute the optimal all-pairs shortest path solution (arbitrary arc costs, including negative)
JMooreBellmann[] to compute all shortest paths from a source node to all other nodes (arbitrary arc costs, including negative)
JKruskal[] to compute the minimum spanning tree of a connected undirected network

All of these methods by far outperform their Mathematica equivalents, to the point of being so much faster that they allow for the solution of problem sizes that the built-in functions cannot cope with.

JDijkstra[]

The function JDijkstra[] computes the set of all shortest paths from a source node s to all other nodes in a cycle-free directed network with non-negative arc costs. JDijkstra[] uses the collections from the Java collections framework as well as the Google Collections (now part of Google Guava) internally as sets of nodes and arcs and their dependencies can be easily represented through these data structures.

Although the Dijkstra algorithm is not the fastest algorithm to determine all shortest paths from a source node to all other nodes in the network, the JavaTools implementation with the above-mentioned collections is MUCH faster than the Dijkstra algorithm implementation in the Mathematica Combinatorica package, as the latter is entirely written with symbolic Mathematica code and doesn't execute in a compiled environment like the Java virtual machine. In particular on large problems with hundreds of nodes and arcs the performance difference is substantial. In fact, the large examples at the end of this page couldn't be solved with Mathematica's Dijkstra implementation at all.

The inputs to the function JDijkstra[] are a sparse array defining the arc costs for every node pair having an arc, and an integer indicating which node is the source node. Spare arrays are particularly memory-efficient on large graphs where most node pairs do not have an arc.

The following example uses the Dijkstra algorithm on a very simple problem with 6 nodes and 6 arcs:

moorebellmann_1.gif

This computes all shortest paths from source node 1:

moorebellmann_2.gif

moorebellmann_3.gif

This is to be interpreted as follows: The first list is a list of the predecessor nodes. The second list is a list of the shortest path distances from the source node to every other node. The predecessor  lists tells us that in the optimal solution the predecessor of node 1 (the source node) is Null (meaning, it has none), the predecssor of node 2 is node 1 (the source node), the predecssor of node 3 is node 1, the predecssor of node 4 is node 2, asf. The second list tells us that in the optimal solution the shortest path distance for node 1 is 0, for node 2 is 1, for node 3 is 2, for node 4 is 4, asf.

Note how fast the optimal solution is computed with JDijkstra[] from the JavaTools:

moorebellmann_4.gif

moorebellmann_5.gif

The arc costs matrix, in which the cost for the arc (i,j) is shown in the element in row i and column j, looks as follows (0 means the nodes have no connecting arc):

moorebellmann_6.gif

moorebellmann_7.gif

We can see that the arc between nodes 1 and 2 is included in the shortest paths for nodes 2 and 4. Therefore, if we increase its value (=costs), that will be reflected in the costs of the optimal solution, and we can also see that due to its high cost, arc (1,2) is NOT included in the optimal solution for node 6. While the cost for the latter was 7.4 previously, going "over" arc (1,2), it is now 9.3, going "over" arcs (1,3 ), (3,5), and (5,6).

moorebellmann_8.gif

moorebellmann_9.gif

moorebellmann_10.gif

moorebellmann_11.gif

moorebellmann_12.gif

moorebellmann_13.gif

Here is a slightly more complicated example:

moorebellmann_14.gif

moorebellmann_15.gif

moorebellmann_16.gif

moorebellmann_17.gif

moorebellmann_18.gif

moorebellmann_19.gif

moorebellmann_20.gif

JDijkstra[] is very fast. It finishes in less than 2 milliseconds.

moorebellmann_21.gif

moorebellmann_22.gif

The following example reads a file from the MatrixMart example data, uses Abs to ensure that all arc costs are positive, and subtracts the main diagonal to ensure that no node has an arc to itself. Then we compute all shortest paths with node 1 taken to be the source node. JDijkstra[] finiahes in 6 seconds. Dijkstra{} in the Mathematica Combinatorica package can not compute this problem at all due to its size.

moorebellmann_23.gif

moorebellmann_24.gif

moorebellmann_25.gif

moorebellmann_26.gif

moorebellmann_27.gif

The runtime complexity of the Dijkstra algorithm is O(moorebellmann_28.gif).

For sparse networks with only non-negative arc costs the D'Esopo Pape variant of the Moore Bellmann algorithm is more efficient (O(min(nmC,mmoorebellmann_29.gif))), thus possibily linear) than the Dijkstra algorithm, see JMooreBellmann[] below. While JDijkstra[] computed the above result in 6 seconds, JMooreBellmann[] computes it in 78 milliseconds due to the extremely efficient double-edged queue used internally.

JFloyd[]

The JavaTools function JFloyd[] computes the optimal all-pairs shortest path solution, permitting negative arc costs as well. "all-pairs" means that unlike the Dijkstra algorithm the Floyd algorithm computes the shortest paths to all nodes for ALL  nodes as source nodes. The output is the so-called "shortest paths" matrix, which contains the optimal nodes for every row (where row index n represents node n as source node), where Infinity indicates that there is no path from that source node to the target node (column index), as well as the matrix with the corresponding distances. Using the example data from above we get:

moorebellmann_30.gif

moorebellmann_31.gif

moorebellmann_32.gif

moorebellmann_33.gif

moorebellmann_34.gif

moorebellmann_35.gif

moorebellmann_36.gif

moorebellmann_37.gif

moorebellmann_38.gif

JFloyd[] finishes in less than 2 milliseconds:

moorebellmann_39.gif

moorebellmann_40.gif

Note that the last example for JDijkstra[] is just a special case of the output of the Floyd algorithm (first row).

Using the MatrixMart example from above, JFloyd[] computes the all-pairs optimal solution, and we can compare the first row with the output from the Dijkstra algorithm, which must be identical:

moorebellmann_41.gif

moorebellmann_42.gif

moorebellmann_43.gif

The values attained by the two solutions have to be the same (not necessarily the solutions themselves). Three cases contain Infinity (which means no directed path is possible between the two nodes), and where one solution has Infinity, so does the other. In all other cases the difference between the two values is 0:

moorebellmann_44.gif

moorebellmann_45.gif

moorebellmann_46.gif

moorebellmann_47.gif

The runtime is less than 9 seconds:

moorebellmann_48.gif

moorebellmann_49.gif

The runtime complexity of the Floyd algorithm is O(moorebellmann_50.gif), and its data storage requirements are O(moorebellmann_51.gif). It is therefore much more efficient to use the MooreBellmann or Dijkstra algorithms over the Floyd algorithm if only the optimal paths for one source node have to be computed.

The author is inviting competing implementations in Mathematica or Java or both for large problems at info@lauschkeconsulting.net to challenge the presented performance claim.

JMooreBellmann[]

The JavaTools function JMooreBellmann[] finds the shortest paths from a source node to all other nodes for arbitrary arc costs, using the D'Esopo Pape variant, which is the most efficient method known today for practical problems (its theoretical worse-case complexity is exponential). The D'Esopo Pape variant is also sometimes known as the "Deque" variant ("double-edged queue"), as a deque is used internally for updating nodes. Unlike the Dijkstra algorithm, the Moore-Bellmann algorithm works correctly even for arbitrary (negative) arc costs. However, for the Moore-Bellmann to work correctly, it is imperative that the network does not contain any *circles* of negative arc length (as it would be possible to "walk" in the circle an infinite number of times and always decrease the total arc length).

Using the example from above:

moorebellmann_52.gif

moorebellmann_53.gif

moorebellmann_54.gif

moorebellmann_55.gif

moorebellmann_56.gif

This is the same output as was computed with the Dijkstra algorithm, and it matches the first line of the output of the Floyd algorithm.

Its running time is very fast: less than 2 millisconds for this example:

moorebellmann_57.gif

moorebellmann_58.gif

In the following example we use a few negative arc costs, not introducing a negative cycle:

moorebellmann_59.gif

moorebellmann_60.gif

moorebellmann_61.gif

moorebellmann_62.gif

moorebellmann_63.gif

This is the same output as was computed with the Dijkstra algorithm, and it matches the first line of the output of the Floyd algorithm.

moorebellmann_64.gif

moorebellmann_65.gif

For dense networks with only non-negative arc costs the Dijkstra algorithm is more efficient (O(moorebellmann_66.gif)), for sparse networks with only non-negative arc costs the D'Esopo Pape variant of the Moore Bellmann algorithm (implemented in JMooreBellmann[]) is more efficient (O(min(nmC,mmoorebellmann_67.gif)), thus possibily linear).

Using the MatrixMart example from above, JMooreBellmann[] computes the result in 78 milliseconds.

moorebellmann_68.gif

moorebellmann_69.gif

moorebellmann_70.gif

Here as well we get different solutions, but identical values of the solution:

moorebellmann_71.gif

moorebellmann_72.gif

moorebellmann_73.gif

moorebellmann_74.gif

moorebellmann_75.gif

However, the solutions themselves returned by the two algorithms are NOT necessarily the same (the solutions of the shortest path problem, and of the minimum spanning tree, see below, are not unique). In this example, out of the 1074 paths, 1038 are the same, 36 are different:

moorebellmann_76.gif

moorebellmann_77.gif

moorebellmann_78.gif

moorebellmann_79.gif

This means that in 36 cases different paths were chosen, however, they both attain the same optimum value.

The author is inviting competing implementations in Mathematica or Java or both for large problems at info@lauschkeconsulting.net to challenge the presented performance claim.

JKruskal[]

JavaTools also provides a very efficient implementation of the Kruskal algorithm to compute the minimum spanning tree of a connected undirected network. The "modified Kruskal algorithm" is invoked with JKruskal[] in the JavaTools, and it operates on sparse arrays or lists with the nodes and costs. Here we use the sparse array version:

moorebellmann_80.gif

moorebellmann_81.gif

moorebellmann_82.gif

moorebellmann_83.gif

moorebellmann_84.gif

This means the arcs (1,2), (2,4), (3,4), and (3,5) form the minimum cost spanning tree.

moorebellmann_85.gif

Note the speed: JKruskal[] finishes in roughly a millisecond:

moorebellmann_86.gif

moorebellmann_87.gif

The author is inviting competing implementations in Mathematica or Java or both for large problems at info@lauschkeconsulting.net to challenge the presented performance claim.

More Examples

The following example uses a slight modification of an example from the Mathematica help browser. We compute a minimum spanning tree (assuming arcs are just undirected edges and all edge=arc costs are their -- 2-dimensional -- Euclidean lenghts) using the modified Kruskal algorithm, the shortest paths from the source node 1 to all other nodes using the Dijkstra algorithm as well as the Moore-Bellmann algorithms, the all-pairs shortest path solutions using the Floyd algorithm, and show related timing results.

Note that the Kruskal algorithm and all its variants compute EXACT optimal solutions, that means they are not heuristics to compute a solution that is not necessarily optimal (only "good"). The solution is not necessarily unique, but it is guaranteed to be an optimal solution. In other words, other solutions are possbile, but no better ones. Other solutions are only just as good. However, for "sufficiently random" inputs of type Real, such as the coordinates of cities (see below), the solutions usually are unique, hence, the optimal solution is the only optimal one. It is mathematically next to impossible that random inputs would result in problems for which the optimal solutions are not unique. For the next few graph theoretic examples other optimal solutions are readily apparent, but for the city data examples below it is impossible that other optimal solutions exist.

moorebellmann_88.gif

moorebellmann_89.gif

moorebellmann_90.gif

moorebellmann_91.gif

To compute the minimum spanning tree solution using edge lengths as edge costs we have to extract the edge lengths from the graph:

moorebellmann_92.gif

moorebellmann_93.gif

Here we compute the minimum spanning tree solution (in practically 0 time):

moorebellmann_94.gif

moorebellmann_95.gif

moorebellmann_96.gif

This is what the minimum spanning tree solutions looks like, overlaid on the whole graph:

moorebellmann_97.gif

moorebellmann_98.gif

The shortest paths from the source node 1 to all others, using the Dijkstra algorithm:

moorebellmann_99.gif

moorebellmann_100.gif

The shortest paths from the source node 1 to all others, using the D'Esopo Pape variant of the Moore-Bellmann algorithm:

moorebellmann_101.gif

moorebellmann_102.gif

The all-pairs shortest paths solution using the Floyd algorithm (again, note that the first line is identical to the Dijkstra and MooreBellman solutions):

moorebellmann_103.gif

moorebellmann_104.gif

The following example uses the Petersen graph as a starting point, but with double-edges removed:

moorebellmann_105.gif

moorebellmann_106.gif

moorebellmann_107.gif

moorebellmann_108.gif

moorebellmann_109.gif

moorebellmann_110.gif

moorebellmann_111.gif

moorebellmann_112.gif

moorebellmann_113.gif

moorebellmann_114.gif

moorebellmann_115.gif

moorebellmann_116.gif

moorebellmann_117.gif

moorebellmann_118.gif

moorebellmann_119.gif

moorebellmann_120.gif

moorebellmann_121.gif

Another example (a modification of an example from the Mathematica help browser):

moorebellmann_122.gif

moorebellmann_123.gif

moorebellmann_124.gif

moorebellmann_125.gif

moorebellmann_126.gif

moorebellmann_127.gif

moorebellmann_128.gif

moorebellmann_129.gif

moorebellmann_130.gif

moorebellmann_131.gif

moorebellmann_132.gif

moorebellmann_133.gif

moorebellmann_134.gif

moorebellmann_135.gif

moorebellmann_136.gif

moorebellmann_137.gif

More Large Examples

In the following section we show two more large-scale examples. None of the problems could be solved with the functions that Mathematica has built in.

Both examples use an underlying graph that is not planar, i. e. there are edges that are crossing other edges. This makes the problem particularly difficult, and in the solution plots you can see that optimal edges are hidden underneath other edges that were not included in the optimal solution.

The following is also an example from the Mathematica help browser. As in the previous examples we compute the minimum spanning tree with the modified Kruskal algorithm (again assuming all edge costs to be their 2-dimensional Euclidean lengths), and the shortest paths from the source node 1 to all other nodes with the Dijkstra and Moore-Bellmann algorithms, as well as the all-pairs shortest path solutions using the Floyd algorithm, and show related timing results and solution overlays.

moorebellmann_138.gif

moorebellmann_139.gif

moorebellmann_140.gif

moorebellmann_141.gif

JavaTools computes the minimum spanning tree solution in a mere 157 milliseconds:

moorebellmann_142.gif

moorebellmann_143.gif

moorebellmann_144.gif

moorebellmann_145.gif

moorebellmann_146.gif

moorebellmann_147.gif

moorebellmann_148.gif

moorebellmann_149.gif

moorebellmann_150.gif

moorebellmann_151.gif

moorebellmann_152.gif

moorebellmann_153.gif

The following is also an example from the Mathematica help browser. The matrix is related to a structure from NASA's Langley Research Center. As in the previous examples we compute the minimum spanning tree with the modified Kruskal algorithm (again assuming all edge costs to be their 2-dimensional Euclidean lengths), and the shortest paths from the source node 1 to all other nodes with the Dijkstra and Moore-Bellmann algorithms, as well as the all-pairs shortest path solutions using the Floyd algorithm, and show related timing results and overlays.

moorebellmann_154.gif

moorebellmann_155.gif

moorebellmann_156.gif

moorebellmann_157.gif

JavaTools computes the minimum spanning tree solution in 2/3 of a second:

moorebellmann_158.gif

moorebellmann_159.gif

moorebellmann_160.gif

moorebellmann_161.gif

moorebellmann_162.gif

moorebellmann_163.gif

moorebellmann_164.gif

moorebellmann_165.gif

moorebellmann_166.gif

moorebellmann_167.gif

moorebellmann_168.gif

moorebellmann_169.gif

The last Dijkstra and MooreBellmann solutions show just how inferior the Dijkstra algorithm is and how superior the MooreBellmann algorithm is, which can also be shown theoretically.

The author of JVMTools is inviting competing implementations in Mathematica or Java or both for large problems at info@lauschkeconsulting.net to challenge the presented performance claim.

Spikey Created with Wolfram Mathematica 9.0