How can we parallelize this particular algorithm? Recall that combinations of starting and ending nodes were generated prior to generating the permutations. Since the number of processors that we have available are limited, just breaking down the combinations of the starting and ending nodes would be sufficient.
What's tricky in breaking down the jobs is in the order that the programs can handle in visiting different starting and ending nodes. In this particular instance, the program broke them down like this:
for int i=1; i<=|V|; i++ for int j=1; j<i; j++ generate permutations without i and j and compute output end for end forHow can we equally divide these induces equally? Let i be the starting node and j be the ending node. However, the code would not know which i and j to start and to stop. so we introduce the notion of k-indexing where k is the kth combination using i and j. to express k in terms of i and j:
With no extra communication required for each processor other than starting up processes and cleaning up, the result, graphed in figure 3.6 shows an expected result although there are still some loss of time from a strictly linear running time.