next up previous
Next: November 20 Up: November 15 Previous: November 15


Matrix chains

Suppose you (or your computer) have to multiply a chain of matrices:

$\displaystyle A\times B\times C\times D\times E.
$

Matrix multiplication of a pair of matrices is well-defined (if it's defined at all), and for more than two, multiplication is associative:

$\displaystyle (((AB)(CD))E) = (A((BC)(DE))).
$

You get the same answer, either way, but the amount of work (measured in the number of multiplications) may well differ. Suppose $ A$ is a $ 10\times 100$ matrix, $ B$ is a $ 100\times 5$ matrix, and $ C$ is a $ 5\times 50$ matrix. If we group the multiplication as $ ((AB)C)$, then the amount of work is proportional to $ 10\times 100\times 5$ + $ 10\times 5\times 50$, or 7500 multiplications. On the other hand, if we grouped the product as $ (A(BC))$ we'd be faced with $ 100\times
5\times 50$ + $ 10\times 100\times 50$, or 75000 multiplications -- 10 times as many! So, order counts.

How much work is it to check out all the ways of parenthesizing a product? Try it out with just a few matrices (no more than 10), and you'll find the number of groupings grows rapidly -- it's proportional to $ 4^n$ for $ n$ matrices. The exact number of groupings is a challenging thing to calculate, it's called the Catalan number.

It's not practical to check something proportional to $ 4^n$ before we even get down to the business of calculating the matrix product. Let's look at the structure of the problem. Suppose we (magically) have an optimal grouping for multiplying $ n$ matrices. Examine the last matrix multiplication performed:

$\displaystyle ((A_1 \times \cdots A_k)(A_{k+1}\times \cdots A_n)).
$

If the entire product is optimal (uses the fewest possible multiplications), then the subproducts $ (A_1\cdots A_k)$ and $ (A_{k+1}
\cdots A_n)$ must also be optimal. Otherwise, if there were a better way to calculate either subproduct, it would lead to a more efficient grouping to calculate the entire product. So, our Dynamic Programming solution will keep track of the best ways of calculating subproducts, and combine these into the best ways of calculating the entire product. We can define the following array, for $ 1 \leq i \leq j \leq
n$:

$\displaystyle C[i][j] =$   minimum cost of multiplying $\displaystyle A_i \cdots A_j.
$

Clearly $ C[i][i]$ is zero, for all $ i$, since there is no cost in the ``product'' of one matrix. If $ j>i$, then we get the recurrence relation:

$\displaystyle C[i][j] = \min_{i \leq k < j} \{ C[i][k] + C[k+1][j] + r_i
r_{k+1}c_j \}.
$

...where $ r_i r_{k+1} c_j$ is the product of the number of rows in matrix $ i$, the number of rows in matrix $ k+1$, and the number of columns in matrix $ j$.


next up previous
Next: November 20 Up: November 15 Previous: November 15
Danny Heap 2002-12-16