next up previous
Next: Your job Up: The problems Previous: Optimal grouping of a

Optimal structure of a BST

Suppose you need a binary search tree (BST) that makes searches as efficient as possible, given the frequency of the keys. Given a list of pairs

$\displaystyle (K_0, f_0), (K_1, f_1), \ldots, (K_n, f_n),
$

...where the $ K_i$ are keys, $ K_i$ $ <$ $ K_j$ (in some ordering) if $ i<j$, and the $ f_i$ are integers specifying how frequently the keys occur. For each BST comprised of nodes containing these keys, denote the depth of the node containing $ K_i$ as $ d_i$ (with the depth of the root node being 1, its children being 2, and so on). The cost of such a BST is the sum of $ d_i \times f_i$ (for all $ i$) -- you expect to have to traverse depth $ d_i$ $ f_i$ times retrieving the node with key $ K_i$. What is the best way to construct a BST in order to minimize the cost? Note: For this assignment, BSTs have keys in all nodes, not just in the leaves.

Here is a first attempt. Denote the minimum cost of a BST comprised of nodes containing $ K_i$ $ \ldots$ $ K_k$ as $ c[i][k]$, and the weight added by selecting some $ K_j$ ( $ i\leq j \leq k$) as the root with subtrees $ K_i,\ldots, K_{j-1}$ and $ K_{j+1},\ldots, K_k$ as $ w(i,j,k)$ $ =$ $ f_i + \cdots + f_j + \cdots + f_k$. You should draw some small BSTs to see why this $ w(i,j,k)$ is suitable. Now, for each each choice of $ j$ with $ i\leq j \leq k$ find $ c[i][j-1]$ $ +$ $ c[j+1][k]$ $ +$ $ w(i,j,k)$. Again, this is close, but not identical, to the formula for triangulation and matrix multiplication, so you'll need to slightly change the notation.

Denote the minimum cost of a BST comprised of nodes containing $ K_{i+1}$, $ \ldots$, $ K_{k-1}$ as $ c[i][k]$, and $ w(i,j,k)$ as the sum $ f_{i+1}$ $ +$ $ \cdots$ $ +$ $ \cdots$ $ +$ $ f_{k-1}$. Now your recursive formula is Equation 1. If you set out to find a minimal BST for values $ (K_0, f_0)$, $ \ldots$, $ (K_n, f_n)$, the corresponding minimal cost will be $ c[-1][n+1]$, and the algorithm is the same (except for the definition of $ w(i,j,k)$) as the other two problems.


next up previous
Next: Your job Up: The problems Previous: Optimal grouping of a
Danny Heap 2002-11-27