next up previous contents
Next: Using the Path Index Up: Path Indexing Previous: Building the Path Index   Contents


Key Extraction in Path Indexing

The vectors used as the second layer index should be of the same size for each pair of matching mothers and daughters. More than that, a vector $ V_i$ from the indexing scheme of a daughter $ D$ that matches a mother $ M$ and a vector $ V_j$ from the indexing scheme of $ M$ associated with $ D$ should refer to sets of equivalent nodes (through $ \bowtie$) in the mother and in the daughter.

The types extracted for the indexing vectors are those of nodes found at the end of the indexing paths. An indexing path $ \pi$ for a mother $ M$ that unifies with a daughter $ D$ is defined as:

Definition 6.8   If $ \widehat{M}^{\textrm{\Cutleft}_D}_{\textrm{\Cutleft}}= \langle
Q',\overline{Q'}, \theta,\delta \rangle$ is the Dynamically Cut typed feature structure of a mother $ M=\langle Q,\overline{q},
\theta, \delta\rangle$ with respect to a daughter $ D$ for which $ M\sqcup D\downarrow$ before parsing, then $ \pi\in Path$ is an indexing path iff $ \delta(\pi,\overline{q})\in
[\overline{Q'}]^{-1}$.

In the current implementation, a more efficient definition of indexing paths is used:

Definition 6.9   If $ \widehat{M}^{\textrm{\Cutleft}_D}= \langle Q',\overline{Q'},
\theta,\delta \rangle$ is the Statically Cut typed feature structure of a mother $ M=\langle Q,\overline{q},
\theta, \delta\rangle$ with respect to a daughter $ D$ for which $ M\sqcup D\downarrow$ before parsing, then $ \pi\in Path$ is an indexing path iff $ \delta(\pi,\overline{q})\in
[\overline{Q'}]^{-1}$.

The motivation for not using the $ DynamicCut$ is the high cost of evaluating the Condition 2 of Definition 6.6, a condition that is tested during parsing (after each edge is created). Any significant operation performed during run-time negatively affects the efficiency of the parser. An experiment using the $ DynamicCut$ to determine the indexing paths yielded parsing times of more than 100% slower than those of the baseline. For the $ StaticCut$ however, the conditions are tested off-line, with no penalty placed on parsing times.

From an implementation point of view, in order to avoid traversing the indexing path during run-time, the path index key contains pointers to the types at the ends of the indexing paths of the TFSs in the chart.


next up previous contents
Next: Using the Path Index Up: Path Indexing Previous: Building the Path Index   Contents