next up previous contents
Next: Indexing for Non-Parsing Applications Up: A Discussion About Optimization Previous: Indexing vs. Filtering   Contents


Statistical vs. Non-Statistical Methods

While previous sections presented an overview of several approaches to improve parsing times for TFSGs, the focus of this section in particular is on the analysis of those that use indexing or filtering. They can be divided into two major categories: statistically-based and non-statistically-based.

Statistical methods, such as the commonly used quick-check [Malouf et al.2000], need a training phase in order to determine the best criteria used to predict the unification failure. The advantage of using such methods resides in their simplicity. Although the experimental results can often exhibit significant improvements, the major disadvantage of this method is the need for a training phase. If the grammar is modified often (even through very small changes, which can occur frequently in a development process), the time spent on training is not compensated for by the improvements in parsing times. More than that, if the training set is not properly chosen, the statistical filter may even fail to detect any unification failure. Another disadvantage of methods such as quick-check, which need to determine the failure-causing paths in the training phase, is that finding optimal paths is exponential [Penn and Munteanu2003].

Non-statistical indexing or filtering methods used to improve parsing times have received less attention in recent years. One of the very few methods is rule filtering [Kiefer et al.1999], a method similar to the general indexing presented in Chapter 4. However, its authors do not exploit the benefits of an indexing structure, nor do they present the results in a large experimental context. As will be shown in Chapter 6, using an indexing structure allows for implementation of various extensions, such as ``personalized'' index keys for each mother-daughter pair (leading to a larger percentage of avoided failed unifications.)

It should be mentioned here that the non-statistical indexing methods presented later in Chapter 6 do not preclude statistical improvements4.1. Although quick-check and rule filtering demonstrate improvements on parsing times, both methods were evaluated using parsers that are not fully optimized from a non-statistical point of view. A complete evaluation of such statistical methods would be more relevant if performed after all posible non-statistical optimizations are implemented.


next up previous contents
Next: Indexing for Non-Parsing Applications Up: A Discussion About Optimization Previous: Indexing vs. Filtering   Contents