next up previous contents
Next: Variables in TFS Representations Up: Implementation Aspects Previous: Implementation Aspects   Contents


Compiling Grammar Rules

As mentioned in Section 2.3.3, the constraints reflected in a TFS can be expressed through descriptions. Without getting into explicit details about various description languages, this section presents the rationale behind the transformation of grammar rules into a representation usable in a programming environment (specifically, in Prolog). For this, the Attribute Logic Engine (ALE) will be used as the example parsing system.

ALE [Carpenter and Penn2001] is a phrase structure parsing system, supporting various formalisms, such as HPSG. Its grammar handling mechanism is built on foundations of the Prolog built-in DCG system, with the important difference of using descriptions of TFSs for representing categories, instead of Prolog terms.

There are two main components in the grammar handling mechanism: the lexicon and the grammar rules. The lexicon consists of lexical entries and lexical rules. Lexical rules are used to express the redundancies among lexical entries.

Of interest to the work presented in this thesis are the grammar rules. An example of a phrase rule in ALE is given in Figure 3.2.

Figure 3.2: A phrase rule in ALE. A line of the form syn:vp represents the description of a feature named syn (syntactic category) with the value (type restriction) vp. The above rule states that the syntactic category s can be combined from np and vp categories if their values for the feature agr are the same. The semantics of s is the semantics of the verb phrase, while the role of agent is served by the semantics of the noun phrase.
\begin{figure}\centering
\begin{verbatim}s_np_vp rule
(syn:s,
sem:(VPSem,
ag...
...
sem:NPSem),
cat>
(syn:vp,
agr:Agr,
sem:VPSem).\end{verbatim}
\end{figure}

Using the descriptions presented in Figure 3.2 in an implementation would not be practical (not only for efficiency reasons, but also because types can be promoted, and Prolog variables, once instantiated, cannot be changed). Therefore, the grammar is first compiled into an internal, efficient, representation. The choice for the internal representation of each category in the grammar depends on the programming language that is chosen for the implementation. For the particular case of Prolog, the next section presents several encodings of TFSs.

Another reason for the off-line compilation of the grammar is the possibility of performing several optimizations. As it will be shown later in this thesis, an analysis of grammar rules carried out at compile-time results in a better indexing scheme, leading to faster parsing times.


next up previous contents
Next: Variables in TFS Representations Up: Implementation Aspects Previous: Implementation Aspects   Contents