next up previous contents
Next: Experiments Up: Experimental Evaluation Previous: Resources   Contents


Prolog Data Structure

Before presenting (in the following section) the experimental evaluation of the proposed indexing methods, the data structure used to encode TFSs in Prolog is introduced. It should be mentioned that the experiments are highly dependent on the chosen data structure, which is unrelated to the proposed indexing method. Therefore, details about the TFS encoding are given here, rather than in earlier chapters.

The feature structures are encoded as Prolog terms. From the existing methods that efficiently encode TFSs ([Mellish1988], [Gerdemann1995], [Penn1999a]), an optimized encoding similar to the one presented in [Penn1999a] was chosen. As is shown in the aforementioned paper, if the feature graph is $ N$-colourable (but not $ (N-1)$-colourable), the least number of argument positions in a flat encoding is $ N$. The feature graph is an undirected graph where each vertex represents an introduced feature, and each edge, a pair of features that are appropriate to a common type [Penn1999a]. Each position $ N$ in a term represents a feature with colour $ N$ in the feature graph. The encoding introduced in this thesis extends the encoding from [Penn1999a] by allowing for the enforcement of type constraints during unification of typed feature structures.

Figure 7.1 presents the structure used to encode TFSs as Prolog terms. The variable Type encodes the type of the feature structure, while each variable F$ _i$ represents a feature value assigned to position $ i$ (coloured as $ i$ in the feature graph). The arity of feats is constant, and is the minimum number of colours $ N$. Each of feats's arguments is either a singleton variable (meaning that particular feature value is not defined for the current type or is $ \bot$), a shared variable (representing structure sharing), or an embedded tfs term. It is possible for a tfs term to have a variable as its second argument (for structure sharing).

Figure 7.1: The encoding of TFSs as Prolog terms
\begin{figure}\hrule\vspace*{0.2cm}
Typed feature structures: \texttt{tfs(Type,...
...:
\texttt{type(TypeValue,TypeRef,FeatRef).} \vspace*{0.2cm}\hrule\end{figure}

Types were encoded using the attributed variables library in SICStus Prolog [SICS2003]. While existing systems (such as ALE[Carpenter and Penn2001]) use a reference-based encoding for types, attributed variables are more suitable for indexing. Performing de-referencing each time a type is needed for the indexing key during parsing would result in slower parsing time. Attributed variables offer direct access to the encoded type.

In the Prolog encoding of a TFS (tfs(Type,Feats,TypeRef)) shown in Figure 7.1, the variable Type carries the term type(TypeValue,TypeRef,FeatRef) as an attribute. TypeValue represents the actual type value, while FeatRef is a pointer to the features arguments Feats in the TFS encoding tfs/3. TypeRef is a trigger variable (it has no value associated with it) used to enforce the type constraints. It is pointed to by the TypeRef variable from the tfs/3 term. Its purpose is to delay the activation of the constraint enforcing mechanism until the unification of all features in Feats is completed (considering that typical Prolog systems perform the unification of arguments in a term from left to right).

Figure 7.2: The encoding of the TFS for the word mary as Prolog terms



\includegraphics[width=8cm]{example_hpsg.eps}
\includegraphics[width=4.8cm]{example_feature_graph.eps}


$ mary \longrightarrow
\left[\begin{array}{l}
\textrm{\textbf{sign}}\\
\textrm{...
...bf{noun}}\\
\textrm{SUBJ: \textbf{head}}
\end{array}\right]
\end{array}\right]$


TFS for mary: tfs(Type1,Feats1,TypeRef1).
Type1 attribute: type(sign,TypeRef1,Feats1).
Feats1=feats(tfs(Type2,Feats2,TypeRef2),_).
Type2 attribute: type(cat,TypeRef2,Feats2).
Feats2=feats(tfs(Type3,Feats3,TypeRef3),tfs(Type4,Feats4,TypeRef4)).
Type3 attribute: type(noun,TypeRef3,Feats3).
Type4 attribute: type(head,TypeRef4,Feats4).
Feats3=_, Feats4=_.

Figure 7.2 presents the Prolog encoding for a typed feature structure representing the lexical entry for mary, given a type hierarchy and its coloured feature graph.


next up previous contents
Next: Experiments Up: Experimental Evaluation Previous: Resources   Contents