After the ALE program compiles without any error messages, it is possible to test the program to make sure it does what it is supposed to. We consider the problem from the bottom-up, as this is the best way to proceed in testing grammars. ALE does not have a sophisticated input/output package, and thus all ALE procedures must be accessed through Prolog queries.
Once the signature is compiled, it is possible to test the results of the compilation. To test whether or not a type exists, use the following query :
| ?- type(Type). Type = bot ?; Type = cat ?; Type = synsem ? yesNote that the prompt | ?- is provided by Prolog, while the query consists of the string type(Type)., including the period and a return after the period. Prolog then responds with instantiations of any variables in the query if the query is successful. Thus the first solution for Type that is found above is Type = bot. After providing an instantiation representing the solution to the query, Prolog then provides another prompt, this time in the form of a single question mark. After the first prompt above, the user typed a semicolon and return, indicating that another solution is desired. The second solution Prolog found was Type = cat. After this prompt, the user requested a third solution. After the third solution, Type = synsem, the user simply input a return, indicating that no more solutions were desired. These two options, semicolon followed by return, and a simple return, are the only ones relevant for ALE. If the anonymous variable _ is used in a query, no substitutions are given for it in the solution. If there are no solutions to a query, Prolog returns no as an answer. Consider the following two queries:
| ?- type(bot). yes
| ?- type(foobar). noIn both cases, no variables are given in the input, so a simple yes/no answer, followed by another prompt, is all that is returned.
The second useful probe on the signature indicates type subsumptions and type unifications. To test type subsumption, use the following form of query :
| ?- sub_type(X,Y). X = and, Y = and ?; X = backward, Y = backward ? yesNote that with two variables, substitutions for both are given, allowing the possibility of iterating through the cases. In general, wherever a variable may be used in a query, a constant may also be used. Thus sub_type(synsem,forward). is a valid query, as are sub_type(synsem,X) and sub_type(Y,forward). The first argument is the more general type, with the second argument being the subtype.
Type unifications are handled by the following form of query :
| ?- unify_type(T1,T2,T).The interpretation here is that T1 unified with T2 produces T3. As before, any subset of the three variables may be instantiated for the test and the remaining variables will be solved for.
The following query will indicate whether given features have been defined and can also be used to iterate through the features if the argument is uninstantiated :
| ?- feature(F).
Feature introduction can be tested by :
| ?- introduce(F,T).which holds if feature F is introduced at type T.
Type constraints can be tested using :
| ?- show_cons(Type).which will display the description of the constraint assigned to the type, Type.
Finally, the inherited appropriateness function can be tested by :
| ?- approp(Feat,Type,Restr).A solution indicates that the value for feature Feat for a type Type structure is of type Restr. As usual, any of the variables may be instantiated, so that it is possible to iterate through the types appropriate for a given feature or the features appropriate for a given type, the restrictions on a given feature in a fixed type, and so on.
There is one higher-level debugging routine for the signature that outputs a complete specification for a type, including a list of its subtypes and supertypes, along with the most general feature structure of that type (after all type inference and constraint satisfaction has been performed). An example of the show_type/1 query is as follows :
| ?- show_type functional.
TYPE: functional
SUBTYPES: [forward,backward]
SUPERTYPES: [synsem]
MOST GENERAL SATISFIER:
functional
ARG synsem
RES synsem
If synsem had any appropriate features, these would have been
added, along with their most general appropriate values.
Descriptions can be evaluated in order to find their most general satisfiers. ALE provides the following form of query :
| ?- mgsat tl:e_list.
ne_list_quant
HD quant
RESTR proposition
SCOPE proposition
VAR individual
TL e_list
ANOTHER? n.
yes
Note that there must be whitespace between the mgsat and the
description to be satisfied. The answer given above is the most
general satisfier of the description tl:e_list using the
signature in the categorial grammar in the appendix. It is important
to note here that type inference is being performed to find most
general satisfiers. In the case at hand, because lists in the
categorial grammar are typed to have quantifiers as their HD
values, the value of the HD feature in the most general
satisfier has been coerced to be a quantifier.
Satisfiable non-disjunctive descriptions always have unique most general satisfiers as a consequence of the way in which the type system is constrained. But a description with disjunctions in it may have multiple satisfiers. Consider the following query:
| ?- mgsat hit,hitter:(j;m). hit HITTEE individual HITTER j ANOTHER? y. hit HITTEE individual HITTER m ANOTHER? y. noAfter finding the first most general satisfier to the description, the user is prompted as to whether or not another most general satisfier should be sought. As there are only two most general satisfiers of the description, the first request for another satisfier succeeds, while the second one fails. Failure to find additional solutions is indicated by the no response from Prolog.
Error messages will result if there is a violation of the type hierarchy in the query. For instance, consider the following query containing two type errors before a satisfiable disjunct:
| ?- mgsat hd:j ; a ; j.
add_to could not add incompatible type j to:
quant
RESTR proposition
SCOPE proposition
VAR individual
add_to could not add undefined type: a to
bot
MOST GENERAL SATISFIER OF: hd:j;a;j
j
ANOTHER?
Here the two errors are indicated, followed by a display of the unique
most general satisfiers. The problem with the first disjunct is that
lists have elements which must be of the quantifier type, which
conflicts with the individual type of j, while the second
disjunct involves an undefined type a. Note that in the error
messages, there is some indication of how the conflict arose as well
as the current state of the structure when the error occurred. For
instance, the system had already figured out that the head must be a
quantifier, which it determined before arriving at the incompatible
type j. The conflict arose when an attempt was made to add the
type j to the quant type object.
To explore unification, simply use conjunction and mgsat. In particular, to see the unification of descriptions D1 and D2, simply display the most general satisfiers of D1, D2, and their conjunction (D1,D2). To obtain the correct results, D1 and D2 must not share any common variables. If they do, the values of these will be unified across D1 and D2, a fact which is not represented by the most general satisfiers of either D1 or D2. Providing most general satisfiers also allows the user to test for subsumption or logical equivalence by visual inspection, by using mgsat/1 and comparing the set of solutions. Future releases should contain mechanisms for evaluating subsumption (entailment), and hence logical equivalence of descriptions.
With a feature structure system such as ALE, grammars and programs often manipulate very large feature structures. To aid in debugging, two queries allow the user to focus attention on particular types and features by supressing the printing of other types and features.
The following command supresses printing of a type :
| ?- no_write_type(T).After no_write_type(T) is called, the type T will no longer be displayed during printing. To restore the type T to printed status, use :
| ?- write_type(T).If T is a variable in a call to write_type/1, then all types are subsequently printed. Alternatively, the following query restores printing of all types :
| ?- write_types.
Features and their associated values can be supressed in much the same way as types. In particular, the following command blocks the feature F and its values from being printed :
| ?- no_write_feat(F).To restore printing of feature F, use :
| ?- write_feat(F).If F is a variable here, all features will subsequently be printed. The following special query also restores printing of all features .
| ?- write_feats.
It is possible to display definite clauses in feature structure format by name. The following form of query can be used :
| ?- show_clause append.
HEAD: append(e_list,
[0] bot,
[0] )
BODY: true
ANOTHER? y.
HEAD: append(ne_list_quant
HD [0] quant
RESTR proposition
SCOPE proposition
VAR individual
TL [1] list_quant,
[2] bot,
ne_list_quant
HD [0]
TL [3] list_quant)
BODY: append([1],
[2],
[3])
ANOTHER? y.
no
Note that this example comes from the categorial grammar in the
appendix. Also note that the feature structures are displayed in full
with tags indicating structure sharing. Next, note that prompts allow
the user to iterate through all the clauses. The number of solutions
might not correspond to the number of clause definitions in the
program due to disjunctions in descriptions which are resolved
non-deterministically when displaying rules. But it is important to
keep in mind that this feature structure notation for rules is not the
one ALE uses internally, which compiles rules down into
elementary operations which are then compiled, rather than evaluating
them as feature structures by unification. In this way, ALE is
more like a logic programming compiler than an interpreter. Finally,
note that the arity of the predicate being listed may be represented
in the query as in Prolog. For instance, the query show_clause
append/3 would show the clauses for append with three
arguments.
Definite clauses in ALE can be evaluated by using a query such as :
| ?- query append(X,Y,[a,b]).
append(e_list,
[0] ne_list
HD a
TL ne_list
HD b
TL e_list,
[0] )
ANOTHER? y.
append(ne_list
HD [0] a
TL e_list,
[1] ne_list
HD b
TL e_list,
ne_list
HD [0]
TL [1] )
ANOTHER? y.
append(ne_list
HD [0] a
TL ne_list
HD [1] b
TL e_list,
[2] e_list,
ne_list
HD [0]
TL ne_list
HD [1]
TL [2] )
ANOTHER? y.
no
The definition of append/3 is taken from the syllabification
grammar in the appendix. After displaying the first solution,
ALE queries the user as to whether or not to display another
solution. In this case, there are only three solutions, so the third
query for another solution fails. Note that the answers are given in
feature structure notation, where the macro [a,b] is converted to a
head/tail feature structure encoding.
Unlike Prolog, in which a solution is displayed as a substitution for the variables in the query, ALE displays a solution as a satisfier of the entire query. The reason for this is that structures which are not given as variables may also be further instantiated due to the type system. Definite clause resolution in ALE is such that only the most general solutions to queries are displayed. For instance, consider the following query, also from the syllabification grammar in the appendix:
| ?- query less_sonorous(X,r).
less_sonorous(nasal,
r)
ANOTHER? y.
less_sonorous(sibilant,
r)
ANOTHER? n.
Rather than enumerating all of the nasal and sibilant
types, ALE simply displays their supertype. On the other hand,
it is important to note that the query less_sonorous(s,r) would
succeed because s is a subtype of sibilant. This example
also clearly illustrates how ALE begins each argument on its own
line arranged with the query.
In general, the goal to be solved must be a literal, consisting only of a relation applied to arguments. In particular, it is not allowed to contain conjunction, disjunction, cuts, or other definite clause control structures. To solve a more complex goal, a definite clause must be defined with the complex goal as a body and then the head literal solved, which will involve the resolution of the body.
There are no routines to trace the execution of definite clauses. Future releases of ALE will contain a box port tracer similar to that used for Prolog. At present, the best suggestion is to develop definite clauses modularly and test them from the bottom-up to make sure they work before trying to incorporate them into larger programs.
ALE provides a number of routines for displaying and debugging grammar specifications. After compile-time errors have been taken care of, the queries described in this section can display the result of compilation.
Lexical entries can be displayed using the following form of query :
| ?- lex(kid).
WORD: kid
ENTRY:
cat
QSTORE e_list
SYNSEM basic
SEM property
BODY kid
ARG1 [0] individual
IND [0]
SYN n
ANOTHER? y.
no
As usual, if there are multiple entries, ALE makes a query as
to whether more should be displayed. In this case, there was only one
entry for kid in the categorial grammar in the appendix.
Empty lexical entries can be displayed using :
| ?- empty.
EMPTY CATEGORY:
cat
QSTORE ne_list_quant
HD some
RESTR [0] proposition
SCOPE proposition
VAR [1] individual
TL e_list
SYNSEM forward
ARG basic
SEM property
BODY [0]
IND [1]
SYN n
RES basic
SEM [1]
SYN np
ANOTHER? no.
Note that the number specification was removed to allow the empty
category to be processed with respect to the categorial grammar type
system. As with the other display predicates, empty provides
the option of iterating through all of the possibilities for empty
categories.
Grammar rules can be displayed by name, as in :
| ?- rule forward_application.
RULE: forward_application
MOTHER:
cat
QSTORE [4] list_quant
SYNSEM [0] synsem
DAUGHTERS/GOALS:
CAT cat
QSTORE [2] list_quant
SYNSEM forward
ARG [1] synsem
RES [0]
CAT cat
QSTORE [3] list_quant
SYNSEM [1]
GOAL append([2],
[3],
[4])
ANOTHER? n.
Rules are displayed as most general satisfiers of their mother,
category and goal descriptions. It is important to note that this is
for display purposes only. The rules are not converted to feature
structures internally, but rather to predicates consisting of
low-level compiled instructions. Displaying a rule will also flag any
errors in finding most general satisfiers of the categories and rules
in goals, and can thus be used for rule debugging. This can detect
errors not found at compile-time, as there is no satisfiability
checking of rules performed during compilation.
Macros can also be displayed by name, using :
| ?- macro np(X).
MACRO:
np([0] sem_obj)
ABBREVIATES:
basic
SEM [0]
SYN np
ANOTHER? n.
First note that the macro name itself is displayed, with all
descriptions in the macro name given replaced with their most general
satisfiers. Following the macro name is the macro satisfied by the
macro description with the variables instantiated as shown in the
macro name display. Note that there is sharing between the
description in the macro name and the SEM feature in the result.
This shows where the parameter is added to the macro's description.
Finally, it is possible to display lexical rules, using the following query :
| ?- lex_rule plural_n.
LEX RULE: plural_n
INPUT CATEGORY:
n
NUM sing
PERS pers
OUTPUT CATEGORY:
n
NUM plu
PERS pers
MORPHS:
[g,o,o,s,e] becomes [g,e,e,s,e]
[k,e,y] becomes [k,e,y,s]
A,[m,a,n] becomes A,[m,e,n]
A,B becomes A,B,[e,s]
when fricative(B)
A,[e,y] becomes A,[i,e,s]
A becomes A,[s]
ANOTHER? n.
Note that the morphological components of a rule is displayed in
canonical form when it is displayed. Note that variables in
morphological rules are displayed as upper case characters. When
there is sharing of structure between the input and output of a
lexical rule, it will be displayed as such. As with the other ALE grammar display predicates, if there are multiple solutions to
the descriptions, these will be displayed in order. Also, if there is
a condition on the categories in the form of an ALE definite
clause goal, this condition will be displayed before the morphological
clauses. As with grammar rules, lexical rules are compiled internally
and not actually executed as feature structures. The feature
structure notation is only for display. Also, as with grammar rules,
displaying a lexical rule may uncover inconsistencies which are not
found at compile time.
In this section, we consider the execution of ALE phrase structure grammars. The examples shown in this section have been produced while running with the mini-interpreter off. The mini-interpreter will be discussed in the next section.
The primary predicate for parsing is illustrated as follows :
| ?- rec [john,hits,every,toy].
STRING:
0 john 1 hits 2 every 3 toy 4
CATEGORY:
cat
QSTORE e_list
SYNSEM basic
SEM every
RESTR toy
ARG1 [0] individual
SCOPE hit
HITTEE [0]
HITTER j
VAR [0]
SYN s
ANOTHER? y.
CATEGORY:
cat
QSTORE ne_list_quant
HD every
RESTR toy
ARG1 [0] individual
SCOPE proposition
VAR [0]
TL e_list
SYNSEM basic
SEM hit
HITTEE [0]
HITTER j
SYN s
ANOTHER? y.
no
The first thing to note here is that the input string must be entered
as a Prolog list of atoms. In particular, it must have an opening and
closing bracket, with words separated by commas. No variables should
occur in the query, nor anything other than atoms. The first part of
the output repeats the input string, separated by numbers ( nodes
in the chart) which indicate positions in the string for later use in
inspecting the chart directly. ALE asserts one lexical item for
every unit interval, with empty categories being stored as loops from
every single node to itself. The second part of the output is a
category which is derived for the input string. If there are multiple
solutions, these can be iterated through by providing positive answers
to the query. The final no response above indicates that the
category displayed is the only one that was found. If there are no
parses for a string, an answer of no is returned, as with:
| ?- rec([runs,john]). STRING: 0 runs 1 john 2 no
Notice that there is no notion of ``distinguished start symbol'' in parsing. Rather, the recognizer generates all categories which it can find for the input string. This allows sentence fragments and phrases to be analyzed, as in:
| ?- rec [big,kid].
STRING:
0 big 1 kid 2
CATEGORY:
cat
QSTORE ne_list_quant
HD some
RESTR and
CONJ1 kid
ARG1 [0] individual
CONJ2 big
ARG1 [0]
SCOPE proposition
VAR [0]
TL e_list
SYNSEM basic
SEM [0]
SYN np
ANOTHER? n.
Once parsing has taken place for a sentence using rec/1, it is possible to look at categories that were generated internally. In general, the parser will find every possible analysis of every substring of the input string, and these will be available for later inspection. For instance, suppose the last call to rec/1 executed was rec [john,hits,every,toy], the results of which are given above. Then the following query can be made :
| ?- edge(2,4).
COMPLETED CATEGORIES SPANNING: every toy
cat
QSTORE ne_list_quant
HD every
RESTR toy
ARG1 [0] individual
SCOPE proposition
VAR [0]
TL e_list
SYNSEM basic
SEM [0]
SYN np
ANOTHER? n.
This tells us that from positions 2 to 4, which covers the
string every toy in the input, the indicated category was found.
Even though an active chart parser is used, it is not possible to
inspect active edges. This is because ALE represents active
edges as dynamic structures which are not available after they have
been evaluated.
Using edge/2 it is possible to debug grammars by seeing how far analyses got and inspecting analyses of substrings.
ALE contains a mini-interpreter which allows the user to traverse and edit an ALE parse tree. By default, the mini-interpreter is off when ALE is loaded. To turn the mini-interpreter on, simply type :
| ?- interp. interpreter is active yes | ?-To turn it off again, use nointerp. Any parse created while the mini-interpreter is active will automatically store the following information on the edges added to ALE's chart:
The status of the mini-interpreter has no effect on compilation. The same compiled code is used regardless of whether the mini-interpreter is active or inactive. The mini-interpreter has an effect on two run-time commands: rec/1 and edge/2.
When the mini-interpreter is active, rec/1 operates in one of three modes, add-mode, go-mode, and quiet-mode. When the mini-interpreter is active, rec/1 always begins in add-mode. In add-mode, the user is prompted just before any edge is added. Because ALE parses from right to left, the edges are encountered in that order. The prompt consists of a display of the feature structure for the edge, followed by the mini-interpreter information for that edge, followed by an action-line, which lists the options available to the user. For example:
| ?- rec([kim,sees,sandy]).
STRING:
0 kim 1 sees 2 sandy 3
word
QRETR list_quant
QSTORE e_set
SYNSEM synsem
LOC loc
CAT cat
HEAD noun
CASE case
MOD none
PRD bool
MARKING unmarked
SUBCAT e_list
CONT nom_obj
INDEX [0] ref
GEN gend
NUM sing
PER third
RESTR e_set
Edge created for category above:
from: 2 to: 3
string: sandy
rule: lexicon
# of dtrs: 0
Action(add,noadd,go(-#),quiet,break,dtr-#,abort)?
|:
We see, in this example, the action-line for rec. If the user
selects add , the edge is added, and rec
proceeds, in add-mode, as usual. If noadd is
selected, the edge is not added, and rec proceeds in add-mode.
go puts the mini-interpreter into go-mode. In go-mode, rec proceeds to add all of the edges that it would if the mini-interpreter were inactive, or to think of it another way, it functions as if the user always chose add, but it does not stop to ask. As it adds the edges, it displays them, along with their mini-interpreter information. go suffixed with a number , e.g. go-1, puts the mini-interpreter into go-mode until it encounters an edge whose left node is that number, and then, beginning with that edge, automatically switches back into add-mode. With ALE's current parsing strategy, go- N will remain in go-mode until it encounters the first edge corresponding to the ( N+1)st lexical item in the string being parsed.
quiet puts the mini-interpreter into quiet-mode. Quiet-mode is just like go-mode, except that the feature structures and their mini-interpreter information are not printed. What separates quiet-mode from simply turning the mini-interpreter off is that the mini-interpreter information is still recorded in quiet-mode, which the user can examine later with edge/2.
break simply invokes the Prolog break commmand, placing the user into an interpreter with a new break-level. Edges that have been added so far can be examined and retracted at this time. When the user pops out of the break, the current prompt is redisplayed.
dtr-N displays the Nth daughter, its mini-interpreter information, and the action-line for dtr:
Action(retract,dtr-#,parent)? |:retract removes the daughter from the chart. When the parse continues, ALE grammatical rules will not be able to use that edge. The current edge which is the parent of this daughter, however, can still be added. dtr-N has the same effect as in the rec action-line. parent returns to the current edge's parent and its action-line (either rec or dtr).
The mini-interpreter will not display any edge that has already been retracted.
Currently, the mini-interpreter cannot traverse the daughter(s) corresponding to a cats> operator in a rule unless it is preceded by one or more cat> operators. The number of daughters will not include them, and the daughters which are visible to the mini-interpreter will be numbered beginning with 1, i.e. the first cat> daughter will be numbered as the first daughter. This shortcoming will be corrected in future versions.
If abort is selected, the parse is aborted. All of the edges added so far remain in memory until the next rec statement. The edge which was displayed when abort was chosen is discarded.
When the mini-interpreter is active, edge/2 displays not only the feature structures of the edges spanning the two argument nodes, but their mini-interpreter information, and the action-line for edge/2, e.g.:
| ?- edge(0,3).
COMPLETED CATEGORIES SPANNING: kim sees sandy
phrase
QRETR e_list
QSTORE e_set
SYNSEM synsem
LOC loc
CAT cat
HEAD verb
MARKING unmarked
SUBCAT e_list
CONT psoa
NUCLEUS see
SEEN [1] ref
GEN gend
NUM sing
PER third
SEER [0] ref
GEN gend
NUM sing
PER third
QUANTS e_list
Edge created for category above:
index: 21
from: 0 to: 3
string: kim sees sandy
rule: schema1
# of dtrs: 2
Action(retract,dtr-#,next)?
|:
Every edge that is actually asserted into the chart is assigned a
unique number, called an index (since ALE has no
subsumption checking yet, the feature structures themselves may not be
unique), which edge displays also. retract and dtr
behave the same as in the dtr action-line.
next tells the mini-interpreter that the user is done
traversing the parse tree rooted at the current edge. ALE then
asks if the user wants to search for more edges spanning the two
argument nodes given to edge/2, just as when the
mini-interpreter is inactive.
Note that in order for edge/2 to display the mini-interpreter information for an edge, the mini-interpreter must be active, and the mini-interpreter must have been active when the edge in question was asserted. If the former is not true, and the latter is, the information will be ignored. If the former is true, but the latter is not, the following line will be appear in place of the information:
(Edge created while interpreter was inactive)