% ALE:  The Attribute Logic Engine  /  User's Guide Version 3.2.1
% Bob Carpenter, SpeechWorks Research
% Gerald Penn, Department of Computer Science, University of Toronto
% Process in LaTeX

% STYLE
% ==============================================================================

\documentclass[fleqn,leqno,11pt,twoside]{report}
\usepackage{epsf}
\pagestyle{headings}

\setlength{\oddsidemargin}{0.625in}
\setlength{\evensidemargin}{0.625in}
\setlength{\textwidth}{5.5in}

\setlength{\topmargin}{-0.25in}
\setlength{\textheight}{9.0in}


\setlength{\headsep}{0.275in}            

\sloppy

\raggedbottom


% AV-Logic Commands
% -----------------
\newcommand{\type}[1]{\mbox{\bf #1}}

% DAGs
% -------------
\newcommand{\feat}[1]{\mbox{{\sc #1}}}
\newcommand{\ind}[1]{\setlength{\fboxsep}{2pt}\fbox{\scriptsize #1}}
\newcommand{\sdg}[2]{\left[ 
                         \! \!
                         \begin{array}{l}
                           #1 \\
                           #2
                         \end{array}
                         \! \!
                       \right] }
\newcommand{\sdi}[2]{#1 \fv \! #2 \\[2pt]}
\newcommand{\fv}{\colon}

% Text Commands
% -------------

\newcommand{\acronym}[1]{{\sc #1}\index{{\sc #1}}}

% Picture Commands
% ----------------
\newcommand{\picpar}[1]{\begin{minipage}[t]{0pt}
                        \begin{tabbing}#1\end{tabbing}
                        \end{minipage}}
\newcommand{\puttop}[2]{\put#1{\makebox(0,0)[t]{#2}}}
\newcommand{\putleft}[2]{\put#1{\makebox(0,0)[l]{#2}}}
\newcommand{\putcenter}[2]{\put#1{\makebox(0,0){#2}}}

% References
% ----------
\newcommand{\annotate}[1]{\begin{quote}
                          \vspace{-8pt} {\small #1}
                          \end{quote}}

% Error Messages
% --------------
\newcommand{\errorcond}[2]{\mbox{ } \\[-6pt]
                           {\tt #1}
                           \begin{quote} \vspace{-4pt} #2 \end{quote}}

% Sections
% --------

\newcommand{\mychapter}[1]{\chapter{#1}}
\newcommand{\mychapterstar}[1]{\chapter*{#1}}
\newcommand{\mysection}[1]{\section{#1}}
\newcommand{\mysectionstar}[1]{\section*{#1}}
\newcommand{\myappendix}[1]{\chapter{#1}}
\newcommand{\mysubsection}[1]{\subsection{#1}}
\newcommand{\mysubsubsection}[1]{\subsubsection*{#1}}


% DOCUMENT
% ===================================================


\begin{document}

\begin{titlepage}
\begin{center}
\mbox{ } \\[.75in]
\mbox{ }\fbox{\Huge\bf ALE}\mbox{ }
\mbox{ } \\[.5in]
{\Huge\bf The Attribute Logic Engine} \\[.25in]
{\huge\bf User's Guide} \\[1.0in]
{\LARGE Version 3.2.1} \\[6pt]
{\Large December 2001} \\[1.5in]
% {\Large Bob Carpenter} \qquad \qquad {\Large Gerald Penn} \\[3pt]
\hspace*{0.7cm}
\begin{tabular}{lll}
{\Large Bob Carpenter} & \hspace*{2cm} & {\Large Gerald Penn} \\[3pt]
{\large SpeechWorks Research} & & {\large Department of Computer Science}
 \\[3pt]
{\large 55 Broad St.} & & {\large University of Toronto} \\[3pt]
{\large New York, NY 10004} & & {\large 10 King's College Rd.} \\[3pt]
{\large USA} & & {\large Toronto M5S 3G4} \\[3pt]
& & {\large Canada} \\[3pt]
{\large carp{\normalsize @}colloquial.com} & & {\large gpenn{\normalsize @}cs.toronto.edu} \\[1in]
\end{tabular}
\end{center}

\begin{center}
\begin{tabular}{l}
\copyright 1992--1995, Bob Carpenter and Gerald Penn\\
\copyright 1998, 1999, 2001, Gerald Penn
\end{tabular}
\end{center}
\end{titlepage}


\pagenumbering{roman}

\tableofcontents

\mychapterstar{Preface -- Version 3.0}

{\sc ale} 3.0 is completely compatible with {\sc ale} 2.0 grammars,
and adds the following new features:
%
\begin{itemize}
\item A semantic-head-driven generator, based on the algorithm
  presented in Shieber et al. (1990).  The generator was adapted to
  the logic of typed feature structures by Octav Popescu in his
  Carnegie Mellon Master's Thesis, Popescu (1996).  Octav also wrote
  most of the generation code for this release.  Grammars can be
  compiled for parsing only, generation only, or both.  Some glue-code
  is also available from the {\sc ale} homepage, to parse and generate
  with different grammars through a unix pipe.
\item A source-level debugger with a graphical XEmacs interface.
  This debugger works only with SICStus Prolog 3.0.6 and higher.  A
  debugger with reduced functionality will be made available to SWI
  Prolog users in a later release.  This debugger builds on, and
  incorporates the functionality of the code for the SICStus
  source-level debugger, written by Per Mildner at Uppsala University.
\item Functional descriptions.  Instead of binding variables in a
  description and calling a procedural attachment, e.g., {\tt a cons
  f:X,g:Y,h:Z goal append(X,Y,Z)}, it is now possible to incorporate
  certain functional relations into descriptions themselves, e.g., {\tt
  a cons f:X,g:Y,h:append(X,Y)}.
\item {\tt a\_/1} atoms.  There are now an infinite number of atoms
  (types with no appropriate features), implicitly declared in every
  signature.  These atoms can be arbitrary Prolog terms, including
  unbound variables, and can be used wherever normal {\sc ale} types
  can, e.g., {\tt f:(a\_ p(3.7))}.  {\tt a\_/1} atoms are extensional
  as Prolog terms, i.e., are taken to be identical according to the
  Prolog predicate, {\tt ==/2}.  In particular, this means that ground
  atoms behave exactly as {\sc ale} extensional types.
\item Optional edge subsumption checking.  For completeness of
  parsing, one only needs to ensure that, for every pair of nodes in
  the chart, the most general feature structure spanning those nodes
  is stored in the chart.  This can reduce the number of edges in many
  domains.
\item An autonomous {\tt intro/2} operator.  Features can now be
  declared on their own in a separate part of the grammar.
\item Default specifications for types.  These are NOT default types.
  If a type appears on the right-hand side of a {\tt sub/2} or {\tt
    intro/2} specification, but not on the left-hand side of one, {\sc
    ale} will assume this type is maximal, i.e., assume the
  specification, {\tt $Type$ sub [].}  Similarly, if it occurs on a
  left-hand side, but not on a right-hand side, {\sc ale} will assume
  the type is immediately subsumed by {\tt bot}, the most general
  type.  In both cases, {\sc ale} will announce these assumptions
  during compilation.
\item Several bug corrections and more compile-time warning and error
  messages.
\item An SWI Prolog 2.9.7 port.  Version 3.0 will be the last version
  for which we will port the system to Quintus Prolog.  We will now
  support {\sc ale} for SICStus Prolog and SWI Prolog.  The SICStus
  version of the system works for SICStus Prolog 3.0.5 and higher,
  except for the source-level debugger, which requires version 3.0.6.
\end{itemize}

We would like to thank all of the users who have supplied us with
feedback and suggestions over the past few years for how to improve
{\sc ale}.  In particular, we would like to thank Ion Androutsopoulos,
Mike Calcagno, Mats Carlsson, Frederik Fouvry, Gertjan van Noord,
Peter van Roy, Margriet Verlinden, and Jan Wielemaker for their
patient assistance.  This release incorporates many, but unfortunately
not all of those changes.  Quite a few more will be made in the next
several releases, along with many performance improvements.
\\
\mbox{ } \hfill Bob Carpenter and Gerald Penn
\\
\mbox{ } \hfill Murray Hill and Tuebingen, March 1998


\mychapterstar{Preface -- Version 2.0}

{\sc ale} 2.0 is a proper extension of version 1.0.  Specifically,
version 2.0 will run any grammar that will run under version 1.0.  But
version 2.0 includes many extensions to version 1.0, including the
following.
%
\begin{itemize}
\item  Inequations
\item  Extensionality
\item  General Constraints on Types
\item  Mini-interpreter
\item  Error-suppression
\end{itemize}
%
Inequations allow inequational constraints to be imposed between two
structures.  Extensionality allows structures of specified types to be
identified if they are structurally identicial.  Together, these
provide the ability to simulate Prolog II programs (Colmerauer 1987).
{\sc ale} 2.0 also allows general constraints to be placed on types,
using arbitrary descriptions from the constraint language, including
path equations, inequations and disjunctions, and procedural
attachments.  It also has a mini-interpreter, which allows the user to
traverse and edit an {\sc ale} parse tree.  Error messages for
incompatible descriptions are now automatically disabled during
lexicon and empty category compilation.

The second release of {\sc ale}, Version 2.0, is based on an extension of
the first version of {\sc ale}, that was completed for Gerald Penn's (1993)
MS Project in the Computational Linguistics Program at Carnegie Mellon
University.

There are many people whom we would like to thank for their comments
and feedback on version 1.0 and $\beta$-versions of 2.0.  These people
have actually used the system in their research and have thus had the
best opportunity to provide us with practical feedback.  First, we
would like to thank the first group of users, housed at Sharp
Laboratories of Europe, located in Oxford, England, including Pete
Whitelock, Antonio Sanfillipo, and Osamu Nishida.  They not only used
the system but provided feedback on the code.  Secondly, the group at
University of T\"ubingen, who are developing a competing system,
Troll, have rigorously tested existing systems, including {\sc ale}, both
for their ability to express grammars naturally and for efficiency.
Specifically, we would like to thank Detmar Meurers, Dale Gerdemann,
Thilo G\"otz, Paul King, John Griffith, and Erhard Hinrichs.  John and
Thilo also provided the changes necessary for the system to run
directly in Quintus Prolog.  This group is undoubtedly the best
informed when it comes to implemented grammar formalisms.  We would
also like to thank the grammar development group at Stanford
University, including Ivan Sag, Chris Manning, Suzanne Riehemann.  We
would further like to thank Bob Kasper, Carl Pollard, and Andreas
Kathol of the Ohio State University, for a great deal of feedback on
the design of HPSG grammars in general, and {\sc ale} implementations of
them in particular.  Chris Manning, in addition, found a bug in
SICStus Prologs prior to 2.1.8, which prevented cyclic structures from
being used in completed chart edges, a bug found by both Steven Bird
of Edinburgh and C.~J. Rupp of IDSIA.  Their feedback on Bob
Carpenter's prototype implementation of HPSG for English led to the
design of Gerald Penn's much more comprehensive implementation of HPSG
and was the primary impetus for the importation of general type
constraints into version 2.0.  Next, we would like to thank Claire
Gardent, who has been using {\sc ale} to develop discourse grammars in
Amsterdam.  We should also thank Carsten Guenther and Markus Walther,
of the Universities of Hamburg and D\"usseldorf, respectively, who
have used the system to develop phonological grammars.  Finally, we
should thank Michael Mastroianni, who implemented a comprehensive
approach to constraint-based phonology in {\sc ale} (Mastroianni 1993).  He
suffered through early, buggy versions of the system, thus sparing the
rest of us much of that pain. The feedback we received from these
users was invaluable.

We would like to thank EAGLES, the European Advisory Group on
Linguistic Engineering Standards, for allowing us to present our
system at a meeting in Saarbr\"ucken in March 1993 of the European
Expert Group on Linguistic Formalisms devoted to implemented
formalisms.  We learned a great deal from the other participants in
the workshop including especially Jochen D\"orre, Michael Dorna, and
Martin Emele, of Stuttgart, and Andreas Podelski, then associated with
the Digital Equipment Paris Research Lab.  We also benefitted from
discussions with Hans Uszkoreit, Rolf Backofen, and Uli
Krieger, of Saarbr\"ucken,  Steve Pulman from SRI in Cambridge, and
C.~J.~Rupp and Graham Russell, of ISSCO in Switzerland.

We had many discussions of the {\sc ale} formalism at the HPSG workshop
running concurrently with the LSA Linguistic Institute in Columbus.
We would especially like to thank Gregor Erbach for comments on our
system, including benchmark test results.  We would also like to thank
Hiroshi Tusda, of the Institute for New Gernateion Computer
Technology, for discussion of our systems and comparisons to his
system, cu-Prolog.  We also discussed {\sc ale} heavily during the
workshop on implementations of attribute-value logics, during the 1993
Summer School on Logic, Language, and Information in Lisbon, Portugal.
We especially benefitted from discussions with Suresh Manandhar, of
the University of Edinburgh, and Gerrit Rentier of Tilburg University,
and Gert Webelhuth of the University of North Carolina, among those we
have not already thanked.  We also benefitted from discussions with Ed
Stabler and Mark Johnson, and from sitting in on their class on the
implementation of constraint-based grammars.

We would also like to thank Ann Copestake and Ted Briscoe, of the
Cambridge Computing Laboratory, for feedback on the design of the
system.  

We would like to thank Richard O'Keefe, who provided some invaluable
feedback on coding style.  Of course, any glitches or failure to
follow his excellent example are our own.

We would also like to thank Elizabeth Hinkelman, who runs the Software
Registry, and Mark Kantrowitz, who administers the Prolog Resource
Guide and the Prime Time Freeware for AI CD-ROM.  They have helped in
publicizing the system description as well as providing access.  

The extensions we have
not made, though would like to, include the addition of:
\begin{itemize}
\item  Primitive, Atomic Data Types
\item  Parametric Types
\item  Partial Type Inference
\item  Assert Mode in Compiler
\item  Peephole Code Optimization
\item  Subsumption Checking of Chart Edges
\end{itemize}
The incorporation of Prolog data types such as Real, Integer,
Character, and String, is straightforward theoretically, but not so
straightforward in terms of {\sc ale}.  The same goes for parametric
polymorphism at the type level.  Partial type inference could provide
a great deal of optimization in some circumstances.  We could not
figure out how to incoprorate these three changes without drastically
modifying the underlying representations and algorithms.  The
remaining changes are lying dormant because we have other obligations.

The next wave of development of attribute-logic grammars should not be
in Prolog, but rather through the use of a direct abstract machine.
Bob Carpenter has worked on an abstract machine with Yan Qu, in the
context of her MS project in the Carnegie Mellon Computational
Linguistics Program, and with Shuly Wintner, of the Technion, in
Haifa, Israel, who is writing a PhD dissertation on the topic.  Such
an undertaking is also underway among the LIFE community, led by
Hassan A\"{\i}t-Kaci, Andreas Podelski, and Peter van Roy.

We would like to thank a number of people for discovering bugs and
providing comments on Version 2.0:  Ingo Schroeder, Frank Morawietz,
Detmar Meurers, Rob Malouf, Frederik Fouvry, Jo Calder, and Suresh
Manandhar.

Finally, we would like to thank Jo Calder,Chris Brew, Kevin Humphreys,
and Mike Reape, who developed the Pleuk grammar development
environment as well as interfacing it to {\sc ale}.  Details of that system
can be found in the appropriate Appendix.  

This material is based upon work supported under a National Science
Foundation Graduate Research Fellowship (for Gerald Penn).  Any
opinions, findings, conclusions or recommendations expressed in this
publication are those of the author(s) and do not necessarily reflect
the views of the National Science Foundation.
\\
\mbox{ } \hfill Bob Carpenter and Gerald Penn
\\
\mbox{ } \hfill Pittsburgh,  August 1994 


\mychapterstar{Preface -- Version Beta}

A number of people have asked me to make this system, along with its
documentation, available to the public.  Now that it's available, I
hope that it's useful.  But a word of caution is in order.  The system
is still only a prototype, hence the label ``version $\beta$.''  

Any bug reports would be greatly appreciated.  But what I'd really
like is comments on the functionality of the system, as well as on the
utility of its documentation.  I am also interested in hearing of any
applications that are made of the system.  I would also be glad to
answer questions about the system.  I have tried to document the
strategies used by {\sc ale} in this guide.  I have also tried to
comment the code to the point where it might be adaptable by others.
I would, of course, be interested in any kind of improvements or
extensions that are discovered or developed, and would like to have
the chance to incorporate any such improvements in future versions of
this package.

In the implementation, I have endeavored to follow the logic
programming methodology laid out by O'Keefe (1990), but there are many
spots where I have fallen short.  Thus the code is not as fast as it
could be, even in Prolog.  But I view this system more as a prototype,
indicating the utility of a typed logic programming and grammar
development system.  Borrowing techniques from the {\sc wam} directly,
implementing an abstract machine C, would lead to roughly a 100-fold
speedup, as there is no reason that {\sc ale} should be slower than
Prolog itself.

I would like to acknowledge the help of Gerald Penn in working through
many implementation details of a general constraint resolver, which
was the inspiration for this implementation.  This version of the
system is a great improvement on the last version due to Gerald's work
on the system.  Secondly, I would like to thank Michael Mastroianni,
who has actually used the system to develop grammars for phonology.
Finally, I would like to thank Carl Pollard and Bob Kasper for looking
over a grammar of {\sc hpsg} coded in {\sc ale} and providing the
impetus for the inclusion of empty categories and lexical rules.

The system is available without charge from the author.  It is
designed to run in either SICStus or Quintus Prologs.  
\\[4pt]
\mbox{ } \hfill Bob Carpenter \\
\mbox{ } \hfill Pittsburgh, 1992


\cleardoublepage
\pagenumbering{arabic}
\setcounter{page}{1}

\mychapter{Introduction}

This report serves as an introduction to both the {\sc ale} formalism
and its Prolog implementation.  {\sc ale} is an integrated phrase
structure parsing and definite clause logic programming system in
which the terms are typed feature structures.  Typed feature
structures combine type inheritance and appropriateness specifications
for features and their values.  The feature structures used in {\sc
ale} generalize the common feature structure systems found in the
linguistic programming systems {\sc patr-ii} and {\sc fug}, the
grammar formalisms {\sc hpsg} and {\sc lfg}, as well as the logic
programming systems Prolog-II and {\sc login}.  Programs in any of
these languages can be encoded directly in {\sc ale}.  

Terms in grammars and logic programs are specified in {\sc ale} using
a typed version of Rounds and Kasper's attribute-value logic with
variables.  At the term level, we have variables, types, feature value
restrictions, path equations, inequations, general constraints, and
disjunction.  The definite clause programs allow disjunction, negation
and cut, specified with Prolog syntax.  Phrase structure grammars are
specified in a manner similar to {\sc dcg}s, allowing definite clause
procedural attachment.  The grammar formalism also fully supports
empty categories.  Lexical development is supported by a very general
form of lexical rule which operates on both categories and surface
strings.  Macros are available to help organize large descriptions,
either in programs or in grammars.  Both definite clause programs and
grammars are compiled into abstract machine instructions.  These
instructions are then interpreted by an emulator compiled from the
type specifications.  Like Prolog compilers, a structure copying
strategy is used for matching both definite clauses and grammar rules.

For parsing, {\sc ale} compiles from the grammar specification a
Prolog-optimized bottom-up, dynamic chart parser.  Definite clauses
are also compiled into Prolog.  As it stands, the current version of
{\sc ale}, running the feature-structure-based naive reverse of a
30-element list on top of the SICStus 3.7 native code compiler, runs
at roughly 152,300 logical inferences per second (LIPS) on a Sun Ultra
Enterprise 450.  This is slightly faster than the speed of the SICStus
3.7 interpreter on the Prolog naive reverse, 60\% as fast as SWI
Prolog 3.2.7, and about 9\% as fast as the SICStus 3.7 compact-code
compiler.  The definite clause compiler performs last call
optimization, but does not index first arguments or use specialised
list cells at the WAM level.  Thus it will perform relatively well
versus non-optimized interpreters, but lag further behind compiled
grammars when programs are written in a more sophisticated manner than
naive reverse.

Full details of the theory behind {\sc ale} can be found in Carpenter
(1992).  

The user who is only interested in definite clause programming can
skip the material on phrase structure grammars, while those interested
in only grammars without procedural attachments may skip the material
in the section on definite clauses.


\mychapter{Prolog Preliminaries}

While it is not absolutely necessary, some familiarity with logic
programming in general, and Prolog in particular, is helpful in
understanding the definite clause portion of {\sc ale}.  Similarly,
experience with unification grammar systems such as {\sc patr-ii},
{\sc dcg}s, or {\sc fug} is helpful in understanding the phrase
structure component of the system.  In particular, writing efficient
programs and grammars in {\sc ale} involves the same kinds of
strategies necessary for writing efficient programs in Prolog or {\sc
patr-ii}.  For those not familiar with Prolog, the sequence of two
books by Sterling and Shapiro (1986) and by O'Keefe (1990) are
excellent general introductions to the theory and practice of logic
programming.  For those not familiar with unification-based grammar
formalisms, Shieber (1986), Gazdar and Mellish (1987) and Pereira and
Shieber (1987) are useful resources.

For those not familiar with Prolog, we need to point out the salient
features of the language which will be assumed throughout this report.
This section contains all of the information necessary about Prolog
required to run {\sc ale}.  


\mysection{Terms}

A Prolog {\em constant} is composed of either a sequence of
characters and/or under-scores, beginning with a lower case letter, a
number, or any sequence of symbols surrounded by apostrophes.  So,
{\tt abc, johnDoe, b\_17, 123, 'JohnDoe', '65\$'}, and {\tt '\_65a.'}
are constants, but {\tt A19, JohnDoe, B\_112, \_au8}, and {\tt [dd,e]}
are not (although see the warning at the end of this section).  
 A variable, on the other hand, is any string of letters,
underscores or numbers beginning with a capital letter.  Thus {\tt C,
C\_foo,} and {\tt TR5ab} are variables, but {\tt 1Xa, aXX}, and {\tt
\_Xy}%
\footnote{Technically, a variable may begin with an underscore, but
such variables, said to be {\em anonymous}, have a very different
status than those which begin with a capital letter.  The use of
anonymous variables is discussed later.}
%
are not.  

In general, it is a bad idea to have constants or variables
which are only distinguished by the capitalization of some of their
letters.  For instance, while {\tt aBa} and {\tt aba} are different
constants, they should not both be used in one program.  One reason
for this in the context of {\sc ale} is that the output routines adopt
standard capitalization conventions which hide the differences between
such constants.  

{\bf Warning:}  As pointed out to us by Ingo Schroeder, constants or
atoms beginning with a capital letter are not treated properly by the
compiler.  Thus constants such as {\tt 'Foo'} should not be used.  


\mysection{Space and Comments}

In your own program and grammar files, extra {\em whitespace} between
symbols beyond that needed to separate constants or variables is
ignored.  Whitespace consists of either spaces, blank lines or line
breaks are ignored.  This allows you to format your programs in a
manner that is readable.  Furthermore, any symbols on a line appearing
after a {\tt \%} symbol are treated as {\em comments} and ignored.


\mysection{Running Prolog}

To fire up Prolog locally, you should contact your systems
administrator.  You should have either SICStus or SWI Prolog, or a
Prolog compiler compatible with one of these.  Once Prolog is fired
up, you will see a {\em prompt}.  The Prolog prompt should look like:
%
\begin{verbatim}
  | ?-
\end{verbatim}
%

It is important that Prolog be invoked from a directory for which the
user has write permission.  {\sc ale}, in the process of compiling
user programs, writes a number of local files.

\mysection{Queries}

What you type after the prompt is called a {\em query}.  Queries
should always end with a period and be followed by a carriage return.
In fact, all of the grammar rules, definite clauses, macros and
lexical entries in your programs should also end with periods.  Most
of the interface in {\sc ale} is handled directly by top-level Prolog
queries.  Many of these will return {\tt yes} or {\tt no} after they
are called, the significance of which within {\sc ale} is explained on
a query by query basis.


\mysection{Running ALE}

To run {\sc ale}, it is only necessary to type the following query:
%
\begin{quote}
  {\tt | ?- compile({\it File}).}\label{COMPILE}  
\end{quote}
%
where {\it File} is the file in which the file {\tt ale.pl} resides.
SWI Prolog users must type:
%
\begin{quote}
  {\tt | ?- consult({\it File}).}\label{CONSULT}
\end{quote}
%
Note that {\it File} does not have to be local to the directory from
which Prolog was invoked.  

In SICStus Prolog, when {\sc ale} is loaded, it turns on Prolog
  character escapes.  {\sc ale} will not be able to generate code
  properly during compilation without this.

\mysection{Exiting Prolog and Breaking}

To exit from Prolog, you can type {\tt halt}\label{HALT} at any prompt
(followed by a period, of course).

If you find Prolog hanging at some point, and you are working on a
standard Unix implementation, typing {\tt control-c}\label{CTRLC}
should produce something like the following message:
%
\begin{verbatim}
  Prolog interruption (h for help)?
\end{verbatim}
%
You should reply with the character {\tt a}, with or without a
following period, followed by a carriage return.  If this doesn't
work, typing {\tt control-z}\label{CTRLZ} should take you out of
Prolog altogether.


\mysection{Saved States}

All information concerning an {\sc ale} state is encoded in the
current Prolog state.  Thus, any options presented by the local system
to save Prolog states should be able to save {\sc ale} states.  


\mychapter{Feature Structures, Types and Descriptions}

This section reviews the basic material from Carpenter (1992), Chapters
1--10, which is necessary to use {\sc ale}.  


\mysection{Inheritance Hierarchies}

{\sc ale} is a language with strong typing.  What this means is that
every structure it uses comes with a type.  These types are arranged
in an inheritance hierarchy, whereby type constraints on more general
types are inherited by their more specific subtypes, leading to what
is known as {\em inheritance-based polymorphism}.  Inheritance-based
polymorphism is a cornerstone of object-oriented programming.  In this
section, we discuss the organization of types into an inheritance
hierarchy.  Thus many types will have {\em subtypes}, which are more
specific instances of the type.  For instance, {\tt person} might have
subtypes {\tt male} and {\tt female}.  

{\sc ale} does much of its processing of types at compile time, as it
is reading and processing the grammar file.  Thus the user is required
to declare all of the types that will be used along with the subtyping
relationship between them.  An example of a simple
{\sc ale} type declaration is as follows:
%
\begin{verbatim}
  bot sub [b,c].          % two basic types -- b and c
    b sub [d,e].
      d sub [g,h].        
      e sub [].
    c sub [d,f].          % b and c unify to d
      f sub [].
\end{verbatim}
%
There are quite a few things to note about this declaration.  The
types declared here are \label{BOT}{\tt bot, b, c, d, e, f} and {\tt
  g}.  Note that each type that is mentioned gets its own
specification.  Of course, the whitespace is not important, but it is
convenient to have each type start its own line.  A simple type
specification consists of the name of the type, followed by the
keyword {\tt sub}\label{SUB}, followed by a list of its subtypes
(separated by whitespace).  In this case, {\tt bot} has two subtypes,
{\tt b} and {\tt c}, while {\tt f, d} and {\tt e} have no subtypes.
The subtypes are specified by a Prolog list.  In this case, a Prolog
list consists of a sequence of elements separated by commas and
enclosed in square brackets.  Note that no whitespace is needed
between the list brackets and types, between the types and commas, or
between the final bracket and the period.  Whitespace is only needed
between constants.  The extra whitespace on successive lines is
conventional, indicating the level in the ordering for the user, but
is ignored by the program.  Also notice that there are comments on two
of the lines; recall that comments begin with a {\tt
  \%}\label{PERCENT} sign and continue the length of the line.  Every
type (except {\tt a\_/1} atoms, discussed below) must have at most one
{\tt sub} declaration, i.e., all of the immediate subtypes must be
declared in one declaration.

The subtyping relation is only specified by {\it immediate} subtyping
declarations; but subtyping itself is transitive.  Thus, in the
example, {\tt d} is a subtype of {\tt c}, and {\tt c} is a subtype of
{\tt bot}, so {\tt d} is also a subtype of {\tt bot}.  The user only
needs to specify the direct subtyping relationship.  The transitive
closure of this relation is computed by the compiler.  While redundant
specifications, such as putting {\tt d} directly on the subtype list
of {\tt bot}, will not alter the behavior of the compiler, they are
confusing to the reader of the program and should be avoided.  In
addition, the derived transitive subtyping relationship must be
anti-symmetric.  In particular, this means that there should not be
two distinct types each of which is a subtype of the other.

There are two additional restrictions on the inheritance hierarchy
beyond the requirement that it form a partial order.  First, there is
a special type {\tt bot}, which must be declared as the unique most
general type.  In other words, every type must be a subtype of {\tt
bot}.  If a type is used on the left-hand side of a {\tt sub}
declaration, but never declared as a sub-type of anything else, it is
assumed that this type is an immediate subtype of {\tt bot}.
Similarly, {\sc ale} assumes that all types for which no subtypes are
declared are maximal, i.e., have no subtypes.

The second and more subtle restriction on type hierarchies is that
they be {\em bounded complete}.  Since type declarations must be
finite, this amounts to the restriction that every pair of types which
have a common subtype have a unique most general common subtype.  In
the case at hand, {\tt b} and {\tt c} have three common subtypes, {\tt
d, g}, and {\tt h}.  But these subtypes of {\tt b} and {\tt c} are
ordered in such a way that {\tt d} is the most general type in the
set, as both {\tt g} and {\tt h} are subtypes of {\tt d}.  An example
of a type declaration violating this condition is:
%
\begin{verbatim}
  bot sub [a,b].
    a sub [c,d].
      c sub [].
      d sub [].
    b sub [c,d].
\end{verbatim}
%
The problem here is that while {\tt a} and {\tt b} have two common
subtypes, namely {\tt c} and {\tt d}, they do not have a most general
common subtype, since {\tt c} is not a subtype of {\tt d}, and {\tt d}
is not a subtype of {\tt c}.  In general, a violation of the bounded
completeness condition such as is found in this example can be patched
without destroying the ordering by simply adding additional types.
In this case, the following type hierarchy preserves all of the
subtyping relations of the one above, but satisfies bounded
completeness:
%
\begin{verbatim}
  bot sub [a,b].
    a sub [e].
      e sub [c,d].
        c sub [].
        d sub [].
    b sub [e].
\end{verbatim}
%
In this case, the new type {\tt e} is the most general subtype of {\tt
a} and {\tt b}.  

This last example brings up another point about inheritance
hierarchies.  When a type only has one subtype, the system provides a
warning message (as opposed to an error message).  This condition will
not cause any compile-time or run-time errors, and is perfectly
compatible with the logic of the system.  It is simply not a
very good idea from either a conceptual or implementational point of
view.  For more on this topic, see Carpenter (1992:Chapter 9).  
 

\mysection{Feature Structures}

The primary representational device in {\sc ale} is the {\em typed
feature structure}.  In phrase structure grammars, feature structures
model categories, while in the definite clause programs, they serve
the same role as first-order terms in Prolog, that of a universal data
structure.  Feature structures are much like the frames of AI systems,
the records of imperative programming languages like C or Pascal, and
the feature descriptions used in standard linguistic theories of
phonology, and more recently, of syntax.

Rather than presenting a formal definition of feature structures,
which can be found in Carpenter (1992:Chapter 2), we present an
informal description here.  In fact, we begin by discussing feature
structures which are not necessarily well-typed.  In the next section,
the type system is presented.

A feature structure consists of two pieces of information.  The first
is a type.  Every feature structure must have a type drawn from the
inheritance hierarchy.  The other kind of information specified by a
feature structure is a finite, possibly empty, collection of
feature/value pairs.  A feature value pair consists of a feature and a
value, where the value is itself a feature structure.  The difference
between feature structures and the representations used in phonology
and in {\sc gpsg}, for instance, is that it is possible for two different
substructures (values of features at some level of nesting) to be
token identical in a feature structure.  Consider the following
feature structure drawn from the lexical entry for {\em John} in the
categorial grammar in the appendix, displayed in the output notation
of {\sc ale}:
%
\begin{verbatim}
  cat
  QSTORE e_list
  SYNSEM basic
         SEM j  
         SYN np
\end{verbatim}
%
The type of this feature structure is {\tt cat}, which is interpreted
to mean it is a category.  It is defined for two features, {\tt
QSTORE} and {\tt SYNSEM}.  As can be seen from this example, we follow
the {\sc hpsg} notational convention of displaying features in all caps,
while types are displayed in lower case.  Also note that features and
their values are printed in alphabetic order of the feature names.  
In this case, the value of the {\tt QSTORE} feature is
the simple feature structure of type {\tt e\_list},%
%
\footnote{Set values, like those employed in {\sc hpsg}, are not
supported by {\sc ale}.  In the categorial grammar in the appendix,
they are represented by lists and treated by attached procedures for
union and selection.}
%
which has no feature
values.  On the other hand, the feature {\tt SYNSEM} has a complex
feature as its value, which is of type {\tt basic}, and has two feature
values {\tt SEM} and {\tt SYN}, both of which have simple values.  

This last feature structure doesn't involve any structure sharing.
But consider the lexical entry for {\em runs}:
%
\begin{verbatim}
  cat
  QSTORE e_list
  SYNSEM backward
         ARG basic
             SEM [0] individual
             SYN np
         RES basic
             SEM run
                 RUNNER [0] 
             SYN s
\end{verbatim}
%
Here there is structure sharing between the path {\tt SYNSEM ARG SEM}
and the path {\tt SYNSEM RES SEM RUNNER}, where a {\em path} is simply
a sequence of features.  This structure sharing is indicated by the
{\em tag} {\tt [0]}.  In this case, the sharing indicates that the
semantics of the argument of {\em runs} fills the runner role in the
semantics of the result.  Also note that a shared structure is only
displayed once;  later occurrences simply list the tag.  Of course,
this example only involves structure sharing of a very simple feature
structure, in this case one consisting of only a type with no
features.  In general, structures of arbitrary complexity may be
shared, as we will see in the next example. 

{\sc ale}, like Prolog II and {\sc hpsg}, but unlike most other systems,
allows cyclic structures to be processed and even printed.  For
instance, consider the following representation we might use for the
liar sentence {\em This sentence is false}:
%
\begin{verbatim}
  [0] false
      ARG1 [0]
\end{verbatim}
%
In this case, the empty path and the feature {\tt ARG1} share a value.
Similarly, the path {\tt ARG1 ARG1 ARG1} and the path {\tt ARG1 ARG1},
both of which are defined, are also identical.  But consider a
representation for the negation of the liar sentence, {\em It is false
that this sentence is false}:
%
\begin{verbatim}
  false
  ARG1 [0] false
       ARG1 [0]
\end{verbatim}
%
Unlike Prolog II, {\sc ale} does not necessarily treat these two feature
structures as being identical, as it does not conflate a cyclic
structure with its infinite unfolding.  We take up the notion of
token identical structures in the section below on extensionality.

It is interesting to note that with typed feature structures, there is
a choice between representing information using a type and
representing the same information using feature values.  This is a
familiar situation found in most inheritance-based representation
schemes.  Thus the relation specified in the value of the path {\tt
SYNSEM RES SEM} is represented using a type, in:
%
\begin{verbatim}
  SEM run
      RUNNER [0]
\end{verbatim}
%
An alternative encoding, which is not without merit, is:
%
\begin{verbatim}
  SEM unary_rel
      REL run
      ARG1 [0]
\end{verbatim}
%
In general, type information is processed much more efficiently than
feature value information, so as much information as possible should
be placed in the types.  The drawback is that type information must be
computed at compile-time and remain accessible at run-time.  More
types simply require more memory.%
%
\footnote{In general, the amount of memory required to represent $n$
types is proportional to the number of pairs of consistent types.  
In the worst case, this is ${\cal O}(n^2)$ in the number of types.}


\mysection{Subsumption and Unification}

Feature structures are inherently partial in the information they
provide.  Based on the type inheritance ordering, we can order feature
structures based on how much information they provide.  This ordering
is referred to as the {\em subsumption} ordering.  The notion of
subsumption, or information containment, can be used to define the
notion of unification, or information combination.  Unification
conjoins the information in two feature structures into a single
result if they are consistent and detects an inconsistency otherwise.


\mysubsection{Subsumption}

We define subsumption, saying that $F$ {\it subsumes} $G$, if and only if: 
%
\begin{itemize}
\item
  the type of $F$ is more general than the type of $G$
\item
  if a feature $f$ is defined in $F$ then $f$ is also defined in $G$
such that the value in $F$ subsumes the value in $G$
\item
  if two paths are shared in $F$ then they are also shared in $G$
\end{itemize}
%
Consider the following examples of subsumption, where we let {\tt <}
stand for subsumption:
%
\begin{verbatim}
  agr         <  agr
  PERS first     PERS first
                 NUM plu
\end{verbatim}
%
\begin{verbatim}
  sign                phrase                 
  SUBJ agr         <  SUBJ agr 
       PERS pers           PERS first
                           NUM plu
\end{verbatim}
%
\begin{verbatim}
  sign                sign
  SUBJ agr            SUBJ [0] agr
       PERS first          PERS first
       NUM plu     <       NUM plu
  OBJ agr             OBJ [0]
      PERS first
      NUM plu
\end{verbatim}
%
\begin{verbatim}
  false               false              [1] false
  ARG1 false       <  ARG1 [0] false  <  ARG1 [1]
       ARG1 false          ARG1 [0]
\end{verbatim}
%
Note that the second of these subsumptions holds only if {\tt pers} is
a more general type than {\tt first}, and {\tt sign} is a more general
type than {\tt phrase}.  It is also important to note that the feature
structure consisting simply of the type {\tt bot} will subsume every
other structure, as the type {\tt bot} is assumed to be more general
than every other type.

\mysubsection{Unification}

Unification is an operation defined over pairs of feature structures that
combines the information contained in both of them if they are
consistent and fails otherwise.  In {\sc ale}, unification is very
efficient.%
%
\footnote{Using a typed version of the Martelli and Montanari (1982)
algorithm, which was adapted to cyclic structures by Jaffar (1984),
unification can be performed in what is known as quasi-linear time in
the size of the input structures, where in this case, quasi-linear in
$n$ is defined to be ${\cal O}(n \cdot \mbox{\it ack}^{-1}(n))$, where
${\it ack}^{-1}$ is the inverse of Ackermann's function, which will
never exceed 4 or 5 for structures that can be represented on existing
computers.  There is also a factor in the complexity of unification
stemming from the type hierarchy and appropriateness conditions, which
we discuss below.}
%
Declaratively, unifying two feature structures computes a result which
is the most general feature structure subsumed by both input
structures.  But the operational definition is more enlightening, and
can be given by simple conditions which tell us how to unify two
structures.  We begin by unifying the types of the structures in the
type hierarchy.  This is why we required the bounded completeness
condition on our inheritance hierarchies; we want unification to
produce a unique result.  If the types are inconsistent, unification
fails.  If the types are consistent, the resulting type is the
unification of the input types.  Next, we recursively unify all of the
feature values of the structures being unified which occur in both
structures.  If a feature only occurs in one structure, we copy it
over into the result.  This algorithm terminates because we only need
to unify structures which are non-distinct and there are a finite
number of nodes in any input structure.

Some examples of unification follow, where we use {\tt +} to represent
the operation:
%
\begin{verbatim}
  agr          +  agr      =  agr
  PERS first      NUM plu     PERS first
                              NUM sing
\end{verbatim}
%
\begin{verbatim}
  sign              sign             sign
  SUBJ agr       +  SUBJ [0] bot  =  SUBJ [0] agr
       PERS 1st     OBJ [0]               PERS first
  OBJ agr                                 NUM plu
      NUM plu                        OBJ [0]
\end{verbatim}
%
\begin{verbatim}
  t           t           t
  F [0] t  +  F t      =  F [1] t
  G [0]         F [1]       F [1]
              G [1]       G [1]
\end{verbatim}
%
\begin{verbatim}
  agr         +  agr          =  *failure*
  PERS first     PERS second    
\end{verbatim}
%
\begin{verbatim}
  e_list  +  ne_list     =  *failure*
             HD a
             TL e_list
\end{verbatim}
%
Note that the second example respects our assumption that the type
{\tt bot} is the most general type, and thus more general than {\tt
agr}.  The second example illustrates what happens in a simple case of
structure sharing:  information is retrieved from both the {\tt SUBJ}
and {\tt OBJ} and shared in the result.  The third example shows how
two structures without cycles can be unified to produce a structure
with a cycle.  Just as the feature structure {\tt bot} subsumes every
other structure, it is also the identity with respect to unification;
unifying the feature structure consisting just of the type {\tt bot}
with any feature structure $F$ results simply in $F$.  The last two
unification attempts fail, assuming that the types {\tt first} and
{\tt second} and the types {\tt e\_list} and {\tt ne\_list} are
incompatible. 

\mysection{Inequations}

Feature structures may also incorporate inequational constraints
following (Carpenter 1992), which is in turn based on the notion of
inequation in Prolog II (Colmerauer 1987).  For instance, we might
have the following representation of the semantics of a sentence:
%
\begin{verbatim}
  SEM binary_rel
      REL know
      ARG1  [0] referent
            GENDER masc
            PERS third
            NUM sing
      ARG2  [1] referent
            GENDER masc
            PERS third
            NUM sing
  [0] =\= [1]
\end{verbatim}
%
Below the feature information, we have included the constraint that
the value of the structure {\tt [0]} is not identical to that of
structure {\tt [1]}.  As a result, we cannot unify this structure with
the following one:
%
\begin{verbatim}
  REL know
  ARG1 [2]
  ARG2 [2]
\end{verbatim}
%
Any attempt to unify the structures {\tt [0]} and {\tt [1]} causes
failure.  

\mysection{Type System}

As we mentioned in the introduction, what distinguishes {\sc ale} from
other approaches to feature structures and most other approaches to
terms, is that there is a strong type discipline enforced on feature
structures.  We have already demonstrated how to define a type
hierarchy, but that is only half the story with respect to typing.
The other component of our type system is a notion of feature {\em
appropriateness}, whereby each type must specify which features it can
be defined for, and furthermore, which types of values such features
can take.  The notion of appropriateness used here is similar to that
found in object-oriented approaches to typing.  For instance, if a
feature is appropriate for a type, it will also be appropriate for all
of the subtypes of that type.  In other words, appropriateness
specifications are inherited by a type from its supertypes.
Furthermore, value restrictions on feature values are also inherited.
Another important consideration for {\sc ale}'s type system is the
notion of type inference, whereby types for structures which are
underspecified can be automatically inferred.  This is a property our
system shares with the functional language {\sc ml}, though our notion
of typing is only first-order.  To further put {\sc ale}'s type system
in perspective, we note that type inheritance must be declared by the
user at compile time, rather than being inferred.  Furthermore, types
in {\sc ale} are semantic, in Smolka's (1988b) terms, meaning that
types are used at run-time.  Even though {\sc ale} employs semantic
typing, the type system is employed statically (at compile-time) to
detect type errors in grammars and programs.

As an example of an appropriateness declaration, consider the simple
type specification for lists with a head/tail encoding:
%
\begin{verbatim}
  bot sub [list,atom].
    list sub [e_list,ne_list].
      e_list sub [].
      ne_list sub []
              intro [hd:bot,
                     tl:list].
  atom sub [a,b].
    a sub [].
    b sub [].
\end{verbatim}
%
This specification tells us that a list can be either empty ({\tt
e\_list}) or non-empty ({\tt ne\_list}).  It implicitly tells us that
an empty list cannot have any features defined for it, since none are
declared directly or inherited from more general types.  The
declaration also tells us that a non-empty list has two features,
representing the head and the tail of a list, and, furthermore, that
the head of a list can be anything (since every structure is of type
{\tt bot}), but the tail of the list must itself be a list.  Note that
features must also be Prolog constants, even though the output
routines convert them to all caps.  The appropriateness declaration,
{\tt intro}\label{INTRO}, can be specified along with subsumption, as
shown above, or separately; but for any given type, all features must
be declared at once.  If no {\tt intro} declaration is given for a
type, it is assumed that that type introduces no appropriate features.
If an {\tt intro} declaration is made for a type that does not occur
on either side of a {\tt sub} declaration, that type is assumed to be
an immediate subtype of {\tt bot} with no subtypes of its own.  If a
value restrictor (such as {\tt list} above for feature {\tt tl}) does
not occur on either side of a {\tt sub} declaration, it too is assumed
to be maximal and an immediate subtype of {\tt bot}.

In {\sc ale}, every feature structure must respect the appropriateness
restrictions in the type declarations.  This amounts to two
restrictions.  First, if a feature is defined for a feature structure
of a given type, then that type must be appropriate for the feature.
Furthermore, the value of the feature must be of the appropriate type,
as declared in the appropriateness conditions.  The second condition
goes the other way around: if a feature is appropriate for a type,
then every feature structure of that type must have a value for the
feature.  A feature structure respecting these two conditions is said
to be {\em totally well-typed} in the terminology of Carpenter (1992,
Chapter 6).%
%
\footnote{The choice of totally well-typed structures was motivated by
the desire to represent feature structures as records at run-time,
without listing their features.  Internally, a feature structure is
represented as a term of the form {\tt Tag-Sort(V1,...,VN)} where {\tt
Tag} represents the token identity of the structure using a Prolog
variable, {\tt Sort} is the type of structure, and {\tt V1} through
{\tt VN} are the values of the appropriate features, in alphabetical
order of the features' names, which are themselves left implicit.
Furthermore, the {\tt Tag} is used for forwarding and dereferencing
during unification.}
%
For instance, consider the following feature structures:
%
\begin{verbatim}
  list
  HD a
  TL bot
\end{verbatim}
%
\begin{verbatim}
  ne_list
  HD bot
  TL ne_list
     HD atom
     TL list
\end{verbatim}
%
\begin{verbatim}
  ne_list
  HD [0] ne_list
     HD [0]
     TL [0]
  TL e_list
\end{verbatim}
%
The first structure violates the typing condition because the type
{\tt list} is not appropriate for any features, only {\tt ne\_list} is.
But even if we were to change its type to {\tt ne\_list}, it would
still violate the type conditions, because {\tt bot} is not an
appropriate type for the value of {\tt TL} in a {\tt ne\_list}.  On
the other hand, the second and third structures above are totally
well-typed.  Note that the second such structure does not specify what
kind of list occurs at the path {\tt TL TL}, nor does it specify what
the {\tt HD} value is, but it does specify that the second element of
the list, the {\tt TL HD} value is an {\tt atom}, but it doesn't
specify which one.  

To demonstrate how inheritance works in a simple case, consider the
specification fragment from the categorial grammar in the appendix:
%
\begin{verbatim}
  functional sub [forward,backward]
             intro [arg:synsem,
                    res:synsem].
    forward sub [].
    backward sub [].
\end{verbatim}
%
This tells us that {\tt functional} objects have {\tt ARG} and {\tt
RES} features.  Because {\tt forward} and {\tt backward} are subtypes
of {\tt functional}, they will also have {\tt ARG} and {\tt RES}
features, with the same restrictions.  

There are a couple of important restrictions placed on appropriateness
conditions in {\sc ale}.  The most significant of these is the
acyclicity requirement.  This condition disallows type specifications
which require a type to have a value which is of the same or more
specific type.  For example, the following specification is {\em not}
allowed:
%
\begin{verbatim}
  person sub [male,female]
         intro [father:male,
                mother:female].
    male sub [].
    female sub [].
\end{verbatim}
%
The problem here is the obvious one that there are no most general
feature structures that are both of type {\tt person} and totally
well-typed.%
%
\footnote{The only finite feature structures that could meet this type
system would have to be cyclic, as noted in Carpenter (1992).  The
problem is that there is no most general such cyclic structure, so
type inference cannot be unique.}
%
This is because any person must have a father and mother feature,
which are male and female respectively, but since male and female are
subtypes of person, they must also have mother and father values.  It
is significant to note that the acyclicity condition does not rule out
recursive structures, as can be seen with the example of lists.  The
{\tt list} type specification is acceptable because not every list is
required to have a head and tail, only non-empty lists are.  The
acyclicity restriction can be stated graph theoretically by
constructing a directed graph from the type specification.  The nodes
of the graph are simply the types.  There is an edge from every type
to all of its supertypes, and an edge from every type to the types in
the type restrictions in its features.  Type specifications are only
acceptable if they produce a graph with no cycles.  One cycle in the
{\tt person} graph is from {\tt male} to {\tt person} (by the
supertype relation) and from {\tt person} to {\tt male} (by the {\tt
FATHER} feature).  On the other hand, there are no cycles in the
specification of {\tt list}.

The second restriction placed on appropriateness declarations is
designed to limit non-determinism in much the same way as the
bounded completeness condition on the inheritance hierarchy.  This
second condition requires every feature to be {\em introduced} at a unique
most general type.  In other words, the set of types appropriate for a
feature must have a most general element.  Thus the following type
declaration fragment is {\em invalid}:
%
\begin{verbatim}
  a sub [b,c,d].
    b sub []
      intro [f:w,
             g:x].
    c sub []
      intro [f:y,
             h:z].
    d sub [].
\end{verbatim}
%
The problem is that the feature {\tt F} is appropriate for types {\tt
b} and {\tt c}, but there is not a unique most general type for which
it's appropriate.  In general, just like the bounded completeness
condition, type specifications which violate the feature introduction
condition can be patched, without violating any of their existing
structure, by adding additional types.  In this case, we add a new type
between {\tt a} and the types {\tt b} and {\tt c}, producing the
equivalent well-formed specification:
%
\begin{verbatim}
  a sub [e,d].
    e sub [b,c]
      intro [f:bot].
      b sub []
        intro [f:w,
               g:x].
      c sub []
        intro [f:y,
               h:z].
    d sub [].
\end{verbatim}
%
This example also illustrates how subtypes of a type can place
additional restrictions on values on features as well as introducing
additional features.

As a further illustration of how feature introduction can be obeyed in
general, consider the following specification of a type system for
representing first-order terms:
%
\begin{verbatim}
  sem_obj sub [individual,proposition].
    individual sub [a,b].
      a sub []. 
      b sub [].
  proposition sub [atomic_prop,relational].
    atomic_prop sub [].
    relational_prop sub [unary_prop,transitive_prop]
                    intro [arg1:individual].
      unary_prop sub [].
      transitive_prop sub [binary_prop,ternary_prop]
                      intro [arg2:individual].
        binary_prop sub [].
        ternary_prop sub []
                     intro [arg3:individual].
\end{verbatim}
%
In this case, unary propositions have one argument feature, binary
propositions have two argument features, and ternary propositions have
three argument features, all of which must be filled by individuals.


\mysection{Extensionality}

{\sc ale} also respects the distinction between {\em intensional} and
{\em extensional} types (see Carpenter (1992:Chapter 8).  The concept
of extensional typing has its origins in the assumption in standard
treatments of feature structures, that there can only be one copy of
any {\em atom} (a feature structure with no appropriate features) in a
feature structure.  Thus, if path $\pi_{1}$ leads to atom {\em a}, and
path $\pi_{2}$ leads to atom {\em a}, then the values for those two
paths are token-identical.  Token-identity refers to an identity
between two feature structures as objects, as opposed to {\em
  structure-identity}, which refers to an identity between two feature
structures that contain the same information.

Smolka (1988a) partitioned his atoms according to whether more than
one copy could exist or not.  In {\sc ale}, following Carpenter
(1992), this notion of copyability has been extended to arbitrary
types -- loosely speaking, those types which are copyable we call {\em
intensional}, and those which are not we call {\em extensional}.
Thus, it is possible to have two feature structures of the same
intensional type which, although they may be structure-identical, are
not token-identical.  Formally:
%
\begin{quote}
  Given the set of types, {\sf Type}, defined by an {\sc ale}
  signature, we designate a subset, ${\sf ExtType} \subseteq {\sf
    Type}$, as the set of extensional types.  With the exception of
  {\tt a\_/1} atoms, discussed below, this set consists only of {\em
    maximally specific types}, i.e., for each $\sigma \in {\sf
    ExtType}$, there is no type $\tau$ such that $\sigma$ subsumes
  $\tau$.
\end{quote}
%
The restriction of {\sf ExtType} to maximally specific types is
peculiar to {\sc ale}, and is levied in order to reduce the
computational complexity of enforcing extensionality.%
%
\footnote{In theory (Carpenter 1992), this set is only required to be
  {\em upward closed}, which means that if $\sigma \in {\sf ExtType}$,
  and $\sigma$ subsumes $\tau$, then $\tau \in {\sf ExtType}$.  This
  relaxation of our requirement that extensional types be maximal
  would actually not be too difficult to implement.}

We need one more definition to formally state the effect which an
extensional type has on feature structures in {\sc ale}.
%
\begin{quote}
Given a set of extensional types, {\sf ExtType}, we define an
equivalence relation, $\asymp$, the {\em collasping relation}, on
well-typed feature structures, such that $F_{1} \asymp F_{2}$ for
$F_{1} \neq F_{2}$ only if:
\begin{itemize}
\item $F_{1}$ has the same type, $\sigma$, as $F_{2}$, and $\sigma \in
{\sf ExtType}$, and
\item for every feature, $f$, appropriate to $\sigma$,
$F_{1}^{f}$, the value of $f$ in $F_{1}$, and $F_{2}^{f}$, the value
of $f$ in $F_{2}$, are defined, and $F_{1}^{f} \asymp F_{2}^{f}$.
\end{itemize}
\end{quote}
%
In {\sc ale}, all feature structures behave as if they are what
Carpenter (1992) referred to as {\em collapsed}.
That is, the only collapsing relation that exists between any two
feature structures is the trivial collapsing relation, namely:
%
\begin{quote}
$F_{1} \asymp F_{2}$ if and only if $F_{1}$ is token-identical to $F_{2}$.
\end{quote}
%
In the case of acyclic feature structures, this definition is
equivalent to saying that two feature structures of the same
extensional type are token-identical if and only if, for every feature
appropriate to that type, their respective values on that feature are
token-identical.  For example, supposing that we have a signature
representing dates, then the two substructures representing dates in
the following structure must be token identical.
%
\begin{verbatim}
  married_person
  BIRTHDAY [1] date
           DAY  12
           MONTH nov
           YEAR 1971
  SPOUSE BIRTHDAY [1] date
                  DAY 12
                  MONTH nov
                  YEAR 1971
\end{verbatim}
%
In other words, this represents a person born on 12 November 1971, who
is married to a person with the same birthdate.

Now consider a slightly more complex example, which employs the
following signature.
%
\begin{verbatim}
  bot sub [a,b,c,g].
    a sub []
      intro [f:b,g:c].
    b sub [].
    c sub [].
    g sub []
      intro [h:a,j:a].
\end{verbatim}
%
If {\tt a}, {\tt b}, and {\tt c} are extensional, then the values of
{\tt H} and {\tt J} in {\tt g} are always token-identical, i.e., every
feature structure of type {\tt g} satisfies:
%
\begin{verbatim}
  g
  H [0] a
    F b
    G c
  J [0]
\end{verbatim}
%
But if only {\tt a}, and {\tt b} are extensional, and {\tt c} is
intensional, then the values of {\tt H} and {\tt J} are not
necessarily token-identical, although they are always
structure-identical:
%
\begin{verbatim}
  g
  H a
    F [1] b
    G c
  J a
    F [1]
    G c
\end{verbatim}
%

To cite an earlier example, suppose we were to specify that the type
{\tt false}, used in the liar sentence and its negation, were
extensional.  Now the liar sentence's representation is:
%
\begin{verbatim}
  [0] false
      ARG1 [0]
\end{verbatim}
%
as before, but the negation of the liar sentence would also be
represented by:
%
\begin{verbatim}
  [0] false
      ARG1 [0]
\end{verbatim}
%
since if were still represented by:
%
\begin{verbatim}
  [1] false
  ARG1 [0] false
       ARG1 [0]
\end{verbatim}
%
then we could cite a non-trivial collasping relation, $\asymp$, in
which $[1] \asymp [0]$.  

As a related example, consider:
%
\begin{verbatim}
  s
  A [0] t
    C [0]
  B [1] t
    C [1]
\end{verbatim}
%
Assuming that {\tt t} is extensional and only appropriate for the
feature {\tt C}, then the structures {\tt [0]} and {\tt [1]} in the
above structure would be identified.  

Extensionality allows the proper representation of feature structures
and terms in both PATR-II, the Rounds-Kasper system, and in Prolog and
Prolog II.  For PATR-II and the Rounds-Kasper system, all atoms (those
types with no appropriate features) are assumed to be extensional.
Furthermore, in the Rounds-Kasper and PATR-II systems, which are
monotyped, there is only one type that is appropriate for any
features, and it must be appropriate for all features in the grammar.
In Prolog and Prolog II, the type hierarchy is assumed to be flat, and
every type is extensional.  

Just as with implementations of Prolog, collapsing is only performed
as it is needed.  As shown by Carpenter (1992), collapsing can be
restricted to cases where inequations are tested between two
structures, with exactly the same failure behavior.  It turns out to
be less efficient to collapse structures before asserting them into
{\sc ale}'s parsing chart, primarily because the time to test arbitrary
structures for collapsibility is at least quadratic in the size of the
structures being collapsed.  See the section below on inequations for
further discussion.  Currently, extensionality is only enforced before
the answer to a user query is given.

Extensional types in {\sc ale} are specified all at once in
a list: 
%
\begin{quote}
  ${\tt ext([}ext_{1}, \ldots, ext_{n}{\tt ]).}$\label{EXT}
\end{quote}
%
in the same file in which the subsumption relation is defined.  All
types that are not mentioned in the {\tt ext} specification are
assumed to be intensional, except {\sc ale}'s {\tt a\_/1} atoms,
discussed below, which have the same extensionality as Prolog terms,
i.e., if they are ground or have the same variables in the same
positions.%
%
\footnote{This is given by the {\tt ==} operator in Prolog.}
%
These do not need to be declared as such.  If more than
one {\tt ext} specification is given, the first one is used.  If no
{\tt ext} specification is given, then the specification:
%
\begin{verbatim}
  ext([]).
\end{verbatim}
%
is assumed.  If a type occurs in {\tt ext/1}, but does not appear on
the left or right-hand side of a {\tt sub} declaration, it assumed to
be maximal, and immediately subsumed by {\tt bot}.

Of course, collapsing is only enforced between feature structures
whose life-spans in {\sc ale}%
%
\footnote{The life-span of a feature
structure in {\sc ale} is the period from its creation to the point
when the user command currently being executed finishes, unless
that feature structure is asserted as an edge in {\sc ale}'s chart
parser.  In this case, the life of the feature structure ends when the
edge is removed.  Every new request for a parse to {\sc ale} removes
all of the current edges.} 
%
overlap.  So, for example, if one request
is made for the representation of the liar sentence:
%
\begin{verbatim}
  [0] false
      ARG1 [0]
\end{verbatim}
%
and then another is made for that of its negation, the output is not:
\begin{verbatim}
  [0]   
\end{verbatim}
(referring to the same token above) but rather:
%
\begin{verbatim}
  [0] false
      ARG1 [0]
\end{verbatim}
%
Every time a new context arises, numbering of structures begins again
from {\tt [0]}.

\mysection{{\tt a\_/1} Atoms}

{\sc ale} also provides an infinite collection of atoms.
These are of the form:\label{ATOM}
\begin{quote}
{\tt a\_} {\it Term}
\end{quote}
where {\it Term} is a prolog term.  Two {\tt a\_/1} atoms subsume each
other if and only if their terms subsume each other as prolog terms.
As a result, no two different, ground atoms subsume each
other; and the most general atom of this collection is {\tt a\_ \_}.
They implicitly exist in every type hierarchy, with {\tt a\_ \_} being
immediately subsumed by {\tt bot}, and with every ground
atom being maximal.  {\tt a\_/1} atoms are extensional; and non-ground
{\tt a\_/1} atoms are extensionally identical as Prolog terms, i.e.,
if they have the same variables in the same positions.  For example,
{\tt a\_ f(X)} and {\tt a\_ g(X)} are not taken to be the same atom,
nor are {\tt a\_ f(X)} and {\tt a\_ f(f(X)}, nor are {\tt a\_ f(X)}
and {\tt a\_ f(Y)}.  But {\tt a\_ f(X)} and {\tt a\_ f(X)} are.  Their
status in the type hierarchy should not be explicitly declared, nor
should the fact that they bear no features, nor should their
extensionality.

Some care must be exercised when using non-ground atoms in chart
edges.  {\sc ale}'s chart parser copies edges, including the Prolog
variables inside {\tt a\_/1} atoms.  When these variables are copied,
identity among variables within a single edge is preserved, but
identity among variables between different edges may be lost.  Because
{\sc ale} delays the enforcement of extensional type checking, this
could result in {\sc ale} losing a path equation between two atoms.
The best way to avoid this is always to use ground atoms in chart
edges.  Otherwise, the user should at least avoid relying on
extensional identity when writing grammars by not using the {\sc ale}
built-in {\tt @=} or inequations between non-ground atoms from
different edges.

If the user requires intensional atoms, they must be explicitly
declared.  There must also be no user-defined type, {\tt a\_}, in the
type hierarchy.  Certain arguments to {\tt a\_/1} cannot be used, such
as {\tt a\_} itself, and other prolog reserved words, such as {\tt
  mod}, unless they are used with the proper operator precedence and
proper number of arguments to be parsed by Prolog.

Otherwise, {\tt a\_/1} atoms can be used wherever a normal type can.
They are particularly useful as members of large domains that are too
tedious to define, such as phonology attributes in natural language
grammars, or to pass extra-logical information around a parse tree,
such as numbers representing probabilities.  To declare a feature's
value as any {\tt a\_/1} atom, use {\tt a\_ \_}:
%
\begin{verbatim}
sign intro [phon:(a_ _)].
\end{verbatim}
%
The parentheses are recommended for readability, but not necessary.
Because subsumption among {\tt a\_/1} atoms mirrors subsumption as
prolog terms, one can also declare features appropriate for only
certain kinds of atoms.  For example:
%
\begin{verbatim}
sign intro [phon:(a_ phon(_))].
\end{verbatim}
%
declares {\sc phon} appropriate to any atom whose term's functor is
{\tt phon/1}.

Structure-sharing between {\tt a\_/1} prolog terms in feature
appropriateness declarations is ignored by {\sc ale}.  For example,
the declaration:
%
\begin{verbatim}
foo intro [f:(a_ X),g:(a_ X)].
\end{verbatim}
%
is treated as:
%
\begin{verbatim}
foo intro [f:(a_ _),g:(a_ _)].
\end{verbatim}
%
{\sc ale} does respect structure-sharing between {\tt a\_/1} prolog
terms in descriptions.

\mysection{Attribute-Value Logic}

Now that we have seen how the type system must be specified, we turn
our attention to the specification of feature structures themselves.
The most convenient and expressive method of describing feature
structures is the logical language developed by Kasper and Rounds
(1986), which we modify here in three ways.  First, we replace the
notion of path sharing with the more compact and expressive notion of
variable due to Smolka (1988a).  Second, we extend the language to
types, following Pollard (in press). Finally, we add inequations.

The collection of {\em descriptions} used in {\sc ale} can be described by
the following {\sc bnf} grammar:
%
\begin{verbatim}
  <desc> ::= <type>
           | <variable>
           | (<feature>:<desc>)
           | (<desc>,<desc>)
           | (<desc>;<desc>)
           | (=\= <desc>)
\end{verbatim}
%
As we have said before, both types and features are represented by
Prolog constants.  Variables, on the other hand, are represented by
Prolog variables.  As indicated by the {\sc bnf}, no whitespace is
needed around the feature selecting colon, conjunction
comma\label{COMMADESC} and disjunction semi-colon\label{SEMIDESC}, but any
whitespace occurring will be ignored.

These descriptions are used for picking out feature structures that
satisfy them.  We consider the clauses of the definition in turn.  A
description consisting of a type picks out all feature structures of
that type.  A variable can be used to refer to any feature structure,
but multiple occurrences of the same variable must refer to the same
structure.  A description of the form {\tt
  (<feature>:<desc>)}\label{COLON} picks out a feature structure whose
value for the feature satisfies the nested description.  An inequation
{\tt =$\backslash$= <desc>}{\label{INEQ}} is satisfied by those
feature structures that are not token-identical to the feature
structure described by {\tt <desc>}.  Inequations are discussed in
more detail below.

There are two ways of logically combining descriptions: following
Prolog, the comma represents conjunction and the semi-colon represents
disjunction.  A feature structure satisfies a conjunction of
descriptions just in case it satisfies both conjuncts, while it
satisfies a disjunction of descriptions if it satisfies either of the
disjuncts.

We should also add to the above {\sc bnf} grammar the following line:
%
\begin{verbatim}
  <desc> ::= (<path> == <path>)
\end{verbatim}
%
This is an equational description\label{EQUAL}, of which inequations
are the negation.  Equational or inequational descriptions are
satisfied by the presence or absence, respectively, of token-identity.
In particular, an inequation between two structurally-identical
feature structures can be satisfied, while a path equation can {\it
 only} be satisfied by two structurally-identical feature structures,
but is not necessarily satisfied.

All instances of equational descriptions can be captured by using
multiple occurrences of variables.  For example, the description:
%
\begin{verbatim}
  ([arg1]==[arg2])
\end{verbatim}
%
is equivalent to the description:
%
\begin{verbatim}
  (arg1:X,arg2:X).
\end{verbatim}
%
assuming there are no other occurrences of {\tt X}.  

Standard assumptions about operator precedence and association are
followed by {\sc ale}, allowing us to omit most of the parentheses in
descriptions.  In particular, equational descriptions bind the most
tightly, followed by feature selecting colon, then by inequations,
then conjunction and finally disjunction.  Furthermore, conjunction
and disjunction are left-associative, while the feature selector is
right-associative.  For instance, this gives us the following
equivalences between descriptions:
%
\begin{verbatim}
  a, b ; c, d ; e = (a,b);(c,d);e

  a,b,c = a,(b,c)
  
  f:g:bot,h:j = (f:(g:bot)),(h:j)

  f:g: =\=k,h:j = (f:(g: =\=(k))),(h:j)

  f:[g]==[h],h:j = (f:([g]==[h])),(h:j) 
\end{verbatim}
Note that a space must occur between {\tt =$\backslash$=} and other
operators such as {\tt :}.

A description may be satisfied by no structure, a finite number of
structures or an infinite collection of feature structures.  A
description is said to be {\em satisfiable} if it is satisfied by at
least one structure.  A description $\phi$ {\em entails} a description
$\psi$ if every structure satisfying $\phi$ also satisfies $\psi$.
Two descriptions are {\em logically equivalent} if they entail each
other, or equivalently, if they are satisfied by exactly the same set
of structures.  

{\sc ale} is only sensitive to the differences between logically
equivalent formulas in terms of speed.  For instance, the two
descriptions {\tt (tl:list,ne\_list,hd:bot)} and {\tt hd:bot} are
satisfied by exactly the same set of totally well-typed structures
assuming the type declaration for lists given above, but the smaller
description will be processed much more efficiently.  There are also
efficiency effects stemming from the order in which conjuncts (and
disjuncts) are presented.  The general rule for speedy processing is
to eliminate descriptions from a conjunction if they are entailed by
other conjuncts, and to put conjuncts with more type and feature
entailments first.  Thus with our specification for relations above,
the description {\tt (arg1:a, binary\_proposition)} would be slower
than {\tt (binary\_proposition,arg1:a)}, since {\tt
binary\_proposition} entails the existence of the feature {\tt arg1},
but not conversely.%
%
\footnote{This is because the depth of dereferencing depends on the
history of types a given structure is instantiated to.  There is no
path compression on-line, but it is carried out before an edge is
asserted into the chart.} 

At run-time, {\sc ale} computes a representation of the most general
feature structure that satisfies a description.  Thus a description
such as {\tt hd:a} with respect to the list grammar is satisfied by
the structure:
%
\begin{verbatim}
  ne_list
  HD a
  TL list
\end{verbatim}
%
Every other structure satisfying the description {\tt hd:a} is
subsumed by the structure given above.  In fact, the above structure
is said to be a {\em vague} representation of all of the structures
that satisfy the description.  The type conditions in {\sc ale} were
devised to obey the very important property, first noted by Kasper and
Rounds (1986), that every non-disjunctive description is satisfied by
a unique most general feature structure.  Thus in the case of {\tt
hd:a}, there is no more general feature structure than the one above
which also satisfies {\tt hd:a}.

The previous example also illustrates the kind of {\em type inference}
used by {\sc ale}.  Even though the description {\tt hd:a} does not
explicitly mention either the feature {\tt TL} or the type {\tt
ne\_list}, to find a feature structure satisfying the description,
{\sc ale} must infer this information.  In particular, because {\tt
ne\_list} is the most general type for which {\tt HD} is appropriate,
we know that the result must be of type {\tt ne\_list}.  Furthermore,
because {\tt ne\_list} is appropriate for both the features {\tt HD}
and {\tt TL}, {\sc ale} must add an appropriate {\tt TL} value.  The
value type {\tt list} is also inferred, due to the fact that a {\tt
ne\_list} must have a {\tt TL} value which is a list.  As far as type
inference goes, the user does not need to provide anything other than
the type specification; the system computes type inference based on
the appropriateness specification.  In general, type inference is very
efficient in terms of time.  The biggest concern should be how large
the structures become.%
%
\footnote{Finding most general satisfiers for non-disjunctive
descriptions, even those involving type inference, is quasi-linear in
the size of the description.  But it should be kept in mind that there
is also a factor of complexity determined by the size of the type
specification.  In practice, this factor is proportional to how large
the inferred structure is.  In general, the size of the inferred
structure is linear in the size of the description, with a constant
for the type specification.}
%
In contrast to a vague description, a disjunctive description is
usually {\em ambiguous}.  Disjunction is where the complexity arises
in satisfying descriptions, as it corresponds operationally to
non-determinism.%
%
\footnote{It corresponds so closely with non-determinism that
satisfiability of descriptions with disjunctions is $\mbox{\sc
NP}$-complete.  Furthermore, the algorithm employed by {\sc ale} may
produce up to $2^n$ satisfiers for a description with $n$
disjunctions.}
%
For instance, the description {\tt hd:(a;b)}
is satisfied by two distinct minimal structures, neither of which
subsumes the other:
%
\begin{verbatim}
  ne_list       ne_list
  HD a          HD b
  TL list       TL list
\end{verbatim}
%
On the other hand, the description {\tt hd:atom} is satisfied by the
structure:
%
\begin{verbatim}
  ne_list
  HD atom
  TL list
\end{verbatim}
%
Even though the descriptions {\tt hd:atom} and {\tt hd:(a;b)} are not
logically equivalent (though the former entails the latter), they have
the interesting property of being unifiable with exactly the same set
of structures.  In other words, if a feature structure can be unified
with the most general satisfier of {\tt hd:atom}, then it can be
unified with one of the minimal satisfiers of {\tt hd:(a;b)}.  

In terms of efficiency, it is very important to use vagueness wherever
possible rather than ambiguity.  In fact, it is almost always a good
idea to arrange the type specification with just this goal in mind.
For instance, consider the difference between the following pair of
type specifications, which might be used for English gender:
%
\begin{verbatim}
  gender sub [masc,fem,neut].        gender sub [animate,neut].
    masc sub [].                       animate sub [masc,fem].
    fem sub [].                          masc sub [].
    neut sub [].                         fem sub [].
                                       neut sub [].
\end{verbatim}
%
Now consider the fact that the relative pronouns {\em who} and {\em
which} are distinguished on the basis of whether they select animate
or inanimate genders.  In the flatter hierarchy, the only way to
select the animate genders is by the ambiguous description {\tt
masc;fem}.  The hierarchy with an explicit animate type can capture
the same possibilities with the vague description {\tt animate}.  An
effective rule of thumb is that {\sc ale} does an amount of work at
best proportional to the number of most general satisfiers of a
description and at worst proportional to $2^n$, where $n$ is the
number of disjuncts in the description.  Thus the ambiguous
description requires roughly twice the time and memory to process than
the vague description.  Whether the amount of work is closer to the
number of satisfiers or exponential in the number of disjuncts depends
on how many unsatisfiable disjunctive possibilities drop out early in
the computation.

\mysubsection{Enforcement of Inequations}

Inequations are persistent in that once that are created, they remain
as long as one of the structures being inequated remains.  Thus the
following two descriptions are logically equivalent:
%
\begin{verbatim}
  f:(=\=c), f:c

  f:c, f:(=\=c)
\end{verbatim}
%
Both will cause failure; but they are not operationally equivalent.
An inequation is evaluated when it arises, and again after high-level
unifications in the system; but inequations are not evaluated every
time an inequated structure is modified.  In an ideal system,
inequations would be attached directly to structures so that they
could be evaluated on-line during unification.  As things stand, {\sc
  ale} represents a feature structure as a regular feature
structure with a separate set of inequations.  Also, the complexity is
sensitive to the conjunctive normal form of inequations at the time at
which it is evaluated, though this form may later be reduced.

These sets of inequations are evaluated at run-time at the point they
are encountered, before answers are given to top-level queries, before
any edge is added to {\sc ale}'s parsing chart, after every daughter
is matched to a description in an {\sc ale} grammar rule%
%
\footnote{In the case of {\tt cats>}, they are enforced after
the list description itself is matched, and also after every element
of the list is matched.},
%
and after the head of an {\sc ale} definite clause has been matched
to a goal.  At compile-time, inequations are checked for every empty
category, for every lexical entry, and after every lexical rule
application.

Inequations are also symmetric.  Thus the following two descriptions
are logically equivalent:
%
\begin{verbatim}
  f:(=\= X),g:X

  f:X,g:(=\= X)
\end{verbatim}
%
Both inequate the values of {\sc f} and {\sc g}.  Again, these are not
operationally equivalent.  Because inequations are evaluated at the
time they are encountered, the second ordering will normally detect an
immediate failure {\it sooner} than the first.

An inequation between two feature structures is a requirement for them
not to be {\em token-identical}.  Thus, if a type is intensional, it
is possible for two feature structures to be of that same type, and
still satisfy an inequation between them.  Thus, any attempt to
inequate two structures that should be identical as a result of
extensional typing will also cause failure.  For instance, consider
the following:
%
\begin{verbatim}
  s              s
  F [1] t        F [3]
    H [1]   +    G [4]           =  failure
  G [2] t        [3] =\= [4]
    H [1]
\end{verbatim}
%
The values of the features {\tt F} and {\tt G} cannot be inequated
because they are extensionally identical (assuming the type {\tt t} is
declared to be extensional and is only appropriate for the feature
{\tt H}.  

When inequations are evaluated, they are reduced.  This reduction
consists, in part, of partial evaluation of extensionality checking,
which is otherwise delayed in {\sc ale}.  For instance, consider the
following:
%
\begin{verbatim}
  F [1] s
    H bot
    J bot
  G [2] s
    H bot
    J bot
  [1] =\= [2]
\end{verbatim}
%
If the type {\tt s} is extensional and appropriate for the features
{\tt H} and {\tt J}, then the inequation above is reduced to the
following:
%
\begin{verbatim}
  F [1] s
    H [3] bot
    J [4] bot
  G [2] s
    H [5]
    J [6]
  [3] =\= [5] ; [4] =\= [6]
\end{verbatim}
%
The set of inequations is stored in conjunctive normal form.  The cost
is some space over the re-evaluation of inequations.  Of course, if
the types on {\tt [3]} and {\tt [4]} were more refined than {\tt bot},
then the inequations {\tt [3] \verb+=\=+ [5]} and {\tt [4] \verb+=\=+
[6]} would further reduce.  In addition, when reducing inequations in
this way, we eliminate those that are trivially satisfied.  The ones
that are printed are only the residue after reduction.  For instance,
consider the following structure:
%
\begin{verbatim}
  F [1] s
    H [3] a
    J bot
  G [2] s
    H [3]
    J bot
  [1] =\= [2]
\end{verbatim}
%
Since the {\tt H} values are token-identical, this
structure reduces to the following.
%
\begin{verbatim}
  F s
    H [3] a
    J [4] bot
  G s
    H [3]
    J [5] bot
  [4] =\= [5]
\end{verbatim}
%
If structures {\tt [4]} and {\tt [5]} had been of non-unifiable types,
then there would be no residual inequation at all --- the original
inequation would trivially be satisfied.

An important subcase is that of an inequation between extensional
atoms.  If an atom is extensional, then there is only one instance of
it.  Thus an inequation between two identical, extensional atoms
always fails.  For example, if a type signature includes:
%
\begin{verbatim}
  bot sub [..., a, b, ...].
  a sub [].
    intro [f:bot].
  b sub [].
  ...
  ext([..., b, ...]).
\end{verbatim}
%
then the description:
%
\begin{verbatim}
  (a,f:=\= b)
\end{verbatim}
%
is satisfied just in those cases where the value of {\sc f} is not of
type {\bf b}.  If {\bf b} were intensional, then the inequation in
this description would essentially have no effect.  In fact, the only
productive instances of inequations between two intensionally typed
feature structures are those used with multiply occurring variables.
In all other instances, there is no way for the inequation to be
violated, since there is no way to render a structurally-identical
copy of an intensionally typed feature structure token-identical to
any other structure.  {\sc ale} detects these trivially satisfied
inequations and disposes of them.

\mysection{Macros}

{\sc ale} allows the user to employ a general form of parametric
macros in descriptions.  Macros allow the user to define a description
once and then use a shorthand for it in other descriptions.  We first
consider a simple example of a macro definition, drawn from the
categorial grammar in the appendix.  Suppose the user wants to employ
a description {\tt qstore:e\_list} frequently within a program.  The
following macro definition\label{MACROSIG} can be used in the program
file:
%
\begin{verbatim}
  quantifier_free macro
    qstore:e_list.
\end{verbatim}
%
Then, rather than including the description {\tt qstore:e\_list} in
another description, {\tt @ quantifier\_free} can be used instead.
Whenever {\tt @ quantifier\_free} is used, {\tt qstore:e\_list} is
substituted.

In the above case, the {\tt <macro\_spec>} was a simple atom, but in
general, it can be supplied with arguments.  
The full {\sc bnf} for macro definitions is as follows:
%
\begin{verbatim}
  <macro_def> ::= <macro_head> macro <desc>.

  <macro_head> ::= <macro_name>
                 | <macro_name>(<seq(<var>)>)

  <macro_spec> ::= <macro_name>
                 | <macro_name>(<seq(<desc>)>)

  <seq(X)> ::= X
             | X, <seq(X)>
\end{verbatim}
%
Note that {\tt <seq(X)>} is a parametric category in the {\sc bnf}
which abbreviates non-empty sequences of objects of category {\tt X}.
The following clause should be added to recursive definition of
descriptions: 
%
\begin{verbatim}
  <desc> ::=  @ <macro\_spec>
\end{verbatim}
%
A feature structure satisfies a description of the form {\tt @
<macrospec>}\label{AT} just in case the structure satisfies the body
of the definition of the macro.

Again considering the categorial grammar in the appendix, we have the
following macros with one and two arguments respectively:
%
\begin{verbatim}
  np(Ind) macro
    syn:np,
    sem:Ind.
\end{verbatim}
%
\begin{verbatim}
  n(Restr,Ind) macro
    syn:n,
    sem:(body:Restr,
         ind:Ind).
\end{verbatim}
%
In general, the arguments in the definition of a macro must be Prolog
variables, which can then be used as variables in the body of the
macro.  With the first macro, the description {\tt @ np(j)} would then
be equivalent to the description {\tt syn:np,sem:j}.  When evaluating
a macro, the argument supplied, in this case {\tt j}, is substituted
for the variable when expanding the macro.  In general, the argument
to a macro can itself be an arbitrary description (possibly containing
macros).  For instance, the description:
%
\begin{verbatim}
  n((and,conj1:R1,conj2:R2),Ind3)
\end{verbatim}
%
would be equivalent to the description:
%
\begin{verbatim}
  syn:n, 
  sem:(body:(and,conj1:R1,conj2:R2), 
       ind:Ind3)
\end{verbatim}
%
This example illustrates how other variables and even complex
descriptions can be substituted for the arguments of macros.
Also note the parentheses around the arguments to the first argument
of the macro.  Without the parentheses, as in {\tt
n(and,conj1:R1,conj2:R2,Ind3)}, the macro expansion routine would take
this to be a four argument macro, rather than a two argument macro
with a complex first argument.  This brings up a related point, which
is that different macros can have the same name as long as they have
the different numbers of arguments.  

Macros can also contain other macros, as illustrated by the macro for
proper names in the categorial grammar:
%
\begin{verbatim}
  pn(Name) macro
    synsem: @ np(Name),
    @ quantifier_free.
\end{verbatim}
%
In this case, the macros are expanded recursively, so that the
description {\tt pn(j)} would be equivalent to the description
%
\begin{verbatim}
  synsem:(syn:np,sem:j),qstore:e_list
\end{verbatim}

It is usually a good idea to use macros whenever the same description
is going to be re-used frequently.  Not only does this make the
grammars and programs more readable, it reduces the number of simple
typing errors that lead to inconsistencies.

As is to be expected, macros can't be recursive.  That is, a macro,
when expanded, is not allowed to invoke itself, as in the {\em
ill-formed} example:
%
\begin{verbatim}
  infinite_list(Elt) macro
    hd:Elt,
    tl:infinite_list(Elt)
\end{verbatim}
%
The reason is simple;  it is not possible to expand this macro to a
finite description.  Thus all recursion must occur in grammars or
programs;  it can't occur in either the appropriateness conditions or
in macros.  

The user should note that variables in the scope of a macro are not
the same as {\sc ale} feature structure variables --- they denote
where macro-substitutions of parameters are made, {\em not}\/
instances of re-entrancy in a feature structure.  If we employ
the following macro:
%
\begin{verbatim}
  blah(X) macro
    b,
    f: X,
    g: X.
\end{verbatim}
%
with the argument {\tt (c,h:a)} for example we obtain the following
feature structure:
%
\begin{verbatim}
  b
  F c
    H a
  G c
    H a
\end{verbatim}
%
where the values of {\tt F} and {\tt G} are not shared (unless {\tt c}
and {\tt a} are extensional).  We can, of course, create a shared
structure using {\tt blah}, by including an {\sc ale} variable in the
actual argument to the macro.  Thus {\tt blah((Y,c,h:a))} yields:
%
\begin{verbatim}
  b
  F [0] c
    H a
  G [0]
\end{verbatim}

Because programming with lists is so common, {\sc ale} has a special
macro\label{LIST} for it, based on the Prolog list notation.  A
description may also take any of the forms on the left, which will be
treated equivalently to the descriptions on the right in the following
diagram:
%
\begin{verbatim}
  []                e_list

  [H|T]             (hd:H, 
                     tl:T)

  [A1,A2,...,AN]    (hd:A1,
                     tl:(hd:A2,
                         tl: ...
                                 tl:(hd:AN,
                                     tl:e_list)...))

  [A1,...,AN|T]      (hd:A1,
                     tl:(hd:A2,
                         tl: ...
                                 tl:(hd:AN,
                                     tl:T)...))
\end{verbatim}
%
Note that this built-in macro does not require the macro operator {\tt
@}.  Thus, for example, the description {\tt [a|T3]} is equivalent to
{\tt hd:a,tl:T3}, and the description {\tt [a,b,c]} is equivalent to
{\tt hd:a,tl:(hd:b,tl:(hd:c,tl:e\_list))}.  There are many example of
this use of Prolog's list notation in the grammars in the appendix.

\mysection{Functional Descriptions}

{\sc ale} also provides the means to define functions mapping
descriptions to descriptions.  The syntax is:
%
\begin{verbatim}
  <func_def> ::= <func_spec> +++> <desc>.

  <func_spec> ::= <func_name>
                 | <func_name>(<seq(desc)>)
\end{verbatim}
%
Functional descriptions are compiled into code that is evaluated at
run-time, first adding to each argument (if any) the description given
for that argument, and then evaluating to the resulting description,
which can itself include other functional descriptions, including
recursive calls.  As an example, one may consider the {\tt append/2}
function:\label{APPEND-DEF}
%
\begin{verbatim}
append([],L) +++> L.
append([X|L1],L2) +++> [X|append(L1,L2)].
\end{verbatim}
%
The lists shown here are instances of {\sc ale}'s list macro notation.
Notice that the only type checking in this definition is performed by
appropriateness and the list macro, so the first clause could succeed
with any {\tt L}.  To add more, one could redefine the first clause
as:
%
\begin{verbatim}
append([],(list,L)) +++> L.
\end{verbatim}
%
Functional descriptions can be used wherever other descriptions can.
The user should be particularly careful with ensuring that the
arguments to a functional description call will be sufficiently
instantiated for the call to terminate; and the clauses, correctly
ordered for the description to terminate correctly.  Type checking
cannot ensure this, in general.  For example, in the definition for
{\tt append/2}, with or without explicit type-checking on the second
argument, both clauses will match a functional description whose first
argument or any {\sc tl} value in the first argument is strictly of
type {\tt list}, i.e.,  a list that is not known to be {\tt elist} or
{\tt nelist}.  Such a functional description will, thus, evaluate to
an infinite number of results.  Clauses in functional descriptions are
considered in the order they are given; and the search for solutions
{\it always} backtracks into subsequent clauses.

Any functional description that can be defined so that its argument
descriptions are only variables, and its result has no (mutual)
recursion should be defined as a macro instead, which is completely
expanded at compile-time.  Remember, however, that it is not always
sufficient to replace the {\tt +++>} with {\tt macro}: variable
replacement in macros works by true textual replacement, whereas the
variables in functional descriptions are {\sc ale} descriptions.  For
example, the functional description:
%
\begin{verbatim}
foo(X) +++> (bar,f:X,g:X).
\end{verbatim}
%
evaluates to a feature structure with a re-entrancy.  The macro:
%
\begin{verbatim}
foo(X) macro (bar,f:X,g:X).
\end{verbatim}
%
does not necessarily evaluate to a feature structure with a
re-entrancy.  The functional description, {\tt foo/1} is correctly
converted to the macro:
%
\begin{verbatim}
foo(X) macro (bar,f:(Y,X),g:(Y,X)).
\end{verbatim}
%
The {\sc ale} variable, {\tt Y}, establishes the re-entrancy that the macro
variable, {\tt X}, does not.

\mysection{Type Constraints}

Our logical language of descriptions can be used with the type system
in order to enforce constraints on feature structures of a particular
type.  Constraints are attached to types, and may consist of arbitrary
descriptions.  Their effect is to require every structure of the
constrained type to always satisfy the constraint description.  

Constraints are enforced using the {\tt cons}\label{CONS} operator,
e.g.:
%
\begin{verbatim}
  bot sub [a,b].
    a sub []
      intro [f:b,g:b].
    b sub [].
  a cons (f:X,g:=\= X).
\end{verbatim}
%
The constraint on the type {\tt a} (which must occur within
parentheses) requires all feature structures of type {\tt a} to have
non-token-identical values for features {$f$} and {$g$}.  Notice
that the type {\tt b} has no constraints expressed.  This is
equivalent to specifying the constraint:
%
\begin{verbatim}
  b cons bot.
\end{verbatim}
%
which is satisfied by any feature structure (of type {\tt b}).  A type
constraint may use any of the operators in the description language,
including further type descriptions, which may themselves be
constrained.  The type, {\tt bot}, may not have type constraints, nor
may {\tt a\_/1} atoms.

It is crucial that the type descriptions be finitely resolvable.
Because simple depth-first search is used to evaluate constraints,
infinite resolution paths will cause the system to hang.  For example,
the following signature should not contain the following constraints:
%
\begin{verbatim}
  bot sub [a,b].
    a sub [c]
      intro [f:bot].
      c sub [].
    b sub []
      intro [g:bot].
  a cons f:b.
  b cons g:c.
\end{verbatim}
%
This is because {\tt a} subsumes {\tt c}.  Notice, however, that type
constraints can be used to provide additional information regarding
value restrictions on appropriate features.  In general, {\sc ale}
performs more efficiently when restrictions are provided in the
appropriateness conditions, rather than in general constraints; but
type constraints can encode a greater variety of restrictions.
Specifically, they allow constraints to express path equations and
inequations, as well as deeper path restrictions.  Constraints may
include relational constraints, which are defined using definite
clauses, which are discussed below.  Type constraints are efficiently
compiled in the same way as other descriptions.  Also, like
appropriateness conditions, they are only enforced once for any
given structure.
% future version: error flagging when constraint is inconsistent with
%  appropriateness conditions
%                 error flagging cyclicity of constraints?

It is also important to note that because of the delay in {\sc ale}'s
inequational enforcement, type constraints that involve recursion that
terminates by an inequation failure may go into infinite loops due to
this delay in enforcement.  Because extensionality is only enforced
before the answer to a top-level query is given, recursive type
constraints that rely on the extensional identity of two feature
structures to terminate on the basis of their type will not terminate.

\mysection{Example: The Zebra Puzzle}

We now provide a simplified form of the Zebra Puzzle
(Figure~\ref{zebra-fig}), a common puzzle for constraint resolution.
This puzzle was solved by A{\"{\i}}t-Kaci (1984) using roughly the
same methods as we use here.  The puzzle illustrates the expressive
power of the combination of extensional types, inequations and type
constraints.  Such puzzles, sometimes known as logic puzles or
constraint puzzles, require one to find a state of affairs within some
situation that simultaneously satisfies a set of constraints.  The
situation itself very often implicitly levies certain constraints.
%
\begin{figure}
{\em Puzzle}:
Three different houses each contain a different pet, a different
drink, and an inhabitant of a different nationality.  The following
six facts are true about these houses:
\begin{enumerate}
\item The man in the third (middle) house drinks milk.
\item The Spaniard owns the dog.
\item The Ukranian drinks the tea.
\item The Norwegian lives in the first house.
\item The Norwegian lives next to the tea-drinker.
\item The juice-drinker owns the fox.
\end{enumerate}
{\em Questions}: Who owns the zebra?  Who drinks juice? 
\caption{The Zebra Puzzle.}\label{zebra-fig}
\end{figure}
%

We can represent the simplified Zebra Puzzle in {\sc ale} as:
%
\begin{verbatim}
  % Subsumption
  %=======================
  
  bot sub [house,descriptor,background].
  
    descriptor sub [nat_type,ani_type,bev_type].
      nat_type sub [norwegian,ukranian,spaniard].
        norwegian sub [].
        ukranian sub [].
        spaniard sub [].
      ani_type sub [fox,dog,zebra].
        fox sub [].
        dog sub [].
        zebra sub [].
      bev_type sub [juice,tea,milk].
        juice sub [].
        tea sub [].
        milk sub [].
  
    house sub []
        intro [nationality:nat_type,animal:ani_type,beverage:bev_type].
  
    background sub [clue]
               intro [house1:house,house2:house,house3:house].
      clue sub [maximality].
        maximality sub [].
  
  ext([norwegian,ukranian,spaniard,fox,dog,zebra,juice,tea,milk]).
  
  % Constraints
  %=============================
  background cons
    (house1:nationality:N1,                          % inequational constraints
     house2:nationality:(N2,(=\= N1)),
     house3:nationality:((=\= N1),(=\= N2)),
  
     house1:animal:A1,
     house2:animal:(A2,(=\= A1)),
     house3:animal:((=\= A1),(=\= A2)),
  
     house1:beverage:B1,
     house2:beverage:(B2,(=\= B1)),
     house3:beverage:((=\= B1),(=\= B2))).
  
  clue cons
    (house3:beverage:milk,                             % clue 1
  
     (house1:nationality:spaniard,house1:animal:dog    % clue 2
     ;house2:nationality:spaniard,house2:animal:dog
     ;house3:nationality:spaniard,house3:animal:dog),
  
     (house1:nationality:ukranian,house1:beverage:tea  % clue 3
     ;house2:nationality:ukranian,house2:beverage:tea
     ;house3:nationality:ukranian,house3:beverage:tea),
  
     house1:nationality:norwegian,                     % clue 4
  
     (house1:nationality:norwegian,house2:beverage:tea % clue 5
     ;house2:nationality:norwegian,house3:beverage:tea
     ;house2:nationality:norwegian,house1:beverage:tea
     ;house3:nationality:norwegian,house2:beverage:tea),

     (house1:beverage:juice,house1:animal:fox          % clue 6
     ;house2:beverage:juice,house2:animal:fox
     ;house3:beverage:juice,house3:animal:fox)).
  
  
  maximality cons
    (house1:nationality:(norwegian;ukranian;spaniard), % maximality constraints
     house2:nationality:(norwegian;ukranian;spaniard),
     house3:nationality:(norwegian;ukranian;spaniard),
  
     house1:animal:(fox;dog;zebra),
     house2:animal:(fox;dog;zebra),
     house3:animal:(fox;dog;zebra),
  
     house1:beverage:(juice;tea;milk),
     house2:beverage:(juice;tea;milk),
     house3:beverage:(juice;tea;milk)).
\end{verbatim}
%    
The subsumption hierarchy is shown pictorially in
Figure~\ref{inher-fig}.  The type, {\tt background}, with the
assistance of the types subsumed by {\tt house} and {\tt descriptor},
represents the situation of three houses (the features of {\tt
  background}), each of which has three attributes (the features of
{\tt house}).  The implicit constraints levied by the situation appear
as constraints on the type, {\tt background}, namely that no two
houses can have the same value for any attribute.  These are
represented by inequations.  But notice that, since we are interested
in treating the values of attributes as tokens, we must represent
those values by extensional types.  If we had not done this, then we
could still, for example, have two different houses with the beverage,
{\tt juice}, since there could be two different feature structures of
type {\tt juice} that were not token-identical.  Notice also that all
of these types are maximal, and hence satisfy the restriction that
{\sc ale} places on extensional types.

The explicit constraints provided by the clues to the puzzle are
represented as constraints on the type {\tt clue}, a subtype of {\tt
background}.  We also need a subtype of {\tt clue}, {\tt maximality},
to enforce another constraint implicit to all constraint puzzles,
namely the one which requires that we provide only maximally specific
answers, rather than vague solutions which say, for example, that the
beverage for the third house is a type of beverage ({\tt bev\_type}),
which may actually still satisfy a puzzle's constraints.

To solve the puzzle, we simply type:
%
\begin{verbatim}
| ?- mgsat maximality.

MOST GENERAL SATISFIER OF: maximality

maximality
HOUSE1 house
       ANIMAL fox
       BEVERAGE juice
       NATIONALITY norwegian
HOUSE2 house
       ANIMAL zebra
       BEVERAGE tea
       NATIONALITY ukranian
HOUSE3 house
       ANIMAL dog
       BEVERAGE milk
       NATIONALITY spaniard


ANOTHER?  y.

no
| ?- 
\end{verbatim}
%
So the Ukranian owns the zebra, and the Norwegian drinks juice.  A
most general satisfier of {\tt maximality} will also satisfy the
constraints of its supertypes, {\tt background} and {\tt clue}.
%
\begin{figure}
\setlength{\unitlength}{.5cm}
%\begin{center}
   \begin{picture}(30,15)(0,1)
     \put(16,1){\makebox(0,0){\type{$\bot$}}}
%
     \put(13,6){\makebox(0,0){\type{descriptor}}}
%
     \put(4,6){\makebox(0,0){\type{house}}}
     \put(4,5){\makebox(0,0)[r]{\feat{nationality}}}
     \put(4,4){\makebox(0,0)[r]{\feat{animal}}}
     \put(4,3){\makebox(0,0)[r]{\feat{beverage}}}
%
     \put(25,6){\makebox(0,0){\type{background}}}
     \put(25,5){\makebox(0,0)[l]{\feat{house1}}}
     \put(25,4){\makebox(0,0)[l]{\feat{house2}}}
     \put(25,3){\makebox(0,0)[l]{\feat{house3}}}
%
     \put(6,10){\makebox(0,0){\type{nat-type}}}
     \put(11,10){\makebox(0,0){\type{ani-type}}}
     \put(16,10){\makebox(0,0){\type{bev-type}}}
     \put(25,10){\makebox(0,0){\type{clue}}}
%
     \put(7,12){\makebox(0,0){\it{spaniard}}}
     \put(7,13){\makebox(0,0){\it{ukranian}}}
     \put(7,14){\makebox(0,0){\it{norwegian}}}
%
     \put(12,12){\makebox(0,0){\it{dog}}}
     \put(12,13){\makebox(0,0){\it{fox}}}
     \put(12,14){\makebox(0,0){\it{zebra}}}
%
     \put(25,14){\makebox(0,0){\type{maximality}}}
%
     \put(17,12){\makebox(0,0){\it{tea}}}
     \put(17,13){\makebox(0,0){\it{milk}}}
     \put(17,14){\makebox(0,0){\it{orange-juice}}}
%
     \put(16,2){\line(-4,1){12}}
     \put(16,2){\line(-1,1){3}}
     \put(16,2){\line(3,1){9}}
%
     \put(13,7){\line(-3,1){6}}
     \put(13,7){\line(-1,1){2}}
     \put(13,7){\line(1,1){2}}
%
     \put(25,7){\line(0,1){2}}
     \put(25,11){\line(0,1){2}}
%
     \put(4,10){\line(0,1){4}}
     \put(4,10){\line(1,0){.2}}
     \put(4,12){\line(1,0){.4}}
     \put(4,13){\line(1,0){.4}}
     \put(4,14){\line(1,0){.4}}
%
     \put(10.3,10.5){\line(0,1){3.5}}
     \put(10.3,12){\line(1,0){.4}}
     \put(10.3,13){\line(1,0){.4}}
     \put(10.3,14){\line(1,0){.4}}
%
     \put(14,10){\line(0,1){4}}
     \put(14,10){\line(1,0){.1}}
     \put(14,12){\line(1,0){.4}}
     \put(14,13){\line(1,0){.4}}
     \put(14,14){\line(1,0){.4}}
%
   \end{picture}
%\end{center}
\caption{Inheritance Network for the Zebra Puzzle.}
\label{inher-fig}
\end{figure}%
%

Although {\sc ale} is capable of representing such puzzles and solving
them, it is not actually very good at solving them efficiently.  To
solve such puzzles efficiently, a system must determine an optimal
order in which to satisfy all of the various constraints.  {\sc ale}
does not do this since it can express definite clauses in its
constraints, and the reordering would also be very difficult for the
user to keep track of while designing a grammar.  A system that does
do this is the general constraint resolver proposed by Penn and
Carpenter (1993)%
%
\footnote{This system was actually the precursor to {\sc ale}.
It implemented a completely reversible constraint-based
parser/generator with a weighting on the constraints based on their
maximal non-determinism.  Re-ordering constraints, however, proved to
be insufficient for efficient parsing or generation, compared to a
uni-directional system such as {\sc ale}.}.
%

\mychapter{Definite Clauses}

The next two sections, covering the constraint logic programming
and phrase structure components of {\sc ale}, simply describe how to
write {\sc ale} programs and how they will be executed.  Discussion of
interacting with the system itself follows the description of the
programming language {\sc ale} provides.

The definite logic programming language built into {\sc ale} is a
constraint logic programming ({\sc clp}) language, where the
constraint system is the attribute-value logic described above.  Thus,
it is very closely related to both Prolog and {\sc login}.  Like
Prolog, definite clauses may be defined with disjunction, negation and
cut.  The definite clauses of {\sc ale} are executed in a
depth-first, left to right search, according to the order of clauses
in the database.  {\sc ale} performs last call optimization, but does
not perform any clause indexing.%
%
\footnote{Thus, additional cuts might be necessary to ensure
determinism, so that last call optimization is effective.}
%
Those familiar with Prolog should have no trouble adapting that
knowledge to programming with definite clauses in {\sc ale}.  The only
significant difference is that first-order terms are replaced with
descriptions of feature structures.

While it is not within the scope of this user's guide to detail the
logic programming paradigm, much less {\sc clp}, this section will
explain all that the user familiar with logic programming needs to
know to exploit the special features of {\sc ale}.  For background,
the user is encouraged to consult Sterling and Shapiro (1986) with
regard to general logic programming techniques, most of which are
applicable in the context of {\sc ale}, and A\"{\i}t-Kaci and Nasr
(1986) for more details on programming with sorted feature structures.
For more advanced material on programming in Prolog with a compiler,
see O'Keefe (1990).  The general theory of {\sc clp} is developed in a
way compatible with {\sc ale} in H\"ohfeld and Smolka (1988).  Of
course, since {\sc ale} is literally an implementation of the theory
found in Carpenter (1992), the user is strongly encouraged to consult
Chapter 14 of that book for full theoretical details.

The syntax of {\sc ale}'s logic programming component is broadly
similar to that of Prolog, with the only difference being that
first-order terms are replaced with attribute-value logic
descriptions.  The language in which clauses are expressed in {\sc
ale} is given in {\sc bnf} as:
%
\begin{verbatim}
  <clause> ::= <literal> if <goal>.

  <literal> ::=  <pred_sym>
              |  <pred_sym>(<seq(desc)>)

  <goal> ::= true
           | <literal>
           | (<goal>,<goal>)
           | (<goal>;<goal>)
           | (<desc> =@ <desc>)
           | (<cut_free_goal> -> <goal>)
           | (<cut_free_goal> -> <goal> ; <goal>)
           | !
           | (\+ <goal>)
           | prolog(<prolog_goal>)
\end{verbatim}
%
Just as in Prolog, predicate symbols must be Prolog atoms.  This is a
more restricted situation than the definite clause language discussed
in Carpenter (1992), where literals were also represented as feature
structures and described using attribute-value logic.  Also note that
{\sc ale} requires every clause to have a body, which might simply be
the goal {\tt true}.  There must be whitespace around the {\tt if}
operator, but none is required around the conjunction comma, the
disjunction semicolon, the cut or shallow cut symbols {\tt !,->}, or
the unprovability symbol {\tt $\setminus$+}.  Parentheses, in general,
may be dropped and reconstructed based on operator precedences.  The
precedence is such that the comma binds more tightly than the
semicolon, while the unprovability symbol binds the most tightly.
Both the semicolon and comma are right associative.

The operational behavior of {\sc ale} is nearly identical to Prolog
with respect to goal resolution.  That is, it evaluates a sequence of
goals depth-first, from the left to right, using the order of clauses
established in the program.  The only difference arises from the fact
that, in Prolog, terms cannot introduce non-determinism.  In {\sc
ale}, due to the fact that disjunctions can be nested inside of
descriptions, additional choice points might be created both in
matching literals against the heads of clauses and in expanding the
literals within the body of a clause.  In evaluating these choices,
{\sc ale} maintains a depth-first left to right strategy.  

We begin with a simple example, the {\tt member/2} predicate:%
%
\footnote{As in Prolog, we refer to predicates by their name and
arity.} 
%
\begin{verbatim}
  member(X,hd:X) if 
    true.
  member(X,tl:Xs) if 
    member(X,Xs).
\end{verbatim}
%
As in Prolog, {\sc ale} clauses may be read logically, as
implications, from right to left.  Thus the first clause above states
that {\tt X} is a member of a list if it is the head of a
list\label{IF}.  The second clause states that {\tt X} is a member of
a list if {\tt X} is a member of the tail of the list, {\tt Xs}.  Note
that variables in {\sc ale} clauses are used the same way as in
Prolog, due to the notational convention of our description language.
Further note that, unlike Prolog, {\sc ale} requires a body for every
clause.  In particular, note that the first clause above has the
trivial body {\tt true}\label{TRUE}.  The compiler is clever enough to
remove such goals at compile time, so they do not incur any run-time
overhead.

Given the notational convention for lists built into {\sc ale}, the
above program could equivalently be written as:
%
\begin{verbatim}
  member(X,[X|_]) if
    true.
  member(X,[_|Xs]) if
    member(X,Xs).
\end{verbatim}
%
But recall that {\sc ale} would expand {\tt [X|\_]} as {\tt
(hd:X,tl:\_)}.  Not only does {\sc ale} not support anonymous variable
optimizations, it also creates a conjunction of two descriptions,
where {\tt hd:X} would have sufficed.  Thus the first method is not
only more elegant, but also more efficient.  

Due to the fact that lists have little hierarchical structure, list
manipulation predicates in {\sc ale} look very much like their
correlates in Prolog.  They will also execute with similar
performance.  But when the terms in the arguments of literals have
more interesting taxonomic structure, {\sc ale} actually provides a
gain over Prolog's evaluation method, as pointed out by A\"{\i}t-Kaci
and Nasr (1986).  Consider the following fragment drawn from the
syllabification grammar in the appendix, in which there is a
significant interaction between the inheritance hierarchy and the
definite clause {\tt less\_sonorous/2}:
%
\begin{verbatim}
  segment sub [consonant,vowel].
    consonant sub [nasal,liquid,glide].
      nasal sub [n,m].
        n sub [].
        m sub [].
      liquid sub [l,r].
        l sub [].
        r sub [].
      glide sub [y,w].
        y sub [].
        w sub [].
    vowel sub [a,e,i]
      a sub [].
      e sub [].
      i sub [].

  less_sonorous_basic(nasal,liquid) if true.
  less_sonorous_basic(liquid,glide) if true.
  less_sonorous_basic(glide,vowel) if true.

  less_sonorous(L1,L2) if
    less_sonorous_basic(L1,L2).
  less_sonorous(L1,L2) if
    less_sonorous_basic(L1,L3),
    less_sonorous(L3,L2).
\end{verbatim}
%
For instance, the third clause of {\tt less\_sonorous\_basic/2}, being
expressed as a relation between the types {\tt glide} and {\tt vowel},
allows solutions such as {\tt less\_sonorous\_basic(w,e)}, where {\tt
glide} and {\tt vowel} have been instantiated as the particular
subtypes {\tt w} and {\tt e}.  This fact would not be either as
straightforward or as efficient to code in Prolog, where relations
between the individual letters would need to be defined.  The loss in
efficiency stems from the fact that Prolog must either code all 14
pairs represented by the above three clauses and type hierarchy, or
perform additional logical inferences to infer that {\tt w} is a
glide, and hence less sonorous than the vowel {\tt e}.  {\sc ale}, on
the other hand, performs these operations by unification, which, for
types, is a simple table look-up.%
%
\footnote{Table look-ups involved in unification in {\sc ale} rely on
double hashing, once for the type of each structure being unified.}
%
All in all, the three clauses for {\tt less\_sonorous\_basic/2} given
above represent relations between 14 pairs of letters.  Of course, the
savings is even greater when considering the transitive closure of
{\tt less\_sonorous\_basic/2}, given above as {\tt less\_sonorous/2}, and
would be greater still for a type hierarchy involving a greater degree
of either depth or branching.  

While we do not provide examples here, suffice it to say that
cuts\label{CUT}, shallow cuts\label{SHALLOW},
negation\label{NEGATION}, conjunction\label{COMMADCL} and
disjunction\label{SEMIDCL} work exactly the same as they do in Prolog.
In particular, cuts conserve stack space representing backtracking
points, disjunctions create choice points and negation is evaluated by
failure, with the same results on binding as in Prolog.

The definite clause language also allows arbitrary prolog goals, using
the predicate, {\tt prolog(<prolog\_goal>)}\label{PROLOG}.  This is
perhaps most useful when used with the Prolog predicates, {\tt
assert}\label{ASSERT} and {\tt retract}\label{RETRACTPRO}, which provide
{\sc ale} users with access to the Prolog database, and with I/O
statements, which can be quite useful for debugging definite clauses.

Should {\tt prolog\_goal} contain a variable that has been
instantiated to an {\sc ale} feature structure, this will appear to
Prolog as {\sc ale}'s internal representation of that feature
structure.  Feature structures can be asserted and retracted, however,
without regard to their internal structure.  The details of {\sc
ale}'s internal representation of feature structures can be found in
Carpenter and Penn (1996), and briefly on p.~\pageref{INTERNAL-REP}.

Another side effect of not directly attaching inequations to feature
structures is that if a feature structure with inequations is asserted
and a copy of it is later instantiated from the Prolog database or
retracted, the copy will have lost the inequations.  In general,
passing feature structures with inequations to Prolog hooks should be
avoided.

Because the enforcement of extensionality is delayed in
{\sc ale}, a variable which is instantiated to an extensionally typed
feature structure and then passed to a prolog hook may also not
reflect token identities as a result of extensionality.  Provided that
there are no inequations (to which the user does not have direct
access), this can be enforced within the hook by calling the {\sc ale}
internal predicate {\tt extensionalise(FS,[])}.

There is a special literal predicate, {\tt =@}\label{EQAT}, used with
infix notation, which is true when its arguments are token-identical.
As with inequations, which forbid token-identity, the {\tt =@}
operator is of little use unless multiply occurring variables are used
in its arguments' descriptions.  Note, however, that while inequations
(\verb+=\=+) and path equations (\verb+==+) are part of the description
language, {\tt =@} is a definite clause predicate, and cannot be used
as a description.  It is more important to note that while the
negation of the structure-identity operator (\verb+==+), namely the
inequation (\verb+=\=+), is monotonic when interpreted persistently,
the negation of the token-identity operator ({\tt =@}), achieved by
using it inside the argument of the \verb1\+1 operator, is
non-monotonic, and thus its use should be avoided.

It is significant to note that clauses in {\sc ale} are truly definite
in the sense that only a single literal is allowed as the {\em head}
of a clause, while the {\em body} can be a general goal.  In
particular, disjunctions in descriptions of the arguments to the head
literal of a clause are interpreted as taking wide scope over the
entire clause, thus providing the effect of multiple solutions rather
than single disjunctive solutions.  The most simple example of this
behavior can be found in the following program:
%
\begin{verbatim}
  foo((b;c)) if true.

  bar(b) if true.

  baz(X) if foo(X), bar(X).
\end{verbatim}
%
Here the query {\tt foo(X)} will provide two distinct solutions, one
where {\tt X} is of type {\tt b}, and another where it is of type {\tt
c}.  Also note that the queries {\tt foo(b)} and {\tt foo(c)} will
succeed.  Thus the disjunction is equivalent to the two single clauses:
%
\begin{verbatim}
  foo(b) if true.
  foo(c) if true.
\end{verbatim}
%
In particular, note that the query {\tt baz(X)} can be solved, with
{\tt X} instantiated to an object of type {\tt b}.  In general, using
embedded disjunctions will usually be more efficient than using multiple
clauses in {\sc ale}, especially if the disjunctions are deeply nested
late in the description.  On the other hand, cuts can be inserted for
control with multiple clauses, making them more efficient in some
cases.

\mysection{Type Constraints Revisited}

The type constraints mentioned in the last chapter can also
incorporate relational constraints defined by definite clauses, with
the optional operator {\tt goal}\label{GOAL}.  Consider the following
example from {\sc hpsg}:
%
\begin{verbatim}
  word cons W
       goal (single_rel_constraint(W),
             clausal_rel_prohibition(W)).
\end{verbatim}
%
In this example, the constraint from the description language is
simply the variable {\tt W}, which is used to match any feature
structure of type {\tt word}.  That feature structure is then passed
as an argument to the two procedural attachments {\tt
single\_rel\_constraint/1} and {\tt clausal\_rel\_prohibition/1},
which each represent a principle from {\sc hpsg} which governs words
(among other objects).  Notice that the goal, when non-literal, must
occur within parentheses.

While every type constraint must have a description, procedural
attachments are optional.  If they do occur, they occur after the
description.  The syntax is given in {\sc bnf} as:
%
\begin{verbatim}
  <cons_spec> ::= <type> cons <desc>
                | <type> cons <desc>
                         goal <goal>
\end{verbatim}


\mychapter{Phrase Structure Grammars}

The {\sc ale} phrase structure processing component is loosely based
on a combination of the functionality of the {\sc patr-ii} system and
the {\sc dcg} system built into Prolog.  Roughly speaking, {\sc ale}
provides a system like that of {\sc dcg}s, with two primary
differences.  The first difference stems from the fact that {\sc ale}
uses attribute-value logic descriptions of typed feature structures
for representing categories and their parts, while {\sc dcg}s use
first-order terms (or possibly cyclic variants thereof).  The second
primary difference is that {\sc ale}'s parser uses a bottom-up active
chart parsing algorithm and a semantic-head-driven generator rather
than encoding grammars directly as Prolog clauses and evaluating them
top-down and depth-first.  In the spirit of {\sc dcg}s, {\sc ale}
allows definite clause procedures to be attached and evaluated at
arbitrary points in a phrase structure rule, the difference being that
these rules are given by definite clauses in {\sc ale}'s logic
programming system, rather than directly in Prolog.

Phrase structure grammars come with two basic components, one for
describing lexical entries and empty categories, and one for
describing grammar rules.  We consider these components in turn.

\mysection{Lexical Entries}

Lexical entries\label{ARROW} in {\sc ale} are specified as rewriting
rules, as given by the following {\sc bnf} syntax:
%
\begin{verbatim}
  <lex_entry> ::=  <word> ---> <desc>.
\end{verbatim}
%
For instance, in the categorial grammar lexicon in the appendix, the
following lexical entry is provided, along with the relevant macros:
%
\begin{verbatim}
  john ---> 
    @ pn(j).

  pn(Name) macro
    synsem: @ np(Name),
    @ quantifier_free.

  np(Ind) macro
    syn:np,
    sem:Ind.

  quantifier_free macro
    qstore:[].
\end{verbatim}
%
Read declaratively, this rule says that the word {\tt john} has as its
lexical category the most general satisfier of the description
{\tt @ pn(j)}, which is:
%
\begin{verbatim}
  cat
  SYNSEM basic
         SYN np
         SEM j
  QSTORE e_list
\end{verbatim}
%
Note that this lexical entry is equivalent to that given without
macros by: 
%
\begin{verbatim}
  john --->
    synsem:(syn:np,
            sem:j),
    qstore:e_list.
\end{verbatim}
%
Macros are useful as a method of organizing lexical information to
keep it consistent across lexical entries.  The lexical entry for the
word {\tt runs} is:
%
\begin{verbatim}
  runs ---> @ iv((run,runner:Ind),Ind).

  iv(Sem,Arg) macro
    synsem:(backward,
            arg: @ np(Arg),
            res:(syn:s,
                 sem:Sem)),
    @ quantifier_free.
\end{verbatim}
%
This entry uses nested macros along with structure sharing, and
expands to the category:
%
\begin{verbatim}
  cat
  SYNSEM backward
         ARG synsem
             SYN np
             SEM [0] sem_obj
         RES SYN s
             SEM run
                 RUNNER [0]
  QSTORE e_list
\end{verbatim}
%
It also illustrates how macro parameters are in fact treated as
variables.  

Multiple lexical entries may be provided for each word.  Disjunctions
may also be used in lexical entries, but are expanded out at
compile-time.  Thus the first three lexical entries, taken together,
compile to the same result as the fourth:
%
\begin{verbatim}
  bank ---> 
    syn:noun, 
    sem:river_bank.
  bank ---> 
    syn:noun, 
    sem:money_bank.
  bank ---> 
    syn:verb, 
    sem:roll_plane.

  bank ---> 
    ( syn:noun, 
      sem:( river_bank
          ; money_bank
          )
    ; syn:verb, 
      sem:roll_plane
    ).
\end{verbatim}
%
Note that this last entry uses the standard Prolog layout conventions
of placing each conjunct and disjunct on its own line, with commas at
the end of lines, and disjunctions set off with vertically aligned
parentheses at the beginning of lines.

The compiler finds all the most general satisfiers for lexical entries
at compile time, reporting on those lexical entries that have
unsatisfiable descriptions.  In the above case of {\tt bank}, the
second combined method is marginally faster at compile-time, but their
run-time performance is identical.  The reason for this is that both
entries have the same set of most general satisfiers.

{\sc ale} supports the construction of large lexica, as it relies on
Prolog's hashing mechanism to actually look up a lexical entry for a
word during bottom-up parsing.  For generation, {\sc ale} indexes
lexical entries for faster unification, as described in Penn and
Popescu (1997).  Constraints on types can also be used to enforce
conditions on lexical representations, allowing for further
factorization of information.

\mysection{Empty Categories}

{\sc ale} allows the user to specify certain categories as occurring
without any corresponding surface string.  These are usually referred
to somewhat misleadingly as {\em empty categories}, or sometimes as
{\em null productions}.  In {\sc ale}, they are supported by a special
declaration of the form\label{EMPTYSIG}:
%
\begin{verbatim}
  empty <desc>.
\end{verbatim}
%
Where {\tt <desc>} is a description of the empty category.  

For example, a common treatment of bare plurals is to hypothesize an
empty determiner.  For instance, consider the contrast between the
sentences {\em kids overturned my trash cans} and {\em a kid
overturned my trash cans}.  In the former sentence, which has a plural
subject, there is no corresponding determiner.  In our categorial
grammar, we might assume an empty determiner with the following
lexical entry (presented here with the macros expanded):
%
\begin{verbatim}
  empty @ gdet(some).

  gdet(Quant) macro
    synsem:(forward,
            arg:(syn:(n,
                      num:plu),
                 sem:(body:Restr,
                      ind:Ind)),
            res:(syn:(np,
                      num:plu),
                 sem:Ind),
    qstore:[ (Quant,
              var:Ind,
              restr:Restr) ].
\end{verbatim}
%
Of course, it should be noted that this entry does not match the
type system of the categorial grammar in the appendix, as it assumes a
number feature on nouns and noun phrases.

Empty categories are expensive to compute under a bottom-up parsing
scheme such as the one used in {\sc ale}.  The reason for this is that
these categories can be used at every position in the chart during
parsing (with the same begin and end points).  If the empty categories
cause local structural ambiguities, parsing will be slowed down
accordingly as these structures are calculated and then propagated.
Consider the empty determiner given above.  It can be used as an
inactive edge at every node in the chart, then match the forward
application rule scheme and search through every edge to its right
looking for a nominal complement.  If there are relatively few
nouns in a sentence, not many noun phrases will be created by this
rule and thus not many structural ambiguities will propagate.  But in
a sentence such as {\em the kids like the toys}, there will be an edge
spanning {\em kids like the toys} corresponding to an empty determiner
analysis of {\em kids}.  The corresponding noun phrase created
spanning {\em toys} will not propagate any further, as there is no way
to combine a noun phrase with the determiner {\em the}.  But now
consider the empty slash categories of form $X/X$ in {\sc gpsg}.
These categories, when coupled with the slash passing rules, would
roughly double parsing time, even for sentences that can be analyzed
without any such categories.  The reason is that these empty
categories are highly underspecified and thus have many options for
combinations.  Thus empty categories should be used sparingly, and
prefarably in environments where their effects will not propagate.

Another word of caution is in order concerning empty categories: they
can occur in constructions with other empty categories.  For instance,
if we specify categories $C_{1}$ and $C_{2}$ as empty categories, and
have a rule that allows a $C$ to be constructed from a $C_{1}$ and a
$C_{2}$, then $C$ will act as an empty category, as well.  These
combinations of empty categories are computed at compile-time; but the
sheer number of empty categories produced under this closure may be a
processing burden if they apply at run-time too productively.  Keep in
mind that {\sc ale} computes all of the inactive edges that can be
produced from a given input string, so there is no way of eliminating
the extra work produced by empty categories interacting with other
categories, including empty ones.


\mysection{Lexical Rules}

Lexical rules provide a mechanism for expressing redundancies in the
lexicon, such as the kinds of inflectional morphology used for word
classes, derivational morphology as found with suffixes and prefixes,
as well as zero-derivations as found with detransitivization,
nominalization of some varieties and so on.  The format {\sc ale}
provides for stating lexical rules is similar to that found in both 
{\sc patr-ii} and {\sc hpsg}.  

In order to implement them efficiently, lexical rules, as well as
their effects on lexical entries, are compiled in much the same way as
grammars.  To enhance their power, lexical rules, like grammar rules,
allow arbitrary procedural attachment with {\sc ale} definite
constraints.  

The lexical rule system of {\sc ale} is productive in that it allows
lexical rules to apply sequentially to their own output or the output
of other lexical rules.  Thus, it is possible to derive the nominal
{\em runner} from the verb {\em run}, and then derive the plural
nominal {\em runners} from {\em runner}, and so on.  At the same time,
the lexical system is leashed to a fixed depth-bound, which may be
specified by the user.  This bound limits the number of rules that can
be applied to any given category.  The bound on application of rules
is specified by a command such as the following, which should appear
somewhere at the beginning of the input file\label{LEXRULEDEPTH}:
%
\begin{verbatim}
  :-lex_rule_depth(2).
\end{verbatim}
%
Of course, bounds other than {\tt 2} can be used.  The bound indicates
how many applications of lexical rules can be made, and may be {\tt
0}.  If there is more than one such specification in an input file,
the last one will be the one that is used.  If no specification
is given, the default is {\tt 2}.

The format for lexical rules is as follows\label{LEXRULESIG}:
%
\begin{verbatim}
  <lex_rule> ::= <lex_rule_name> lex_rule <lex_rewrite> 
                                 morphs <morphs>.

  <lex_rewrite> ::= <desc> **> <desc>
                  | <desc> **> <desc> if <goal>

  <morphs> ::= <morph>
             | <morph>, <morphs>

  <morph> ::= (<string_pattern>) becomes (<string_pattern>)
            | (<string_pattern>) becomes (<string_pattern>) 
                                 when <prolog_goal>

  <string_pattern> ::= <atomic_string_pattern>
                     | <atomic_string_pattern>, <string_pattern>
  
  <atomic_string_pattern> ::=  <atom>
                            |  <var>
                            |  <list(<var_char>)>
  <var_char> ::= <char>
               | <var>
\end{verbatim}
%
An example of a lexical rule with almost all of the bells and whistles
(we put off procedural attachment for now) is:
%
\begin{verbatim}
plural_n lex_rule  
  (n,
   num:sing)
   **> (n,
        num:plu)
  morphs
    goose becomes geese,
    [k,e,y] becomes [k,e,y,s],
    (X,man) becomes (X,men),
    (X,F) becomes (X,F,es) when fricative(F),
    (X,ey) becomes (X,[i,e,s]),
    X becomes (X,s).

  fricative([s]).
  fricative([c,h]).
  fricative([s,h]).
  fricative([x]).
\end{verbatim}
%
We will use this lexical rule to explain the behavior of the lexical
rule system.  First note that the name of a lexical rule, in this case
{\tt plural\_n}, must in general be a Prolog atom.  Further note that
the top-level parentheses around both the descriptions and the
patterns are necessary.  If the Prolog goal, in this case {\tt
fricative(F)}, had been a complex goal, then it would need to be
parenthesized as well.  The next thing to note about the lexical rule
is that there are two descriptions --- the first describes the input
category to the rule, while the second describes the output category.
These are arbitrary descriptions, and may contain disjunctions,
macros, etc.  We will come back to the clauses for {\tt fricative/1}
shortly.  Note that the patterns in the morphological component are
built out of variables, sequences and lists.  Thus a simple rewriting
can be specified either using atoms as with {\tt goose} above, with a
list, as in {\tt [k,e,y]}, or with a sequence as in {\tt (X,man)}, or
with both, as in {\tt (X,[i,e,s])}.  The syntax of the morphological
operations is such that in sequences, atoms may be used as a shorthand
for lists of characters.  But lists must consist of variables or
single characters only.  Thus we could not have used {\tt (X,[F])} in
the fricative case, as {\tt F} might is itself a complex list such as
{\tt [s,h]} or {\tt [x]}.  But in general, variables ranging over
single characters can show up in lists.

The basic operation of a lexical rule is quite simple.  First, every
lexical entry, including a word and a category, that is produced
during compilation is checked to see if its category satisfies the
input description of a lexical rule.  If it does, then a new category
is generated to satisfy the output description of the lexical rule, if
possible.  Note that there might be multiple solutions, and all
solutions are considered and generated.  Thus multiple solutions to
the input or output descriptions lead to multiple lexical entries.

After the input and output categories have been computed, the word of
the input lexical entry is fed through the morphological analyzer to
produce the corresponding output word.  Unlike the categorial component
of lexical rules, only one output word will be constructed, based on
the first input/output pattern that is matched.%
%
\footnote{Thus {\sc ale}'s lexical rule system is not capable of
handling cases of partial suppletion, where both a regular and
irregular morphological form are both allowed.  To allow two ouptut
forms, one must be coded by hand with its own lexical entry or a
separate lexical rule.}
%
The input word is matched against the patterns on the left hand side
of the morphological productions.  When one is found that the input
word matches, any condition imposed by a {\tt when} clause on the
production is evaluated.  This ordering is imposed so that the Prolog
goal will have all of the variables for the input string instantiated.
At this point, Prolog is invoked to evaluate the {\tt when} clause.
In the most restricted case, as illustrated in the above lexical rule,
Prolog is only used to provide abbreviations for classes.  Thus the
definition for {\tt fricative/1} consists only of unit clauses.  For
those unfamiliar with Prolog, this strategy can be used in general for
simple morphological abbreviations.  Evaluating these goals requires
the {\tt F} in the input pattern to match one of the strings given.
The shorthand of using atoms for the lists of their characters only
operates within the morphological sequences.  In particular, the
Prolog goals do not automatically inherit the ability of the lexical
system to use atoms as an abbreviation for lists, so they have to be
given in lists.  Substituting {\tt fricative(sh)} for {\tt
fricative([s,h])} would not yield the intended interpretation.
Variables in sequences in morphological productions will always be
instantiated to lists, even if they are single characters.  For
instance, consider the lexical rule above with every atom written out
as an explicit list:
%
\begin{verbatim}
    [g,o,o,s,e] becomes [g,e,e,s,e],
    [k,e,y] becomes [k,e,y,s],
    (X,[m,a,n]) becomes (X,[m,e,n]),
    (X,F) becomes (X,F,[e,s]) when fricative(F),
    (X,[e,y]) becomes (X,[i,e,s]),
    X becomes (X,[s]).
\end{verbatim}
%
In this example, the {\tt s} in the final production is given as a
list, even though it is only a single character.  

The morphological productions are considered one at a time until one
is matched.  This ordering allows a form of suppletion, whereby
special forms such as those for the irregular plural of {\tt goose}
and {\tt key} to be listed explicitly.  It also allows
subregularities, such as the rule for fricatives above, to override
more general rules.  Thus the input word {\tt beach} becomes {\tt
beaches} because {\tt beach} matches {\tt (X,F)} with {\tt X =
[b,e,a]} and {\tt F = [c,h]}, the goal {\tt fricative([c,h])} succeeds
and the word {\tt beaches} matches the output pattern {\tt
(X,F,[e,s])}, instantiated after the input is matched to {\tt
([b,e,a],[c,h],[e,s])}.  Similarly, words that end in {\tt [e,y]} have
this sequence replaced by {\tt [i,e,s]} in the plural, which is why an
irregular form is required for {\tt keys}, which would otherwise match
this pattern.  Finally, the last rule matches any input, because it is
just a variable, and the output it produces simply suffixes an {\tt
[s]} to the input.  

For lexical rules with no morphological effect, the production:
%
\begin{verbatim}
  X becomes X
\end{verbatim}
%
suffices.  To allow lexical operations to be stated wholly within
Prolog, a rule may be used such as the following:
%
\begin{verbatim}
  X becomes Y when morph_plural(X,Y)
\end{verbatim}
%
In this case, when {\tt morph\_plural(X,Y)} is called, {\tt X} will be
instantiated to the list of the characters in the input, and as a
result of the call, {\tt Y} should be instantiated to a ground list of
output characters.

We finally turn to the case of lexical rules with procedural
attachments, as in the following (simplified) example from {\sc hpsg}:
%
\begin{verbatim}
extraction lex_rule
  local:(cat:(head:H,
              subcat:Xs),  
         cont:C),
  nonlocal:(to_bind:Bs,
            inherited:Is)
  **> local:(cat:(head:H,
                  subcat:Xs2),
             cont:C),
      nonlocal:(to_bind:Bs,
                inherited:[G|Is])
  if 
    select(G,Xs,Xs2)
  morphs
    X becomes X.

  select(X,(hd:X),Xs) if true.
  select(X,[Y|Xs],[Y|Ys]) if
    select(X,Xs,Ys).
\end{verbatim}
%
This example illustrates an important point other than the use of
conditions on categories in lexical rules.  The point is that even
though only the {\tt LOCAL CAT SUBCAT} and {\tt NONLOCAL INHERITED}
paths are affected, information that stays the same must also be
mentioned.  For instance, if the {\tt cont:C} specification had been
left out of either the input our output category description, then the
output category of the rule would have a completely unconstrained
content value.  This differs from the default-based nature of the
usual presentation of lexical rules, which assumes all information
that hasn't been explicitly specified is shared between the input and
the output.  As another example, we must also specify that the {\tt
HEAD} and {\tt TO\_BIND} features are to be copied from the input to
the output; otherwise there would be no specification of them in the
output of the rule.  This fact follows from the description of the
application of lexical rules: they match a given category against the
input description and produce the most general category(s) matching
the output description.

Turning to the use of conditions in the above rule, the {\tt select/3}
predicate is defined so that it selects its first argument as a list
member of its second argument, returning the third argument as the
second argument with the selected element deleted.  In effect, the
above lexical rule produces a new lexical entry which is like the
original entry, except for the fact that one of the elements on the
subcat list of the input is removed from the subcat list and added to
the inherited value in the output.  Nothing else changes.

Procedurally, the definite clause is invoked after the lexical rule
has matched the input description against the input category.  Like
the morphological system, this control decision was made to ensure
that the relevant variables are instantiated at the time the condition
is resolved.  The condition here can be an arbitrary goal, but if it
is complex, there should be parentheses around the whole thing.  Cuts
should not be used in conditions on lexical rules (see the comments on
cuts in grammar rules below, which also apply to cuts in lexical
rules).

Currently, {\sc ale} does not check for redundancies or for entries
that subsume each other, either in the base lexicon or after closure
under lexical rules.  {\sc ale} also does not apply lexical rules to
empty categories.


\mysection{Grammar Rules}

Grammar rules in {\sc ale} are of the phrase structure variety, with
annotations for both goals that need to be solved and for
attribute-value descriptions of categories.  The {\sc bnf} syntax for
rules is as follows\label{RULESIG}:
%
\begin{verbatim}
  <rule> ::=  <rule_name> rule <desc> ===> <rule_body>.
 
  <rule_body> ::=  <rule_clause>
                |  <rule_clause>, <rule_body>

  <rule_clause> ::=  cat> <desc>
                  |  cats> <desc>
                  |  sem_head> <desc>
                  |  goal> <goal>
                  |  sem_goal> <goal>
\end{verbatim}
%
The {\tt <rule\_name>} must be a Prolog atom.  The description in the
rule is taken to be the mother category in the rule, while the rule
body specifies the daughters in the rule along with any side
conditions on the rule, expressed as {\sc ale} goals.  A further
restriction on rules, which is not expressed in the {\sc bnf} syntax
above, is that there must be at least one category-seeking rule clause
in each rule body.%
%
\footnote{By doubling the size of the {\sc bnf} for rules, this
requirement could be expressed.}
%
Thus empty productions are not allowed and will be
flagged as errors at compile time.

A simple example of such a rule, without any goals, is as follows:
%
\begin{verbatim}
  s_np_vp rule
  (syn:s,
   sem:(VPSem,
        agent:NPSem))
  ===>
  cat>
    (syn:np,
     agr:Agr,
     sem:NPSem),
  cat>
    (syn:vp,
     agr:Agr,
     sem:VPSem).
\end{verbatim}
%
There are a few things to notice about this rule.  The first is that
the parentheses around the category and mother descriptions are
necessary.  Looking at what the rule means, it allows the combination
of an {\tt np} category with a {\tt vp} type category if they have
compatible (unifiable) values for {\tt agr}.  It then takes the
semantics of the result to be the semantics of the verb phrase, with
the additional information that the noun phrase semantics fills the
agent role.

Unlike the {\sc patr-ii} rules, but similar to {\sc dcg} rules,
``unifications'' are specified by variable co-occurrence rather than
by path equations, while path values are specified using the colon rather
than by a second kind of path equation.  The rule above is similar to
a {\sc patr-ii} rule which would look roughly as follows:
%
\begin{verbatim}
  x0 ---> x1, x2 if
    (x0 syn) == s,
    (x1 syn) == np,
    (x2 syn) == vp,
    (x0 sem) == (x2 sem),
    (x0 sem agent) == (x1 sem),
    (x1 agr) == (x2 agr)
\end{verbatim}

Unlike lexical entries, rules are not expanded to feature structures
at compile-time.  Rather, they are compiled down into
structure-copying operations involving table look-ups for feature and
type symbols, unification operations for variables, sequencing for
conjunction, and choice point creation for disjunction.

The descriptions for {\tt cat>} and {\tt cats>} daughters are always
evaluated in the order they are specified, from left to right.  This
is significant when considering goals that might be interleaved with
searches in the chart for consistent daughter categories.  The order
in which the code for the mother's and semantic head's descriptions is
executed depends on the control strategy used during parsing or
generation.  These are described in Sections~\ref{PARSING}
and~\ref{GENERATION}, respectively.  In theory, the same grammar can
be used for both parsing and generation.  In practice, a single
grammar is rarely efficient in both directions, and can even exhibit
termination problems in one, just as a Prolog program may have these
problems with queries that have different argument instantiations.  So
while it is not necessary to fully understand the parsing or
generation algorithms used by {\sc ale} to exploit its power for
developing grammars, practical implementations will order their
procedural attachments and distribute their description-level
information with these algorithms in mind.

Within a single description, in the case of feature and type symbols,
a double-hashing is performed on the type of the structure being added
to, as well as either the feature or the type being added.  Additional
operations arise from type coercions that adding features or types
require.  Thus there is nothing like disjunctive normal-form
conversion of rules at compile time, as there is for lexical entries.
In particular, if there is a local disjunction in a rule, it will be
evaluated locally at run time.  For instance, consider the following
rule, which is the local part of {\sc hpsg}'s Schema 1:
%
\begin{verbatim}
  schema1 rule
  (cat:(head:Head,
        subcat:[]),
   cont:Cont)
  ===>
  cat> 
   (Subj,
    cat:head:( subst 
             ; spec:HeadLoc,
             )),
  cat>
   (HeadLoc,
    cat:(head:Head,
         subcat:[Subj]),
    cont:Cont).
\end{verbatim}
%
Note that there is a disjunction in the {\tt cat:head} value of the
first daughter category (the subject in this case).  This disjunction
represents the fact that the head value is either a substantive
category (one of type {\tt subst}), or it has a specifier value which
is shared with the entire second daughter.  But the choice between the
disjuncts in the first daughter of this rule is made locally, when the
daughter category is fully known, and thus does not create needless
rule instantiations.  

{\sc ale}'s general treatment of disjunction in descriptions, which is
an extension of Kasper and Round's (1986) attribute-value logic to
phrase structure rules, is a vast improvement over a system such as
{\sc patr-ii}, which would not allow disjunction in a rule, thus
forcing the user to write out complete variants of rules that only
differ locally.  Disjunctions in rules do create local choice points,
though, even if the first goal in the disjunction is the one that is
solvable.%
%
\footnote{In a future release, cuts will be allowed within
descriptions, to allow the user to eliminate disjunctive choice points
when possible.}
%
This is because, in general, both parts of a disjunction might be
consistent with a given category, and lead to two solutions.  Or one
disjunct might be discarded as inconsistent only when its variables are
further instantiated elsewhere in the rule.

\mysubsection{Procedural Attachments}

A more complicated rule, drawn from the categorial grammar in the
appendix is as follows:
%
\begin{verbatim}
  backward_application rule
  (synsem:Z,
   qstore:Qs)
  ===> 
  cat> 
    (synsem:Y,
     qstore:Qs1),
  cat> 
    (synsem:(backward,
             arg:Y,
             res:Z),
     qstore:Qs2),
  goal>
    append(Qs1,Qs2,Qs).
\end{verbatim}
%
Note that the goal in this rule is specified after the two category
descriptions.  Consequently, it will be evaluated after categories
matching the descriptions have already been found, thus ensuring in
this case that the variables {\tt Qs1} and {\tt Qs2} are instantiated.
The {\tt append(Qs1,Qs2,Qs)} goal is evaluated by {\sc ale}'s definite
clause resolution mechanism.  {\tt goal>} attachments are always
evaluated in the order they are specified relative to the enforcement
of {\tt cat>} and {\tt cats>} daughters, from left to right.  All
possible solutions to the goal are found with the resulting
instantiations carrying over to the rule.  These solutions are found
using the depth-first search built into {\sc ale}'s definite
constraint resolver.  In general, goals may be interleaved with the
category specifications, giving the user control over when the goals
are fired.  Also note that goals may be arbitrary {\em cut-free} {\sc
  ale} definite clause goals, and thus may include disjunctions,
conjunctions, and negations.  Cuts may occur, however, within the code
for any literal clause specified in a procedural attachment.  The
attachments themselves must be cut-free to avoid the cut taking
precedence over the entire rule after compilation, thus preventing the
rule to apply to other edges in the chart or for later rules to apply.
Instead, if cuts are desired in rules, they must be encapsulated in an
auxiliary predicate, which will restrict the scope of the cut.  For
instance, in the context of a phrase structure rule, rather than a
goal of the form:
%
\begin{verbatim}
  goal>
    (a, !, b)
\end{verbatim}
%
it is necessary to encode this as follows:
%
\begin{verbatim}
  goal>  
    c
\end{verbatim}
%
where the predicate {\tt c} is defined by:
\begin{verbatim}
  c if 
    (a, !, b).
\end{verbatim}
%
This prevents backtracking through the cut in the goal, but does not
block the further application of the rule.  A similar strategy should
be employed for cuts in lexical rules.  

As a programming strategy, rules should be formulated like Prolog
clauses, so that they fail as early as possible.  Thus the features
that discriminate whether a rule is applicable should occur first in
category descriptions.  The only work incurred by testing whether a
rule is applicable is up to the point where it fails.

Just as with {\sc patr-ii}, {\sc ale} is {\sc re}-complete
(equivalently, Turing-equivalent), meaning that any computable
language can be encoded.  Thus it is possible to represent undecidable
grammars, even without resorting to the kind of procedural attachment
possible with arbitrary definite clause goals.  With its mix of
depth-first and breadth-first evaluation strategies, {\sc ale} is not
strictly complete with respect to its intended semantics if an
infinite number of edges can be generated with the grammar.  This
situation is similar to that in Prolog, where a declaratively
impeccable program might hang operationally.  

\mysubsection{The {\tt cats>} Operator}

The {\tt cats>}\label{CATS} operator is used to describe a list of
daughters, whose length cannot be determined until run-time.
Daughters are not parsed or generated as quickly as part of a {\tt
  cats>} specification.  Note also the interpretation of {\tt cats>}
requires that its argument is subsumed by the type {\tt list}, which
must be defined, along with {\tt ne\_list}, {\tt e\_list}, etc., and
the features {\tt HD}, and {\tt TL}, which we defined above.  This
check is not made using unification, so that an underspecified list
argument will not work either.  If the argument of {\tt cats>} is not
subsumed by {\tt list}, then the rule in which that argument occurs
will never match any string, and a run-time error message will be
given.  This operator is useful for so-called ``flat'' rules, such as
{\sc hpsg}'s Schema 2, part of which is given (in simplified form)
below:
%
\begin{verbatim}
  schema2 rule
  (cat:(head:Head,
        subcat:[Subj]))
  ===>
  cat>
   (cat:(head:Head,
         subcat:[Subj|Comps])),
  cats> Comps.
\end{verbatim}
%
Since various lexical items have {\tt SUBCAT} lists of various
lengths, e.g., zero for proper nouns, one for intransitive verbs, two
for transitive verbs, {\tt cats>} is required in order to match the
actual list of complements at run-time.

It is common to require a goal to produce an output for the argument
of {\tt cats>}.  If this is done, the goal must be placed before the
{\tt cats>}.  Our use of {\tt cats>} is problematic in that we require
the argument of {\tt cats>} to evaluate to a list of fixed length.
Thus parsing with the following head-final version of {\sc hpsg}'s
Schema 2 would not work:
%
\begin{verbatim}
  schema2 rule
  (cat:(head:Head,
        subcat:[SubjSyn]))
  ===>
  cats> Comps,
  cat>
   (cat:(head:Head,
         subcat:[Subj|Comps])).
\end{verbatim}
%
One way to work around this is to place some finite upper bound on the
size of the {\tt Comps} list by means of a constraint.
%
\begin{verbatim}
  schema2 rule
  (cat:(head:Head,
        subcat:[SubjSyn]))
  goal>  three_or_less(Comps),
  cats> Comps,
  cat>
   (cat:(head:Head,
         subcat:[Subj|Comps])).

  three_or_less([]) if true.
  three_or_less([_]) if true.
  three_or_less([_,_]) if true.
  three_or_less([_,_,_]) if true.
\end{verbatim}
%
The problem with this strategy from an efficiency standpoint is that
arbitrary sequences of three categories will be checked at every point
in the grammar; in the English case, the search is directed by the
types instantiated in {\tt Comps} as well as that list's length.  From
a theoretical standpoint, it is impossible to get truly unbounded
length arguments in this way.  

\mysubsection{Parsing}
\label{PARSING}

The {\sc ale} system employs a bottom-up active chart parser that has
been tailored to the implementation of attribute-value grammars in
Prolog.  The single most important fact to keep in mind is that rules
are evaluated from left to right, with the mother description coming
last.  Most of the implementational considerations follow from this
rule evaluation principle and its specific implementation in Prolog.
In parsing, {\tt sem\_head>} and {\tt sem\_goal>} specifications are
treated exactly as {\tt cat>} and {\tt goal>} specifications,
respectively.

The chart is filled in using a combination of depth- and breadth-first
control.  In particular, the edges are filled in from right to left,
even though the rules are evaluated from left to right.  Furthermore,
the parser proceeds breadth-first in the sense that it incrementally
moves through the string from right to left, one word at a time,
recording all of the inactive edges that can be created beginning from
the current left-hand position in the string.  For instance, in the
string {\tt The kid ran yesterday}, the order of processing is as
follows.  First, lexical entries for {\tt yesterday} are looked up,
and entered into the chart as inactive edges.  For each inactive edge
that is added to the chart, the rules are also fired according to the
bottom-up rule of chart parsing.  But no active edges are recorded.
Active edges are purely dynamic structures, existing only locally to
exploit Prolog's copying and backtracking schemes.  The benefit of
parsing from right to left is that when an active edge is proposed by
a bottom-up rule, every inactive edge it might need to be completed
has already been found.  This is actually true as long as the
direction of traversal through the string is the opposite of the
direction of matching daughter categories in a rule; thus the real reason
for the right-to-left parsing strategy is to allow the active edges to
be represented dynamically, while still evaluating the rules from left
to right.  While the overall strategy is bottom-up, and breadth-first
insofar as it steps incrementally through the string, filling in every
possible inactive edge as it goes, the rest of the processing is done
depth-first to keep as many data structures dynamic as possible, to
avoid copying other than that done by Prolog's backtracking mechanism.
In particular, lexical entries, bottom-up rules, and active
edges are all evaluated depth-first, which is perfectly sound, because
they all start at the same left point (that before the current word in
the right to left pass through the string), and thus do not interact
with one another.

{\sc ale} computes the closure of its grammar rules under application
of the first daughter's description to empty categories at
compile-time.  This is known as {\it Empty-First-Daughter closure} or
EFD closure.\label{EFDCLOSURE} This closure operation has three
advantages.  First, given {\sc ale}'s combination of depth-first and
breadth-first processing, it is necessary in order to ensure
completeness of parsing with empty categories, because any permutation
of empty categories can, in principle, be combined to form a new empty
category.  Second, it works around a problem that many
non-ISO-compatible Prologs, including SICStus Prolog, have with
asserted predicates that results in empty category leftmost daughters
not being able to combine with their own outputs.  Third, it allows
the run-time parser to establish a precondition that rules only need
to be closed with non-empty leftmost daughters at run-time.  As a
result, when a new mother category is created and closed under rules
as the leftmost daughter, it cannot combine with other edges created
with the same left node.  This allows {\sc ale}, at each step in its
right-to-left pass throught the input string, to copy all of the edges
in the internal database back onto the heap before they can be used
again, and thus reduces edge copying to a constant two times per edge
for non-empty categories.  Keeping a copy of the chart on the heap
also allows for more sophisticated indexing strategies that would
otherwise be overwhelmed by the cost of copying edges with large-sized
categories in Prolog before the match.  The EFD closure algorithm
itself is described in Penn (1999).

EFD closure potentially creates new rules, a prefix of whose daughters
have matched empty categories, and new empty categories, formed when
every daughter of a rule has matched an empty category.  The closure
in computed breadth-first.

EFD closure may not terminate.  As a result, compilation of some
grammars may go into infinite loops.  This only occurs, however, with
grammars for which every parse would go into an infinite loop at
run-time if EFD closure were not applied --- specifically, when empty
categories alone can produce infinitely many empty categories using
the rules of the grammar.  Because early versions of {\sc ale} did not
compute a complete closure of grammar rules over empty categories
(even at run-time), some grammars that terminated at run-time under
these early versions will not terminate at compile-time under the
current version.

Rules can incorporate definite clause goals before, between or after
category specifications.  These goals are evaluated when they are
found.  For instance, if a goal occurs between two categories on the
right hand side of a rule, the goal is evaluated after the first
category is found, but before the second one is.  The goals are
evaluated by {\sc ale}'s definite clause resolution mechanism, which
operates in a depth-first manner.  Thus care should be taken to make
sure the required variables in a goal are instantiated before the goal
is called.  The resolution of all goals should terminate with a finite
(possibly empty) number of solutions, taking into account the
variables that are instantiated when they are called.

The parser will terminate after finding all of the inactive edges
derivable from the lexical entries and the grammar rules.  Of course,
if the grammar is such that an infinite number of derivations can be
produced, {\sc ale} will not terminate.  Such an infinite number of
derivations can creep in either through recursive unary rules or
through the evaluation of goals.

{\sc ale} now has an optional mechanism for checking edge subsumption
(Section~\ref{SUBSUMPTION}).  This can be used to prevent the
propagation of spurious ambiguities through the parse.  A category $C$
spanning a given subsequence is said to be {\em spurious} if there is
another category $C'$ spanning the same subsequence such that $C$ is
subsumed by $C'$.  Only the most general category needs to be recorded
to ensure soundness.  It can also be used to detect two derivations of
the same category.  Our experience, however, has been that most
unification-based grammars do not have any spurious ambiguity.  They
normally incorporate some notion of thematic or functional structure
representing the meaning of a sentence; and in these cases, most
structural ambiguities result in semantic ambiguities.  For such
grammars, subsumption checking is probably not worth the effort, and
should be left disabled.

\mysubsection{Generation}
\label{GENERATION}

{\sc ale} also contains a generator, based on the {\em Semantic
  Head-Driven Generation} algorithm of van Noord (1989), as extended
by Shieber et al. (1990), and adapted to the typed feature logic of
Carpenter (1992) by Popescu (1996).  Its implementation in {\sc ale}
is described in Penn and Popescu (1997).

Given a description of a feature structure, {\sc ale}'s generator will
non-deterministically generate all the word strings that correspond to
its most general satisfier(s).  In other words, the generated word
strings, when parsed in {\sc ale} using the same grammar, will result
in feature structures that {\em unify} with a most general satisfier
of the initial description (rather than necessarily be identical).
That part of the feature structure which represents semantic
information drives the generation process.

\mysubsubsection{The {\tt semantics/1} Directive}

{\sc ale} identifies this part using a user-defined directive, {\tt
  semantics/1}\label{SEMANTICS}.  This directive distinguishes a
binary user-defined definite clause predicate as the predicate to use
to find semantic information.  The first argument is always the
feature structure whose semantics are being identified; and the second
argument is always the semantic information.  The example below, taken
again from the sample generation grammar, simply says that the
semantics of a feature structure is the value of its {\tt sem}
feature:
%
\begin{verbatim}
  semantics sem1.
  sem1(sem:S,S) if true.
\end{verbatim}

In general, the second argument does not need to be a sub-structure of
the first --- it could have a special type that is used only for the
purpose of collecting semantic information, possibly spread over
several unrelated sub-structures.  The body can be arbitrarily
complex; and there can be multiple clauses for the definition of this
predicate.  The predicate must, however, have the property that it
will terminate when only its first argument is instantiated, and when
only its second argument is instantiated.  {\sc ale} will use this
predicate in both ``directions'' --- to find semantics information,
and in reverse to build templates to find structures that have
matching semantic information.  There can be only one predicate
distinguished by {\tt semantics/1}.  If there are multiple directives,
{\sc ale} will only use the first.

\mysubsubsection{The Algorithm}

Semantic-head-driven generation makes use of the notion of a {\em
  semantic head} of a rule, a daughter whose semantics is shared
with the mother.  In semantic-head-driven generation, there are two
kinds of rules: {\em chain rules}, which have a semantic head, and
{\em non-chain rules}, which lack such a head.  These two subsets are
processed differently during the generation process.

Given a feature structure, called the {\em root} goal, to generate a
string for, the generator builds a new feature structure that shares
its semantic information (using the user-defined semantics predicate
with the second argument instantiated) and finds a {\em pivot} that
unifies with it.  The pivot is the lowest node in a derivation tree
that has the same semantics as the root.  The pivot may be either a
{\em lexical entry} or {\em empty category} (the base cases), or the
{\em mother category of a non-chain rule}.  Once a pivot is
identified, one can recursively generate top-down from the pivot using
non-chain rules.  Since the pivot must be the {\em lowest}, there can
be no lower semantic heads, and thus no lower chain-rule applications.
Just as in parsing, the daughters of non-chain rules are processed
from left to right.

Once top-down generation from the pivot is complete, the pivot is
linked to the root bottom-up by chain rules.  At each step, the
current chain node (beginning with the pivot) is unified with the
semantic head of a chain-rule, its non-head sisters are generated
recursively, and the mother becomes the new chain node.  The non-head
daughters of chain rules are also processed from left to right.  The
base case is where the current chain node unifies with the root.

An example from the sample generation grammar in
Appendix~\ref{GENGRAMMAR} illustrates this better.  Suppose that the
generator is given the goal description:
%
\begin{verbatim}
  (sentence,
   sem:(pred:decl,
        args:[(pred:call_up,
               args:[pred:mary,pred:john])]))
\end{verbatim}
%
\begin{figure}
\centering
\leavevmode\epsfbox{fig5.1.eps}
\caption{The initial root.}\label{fig:gen-a}
\end{figure}
%
\begin{figure}
\centering
\leavevmode\epsfbox{fig5.2.eps}
\caption{The semantics template.}\label{fig:gen-b}
\end{figure}
%
Figure~\ref{fig:gen-a} shows the initial root (the most general
satisfier of the input description); and Figure~\ref{fig:gen-b} shows
the template that {\sc ale} uses to find a pivot.  Next
(Figure~\ref{fig:gen-c}), a pivot is selected, in this case by
unifying the template with the mother category of the non-chain rule,
{\tt sentence1}:
%
\begin{figure}
\centering
\leavevmode\epsfbox{fig5.3.eps}
\caption{The first pivot is found.}\label{fig:gen-c}
\end{figure}
%
\begin{verbatim}
sentence1 rule
  (sentence,sem:(pred:decl,
                 args:[S]))
  ===>
  cat> (s,form:finite,
          sem:S).
\end{verbatim}
%
We can tell that {\tt sentence1} is a non-chain rule because it lacks
a {\tt sem\_head>} daughter, unlike, for example, the chain rule {\tt
  s}:
%
\begin{verbatim}
  s rule
  (s,form:Form,
     sem:S)
  ===>
  cat> Subj,
  sem_head>
    (vp,form:Form,
        subcat:[SUBJ],
               sem:S).
\end{verbatim}
%
The only daughter of {\tt sentence1} becomes the new root.

The pivot chosen for that root is the lexical entry for {\tt calls},
which is obtained by applying the lexical rule, {\tt sg3}, to the
first entry for {\tt call} in the grammar.  That pivot has no
daughters, so it must now be connected by chain rules to the new root
in Figure~\ref{fig:gen-c}.  The chain rule, {\tt vp1}, is chosen, its
semantic head is unified with the entry for {\tt calls}, its one
non-head daughter is recursively generated (which simply succeeds by
unifying with the lexical entry for {\tt john}), and its mother
becomes the new chain node (Figure~\ref{fig:gen-d}).
%
\begin{figure}
\centering
\leavevmode\epsfbox{fig5.4.eps}
\caption{First chain rule application.}\label{fig:gen-d}
\end{figure}

Again (Figure~\ref{fig:gen-e}), chain rule {\tt vp1} is chosen, its
semantic head is unified with the new pivot, its non-head daughter is
recursively generated by unifying with the lexical entry for {\tt up},
and its mother becomes the next chain node.
%
\begin{figure}
\centering
\hspace*{-0.75in}\leavevmode\epsfbox{fig5.5.eps}
\caption{Second chain rule application.}\label{fig:gen-e}
\end{figure}

Finally, the chain rule, {\tt s} is chosen.  Its non-head daughter is
recursively generated by unifying with the lexical entry for {\tt
mary}, and its mother, the new chain node, unifies with the new root
(Figure~\ref{fig:gen-f}).
%
\begin{figure}
\centering
\hspace*{-0.75in}\leavevmode\epsfbox{fig5.6.eps}
\caption{Third chain rule application and unification.}\label{fig:gen-f}
\end{figure}
%
With generation below the pivot of Figure~\ref{fig:gen-c} having been
completed, it is linked to its root directly by unification, yielding
the solution, {\tt mary calls john up}.

\mysubsubsection{Pivot Checking}

{\sc ale}'s generator uses a simple depth-first search strategy,
displaying solutions as it finds them.  As a result, it is not
complete.  Following the suggestion made in Shieber et al. (1990),
{\sc ale} also checks whether there are {\em semantic head $\rightarrow$
mother} sequences that could possibly link a potential pivot
to the current root before recursively generating its non-head
daughters.  If not, then the pivot is discarded.  Semantic-head-driven
generators that do not prune away such bad pivots from the search tree
run a greater risk of missing solutions because top-down generation of
the bad pivot's non-head daughters may not terminate, even though it
can never yield solutions.  This check is valuable because it
incorporates {\em syntactic} information from the mother and semantic
head into the otherwise {\em semantic} prediction step.

It also creates another termination problem, however, namely the
potential for infinitely long {\it semantic head $\rightarrow$ mother}
sequences in some grammars.  To avoid this, {\sc ale} requires the
user to specify a bound on the length of chain rule sequences at
compile-time.  This can be specified with the
declaration:\label{CHAINLENGTH}
%
\begin{verbatim}
  :- chain_length(4).
\end{verbatim}
%
Other values than {\tt 4} can be used, including 0.  The default value
is 4.  {\sc ale} compiles chains of semantic head and mother
descriptions of this length to perform the pivot check more efficiently 
at run-time.

\mysubsubsection{The {\tt sem\_goal>} Operator}

For the most part, the generator treats procedural attachments as the
parser does --- it evaluates them with respect to other daughter
specifications in the order given.  The one exception to this is {\tt
  sem\_goal>} attachments.  These goals are distinguished as attached
to the semantic head, and are therefore evaluated either immediately
before or immediately after the {\tt sem\_head>} description.  As a
result, {\tt sem\_goal>} specification can only occur immediately
before or immediately after a {\tt sem\_head>} specification; and thus
only in chain rules.  {\tt sem\_goal>} attachments are not evaluated
during the pivot check described above --- only the {\tt sem\_head>}
and mother descriptions.  During parsing, {\tt sem\_goal>}
specifications are treated exactly the same as {\tt goal>}
specifications, i.e., evaluated in order.

To summarize, the order of execution for a non-chain rule specification
during generation is:
%
\begin{itemize}
\item the mother description, then
\item the {\tt cat>}, {\tt cats>}, and {\tt goal>} descriptions, in
  order, from left to right.
\end{itemize}
%
There are no {\tt sem\_head>} or {\tt sem\_goal>} specifications in a
non-chain rule.  The order for a chain rule specification during
generation is:
%
\begin{itemize}
\item the pre-head {\tt sem\_goal>} specification, if it exists, 
\item the {\tt sem\_head>} description,
\item the post-head {\tt sem\_goal>} specification, if it exists,
\item the mother description, then\footnotemark
\item the {\tt cat>}, {\tt cats>} and {\tt goal>} specifications, in
  order, from left to right.
\end{itemize}
\footnotetext{The standard head-driven-generation algorithm enforces
the mother description after the non-semantic-head-related daughters.
We deviate from this order in order to enforce the pivot check, which
requires instantiating the mother, more efficiently.}
%
Again, practical grammar implementations will arrange information in
rules in such a way as to ensure termination and to force failure as
early as possible.  For non-chain rules, this means making the mother
and early daughters or goals as informative as possible at the
description level (that is, up to where type inferencing can take
over).  For chain rules, the semantic head and its attachments should
be maximally informative.


\mychapter{Compiling ALE Programs}

This section is devoted to showing how {\sc ale} programs can actually
be compiled.  {\sc ale} was developed to be run with a Prolog
compiler, such as SICStus Prolog's.  An SWI port of {\sc ale} is
available, which also has a less extensive compilation phase.  We
strongly recommend SICStus Prolog.  SWI Prolog does not scale up well
to large-sized grammars.  The local systems administrator should be
able to provide help on running Prolog.  This documentation only
assumes the user has figured out how to run Prolog as well as write
and edit files.  It is otherwise self-contained.

\mysection{File Management}

After starting up Prolog, the following command should be used to load
the {\sc ale} system:
%
\begin{quote}
  {\tt | ?- compile({\em{AleFile}}).}
\end{quote}
%
where {\em AleFile} is an atom specifying the file name in which {\sc
ale} resides.  For instance, in Unix, you might need to use something
like: {\tt compile('/users/carp/Prolog/ALE/ale.pl').}, or a local
abbreviation for it like {\tt compile(ale).} if the system is in a
file named {\tt ale.pl} in the local directory (SICStus and SWI can
fill in the ``{\tt{.pl}}'' suffix).  With SWI Prolog, the command:
%
\begin{quote}
  {\tt | ?- consult({\em{AleFile}}).}
\end{quote}
%
must be used instead.  Note that the argument to {\tt compile} must be
an atom, which means it should be single-quoted if it is not otherwise
an atom.  After the system has compiled, you should see another Prolog
prompt.  It is necessary to have write permission in the directory
from which Prolog is invoked, because {\sc ale} creates files during
compilation.  But note that neither the grammar nor {\sc ale} need to
be locally defined; it is only necessary to have local write
permission.

{\sc ale} source code, being a kind of Prolog code, must be organized
so that predicate definitions are not spread across files, unless the
appropriate {\tt multifile} declarations are made.  For instance, the
{\tt sub/intro} clauses specifying the type hierarchy must all be in
one file.  Similarly, the definite clauses must all be in one file, as
must the grammar rules and macros.


\mysection{Compiling Programs}

{\sc ale} can compile a program incrementally to some extent.  In
particular, the compiler is broken down into six primary components
for compiling the type hierarchy, functional descriptions, type
constraints, the attribute-value logic, the definite clauses and the
grammar.  Compiling the type hierarchy consists of compiling type
subsumption, type unification, appropriateness specifications, and
extensionality information.  The logic compiler compiles predicates
which know how to add a type to a feature structure, how to find a
feature value in a type and how to perform feature structure
unification, as well as the most general satisfiers of every type,
with code attached to enforce {\tt cons/2} constraints.  Compiling the
grammar consists of compiling the lexicon, empty categories, rules and
lexical rules, and if compilation for generation is enabled, the {\tt
semantics/1} directive.  Macros are not compiled, but are rather
interpreted during compilation.

There is one predicate {\tt compile\_gram/1}\label{COMPILEGRAM} that
can be used to compile a whole {\sc ale} grammar from one file, as
follows:
%
\begin{quote}
  {\tt | ?- compile\_gram({\em{GramFile}}).}
\end{quote}
where {\em GramFile} is the name of the file in which the grammar
resides.  The compiler will display error messages to the screen when
it is compiling.  But since {\sc ale} uses the Prolog compiler to
read the files, Prolog might also complain about syntax errors in
specifying the files.  In either case, there should be some indication
of what the error is and which clause of the file contained it.  

{\sc ale}'s compiler creates code for parsing, generation, or both.
As of the present version of {\sc ale}, only one grammar can be used,
even if code for both modes is to be created.  Two files, {\tt
ale\_parse.pl} and {\tt ale\_gen.pl}, are included with the
distribution, which provide some example glue code to link together
two {\sc ale} processes running under SICStus Prolog 3.0 or higher in
order to parse and generate with two different grammars.

At startup, {\sc ale} produces code only for parsing.  To produce code
for generation only, use the command\label{GENERATE}:
%
\begin{verbatim}
  | ?- generate.
 
  compiler will produce code for generation only
 
  yes
  | ?-
\end{verbatim}
%
To produce code for both parsing and generation, use {\tt
  parse\_and\_gen}\label{PARSEANDGEN} instead.  To switch back to
producing code for parsing only, use {\tt parse}.\label{PARSE} Note
that these commands modify the behaviour of the compiler, not the
compiled code, so grammars may need to be recompiled after these
directives are issued.


The following predicates are available to compile grammars and their
component parts.  They are listed hierarchically, with each command
calling all those listed under it.  Each higher-level command is
nothing more than the combination of those commands below
it\label{TABLE}.
%
\newpage
\begin{verbatim}
Command                      Requires            File  Mode  Clause
-------------------------------------------------------------------
compile_gram                 nothing               *   both
  compile_sig                nothing               *   both
    compile_sub_type         nothing               *         sub
    compile_unify_type       compile_sub_type          
    compile_approp           compile_unify_type    *         intro
    compile_extensional      compile_approp        *         ext
  compile_fun                compile_sig           *   both  +++>
  compile_cons               compile_fun           *   both  cons
  compile_logic              compile_sig               both
    compile_mgsat            compile_sig               
    compile_add_to_type      compile_sig               
    compile_featval          compile_add_to_type       
    compile_u                compile_sig               
  compile_subsume            compile_sig            parse/subtest 
  compile_dcs                compile_logic         *   both  if
  compile_grammar            compile_logic         *
    compile_lex_rules        compile_logic         *   parse **>
    compile_lex              compile_logic         *   parse --->
    compile_rules            compile_logic         *   parse ===>,empty
                             compile_logic         *   gen   ===>,empty
                                                             --->,**>
    compile_generate         compile_rules         *   gen   semantics
\end{verbatim}
%
The table above lists which compilations must have already been
compiled before the next stage of compilation can begin.  Thus before
{\tt compile\_grammar} can be called, {\tt compile\_logic} must be
called (or equivalently, the sequence of {\tt compile\_mgsat}, {\tt
compile\_add\_to\_type}, {\tt compile\_featval}, and {\tt
compile\_u}).  Each command with an asterisk in its clauses column in
the above table may be given an optional file argument.  The file
argument should be an atom which specifies the file in which the
relevant clauses can be found.  The clauses needed before each stage
of compilation can begin are listed to the right of the asterisks.
For instance, the {\tt if} clauses must be loaded before {\tt
compile\_dcs} is called.  But note that {\tt compile\_unify\_type}
does not require any clauses to be loaded, as it uses the compiled
definition of {\tt sub\_type} rather than the user specification in
its operation.  Thus changes to the signature in the source file, even
if the source file is recompiled, will not be reflected in {\tt
compile\_unify\_type} if they have not been recompiled by {\tt
compile\_sub\_type} first.  If an attempt is made to compile a part of
a program where the relevant clauses have not been asserted, an error
will result.

Note that {\tt compile\_subsume} only compiles code if subsumption
checking (p.~\pageref{SUBTEST}) and parsing have been enabled.

Each of the lowest level commands generates intermediate Prolog source
code for that function, which is then compiled further by a Prolog
compiler.  {\sc ale} uses a term-expansion-based compiler in both
SICStus and SWI Prologs that avoids the necessity for creating
intermediate files.  It also improves the speed of intermediate code
compilation.  Because both SICStus and SWI Prologs require the user to
read a file on disk in order to use their compilers, {\sc ale} must
create a zero-byte file called {\tt .alec\_throw} to throw control to
its intermediate code compiler.  For that reason, the Prolog process
must have write permission in the local directory to create this file,
if it does not already exist.

After a grammar is compiled, the system plus grammar code can be saved
with the command:
%
\begin{quote}
  {\tt | ?- save\_program(\em{File}).}
\end{quote}
%
This will save the state of the Prolog database in {\em File}.
SICStus users should normally use this rather than {\tt save/1}, which
creates a larger file by saving other information like the state of
Prolog's internal stacks.  The SWI Prolog command is {\tt
qsave\_program({\em File})}.  With either Prolog, the state can be
reloaded, by executing the saved file directly.

In general, whenever the {\sc ale} source program is changed, it
should be recompiled from the point of change.  For instance, if the
definite clauses are the only thing that have changed since the last
compilation, then only {\tt compile\_dcs({\it FileSpec})} needs to be
run.  But if in changing the definite clauses, the type hierarchy had
to be changed, then everything must be recompiled.

{\sc ale} treats lexicon compilation differently than the other
stages.  Two commands, {\tt lex\_compile/0}\label{LEXCOMPILE} and {\tt
lex\_consult/0}\label{LEXCONSULT}, control whether the intermediate
code for the lexicon and empty categories is compiled or consulted (a
lesser degree of compilation).  Lexicon compilation is usually the
most time-consuming stage of grammar compilation in {\sc ale}, and
consulting the code for this stage can result in a substantial
compile-time speed-up.  The decrease in run-time performance is only
significant in grammars with a high degree of lexical ambiguity, i.e.,
where one string has a very large number of entries in the lexicon.
By default, {\sc ale} consults the code for the lexicon.  In SWI
Prolog, only lexicon consulting is available.

When consulting is chosen, the lexicon and empty categories are also
consulted {\it dynamically}.  This means that individual entries can
be retracted and added without recompiling the entire lexicon.  To
retract a lexical entry, use the command:
%
\begin{quote}
  {\tt | ?- retract\_lex({\em LexSpec}).}\label{RETRACTLEX}
\end{quote}
%
{\em LexSpec} can either be a word, or a list of words.  The words
given to {\tt retract\_lex/1} are not closed under {\tt morph} rules
--- derived entries with different forms must be retracted explicitly.
{\tt retract\_lex/1} iterates through all of the entries that match
the given word(s), asking for each one whether it should be retracted.
{\tt retractall\_lex/1}\label{RETRACTALLLEX} will remove all of them
without asking.

To add lexical entries, use the command:
%
\begin{quote}
  {\tt | ?- update\_lex({\em UpdateFile}).}\label{UPDATELEX}
\end{quote}
%
{\em UpdateFile} is a file containing new lexical entries and empty
categories.  New lexical entries are closed under lexical rules, as
usual.%
%
\footnote{Earlier versions of {\sc ale} also permitted incremental
updating and retraction of empty categories.  Because empty categories
are now closed under phrase structure rules at compile-time, this is
no longer possible.}

\mysection{Compile-Time Error Messages}

There are three sources of compile-time messages generated by {\sc
ale}:  Prolog messages, {\sc ale} errors, and {\sc ale} warnings.  

{\sc ale} uses Prolog term input and output, thus requiring the input
to be specified as a valid Prolog program.  Of course, any {\sc ale}
program meeting the {\sc ale} syntax specification will not cause
Prolog errors.  If there is a Prolog error generated, there is a
corresponding bug in the grammar file(s).  Prolog error messages
usually generate a message indicating what kind of error it found, and
just as importantly, which line(s) of the input the error was found
in.  The most common Prolog error messages concern missing periods or
operators which cannot be parsed.  Such errors are usually caused by
bad punctuation such as missing periods, misplaced commas, commas
before semicolons in disjunctions, etc.  These errors are usually easy
to track down.  

Prolog also generates warnings in some circumstances.  In particular,
if you only use a variable once in a definition, it will report a
singleton variable warning.  The reason for this is that variables
that only occur once are useless in that they do not enforce any
structure sharing.  There is little use for singleton variables in
{\sc ale} outside of the Prolog goals in morphological rules and some
macro parameters.  Usually a singleton variable indicates a typing
error, such as typing {\tt AgrNum} in one location and {\tt Agrnum} in
another.  It is standard Prolog practice to replace all singleton
variables with {\em anonymous} variables.  An anonymous variable is a
variable which begins with the underscore character.  For instance, a
singleton variable such as {\tt Head} can be replaced with the
anonymous variable {\tt \_Head}, or even just {\tt \_}, to suppress
such singleton variable warnings.  Two occurrences of the simple
anonymous variable {\tt \_} are not taken to be co-referential, but
two occurrences of something like {\tt \_Head} are taken to be
co-referential.  In particular, the two descriptions, {\tt (foo:X,
bar:X)} and {\tt (foo:\_X, bar:\_X)} are equivalent to each other, but
distinct from {\tt (foo:\_,bar:\_)} in that the latter description
does not indicate any structure sharing.  The second description above
is considered bad style, though, as it uses the anonymous variable
{\tt \_X} co-referentially.

Besides Prolog syntax errors, there are many errors that {\sc ale} is
able to detect at compile time.  These errors will be flagged during
compilation.  Most errors give some indication of the program clause
in which they are found.  Some errors may be serious enough to halt
compilation before it is finished.  In general, it is a good idea to
fix all of the errors before trying to run a program, as the error
messages only report serious bugs in the code, such as type
mismatches, unspecified types, ill-formed rules, etc.  

In certain cases, it is preferable to disable those error messages
concerned with {\sc ale}'s inability to add incompatible descriptions
to a feature structure.  This is especially true during lexicon and
empty category compilation, when, due to the interaction of
disjunctions and type constraints, the number of such errors can be
overwhelming.  In the current version of {\sc ale}, these errors are
automatically disabled during lexicon and empty category compilation,
and enabled otherwise.  Commands will be added to future versions so
that the user may control when these errors should be displayed.

Less serious problems are flagged with warning messages.  Warning
messages do not indicate an error, but may indicate an omission or
less than optimal {\sc ale} programming style.  

The {\sc ale} error and warning messages are listed in an appendix at
the end of this report, along with an explanation.  The manual for the
Prolog in which {\sc ale} is being run in will probably list the kinds
of errors generated by the Prolog compiler.


\mychapter{Running and Debugging ALE Programs}

After the {\sc ale} program compiles without any error messages, it is
possible to test the program to make sure it does what it is supposed
to.  We consider the problem from the bottom-up, as this is the best
way to proceed in testing grammars.  {\sc ale} does not have a
sophisticated input/output package, and thus all {\sc ale} procedures
must be accessed through Prolog queries.  

\mysection{Testing the Signature}

Once the signature is compiled, it is possible to test the results of
the compilation.  To test whether or not a type exists, use the
following query\label{TYPE}:
%
\begin{verbatim}
  | ?- type(Type).
  
  Type = bot ?;
  
  Type = cat ?;
  
  Type = synsem ?

  yes   
\end{verbatim}
%
Note that the prompt {\tt | ?- } is provided by Prolog, while the
query consists of the string {\tt type(Type).}, including the period
and a return after the period.  Prolog then responds with
instantiations of any variables in the query if the query is
successful.  Thus the first solution for {\tt Type} that is found
above is {\tt Type = bot}.  After providing an instantiation
representing the solution to the query, Prolog then provides another
prompt, this time in the form of a single question mark.  After the
first prompt above, the user typed a semicolon and return, indicating
that another solution is desired.  The second solution Prolog found
was {\tt Type = cat}.  After this prompt, the user requested a third
solution.  After the third solution, {\tt Type = synsem}, the user
simply input a return, indicating that no more solutions were desired.
These two options, semicolon followed by return, and a simple return,
are the only ones relevant for {\sc ale}.  If the anonymous variable
{\tt \_} is used in a query, no substitutions are given for it in the
solution.  If there are no solutions to a query, Prolog returns {\tt
no} as an answer.  Consider the following two queries:
%
\begin{verbatim}
  | ?- type(bot).

  yes
\end{verbatim}
%
%
\begin{verbatim}
  | ?- type(foobar).

  no
\end{verbatim}
%
In both cases, no variables are given in the input, so a simple {\tt
yes/no} answer, followed by another prompt, is all that is returned.

The second useful probe on the signature indicates type subsumptions
and type unifications.   To test type subsumption, use the following
form of query\label{SUBTYPE}:
%
\begin{verbatim}
  | ?- sub_type(X,Y).

  X = and,
  Y = and ?;

  X = backward,
  Y = backward ?

  yes
\end{verbatim}
%
Note that with two variables, substitutions for both are given,
allowing the possibility of iterating through the cases.  In general,
wherever a variable may be used in a query, a constant may also be
used.  Thus {\tt sub\_type(synsem,forward).} is a valid query, as are
{\tt sub\_type(synsem,X)} and {\tt sub\_type(Y,forward)}.  The first
argument is the more general type, with the second argument being the
subtype.

Type unifications are handled by the following form of
query\label{UNIFYTYPE}:
%
\begin{verbatim}
  | ?- unify_type(T1,T2,T).
\end{verbatim}
%
The interpretation here is that {\tt T1} unified with {\tt T2}
produces {\tt T3}.  As before, any subset of the three variables may
be instantiated for the test and the remaining variables will be
solved for.  

The following query will indicate whether given features have been
defined and can also be used to iterate through the features if the
argument is uninstantiated\label{FEATURE}:
%
\begin{verbatim}
  | ?- feature(F).
\end{verbatim}
%

Feature introduction can be tested by\label{INTRODUCE}:
%
\begin{verbatim}
  | ?- introduce(F,T).
\end{verbatim}
%
which holds if feature {\tt F} is introduced at type {\tt T}.  

Type constraints can be tested using\label{SHOWCONS}:
%
\begin{verbatim}
  | ?- show_cons(Type).
\end{verbatim}
%
which will display the description of the constraint assigned to the
type, Type.

Finally, the inherited appropriateness function can be tested
by\label{APPROP}:
%
\begin{verbatim}
  | ?- approp(Feat,Type,Restr).
\end{verbatim}
%
A solution indicates that the value for feature {\tt Feat} for a type
{\tt Type} structure is of type {\tt Restr}.  As usual, any of the
variables may be instantiated, so that it is possible to iterate
through the types appropriate for a given feature or the features
appropriate for a given type, the restrictions on a given feature in a
fixed type, and so on.

There is one higher-level debugging routine for the signature that
outputs a complete specification for a type, including a list of its
subtypes and supertypes, along with the most general feature structure
of that type (after all type inference and constraint satisfaction has
been performed).  An example of the {\tt show\_type/1} query is as
follows\label{SHOWTYPE}:
\begin{verbatim}
  | ?- show_type functional.

  TYPE: functional
  SUBTYPES: [forward,backward]
  SUPERTYPES: [synsem]
  MOST GENERAL SATISFIER: 
       functional
       ARG synsem
       RES synsem
\end{verbatim}
%
If {\tt synsem} had any appropriate features, these would have been
added, along with their most general appropriate values. 

\mysection{Evaluating Descriptions}

Descriptions can be evaluated in order to find their most general
satisfiers.  {\sc ale} provides the following form of query\label{MGSAT}:
%
\begin{verbatim}
  | ?- mgsat tl:e_list.

  ne_list_quant
  HD quant
     RESTR proposition
     SCOPE proposition
     VAR individual
  TL e_list

  ANOTHER?  n.

  yes  
\end{verbatim}
%
Note that there must be whitespace between the {\tt mgsat} and the
description to be satisfied.  The answer given above is the most
general satisfier of the description {\tt tl:e\_list} using the
signature in the categorial grammar in the appendix.  It is important
to note here that type inference is being performed to find most
general satisfiers.  In the case at hand, because lists in the
categorial grammar are typed to have quantifiers as their {\tt HD}
values, the value of the {\tt HD} feature in the most general
satisfier has been coerced to be a quantifier.

Satisfiable non-disjunctive descriptions always have unique most
general satisfiers as a consequence of the way in which the type
system is constrained.  But a description with disjunctions in it may
have multiple satisfiers.  Consider the following query:
%
\begin{verbatim}
  | ?- mgsat hit,hitter:(j;m).

  hit
  HITTEE individual
  HITTER j

  ANOTHER?  y.

  hit
  HITTEE individual
  HITTER m
  
  ANOTHER?  y.
  
  no
\end{verbatim}
%
After finding the first most general satisfier to the description, the
user is prompted as to whether or not another most general satisfier
should be sought.  As there are only two most general satisfiers of
the description, the first request for another satisfier succeeds,
while the second one fails.  Failure to find additional solutions is
indicated by the {\tt no} response from Prolog.

Error messages will result if there is a violation of the type
hierarchy in the query.  For instance, consider the following query
containing two type errors before a satisfiable disjunct:
%
\begin{verbatim}
  | ?- mgsat hd:j ; a ; j.

  add_to could not add incompatible type j to: 
       quant
       RESTR proposition
       SCOPE proposition
       VAR individual
  
  add_to could not add undefined type: a to
       bot
  
  MOST GENERAL SATISFIER OF: hd:j;a;j
  
  j
  
  ANOTHER? 
\end{verbatim}
%
Here the two errors are indicated, followed by a display of the unique
most general satisfiers.  The problem with the first disjunct is that
lists have elements which must be of the quantifier type, which
conflicts with the individual type of {\tt j}, while the second
disjunct involves an undefined type {\tt a}.  Note that in the error
messages, there is some indication of how the conflict arose as well
as the current state of the structure when the error occurred.  For
instance, the system had already figured out that the head must be a
quantifier, which it determined before arriving at the incompatible
type {\tt j}.  The conflict arose when an attempt was made to add the
type {\tt j} to the {\tt quant} type object.

To explore unification, simply use conjunction and {\tt mgsat}.  In
particular, to see the unification of descriptions {\tt D1} and {\tt
D2}, simply display the most general satisfiers of {\tt D1}, {\tt D2},
and their conjunction {\tt (D1,D2)}.  To obtain the correct results,
{\tt D1} and {\tt D2} must not share any common variables.  If they
do, the values of these will be unified across {\tt D1} and {\tt D2},
a fact which is not represented by the most general satisfiers of
either {\tt D1} or {\tt D2}.  Providing most general satisfiers also
allows the user to test for subsumption or logical equivalence by
visual inspection, by using {\tt mgsat/1} and comparing the set of
solutions.  Future releases should contain mechanisms for evaluating
subsumption (entailment), and hence logical equivalence of
descriptions.

{\tt mgsat} can also be used to test functional descriptions, e.g.,
for the functional append on p.~\pageref{APPEND-DEF}:
%
\begin{verbatim}
  | ?- mgsat append([bot],[bot,bot]).

ne_list
HD bot
TL ne_list
  HD bot
  TL ne_list
    HD bot
    TL e_list

ANOTHER?
\end{verbatim}

The Prolog predicate {\tt iso\_desc/2}\label{ISODESC} can be used to
discover whether two descriptions evaluate to the same feature
structure.  This can be useful for testing extensional type
declarations.
%
\begin{verbatim}
  | ?- iso_desc(X,X).
 
  X = _A-bot ? 

  yes
  | ?- iso_desc((a_ atom),(a_ atom)).     % a_ atoms are extensional

  yes
  | ?- iso_desc(b,b).         % for b, intensional

  no
  | ?- iso_desc(a,a).         % for a, extensional, with feature f
 
  no
  | ?- iso_desc(f:(a_ at1),f:(a_ at1)).   % f approp. to a_ atoms
 
  yes
  | ?- iso_desc(f:(a_ at1),f:(a_ at2)).

  no
\end{verbatim}

\mysection{Hiding Types and Features}

With a feature structure system such as {\sc ale}, grammars and
programs often manipulate very large feature structures.  To aid in
debugging, two queries allow the user to focus attention on particular
types and features by supressing the printing of other types and
features.  

The following command supresses printing of a type\label{NOWRITETYPE}:
%
\begin{quote}
  {\tt | ?- no\_write\_type($T$).}
\end{quote}
%
After {\tt no\_write\_type($T$)} is called, the type $T$ will no
longer be displayed during printing.  To restore the type $T$ to
printed status, use\label{WRITETYPE}:
%
\begin{quote}
  {\tt | ?- write\_type($T$).}
\end{quote}
%
If $T$ is a variable in a call to {\tt write\_type/1}, then all types
are subsequently printed.  Alternatively, the following query restores
printing of all types\label{WRITETYPES}:
%
\begin{quote}
  {\tt | ?- write\_types.}
\end{quote}

Features and their associated values can be supressed in much the same
way as types.  In particular, the following command blocks the feature
$F$ and its values from being printed\label{NOWRITEFEAT}:
%
\begin{quote}
  {\tt | ?- no\_write\_feat($F$).}
\end{quote}
%
To restore printing of feature $F$, use\label{WRITEFEAT}:
%
\begin{quote}
  {\tt | ?- write\_feat($F$).}
\end{quote}
%
If $F$ is a variable here, all features will subsequently be printed.
The following special query also restores printing of all
features\label{WRITEFEATS}.
%
\begin{quote}
  {\tt | ?- write\_feats.}
\end{quote}


\mysection{Evaluating Definite Clause Queries}

It is possible to display definite clauses in feature structure format
by name.  The following form of query can be used\label{SHOWCLAUSE}:
%
\begin{verbatim}
| ?- show_clause append.

HEAD: append(e_list,
             [0] bot,
             [0] )
BODY: true

ANOTHER?  y.

HEAD: append(ne_list_quant
             HD [0] quant
                RESTR proposition
                SCOPE proposition
                VAR individual
             TL [1] list_quant,
             [2] bot,
             ne_list_quant
             HD [0] 
             TL [3] list_quant)
BODY: append([1],
             [2],
             [3])

ANOTHER?  y.

no
\end{verbatim}
%
Note that this example comes from the categorial grammar in the
appendix.  Also note that the feature structures are displayed in full
with tags indicating structure sharing.  Next, note that prompts allow
the user to iterate through all the clauses.  The number of solutions
might not correspond to the number of clause definitions in the
program due to disjunctions in descriptions which are resolved
non-deterministically when displaying rules.  But it is important to
keep in mind that this feature structure notation for rules is not the
one {\sc ale} uses internally, which compiles rules down into
elementary operations which are then compiled, rather than evaluating
them as feature structures by unification.  In this way, {\sc ale} is
more like a logic programming compiler than an interpreter.  Finally,
note that the arity of the predicate being listed may be represented
in the query as in Prolog.  For instance, the query {\tt show\_clause
append/3} would show the clauses for {\tt append} with three
arguments.

Definite clauses in {\sc ale} can be evaluated by using a query such
as\label{QUERY}:
%
\begin{verbatim}
  | ?- query append(X,Y,[a,b]).

  append(e_list,
         [0] ne_list
         HD a
         TL ne_list
            HD b
            TL e_list,
         [0] )
  
  ANOTHER?  y.
  append(ne_list
         HD [0] a
         TL e_list,
         [1] ne_list
         HD b
         TL e_list,
         ne_list
         HD [0] 
         TL [1] )
  
  ANOTHER?  y.
  append(ne_list
         HD [0] a
         TL ne_list
            HD [1] b
            TL e_list,
         [2] e_list,
         ne_list
         HD [0] 
         TL ne_list
            HD [1] 
            TL [2] )

  ANOTHER?  y.

  no
\end{verbatim}
%
The definition of {\tt append/3} is taken from the syllabification
grammar in the appendix.  After displaying the first solution, 
{\sc ale} queries the user as to whether or not to display another
solution.  In this case, there are only three solutions, so the third
query for another solution fails.  Note that the answers are given in
feature structure notation, where the macro {\tt [a,b]} is converted to a
head/tail feature structure encoding.

Unlike Prolog, in which a solution is displayed as a substitution for
the variables in the query, {\sc ale} displays a solution as a
satisfier of the entire query.  The reason for this is that structures
which are not given as variables may also be further instantiated due
to the type system.  Definite clause resolution in {\sc ale} is such
that only the most general solutions to queries are displayed.  For
instance, consider the following query, also from the syllabification
grammar in the appendix:
%
\begin{verbatim}
  | ?- query less_sonorous(X,r).
  
  less_sonorous(nasal,
                r)
  
  ANOTHER?  y.

  less_sonorous(sibilant,
                r)
  
  ANOTHER?  n.
\end{verbatim}
%
Rather than enumerating all of the {\tt nasal} and {\tt sibilant}
types, {\sc ale} simply displays their supertype.  On the other hand,
it is important to note that the query {\tt less\_sonorous(s,r)} would
succeed because {\tt s} is a subtype of {\tt sibilant}.  This example
also clearly illustrates how {\sc ale} begins each argument on its own
line arranged with the query.

In general, the goal to be solved must be a literal, consisting only
of a relation applied to arguments.  In particular, it is not allowed
to contain conjunction, disjunction, cuts, or other definite clause
control structures.  To solve a more complex goal, a definite clause
must be defined with the complex goal as a body and then the head
literal solved, which will involve the resolution of the body.

There are no routines to trace the execution of definite clauses.
Future releases of {\sc ale} will contain a box port tracer similar to
that used for Prolog.  At present, the best suggestion is to develop
definite clauses modularly and test them from the bottom-up to make
sure they work before trying to incorporate them into larger programs.


\mysection{Displaying Grammars}

{\sc ale} provides a number of routines for displaying and debugging
grammar specifications.  After compile-time errors have been taken
care of, the queries described in this section can display the result
of compilation.

Lexical entries can be displayed using the following form of
query\label{LEX}:
%
\begin{verbatim}
  | ?- lex(kid).
  
  WORD: kid
  ENTRY: 
  cat
  QSTORE e_list
  SYNSEM basic
         SEM property
             BODY kid
                  ARG1 [0] individual
             IND [0] 
         SYN n
  
  ANOTHER?  y.

  no
\end{verbatim}
%
As usual, if there are multiple entries, {\sc ale} makes a query as
to whether more should be displayed.  In this case, there was only one
entry for {\tt kid} in the categorial grammar in the appendix.

Another predicate, {\tt export\_words({\em Stream},{\em
Delimiter})}\label{EXPORTWORDS}, writes an alphabetised list of all of
the words in the lexicon, separated by {\em Delimiter}, to {\em
Stream}.  In SICStus Prolog, for example, {\tt
export\_words(user\_output,'$\backslash$n')} will write the words to standard
output (such as the screen), one to a line.

Empty lexical entries can be displayed using\label{EMPTY}:
%
\begin{verbatim}
  | ?- empty.
  
  EMPTY CATEGORY: 
      cat
      QSTORE ne_list_quant
             HD some
                RESTR [0] proposition
                SCOPE proposition
                VAR [1] individual
             TL e_list
      SYNSEM forward
             ARG basic
                 SEM property
                     BODY [0] 
                     IND [1] 
                 SYN n
             RES basic
                 SEM [1] 
                 SYN np
  
  ANOTHER?  no.
\end{verbatim}
%
Note that the number specification was removed to allow the empty
category to be processed with respect to the categorial grammar type
system.  As with the other display predicates, {\tt empty} provides
the option of iterating through all of the possibilities for empty
categories.

Grammar rules can be displayed by name, as in\label{RULE}:
%
\begin{verbatim}
| ?- rule forward_application.

RULE: forward_application

MOTHER: 

  cat
  QSTORE [4] list_quant
  SYNSEM [0] synsem

DAUGHTERS/GOALS:

CAT  cat
     QSTORE [2] list_quant
     SYNSEM forward
            ARG [1] synsem
            RES [0] 

CAT  cat
     QSTORE [3] list_quant
     SYNSEM [1] 

GOAL  append([2],
             [3],
             [4])

ANOTHER?  n.
\end{verbatim}
%
Rules are displayed as most general satisfiers of their mother,
category and goal descriptions.  It is important to note that this is
for display purposes only.  The rules are not converted to feature
structures internally, but rather to predicates consisting of
low-level compiled instructions.  Displaying a rule will also flag any
errors in finding most general satisfiers of the categories and rules
in goals, and can thus be used for rule debugging.  This can detect
errors not found at compile-time, as there is no satisfiability
checking of rules performed during compilation.  

Macros can also be displayed by name, using\label{MACRO}:
%
\begin{verbatim}
  | ?- macro np(X).
  
  MACRO: 
      np([0] sem_obj)
  ABBREVIATES:
      basic
      SEM [0] 
      SYN np
  
  ANOTHER?  n.
\end{verbatim}
%
First note that the macro name itself is displayed, with all
descriptions in the macro name given replaced with their most general
satisfiers.  Following the macro name is the macro satisfied by the
macro description with the variables instantiated as shown in the
macro name display.  Note that there is sharing between the
description in the macro name and the {\tt SEM} feature in the result.
This shows where the parameter is added to the macro's description.

Finally, it is possible to display lexical rules, using the following
query\label{LEXRULE}: 
%
\begin{verbatim}
  | ?- lex_rule plural_n.

  LEX RULE: plural_n
  INPUT CATEGORY: 
      n
      NUM sing
      PERS pers
  OUTPUT CATEGORY: 
      n
      NUM plu
      PERS pers
  MORPHS: 
      [g,o,o,s,e] becomes [g,e,e,s,e]
      [k,e,y] becomes [k,e,y,s]
      A,[m,a,n] becomes A,[m,e,n]
      A,B becomes A,B,[e,s]
          when fricative(B)
      A,[e,y] becomes A,[i,e,s]
      A becomes A,[s]
  
  ANOTHER?  n.
\end{verbatim}
%
Note that the morphological components of a rule is displayed in
canonical form when it is displayed.  Note that variables in
morphological rules are displayed as upper case characters.  When
there is sharing of structure between the input and output of a
lexical rule, it will be displayed as such.  As with the other {\sc
ale} grammar display predicates, if there are multiple solutions to
the descriptions, these will be displayed in order.  Also, if there is
a condition on the categories in the form of an {\sc ale} definite
clause goal, this condition will be displayed before the morphological
clauses.  As with grammar rules, lexical rules are compiled internally
and not actually executed as feature structures.  The feature
structure notation is only for display.  Also, as with grammar rules,
displaying a lexical rule may uncover inconsistencies which are not
found at compile time.

 
\mysection{Executing Grammars: Parsing}

In this section, we consider the execution of {\sc ale} phrase
structure grammars compiled for parsing.  The examples shown in this
section have been produced while running with the mini-interpreter
off.  The mini-interpreter will be discussed in the next section.

The primary predicate for parsing is illustrated as follows\label{REC}:
%
\begin{verbatim}
| ?- rec [john,hits,every,toy].

STRING: 
0 john 1 hits 2 every 3 toy 4

CATEGORY: 
cat
QSTORE e_list
SYNSEM basic
       SEM every
           RESTR toy
                 ARG1 [0] individual
           SCOPE hit
                 HITTEE [0] 
                 HITTER j
           VAR [0] 
       SYN s

ANOTHER?  y.

CATEGORY: 
cat
QSTORE ne_list_quant
       HD every
          RESTR toy
                ARG1 [0] individual
          SCOPE proposition
          VAR [0] 
       TL e_list
SYNSEM basic
       SEM hit
           HITTEE [0] 
           HITTER j
       SYN s

ANOTHER?  y.

no
\end{verbatim}
%
The first thing to note here is that the input string must be entered
as a Prolog list of atoms.  In particular, it must have an opening and
closing bracket, with words separated by commas.  No variables should
occur in the query, nor anything other than atoms.  The first part of
the output repeats the input string, separated by numbers ({\em nodes}
in the chart) which indicate positions in the string for later use in
inspecting the chart directly.  {\sc ale} asserts one lexical item for
every unit interval, with empty categories being stored as loops from
every single node to itself.  The second part of the output is a
category which is derived for the input string.  If there are multiple
solutions, these can be iterated through by providing positive answers
to the query.  The final {\tt no} response above indicates that the
category displayed is the only one that was found.  If there are no
parses for a string, an answer of {\tt no} is returned, as with:
%
\begin{verbatim}
| ?- rec([runs,john]).

STRING: 
0 runs 1 john 2

no
\end{verbatim}
%

Notice that there is no notion of ``distinguished start symbol'' in
parsing.  Rather, the recognizer generates all categories that it can
find for the input string.  This allows sentence fragments and phrases
to be analyzed, as in:
%
\begin{verbatim}
  | ?- rec [big,kid].
  
  STRING: 
  0 big 1 kid 2
  
  CATEGORY: 
  cat
  QSTORE ne_list_quant
         HD some
            RESTR and
                  CONJ1 kid
                        ARG1 [0] individual
                  CONJ2 big
                        ARG1 [0] 
            SCOPE proposition
            VAR [0] 
         TL e_list
  SYNSEM basic
         SEM [0] 
         SYN np
  
  ANOTHER?  n.
\end{verbatim}
%

There is also a two-place version of {\tt rec}\label{REC2} that
displays only those parses that satisfy a given description:
%
\begin{verbatim}
  | ?- rec([big,kid],s).

  STRING: 
  0 big 1 kid 2

  no
\end{verbatim}
%
This call to {\tt rec/2} failed because there were no parses of {\tt
big kid} of type, {\tt s}.  Internally, the parser still generates all
of the edges that it normally does --- the extra description is only
applied at the end as a filter.

Once parsing has taken place for a sentence using {\tt rec/1}, it is
possible to look at categories that were generated internally.  In
general, the parser will find every possible analysis of every
substring of the input string, and these will be available for later
inspection.  For instance, suppose the last call to {\tt rec/1}
executed was {\tt rec [john,hits,every,toy]}, the results of which are
given above.  Then the following query can be made\label{EDGE}:
%
\begin{verbatim}
  | ?- edge(2,4).
  
  COMPLETED CATEGORIES SPANNING: every toy 
  
  cat
  QSTORE ne_list_quant
         HD every
            RESTR toy
                  ARG1 [0] individual
            SCOPE proposition
            VAR [0] 
         TL e_list
  SYNSEM basic
         SEM [0] 
         SYN np

  Edge created for category above: 
       index: 20
        from: 2 to: 4
      string: every toy
        rule:  np_det_nbar
   # of dtrs: 2
  Action(retract,dtr-#,continue,abort)? 
  |: continue.
 
no
| ?-
\end{verbatim}
%
The possible replies in the action-line will be discussed in the next
section.  This tells us that from positions {\tt 2} to {\tt 4}, which
covers the string {\tt every toy} in the input, the indicated category
was found.  Even though an active chart parser is used, it is not
possible to inspect active edges.  This is because {\sc ale}
represents active edges as dynamic structures that are not available
after they have been evaluated.

Using {\tt edge/2} it is possible to debug grammars by seeing how far
analyses progressed and by inspecting analyses of substrings.  

There is also a predicate, {\tt rec/4}~\label{REC4} that binds the
answer to variables instead of displaying it:%
%
\footnote{Actually, this example is a bit of an improvisation --- the
  sample {\sc hpsg} grammar included in the {\sc ale} distribution
  does not use inequations.}
%
\begin{verbatim}
  | ?- rec([kim,walks],Ref,SVs,Iqs).
 
  Iqs = [ineq(_A,index(_B-third,_C-plur,_D-masc),_E,index(_B-third,_C-plur,
  _D-masc),done)],
  SVs =  phrase(_F-synsem(_G-loc(_H-cat(_I-verb(_J-minus,_K-minus,_L-none,_M-
  bool,_N-fin),_O-unmarked,_P-e_list),...)))?
\end{verbatim}
%
{\tt Ref}~\label{INTERNAL-REP} is an unbound variable that represents
the root node of the resulting feature structure.  {\tt SVs} is a term
whose functor is the type of the feature structure, whose arity is the
number of appropriate features for that type, and whose arguments are
the values at those appropriate features in alphabetical order.  Each
value, in turn, is of the form, {\tt Ref-SVs}, another pair of root
variable and type-functored term.  {\tt Iqs} is a list of the
inequations that apply to the feature structure and its substructures.
Each member of the list represents a disjunction of inequations, i.e.,
one must be satisfied, with the list itself being interpreted
conjunctively, i.e., every member must be satisfied.  Each member is
represented by a chain of {\tt ineq/5} structures:
%
\begin{verbatim}
ineq(Ref1,SVs1,Ref2,SVs2,ineq(......,done)...)
\end{verbatim}
%
The first four arguments represent the {\tt Ref-SVs} pairs of the two
inequated feature structures of the first disjunct.  The fifth
argument contains another {\tt ineq/5} structure, or {\tt done/0}.
These three structures are suitable for passing to {\tt gen/3} or {\tt
  gen/4}.  These representations are not grounded; so if you want to
assert them into a database, be sure to assert them all in one
predicate to preserve variable sharing.  For more details on {\sc
  ale}'s internal representation, the reader is referred to Carpenter
and Penn (1996).

There is also a {\tt rec/5}\label{REC5}, which works just like {\tt rec/4}, but
filters its output through the description in its last argument, just
like {\tt rec/2}:
%
\begin{verbatim}
  | ?- rec([kim,walks],Ref,SVs,Iqs,phrase).
 
  Iqs = [ineq(_A,index(_B-third,_C-plur,_D-masc),_E,index(_B-third,_C-plur,
  _D-masc),done)],
  SVs =  phrase(_F-synsem(_G-loc(_H-cat(_I-verb(_J-minus,_K-minus,_L-none,_M-
  bool,_N-fin),_O-unmarked,_P-e_list),...)))?
\end{verbatim}
%
The call succeeds here because the given answer is of type, {\tt phrase}.

{\tt rec\_list/2}\label{RECLIST} iteratively displays all of the
solutions for each string in a list of list of words that satisfy a
description:
%
\begin{verbatim}
  | ?- rec_list([[kim,sees,sandy],[sandy,sees,kim]],phrase).

  STRING: 
  0 kim 1 sees 2 sandy 3
  CATEGORY: 
  
  phrase
  QRETR e_list
  QSTORE e_set
  SYNSEM synsem...

  ANOTHER? y.

  STRING: 
  0 sandy 1 sees 2 kim 3
  CATEGORY: 

  phrase
  QRETR e_list
  QSTORE e_set
  SYNSEM synsem

  ANOTHER?  y.

  no
\end{verbatim}
%
If no filtering through a description is desired, the description,
{\tt bot}, which is trivially satisfied, can be used.  When {\tt
rec\_list/2} runs out of solutions for a string, it moves on to the
next string.

{\tt rec\_best/2}\label{RECBEST} iteratively displays all of the
solutions satisfying a given description for the {\it first} string in
a list of list of words that has a solution satisfying that
description.
%
\begin{verbatim}
  | ?- rec_best([[kim,sees,sandy],[sandy,sees,kim]],phrase).

  STRING: 
  0 kim 1 sees 2 sandy 3
  CATEGORY: 
  
  phrase
  QRETR e_list
  QSTORE e_set
  SYNSEM synsem...

  ANOTHER? y.

  no
\end{verbatim}
%
When {\tt rec\_best/2} runs out of solutions for a string that had at
least one solution, it fails.  It only tries the strings in its first
argument until it finds one that has solutions.

There is also a three-place {\tt rec\_list/3}\label{RECLIST3} that
collects the solutions to {\tt rec\_list/2} in a list of terms of the
form {\tt fs(Tag-SVs,Iqs)}.

\mysection{Executing Grammars: Generation}

The generator can be used with the predicate {\tt gen/1,3}\label{GEN}
predicate.  It can take a single description argument:
%
\begin{verbatim}
  | ?- gen((sentence,
            sem:(pred:decl,
                 args:[(pred:call_up,
                        args:[pred:mary,pred:john])]))).

  CATEGORY: 

  sentence
  SEM sem
      ARGS arg_ne_list
           HD sem
              ARGS arg_ne_list
                   HD sem
                      ARGS arg_list
                      PRED mary
                   TL arg_ne_list
                      HD sem
                         ARGS arg_list
                         PRED john
                      TL e_list
              PRED call_up
           TL e_list
      PRED decl


  STRING: 
  mary calls john up 

  ANOTHER?  y.

  STRING: 
  mary calls up john 

  ANOTHER?  y.

  no
\end{verbatim}
%
Notice the extra set of parentheses necessary to make the whole
description a single argument for {\tt gen/1}.

{\tt gen} can also take three arguments:
%
\begin{verbatim}
  gen(Ref,SVs,Iqs)
\end{verbatim}
%
where {\tt Ref}, {\tt SVs} and {\tt Iqs} are the three parts of {\sc
  ale}'s internal representation of a feature structure as defined in
the last section.  This alternative is most useful when the feature
structure has been generated before by another process, like parsing,
or retrieved from a database.
%
\begin{verbatim}
  | ?- rec([john,calls,mary,up],Ref,SVs,Iqs),gen(Ref,SVs,Iqs).

  CATEGORY: 

  sentence
  SEM sem
      ARGS arg_ne_list
           HD sem
              ARGS arg_ne_list
                   HD sem
                      ARGS e_list
                      PRED john
                   TL arg_ne_list
                      HD sem
                         ARGS e_list
                         PRED mary
                      TL e_list
              PRED call_up
           TL e_list
      PRED decl


  STRING: 
  john calls mary up 

  ANOTHER?  n.

  Iqs = [],
  SVs = sentence(_O-sem(_N-arg_ne_list(_M-sem(_L-arg_ne_list(_K-
  sem(_J-e_list,_I-john),_H-arg_ne_list(_G-sem(_F-e_list,_E-mary),
  _D-e_list)),_C-call_up),_B-e_list),_A-decl)) ? 

  yes
\end{verbatim}
%
In both cases, {\sc ale} will print the input feature structure and
then will generate and display all possible string solutions through
backtracking.

It is also possible to bind the string to a variable, using {\tt
  gen/4}~\label{GEN4}:
%
\begin{verbatim}
  gen(Ref,SVs,Iqs,Ws).
\end{verbatim}
%
{\tt Ws} will non-deterministically be bound to the word lists
that constitute valid solutions to the generation problem.  This can
be used as input to {\tt rec/1}, for example.
%
\begin{verbatim}
  | ?- rec([john,calls,mary,up],Ref,SVs,Iqs),gen(Ref,SVs,Iqs,Ws).

  Iqs = [],
  SVs = sentence(_O-sem(_N-arg_ne_list(_M-sem(_L-arg_ne_list(_K-
  sem(_J-e_list,_I-john),_H-arg_ne_list(_G-sem(_F-e_list,_E-mary),
  _D-e_list)),_C-call_up),_B-e_list),_A-decl)),
  Ws = [john,calls,mary,up] ? ;

  Iqs = [],
  SVs = sentence(_O-sem(_N-arg_ne_list(_M-sem(_L-arg_ne_list(_K-
  sem(_J-e_list,_I-john),_H-arg_ne_list(_G-sem(_F-e_list,_E-mary),
  _D-e_list)),_C-call_up),_B-e_list),_A-decl)),
  Ws = [john,calls,up,mary] ? ;

  Iqs = [],
  SVs = s(_L-finite,_K-sem(_J-arg_ne_list(_I-sem(_H-e_list,_G-john),
  _F-arg_ne_list(_E-sem(_D-e_list,_C-mary),_B-e_list)),_A-call_up)),
  Ws = [john,calls,mary,up] ? ;

  Iqs = [],
  SVs = s(_L-finite,_K-sem(_J-arg_ne_list(_I-sem(_H-e_list,_G-john),
  _F-arg_ne_list(_E-sem(_D-e_list,_C-mary),_B-e_list)),_A-call_up)),
  Ws = [john,calls,up,mary] ? ;

no
\end{verbatim}
%
The last two solutions in the example above are generated because the
input string, {\tt john calls mary up}, can be parsed both as a {\tt
sentence} type and as an {\tt s} type.

\mysection{Mini-interpreter (parsing only)}

{\sc ale} contains a mini-interpreter that allows the user to
traverse and edit an {\sc ale} parse tree.  By default, the
mini-interpreter is off when {\sc ale} is loaded.  To turn the
mini-interpreter on, simply type\label{INTERP}:
%
\begin{verbatim}
  | ?- interp.

  interpreter is active

  yes
  | ?-
\end{verbatim}
%
To turn it off again, use {\tt nointerp.}\label{NOINTERP} Any parse
automatically stores the following information on the edges added to
{\sc ale}'s chart:
%
\begin{itemize}
\item  Spanning nodes
\item  Substring spanned
\item  Creator
\item  Daughters (if any)
\end{itemize}
%
The spanning nodes are the nodes in the chart that the edge spans.
The substring spanned is the concatenation of lexical items between
the spanning nodes.  If an edge was formed by the application of an
{\sc ale} grammatical rule, its creator is that rule, with the
daughters being the daughters of the rule, i.e., the {\tt cat>}, and
{\tt cats>} of the rule).  If the rule was created by EFD closure, the
mini-interpreter will treat it as the user's rule it was created from,
displaying the empty categories that had matched its daughters during
EFD closure in the same way as daughters matched at run-time.  If an
edge represents an empty category, its creator is normally {\tt
empty}; but empty categories created during EFD closure will show the
rules that created them, along with their empty category daughters.
If an edge represents a lexical item, its creator is {\tt lexicon}.
In the case of empty categories not created by EFD closure and all
lexical items, there are no daughters.

The status of the mini-interpreter has no effect on compilation.  The
same code is used regardless of whether the mini-interpreter is active
or inactive.  The mini-interpreter only has an effect on the run-time
commands {\tt rec/1,2,4,5} and {\tt drec/1}.

When the mini-interpreter is active, {\tt rec/1} operates in one of
two modes: query-mode or go-mode.  When the mini-interpreter is active,
{\tt rec/1} always begins in {\it query-mode}.  In query-mode, the
user is prompted just before any edge is added.  Because {\sc ale}
parses from right to left, the edges are encountered in that order.
The prompt consists of a display of the feature structure for the
edge, followed by the mini-interpreter information for that edge,
followed by an {\em action-line}, which lists the options available to
the user.  For example (from the {\sc hpsg} grammar included in the
{\sc ale} distribution):
%
\begin{verbatim}
  | ?- rec([kim,sees,sandy]).

  STRING: 
  0 kim 1 sees 2 sandy 3

  word
  QRETR list_quant
  QSTORE e_set
  SYNSEM synsem
         LOC loc
             CAT cat
                 HEAD noun
                      CASE case
                      MOD none
                      PRD bool
                 MARKING unmarked
                 SUBCAT e_list
             CONT nom_obj
                  INDEX [0] ref
                        GEN gend
                        NUM sing
                        PER third
                  RESTR e_set

  Edge created for category above: 
        from: 2 to: 3
      string: sandy 
        rule:  lexicon
   # of dtrs: 0

  Action(add,noadd,go(-#),break,dtr-#,abort)? 
  |: 
\end{verbatim}
%
We see, in this example, the main action-line for {\tt rec}.  If the
user selects {\tt add}\label{ADD}, the edge is added, and {\tt rec}
proceeds, in query-mode, as usual.  If {\tt noadd}\label{NOADD} is
selected, the edge is not added, and {\tt rec} proceeds in query-mode.
In every {\sc ale} action-line, the first option is the one that {\sc
  ale} would have chosen if the mini-interpreter were disabled.

{\tt go}\label{GO} puts the mini-interpreter into {\it go-mode}.  In
go-mode, {\tt rec} proceeds as it would if the mini-interpreter were
inactive, or to think of it another way, it functions as if the user
always chose the first option on every action-line, but it does not
stop to ask.  As it adds the edges, it displays them, along with their
mini-interpreter information.  {\tt go} suffixed with a
number\label{GON}, e.g.,  {\tt go-1}, puts the mini-interpreter into
go-mode until it encounters an edge whose left node is that number,
and then, beginning with that edge, automatically switches back into
query-mode.  With {\sc ale}'s current parsing strategy, {\tt go-}{\em
  N} will remain in go-mode until it encounters the first edge
corresponding to the ({\em N+1})st lexical item in the string being
parsed.

{\tt break}\label{BREAK} simply invokes the Prolog {\tt break}
commmand, placing the user into a Prolog interpreter with a new
break-level.  Edges that have been added so far can be examined and
retracted at this time.  When the user pops out of the break, the
current prompt is redisplayed.

{\tt dtr-N}\label{DTRN} displays the {\tt N}th daughter, its
mini-interpreter information, and the action-line for {\tt dtr}:
%
\begin{verbatim}
  Action(retract,dtr-#,parent,abort)?
  |: 
\end{verbatim}
%
{\tt retract}\label{RETRACTMI} removes the displayed edge (in this
case, the daughter) from the chart.  When the parse continues, {\sc
  ale} grammar rules will not be able to use that edge.  The current
edge (the parent of this daughter), however, can still be added.  {\tt
  dtr-N} has the same effect as in the {\tt rec} action-line.  {\tt
  parent}\label{PARENT} returns to the current edge's parent and its
action-line (either {\tt rec} or {\tt dtr}).

The mini-interpreter will not display any edge that has already been
retracted. Note that if edges are retracted, there may be gaps in the
sequence of chart-edge indices.

If {\tt abort}\label{ABORT} is selected, the parse is aborted.  All of
the edges added so far remain in memory until the next {\tt rec}
statement.  The edge that was displayed when {\tt abort} was chosen is
discarded.

Mini-interpreter information is also available through the run-time
commands, {\tt edge/2} and {\tt edge/1}, e.g.:
%
\begin{verbatim}
  | ?- edge(2,4).
  
  COMPLETED CATEGORIES SPANNING: every toy 
  
  cat
  QSTORE ne_list_quant
         HD every
            RESTR toy
                  ARG1 [0] individual
            SCOPE proposition
            VAR [0] 
         TL e_list
  SYNSEM basic
         SEM [0] 
         SYN np

  Edge created for category above: 
       index: 20
        from: 2 to: 4
      string: every toy
        rule:  np_det_nbar
   # of dtrs: 2
  Action(retract,dtr-#,continue,abort)? 
  |: 
\end{verbatim}
%
Every edge that is actually asserted into the chart is assigned a
unique number, called an {\em index}, which {\tt edge/2} displays also.
{\tt retract} and {\tt dtr} behave the same as in the {\tt dtr}
action-line.  {\tt continue}\label{CONTINUE} tells the
mini-interpreter that the user is done traversing the parse tree
rooted at the current edge, and to find more.

{\tt edge/1} works exactly as {\tt edge/2} does, except that its input
is the unique index number that {\sc ale} stores with every edge.

\mysection{Subsumption Checking (parsing only)}
\label{SUBSUMPTION}

{\sc ale} can perform subsumption checking on edges during parsing.
By default, it does not.  To enable it, use the
command\label{SUBTEST}:
%
\begin{verbatim}
  | ?- subtest.
 
  edge subsumption checking active
 
  yes
  | ?- 
\end{verbatim}
%
To turn it, off, use {\tt nosubtest.}\label{NOSUBTEST}

When subsumption checking is enabled, {\sc ale} will only add a new
feature structure to the chart between nodes $n$ and $m$ if there is
no other edge currently spanning $n$ and $m$ whose feature structure
subsumes the new one.  If, instead, the new feature structure subsumes
an existing one's, then the existing edge is retracted and replaced
with the new one.

Note that if edges are retracted, there may be gaps in the sequence of
chart-edge indices.

Extra compiled code is required in order to make subsumption checking
more efficient.  If subsumption checking is enabled when a grammar is
compiled, this code will be compiled also.  If it is disabled, the
subsumption checking code will be compiled when the {\tt subtest}
command is given.  Only in the former case, however, will subsumption
checking be used during EFD closure to reduce the number of empty
categories to consider at run-time.  The empty categories and phrase
structure rules must be recompiled (with {\tt compile\_rules}) if
subsumption checking is enabled after the initial compilation of a
grammar for empty categories to be tested for subsumption.

Our experience has been that subsumption checking is not required in
most unification-based grammars, and should therefore be left
disabled.  It is useful only for those grammars which have true
spurious ambiguity or redundancy.  Grammars that incorporate some
notion of thematic or functional structure for representing the
meaning of a sentence normally realise structural ambiguities as
semantic ambiguities that should be retained in the chart.

If both subsumption checking and the mini-interpreter are enabled,
then the user may override either of these behaviours.  In the former case,
before the new feature structure is discarded, it will be displayed
along with the discard action-line:
%
\begin{verbatim}
  Action(noadd,continue,break,dtr-#,existing,abort)?
  |:
\end{verbatim}
%
{\tt continue}\label{CONTINUE2} instructs the parser to look for more
subsuming edges.  If no more are found, the new feature structure is
added to the chart.  {\tt existing}\label{EXISTING} displays the
existing chart edge that subsumes the new one with the {\tt edge/2}
action-line.  The rest behave as described above.

In the latter case, before an existing edge is retracted, its feature
structure will be displayed along with the retract action-line:
%
\begin{verbatim}
  Action(retract,continue,break,dtr-#,incoming,abort)?
  |:
\end{verbatim}
%
{\tt retract} retracts the existing, subsumed edge and asserts the
incoming feature structure.  {\tt incoming}\label{INCOMING} displays
the incoming feature structure, along with the {\tt incoming}
action-line:
%
\begin{verbatim}
  Action(noadd,dtr-#,existing,abort)?
  |:
\end{verbatim}
%
the functions of whose options have already been described.

\mysection{Source-Level Debugger}

{\sc ale} also provides an XEmacs-based source-level debugger.  This
can only be used for parsing or definite clause resolution, and only
with SICStus 3.8.6 or higher, and XEmacs 20.3 or higher.  In future
releases, a debugger with a more restricted functionality will be made
available for users of SWI Prolog.

The {\sc ale} source-level debugger is implemented on top of the
SICStus source-level debugger.  The debugger provided with {\sc ale}
has the complete functionality of the SICStus source-level debugger
with ordinary Prolog programs; so you only need this one if you will
need to debug both.  SICStus debugger commands that are not explicitly
mentioned in this section are not supported in {\sc ale} debugging.

{\sc ale} source code must occur in a single file in order to be
debugged.  To debug Prolog source code, please refer to the SICStus
documentation.  For both {\sc ale} and Prolog debugging, the
prolog flag, {\tt source\_info}, needs to be turned on, using the command:
%
\begin{verbatim}
  | ?- prolog_flag(source_info,_,on).
\end{verbatim}
%
The SICStus XEmacs interface should do this automatically.

When a Prolog hook is encountered while debugging an {\sc ale}
grammar, the SICStus debugger is automatically invoked.  The hook will
be embedded in a Prolog {\tt call/1} statement.  If the {\tt leap}
option of the SICStus debugger is used, the leap ends at the end of
the hook --- {\sc ale} will creep when it resumes control.

To install the {\sc ale} source-level debugger, follow the directions
in the distribution file, {\tt debugger/INSTALL}.

\mysubsection{Running without XEmacs}

To run the debugger without XEmacs, simply run SICStus Prolog from the
directory with {\tt debugger.pl} and type:
%
\begin{verbatim}
  | ?- compile(debugger).
\end{verbatim}
%
This assumes that {\tt ale.pl} and the {\tt debugger} subdirectory are
located in the same directory.  If {\sc ale.pl} has not been loaded
already, this will compile {\tt ale.pl} as well.  There will be a
warning message about buffer-mode when it loads, which can be ignored.
Then type:\label{NOEMACS}
%
\begin{verbatim}
  | ?- noemacs.
\end{verbatim}
%
You can add a {\tt noemacs/0} directive to the end of {\tt
debugger.pl} to do this automatically.

\mysubsection{Running with XEmacs}

To run the debugger with XEmacs, you must run SICStus Prolog and {\sc
  ale} within XEmacs as an inferior process.  To do this:
%
\begin{enumerate}
\item
  set the EPROLOG environment variable to the command that
  runs SICStus Prolog in the shell that you will run XEmacs from
  (this is not necessary if the command is 'sicstus'),
\item
  run XEmacs from the directory with {\tt debugger.pl},
\item
  load the file to be debugged in Prolog major mode,
\item
  Use the XEmacs command, {\tt M-x run-prolog}, to run SICStus prolog as an
  inferior process,
\item
  From the SICStus Prolog prompt, type:
%
\begin{verbatim}
  | ?- compile(debugger).
\end{verbatim}
\end{enumerate}
%
There is also a command:\label{EMACS}
%
\begin{verbatim}
  | ?- emacs.
\end{verbatim}
%
that will turn on the XEmacs interface, if it has been disabled by
{\tt noemacs/0}.

\mysubsection{Debugger Commands}

There are seven basic commands in the {\sc ale} debugger, four of
which are variants of normal {\sc ale} commands.  These four, {\tt
dcompile\_gram/1}, {\tt drec/1}, {\tt dquery/1} and {\tt dgen/1}, are
the debugger variants of {\tt compile\_gram/1}, {\tt rec/1}, {\tt
query/1} and {\tt gen/1}, respectively.  The other three, {\tt
dleash/1}, {\tt dskip/1}, and {\tt dclear\_bps/0}, will be explained
below.

Currently, the {\sc ale} source-level debugger can only debug grammars
down to the level of feature structure unification, i.e., feature
structure unification is treated as an atomic operation.  Constraints
and procedural attachments on types using {\tt cons} and {\tt goal}
cannot be debugged either, nor can inequation enforcement, edge
subsumption checking, or extensionalization.  These will be possible
in a future version.  For now, {\tt
dcompile\_gram/1}\label{DCOMPILEGRAM} compiles a grammar in exactly
the same way that {\sc ale} normally does to the point where it can be
debugged.  It then reopens the grammar file and uses the token-stream
directly to index those parts of the grammar source code that it can
debug by line number.  {\tt dcompile\_gram/1} also requires that all
of the {\sc ale} source code for a grammar be in a single file, i.e.,
unlike {\sc ale} without the debugger, you cannot load other auxiliary
files from within a grammar file.  This restriction will also be
lifted in a future version.

{\tt dcompile\_gram/1} also allows for tracing through the EFD closure
algorithm, again down to the level of feature structure unification.

After a grammar has been compiled for debugging with {\tt
dcompile\_gram/1}, {\tt drec/1}, {\tt dquery/1} and {\tt dgen/1} can
be used for parsing, definite clause resolution or generation,
respectively, with the debugger.  {\tt dquery/1}\label{DQUERY} works
exactly as {\tt query/1} does: it first finds most general satisfiers
of the arguments and then searches for a solution with Prolog-style
SLD-resolution using the clauses provided by the user.  {\tt
drec/1}\label{DREC} and {\tt dgen/1}\label{DGEN} work exactly as {\tt
rec/1} and {\tt gen/1} do, except that lexical rules are not compiled
out, so that they can be debugged.  Also remember that {\sc ale}
parses right-to-left with EFD-closed rules and empty categories, and
that it uses a semantic-head-driven generator.  With parsing, the
debugger can also be used in conjunction with the chart
mini-interpreter described above for extra control of the chart.  With
generation, the debugger successively shows the construction of the
pivot template, pivot matching, pivot checking and the linking of the
pivot with the root.  Generation of non-semantic-head daughters is
performed recursively.

\mysubsection{Debugger Ports and Steps}

The {\sc ale} debugger is loosely based on the procedural box model of
execution that many Prolog debuggers use.  There are four kinds of
ports, {\it call}, {\it exit}, {\it redo}, and {\it fail}.  The {\sc
ale} debugger does not support exception ports.  A call port occurs at
every initial invocation of a step that {\sc ale} takes in parsing or
definite clause resolution, a list of which is given below.  An exit
port occurs at the successful completion of such a step; a redo port
occurs when a subsequent step has failed and {\sc ale} backtracks into
the current step to find more solutions, and a fail port occurs when a
step has failed to produce any or any more solutions.

Consider, for example, what happens when {\sc ale} applies the
following description to a feature structure that occurs in the
lexical entry for the word, {\tt foo}
%
\begin{verbatim}
foo --->(a
        ;f:b),
        g:c.
\end{verbatim}
%
The first port we encounter is when {\sc ale} tries to add the type
{\tt a}.  This is a call port:
%
\begin{verbatim}
Call: add type, a, to lex entry?
\end{verbatim}
%
If this succeeds, then an exit port occurs:
%
\begin{verbatim}
Exit: add type, a, to lex entry?
\end{verbatim}
%
and processing moves on to {\tt g:c}.  If this fails, then we must
backtrack through adding {\tt a} for more solutions (for example, if
there is a disjunctive constraint on that type):
%
\begin{verbatim}
Redo: add type, a, to lex entry?
\end{verbatim}
%
If there is another solution, then another exit port occurs.
Otherwise, the next port is fail port:
%
\begin{verbatim}
Fail: add type, a, to lex entry?
\end{verbatim}
%
and processing continues with the other disjunct, {\tt f:b}.
Certain steps in {\sc ale}, notably the depth-first rule application
of the chart parser, are failure-driven loops.  To indicate that these
``failures'' are actually a normal part of execution, they are
displayed as, e.g.:
%
\begin{verbatim}
Finished:close chart edge under rule application?
\end{verbatim}

Enforcing descriptions at a feature also counts as a step:
%
\begin{verbatim}
Call: enforce description on f value of lex entry?
\end{verbatim}
%
If we agree to this, then the next step would be to add the type {\tt b}
%
\begin{verbatim}
Call: add type, b, to value at f?
\end{verbatim}
%
and so on.

The steps in a description whose ports the {\sc ale} debugger keeps
track of are given in Figure~\ref{fig:desc-steps}:
%
\begin{figure}[tbh]
\begin{tabular}{|l|l|l|} \hline
Step & Kind & Example Message \\ \hline
Adding types & {\tt desc} & {\tt add type, a, to lex entry} \\ \hline
Adding {\tt a\_/1} atoms & {\tt desc} & {\tt add atom, foo(X), to lex
  entry} \\ \hline
Feature selection & {\tt desc} & {\tt enforce description on f value} \\
& & {\tt of lex entry} \\ \hline
Adding path equations & {\tt desc} & {\tt path equate [f,g] with [f,h]}
  \\ \hline
Adding inequations & {\tt desc} & {\tt inequate \ldots with description} \\ \hline
Unification (from shared & {\tt desc} & {\tt unify \ldots with
  $Var$} \\
variables) &&\\ \hline
Macro substitution & {\tt desc} & {\tt substitute macro description
for np/1} \\ \hline
Functional description & {\tt desc} & {\tt evaluate
  functional description,} \\
evaluation  & & {\tt append/2} \\ \hline
\end{tabular}
\caption{Description-level Steps}\label{fig:desc-steps}
\end{figure}
%
There are other kinds of steps besides description-level ones, that
pertain to {\sc ale}'s built-in control for parsing and definite
clause resolution.  These steps, along with their {\it kind}, are
given in the table in Figure~\ref{fig:parse-dcs-steps}.
%
\begin{figure}[tbh]
\begin{tabular}{|l|l|l|} \hline
Step & Kind & Example Message \\ \hline
Empty category derivation & {\tt empty} & {\tt empty category} \\ \hline
EFD Closure & {\tt empty} & {\tt close empty categories under rules}
\\ \hline
EFD matching & {\tt empty} & {\tt apply daughter 1 of rule} \\
 & & {\tt sentence1 to empty category} \\ \hline
Lexical entry derivation & {\tt lex} & {\tt derive seen from base entry: see}
    \\ \hline
Lexical rule application & {\tt lr} & {\tt apply lexical rule,
  passive,} \\
 & & {\tt to see} \\ \hline
Input morph application & {\tt lr} & {\tt apply morph to input: see} \\ \hline
Output morph application & {\tt lr} & {\tt apply morph to input: see,} \\ 
  & & {\tt output: seen} \\ \hline
Morph condition application & {\tt lr} & {\tt apply morph condition} \\ \hline
Functional description clause & {\tt fun} & {\tt evaluate
    functional clause for} \\
selection  & & {\tt append/2} \\ \hline
Definite clause selection & {\tt rel} & {\tt evaluate relational}
  clause for \\
& & {\tt head\_feature\_principle/2} \\ \hline
Definite clause resolution & {\tt rel} & {\tt resolve goal, append/3} \\ \hline
Negated goal (\verb-\+-) resolution & {\tt rel} & {\tt resolve negated
  goal} \\ \hline
Shallow cut (\verb+->;+) execution & {\tt rel} & {\tt execute shallow
cut} \\ \hline
Extensional identity (\verb-@=-) & {\tt rel} & {\tt resolve
    extensional identity} \\
check &&\\ \hline
Prolog hook call & {\tt rel} & {\tt resolve prolog hook:} \\
  & & {\tt (num(N),write(N)} \\ \hline
Closing new chart edge under & {\tt
    rule} & {\tt close chart edge under rule} \\
rules as leftmost daughter & & {\tt application} \\ \hline
Rule selection & {\tt rule} & {\tt apply rule, schema1}. \\ \hline
\end{tabular}
\caption{Parsing and Definite Clause Resolution Steps}\label{fig:parse-dcs-steps}
\end{figure}
%
The steps for generation are given in Figure~\ref{fig:gen-steps}.
%
\begin{figure}[tbh]
\begin{tabular}{|l|l|l|} \hline
Step & Kind & Example Message \\ \hline
Build semantic index & {\tt gen} & {\tt build semantic index from
root} \\ \hline
Build pivot template & {\tt gen} & {\tt build pivot template from
index} \\ \hline
Find pivot & {\tt gen} & {\tt find pivot} \\ \hline
Match pivot (lexical entry) & {\tt gen} & {\tt match pivot against
entry derived} \\
&& {\tt from see} \\ \hline
Match pivot (empty category) & {\tt gen} & {\tt match pivot
against empty category} \\ \hline
Match pivot (mother of & {\tt gen} & {\tt match pivot
against mother of} \\
non-chain rule) && {\tt non-chain rule, sentence1} \\ \hline
Apply chain rule & {\tt gen} & {\tt apply chain rule, s, to, pivot} \\
\hline
Pivot check & {\tt gen} & {\tt check for link from pivot to root} \\
\hline
Generate from pivot & {\tt gen} & {\tt recursively generate from
pivot} \\ \hline
Generate pre-head daughters & {\tt gen} & {\tt recursively generate
pre-head} \\
&& {\tt daughters from pivot} \\ \hline
Generate post-head daughters & {\tt gen} & {\tt recursively generate
post-head} \\
&& {\tt daughters from pivot} \\ \hline
Connect pivot to root & {\tt gen} & {\tt connect pivot to root} \\
\hline
Connect chain node to root & {\tt gen} & {\tt connect new chain node
to root} \\ \hline
Unify pivot template with & {\tt lr} & {\tt unify pivot
template and see} \\
lexical entry &&\\ \hline
Unify chain node & {\tt gen} & {\tt unify mother of chain rule, s,} \\
&& {\tt with root} \\ \hline
\end{tabular}
\caption{Generation Steps}\label{fig:gen-steps}
\end{figure}
%
Almost all of these involve sub-steps that enforce descriptions.
Some, such as chart-edge closure, involve other sub-steps such as rule
selection.  Lexical rule application includes input and output morph
application, as well as morph condition (given in a {\tt when} clause)
application.  The one generation step of kind, {\tt lr}, takes place
when the pivot template is matched against a (base or derived) lexical
entry.  Unlike the compiled generator, the debugger does not compile
lexical rules into the non-chain-rule selection step, so that the user
can see their application.  Instead, a base lexical entry is first
selected without reference to the pivot template, then zero or more
lexical rules are applied bottom up, until a derived entry is
eventually unified with the pivot template.  Lexical rules are, thus,
treated as a special kind of unary chain rule.  The length of these
special chains is still controlled by {\tt lex\_rule\_depth/1}, not
{\tt chain\_length/1}.

\mysubsection{Leashing}

With so many steps, and four possible ports, stepping through an
entire parse in a large grammar would be a very trying experience.
The {\sc ale} debugger provides three basic ways to filter through the
steps to find points of interest in a parse or definite clause
query.

The first is leashing.  Leashing allows the user to distinguish at
which steps information is simply displayed and at which steps the
debugger stops and asks the user what to do.  Unlike the SICStus
debugger, leashing in the {\sc ale} debugger is a property of steps,
not ports.  The command to control leashing is {\tt
  dleash/1}.\label{DLEASH} The argument to dleash/1 consists of a
sign, + or -, plus the kind of step.  A + sign indicates that the
debugger should stop and ask what to do at steps of that kind; and a -
sign indicates that it should simply display the port and proceed.
For example, to turn leashing off for empty categories, type:
%
\begin{verbatim}
  | ?- dleash(-empty).

  yes
\end{verbatim}
%
There is also a special kind, {\tt all}, that allows the user to turn
on and off leashing on all kinds of steps at once.  The default
leashing at start-up is {\tt +all}.

If a kind of step is leashed, then the debugger will stop at every
port for every step of that kind, and ask what to do.  The possible
responses are given in Figure~\ref{fig:response}:
%
\begin{figure}[tbh]
\begin{tabular}{|l|l|l|} \hline
Input & Description & Ports \\ \hline
{\tt ?} & show available commands at current port & c,e,r,f \\ \hline
{\tt h} & same as {\tt ?} & \\ \hline
{\tt a} & abort processing & c,e,r,f \\ \hline
{\tt f} & fail at this step (go to fail port) & c,e,r \\ \hline
{\tt r} & retry this step (go from fail to call port) & f \\ \hline
{\tt c} & advance to next port & c,e,r,f \\ \hline
LF & same as {\tt c} & \\ \hline
{\tt s} & advance to next exit/fail port of this step & c,r \\ \hline
{\tt n} & advance to first port on new line of grammar file & c,e,r,f \\ \hline
{\tt l} & advance to next breakpoint & c,e,r,f \\ \hline
{\tt +} & set breakpoint at current line & c,e,r,f \\ \hline
{\tt -} & clear breakpoint at current line & c,e,r,f \\ \hline
{\tt @} {\it PrologGoal} & pass a goal to Prolog &
   c,e,r,f \\ \hline
{\tt i} & toggle chart mini-interpreter & c,e,r,f \\ \hline
{\tt d} & display current structure & c,e,r,f \\ \hline
\end{tabular}
\caption{Possible Responses at Debugger Ports}\label{fig:response}
\end{figure}
%
Not all responses are available at all ports.  The kind of port (call,
fail, etc.) is what determines the possible responses.  The responses,
{\tt ?} and {\tt h} are always available, and list the other legal
responses at the current port.

\mysubsection{Skipping}

Even leashing may not be enough for very large parses or queries
because of the sheer number of ports displayed.  The {\sc ale}
debugger also provides a facility for {\it auto-skipping}.  Whereas
turning leashing off at a kind of step is like automatically answering
{\tt c} (advance to next port) at those steps, auto-skipping is like
automatically answering {\tt s}, which advances to the next exit or
fail port of the current step without stopping at or even displaying
the ports in between.  The command for this is {\tt
  dskip/1},\label{DSKIP} and its argument is of the same form as the
argument to {\tt dleash/1}.  The signs have a different meaning, of
course.  For example, {\tt dskip(+empty)} means that you want the
debugger to auto-skip steps of kind {\tt empty}, i.e., not stop and
ask, whereas {\tt dleash(+empty)} means that you want to leash steps
of kind {\tt empty}, i.e., stop and ask.  When a step where
auto-skipping is set is encountered, it is displayed with an automatic
reply without stopping, e.g.:
%
\begin{verbatim}
Call: empty category? <auto-skip>
Exit: empty category? <auto-skipped>
Edge added: Number:0, Left:1, Right:1, Rule:empty
Call: close chart edge under rule application?
\end{verbatim}

\mysubsection{Breakpoints}

The final kind of filtering is the breakpoint.  In the {\sc ale}
debugger, breakpoints are a property of lines in a grammar source
file, not steps or ports. For a finer grain of resolution, it would be
necessary to give each potential breakable step its own line in the
input.  By setting breakpoints and then using the {\tt l} response,
the debugger will advance to the next step whose line has a breakpoint
without displaying any steps in between.  If that step is not leashed
or has auto-skipping set, the debugger acts accordingly after
displaying it.

There are currently two ways to set a breakpoint.  One is to use the
{\tt +} response from within the debugger at a step at whose line you
wish to set a breakpoint.  The other is only available when the
debugger is used with an installation of XEmacs that supports XPM
resources.  In this case, when a source file is compiled a small glyph
will be displayed at the left edge of every breakable line.  Clicking
on this glyph once with the left mouse button sets a breakpoint.
Clicking again clears it.  A breakpoint can also be cleared with the
{\tt -} response.

It is often the case that a line will have several breakable steps on
it, for example, feature paths:
%
\begin{verbatim}
synsem:local:cat:head:verb,
qretr:e_list
\end{verbatim}
%
If a breakpoint were set at the first line, then leaping from the call
port for {\sc synsem} would still advance to the call port for {\sc
local}:
%
\begin{verbatim}
Call: enforce description on synsem value of lex entry? l
Call: enforce description on local value of value at synsem? 
\end{verbatim}
%
To avoid this, the response {\tt n} is provided, for leaping
automatically to the first port on a different line in the source
file.  The combination of {\tt n} and {\tt l} can be used to leap more
effectively in files that pack many steps, particularly description
steps, into one line.

All breakpoints can be cleared at once using the command, {\tt
  dclear\_bps/0}.\label{DCLEARBPS}


\mychapter{ALE Keyword Summary}

{\samepage
The following is a summary of keywords discussed in this manual, along
with page references.  A table of auxiliary keywords, those that only
occur as arguments of other keyword operators, such as the {\tt cat>}
argument of a {\tt rule}, will be provided in a future version. 

\nopagebreak
A keyword of kind {\em Description} is one that occurs in an {\sc ale}
description of a feature structure.  One of kind {\em Def. Clause}, or
{\em DCL}, is one that occurs in {\sc ale}'s definite clause language.
One of kind {\em Signature} is a declaration that occurs in an {\sc
  ale} signature.  One of kind {\em Type} is an {\sc ale} type with
special properties.  One of kind {\sc ale} is a Prolog query (entered
at the \verb+| ?-+ prompt) that can be used after {\sc ale} has been
loaded (see p.~\pageref{COMPILE}).  One of kind {\em Mini-interpreter} is a
mini-interpreter command that appears in an interpreter action-line.
One of kind {\em Debugger} is a Prolog query that can be used after
the source-level debugger has been loaded.  For debugger responses,
the reader is referred to the table on page~\pageref{fig:response}.

\vspace*{0.2in}
\frenchspacing

\hspace*{-0.3in}
\begin{tabular}{llp{2.5in}r}
Keyword & Kind & Description & Page \\
{\tt ,}               & Desc./DCL                      &
  Conjunction       
 & \pageref{COMMADESC}, \pageref{COMMADCL} \\
{\tt -->}             & Signature                      &
  Declare lexical entry.
 & \pageref{ARROW} \\
{\tt -> (;)}          & Def. Clause                    &
  Shallow cut.
 & \pageref{SHALLOW} \\
{\tt :}               & Description                    &
  Feature value.
 & \pageref{COLON} \\
{\tt ;}               & Desc./DCL                      &
  Disjunction.
 & \pageref{SEMIDESC}, \pageref{SEMIDCL} \\
{\tt !}               & Def. Clause                    &
  Cut.
 & \pageref{CUT} \\
{\verb-\+-}           & Def. Clause                    &
  Negation-by-failure.
 & \pageref{NEGATION} \\
{\tt ==}              & Description                    & 
  Path Equation.
 & \pageref{EQUAL} \\
{\tt =@}              & Def. Clause                    &
  Predefined token-identity definite clause predicate.
 & \pageref{EQAT} \\
{\tt \verb+=\=+}      & Description                    &
  Inequation.
 & \pageref{INEQ} \\
{\tt @}               & Description                    &
  Macro instantiation.
 & \pageref{AT} \\
{\tt [$\ldots$]}      & Description                    &
  Predefined list macro.
 & \pageref{LIST} \\
{\tt \%}              & Prolog                         &
  Comment delimiter.
 & \pageref{PERCENT} 
\end{tabular}

\pagebreak
\hspace*{-1.0in}
\begin{tabular}{llp{3.0in}r}
Keyword & Kind & Description & Page \\
{\tt a\_}              & Description/Signature          &
  Built-in extensional atom.
 & \pageref{ATOM} \\
{\tt abort}           & Mini-interpreter               &
  Abort parse.
 & \pageref{ABORT} \\
{\tt add}             & Mini-interpreter               &
  Add the current edge.
 & \pageref{ADD} \\
{\tt approp}          & {\sc ale}                      &
  Show value restriction on a feature at a type.
 & \pageref{APPROP} \\ 
{\tt assert}          & Prolog                         &
  Add clause to Prolog database.
 & \pageref{ASSERT} \\
{\tt bot}             & Type                           &
  In an {\sc ale} signature, this type must appear, and must subsume all of the
  other types.
 & \pageref{BOT} \\
{\tt break}           & Mini-interpreter               &
  Invoke Prolog break.
 & \pageref{BREAK} \\
{\tt chain\_length}   & {\sc ale}                      &
  Set limit on chain rule sequence length.
 & \pageref{CHAINLENGTH} \\
{\tt compile\_gram}   & {\sc ale}                      &
  Compile {\sc ale} signature (or parts of it --- see table, 
  p.~\pageref{TABLE}).
 & \pageref{COMPILEGRAM} \\
{\tt compile}         & Prolog                         &
  Compile a Prolog file.
 & \pageref{COMPILE} \\
{\tt cons}            & Signature                      &
  Declare type constraint.
 & \pageref{CONS}, \pageref{GOAL} \\
{\tt consult}         & Prolog                         &
  Load Prolog file (such as an {\sc ale} signature) into database.
 & \pageref{CONSULT} \\
{\tt continue}        & Mini-interpreter               &
  Proceed to look for more (subsuming/subsumed) edges.
 & \pageref{CONTINUE},\pageref{CONTINUE2} \\
{\tt control-c}       & Prolog                         &
  Prolog interrupt.
 & \pageref{CTRLC} \\
{\tt control-z}       & Unix                           &
  Unix interrupt.
 & \pageref{CTRLZ} \\
{\tt dclear\_bps}     & Debugger                       &
  Clear all breakpoints in current grammar file.
 & \pageref{DCLEARBPS} \\
{\tt dcompile\_gram}  & Debugger                       &
  Compile grammar file for source-level debugging.
 & \pageref{DCOMPILEGRAM} \\
{\tt dgen}            & Debugger                       &
  Generate with source-level debugger.
 & \pageref{DGEN} \\
{\tt dleash}          & Debugger                       &
  Set or remove leashing on a kind of step.
 & \pageref{DLEASH} \\
{\tt dquery}          & Debugger                       &
  Evaluate a definite clause with source-level debugging.
 & \pageref{DQUERY} \\
{\tt drec}            & Debugger                       &
  Parse with source-level debugger.
 & \pageref{DREC} \\
{\tt dskip}           & Debugger                       &
  Set or remove auto-skipping on a kind of step.
 & \pageref{DSKIP} \\
{\tt dtr-}{\em N}     & Mini-interpreter               &
  Display {\em N}th daughter edge of current edge.
 & \pageref{DTRN} \\
{\tt edge}            & {\sc ale}                      &
  Show a chart edge.
 & \pageref{EDGE} \\
{\tt emacs}           & Debugger                       &
  Turn on XEmacs interface to source-level debugger.
 & \pageref{EMACS} \\
{\tt empty}           & {\sc ale}                      &
  Show empty categories.
 & \pageref{EMPTY} \\
{\tt empty}           & Signature                      &
  Declare empty category.
 & \pageref{EMPTYSIG} \\
{\tt export\_words}   & {\sc ale}                      &
  Send list of words in lexicon to stream
 & \pageref{EXPORTWORDS} \\
{\tt existing}        & Mini-interpreter               &
  Display edge that subsumes new feature structure.
 & \pageref{EXISTING} \\
{\tt ext}             & Signature                      &
  Declare extensional types.
 & \pageref{EXT} \\
{\tt feature}         & {\sc ale}                      &
  Test if feature exists.
 & \pageref{FEATURE} \\
{\tt generate}        & {\sc ale}                      &
  Tell compiler to produce code for generation only.
 & \pageref{GENERATE}
\end{tabular}

\pagebreak
\hspace*{-0.5in}
\begin{tabular}{llp{3.0in}r}
Keyword & Kind & Description & Page  \\
{\tt gen}             & {\sc ale}                      &
  Generate a string using the compiled generator.
 & \pageref{GEN},\pageref{GEN4} \\
{\tt go}              & Mini-interpreter               &
  Add current and all subsequent edges.
 & \pageref{GO} \\
{\tt go-}{\em N}      & Mini-interpreter               &
  Add current and all subsequent edges until node {\em N} is reached.
 & \pageref{GON} \\
{\tt halt}            & Prolog                         &
  Exit from Prolog.
 & \pageref{HALT} \\
{\tt if}              & Def. Clause                    &
  Definite clause language equivalent of {\tt :-}.
 & \pageref{IF} \\
{\tt incoming}        & Mini-interpreter               &
  Display incoming edge that subsumes existing edge.
 & \pageref{INCOMING} \\
{\tt interp}          & {\sc ale}                      &
  Turn on mini-interpreter.
 & \pageref{INTERP} \\
{\tt intro}           & Signature                      &
  Declare appropriate features for type.
 & \pageref{INTRO} \\
{\tt introduce}       & {\sc ale}                      &
  Test if a feature was introduced by a type.
 & \pageref{INTRODUCE} \\
{\tt iso\_desc/2}     & {\sc ale}                      &
  Test whether two descriptions evaluate to the same feature structure.
 & \pageref{ISODESC} \\
{\tt lex}             & {\sc ale}                      &
  Show lexical entry.
 & \pageref{LEX} \\
{\tt lex\_compile}    & {\sc ale}                      &
  Compile lexicon intermediate code (SICStus only)
 & \pageref{LEXCOMPILE} \\
{\tt lex\_consult}    & {\sc ale}                      &
  Dynamically consult lexicon intermediate code
 & \pageref{LEXCONSULT} \\
{\tt lex\_rule}       & {\sc ale}                      &
  Show lexical rule.
 & \pageref{LEXRULE} \\
{\tt lex\_rule}        & Signature                      &
  Declare lexical rule.
 & \pageref{LEXRULESIG} \\
{\tt lex\_rule\_depth}  & {\sc ale}                      &
  Set bound on lexical rule application.
 & \pageref{LEXRULEDEPTH} \\
{\tt list}            & Type                           &
  This type, along with types {\tt e\_list} and {\tt ne\_list}, and features 
  {\tt HD} and {\tt TL}, must be defined in an {\sc ale} signature in order to 
  use the predefined {\tt [$\ldots$]} macro in descriptions, or the {\tt cats>}
  list-argument operator in grammatical rules.
 & \pageref{LIST}, \pageref{CATS} \\
{\tt macro}           & {\sc ale}                      &
  Show macro definition.
 & \pageref{MACRO} \\
{\tt macro}           & Signature                      &
  Declare macro.
 & \pageref{MACROSIG} \\
{\tt mgsat}           & {\sc ale}                      &
  Find most general satisfier(s) of a type.
 & \pageref{MGSAT} \\
{\tt no\_write\_feat} & {\sc ale}                      &
  Hide a feature and its value.
 & \pageref{NOWRITEFEAT} \\
{\tt no\_write\_type} & {\sc ale}                      &
  Hide a type.
 & \pageref{NOWRITETYPE} \\
{\tt noadd}           & Mini-interpreter               &
  Do not add the current edge.
 & \pageref{NOADD} \\   
{\tt noemacs}         & Debugger                       &
  Turn off XEmacs interface to source-level debugger.
 & \pageref{NOEMACS} \\
{\tt nointerp}        & {\sc ale}                      &
  Turn off mini-interpreter.
 & \pageref{NOINTERP} \\
{\tt nosubtest}       & {\sc ale}                      &
  Disable edge subsumption checking.
 & \pageref{NOSUBTEST} \\
{\tt parent}          & Mini-interpreter               &
  Return to parent edge.
 & \pageref{PARENT} \\
{\tt parse}           & {\sc ale}                      &
  Tell compiler to produce code for parsing only.
 & \pageref{PARSE} 
\end{tabular}

\pagebreak
\hspace*{-1.0in}
\begin{tabular}{llp{3.0in}r}
Keyword & Kind & Description & Page \\
{\tt parse\_and\_gen}   & {\sc ale}                      &
  Tell compiler to produce code for parsing and generation.
 & \pageref{PARSEANDGEN} \\
{\tt prolog}          & Def. Clause                    &
  Definite clause hook to Prolog.
 & \pageref{PROLOG} \\ 
{\tt query}           & {\sc ale}                      &
  Evaluate a definite clause.
 & \pageref{QUERY} \\
{\tt rec}             & {\sc ale}                      &
  Parse a string.
 & \pageref{REC},\pageref{REC2}--\pageref{REC5} \\
{\tt rec\_best}       & {\sc ale}                      &
  Parse first parsable string in a list of strings
 & \pageref{RECBEST} \\
{\tt rec\_list}       & {\sc ale}                      &
  Parse a list of strings
 & \pageref{RECLIST},\pageref{RECLIST3} \\
{\tt retract}         & Mini-interpreter               &
  Retract currently displayed edge.
 & \pageref{RETRACTMI} \\
{\tt retract}         & Prolog                         &
  Remove clause from Prolog database.
 & \pageref{RETRACTPRO} \\
{\tt retractall\_lex} & {\sc ale}                      &
  Retract all of a word's entries from lexicon
 & \pageref{RETRACTALLLEX} \\
{\tt retract\_lex}    & {\sc ale}                      &
  Retract a word's entry from lexicon
 & \pageref{RETRACTLEX} \\
{\tt rule}            & {\sc ale}                      &
  Show grammatical rule.
 & \pageref{RULE} \\
{\tt rule}            & Signature                      &
  Declare grammatical rule.
 & \pageref{RULESIG} \\
{\tt semantics}       & {\sc ale}                      &
  Declares a semantics definite clause predicate.
 & \pageref{SEMANTICS} \\
{\tt show\_clause}    & {\sc ale}                      &
  Show a definite clause.
 & \pageref{SHOWCLAUSE} \\
{\tt show\_cons}      & {\sc ale}                      &
  Show constraint for a type.
 & \pageref{SHOWCONS} \\
{\tt show\_type}      & {\sc ale}                      &
  Show subtypes, supertypes, constraint and most general satisfiers for a type.
 & \pageref{SHOWTYPE} \\
{\tt sub}             & Signature                      &
  Declare subtyping relationship.
 & \pageref{SUB} \\
{\tt subtest}         & {\sc ale}                      &
  Enable edge subsumption checking, and, if necessary, compile code
  for it.
 & \pageref{SUBTEST} \\
{\tt sub\_type}       & {\sc ale}                      &
  Test subsumption between two types.
 & \pageref{SUBTYPE} \\
{\tt true}            & Def. Clause                    &
  Definite clause that is always satified (also used to construct ground 
  clauses in def. clause language).
 & \pageref{TRUE} \\
{\tt type}            & {\sc ale}                      &
  Test if type exists.
 & \pageref{TYPE} \\
{\tt unify\_type}     & {\sc ale}                      &
  Unify two types.
 & \pageref{UNIFYTYPE} \\
{\tt update\_lex}     & {\sc ale}                      &
  Add new entries to lexicon
 & \pageref{UPDATELEX} \\
{\tt write\_feat}     & {\sc ale}                      &
  Don't hide a feature.
 & \pageref{WRITEFEAT} \\
{\tt write\_feats}    & {\sc ale}                      &
  Don't hide any features.
 & \pageref{WRITEFEATS} \\
{\tt write\_type}     & {\sc ale}                      &
  Don't hide a type.
 & \pageref{WRITETYPE} \\
{\tt write\_types}    & {\sc ale}                      &
  Don't hide any types.
 & \pageref{WRITETYPES}
\end{tabular}
}


\mychapterstar{HDRUG: A Graphical User Environment for Natural
  Language Processing in Prolog}

Hdrug is an environment to develop logic grammars / parsers /
generators for natural languages. The package is written in Sicstus
Prolog version 3 and uses  library(tcltk) to implement its user
interface.  Tcl/Tk is a powerful script language to develop
applications for the X-windows environment. 

Hdrug offers various tools to visualize lexical entries, grammar
rules, definite-clause definitions, parse trees, feature 
structures, lexical- rule- and type-hierarchies, graphs of the 
comparison of different parsers on a corpus of test sentences
etc., in a Tk widget, LaTeX/DVI format, and the Clig system. 

The package comes with a number of example grammars, including 
the grammars to be found in the distribution of the ALE system.

Hdrug allows for easy comparison of different parsers/generators; it
has extensive possibilities to compile feature equations into Prolog
terms; it can produce graphical (Tk),  and ordinary
Prolog output of trees, feature structures, Prolog terms (and
combinations thereof), plotted graphs of statistical information, and
 tables of statistical information. Etc. Etc.

Using just menu's and buttons it is possible to parse sentences,
generate sentences from logical form representations, view the parse
trees that are derived by the parser or generator, change a particular
version of the parser on the fly, compare the results of parsing the
same sentence(s) with a set of different parsers, etc.

HDRUG was designed and implemented by Gertjan van Noord.  The HDRUG
home-page is \verb+http://www.let.rug.nl/~vannoord/Hdrug/+.


\mychapterstar{Pleuk Grammar Development Environment}


For those using SICStus 2.1\#9 under X windows, the Pleuk
grammar development shell has been adapted for ALE.  Pleuk provides a
graphical user interface, facilities for maintaining and testing
corpora, and an interactive, incremental derivation checker.  Pleuk is
available free of charge from:
%
\begin{verbatim}
     ftp.cogsci.ed.ac.uk:/pub/pleuk
\end{verbatim}

The file README contains instructions for downloading the system.
Pleuk has been ported to Sun SPARCs SunOS 4.* and HP-UX.  
For more information, send email to pleuk@cogsci.ed.ac.uk.
Pleuk was developed by Jo Calder and Chris Brew of the Human
Communication Research Centre at the University of Edinburgh, Kevin
Humphreys of the Centre for Cognitive Science at the University of
Edinburgh, and Mike Reape, of the Computer Science Department, 
Trinity College, Dublin.

As of this release, Pleuk will not work under SICStus 3.0 or later.

\mychapter{References}

This collection of references only scratches the surface of the
relevant literature.  A much more complete survey of the historical
perspective on typed unification grammars and programs can be found in
Carpenter (1992), and in subsequent papers in {\it ACL}, {\it EACL},
{\it COLING}, etc.

\begin{list}{\ }{\setlength{\listparindent}{-1.5in} 
                   \setlength{\itemindent}{-0.25in} 
                   \setlength{\itemsep}{0.0in}
                   \setlength{\leftmargin}{0.25in} 
                   \setlength{\labelwidth}{0.2in} 
                   \setlength{\labelsep}{0.1in} }
\item
A{\"{\i}}t-Kaci, H. (1991).  The {\sc wam}: A (Real) Tutorial.  MIT
Press, Cambridge, Massachusetts.
\annotate{
The best available introduction to Prolog compiler technology,
focusing on Warren's Abstract Machine for Prolog.
}

\item
A{\"{\i}}t-Kaci, H. (1986a).
 An algebraic semantics approach to the effective resolution of type
  equations.
 {\em Theoretical Computer Science}, 45:293--351.
\annotate{Seminal work in sorted feature structures, based on
A\"{\i}t-Kaci's 1984 University of Pennsylvania dissertation.  Focuses
on general constraint resolution.
} 

\item
A{\"{\i}}t-Kaci, H., and Nasr, R. (1986b).  {\sc login}: A logical
programming language with built-in inheritance.  {\em Journal of Logic
Programming}, 3:187--215.
\annotate{
The first application of feature structures to logic programming.
Includes sorted, but not typed feature structures.  Also includes good
details on the Martelli and Montanari (1984) unification
algorithm applied to feature structures.  
}

\item
A{\"{\i}}t-Kaci, H. (1984).  A Lattice-Theoretic Approach to
Computation based on a Calculus of Partially Ordered Type Structures.
Univ. of Pennsylvania dissertation.
\annotate{
A{\"{\i}}t-Kaci's introduction of sorted $\psi$-terms, which are like
our feature structures only without the appropriateness conditions,
inequations and extensionality.  An appendix contains a coding of the
Zebra Puzzle, a benchmark logic puzzle for constraint resolution.  }

\item
Carpenter, B. (1992) {\em The Logic of Typed Feature Structures}.
Cambridge Tracts in Theoretical Computer Science 32, Cambridge
University Press, New York.
\annotate{
Contains all the theoretical details behind the {\sc ale} feature
structures, description language and applications.  A must for fully
understanding {\sc ale} and a number of related variations.
}

\item
  Carpenter, B. and G. Penn (1996) Compiling Typed Attribute-Value
  Logic Grammars.  In H.~Bunt and M.~Tomita, eds., {\it Recent
    Advances in Parsing Technology}. Kluwer.
  \annotate{A description
    of the theoretical underpinnings of ALE 2.0, including the data
    structures, type inference mechanism, description resolution,
    parsing, and inequation solving.}


\item
Colmerauer, A. (1987).  Theoretical model of prolog {II}.  In van
Canegham, M., and Warren, D.~H., editors, {\em Logic Programming and
its Application}, 1--31. Ablex, Norwood, New Jersey.
\annotate{
Describes unification with cyclic terms and inequations in a logic
programming environment.
}

\item
Gazdar, G., and Mellish, C.~S. (1989).  {\em Natural Language
Processing in Prolog}.  Addison-Wesley, Reading, Massachusetts.
\annotate{
An introduction to computational linguistics using Prolog.  Also
contains a very general introduction to simple {\sc patr-ii} phrase
structure grammars, including simple implementations of unification
and parsing algorithms.  A version is also available using Lisp.
}

\item
H{\"{o}}hfeld, M., and Smolka, G. (1988).  Definite relations over
constraint languages.  {\sc lilog--report}~53, {\sc ibm} --
Deutschland GmbH, Stuttgart.
\annotate{
Highly theoretical description of a constraint logic programming
paradigm, including an application to feature structures similar to
those used in {\sc login}. 
}

\item
Jaffar, J. (1984).  Efficient unification over infinite terms.  {\em
New Generation Computing}, 2:207--219.
\annotate{
Unification algorithm for possibly cyclic terms in Prolog II.
Includes quasi-linear complexity analysis.
}

\item
Kasper, R.~T., and Rounds, W.~C. (1990).  The logic of unification in
grammar.  {\em Linguistics and Philosophy}, 13(1):35--58.
\annotate{
The details of Kasper and Rounds original feature structure
description system and related theorems.  
}

\item
Martelli, A., and Montanari, U. (1982).  An efficient unification
algorithm.  {\em ACM Transactions on Programming Languages
and Systems}, 4(2):258--282.
\annotate{
The Union/Find unification algorithm used by {\sc ale}, which was
adapted to the cyclic case by Jaffar (1984).  
}

\item
Mastroianni, M. (1993) Attribute-Logic Phonology.  Carnegie Mellon
University Laboratory for Computational Linguistics Technical Report
CMU-LCL-93-4. Pittsburgh.
\annotate{
The description and motivation for an attribute-logic approach to
phonology.  Includes extensive discussion of its implementation in
ALE, including syllable structure and morphologically conditioned
effects such as epenthesis, harmony and assimilation.
}

\item
O'Keefe, R.~A. (1990) {\it The Craft of Prolog.} MIT Press, Cambridge,
Massachussetts.
\annotate{
Best text on advanced programming techniques using Prolog compilers.
Should read Sterling and Shapiro's introduction as a pre-requisite.
}

\item
Penn, G.  (1993).
A Utility for Typed Feature Structure-based Grammatical Theories.
Technical Report.  Laboratory for Computational Linguistics, Carnegie
Mellon University, Pittsburgh.
\annotate{
This project served as the basis of the version 2.0 updates of ALE.
The report details these updates, including the algorithms used to
implement them and other efficiency issues.  It also describes Penn's
implementation of Head-Driven Phrase Structure Grammar (HPSG), as
represented in the first eight chapters of (Pollard and Sag 1994).
}

\item
Penn, G. (1999).  A Parsing Algorithm to Reduce Copying in Prolog.
Arbeitspapier des Sonderforschungsbereichs 340, Nr. 137.
\annotate{
A presentation of the Empty-First-Daughter closure algorithm, which
can be used to reduce copying in Prolog-based parsers.}

\item
Penn, G., and Carpenter, B. (1993).  Three Sources of Disjunction in a
Typed Feature Structure-based Resolution System.  {\em Feature
Formalisms and Linguistic Ambiguity}, H. Trost ed.  Ellis Horwood, New
York.
\annotate{
A presentation of the principal sources of complexity in solving
constraint puzzles, such as the Zebra Puzzle, a simplified version of
which is presented in this manual; and an outline of steps taken to
cope with them in the reversible general constraint resolver which was
the precursor to {\sc ale}.
}

\item
  Penn, G. and Popescu, O. (1997).  Head-Driven Generation and
  Indexing in {\sc ale}.  {\it Proceedings of Workshop on
    Computational Environments for Grammar Development and Linguistic
    Engineering (ENVGRAM)}, 35th ACL / 8th EACL.
\annotate{
Describes the implementation of {\sc ale}'s head-driven generator, and
  a simple indexing strategy for lexical entries during generation.
}

\item
Popescu, O. (1996).  Head-Driven Generation for Typed Feature
Structures.  Carnegie Mellon University MS Project.
\annotate{
The extension of semantic-head-driven generation to typed feature
structures used in {\sc ale}.
}

\item
Pereira, F. C.~N., and Shieber, S.~M. (1987).  {\em {P}rolog and
Natural-Language Analysis}. Volume~10 of {\em {\small\it CSLI} Lecture
Notes}.  Center for the Study of Language and Information, Stanford.
\annotate{
Excellent introduction to the use of term unification grammars in
natural language.  Includes a survey of Prolog, parsing algorithms and
many sample grammar applications in syntax and semantics.
}

\item
Pollard, C.~J. (in press).  Sorts in unification-based grammar and
what they mean.  In Pinkal, M., and Gregor, B., editors, {\em
Unification in Natural Language Analysis}. MIT Press, Cambridge,
Massachusetts.
\annotate{
Contains the original extension of Rounds and Kasper's logical
language to sorts.  Also motivates the use of sorts in natural
language grammars. 
}

\item
Pollard, C.~J., and Sag, I.~A. (1994).  {\em Head-driven Phrase
Structure Grammar}.  Chicago University Press, Chicago.
\annotate{
The primary grammar formalism which motivated the construction of the
{\sc ale} system.  Provides many examples of how typed feature
structures and their descriptions are employed in a sophisticated
natural language application.
}

\item
Shieber, S.~M. (1986).  {\em An Introduction to Unification-Based
Approaches to Grammar}.  Volume~4 of {\em {\small\it CSLI} Lecture
Notes}.  Center for the Study of Language and Information, Stanford.
\annotate{
Best source for getting acquainted with the application of feature
structures and their descriptions to natural language grammars.
}

\item
Shieber, S.~M., Uszkoreit, H., Pereira, F. C.~N., Robinson, J., and
Tyson, M.  (1983).  The formalism and implementation of {\sc patr-ii}.
In {\em Research on Interactive Acquisition and Use of Knowledge}.
Volume 1894 of {\em {\small\it SRI} Final Report}, {\sc sri}
International, Menlo Park, California.
\annotate{
Original document describing the {\sc patr-ii} formalism.
}

\item
Shieber, S.~M., Pereira, C.~N., van Noord, G., Moore, R.~C. (1990).
Semantic-Head-Driven Generation. In {\em Computational Linguistics},
Vol. 16(1):30--42.
\annotate{The main reference for a description of the semantic-
head-driven generation algorithm.}

\item
Smolka, G. (1988a).  A feature logic with subsorts.  {\sc
lilog--report}~33, {\sc ibm} -- Deutschland GmbH, Stuttgart.
\annotate{
An alternative logic to that of Rounds and Kasper, which includes
sorts, variables and general negation.
}

\item
Smolka, G. (1988b).  Logic programming with polymorphically
order-sorted types.  {\sc lilog--report}~55, {\sc ibm} -- Deutschland
GmbH, Stuttgart.
\annotate{
An application of ordered term unification to typed logic programming.
}

\item
Sterling, L., and Shapiro, E.~Y. (1986).  {\em The Art of Prolog:
Advanced Programming Techniques}.  MIT Press, Cambridge,
Massachusetts.
\annotate{
Best general introduction to logic programming in Prolog.  
}

\item
van Noord, G. (1989).  {\em BUG: A Directed Bottom-Up Generator for
Unification Based Formalisms}.  Working Papers in Natural Language
Processing, Katholieke Universiteit Leuven, Stichting Taaltechnologie
Utrecht.
\annotate{Proposes the first semantic-head-driven generation algorithm.
}

\end{list}


\appendix

\myappendix{Sample Grammars}

\mysection{English Syllabification Grammar}

\begin{verbatim}
% Signature
% =========

bot sub [unit,list,segment].
  unit sub [cluster,syllable,word]
       intro [first:segment,
              last:segment].
    cluster sub [consonant_cluster, vowel_cluster]
            intro [segments:list_segment].
      consonant_cluster sub [onset,coda].
        onset sub [].
        coda sub [].
      vowel_cluster sub [].
    syllable sub []
             intro [syllable:list_segment].
    word sub []
         intro [syllables:list_list_segment].
  segment sub [consonant,vowel].
    consonant sub [sibilant,obstruent,nasal,liquid,glide].
      sibilant sub [s,z].
        s sub [].
        z sub [].
      obstruent sub [p,t,k,b,d,g].
        p sub [].
        t sub [].
        k sub [].
        b sub [].
        d sub [].
        g sub [].
      nasal sub [n,m].
        n sub [].
        m sub [].
      liquid sub [l,r].
        l sub [].
        r sub [].
      glide sub [y,w].
        y sub [].
        w sub [].
    vowel sub [a,e,i,o,u].
      a sub [].
      e sub [].
      i sub [].
      o sub [].
      u sub [].
  list sub [e_list,ne_list,list_segment,list_list_segment].
    e_list sub [].
    ne_list sub [ne_list_segment,ne_list_list_segment]
            intro [hd:bot,
                   tl:list].
    list_segment sub [e_list,ne_list_segment].
      ne_list_segment sub []
                      intro [hd:segment,
                             tl:list_segment].
    list_list_segment sub [e_list,ne_list_list_segment].
      ne_list_list_segment sub []
                           intro [hd:list_segment,
                                  tl:list_list_segment].


% Rules
% =====

word_schema_rec rule
(word,
 syllables:[Syllable|Syllables],
 first:First1,
 last:Last2)
===>
cat> (syllable,
      syllable:Syllable,
      first:First1,
      last:Last1),
cat> (word,
      syllables:Syllables,
      first:First2,
      last:Last2),
goal> (\+ less_sonorous(Last1,First2)).

word_schema_base rule
(word,
 syllables:[Syllable],
 first:First,
 last:Last)
===>
cat> (syllable,
     syllable:Syllable,
     first:First,
     last:Last).

v_syllable rule
(syllable,
 syllable:[Vowel],
 first:Vowel,
 last:Vowel)
===>
cat> (vowel,Vowel).

vc_syllable rule
(syllable,
 syllable:[Vowel|Segs1],
 first:Vowel,
 last:Last)
===>
cat> (vowel,Vowel),
cat> (coda,
     segments:Segs1,
     last:Last).

cv_syllable rule
(syllable,
 syllable:Segs,
 first:First,
 last:Vowel)
===>
cat> (onset,
     segments:Segs1,
     first:First),
cat> (vowel,Vowel),
goal> append(Segs1,[Vowel],Segs).

cvc_syllable rule
(syllable,
 syllable:Segs,
 first:First,
 last:Last)
===>
cat> (onset,
     segments:Segs1,
     first:First),
cat> (vowel,Vowel),
cat> (coda,
     segments:Segs2,
     last:Last),
goal> append(Segs1,[Vowel|Segs2],Segs).

consonant_cluster_base rule
(consonant_cluster,
 segments:[Consonant],
 first:Consonant,
 last:Consonant)
===>
cat> (consonant,Consonant).

onset rule
(onset,
 segments:[Consonant1|Consonants],
 first:Consonant1,
 last:Consonant3)
===>
cat> (consonant,Consonant1),
cat> (onset,
     segments:Consonants,
     first:Consonant2,
     last:Consonant3),
goal> less_sonorous(Consonant1,Consonant2).

coda rule
(coda,
 segments:[Consonant1|Consonants],
 first:Consonant1,
 last:Consonant3)
===>
cat> (consonant,Consonant1),
cat> (coda,
     segments:Consonants,
     first:Consonant2,
     last:Consonant3),
goal> less_sonorous(Consonant2,Consonant1).


% Lexicon
% =======

p ---> p.
t ---> t.
k ---> k.
b ---> b.
d ---> d.
g ---> g.
s ---> s.
z ---> z.
n ---> n.
m ---> m.
l ---> l.
r ---> r.
y ---> y.
w ---> w.
a ---> a.
e ---> e.
i ---> i.
o ---> o.
u ---> u.


% Definite Clauses
% ================

less_sonorous_basic(sibilant,obstruent) if true.
less_sonorous_basic(obstruent,nasal) if true.
less_sonorous_basic(nasal,liquid) if true.
less_sonorous_basic(liquid,glide) if true.
less_sonorous_basic(glide,vowel) if true.

less_sonorous(L1,L2) if
  less_sonorous_basic(L1,L2).
less_sonorous(L1,L2) if
  less_sonorous_basic(L1,L3),
  less_sonorous(L3,L2).

append([],Xs,Xs) if true.
append([X|Xs],Ys,[X|Zs]) if
  append(Xs,Ys,Zs).
\end{verbatim}

\newpage
\mysection{Categorial Grammar with Cooper Storage}

\begin{verbatim}
% Signature
% =========

bot sub [cat,synsem,syn,sem_obj,list_quant].
  cat sub [] 
      intro [synsem:synsem,
             qstore:list_quant].
  synsem sub [functional, basic].
    functional sub [forward,backward]
               intro [arg:synsem,
                      res:synsem].
      forward sub [].
      backward sub [].
    basic sub [] 
          intro [syn:syn, sem:sem_obj].
  syn sub [np,s,n].
    np sub [].
    s sub [].
    n sub [].
  sem_obj sub [individual, proposition, property].
    individual sub [j,m].
      j sub [].
      m sub [].
    property sub []
             intro [ind:individual,
                    body:proposition].
    proposition sub [logical,quant,run,hit,nominal].
      logical sub [and,or].
        and sub []
            intro [conj1:proposition,
                   conj2:proposition].
        or sub []
           intro [disj1:proposition,
                  disj2:proposition].
      quant sub [every,some]
            intro [var:individual,
                   restr:proposition,
                   scope:proposition].
        every sub [].
        some sub [].
      run sub [] 
          intro [runner:individual].
      hit sub []
          intro [hitter:individual, 
                 hittee:individual].
      nominal sub [kid,toy,big,red] 
              intro [arg1:individual].
        kid sub [].
        toy sub [].
        big sub [].
        red sub [].
  list_quant sub [e_list, ne_list_quant].
    e_list sub [].
    ne_list_quant sub []
                  intro [hd:quant,
                         tl:list_quant].


% Lexicon
% =======

kid --->
  @ cn(kid).

toy --->
  @ cn(toy).

big --->
  @ adj(big).

red --->
  @ adj(red).

every --->
  @ gdet(every).

some --->
  @ gdet(some).

john --->
  @ pn(j).

runs --->
  @ iv((run,runner:Ind),Ind).

hits --->
  @ tv(hit).


% Grammar
% =======

forward_application rule
(synsem:Z,
 qstore:Qs) 
===> 
cat> (synsem:(forward,
      arg:Y,
      res:Z),
      qstore:Qs1),
cat> (synsem:Y,
      qstore:Qs2),
goal> append(Qs1,Qs2,Qs).


backward_application rule
(synsem:Z,
 qstore:Qs)
===> 
cat> (synsem:Y,
      qstore:Qs1),
cat> (synsem:(backward,
      arg:Y,
      res:Z),
      qstore:Qs2),
goal> append(Qs1,Qs2,Qs).


s_quantifier rule
(synsem:(syn:s,
         sem:(Q,
              scope:Phi)),
 qstore:QsRest)
===> 
cat> (synsem:(syn:s,
              sem:Phi),
      qstore:Qs),
goal> select(Qs,Q,QsRest).


% Macros
% ======

cn(Pred) macro
  synsem:(syn:n,
          sem:(body:(Pred,
                     arg1:X),
               ind:X)),
  @ quantifier_free.

gdet(Quant) macro
  synsem:(forward,
          arg: @ n(Restr,Ind),
          res: @ np(Ind)),
  qstore:[@ quant(Quant,Ind,Restr)].

quant(Quant,Ind,Restr) macro
  (Quant,
   var:Ind,
   restr:Restr).

adj(Rel) macro
  synsem:(forward,
          arg: @ n(Restr,Ind),
          res: @ n((and,
                    conj1:Restr,
                    conj2:(Rel,
                           arg1:Ind)),
                    Ind)),
  @ quantifier_free.

n(Restr,Ind) macro
  syn:n,
  sem:(body:Restr,
       ind:Ind).

np(Ind) macro
  syn:np,
  sem:Ind.

pn(Name) macro
  synsem: @ np(Name),
  @ quantifier_free.

iv(Sem,Arg) macro
  synsem:(backward,
          arg: @ np(Arg),
          res:(syn:s,
               sem:Sem)),
  @ quantifier_free.

tv(Rel) macro
  synsem:(forward,
          arg:(syn:np,
               sem:Y),
          res:(backward,
               arg:(syn:np,
                    sem:X),
               res:(syn:s,
                    sem:(Rel,
                         hitter:X,
                         hittee:Y)))),
  @ quantifier_free.

quantifier_free macro
  qstore:[].


% Definite Clauses
% ================

append([],Xs,Xs) if
  true.
append([X|Xs],Ys,[X|Zs]) if
  append(Xs,Ys,Zs).


select([Q|Qs],Q,Qs) if
  true.
select([Q1|Qs1],Q,[Q1|Qs2]) if
  select(Qs1,Q,Qs2).
\end{verbatim}


\newpage
\mysection{Simple Generation Grammar}
\label{GENGRAMMAR}

\begin{verbatim}
% An implementation in {\sc ale} of the grammar in Shieber & al,
% "Semantic-Head-Driven Generation", CL 16-1, 1990.

% Signature
% =========

bot sub [pred, list, sem, form, agr, sign].
  pred sub [decl, imp, love, call_up, leave, see, john, mary, mark,
            friends, often, friend, up, you, i].
    decl sub []. imp sub [].
    leave sub []. love sub []. call_up sub []. see sub [].
    john sub []. mary sub []. mark sub [].
    friends sub []. friend sub [].
    often sub []. up sub [].
    you sub []. i sub [].
  list sub [e_list, ne_list, arg_list, subcat_list].
    e_list sub [].
    ne_list sub [arg_ne_list, subcat_ne_list]
            intro [hd:bot, tl:list].
    arg_list sub [e_list, arg_ne_list].
      arg_ne_list sub [] intro [hd:sem, tl:arg_list].
    subcat_list sub [e_list, subcat_ne_list].
      subcat_ne_list sub [] intro [hd:sign, tl:subcat_list].
  sem sub [] intro [pred:pred, args:arg_list].
  form sub [finite, nonfinite].
    finite sub [].
    nonfinite sub [].
  agr sub [sg1, sg2, sg3, pl1, pl2, pl3].
    sg1 sub []. sg2 sub []. sg3 sub [].
    pl1 sub []. pl2 sub []. pl3 sub [].
  sign sub [sentence, verbal, np, adv, p]
       intro [sem:sem].
    sentence sub [].
    verbal sub [s, vp] intro [form:form].
      s sub [].
      vp sub [] intro [subcat:subcat_list].
    np sub [det, n]
       intro [agr:agr, arg:sem].
      det sub [] intro [np_sem:sem].
      n sub [].
    adv sub [] intro [varg:sem].
    p sub [].
ext([sg1,sg2,sg3,pl1,pl2,pl3]).

% Lexicon
% =======

love --->
  vp, form:nonfinite,
  subcat:[(np,sem:Obj),(np,sem:Subj)],
  sem:(pred:love,args:[Subj,Obj]).

call --->
  vp, form:nonfinite,
  subcat:[(np,sem:Obj),(p,sem:(pred:up,args:[])),(np,sem:Subj)],
  sem:(pred:call_up,args:[Subj,Obj]).

call --->
  vp, form:nonfinite,
  subcat:[(p,sem:(pred:up,args:[])),(np,sem:Obj),(np,sem:Subj)],
  sem:(pred:call_up,args:[Subj,Obj]).

leave --->
  vp, form:nonfinite,
  subcat:[(np,sem:Subj)],
  sem:(pred:leave,args:[Subj]).

see --->
  vp, form:nonfinite,
  subcat:[(np,sem:Obj),(np,sem:Subj)],
  sem:(pred:see,args:[Subj,Obj]).

see --->
  vp, form:nonfinite,
  subcat:[(s,form:finite,sem:Obj),(np,sem:Subj)],
  sem:(pred:see,args:[Subj,Obj]).

john --->
  np, agr:sg3, sem:(pred:john,args:[]).

mary --->
  np, agr:sg3, sem:(pred:mary,args:[]).

mark --->
  np, agr:sg3, sem:(pred:mark,args:[]).

friends --->
  np, agr:pl3, sem:(pred:friends,args:[]).

friend --->
  n, agr:sg3, arg:X, sem:(pred:friend,args:[X]).

i --->
  np, agr:sg1, sem:(pred:i,args:[]).

you --->
  np, agr:sg2, sem:(pred:you,args:[]).

often --->
  adv, varg:VP, sem:(pred:often,args:[VP]).

up --->
  p, sem:(pred:up,args:[]).


% Lexical Rules
% =============

sg3 lex_rule (vp, form:nonfinite, subcat:Subcat, sem:Sem) **>
             (vp, form:finite, subcat:NewSubcat, sem:Sem)
  if add_sg3(Subcat,NewSubcat)
  morphs (X,y) becomes (X,i,e,s),
         X becomes (X,s).

non_sg3 lex_rule (vp, form:nonfinite, subcat:Subcat, sem:Sem) **>
                 (vp, form:finite, subcat:NewSubcat, sem:Sem)
  if add_nonsg3(Subcat,NewSubcat)
  morphs X becomes X.


% Grammar Rules
% =============

sentence1 rule
  (sentence,sem:(pred:decl,args:[S])) ===>
  cat> (s,form:finite,sem:S).

sentence2 rule
  (sentence,sem:(pred:imp,args:[S])) ===>
  cat> (vp,form:nonfinite,
        subcat:[(np,sem:(pred:you,args:[]))],sem:S).

s rule
  (s,form:Form,sem:S) ===>
  cat> Subj,
  sem_head> (vp,form:Form,subcat:[Subj],sem:S).

vp1 rule
  (vp,form:Form,subcat:Subcat,sem:S) ===>
  sem_head> (vp,form:Form,subcat:[Compl|Subcat],sem:S),
  cat> Compl.

vp2 rule
  (vp,form:Form,subcat:[Subj],sem:S) ===>
  cat> (vp,form:Form,subcat:[Subj],sem:VP),
  sem_head> (adv,varg:VP,sem:S).


% Semantics Directive
% ====================

semantics sem1.


% Definite Clauses
% ================

sem1(sem:S,S) if true.

add_sg3([(np,sem:Sem)],[(np,agr:sg3,sem:Sem)]) if !, true.
add_sg3([Cat|Cats],[Cat|NewCats]) if add_sg3(Cats,NewCats).

add_nonsg3([(np,sem:Sem)],[(np,agr:(=\=sg3),sem:Sem)]) if !, true.
add_nonsg3([Cat|Cats],[Cat|NewCats]) if add_nonsg3(Cats,NewCats).
\end{verbatim}

\myappendix{Error and Warning Messages}


\mysection{Error Messages}

\vspace{-6pt}
\errorcond{{\tt a\_/1} atom declared subsumed by type $T$}
{Subsumption over {\tt a\_/1} atoms has a fixed definition.  Subtyping
  specifications with {\tt a\_/1} atoms are not allowed.}

\errorcond{add\_to could not unify $FS_{1}$ and $FS_{2}$}
{The unification between feature structures $FS_{1}$ and $FS_{2}$
failed.}

\errorcond{add\_to could not unify paths $\pi$ and $\phi$ in $FS$}
{The unification between paths $\pi$ and $\phi$ failed in feature
structure $FS$ failed.}

\errorcond{add\_to could not inequate $FS_{1}$ and $FS_{2}$}
{The inequation between features structures $FS_{1}$ and $FS_{2}$
could not been satisfied.}

\errorcond{add\_to could not add feature $F$ to $FS$}
{The value of feature $F$ is not unifiable with the similar value in
feature structure $FS$.}

\errorcond{add\_to could not add undefined macro $M$ to $FS$}
{Macro $M$ used in feature structure $FS$ is undefined.}

\errorcond{add\_to could not add incompatible type $T$ to $FS$}
{Type $T$ is not compatible with the type of feature structure $FS$.}

\errorcond{add\_to could not add undefined type $T$ to $FS$}
{Type $T$ in feature structure $FS$ is undefined.}

\errorcond{add\_to could not add ill formed complex description $D$ to
$FS$}
{Description $D$ is ill formed and could not be unified with feature
structure $FS$}

\errorcond{appropriateness cycle following path $\pi$ from type $T$}
{There is a sequence of features $\pi$ which must be defined for
objects of type $T$ where the value must be of type $T$.}

\errorcond{bot has appropriate features}
{The most general type, $\bot$, cannot have any appropriate features.}

\errorcond{bot has constraints}
{The most general type, $\bot$, must not have {\tt cons} constraints.}

\errorcond{{\tt cats>} value with sort $S$ is not a valid list argument}
{An argument of {\tt cats>} was detected at run-time, which is not of
a type subsumed by {\tt list}.}

\errorcond{consistent $T_{1}$ and $T_{2}$ have multiple mgus $Ts$}
{Types $T_{1}$ and $T_{2}$ have the non singleton set $Ts$ as their
set of most general unifiers.}

%\errorcond{constraint cycle following path $\pi$ from $T$}
%{There is a sequence of features $\pi$, defined for objects of type
%$T$ where the value has a type constraint of $T$.}
%
%  not quite yet - coming in a future version
%
%\errorcond{constraint inconsistent with appropriateness for $T$}
%{The value restrictions given in the appropriateness section for $T$
%are inconsistent with the general constraints for $T$}

\errorcond{constraint declaration given for atom}
{{\tt a\_/1} atoms must not have {\tt cons} constraints.}

\errorcond{description uses unintroduced feature $F$}
{A description uses the feature $F$ which has not been defined as
appropriate for any types.}

\errorcond{edge/2: arguments must be non-negative}
{The arguments to edge/2 represent nodes in a parsing chart, and thus
must be non-negative integers.}

\errorcond{edge/2: first argument must be < second argument}
{The arguments to edge/2 represent nodes in a parsing chart.  Edges
only span from one node to an equal or greater valued node.  {\tt edge/2}
shows edges where the other node has a greater value.  {\tt empty/0} shows
edges where the node has an equal value.}

\errorcond{extensional type $E$ is not maximal}
{Type $E$ is declared extensional but does not observe the maximality
restriction.}

\errorcond{feature $F$ multiply introduced at $Ts$}
{The feature $F$ is introduced at the types in $Ts$, which are not
comparable with one another.}

\errorcond{illegal variable occurence in $T$ sub $Ss$ (intro $FRs$)}
{In subtype/feature specifications, neither $T$ or any of the types in
  $Ss$ or $FRs$, or any of the features in $FRs$ can be unbound
  variables.  If a value restriction is an {\tt a\_/1} atom, that atom
  can be unbound, or contain unbound variables, but the {\tt a\_/1}
  operator must still appear.}

\errorcond{incompatible restrictions on feature $F$ at type $T$ are
$Ts$}
{The inherited restrictions, consisting of types $Ts$, on the value of
$F$ at type $T$ are not consistent.}

\errorcond{invalid line $\phi$ in rule}
{A line of a grammar rule is neither a goal nor a category description.}

\errorcond{lexical rule $LR$ lacks morphs specification}
{The obligatory {\tt morphs} part of lexical rule $LR$ is missing.}

\errorcond{multiple constraint declaration error for $T$}
{More than one {\tt cons} declaration exists for type $T$.}

\errorcond{multiple feature specifications for type $T$}
{The appropriate features of $T$ can be introduced along with
  subtyping or by themselves, but there can only be one declaration of
  appropriate features.}

\errorcond{multiple specification for $F$ in declaration of $T$}
{More than one restriction on the value of feature $F$ is given in the
definition of type $T$.}

\errorcond{no lexical entry for $W$}
{Expression $W$ is used, but has no lexical entry.}

\errorcond{pathval: illegal path specified - $\pi$}
{Path $\pi$ is not a valid path specification for the given feature
structure.}

\errorcond{rule $R$ has multiple {\tt sem\_head>} specifications}
{More than one semantic head declaration was found in grammar rule
$R$.}

\newpage
\errorcond{rule $R$ has no {\tt cat>} {\tt cats>} or {\tt sem\_head>}
specification}
{The grammar rule named $R$ is empty in that it does not have any
daughter specification.}

\errorcond{rule $R$ has wrongly placed {\tt sem\_goal>}
  specifications}
{A {\tt sem\_goal>} specification occurs somewhere other than
  immediately before or immediately after a {\tt sem\_head>}
  specification in rule $R$.}

\errorcond{subtype/feature specification given for {\tt a\_/1} atom}
{Subsumption over {\tt a\_/1} atoms has a fixed definition, and they can
  have no features.  Subtype or feature specifications for them are
  not allowed.}

\errorcond{subtyping cycle at $T$}
{The subsumption relation specified is not anti-symmetric.  It can be
inferred that the type $T$ is a proper subtype of itself.}

\errorcond{subtype $T_{1}$ used in $T_{2}$ undeclared}
{Undefined type $T_{1}$ declared as subtype in definition of $T_{2}$.}

\errorcond{$T$ multiply defined}
{There is more than one definition of type $T$.}

\errorcond{$T$ subsumes bot}
{$T$ is declared as subsuming the most general type, $\bot$.}

\errorcond{$T_{1}$ used in appropriateness definition of $T_{2}$
undeclared}
{Undefined type $T_{1}$ used as value restriction in definition of
$T_{2}$.}

\errorcond{undefined macro $M$ used in description}
{A description uses a macro which is not defined.}

\errorcond{undefined type $T$ used in description}
{A description uses a type $T$ which is not defined.}

\errorcond{undefined feature $F$ used in path $\pi$}
{A path $\pi$ of features uses undefined feature $F$ in a description.}

\errorcond{unsatisfiable lexical entry for $W$}
{Word $W$ has a lexical entry which has no satisfying feature structure.}

\errorcond{upward closure fails for $F$ in $S1$ and $S2$}
{$S1$ subsumes $S2$, but the value restriction for $F$ at $S1$ does not
subsume the value restriction for $F$ at $S2$.}

\mysection{Warning Messages}

\errorcond{{\tt =@} accessible by procedural attachment calls from
  constraint for $T$}
{The built-in {\tt =@} predicate (extensional identity check) is
  non-monotonic, so its use should be avoided in constraints attached
  to types.}

\errorcond{atom {\tt a\_/1} $Atom$ is ground in declaration of $T$}
{One of the appropriate features of $T$ has a value restriction that
  is a ground {\tt a\_/1} atom.  Every feature structure of type $T$
  will have the same value at this feature.}

\errorcond{homomorphism condition fails for $F$ in $T_{1}$ and $T_{2}$}
{It is not the case that the appropriateness restriction on the type $T
= T_{1} + T_{2}$ is the unification of the appropriateness
restrictions on $T_{1}$ and $T_{2}$.}

\errorcond{lexical description for $W$ is unsatisfiable}
{Incompatibilities in the lexical description for word $W$ could not
produce a satisfying feature structure.}

\errorcond{no chain rules found}
{All the grammar rules in the program were non-chain rules (no
semantic heads).}

\errorcond{no definite clauses found}
{There were no definite clause rules specified in the program.}

\errorcond{no features introduced}
{There are no appropriate features for any types.}

\errorcond{no functional descriptions found}
{There are no functional description definitions in the program.}

\errorcond{no lexical rules found}
{There were no lexical rules specified in the program.}

\errorcond{no lexicon found}
{There were no lexical entries specified in the program.}

\errorcond{no non\_chain rules found}
{All the grammar rules in the program were chain rules (with semantic
heads).}

\errorcond{no $P$ definite clause found}
{A definition for definite clause predicate $P$, which was declared as
the semantics predicate, was not found in the program.}

\errorcond{no phrase structure rules found}
{There were no phrase structure rules specified in the program.}

\errorcond{no semantics specification found}
{There was no specification of the semantics definite clause predicate
in the program.}

\errorcond{no types defined}
{There were no {\tt sub} or {\tt intro} declarations found in the program.}

\errorcond{unary branch from $T_{1}$ to $T_{2}$}
{The only subtype of $T_{1}$ is $T_{2}$.  In this situation, it is
usually more efficient to elimate $T_{1}$ if every instance of $T_{1}$
is a $T_{2}$.}

\myappendix{BNF for ALE}

\begin{verbatim}
  <desc> ::= <type>
           | <variable>
           | (<feature>:<desc>)
           | (<desc>,<desc>)
           | (<desc>;<desc>)
           | @ <macro_spec>
           | <func_spec>
           | a_ <prolog_term>
           | <path> == <path>
           | =\= <desc>

  <type> ::= <prolog_functor>

  <feature> ::= <prolog_atom>

  <path> ::= list(<feature>)

  <macro_def> ::= <macro_head> macro <desc>.

  <macro_head> ::= <macro_name>
                 | <macro_name>(<seq(var)>)

  <macro_spec> ::= <macro_name>
                 | <macro_name>(<seq(desc)>)

  <func_def> ::= <func_spec> +++> <desc>.

  <func_spec> ::= <func_name>
                 | <func_name>(<seq(desc)>)

  <clause> ::= <literal> if <goal>.

  <literal> ::=  <pred_sym>
              |  <pred_sym>(<seq(desc)>)

  <cut_free_goal> ::= true
                    | <literal>
                    | prolog(<prolog_goal>)
                    | (<cut_free_goal>,<cut_free_goal>)
                    | (<cut_free_goal>;<cut_free_goal>)
                    | (<desc> =@ <desc>)
                    | (<cut_free_goal> -> <cut_free_goal>)
                    | (<cut_free_goal> -> <cut_free_goal> 
                                        ; <cut_free_goal>)
                    | (\+ <cut_free_goal>)

  <goal> ::= true
           | <literal>
           | prolog(<prolog_goal>)
           | (<goal>,<goal>)
           | (<goal>;<goal>)
           | (<desc> =@ <desc>)
           | (<cut_free_goal> -> <goal>)
           | (<cut_free_goal> -> <goal> ; <goal>)
           | !
           | (\+ <goal>)

  <lex_entry> ::=  <word> ---> <desc>.

  <rule> ::=  <rule_name> rule <desc> ===> <rule_body>.

  <rule_body> ::=  <sem_less_rule_body>
                |  <sem_less_rule_body>,<sem_rule_body>,
                   <sem_less_rule_body>

  <sem_less_rule_body> ::= <rule_clause>
                         | <rule_clause>, <sem_less_rule_body>

  <rule_clause> ::=  cat> <desc>
                  |  cats> <desc>
                  |  goal> <goal>

  <sem_rule_body> ::= sem_head> <desc>
                    | sem_goal> <goal>, sem_head> <desc>
                    | sem_head> <desc>, sem_goal> <goal>
                    | sem_goal> <goal>, sem_head> <desc>, 
                      sem_goal> <goal>

  <lex_rule> ::= <lex_rule_name> lex_rule <lex_rewrite>
                                 morphs <morphs>.

\end{verbatim}
\newpage
\begin{verbatim}

  <lex_rewrite> ::= <desc> **> <desc>
                  | <desc> **> <desc> if <goal>

  <morphs> ::= <morph>
             | <morph>, <morphs>

  <morph> ::= (<string_pattern>) becomes (<string_pattern>)
            | (<string_pattern>) becomes (<string_pattern>)
                                 when <prolog_goal>

  <string_pattern> ::= <atomic_string_pattern>
                     | <atomic_string_pattern>, <string_pattern>

  <atomic_string_pattern> ::=  <atom>
                            |  <var>
                            |  <list(<var_char>)>
  <var_char> ::= <char>
               | <var>

  <seq(X)> ::= <X>
             | <X>, <seq(X)>

  <empty_prod> ::= empty <desc>.

  <type_spec> ::= <type> sub <list(<type>)>
                | <type> sub <list(<type>)>
                         intro <list(<frestr_spec>)>
                | <type> intro <list(<frestr_spec>)>

  <frestr_spec> ::= <feature>:<type>
                  | <feature>: a_ <prolog_term>

  <cons_spec> ::= <type> cons <desc>
                | <type> cons <desc>
                         goal <goal>

  <ext_spec> ::= ext(list(<type>))

  <prog> ::= <prog_line>
           | <prog_line> <prog>

\end{verbatim}
\newpage
\begin{verbatim}

  <prog_line> ::= <type_spec>
                | <ext_spec>
                | <cons_spec>
                | <macro_def>
                | <empty_prod>
                | <clause>
                | <rule>
                | <lex_entry>
                | <lex_rule>
\end{verbatim}

\myappendix{Reference Card}

\begin{verbatim}

Grammar
----------------------------------------------------------------------

Type sub Subtypes (intro [F1:R1,...,Fn:Rn]).  (Subtyping)
Type intro [F1:R1,...,Fn:Rn].                 (Appropriateness)
ext([Type1,...,Typen]).                       (Extensionality)

(Types not otherwise declared, but used in the above definitions are 
 assumed to be maximal and/or immediately subsumed by bot)

Type cons Desc                                (Type Constraint)
    (goal Goal).

Word ---> Desc.                               (Lexical Entry)
empty Desc.                                   (Empty Category)
RuleName lex_rule DescIn **> DescOut          (Lexical Rule)
                  (if Goal)
         morph X becomes [X,e,s]
                 when PrologGoal.
RuleName rule Desc ===>                       (Phrase Structure Rule)
                        cat> Desc/            [Daughter]
                        cats> ListDesc/       [List of Daughters]
                        sem_head> Desc/       [Semantic Head]
                        goal> Goal/           [Procedural Attachment]
                        sem_goal> Goal.       [P.A. to Semantic Head]

f(Desc1,...,Descn) if g(Desc1,...,Descn)/     (Definite Clause)
                      Desc1 =@ Desc2/         (Extensionality Check)
                      prolog(PrologGoal).     (Prolog Hook)

fun(Desc1,...,Descn) +++> DescResult.         (Function Declaration)
macro(X1,...,Xn) macro Desc.                  (Macro Declaration)
semantics Pred.                               (Semantic Pred. Declaration)
\end{verbatim}
\newpage

\begin{verbatim}

Compile-time Options
----------------------------------------------------------------------
| ?- parse/generate/parse_and_gen.        (Compilation Mode)
| ?- lex_consult/lex_compile.             (Lexicon Compilation Mode)
| ?- (no)adderrs.                         (Toggle Description Errors)
| ?- (no)subtest.                         (Toggle Subsumption Check)
| ?- chain_length(Num).                   (Chain Rule Length Bound)
| ?- lex_rule_depth(Num).                 (Lexical Rule Depth Bound)

Compilation
----------------------------------------------------------------------
| ?- (d)compile_gram(GramFile).           (Grammar Compilation)
| ?- update_lex(File).                    (Incremental Lexicon Update)
| ?- retract(all)_lex.                (Incremental Lexicon Retraction)

Grammar Inspection
----------------------------------------------------------------------
| ?- (no_)write_type(s).                  (Type Hiding/Showing)
| ?- (no_)write_feat(s).                  (Feature Hiding/Showing)
| ?- show_type Type.                      (Signature Inspection)
| ?- unify_type(Type1,Type2,LUB).         (Type Unification)
| ?- approp(Feat,Type,Restr).             (Appropriateness Inspection)
| ?- introduce(Feat,Type).                (Feature Introduction)
| ?- iso_desc(Desc1,Desc2).               (Extensional Identity)
| ?- mgsat Desc.                          (Most General Satisfier)
| ?- show_clause PredName(/Arity).        (Definite Clause)
| ?- lex Word.                            (Lexical Entry)
| ?- rule Rulename.                       (Phrase Structure Rule)
| ?- empty.                               (Empty Category)
| ?- macro Macro.                         (Macro Definition)
| ?- lex_rule Lexrulename.                (Lexical Rule)
| ?- export_words(Stream,Delim).          (Lexicon Export)

Execution
----------------------------------------------------------------------
| ?- (d)rec WordList.                     (Bottom-up Parsing)
| ?- (d)query Query.                      (SLD Resolution)
| ?- (d)gen Desc.                         (Head-Driven Generation)

Run-time Options
----------------------------------------------------------------------
| ?- (no)interp.                          (Toggle Mini-Interpreter)
| ?- edge(Left,Right).                    (Chart Edge Inspection)
| ?- dleash(+/-Kind).                     (Port Leashing)
| ?- dskip(+/-Kind).                      (Port Auto-Skipping)
| ?- dclear_bps.                          (Breakpoint Clearing)

\end{verbatim}

\end{document}


\myappendix{Future Extensions}

\mysection{Descriptions}
  \mysubsection{Set Values}
  \mysubsection{Negation}
  \mysubsection{Cuts}
  \mysubsection{Disjunctive Paths}
  \mysubsection{Anonymous Variables}
\mysection{Parser}
  \mysubsection{Alternative Strategies}
\mysection{Definite Clauses}
  \mysubsection{Database Interaction}
  \mysubsection{Set Predicates}
\mysection{Compiler}
  \mysubsection{Extended Error Messages}
  \mysubsection{More Incremental Compiling}
  \mysubsection{Compact Atom Representation}
\mysection{Debugging}
  \mysubsection{Incremental Clause Debugger}
  \mysubsection{Incremental Unification/Description Debugger}
  \mysubsection{Incremental Grammar Debugger}