Gerald Penn
Research Interests
Summary
- Typed Feature Logic, Grammar Development, Type Theory
- Parsing and Generation Algorithms, Free Word-order Languages
- Substructural Logics, Parsing with Lambek Categorial Grammars
- Finite-State Transducers, Compilation Methods, Default Reasoning
- ALE, Constraint Logic Programming
- Writing System Analysis, Pronunciation Modelling, Text-to-Speech Synthesis
- Speech Summarisation
2014 Research Guide Entry
My students and I conduct research in mathematical linguistics and
spoken language processing. Not only are these areas
an integral component of the overall vision of artificial intelligence
research, but many of the problems that have defined the rest of AI,
programming language and theoretical computer science research in
general can be found inside this very rich empirical domain. More
recently, the proliferation of electronically available text, audio
and other multimedia resources over the World Wide Web has created an
acute demand for better translation systems, query answering, and text
or speech summarisation tools for using this vast source of
information. In order to approach their full potential, the next
generations of these systems must be capable of providing far more
precise information about meaning and the relations between the
people, objects and locations described by these sources. Recent
advances in wireless technology also require a natural means of
interacting with ever more miniature devices, and this can only be
achieved through spoken input and output. My research thus seeks to
provide both a formal perspective on language to realise or improve
such applications, and the algorithms to support them.
The strategy I have adopted to pursue this goal consists of two
major threads of research activity.
The first deals with
specialised mathematical models that are aimed at natural
language processing applications. These include typed feature
logic, a convenient and flexible class of description logics
that has been applied to
virtually every level of linguistic description and every kind of
linguistic application.
Because of its rigid type system and named features, grammatical
knowledge
specified with typed feature logic is more modular and more scalable,
is more tersely described, even when the underlying record structureds
grow to thousands
of nodes in size, and, as a result, is more efficient to compute
with. We first proved that every statically typable signature
admits a Prolog-term encoding of its feature structures, which
has led to a current research project on algebraic encodings of
partially order sets. I have also been involved with constraint
resolution and constraint logic programming methods over typed feature
structures, particularly in the context of Head-driven Phrase
Structure Grammar (HPSG) development.
I am the principal developer, maintainer and one
of the co-designers of The Attribute Logic Engine
(ALE), a Prolog-like logic programming language and
parsing/generation system based upon typed feature logic.
ALE has been used by over 200
universities, research centres, and product developers for grammar
development and testing, a wide variety of NLP applications, and
embedded support for constraint resolution and unification over typed
feature structures in other domains. Since the release of ALE 3.0 in
1998, ALE's speed has improved by a factor of about 70 on large
parses over the English Head-driven
Phrase Structure Grammar that we distribute with ALE.
I am also currently interested in the relationship between feature
logic, category theory and programming language theory.
Parsing/Generation Algorithms. Parsing algorithms are possibly
the best known and most extensively studied area of computational
linguistics, and yet our mastery of them is largely confined to an
antiquated view of syntax. In this view, grammar consists of a number
of context-free rules, with atomic (or at least very small)
categories, and linear precedence is implicitly and rigidly
determined. Constituents — substrings that guide the
composition of meaning — are assumed to come in a single
flavour, on which all constraints governing argument realisation,
scoping properties, long distance dependencies, topic/focus
articulation and prosodic structure happily agree. I am interested in
all aspects of parsing according to a more realistic, modern view of
grammar. That includes parsing with so-called freer word-order
languages, in which constraint solving techniques combined with
morphological information, functional information and/or statistically
derived preferences must be used in order to efficiently match
arguments to the grammatical relations of syntactic heads.
It also includes the use of different control strategies
for parsing, formulated as constraint solving
problems on syntactic trees, their yields or semantic terms.
Substructural logic. It is a common formal view of natural
language understanding that each word of a sentence contributes some
syntactic and semantic information which is combined by rules to form
the meaning of the sentence. The order of the words and the number of
times a word occurs clearly should matter to this composition. In
classical logic, however, the number of instances of the same
hypothesis and the order in which different hypotheses are presented
are irrelevant to deducing a proposition from them. Substructural
logics are logics that lack one or more of the usual structural rules
that encode this insensitivity, making them a natural choice for
expressing composition in natural languages. Categorial
grammar has built
upon this view using the Lambek Calculus and its various extensions to
provide elegant accounts of coordination, quantifier scoping and many
other complex phenomena from natural language. Lambek grammars are
weakly context-free equivalent, and there has been a great deal of
recent progress on the complexity of the sentence membership problem
in the Lambek Calculus, but our understanding of what factors of
the calculus and its various extensions condition its tractability is
still incomplete. My own recent work in this
area has focussed on graph-theoretic membership
algorithms for the Lambek calculus, which prompted a
reconsideration of this problem within the field that ultimately led
to Pentus's 2003 proof of the traditional calculus's NP-completeness.
I have also previously worked on the problems of machine learning of
categorial grammars, and the use of higher-order logic programming
languages such as Twelf
to represent deductions in the Lambek
calculus.
Finite-state methods. Finite-state automata and transducers are
probably the most widely used computational models in speech and
natural language processing today, in part because they are convenient
to work with for many practical applications, but mostly because they
can be statistically trained with relative ease and are typically very
fast. I am interested in compilation methods for computing with very
large-scale transducers or matrices, particularly in the critical but
often unsung areas of memory management and numerically stable
computation. My previous work in this area was confined mostly to
extensions of finite-state specifications such as contextualised rules and default reasoning,
and feature-based automata.
Speech Summarization. Speech summarization is often conducted
simply by transcribing the speech and then summarizing the textual
transcript. In fact, much of the published research in speech
summarization has been conducted on broadcast news, read text and
other conservative domains that define other interesting aspects of
the problem away simply for the sake of reporting better evaluation
scores. Our research has tackled the summarization of spontaneous
unrestricted conversational speech head-on. We have demonstrated a
very clear advantage to using acoustic and prosodic features, as well
as features peculiar to spontaneous conversations that places this
technology within the reach of practical applications. We are also
very heavily involved in evaluating the
usability of speech in various human-computer interfaces.
Applications. Further improvements are
still to come at both run-time and compile-time, due in great part to
the better theoretical understanding that we now possess of typed
feature logic. This in turn has opened the door to a variety of
extensions of ALE's functionality and applicability, such as
constraint logic programming and speech-to-speech machine
translation, a domain in which we successfully used ALE for
parsing and generation support at Bell Laboratories. More
recently, I have been involved in developing compact alternatives to
finite-state pronunciation modelling for speech recognition and
text-to-speech synthesis. At Bell Labs, I was also involved in
developing a baseline standard for extracting
tabular content and topic summaries from semi-structured text for
display on narrow-bandwidth devices, such as Palm Pilots and PCS
displays — a development programme that continues here in Toronto
in the form of integrated prototypes that incorporate speech
summarisation, together with our partners at Avaya Labs.
[ Home
| Contact
| Research
| Publications
| Students
| ALE
]
gpenn@cs.toronto.edu
Gerald Penn
|