Gerald Penn
Research Interests
Summary
- Typed Feature Logic, Grammar Development, Type Theory
- Parsing and Generation Algorithms, Free Word-order Languages
- Substructural Logics, Parsing with Lambek Categorial Grammars
- Finite-State Transducers, Compilation Methods, Default Reasoning
- ALE, Constraint Logic Programming
- Writing System Analysis, Pronunciation Modelling, Text-to-Speech Synthesis
- Speech Summarisation
2007 Research Guide Entry
My research concerns the study of the structure of human languages as
mathematical and computational systems. Not only is natural language
processing an integral component of the overall vision of artificial
intelligence research, but many of the problems that have defined the
rest of AI, programming language and theoretical computer science
research in general can be found inside this very rich empirical
domain. More recently, the proliferation of electronically available
text, audio and other multimedia resources over the World Wide Web has
created an acute demand for machine
translation systems, query answering systems, and text or speech
summarisation tools for using this vast source of information. In
order to approach
their full potential, the next generations of these systems must be
capable of providing far more precise information about meaning and
the relations between the people, objects and locations described by
these sources. Recent advances in wireless technology also require a
natural means of interacting with ever more miniature devices, and
this can only be achieved through spoken input and output. My
research thus seeks to provide both a formal perspective on language
to realise or improve such applications, and the algorithms to support
them.
The strategy I have adopted to pursue this goal consists of several
related threads of research activity in specialised logics for
computational linguistics, other means of grammar specification,
and their applications.
Typed feature logic. Feature logics are a very convenient and
flexible class of description logics that have been applied to
virtually every level of linguistic description and every kind of
linguistic application. Most of my work in this area involves The Logic
of Typed Feature Structures (Carpenter, 1992), one of the most
widely used variants of these logics in computational linguistics.
Because of the rigid type system and features in this logic, grammars
specified with it are more modular and more scalable, can be used more
efficiently in a variety of applications, and are more terse in their
descriptions of feature structures, even when they grow to thousands
of nodes in size, which often happens as they are used to encode and
integrate multiple sources and levels of linguistic knowledge.
The types and features used by a particular grammar are declared in a
partial order called a signature, which describes how feature
structures can combine and inherit information. My doctoral thesis was a study of algebraic
methods for analysing these type signatures, with applications to
grammar transformation, parametric typing, and computing various
structural closures over signature specifications. It also
constructively proved that every statically typable signature admits a
Prolog-term encoding of its feature structures, which has led to a
current research project on graph-theoretic reductions of partial
order representations. I have also been involved with constraint
resolution and constraint logic programming methods over typed feature
structures, particularly in the context of Head-driven Phrase
Structure Grammar (HPSG) development. I am also currently
interested in the relationship between feature logic, category theory,
constructive type theory and object-oriented programming, which may
ultimately lead to a formalism for linguistic theorising that is at
once mathematically more mature and computationally more elegant and
modular.
Parsing/Generation Algorithms. Parsing algorithms are possibly
the best known and most extensively studied area of computational
linguistics, and yet our mastery of them is largely confined to an
antiquated view of syntax. In this view, grammar consists of a number
of context-free rules, with atomic (or at least very small)
categories, and linear precedence is implicitly and rigidly
determined. Constituents — substrings that guide the
composition of meaning — are assumed to come in a single
flavour, on which all constraints governing argument realisation,
scoping properties, long distance dependencies, topic/focus
articulation and prosodic structure happily agree. I am interested in
all aspects of parsing according to a more realistic, modern view of
grammar. That includes parsing with so-called freer word-order
languages, in which constraint solving techniques combined with
morphological information, functional information and/or statistically
derived preferences must be used in order to efficiently match
arguments to the grammatical relations of syntactic heads.
It also includes the use of different control strategies
for parsing, formulated as constraint solving
problems on syntactic trees, their yields or semantic terms.
Substructural logic. It is a common formal view of natural
language understanding that each word of a sentence contributes some
syntactic and semantic information which is combined by rules to form
the meaning of the sentence. The order of the words and the number of
times a word occurs clearly should matter to this composition. In
classical logic, however, the number of instances of the same
hypothesis and the order in which different hypotheses are presented
are irrelevant to deducing a proposition from them. Substructural
logics are logics that lack one or more of the usual structural rules
that encode this insensitivity, making them a natural choice for
expressing composition in natural languages. Categorial
grammar has built
upon this view using the Lambek Calculus and its various extensions to
provide elegant accounts of coordination, quantifier scoping and many
other complex phenomena from natural language. Lambek grammars are
weakly context-free equivalent, and there has been a great deal of
recent progress on the complexity of the sentence membership problem
in the Lambek Calculus, but our understanding of what factors of
the calculus and its various extensions condition its tractability is
still incomplete. My own recent work in this
area has focussed on graph-theoretic membership
algorithms for the Lambek calculus, which prompted a
reconsideration of this problem within the field that ultimately led
to Pentus's 2003 proof of the traditional calculus's NP-completeness.
I have also previously worked on the problems of machine learning of
categorial grammars, and the use of higher-order logic programming
languages such as Twelf
to represent deductions in the Lambek
calculus.
Finite-state methods. Finite-state automata and transducers are
probably the most widely used computational models in speech and
natural language processing today, in part because they are convenient
to work with for many practical applications, but mostly because they
can be statistically trained with relative ease and are typically very
fast. I am interested in compilation methods for computing with very
large-scale transducers or matrices, particularly in the critical but
often unsung areas of memory management and numerically stable
computation. My previous work in this area was confined mostly to
extensions of finite-state specifications such as contextualised rules and default reasoning,
and feature-based automata.
Speech Summarization. Speech summarization is often conducted
simply by transcribing the speech and then summarizing the textual
transcript. In fact, much of the published research in speech
summarization has been conducted on broadcast news, read text and
other conservative domains that define other interesting aspects of
the problem away simply for the sake of reporting better evaluation
scores. Our research has tackled the summarization of spontaneous
unrestricted conversational speech head-on. We have demonstrated a
very clear advantage to using acoustic and prosodic features, as well
as features peculiar to spontaneous conversations that places this
technology within the reach of practical applications. We are also
very heavily involved in evaluating the
usability of speech in various human-computer interfaces.
Applications. I am the principal developer, maintainer and one
of the co-designers of The Attribute Logic Engine
(ALE), a Prolog-like logic programming language and
parsing/generation system. ALE has been used by over 200
universities, research centres, and product developers for grammar
development and testing, a wide variety of NLP applications, and
embedded support for constraint resolution and unification over typed
feature structures in other domains. Since the release of ALE 3.0 in
1998, ALE's speed has improved by a factor of about 70 on large-scale
parses over the widely referenced English Head-driven Phrase Structure
Grammar that we distribute with ALE. Further improvements are
still to come at both run-time and compile-time, due in great part to
the better theoretical understanding that we now possess of typed
feature logic. This in turn has opened the door to a variety of
extensions of ALE's functionality and applicability, such as
constraint logic programming and speech-to-speech machine
translation, a domain in which we successfully used ALE for
parsing and generation support at Bell Laboratories. More
recently, I have been involved in developing compact alternatives to
finite-state pronunciation modelling for speech recognition and
text-to-speech synthesis. At Bell Labs, I was also involved in
developing a baseline standard for extracting
tabular content and topic summaries from semi-structured text for
display on narrow-bandwidth devices, such as Palm Pilots and PCS
displays — a development programme that continues here in Toronto
in the form of integrated prototypes that incorporate speech
summarisation, together with our partners at Avaya Labs.
[ Home
| Contact
| Research
| Publications
| Students
| ALE
]
gpenn@cs.toronto.edu
Gerald Penn
|