Gerald Penn
Research Interests

Summary

  • Typed Feature Logic, Grammar Development, Type Theory
  • Parsing and Generation Algorithms, Free Word-order Languages
  • Substructural Logics, Parsing with Lambek Categorial Grammars
  • Finite-State Transducers, Compilation Methods, Default Reasoning
  • ALE, Constraint Logic Programming
  • Writing System Analysis, Pronunciation Modelling, Text-to-Speech Synthesis
  • Speech Summarisation

2014 Research Guide Entry

My students and I conduct research in mathematical linguistics and spoken language processing. Not only are these areas an integral component of the overall vision of artificial intelligence research, but many of the problems that have defined the rest of AI, programming language and theoretical computer science research in general can be found inside this very rich empirical domain. More recently, the proliferation of electronically available text, audio and other multimedia resources over the World Wide Web has created an acute demand for better translation systems, query answering, and text or speech summarisation tools for using this vast source of information. In order to approach their full potential, the next generations of these systems must be capable of providing far more precise information about meaning and the relations between the people, objects and locations described by these sources. Recent advances in wireless technology also require a natural means of interacting with ever more miniature devices, and this can only be achieved through spoken input and output. My research thus seeks to provide both a formal perspective on language to realise or improve such applications, and the algorithms to support them.

The strategy I have adopted to pursue this goal consists of two major threads of research activity.

The first deals with specialised mathematical models that are aimed at natural language processing applications. These include typed feature logic, a convenient and flexible class of description logics that has been applied to virtually every level of linguistic description and every kind of linguistic application. Because of its rigid type system and named features, grammatical knowledge specified with typed feature logic is more modular and more scalable, is more tersely described, even when the underlying record structureds grow to thousands of nodes in size, and, as a result, is more efficient to compute with. We first proved that every statically typable signature admits a Prolog-term encoding of its feature structures, which has led to a current research project on algebraic encodings of partially order sets. I have also been involved with constraint resolution and constraint logic programming methods over typed feature structures, particularly in the context of Head-driven Phrase Structure Grammar (HPSG) development. I am the principal developer, maintainer and one of the co-designers of The Attribute Logic Engine (ALE), a Prolog-like logic programming language and parsing/generation system based upon typed feature logic. ALE has been used by over 200 universities, research centres, and product developers for grammar development and testing, a wide variety of NLP applications, and embedded support for constraint resolution and unification over typed feature structures in other domains. Since the release of ALE 3.0 in 1998, ALE's speed has improved by a factor of about 70 on large parses over the English Head-driven Phrase Structure Grammar that we distribute with ALE. I am also currently interested in the relationship between feature logic, category theory and programming language theory.

Parsing/Generation Algorithms. Parsing algorithms are possibly the best known and most extensively studied area of computational linguistics, and yet our mastery of them is largely confined to an antiquated view of syntax. In this view, grammar consists of a number of context-free rules, with atomic (or at least very small) categories, and linear precedence is implicitly and rigidly determined. Constituents — substrings that guide the composition of meaning — are assumed to come in a single flavour, on which all constraints governing argument realisation, scoping properties, long distance dependencies, topic/focus articulation and prosodic structure happily agree. I am interested in all aspects of parsing according to a more realistic, modern view of grammar. That includes parsing with so-called freer word-order languages, in which constraint solving techniques combined with morphological information, functional information and/or statistically derived preferences must be used in order to efficiently match arguments to the grammatical relations of syntactic heads. It also includes the use of different control strategies for parsing, formulated as constraint solving problems on syntactic trees, their yields or semantic terms.

Substructural logic. It is a common formal view of natural language understanding that each word of a sentence contributes some syntactic and semantic information which is combined by rules to form the meaning of the sentence. The order of the words and the number of times a word occurs clearly should matter to this composition. In classical logic, however, the number of instances of the same hypothesis and the order in which different hypotheses are presented are irrelevant to deducing a proposition from them. Substructural logics are logics that lack one or more of the usual structural rules that encode this insensitivity, making them a natural choice for expressing composition in natural languages. Categorial grammar has built upon this view using the Lambek Calculus and its various extensions to provide elegant accounts of coordination, quantifier scoping and many other complex phenomena from natural language. Lambek grammars are weakly context-free equivalent, and there has been a great deal of recent progress on the complexity of the sentence membership problem in the Lambek Calculus, but our understanding of what factors of the calculus and its various extensions condition its tractability is still incomplete. My own recent work in this area has focussed on graph-theoretic membership algorithms for the Lambek calculus, which prompted a reconsideration of this problem within the field that ultimately led to Pentus's 2003 proof of the traditional calculus's NP-completeness. I have also previously worked on the problems of machine learning of categorial grammars, and the use of higher-order logic programming languages such as Twelf to represent deductions in the Lambek calculus.

Finite-state methods. Finite-state automata and transducers are probably the most widely used computational models in speech and natural language processing today, in part because they are convenient to work with for many practical applications, but mostly because they can be statistically trained with relative ease and are typically very fast. I am interested in compilation methods for computing with very large-scale transducers or matrices, particularly in the critical but often unsung areas of memory management and numerically stable computation. My previous work in this area was confined mostly to extensions of finite-state specifications such as contextualised rules and default reasoning, and feature-based automata.

Speech Summarization. Speech summarization is often conducted simply by transcribing the speech and then summarizing the textual transcript. In fact, much of the published research in speech summarization has been conducted on broadcast news, read text and other conservative domains that define other interesting aspects of the problem away simply for the sake of reporting better evaluation scores. Our research has tackled the summarization of spontaneous unrestricted conversational speech head-on. We have demonstrated a very clear advantage to using acoustic and prosodic features, as well as features peculiar to spontaneous conversations that places this technology within the reach of practical applications. We are also very heavily involved in evaluating the usability of speech in various human-computer interfaces.

Applications. Further improvements are still to come at both run-time and compile-time, due in great part to the better theoretical understanding that we now possess of typed feature logic. This in turn has opened the door to a variety of extensions of ALE's functionality and applicability, such as constraint logic programming and speech-to-speech machine translation, a domain in which we successfully used ALE for parsing and generation support at Bell Laboratories. More recently, I have been involved in developing compact alternatives to finite-state pronunciation modelling for speech recognition and text-to-speech synthesis. At Bell Labs, I was also involved in developing a baseline standard for extracting tabular content and topic summaries from semi-structured text for display on narrow-bandwidth devices, such as Palm Pilots and PCS displays — a development programme that continues here in Toronto in the form of integrated prototypes that incorporate speech summarisation, together with our partners at Avaya Labs.


[ Home | Contact | Research | Publications | Students | ALE ]

gpenn@cs.toronto.edu
Gerald Penn