The following are research papers published by our group, grouped by author.
Multi-authored papers are listed under the name of each author.
Papers are listed in reverse chronological order, with more recent papers first.
Only authors' publications that are related to research in or with the University of Toronto computational linguistics group are included.
Yawar Ali (1) | Understanding adjectives,
Yawar Ali, 1985 Master's Thesis. Department of Computer Science, University of Toronto. January. Published as technical report CSRI-167.
Abstract The first problem is to determine exactly what each adjective modifies. In general, this can only be done by taking account of the semantic properties of the adjective in question, as well as those of other adjectives to its right and of the noun itself. ``Real-world'' knowledge and contextual factors also play a role in this process. This is addressed by developing a classification scheme for adjectives which allows us to substantially reduce the number of candidate interpretations, in some cases to a single one. A system is presented which takes account of the disparate semantic behaviour of different classes of adjectives, word order, punctuation in the noun phrase, and a frame-based store of real-world knowledge, in order to determine the scope of adjectives within a noun phrase. The second problem is to construct a representation of the description embodied in such a noun phrase. Here, it is desirable that the structure of the representation correspond to the structure of modification within the phrase. Particular adjectives are taken to indicate restrictions on the values that objects may take on for associated properties. These properties may be featural, dimensional, or functional in nature. Frame-like structures are used to represent the generic concepts that are taken to be associated with noun phrases. |
(bibtex)
| |
|
Afra Alishahi (5) | A computational model for early Argument Structure Acquisition,
Afra Alishahi and Suzanne Stevenson, 2007, Submitted [Download pdf] (bibtex)
| | A computational usage-based model for learning general properties of semantic roles,
Afra Alishahi and Suzanne Stevenson, 2007 Proceedings of the 2nd European Cognitive Science Conference Delphi, Greece [Download pdf] (bibtex)
| | A cognitive model for the representation and acquisition of verb selectional preferences,
Afra Alishahi and Suzanne Stevenson, 2007 Proceedings of the ACL-2007 Workshop on Cognitive Aspects of Computational Language Acquisition Prague, Czech Republic [Download pdf] (bibtex)
| | A probabilistic model of early argument structure acquisition,
Afra Alishahi and Suzanne Stevenson, 2005 Proceedings of the 27th Annual Conference of the Cognitive Science Society, July, Stresa, Italy
Abstract| We present a computational model of usage-based learning of verb argument structure in young children. The model integrates Bayesian classification and prediction to learn from utterances paired with appropriate semantic representations. The model balances item-based and class-based knowledge in language use, demonstrating appropriate word order generalizations, and recovery from overgeneralizations with no negative evidence or change in learning parameters. |
[Download pdf] (bibtex)
| | The acquisition and use of argument structure constructions,
Afra Alishahi and Suzanne Stevenson, 2005 Proceedings of the Second Workshop on Psychocomputational Models of Human Language Acquisition, pp. 82--90, June, Ann Arbor
Abstract| We present a Bayesian model for the representation, acquisition, and use of argument structure constructions, which is founded on a novel view of constructions as a mapping of a syntactic form to a probability distribution over semantic features. Our computational experiments demonstrate the feasibility of learning general constructions from individual examples of verb usage, and show that the acquired knowledge generalizes to novel or low-frequency situations in language use. |
[Download pdf] (bibtex)
| |
|
Daniel Ansari (2) | Generating warning instructions by planning accidents and injuries,
Daniel Ansari and Graeme Hirst, 1998 Proceedings, 9th International Workshop on Natural Language Generation, pp. 118--127, August, Niagara-on-the-Lake, Ontario
Abstract| We present a system for the generation of natural language instructions, as are found in instruction manuals for household appliances, that is able to automatically generate safety warnings to the user at appropriate points. Situations in which accidents and injuries to the user can occur are considered at every step in the planning of the normal operation of the device, and these ``injury sub-plans'' are then used to instruct the user to avoid these situations. |
[Download pdf] (bibtex)
| | Deriving procedural and warning instructions from device and environment models,
Daniel Ansari, 1995 Master's Thesis. Department of Computer Science, University of Toronto. June. Published as technical report CSRI-329 .
AbstractThere has been much interest lately in the automatic generation of documentation; however, much of this research has not considered the cost involved in the production of the natural language generation systems to be a major issue: the benefits obtained from automating the construction of the documentation should outweigh the cost of designing and coding the knowledge base. This study is centred on the generation of instructional text, as is found in instruction manuals for household appliances. We show how knowledge about a device that already exists as part of the engineering effort, together with adequate, domain-independent knowledge about the environment, can be used for reasoning about natural language instructions. The knowledge selected for communication can be planned for, and all the knowledge necessary for the planning should be contained (possibly in a more abstract form) in the knowledge of the artifact together with the world knowledge. We present the planning knowledge for two example domains, in the form of axioms in the situation calculus. This planning knowledge formally characterizes the behaviour of the artifact, and it is used to produce a basic plan of actions that both the device and user take to accomplish a given goal. We explain how the instructions are generated from the basic plan. This plan is then used to derive further plans for states to be avoided. We will also explain how warning instructions about potentially dangerous situations are generated from these plans. These ideas have been implemented using Prolog and the Penman natural language generation system. Finally, this thesis makes the claim that the planning knowledge should be derivable from the device and world knowledge; thus the need for cost effectiveness would be met. To this end, we suggest a framework for an integrated approach to device design and instruction generation. |
[Download gz] (bibtex)
| |
|
Melanie Baljko (7) | Computational simulations of mediated face-to-face multimodal communication,
Melanie Baljko, 2004 Ph.D. Thesis. Department of Computer Science, University of Toronto. July. (bibtex)
| | Articulatory adaptation in multimodal communicative action,
Melanie Baljko, 2001 Proceedings of the North American Chapter of the Association for Computational Linguistics (NAACL) Workshop on Adaptation in Dialogue Systems (Cindi Thompson and Tim Paek and and Eric Horvitz ed.), pp. 73--74, June, Pittsburgh PA [Download pdf] (bibtex)
| | The evaluation of microplanning and surface realization in the generation of multimodal acts of communication,
Melanie Baljko, 2001 Proceedings of the Workshop on Multimodal Communication and Context in Embodied Agents Fifth International Conference on Autonomous Agents (AA'01) (Catherine Pelachaud and Isabella Poggi ed.), pp. 89--94, May, Montreal Quebec
Abstract| We describe an application domain which requires the computational simulation of human--human communication in which one of the interlocutors has an expressive communication disorder. The importance and evaluation of a process, called here microplanning and surface realization, for such communicative agents is discussed and a related exploratory study is described. |
[Download pdf] (bibtex)
| | Incorporating multimodality in the design of interventions for communication disorders,
Melanie Baljko, 2000 Proceedings of the 4th Swedish Symposium on Multimodal Communication (SSoMC'00) (Patric Dahlqvist ed.), pp. 13--14, October, Stockholhm University/KTH [Download pdf] (bibtex)
| | The computational simulation of multimodal, face-to-face communication constrained by physical disabilities,
Melanie Baljko, 2000 Proceedings, Workshop on Integrating Information from Different Channels in Multi-Media-Contexts European Summer School in Logic, Language, and Information, pp. 1--10, August, Birmingham UK
Abstract| In face-to-face interaction, interlocutors often use several modes of articulation simultaneously. An interlocutor's communication will often be multimodal even when he or she knows the other interlocutors cannot perceive all of the modes of communication (e.g., people often gesture while speaking on the telephone). Our present inquiry --- which incorporates computational modeling in conjunction with the analysis of, and comparison to empirical data --- is motivated by the desire to understand a particular design space and is relevant to other research that seeks to understand these ``complex signals'' in human-human and human-computer interaction. |
[Download pdf] (bibtex)
| | The importance of subjectivity in computational stylistic assessment,
Melanie Baljko and Graeme Hirst, 1999 Text Technology, 9(1), pp. 5--17, Spring
Abstract| Often, a text that has been written collaboratively does not ``speak with a single voice.'' Such a text is stylistically incongruous --- as opposed to merely stylistically inconsistent, which might or might not be deleterious to the quality of the text. This widespread problem reduces the overall quality of a text and reflects poorly on its authors. We would like to design a facility for revising style that augments the software environments in which collaborative writing takes place, but before doing so, a question must be answered: what is the role of subjectivity in stylistic assessment for a style-revision facility? We describe an experiment designed to measure the agreement between the stylistic assessments performed by a group of subjects based on a free-sort of writing samples. The results show that there is a statistically significant level of agreement between the subjects' assessments and, furthermore, there was a small number of groupings (three) of even more similar stylistic assessments. The results also show the invalidity of using authorship as an indicator of the reader's perceptions of stylistic similarity between the writing samples. |
[Download pdf] (bibtex)
| | Ensuring stylistic congruity in collaboratively written text: Requirements analysis and design issues,
Melanie Baljko, 1997 Master's Thesis. Department of Computer Science, University of Toronto. May. Published as technical report CSRI-365 .
Abstract| Often, texts that have been written collaboratively do not ``speak with a single voice.'' Eliminating stylistic incongruity, a difficult undertaking for both collaborative and singular writers, is the desired function of a software tool. This thesis describes the first cycle of an iterative software development process towards meeting this goal. The user requirements are analyzed with respect to a model that synthesizes established research, and then the requirements are taxonomized. Then, a framework for performing computational stylistic assessments is developed for later tool design. An experiment designed to measure the subjectivity in stylistic assessment --- a relevant issue for making deterministic, computational stylistic assessments --- was performed; the results indicate that future stylistic assessment tools must account for different patterns of assessment. Several design directions motivated by these results are suggested. |
[Download pdf] (bibtex)
| |
|
Faye Baron (2) | Identifying non-compositional idioms in text using WordNet synsets,
Faye Baron, 2007 Master's Thesis. Department of Computer Science, University of Toronto.
AbstractAny natural language processing system that does not have a knowledge of non-compositional idioms and their interpretation will make mistakes. Previous authors have attempted to automatically identify these expressions through the property of non-substitutability: similar words cannot be successfully substituted for words in non-compositional idiom expressions without changing their meaning. In this study, we use the non-substitutability property of idioms to contrast and expand the ideas of previous works, drawing on WordNet for the attempted substitutions. We attempt to determine the best way to automatically identify idioms through the comparison of algorithms including frequency counts, pointwise mutual information and PMI ranges; the evaluation of the importance of relative word position; and the assessment of the usefulness of syntactic relations. We discover that many of the techniques which we try are not useful for identifying idioms and confirm that non-compositionality doesn't appear to be a necessary or sufficient condition for idiomaticity. |
[Download pdf] (bibtex)
| | Collocations as cues to semantic orientation,
Faye Baron and Graeme Hirst, 2003 [Download pdf] (bibtex)
| |
|
Benjamin Bartlett (2) | Failing to find paraphrases using PNrule,
Benjamin Bartlett, 2007 January [Download pdf] (bibtex)
| | Finding paraphrases using PNrule,
Benjamin Bartlett, 2006 Master's Thesis. Department of Computer Science, University of Toronto. September.
Abstract| In this thesis, we attempt to use a machine-learning algorithm PNrule, along with simple lexical and syntactic measures to detect paraphrases in cases where their existence is rare. We choose PNrule because it was specifically developed for classification in instances where the target class is rare compared to other classes within the data. We test our system both on a dataset we develop based on movie reviews, and on the PASCAL RTE dataset; we obtain poor results on the former, and moderately good results on the latter. We examine why this is the case, and suggest improvements for future research. |
[Download pdf] (bibtex)
| |
|
Barbara Brunson (1) | A processing model for Warlpiri syntax and implications for linguistic theory,
Barbara Brunson, 1986 Master's Thesis. Department of Linguistics, University of Toronto. September. Published as technical report CSRI-206.
Abstract Much of the development of the current Government-Binding (GB) theory of syntax has progressed independently of concerns raised in theories of language processing. Similarly, models of syntactic processing are often proposed that lack any underpinning in syntactic theory. The work described in this report focuses on the language Warlpiri, an Australian aboriginal language with properties that are difficult to reconcile with most theories of Universal Grammar -- properties such as free word-order and discontinuity. This language is studied from the two-fold perspective of establishing a linguistically and computationally sound processing model. This forces the linguistic model to be sufficiently precise to satisfy the demands of implementation as well as forcing the implementation to proceed in a linguistically principled way. This report presents a portion of Warlpiri grammar in a revised GB-based account, addressing the issues of parsability, as well as more theoretical syntactic issues, that together force a reassessment and parametrization of certain linguistic principles. In particular, a revised version of theta theory and the notion of thematic identification are readily interpreted into processing strategies that extend naturally to deal with adjuncts and non-subcategorized arguments in a wide range of languages. The complementary nature of the syntax and morpho-syntax in the satisfaction of syntactic principles as well as in the construction of syntactic representations is addressed, as is the crucial relevance of prosodic information for preserving determinism in the parsing algorithm. |
(bibtex)
| |
|
Alexander Budanitsky (6) | Real-word spelling correction with trigrams: A reconsideration of the Mays, Damerau, and Mercer model,
L. Amber Wilcox-O'Hearn and Graeme Hirst and Alexander Budanitsky, 2008 Proceedings, 9th International Conference on Intelligent Text Processing and Computational Linguistics (CICLing-2008) (Lecture Notes in Computer Science 4919, Springer-Verlag) (Alexander Gelbukh ed.), pp. 605--616, February, Haifa Conference poster with updated results available here
Abstract| The trigram-based noisy-channel model of real-word spelling-error correction that was presented by Mays, Damerau, and Mercer in 1991 has never been adequately evaluated or compared with other methods. We analyze the advantages and limitations of the method, and present a new evaluation that enables a meaningful comparison with the WordNet-based method of Hirst and Budanitsky. The trigram method is found to be superior, even on content words. We then improve the method further and experiment with a new variation that optimizes over fixed-length windows instead of over sentences. |
[Download pdf] (bibtex)
| | Evaluating WordNet-based measures of semantic distance,
Alexander Budanitsky and Graeme Hirst, 2006 Computational Linguistics, 32(1), pp. 13--47, March
Abstract| The quantification of lexical semantic relatedness has many applications in NLP, and many different measures have been proposed. We evaluate five of these measures, all of which use WordNet as their central resource, by comparing their performance in detecting and correcting real-word spelling errors. An information-content--based measure proposed by Jiang and Conrath is found superior to those proposed by Hirst and St-Onge, Leacock and Chodorow, Lin, and Resnik. In addition, we explain why distributional similarity is not an adequate proxy for lexical semantic relatedness. |
[Download pdf] (bibtex)
| | Real-word spelling correction with trigrams: A reconsideration of the Mays, Damerau, and Mercer model,
L. Amber Wilcox-O'Hearn and Graeme Hirst and Alexander Budanitsky, 2006 February Superseded by 2008 CICLing version.
Abstract| The trigram-based noisy-channel model of real-word spelling-error correction that was presented by Mays, Damerau, and Mercer in 1991 has never been adequately evaluated or compared with other methods. We analyze the advantages and limitations of the method, and present a new evaluation that enables a meaningful comparison with the WordNet-based method of Hirst and Budanitsky. The trigram method is found to be superior, even on content words. We then improve the method further and experiment with a new variation that optimizes over fixed-length windows instead of over sentences. |
[Download pdf] (bibtex)
| | Correcting real-word spelling errors by restoring lexical cohesion,
Graeme Hirst and Alexander Budanitsky, 2005 Natural Language Engineering, 11(1), pp. 87--111, March Get paper from publisher's Web site
Abstract| Spelling errors that happen to result in a real word in the lexicon cannot be detected by a conventional spelling checker. We present a method for detecting and correcting many such errors by identifying tokens that are semantically unrelated to their context and are spelling variations of words that would be related to the context. Relatedness to context is determined by a measure of semantic distance initially proposed by Jiang and Conrath (1997). We tested the method on an artificial corpus of errors; it achieved recall of up to 50% and precision of 18 to 25% -- levels that approach practical usability. |
[Download pdf] (bibtex)
| | Semantic distance in WordNet: An experimental, application-oriented evaluation of five measures,
Alexander Budanitsky and Graeme Hirst, 2001 Workshop on WordNet and Other Lexical Resources Second meeting of the North American Chapter of the Association for Computational Linguistics, pp. 29--34, June, Pittsburgh PA
Abstract| Five different proposed measures of similarity or semantic distance in WordNet were experimentally compared by examining performance their in a real-word spelling correction system. It was found that Jiang and Conrath's measure gave the best results overall. That of Hirst and St-Onge seriously over-related, that of Resnik seriously under-related, and those of Lin and of Leacock and Chodorow fell in between. |
[Download pdf] (bibtex)
| | Lexical Semantic Relatedness and its Application in Natural Language Processing,
Alexander Budanitsky, 1999 Department of Computer Science, University of Toronto, Technical Report Number CSRG-390, August
Abstract| A great variety of natural language processing tasks, from word sense disambiguation to text summarization to speech recognition, rely heavily on the ability to measure semantic relatedness or distance between words of a natural language. This report is a comprehensive study of recent computational methods of measuring lexical semantic relatedness. A survey of methods, as well as their applications, is presented, and the question of evaluation is addressed both theoretically and experimentally. Application to the specific task of intelligent spelling checking is discussed in detail: the design of a prototype system for the detection and correction of malapropisms (words that are similar in spelling or sound to, but quite different in meaning from, intended words) is described, and results of experiments on using various measures as plug-ins are considered. Suggestions for research directions in the areas of measuring semantic relatedness and intelligent spelling checking are offered. |
[Download pdf] (bibtex)
| |
|
Mark Catt (2) | An intelligent CALI system for grammatical error diagnosis,
Mark Catt and Graeme Hirst, 1990 Computer Assisted Language Learning, 3, pp. 3--26, November
AbstractThis paper describes an approach to computer-assisted language instruction based on the application of artificial intelligence technology to grammatical error diagnosis. We have developed a prototype system, Scripsi, capable of recognising a wide range of errors in the writing of language learners. Scripsi not only detects ungrammaticality, but hypothesizes its cause and provides corrective information to the student. These diagnostic capabilities rely on the application of a model of the learner's linguistic knowledge. Scripsi operates interactively, accepting the text of the student's composition and responding with diagnostic information about its grammatical structure. In contrast to the narrowly defined limits of interaction available with automated grammatical drills, the framework of interactive composition provides students with the opportunity to express themselves in the language being learned. Although Scripsi's diagnostic functions are limited to purely structural aspects of written language, the way is left open for the incorporation of semantic processing. The design of Scripsi is intended to lay the groundwork for the creation of intelligent tutoring systems for second language instruction. The development of such expertise will remedy many of the deficiencies of existing technology by providing a basis for genuinely communicative instructional tools --- computerised tutors capable of interacting linguistically with the student. The research is based on the assumption that the language produced by the language learner, ``learner language'', differs in systematic ways from that of the native speaker. In particular, the learner's errors can be attributed primarily to two causes: the operation of universal principles of language acquisition and the influence of the learner's native language. A central concern in the design of Scripsi has been the incorporation of a psychologically sound model of the linguistic competence of the second language learner. |
(bibtex)
| | Intelligent diagnosis of ungrammaticality in computer-assisted language instruction,
Mark Catt, 1988 Master's Thesis. Department of Computer Science, University of Toronto. October. Published as technical report CSRI-218.
Abstract We describe an approach to grammatical error diagnosis in computer-assisted language instruction (CALI). Our prototype system, Scripsi, employs a model of the linguistic competence of the second language learner in diagnosing ungrammaticality in learners' writing. Scripsi not only detects errors, but hypothesises their cause and provides corrective information to the student. Scripsi's grammatical model reflects the results of research in second language acquisition, which has identified language transfer and rule overgeneralisation as the chief sources of error in learner language. Thus, in characterizing the learner's ``transitional competence'', we model not only the grammar of the learner's native language, but also the strategies that give rise to overgeneralisation. Although the approach is language-independent, our implementation targets French-speaking and Chinese-speaking learners of English. The computational realization of the model assumes that linguistic behaviour is rule-governed. We have adopted a rule-oriented grammatical formalism in which the processes of transfer and overgeneralisation are readily interpreted. Linguistic rules are expressed in a feature-based grammatical framework closely related to the Standard Theory of transformational grammar. We have extended the shift-reduce parsing algorithm in order to accommodate context-sensitive and transformational aspects of the formalism. We argue that the development of expertise in intelligent grammatical diagnosis is a prerequisite for the next generation of CALI tools -- genuinely communicative systems capable of interacting linguistically with the student. |
(bibtex)
| |
|
Christopher Collins (4) | DocuBurst: Radial Space-Filling Visualization of Document Content,
Christopher Collins, 2007 Knowledge Media Design Institute, University of Toronto, Technical Report Number KMDI-TR-2007-1 Toronto, Canada
Abstract| We present the first visualization of document content which takes advantage of the human-created structure in lexical databases. We use an accepted design paradigm to generate visualizations which improve the usability and utility of Word- Net as the backbone for document content visualization. A radial, space-filling layout of hyponymy (IS-A relation) is presented with interactive techniques of zoom, filter, and details-on-demand for the task of document visualization. The techniques can be generalized to multiple documents. |
[Download pdf] (bibtex)
| | Visualizing Uncertainty in Lattices to Support Decision-Making,
Christopher Collins and Sheelagh Carpendale and Gerald Penn, 2007 Proceedings of the Eurographics/IEEE VGTC Symposium on Visualization, May, Norrköping, Sweden http://diglib.eg.org
Abstract| Lattice graphs are used as underlying data structures in many statistical processing systems, including natural language processing. Lattices compactly represent multiple possible outputs and are usually hidden from users. We present a novel visualization intended to reveal the uncertainty and variability inherent in statistically-derived lattice structures. Applications such as machine translation and automated speech recognition typically present users with a best-guess about the appropriate output, with apparent complete confidence. Through case studies we show how our visualization uses a hybrid layout along with varying transparency, colour, and size to reveal the lattice structure, expose the inherent uncertainty in statistical processing, and help users make better-informed decisions about statistically-derived outputs. |
(bibtex)
| | Head-driven parsing for word lattices,
Christopher Collins and Bob Carpenter and Gerald Penn, 2004 Proceedings of the 42nd Annual Meeting of the Association for Computation Linguistics, July, Barcelona, Spain
Abstract| We present the first application of the head-driven statistical parsing model of Michael Collins as a simultaneous language model and parser for large-vocabulary speech recognition. The model is adapted to an online left-to-right chart-parser for word lattices, integrating acoustic, n-gram, and parser probabilities. The parser uses structural and lexical dependencies not considered by n-gram models, conditioning recognition on more linguistically-grounded relationships. Experiments on the Wall Street Journal treebank and lattice corpora show word error rates competitive with the standard n-gram language model while extracting additional structural information useful for speech understanding. |
[Download pdf] (bibtex)
| | Head-driven probabilistic parsing for word lattices,
Christopher Collins, 2004 Master's Thesis. Department of Computer Science, University of Toronto. January.
AbstractThis thesis presents the first application of the state-of-the-art head-driven statistical parsing model of Michael Collins as a simultaneous language model and parser for large-vocabulary speech recognition. The model is adapted to an online left-to-right chart-parser for word lattices, integrating acoustic, n-gram and parser probabilities. The parser uses structural and lexical dependencies not considered by n-gram models, conditioning recognition on more linguistically-grounded relationships. By preferring paths through the word lattice for which a probable parse exists, word error rate can be reduced and important syntactic and semantic relationships can be determined in a single step process. New forms of heuristic search and pruning are employed to improve efficiency. Experiments on the Wall Street Journal treebank and lattice corpora show word error rates competitive with the standard n-gram language model while extracting additional structural information useful for speech understanding. |
[Download pdf] (bibtex)
| |
|
Paul Cook (5) | The VNC-Tokens Dataset,
Paul Cook and Afsaneh Fazly and Suzanne Stevenson, 2008 Proceedings of the LREC Workshop on Towards a Shared Task for Multiword Expressions (MWE 2008), June, Marrakech, Morocco
Abstract| Idiomatic expressions formed from a verb and a noun in its direct object position are a productive cross-lingual class of multiword expressions, which can be used both idiomatically and as a literal combination. This paper presents the VNC-Tokens dataset, a resource of almost 3000 English verb--noun combination usages annotated as to whether they are literal or idiomatic. Previous research using this dataset is described, and other studies which could be evaluated more extensively using this resource are identified. |
[Download pdf] (bibtex)
| | Pulling their weight: Exploiting syntactic forms for the automatic identification of idiomatic expressions in context,
Paul Cook and Afsaneh Fazly and Suzanne Stevenson, 2007 Proceedings of the ACL Workshop on A Broader Perspective on Multiword Expressions, Prague, Czech Republic
Abstract| Much work on idioms has focused on type identification, i.e. determining whether a sequence of words can form an idiomatic expression. Since an idiom type often has a literal interpretation as well, token classification of potential idioms in context is critical for NLP. We explore the use of informative prior knowledge about the overall syntactic behaviour of a potentially-idiomatic expression (type-based knowledge) to determine whether an instance of the expression is used idiomatically or literally (token-based knowledge). We develop unsupervised methods for the task, and show that their performance is comparable to that of state-of-the-art supervised techniques. |
[Download pdf] (bibtex)
| | Automagically Inferring the Source Words of Lexical Blends,
Paul Cook and Suzanne Stevenson, 2007 Proceedings of the Conference of the Pacific Association for Computational Linguistics (PACLING-2007), Melbourne, Australia
Abstract| Lexical blending is a highly productive and frequent process by which new words enter a language. A blend is formed when two or more source words are combined, with at least one them shortened, as in brunch ("breakfast"+"lunch"). We use linguistic and cognitive aspects of this process to motivate a computational treatment of neologisms formed by blending. We propose statistical features that can indicate the source words of a blend, and whether an unknown word was formed by blending. We present computational experiments that show the usefulness in these tasks of features tapping into the recognizability of the source words in the blend, in combination with their semantic properties. |
[Download pdf] (bibtex)
| | Automatically Classifying English Verb-Particle Constructions by Particle Semantics,
Paul Cook, 2006 Master's Thesis. Department of Computer Science, University of Toronto. August.
Abstract| We address the issue of automatically determining the semantic contribution of the particle in a verb-particle construction (VPC), a task which has been previously ignored in computational work on VPCs. Adopting a cognitive linguistic standpoint, we assume that every VPC is compositional, and that the semantic contribution of a particle corresponds to one of a small number of senses. We develop a feature space based on syntactic and semantic properties of verbs and VPCs for type classification of English VPCs according to the sense contributed by their particle. We focus on VPCs using the particle up since it is very frequent and exhibits a wide range of meanings. In our experiments on unseen test VPCs, features which are motivated by properties specific to verbs and VPCs outperform linguistically uninformed word co-occurrence features, and give a reduction in error rate of around 20-30% over a chance baseline. |
[Download pdf] (bibtex)
| | Classifying particle semantics in English verb-particle constructions,
Paul Cook and Suzanne Stevenson, 2006 Proceedings of the ACL/COLING Workshop on Multiword Expressions: Identifying and Exploiting Underlying Properties (MWE 2006), July, Sydney, Australia
Abstract| Previous computational work on learning the semantic properties of verb-particle constructions (VPCs) has focused on their compositionality, and has left unaddressed the issue of which meaning of the component words is being used in a given VPC. We develop a feature space for use in classification of the sense contributed by the particle in a VPC, and test this on VPCs using the particle up. The features that capture linguistic properties of VPCs that are relevant to the semantics of the particle outperform linguistically uninformed word co-occurrence features in our experiments on unseen test VPCs. |
[Download pdf] (bibtex)
| |
|
Adrian Corduneanu (1) | A Pylonic Decision-Tree Language Model with Optimal Question Selection,
Adrian Corduneanu, 1999 Proceedings of the 37th Annual Meeting, Association for Computational Linguistics, pp. 606--609, June, College Park, Maryland
Abstract| This paper discusses a decision-tree approach to the problem of assigning probabilities to words following a given text. In contrast with previous decision-tree language model attempts, an algorithm for selecting nearly optimal questions is considered. The model is to be tested on a standard task, The Wall Street Journal, allowing a fair comparison with the well-known trigram model. |
[Download pdf] (bibtex)
| |
|
Jean-Pierre Corriveau (5) | Time-constrained memory: A reader-based approach to text comprehension,
Jean-Pierre Corriveau, 1995 , Mahwah NJ:, Lawrence Erlbaum Associates Publisher's Web site Buy at Amazon.com (bibtex)
| | Interpretation of definite reference with a time-constrained memory,
Jean-Pierre Corriveau, 1991 Proceedings, 13th Annual Conference of the Cognitive Science Society, pp. 678--681 , August, Chicago IL (bibtex)
| | Constraint satisfaction in time-constrained memory,
Jean-Pierre Corriveau, 1991 Workshop on parallel processing for artificial intelligence, at the International Joint Conference on Artificial Intelligence, August, Sydney (bibtex)
| | Time-constrained memory for reader-based text comprehension,
Jean-Pierre Corriveau, 1991 Ph.D. Thesis. Department of Computer Science, University of Toronto. January. Order published version from publisher Buy published version at Amazon.com .
AbstractMarvin Minsky writes at the beginning of The Society of Mind (1986, page 18) that ``to explain the mind, we have to show how minds are built from mindless stuff, from parts that are much smaller and simpler than anything we'd consider smart.'' In this dissertation, I develop a model of a strictly quantitative (i.e., non-semantic) memory that can be used to specify a conceptual analyzer for teuchistic (i.e., `constructionist') text comprehension. I view this model as a prototype of Minsky's ``agents of the mind''. Most importantly, I acknowledge the real-time processing constraints derived from the biological constraint (Feldman, 1984) and therefore, assume that linguistic comprehension is a race defined in terms of time-constrained memory processes. Because I do not model an adaptable memory, I partition memory into a static component, which consists of a massively parallel network of simple computing elements whose processes allow for the construction of clusters, and a dynamic component, where these clusters reside. Through specification browsers, the user of the system can input and modify both the topology of the network and the individual behavior of each computing element of static memory, which forms a `knowledge' base. Clusters are built from the processing of an input text with respect to this `knowledge' base and constitute the output of the system. Given that there is widespread disagreement on the nature, modus operandi, and use of inferences in text comprehension, the focus in this work is not on the knowledge required for comprehension, but rather on its specification in terms of constraints to satisfy through the exchange of simple signals and sequences of primitive memory operations to execute upon constraint satisfaction. I demonstrate at length how typical rules for the problems of syntax, referential resolution, lexical and structural disambiguation, and bridging inferences can be encoded in the proposed representational scheme, and thus illustrate how a theory of text understanding may be `grounded' into a more fundamental quantitative time-constrained memory. |
(bibtex)
| | On the role of time in reader-based text comprehension,
Jean-Pierre Corriveau, 1987 Proceedings of Ninth Annual Conference of the Cognitive Science Society , pp. 794--801, July, Seattle (bibtex)
| |
|
Michael Demko (1) | Statistical Parsing with Context-Free Filtering Grammar,
Michael Demko, 2007 Master's Thesis. Department of Computer Science, University of Toronto.
AbstractStatistical parsers that simultaneously generate both phrase-structure and lexical dependency trees have been limited in two important ways: the detection of non-projective dependencies has not been integrated with other parsing decisions, or the constraints between phrase-structure and dependency structure have been overly strict. I develop context-free filtering grammar as a generalization of the more restrictive lexicalized factored parsing model, and I develop for the new grammar formalism a scoring model to resolve parsing ambiguities. I demonstrate the flexibility of the new model by implementing a statistical parser for German, a freer-word-order language exhibiting a mixture of context-free and non-projective behaviours. |
[Download pdf] (bibtex)
| |
|
Chrysanne DiMarco (12) | Generation by selection and repair as a method for adapting text for the individual reader,
Chrysanne DiMarco and Graeme Hirst and Eduard Hovy, 1997 Proceedings of the Flexible Hypertext Workshop (held in conjunction with the 8th ACM International Hypertext Conference, Southampton, April 1997), pp. 36--43, August, Microsoft Research Institute, Macquarie University
AbstractA recent and growing development in Web applications has been the advent of various tools that claim to ``customize'' access to information on the Web by allowing users to specify the kinds of information they want to receive without having to search for it or sift through masses of irrelevant material. But this kind of customization is really just a crude filtering of raw Web material in which the user simply selects the ``channels'' of information she wishes to receive; this selection of information sources is hardly more ``customization'' than someone deciding to tune their television to a certain station. True customization, or tailoring, of information would be done for the user by a system that had access to an actual model of the user, a profile of the user's interests and characteristics. And such tailoring would involve much more than just selecting streams of basic content: the content of the text, whether for on-line Web page or a paper document, would be carefully selected structured, and presented in the manner best calculated to appeal to a particular individual. Adaptive-hypertext presentation comes closest to achieving this kind of document tailoring, but the current techniques used for adapting the content of a document to a particular user generally only involve some form of selectively showing (or hiding) portions of text or choosing whole variants of larger parts of the document. If the Web document designer wishes to write and present material in a way that will communicate well with the user, then just displaying the most relevant chunks of information will not be sufficient. For effective communication, both the form and content of the language used in a document should be tailored in rhetorically significant ways to best suit a user's particular personal characteristics and preferences. Ideally, we would have Web-based natural language generation systems that could produce fully customized and customizable documents on demand by individual users, according to a formal user model. As a first step in this direction, we have been investigating applications of our earlier work on pragmatics in natural language processing to building systems for the automated generation of Web documents tailored to the individual reader. |
[Download pdf] (bibtex)
| | Authoring and generating health-education documents that are tailored to the needs of the individual patient,
Graeme Hirst and Chrysanne DiMarco and Eduard Hovy and Kimberley Parsons, 1997 User Modeling: Proceedings of the Sixth International Conference, UM97 (Anthony Jameson and Cécile Paris and Carlo Tasso ed.), pp. 107--118, June, Chia Laguna, Sardinia, Italy, Vienna and New York Springer Wien New York
Abstract| Health-education documents can be much more effective in achieving patient compliance if they are customized for individual readers. For this purpose, a medical record can be thought of as an extremely detailed user model of a reader of such a document. The HealthDoc project is developing methods for producing health-information and patient-education documents that are tailored to the individual personal and medical characteristics of the patients who receive them. Information from an on-line medical record or from a clinician will be used as the primary basis for deciding how best to fit the document to the patient. In this paper, we describe our research on three aspects of the project: the kinds of tailoring that are appropriate for health-education documents; the nature of a tailorable master document and how it can be created; and the linguistic problems that arise when a tailored instance of the document is to be generated. |
[Download pdf] (bibtex)
| | Automatic customization of health-education brochures for individual patient,
Graeme Hirst and Chrysanne DiMarco, 1996 Proceedings, Information Technology and Community Health Conference (ITCH-96), pp. 222--228, November, Victoria, B.C.
Abstract Many studies have shown that health-education messages and patient instructions are more effective when closely tailored to the particular condition and characteristics of the individual recipient. But in situations where many factors interact -- for example, in explaining the pros and cons of hormone replacement therapy -- the number of different combinations is far too large for a set of appropriately tailored messages to be produced in advance. The HealthDoc project is presently developing linguistic techniques for producing, on demand, health-education and patient-information brochures that are customized to the medical and personal characteristics of an individual patient. For each topic, HealthDoc requires a `master document' written by an expert on the subject with the help of a program called an `authoring tool'. The writer decides upon the basic elements of the text -- clauses and sentences -- and the patient conditions under which each element should be included in the output. The program assists the writer in building correctly structured master-document fragments and annotating them with the relationships and conditions for inclusion. When a clinician wishes to give a patient a particular brochure from HealthDoc, she will select it from a menu and specify the name of the patient. HealthDoc will use information from the patient's on-line medical record to then create and print a version of the document appropriate to that patient, by selecting the appropriate pieces of material and then performing the necessary linguistic operations to combine them into a single, coherent text. |
[Download n] (bibtex)
| | HealthDoc: Customizing patient information and health education by medical condition and personal characteristics,
Chrysanne DiMarco and Graeme Hirst and Leo Wanner and John Wilkinson, 1995 Workshop on Artificial Intelligence in Patient Education, August, Glasgow Scotland
Abstract| The HealthDoc project aims to provide a comprehensive approach to the customization of patient-information and health-education materials through the development of sophisticated natural language generation systems. We adopt a model of patient education that takes into account patient information ranging from simple medical data to complex cultural beliefs, so that our work provides both an impetus and testbed for research in multicultural health communication. We propose a model of language generation, `generation by selection and repair', that relies on a `master-document' representation that pre-determines the basic form and content of a text, yet is amenable to editing and revision for customization. The implementation of this model has so far led to the design of a sentence planner that integrates multiple complex planning tasks and a rich set of ontological and linguistic knowledge sources. |
[Download pdf] (bibtex)
| | Usage notes as the basis for a representation of near-synonymy for lexical choice,
Chrysanne DiMarco and Graeme Hirst, 1993 >Proceedings, 9th annual conference of the University of Waterloo Centre for the New Oxford English Dictionary and Text Research, pp. 33--43, September, Oxford
Abstract| The task of choosing between lexical near-equivalents in text generation requires the kind of knowledge of fine differences between words that is typified by the usage notes of dictionaries and books of synonym discrimination. These usage notes follow a fairly standard pattern, and a study of their form and content shows the kinds of differentiae adduced in the discrimination of near-synonyms. For appropriate lexical choice in text generation and machine translation systems, it is necessary to develop the concept of formal `computational usage notes', which would be part of the lexical entries in a conceptual knowledge base. The construction of a set of `computational usage notes' adequate for text generation is a major lexicographic task of the future. |
[Download pdf] (bibtex)
| | A computational theory of goal-directed style in syntax,
Chrysanne DiMarco and Graeme Hirst, 1993 Computational Linguistics, 19(3), pp. 451--499, September
Abstract The problem of style is highly relevant to computational linguistics, but current systems deal only superficially, if at all, with subtle but significant nuances of language. Expressive effects, together with their associated meaning, contained in the style of a text are lost to analysis and absent from generation. We have developed an approach to the computational treatment of style that is intended to eventually incorporate three selected components---lexical syntactic, and semantic. In this paper, we concentrate on certain aspects of syntactic style. We have designed and implemented a computational theory of goal-directed stylistics that can be used in various applications, including machine translation, second-language instruction and natural language generation. We have constructed a vocabulary of style that contains both primitive and abstract elements of style. The primitive elements describe the stylistic effects of individual sentence components. These elements are combined into patterns that are described by a stylistic meta-language, the abstract elements, that define the concordant and discordant stylistic effects common to a group of sentences. Higher-level patterns are built from the abstract elements and associated with specific stylistic goals, such as clarity or concreteness. Thus, we have defined rules for a syntactic stylistic grammar at three interrelated levels of description: primitive elements, abstract elements, and stylistic goals. Grammars for both English and French have been constructed, using the same vocabulary and the same development methodology. Parsers that implement these grammars have also been built. The stylistic grammars codify aspects of language that were previously defined only descriptively. The theory is being applied to various problems in which the form of an utterance conveys an essential part of meaning and so must be precisely represented and understood. |
[Download pdf] (bibtex)
| | A goal-based grammar of rhetoric,
Chrysanne DiMarco and Graeme Hirst and Marzena Makuta-Giluk, 1993 Association for Computational Linguistics, Workshop on Intentionality and Structure in Discourse Relations, pp. 15--18, June, Ohio State University [Download pdf] (bibtex)
| | The semantic and stylistic differentiation of synonyms and near-synonyms,
Chrysanne DiMarco and Graeme Hirst and Manfred Stede, 1993 AAAI Spring Symposium on Building Lexicons for Machine Translation, pp. 114--121, March, Stanford CA
AbstractIf we want to describe the action of someone who is looking out a window for an extended time, how do we choose between the words gazing, staring, and peering? What exactly is the difference between an argument, a dispute, and a row? In this paper, we describe our research in progress on the problem of lexical choice and the representations of world knowledge and of lexical structure and meaning that the task requires. In particular, we wish to deal with nuances and subtleties of denotation and connotation---shades of meaning and of style---such as those illustrated by the examples above. We are studying the task in two related contexts: machine translation and the generation of multilingual text from a single representation of content. In the present paper, we concentrate on issues in lexical representation. We describe a methodology, based on dictionary usage notes, that we are using to discover the dimensions along which similar words can be differentiated, and we discuss a two-part representation for lexical differentiation. |
[Download pdf] (bibtex)
| | Focus shifts as indicators of style in paragraphs,
Mark Ryan and Chrysanne DiMarco and Graeme Hirst, 1992 Department of Computer Science, University of Waterloo, June In DiMarco, Chrysanne et al, Four papers on computational stylistics. [Download pdf] (bibtex)
| | Accounting for style in machine translation,
Chrysanne DiMarco and Graeme Hirst, 1990 Third International Conference on Theoretical Issues in Machine Translation, June, Austin TX
AbstractA significant part of the meaning of any text lies in the author's style. Different choices of words and syntactic structure convey different nuances in meaning, which must be carried through in any translation if it is to be considered faithful. Up to now, machine translation systems have been unable to do this. Subtleties of style are simply lost to current machine-translation systems. The goal of the present research is to develop a method to provide machine-translation systems with the ability to understand and preserve the intent of an author's stylistic characteristics. Unilingual natural language understanding systems could also benefit from an appreciation of these aspects of meaning. However, in translation, style plays an additional role, for here one must also deal with the generation of appropriate target-language style. Consideration of style in translation involves two complementary, but sometimes conflicting, aims: - The translation must preserve, as much as possible, the author's stylistic intent --- the information conveyed through the manner of presentation.
- But it must have a style that is appropriate and natural to the target language.
The study of comparative stylistics is, in fact, guided by the recognition that languages differ in their stylistic approaches: each has its own characteristic stylistic preferences. The stylistic differences between French and English are exemplified by the predominance of the pronominal verb in French. This contrast allows us to recognize the greater preference of English for the passive voice: - (a) Le jambon se mange froid.
(b) Ham is eaten cold. Such preferences exist at the lexical, syntactic, and semantic levels, but reflect differences in the two languages that can be grouped in terms of more-general stylistic qualities. French words are generally situated at a higher level of abstraction than that of the corresponding English words which tend to be more concrete (Vinay and Darbelnet 1958, 59). French aims for precision while English is more tolerant of vagueness. (Duron 1963 109). So, a French source text may be abstract and very precise in style, but the translated English text should be looser and less abstract, while still retaining the author's stylistic intent. Translators use this kind of knowledge about comparative stylistics as they clean up raw machine-translation output, dealing with various kinds of stylistic complexities. |
(bibtex)
| | Computational stylistics for natural language translation,
Chrysanne DiMarco, 1990 Ph.D. Thesis. Department of Computer Science, University of Toronto. February. Published as technical report CSRI-239.
AbstractThe problem of style is highly relevant to machine translation, but current systems deal only superficially, if at all, with the preservation of stylistic effects. At best, MT output is syntactically correct but aims no higher than a strict uniformity in style. The expressive effects contained in the source text, together with their associated meaning, are lost. I have developed an approach to the computational treatment of style that incorporates three selected components --- lexical, syntactic and semantic --- and focuses on certain aspects of syntactic style. I have designed and implemented the foundations of a computational model of goal-directed stylistics that could serve as the basis of a system to preserve style in French-to-English translation. First, I developed a vocabulary of style that contains both primitive and abstract elements of style. The primitive elements describe the stylistic effects of individual sentence components. These elements are combined into patterns that are described by a stylistic meta-language, the abstract elements, that define the stylistic effects common to a group of sentences. These elements have as their basis the notions of concord and discord, for it is my contention that style is created by patterns of concord and discord giving an overall integrated arrangement. These patterns are built from the abstract elements and associated with specific stylistic goals such as clarity or concreteness. Thus, I have developed a syntactic stylistic grammar at three interrelated levels of description: primitive shapes, abstract elements, stylistic goals. Grammars for both French and English have been constructed, using the same vocabulary and the same development methodology. As well, Mark Ryan has used this vocabulary and methodology to construct a semantic stylistic grammar. Parsers that implement these grammars have also been implemented. Together, the English and French parsers could form the basis of a system that would preserve many aspects of style in translation. The incorporation of stylistic analysis into MT systems should significantly reduce the current reliance on human post-editing and improve the quality of MT output. |
(bibtex)
| | Stylistic grammars in language translation,
Chrysanne DiMarco and Graeme Hirst, 1988 Proceedings, 12th International conference on computational linguistics (COLING-88), pp. 148--153, August, Budapest
AbstractWe are developing stylistic grammars to provide the basis for a French and English stylistic parser. Our stylistic grammar is a branching stratificational model, built upon a foundation dealing with lexical, syntactic, and semantic stylistic realizations. Its central level uses a vocabulary of constituent stylistic elements common to both English and French, while the top level correlates stylistic goals, such as clarity and concreteness, with patterns of these elements. Overall, we are implementing a computational schema of stylistics in French-to-English translation. We believe that the incorporation of stylistic analysis into machine translation systems will significantly reduce the current reliance on human post-editing and improve the quality of the systems' output . |
[Download pdf] (bibtex)
| |
|
Judith Dick (6) | A case-based representation of legal text for conceptual retrieval,
Judith Dick and Graeme Hirst, 1991 Workshop on Language and Informational Processing, American Society for Information Science, October, Washington DC (bibtex)
| | On the usefulness of conceptual graphs in representing knowledge for intelligent retrieval,
Judith Dick, 1991 Proceedings, Sixth Annual Workshop on Conceptual Graphs, pp. 153--167 , July, Binghamton NY (bibtex)
| | Intelligent text retrieval,
Judith Dick and Graeme Hirst, 1991 Text retrieval: Workshop notes from the Ninth National Conference on Artificial Intelligence (AAAI-91), July, Anaheim CA (bibtex)
| | Representation of legal text for conceptual retrieval,
Judith Dick, 1991 Proceedings, Third International Conference on Artificial Intelligence and Law, pp. 244--252, June, Oxford [Download pdf] (bibtex)
| | A conceptual, case-relation representation of text for intelligent retrieval,
Judith Dick, 1991 Ph.D. Thesis. Department of Computer Science, University of Toronto. April. Published as technical report CSRI-265 .
AbstractIdeally, a case-law retrieval system would provide the lawyer with conceptual access to cases and help him or her to develop an argument. This research constitutes an attempt to move from contemporary information retrieval towards the ideal by using natural language understanding techniques. A knowledge base of contract cases has been constructed to demonstrate the advantages of using a conceptual representation rather than keywords. The KB consists of knowledge representations of the cases, a lexicon of legal concepts and some semantic constraints. The ratio decidendi or principal argument of each of the cases has been analyzed according to Toulmin's ``good reasons'' argument model. The argument schema is used to structure the representation of the discourse. Sowa's conceptual graphs have been used as a near-first-order notation. Conceptual graphs have an established group of users and a growing software base. The notation is augmented by Somers's 28 definitive deep cases, which are designed to answer the strongest criticisms of case. The lexicon of legal concepts is integrated with the argument representations. Each legal concept has its own definition and pointers to instances. In a full-scale implementation, as the KB grew the legal concepts would be augmented, continuously being redefined, by knowledge from incoming cases. The open-textured concepts are used in the design to improve retrieval. It might be argued that constructing such a KR is slow, requires human ability and is impractical for large-scale applications. Nevertheless in future, KR construction can reasonably be expected to be automatic. Here, we are not looking to write language in logic just yet, but to model conceptual content in order to facilitate the retrieval of information. We demonstrate that retrieval based on semantics and inference, is perceptive and powerful. The dissertation concludes with a retrieval demonstration using questions derived from cases following those represented in the KB. LOG, a frame-matching algorithm based on spreading activation, is used. The demonstration focuses on pattern-matching among conceptual definitions. Semantic constraints facilitate inference within the type hierarchy. |
[Download pdf] (bibtex)
| | Conceptual retrieval and case law,
Judith Dick, 1987 Proceedings, First International Conference on Artificial Intelligence and Law, pp. 106--115, May, Boston [Download pdf] (bibtex)
| |
|
Philip Edmonds (9) | Near-synonymy and lexical choice,
Philip Edmonds and Graeme Hirst, 2002 Computational Linguistics, 28(2), pp. 105--144, June
Abstract We develop a new computational model for representing the fine-grained meanings of near-synonyms and the differences between them. We also develop a sophisticated lexical-choice process that can decide which of several near-synonyms is most appropriate in a particular situation. This research has direct applications in machine translation and text generation. We first identify the problems of representing near-synonyms in a computational lexicon and show that no previous model adequately accounts for near-synonymy. We then propose a preliminary theory to account for near-synonymy, relying crucially on the notion of granularity of representation, in which the meaning of a word arises out of a context-dependent combination of a context-independent core meaning and a set of explicit differences to its near-synonyms. That is, near-synonyms cluster together. We then develop a clustered model of lexical knowledge, derived from the conventional ontological model. The model cuts off the ontology at a coarse grain, thus avoiding an awkward proliferation of language-dependent concepts in the ontology, and groups near-synonyms into subconceptual clusters that are linked to the ontology. A cluster differentiates near-synonyms in terms of fine-grained aspects of denotation, implication, expressed attitude, and style. The model is general enough to account for other types of variation, for instance in collocational behaviour. An efficient, robust, and flexible fine-grained lexical-choice process is a consequence of a clustered model of lexical knowledge. To make it work, we formalize criteria for lexical choice as preferences to express certain concepts with varying indirectness, to express attitudes, and to establish certain styles. The lexical-choice process itself works on two tiers: between clusters and between near-synonyns of clusters. We describe our prototype implementation of the system, called I-Saurus. |
[Download pdf] (bibtex)
| | Reconciling fine-grained lexical knowledge and coarse-grained ontologies in the representation of near-synonyms,
Philip Edmonds and Graeme Hirst, 2000 Proceedings of the Workshop on Semantic Approximation, Granularity, and Vagueness, April, Breckenridge CO
Abstract| A machine translation system must be able to adequately cope with near-synonymy, for there are often many slightly different translations available for any source language word that can significantly and differently affect the meaning or style of a translated text. Conventional models of lexical knowledge used in natural-language processing systems are inadequate for representing near-synonyms, because they are unable to represent fine-grained lexical knowledge. We will discuss a new model for representing fine-grained lexical knowledge whose basis is the idea of granularity of representation. |
[Download pdf] (bibtex)
| | Semantic representations of near-synonyms for automatic lexical choice,
Philip Edmonds, 1999 Ph.D. Thesis. Department of Computer Science, University of Toronto. September. Published as technical report CSRI-399.
AbstractWe develop a new computational model for representing the fine-grained meanings of near-synonyms and the differences between them. We also develop a sophisticated lexical-choice process that can decide which of several near-synonyms is most appropriate in any particular context. This research has direct applications in machine translation and text generation, and also in intelligent electronic dictionaries and automated style-checking and document editing. We first identify the problems of representing near-synonyms in a computational lexicon and show that no previous model adequately accounts for near-synonymy. We then propose a preliminary theory to account for near-synonymy in which the meaning of a word arises out of a context-dependent combination of a context-independent core meaning and a set of explicit differences to its near-synonyms. That is near-synonyms cluster together. After considering a statistical model and its weaknesses, we develop a clustered model of lexical knowledge, based on the conventional ontological model. The model cuts off the ontology at a coarse grain thus avoiding an awkward proliferation of language-dependent concepts in the ontology, and groups near-synonyms into subconceptual clusters that are linked to the ontology. A cluster acts as a formal usage note that differentiates near-synonyms in terms of fine-grained aspects of denotation, implication, expressed attitude and style. The model is general enough to account for other types of variation, for instance, in collocational behaviour. We formalize various criteria for lexical choice as preferences to express certain concepts with varying indirectness, to express attitudes, and to establish certain styles. The lexical-choice process chooses the near-synonym that best satisfies the most preferences. The process uses an approximate-matching algorithm that determines how well the set of lexical distinctions of each near-synonym in a cluster matches a set of input preferences. We implemented the lexical-choice process in a prototype sentence-planning system. We evaluate the system to show that it can make the appropriate word choices when given a set of preferences. |
[Download pdf] (bibtex)
| | Choosing the word most typical in context using a lexical co-occurrence network,
Philip Edmonds, 1997 Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and the 8th Conference of the European Chapter of the Association for Computational Linguistics, pp. 507--509, July, Madrid Spain
Abstract| This paper presents a partial solution to a component of the problem of lexical choice: choosing the synonym most typical, or expected, in context. We apply a new statistical approach to representing the context of a word through lexical co-occurrence networks. The implementation was trained and evaluated on a large corpus, and results show that the inclusion of second-order co-occurrence relations improves the performance of our implemented lexical choice program. |
[Download pdf] (bibtex)
| | Evoking meaning by choosing the right words,
Philip Edmonds, 1996 Proceedings of the First Student Conference in Computational Linguistics in Montreal, pp. 80--87, June, Montreal Quebec
Abstract| Choosing the right word is difficult. One reason is that the context affects the meaning expressed by a word in complex ways. In particular, when a word is used in a context that is not normal for the word, it may evoke a special meaning. This paper presents a lexical choice process that chooses the word from a set of near-synonyms that best produces the desired effects in the given context. It relies on a clustered representation of lexical knowledge that unites both a statistical model of word co-occurrence (for determining when a word use will be marked) and knowledge-based model (for determining what specific effects will occur). |
[Download pdf] (bibtex)
| | Collaboration on reference to objects that are not mutually known,
Philip Edmonds, 1994 Proceedings of the 15th International Conference on Computational Linguistics (COLING-94), pp. 1118--1122, August, Kyoto Japan
Abstract| In conversation, a person sometimes has to refer to an object that is not previously known to the other particpant. We present a plan-based model of how agents collaborate on reference of this sort. In making a reference, an agent uses the most salient attributes of the referent. In understanding a reference, an agent determines his confidence in its adequacy asa means of identifying the referent. To collaborate, the agents use judgment, suggestion, and elaboration moves to refashion an inadequate referring expression. |
[Download pdf] (bibtex)
| | Repairing conversational misunderstandings and non-understandings,
Graeme Hirst and Susan McRoy and Peter A. Heeman and Philip Edmonds and Diane Horton, 1994 Speech communication, 15(3--4), pp. 213--229, December
Abstract| Participants in a discourse sometimes fail to understand one another but, when aware of the problem, collaborate upon or negotiate the meaning of a problematic utterance. To address nonunderstanding, we have developed two plan-based models of collaboration in identifying the correct referent of a description: one covers situations where both conversants know of the referent, and the other covers situations such as direction-giving, where the recipient does not. In the models conversants use the mechanisms of refashioning, suggestion, and elaboration, to collaboratively refine a referring expression until it is successful. To address misunderstanding, we have developed a model that combines intentional and social accounts of discourse to support the negotiation of meaning. The approach extends intentional accounts by using expectations deriving from social conventions in order to guide interpretation. Reflecting the inherent symmetry of the negotiation of meaning, all our models can act as both speaker and hearer, and can play both the role of the conversant who is not understood or misunderstood and the role of the conversant who fails to understand. |
[Download pdf] (bibtex)
| | A computational model of collaboration on reference in direction-giving dialogues,
Philip Edmonds, 1993 Master's Thesis. Department of Computer Science, University of Toronto. October. Published as technical report CSRI-289..
AbstractIn a conversation, a speaker sometimes has to refer to an object that is not previously known to the hearer. This type of reference occurs frequently in dialogues where the speaker is giving directions to a particular place. To make a reference, the speaker attempts to build a description of the object that will allow the hearer to identify it when she later reaches it. This thesis presents a computational model of how an agent collaborates on reference in direction-giving dialogues. Viewing language as goal-oriented behaviour, we encode route descriptions referring expressions, and discourse actions in the planning paradigm. This allows an agent to construct plans that achieve communicative goals by means of surface speech actions, and to infer plans and goals from these actions. The basis is that a referring expression plan is acceptable to an agent if she is confident that the plan is adequate as an executable identification plan. By considering the salience of the features used in a referring expression plan, an agent can evaluate her confidence in its adequacy. Driven by the implicit intention of making plans mutually acceptable, the conversants collaborate until the hearer is confident in the adequacy of the current referring expression plan. In doing so, the conversants use suggestion and elaboration discourse actions that operate on the current plan. While collaborating, an agent is in a mental state that includes the intention to achieve the goal of having the direction recipient understand the directions, the plan the agents are currently considering, and a focus of attention into the plan. This collaborative state governs the discourse by sanctioning both the adoption of goals, and the mutual acceptance of plans. Reflecting the inherent symmetry in collaborative dialogue, the model can act as both speaker and hearer, and can play the roles of both the direction-giver and the recipient. |
[Download pdf] (bibtex)
| | Translating near-synonyms: Possibilities and preferences in the interlingua,
Philip Edmonds, Proceedings of the AMTA/SIG-IL Second Workshop on Interlinguas, pp. 23--30, Langhorne PA Published in technical report MCCS-98-316, Computing Research Laboratory, New Mexico State University
Abstract| This paper argues that an interlingual representation must explicitly represent some parts of the meaning of a situation as possibilities |
|
|