Statement; Projects; Interests; Professional Organizations


Brief Research Statement

Textual and speech data are at the forefront of information management problems today.  From email to case law to call center speech recognition transcripts, managing and filtering the volumes of linguistic content to find subsets of interest and to understand it is an immense undertaking that most of us are faced with everyday, when we open our inbox or use our favourite search engine.  Lingusitic analysis is also very important to business as the corporate world strives to improve text analytics for monitoring customer opinions, to streamline document management and retrieval, and to provide data services to clients.  Drawing on my background in both computational linguistics and information visualization, my vision is to create solutions to enhance the communication and understanding of linguistic data.  

I am currently developing interactive visualizations of linguistic data with a focus on convergence and coordination of multiple views of data to provide enhanced insight.  This approach is exemplified through several linguistic data visualizations in my thesis research.  I have developed various methods for generating, reading, and comparing visual summaries of document thematic content for everyday users and data analysts, working simultaneously on novel visual design, helpful interaction techniques, and appropriate computational linguistic techniques for processing and selecting data.  These graphical summaries are interactively linked with the full text in a focus+context approach.  I have also worked on solutions for computational linguists.  After an ethnographic study of machine translation researchers working in situ, I am developing a general method for relating multiple 2D visualizations in 3D on visualization planes.  Visualizations are linked through edges called VisLinks, which can reveal relationships amongst visualizations and assist data experts derive greater insight.  VisLinks developed out of a need in the linguistic visualization realm, but are general enough to apply to any data domain where multiple visual representations can be linked and compared.  I plan to continue collaboration with the domain experts (linguists) to use VisLinks to provide them with an enhanced ability to analyse and improve their algorithms. 

My research is part of and supported by the NSERC national research network NECTAR (Network for Effective Collaborative Technologies through Advanced Research).

Read my complete research statement and plans


Projects:

sets over visualizations

Sets over Existing Visualizations
with Gerald Penn and Sheelagh Carpendale

While many data sets contain multiple relationships, depicting more than one data relationship within a single visualization is challenging. We introduce Bubble Sets as a visualization technique for data that has both a primary data relation with a semantically significant spatial organization and a significant set membership relation in which members of the same set are not necessarily adjacent in the primary layout.

[in submission]


parallel tag clouds

Parallel Tag Clouds
with Martin Wattenberg and Fernanda Viégas

Parallel tag clouds combine graphical techniques from parallel coordinates visualizations and traditional tag clouds to provide rich overviews of a document collection while acting as an entry point for exploration of individual texts. We augment basic parallel tag clouds with a details-in-context display and an option to visualize changes over a second facet of the data, such as time.

Accepted to appear at the IEEE Symposium on Visual Analytics Science and Technology (VAST) 2009!


Revealing Relationships Amongst Visualizations
with Sheelagh Carpendale

We are developing exciting new methods for comparing across multiple visualizations, revealing the relationships amongst multiple 2D visualizations while allowing each component visualization to reuse the powerful spatial dimension for data encoding.

[project page]

 


DocuBurst: Visualizing Document Content using Language Structure
with Sheelagh Carpendale and Gerald Penn

DocuBurst is the first visualization of document content which takes advantage of the human-created structure in lexical databases. We use an accepted design paradigm to generate visualizations which improve the usability and utility of WordNet as the backbone for document content visualization. A radial, space-filling layout of hyponymy (IS-A relation) is presented with interactive techniques of zoom, filter, and details-on-demand for the task of document visualization. The techniques can be generalized to multiple documents.

[project page]

[DocuBurst featured in the Toronto Star!]

[DocuBurst on 'information aesthetics' blog]

 


tabletop typing

Tabletop Text Entry Techniques
with Uta Hinrichs, Mark S. Hancock, and Sheelagh Carpendale:

We explored the space of possible text entry techniques for tabletop displays, and suggested a important considerations for deciding upon a text-entry technique for a given situation.

Hinrichs, Uta; Hancock, Mark; Collins, Christopher; Carpendale, Sheelagh. Examination of text-entry methods for tabletop displays. In Proceedings of 2nd IEEE International Workshop on Horizontal Human-Computer Systems (Tabletop 2007), pp. 105-112. Newport, USA, October 2007. [PDF]

 


 

Visualizing Uncertainty in Lattices
with Sheelagh Carpendale and Gerald Penn

Lattice graphs are used as underlying data structures in many statistical processing systems, including natural language processing. Lattices compactly represent multiple possible outputs and are usually hidden from users. We present a novel visualization intended to reveal the uncertainty and variability inherent in statistically-derived lattice structures. Applications such as machine translation and automated speech recognition typically present users with a best-guess about the appropriate output, with apparent complete confidence. Through case studies we show how our visualization uses a hybrid layout along with varying transparency, colour, and size to reveal the lattice structure, expose the inherent uncertainty in statistical processing, and help users make better-informed decisions about statistically-derived outputs.

[project page]

 


WordNet Visualization
with Gerald Penn

Interface designs for lexical databases in NLP have suffered from not following design principles developed in the information visualization research community. In this project we outline our preferred design paradigm and show it can be used to generate visualizations which maximize the usability and utility of WordNet. The techniques can be generally applied to other lexical databases used in NLP research.

[project page]

 


Research Interests

  • Computer Science
    Supervisors: Gerald Penn and Sheelagh Carpendale

    Visualization of Natural Language Data (Literature Review)
    Interaction Techniques for Information Visualization (HCI)
    Text Entry Techniques for Large Displays and Mobile Devices
    Multi-touch Tabletop Information Display and Interaction
    Digital Media
    Parsing off Word-Lattices (see M.Sc. thesis)
    Speech Recognition and Generation
    Social Implications of Computing / Ethics & Philosophy of Computing

  • Conferences in my research area

Professional Organizations

Association for Computational Linguistics (ACL)

Association for Computing Machinery (ACM)

ACM SIG on Human Computer Interaction (ACM-SIGCHI)

Society for Teaching and Learning in Higher Education (STHLE)

IEEE Computer Society (IEEE)

 

Top