Department of Computer Science
University of Toronto
Parts of this page to jump to...

Figure 1: System Componenets
Both scientific articles and patent claims propose unique solutions, with certain assumptions and claims, to various goals. We use these common goal-oriented elements to structure our templates. In reorganizing the information from text documents, we have created a canonical template with the slots shown in Table 1. This table also includes a ``dummy'' category, OTHER, which we have created to temporarily hold information that is irrelevant for our system. This category and its information do not appear in the final interface to the users.
| Slot Name | Abbreviation | Description |
| GOAL | G | high-level goals, domain goals |
| PROBLEM | P | immediate goals |
| RELATED | R | cited or discussed related works |
| SOLUTION | S | overall proposed approach |
| SOFTGOAL | F | desired benefits of the area |
| ASSUMPTIONS | A | assumptions, definitions, beliefs, trends |
| METHOD | M | particular algorithms, steps, devices |
| CLAIMS | C | results, claims, advantages, limitations |
| EXTENSIONS | Z | extensions, future directions |
| OTHER | X | other information: e.g. background, evidence |
Table 1: Template Slots
The corresponding graphical representation for the generic template is shown in Figure 2. Each node in the graph corresponds to a slot in the template, except for the GOAL and SOFTGOAL slots, which may have zero or more nodes in the graph.

Figure 2: Graphical View of Document Contents
The interface produces the graphical model of the template and shows textual details on the side when a node is clicked on. Figure 3 shows a sample use of the system prototype, with the title of the paper placed at the top of the screen.

Figure 3: Sample System Use
In Figure 3, the EXTENSIONS node is highlighted to show that it has been clicked on by the user. The details in the corresponding slot of this template appears on the right-hand side of the screen. Note that the graphical model in this figure does not have any GOAL nodes and has two SOFTGOAL nodes. As mentioned above, the number of nodes for GOAL and SOFTGOAL vary depending on what is written in the original document and what is found by the algorithm. The interface maintains the structural view of the paper while displaying details on demand.
Our visionary system builds concepts across documents in large document collections. The graphical model with GOAL and PROBLEM nodes would make up an ontology of research areas, as shown in Figure 4.

Figure 4: Our Visionary Ontology
This ontology serves as the overview of research problems to tackle. Upon clicking on a particular node, the system zooms in to the structure of the node. By clicking on anyone of the nodes on the screen, details appear on demand. A mock-up of this interface is shown in Figure 5. In this way, the ontology provides an overview of research problems. Zooming into a particular problems gives a further breakdown of that problem and the approaches taken towards it. The details of a particular solution are then displayed when a SOLUTION node has been clicked on. Our interface follows the visual information-seeking mantra (Shneiderman): ``Overview first, zoom and filter, then details-on-demand''.

Figure 5: Our Visionary System
Note: This is an early version of the system... the graphical layout is not as nice as it should be.
First to illustrate our goal-oriented graphical representation model, we take the paper titled ``Cubes Marching to the Beat of a Different Drum'' as an example. (The full paper and the extracted sections are both available for viewing.)
| Graphical Interface |
| Using System's Segmentation Results |
| After Manual Verification |
STILL NEED TO EXPLAIN THE ERRORS MADE BY THE SYSTEM, THE NUMBER OF FIXES MADE MANUALLY, AND THE NOTICEABLE DIFFERENCES BETWEEN THE TWO GRAPHICAL INTERFACES OF THE SAME DOCUMENT.
| Domain | Nc | Nm | Ns | Precision | Recall |
| Pat 1: colour toy | 236 | 319 | 324 | 73.981 | 72.839 |
| Pat 2: education mathematics | 349 | 429 | 430 | 81.351 | 81.162 |
| Pat 3: design blouse | 254 | 386 | 400 | 65.803 | 63.500 |
| Pat 4: modern chair | 219 | 326 | 311 | 67.177 | 70.418 |
| Sci 1: women and language | 202 | 534 | 527 | 37.827 | 38.330 |
| Sci 2: children-related HCI | 428 | 769 | 791 | 55.656 | 55.656 |
| Sci 3: interface-related HCI | 260 | 507 | 502 | 51.282 | 51.792 |
| Sci 4: computational linguistics | 314 | 636 | 640 | 49.371 | 49.062 |
Table 2: Segmentation Results By Domain
| Category | Nc | Nm | Ns | Precision | Recall |
| GOAL | 232 | 242 | 252 | 95.867 | 92.063 |
| PROBLEM | 133 | 216 | 261 | 61.574 | 50.957 |
| RELATED | 6 | 17 | 113 | 35.294 | 05.309 |
| SOLUTION | 95 | 110 | 119 | 86.363 | 79.831 |
| SOFTGOAL | 202 | 329 | 300 | 61.398 | 67.333 |
| ASSUMPTIONS | 9 | 27 | 109 | 33.333 | 08.256 |
| METHOD | 782 | 1516 | 1031 | 51.583 | 75.848 |
| CLAIMS | 387 | 656 | 826 | 58.993 | 46.852 |
| EXTENSIONS | 3 | 4 | 42 | 75.000 | 07.142 |
| OTHER | 413 | 789 | 872 | 52.344 | 47.362 |
| Total | 2262 | 3906 | 3925 | 57.911 | 57.631 |
Table 3: Segmentation Results By Category
The highest domains are ``education mathematics'', with an F-socre of 81.3, for patents and ``children-related HCI'', with an F-score of 55.7, for scientific articles. These two are consistent with each other, because the documents used in these domains talk about children and education in sciences (including mathematics). The lowest domains are ``design blouse'', with an F-score of 64.6, for patents and ``women and language'', with an F-score of 38.1, for scientific articles. Examining the results of each document in ``design blouse'', we see that the SOFTGOAL category was very poor in most cases (i.e., below 50%), while categories PROBLEM, METHOD, and CLAIMS fluctuated in performance. For scientific articles, it was expected that the ``women and language'' domain would score the lowest because its style of writing was not very scientific -- many of them were narrative and often contained many inline quotes.
On average across all the categories, F-score is 57.8 for just over 3900 units. In the order of highest to lowest for precision, the categories are:
GOAL, SOLUTION, EXTENSIONS, PROBLEM, SOFTGOAL, CLAIMS, OTHER, METHOD, RELATED, ASSUMPTIONS
In the order of highest to lowest for recall, the categories are:
GOAL, SOLUTION, METHOD, SOFTGOAL, PROBLEM, OTHER, CLAIMS, ASSUMPTIONS, EXTENSIONS, RELATED
Furthermore, patents consistently score better than scientific articles, which was also true in our pilot study. On average, the patents scored about 72% and scientific articles scored about 50% in this experiment. Again, we believe that these results are credited to the well-structured writing that patents exhibit.
Readers interested in seeing the results of each document are referred to Appendix C.
A question-answering task which is modeled after the reviewing task of a conference referee is used in our experiment. This way, evaluators acted as reviewers using only the system output. To keep the workload of the evaluator to a minimum, we limited the task to three questions:
| Files | Excerpts | Interface |
| Future Directions for HCI | Extracted Sections | Graphical Layout |
| Educational Treasure Hunt Game | Extracted Sections | Graphical Layout |
Table 4: Files Used in the Pilot Usability Evaluation
Table 5: Files Used in the Real Usability Evaluation
Evaluators took 1.5 to 2 hours each to complete a session of 2 to 4 documents (two in the pilot study and four in the real evaluation). Evaluators were asked to type in their comments in an email and summarize their results on a 5-point scale. Principles that could not be assessed in the session were assigned as non-applicable, which corresponds to a score of 0. A summary of the scores of each principle are presented in Table 6 (this table only shows the scores of the real evaluation).
| Principle | P1 | P2 | Total P's | S1 | S2 | Total S's | Total |
| Conciseness | 11 | 11 | 22 | 12 | 16 | 28 | 50 |
| Retention | 11 | 09 | 20 | 17 | 14 | 31 | 51 |
| Coherence | 10 | 09 | 19 | 14 | 17 | 31 | 50 |
| Consistency | 16 | 11 | 27 | 17 | 18 | 35 | 52 |
| Informativeness | 11 | 09 | 20 | 15 | 14 | 29 | 49 |
| Comprehensibility | 04 | 08 | 12 | 17 | 15 | 32 | 44 |
| Fit for Audience | 13 | 12 | 25 | 16 | 16 | 32 | 57 |
| Fit for Purpose | 15 | 13 | 28 | 17 | 17 | 34 | 62 |
Table 6: Summary of Usability Results by Principles
All of this data can be summarized in two points: