The Problem: Information Overload

In many situations within an organization, a group of collaborating knowledge workers work on a common task by accessing, disseminating, summarizing and debating large amounts of information available in documents, databases or the web. For example, a group of business analysts needs to keep track of current trends and events and assess how these affect the strategic objectives of their organization. Likewise, a group of tax credit experts needs to assess projects running within their organization in order to determine whether they qualify for tax credits because of risks the projects involve. Finally, a group of software engineers owning a large software system needs to share knowledge about the source code, changes made to it, its history, purpose and the like.

These groups have similar needs and similar problems too. Foremost, they suffer from information overload. The explosive growth of the web, the proliferation of internal and external reports and the growth of print publishing have all contributed. A  study produced by the School of Information Management and Systems at the University of California at Berkeley estimated the world's production of information in 2002 would require approximately 5 billion gigabytes of storage, and that this number is growing about 30% per year.

The surfeit of information coming from outside is not the only problem. Even within an organization, information is often hard for workers to access. In addition, experienced workers always accumulate a body of tacit knowledge, which others in the organization would benefit from having access to.

A Solution: A Shared Semantic Model

Our research is founded on the premise that such groups of collaborating knowledge workers have a shared semantic model of the application they are working on. For example, a group of strategic business analysts have a shared model of the strategic objectives of their organization, and a group of software engineers have a shared model of the structure and purpose of their software system. EXIP (the Executive Information Portal) is a knowledge management portal that adopts such shared semantic models as an organizing principle. Knowledge from both internal sources (presentations, spreadsheets, reports, emails, etc.) and external sources (web pages, news feeds, etc.) is then classified according to the semantic model. Classification is a semi-automatic process, using information retrieval techniques. Complete documents are classified according to the semantic model, but classification can also be more fine-grained; for example, the paragraphs that make up a report can be classified individually.

The distribution of knowledge is enhanced by the fact that the model is common to the workers that use it, since this fact serves to make knowledge accessible and to facilitate its transfer. In addition, there is a system of direct and indirect notifications when an item or event of interest occurs. Knowledge retention is also enhanced, first because EXIP serves to store all knowledge relevant to a group and so provides a central location to store and find knowledge. Also, implicit knowledge, such as annotations, ratings of documents and recommendations, is captured.

Research continues in several areas, including into the use of natural language processing techniques combined with information retrieval techniques for the classification of knowledge, model analysis techniques, the construction of semantic models for new groups of workers, and the relationship of EXIP to the Semantic Web.

This project is done in collaboration with Techné Knowledge Systems.

