next up previous contents
Next: TFSG Indexing through Static Up: Indexing for Non-Parsing Applications Previous: Tree-Based Indexing   Contents


Indexing in Database Systems

Although database systems are not in the scope of this thesis, many of the techniques developed here are connected to the database area. Since the subject of indexing in databases is very vast, just a few essential bibliographical pointers are mentioned in this section.

Databases can store large amounts of data. Usually, each stored entity is a complex structure, called a record (similar to a feature structure). Records are indexed based on the values of certain fields (features). The retrieval is usually not limited to a query where specific values are requested for a field, but must support other types of queries (such as interval queries - where the values should belong to a given interval). An interesting research topic in the area of indexing are the self-adaptive indexing methods, where the indexing can be (semi-)automatically configured. One of the first published work on this topic is [Hammer and Chan1976].

Most of the available database textbooks (such as [Elmasri and Navathe2000]) have chapters dedicated to indexing. Recent research papers on indexing can be found in Kluwer Academic Publishers' series ``Advances in Database Systems'': [Bertino et al.1997], [Manolopoulos et al.1999], or [Mueck and Polaschek1997].

A major difference between indexing in databases and indexing in a TFSG parser should be noted. Typically, a database consists of a large collection of objects, and the indexing scheme is designed to improve retrieval times. It is expected that databases are persistent, with fewer deletions and insertions than retrievals. From this point of view, parsing can be seen as managing a volatile database, that is always empty at start-up. The ratio between insertions and retrievals in a database application is very small (even equal to 0 when used only to retrieve data). For indexed parsing, this ratio is much higher and depends on the structure of grammar rules. For this reason (similar to those discussed in Section 5.2.3), indexing methods such as B-trees (commonly used in databases), where the retrieval can be performed in $ O(1)$ operations, but the insertion needs $ O(log_m(n))$ operations [Ramesh et al.2001] (where $ n$ is the number of nodes in the B-tree and $ m$ the number of index keys assigned to a node), are not recommended for indexed parsing.


next up previous contents
Next: TFSG Indexing through Static Up: Indexing for Non-Parsing Applications Previous: Tree-Based Indexing   Contents