![]() | CSC D43 PROGRAMMING ASSIGNMENT #2 Due by 5pm, 2004. |
Your second programming assignment is to improve significantly the functionality of the database engine that you built in the first one. In particular, you need to extend the DBE component with indexing and joins, and to extend the CLI component to use these features.
int CreateIndex( char* dbName, char* indexName, int attrNum );
int iOpenScan ( char* dbName, char* indexName, int attrNumber, char* value, int* scanid );
int iNextRecord ( int scanid , char** record );
int Join( char* dbName1, char* dbName2, char* index, int attrNum );
As in the first lab, all functions that return an int, return a ``0'' if the operation was successful or error codes (defined by you) in any other case.
int CreateIndex( char* dbName, char* indexName, int attrNum );
This function creates a dense index named indexName for the attribute in position attrNum of the database dbName. The index will be an ndbm database consisting of pairs (v,S), where v is some value for attribute in position attrNum and V is the list of the primary keys of all the records with value v in that attribute. Note: You can assume that all primary keys of records with the sa me indexed-attribute value fit in a single ndbm record.
It is an error to try to create an index for the first attribute, as the ndbm library already provides efficient access based on the primary key.
int iOpenScan ( char* dbName, char* indexName, int attrNumber, char* value, int* scanid );
int iNextRecord ( int scanid , char** record );
The function iOpenScan should return, in variable scanid, an integer identifying a scan, to be used in subsequent calls to iNextRecord. This scanid is from now on associated with the database dbName, the index indexName, and the null-terminated string pointed to by value. After a call to iOpenScan, we say that the scan number scanid is open. The function should return an error if indexName is not associated with dbName.
The function iNextRecord uses the indexName index to find all records with the given value. Each call to iNextRecord places in variable record a pointer to the next record in dbName for which the attribute indexed by indexName has the same value as the string pointed to by value. iNextRecord returns the special error code EOD (that was defined in dbe.h in Programming Assignment 1) if all the records for this scan have already been returned. After the function returns EOD, the scan is considered closed. int Join ( char* dbName1, char* dbName2, char* dbResult, int attrNum1, int attrNum2)
The function Join computes the equi-join of dbName1 and dbName2 on attributes attrNum1 and attrNum2 respectively, and leaves the result in dbResult. If there is no index on either one of the join attributes, the nested loop join algorithm is used. If there is an index on attrNum2, then the method described in Section 15.6.3 is used, with dbName2 as the inner relation. Similarly, if there is an index on attrNum1, then the same method is used but with dbName1 as the inner relation. If there are indexes on both attributes, then choose one of them by any criterion you want and proceed as above.
create index iName on dbName(attrNum)
This will create an index called iName on attribute number attrNum of relation dbName.
The second extension concerns how a query of the form
select a1,...,ak
from dbName
where ai = constant;
is processed. If there is no index on attribute ai of dbName, or ai is the primary key of dbName (that is, ai is $0), then proceed as in Assignment 1. If there is an index, then use the iOpenScan and iNextRecord functions to answer the query.
The third extension is that the CLI should now be able to handle two-relation queries of the form:
select b1,...,bk
from dbName1, dbName2
[into dbName3]
where bi = bj
In the above, the bi's are attribute numbers, and they are interpreted as follows. Suppose dbName1 has m attributes, including the key. Then $0 is the first attribute (i.e. the key) of dbName1, $1 is the second attribute, etc., up to $(m-1). $m refers to the key of dbName2, $(m+1) to the second attribute of dbName2, etc. For example, suppose Movie is a relation with three attributes, MovieKey, Title, and Director, and Actors is a relation with two attributes MovieKey and ActorName. Then the query "list the title and actors for each movie" is:
select $1,$4
from Movie,Actors
where $0=$3
If the optional clause "into dbName3" is used, a new database called dbName3 is created and the join is stored in it, e.g.:
select $1,$4
from Movie,Actors
into MAJoin
where $0=$3
Please submit a 1-page description of your design decisions in a plain text file called PA2.doc, and submit this file and all your source code files electronically using the submit command. Submission details will be given in the tutorial.