CSCD34H Winter 2003. Assignment #3

DueWednesday, March12, 2003, 7:00pm

PART I

Step 3 of Your PDA

(a). (15 pts.) Write five queries on your PDA database, using the select-from-where construct ofSQL. To receive full credit, all of your queries should exhibit some interesting feature ofSQL: queries over more than one relation, aggregation, or subqueries, for example. We suggest that you experiment with your SQL commands on a small database (e.g., your hand- created database), before running them on the large database that you loaded in PDA part 2. Initial debugging is much easier when you're operating on small amounts of data. Once you're confident that your queries are working, run them on your complete database. If you discover that most or all of your ``interesting'' queries return an empty answer on your large database, check whether you followed the instructions in Assignment #2 for generating data values that join properly. You will need to modify your data generator accordingly.Make sure you create at least one query that, when run over the full database, takes a long time, i.e. of the order of minutes. This will be useful for the next part of the assignment. 

            Turn in a copy of all of your SQL queries, along with a script illustrating their execution. Your script should be sufficient to convince us that your commands run successfully. Please do not, however, turn in query results that are thousands of lines long!

(b). (15 pts.)An important technique for improving the performance of queries is to create indexes. An index on an attribute A of relation R allows the database to quickly find alltuples in R with a given value for attribute A. This index is useful if a value of A is specified by your query (in the where-clause). It may also be useful if A is involved in a join that equates it to some other attribute. For example, in the query

SELECT Shows.network

FROM Viewers V, Shows S

WHEREV.name = 'joe'

AND V.likes = S.name;

we might use an index on Viewers.name to help us find the tuple for viewer Joe quickly. We might also like an index on Shows.name, so we can take all the shows Joe likes and quickly find the tuples for those shows to read their networks. 

In DB2, you can get an index by the command: 

CREATE INDEX <IndexName> ON <RelName>(<Attribute List>)

    DB2 creates indexes automatically when you declare an attribute(s) PRIMARY KEY. The name of the index is SQL, followed by a character timestamp (yymmddhhmmssxxx), qualified by SYSIBM. Thus, some indexes may exist before you create them and may also be making certain queries run faster than they would with no indexes at all. Be sure to take this factor into account when trying to explain differences in running times. 

    An illustration of the CREATE INDEX command is 

CREATE INDEX ViewerInd ON Viewers(name)

CREATE INDEX ShowInd ON Shows(name)

        which creates the two indexes mentioned above. To get rid of an index, you can say DROP INDEX followed by the name of the index. Notice that each index must have a name, even though we only refer to the name if we want to drop the index. 

Create at least two useful indexes for your PDA. Run your queries from part (a) on your large database with the indexes and without the indexes. To time your commands, you may use the CURRENT TIME special register (see p.84 of the DB2 SQL Reference). If you do SELECT CURRENT TIME from a table with one arbitrary tuple in it, you will get the value of the current time as output.Naturally these times may be affected by external factors such as system load, etc. Still, you should see a dramatic difference between the execution times with indexes and the times without. Turn in a script showing your commands to create indexes, and showing the relative times of query execution with and without indexes.

(c). (5 pts.) in Part 2 of the project, you indicated all primary key and foreign key constraints. For each of the relation schemas of your PDA, indicate any CHECK constraints that you expect will hold for the relation. Modify your database schema to include all these constraints. If this is not possible because of DB2 limitations, explain this fact. Show the response of the system to updates that violate each of the CHECK constraints. 

(d). (15 pts.) Write five data modification commands on your PDA database. Each of thesecommands should be legal, given the constraints you created for your database in Part 1.Most of these commands should be ``interesting,'' in the sense that they involve some complex feature, such as inserting the result of a query, updating several tuples at once, or deleting a set of tuples that is more than one but less than the whole relation. You may want to try out your commands on small data before trying it on your full database. Hand in a script that shows the result of your modification commands. 

(e). (5 pts.) Create two views on top of your database schema. Show your CREATE VIEWstatements and the response of the system. Also, show a query involving each view and the system response (but truncate the response if there are more than a few tuples produced). Finally, show a script of what happens when you try to update your view, say by inserting a new tuple into it. Are either of your views updatable? Why or why not?.

PART II (individual work)

1. (15 pts.) Given the database schema: 

Suppliers(supplierId,supplierName,supplierCategory,supplierCity)
Items(itemId,description,itemCity)
Sales(supplierId,customerId,itemId,date,quantity,price)
Customers(customerId,customerName,customerCity)

    Write expressions in SQL for the following queries: 

a. Name of all suppliers in Toronto. 

b. Name of all suppliers that sold some item with description "Mint Chocolate". 

c. Name of customers who bought some item supplied by "Good Foods Inc.". 

d.           d. Name of customers who bought some item supplied by a supplier with category higher than the average supplier category. 

e. Description of items that were bought by customer with Id=23, and not by customer with Id=30. 
       f.  Give a list of the form: <supplierCity, itemCity,totalQuantity>, with the total quantity sold per supplier city and item city, considering only sales of quantity  higher than 100. 
 
 

2. (15 pts.) Given the following database schema: 

Employee(employeeId,name,address,city,years)
Task(task#,description)
Plant(plantId,description,city)
Performs(employeeId,task#)
PlantOrganization(plantId,task#)
WorksAt(employeeId,plantId)
Distance(city1, city2,distance)

In relation Employee, attribute years represents the number of years that the employee has been working for the company. Also, assume that, in relation Distance, if there is a tuple <A,B,20>, then, tuple <B,A,20> cannot exist.

    Write expressions in SQL for the following queries: 

a. Name of employees,city of residence and city of the plant, for the employees such that the distance from the city where they live  to the city where the plant where they work is located, is the largest among all employees in the company.
b. Name of employees, and description of the plant they work at, for the employees that have been working for the company a number of years higher than the average number of years of the current plant workers.

3(15 pts.)Given the following database:

Items(itemId,description,itemCity, stock)
Orders(order#,
  customerId, orderdate, status)
Customers(customerId,customerName,customerCity)
OrderDetail(order#,itemid , quantity, unitprice, status) 

Create the database in DB2. Insert  five items(with an initial stock for each one of them), three customers, and orders # 1, 2, and 3, containing one two and three items respectively. For example, OrderDetail  will contain tuples: <2, 2, 23, 270, "pending">, <2,5, 10, 220,"completed">(i.e. 2 items in the detail), and so on.  The status of  an  order  (or of an item in the order detail), can be: "completed" or pending".  An order is "completed" when all of the items in the order detail are  "completed". For example, for order "2" above,  the status will be "pending", because item "2" is still"pending". We ask you to write two triggers such that:

1) When a new order arrives, the stock is modified according to the  quantity ordered for each item. Let us suppose for example, that order number 4  is inserted into the database with the statements:

   insert into Orders  values <4, 1, '3/4/2003,"pending">;

   insert into OrderDetail  values <4, 1, 100, 20 ,"pending">;

   insert into OrderDetail  values <4, 4,100, 20 ,"pending">;

 If  tuples in "Items", for items "1" and "4" are, initially: <1, "item1", "Toronto", 1300> and <4, "item4", Ottawa", 1400>, after insertion of  the tuples for order # 4 above, the  tuples must be updated by an  action specified in the trigger definition, substracting 100 units from the stock of items 1 and 4 (v.g., for item 1: <1, "item1", "Toronto", 1200>)

2) When all the items of an order are completed,  the status of the order must be updated by the trigger to the "completed" status. For example, in  order #4,  when  the following update occurs:

UPDATE orderDetail SET status = "completed" where order# = 4 and itemId = 1

the trigger will check if there is another pending item for the same order. In this case, as item 4 is still pending, the precondition of the trigger  will fail.

When the follwing update is submitted:

UPDATE orderDetail SET status = "completed" where order# = 4 and itemId = 4, the trigger must update the tuple for order #4 in "Orders".

You are asked to write the triggers and run the updates in order to show that they work correctly.  Turn in a copy of the trigger definition, and of the state of the database before and after the trigger execution.  Check the DB2 manual for the DB2 trigger syntax (there are some special keywords you must add to the standard trigger statements).