CSCD34H Winter 2003. Assignment #2

DueWednesday, February 12, 2003, 7:00pm

PART I

Step 2 of Your PDA (Personal Database Application)

(a)(10 pts.) Consider the Entity-Relationship diagram you designed for your PDA in Assignment #1. Please attach a copy of the E/R diagram. Use the method for translating an E/R diagram to relations described in class and in the textbook to produce a set of relations for your database design. Underline the primary key of each relation and specify all key and foreign key constraints. If you'd like to modify the relation schema produced by the translation, feel free to do so. You should be happy with the final design since you will be using it for the rest of the term. Explain any modifications. 

(b) (15 pts.) Write an SQL database schema for your PDA, using the appropriate CREATE TABLE commands. Pick suitable data types for each attribute domain. Hand in a printout of the commands you use to create your database schema (it is a good idea to keep this file for the balance of the course). Show the response of db2 to a request to describe each of your relation schemas. Use the input file, output file and history file mechanism described in the tutorial posted on the course web page to record your input SQL commands and the system's output. 

Execute five INSERT commands to insert tuples into one of your relations. Show the response of db2 and the relation that results when you issue a SELECT * command. 

(c) (20 pts.) Develop a substantial amount of data for your database and load it into your relations using the SQL IMPORT command. See the db2 tutorial  for information on how to bulk-load data. To create the data, write a program in any programming language you like that creates large files of records in ASCII delimited format as expected by the IMPORT command. Load the data into your PDA relations. If you are using real data for you PDA, your program will need to transform the data into files of records conforming to your PDA schema. Otherwise, write a program to fabricate data; your program will generate either random or nonrandom (e.g., sequential) records conforming to your schema. Note that it is both fine and expected for your data values--strings especially-- to be meaningless gibberish. The point of generating large amounts of data is so that you can experiment with a database of realistic size, rather than the small "toy" databases often used in classes. The data you generate and load should have the following characteristics: 

·At least two relations with thousands of tuples 

·At least one additional relation with hundreds of tuples 

If the semantics of your application includes relations that are expected to be relatively small (e.g., departments within a university), it is fine to use some small relations, but please ensure that you have relations of the sizes prescribed above as well. When writing a program to fabricate data, there are two important points to keep in mind:

1. Be sure not to generate duplicate values for key attributes. 

2. Your PDA almost certainly includes relations that are expected to join with each other. For example, you may have a Student relation with attribute courseNo that's expected to join with attribute number in relation Course. In generating data, be sure to generate values that actually do join--otherwise all of your interesting queries will have empty results! One way to guarantee joinability is to generate the values in one relation, then use the generated values to select joining values for the other relation. For example, you could generate course numbers first (either sequentially or randomly), then use these numbers to fill in the courseNo values in the student relation. 

 Turn in your program code for generating or transforming data, a small sample of the recordsgenerated foreach relation (5 or so per relation), the input file containing the SQL statements you used to load your data into the database, and the history file produced during the load. 

PART II

1. (20 pts.) Given the database schema: 

Suppliers(supplierId,supplierName,supplierCategory,supplierCity)
Items(itemId,description,itemCity)
Sales(supplierId,customerId,itemId,date,quantity,price)
Customers(customerId,customerName,customerCity)

     Write expressions in Relational Algebra and SQL for the following queries: 

a. Name of all suppliers in Toronto. 

b. Name of all suppliers that sold some item with description "Mint Chocolate". 

c. Name of customers who bought some item supplied by "Good Foods Inc.". 

d.          d. Name of customers who bought some item supplied by a supplier with category higher than 4. 

e. Description of items that were bought by customer with Id=23, and not by customer with Id=30. 

f. Name of the supplierswith the highest category. 

2. (15 pts.) Given the following database schema: 

Employee(employeeId,name,address,city,years)
Task(task#,description)
Plant(plantId,description,city)
Performs(employeeId,task#)
PlantOrganization(plantId,task#)
WorksAt(employeeId,plantId)
Distance(city1, city2,distance)

In relation Employee, attribute years represents the number of years that the employee has been working for the company. Also, assume that, in relation Distance, if there is a tuple <A,B,20>, then, tuple <B,A,20> cannot exist.

    Write expressions in Relational Algebra and SQL for the following queries: 

a. Name of employees,city of residence and city of the plant, for the employees such that the distance from the city where they live  to the city where the plant where they work is located, is the largest among all employees in the company.

b. Name of employees, and description of the plant they work at, for the employees that have been  working for the company a number of years higher than the average number of years of the current plant workers(only in SQL).

3. (20 pts.) Given the following database schema: 

Person(docType, docNumber, name,address,phone#, gender)
Parent(docType, docNumber,docTypeChild, docNumberChild)

    Write expressions in Relational Algebra and SQL for the following queries: 

a. For each person named"John Cassavets"list the document type and document number, and document type and document number of all of his chidren.

b. Same as above for:

- all of his siblings

- his mother 

- his grandsons.