DueWednesday, February 12, 2003, 7:00pm
(a)(10
pts.) Consider the Entity-Relationship diagram you designed for your
PDA in Assignment #1. Please attach a copy of the E/R diagram. Use the method
for translating an E/R diagram to relations described in class and in the
textbook to produce a set of relations for your database design. Underline
the primary key of each relation and specify all key and foreign key constraints.
If you'd like to modify the relation schema produced by the translation,
feel free to do so. You should be happy with the final design since you will
be using it for the rest of the term. Explain any modifications.
(b) (15 pts.)
Write an SQL database schema for your PDA, using the appropriate CREATE TABLE commands. Pick suitable data types for each attribute
domain. Hand in a printout of the commands you use to create your database
schema (it is a good idea to keep this file for the balance of the course).
Show the response of db2 to a request to describe each of your relation
schemas. Use the input file, output file and history file mechanism described
in the tutorial posted on the course web page to record your input SQL commands
and the system's output.
Execute five INSERT commands to insert tuples into one of your relations.
Show the response of db2 and the relation that results when you issue a SELECT * command.
(c) (20 pts.) Develop
a substantial amount of data for your database and load it into your relations
using the SQL IMPORT command. See the db2 tutorial for information
on how to bulk-load data. To create the data, write a program in any programming
language you like that creates large files of records in ASCII delimited
format as expected by the IMPORT command. Load the data into your PDA relations.
If you are using real data for you PDA, your program will need to transform
the data into files of records conforming to your PDA schema. Otherwise,
write a program to fabricate data; your program will generate either
random or nonrandom (e.g., sequential) records conforming to your schema.
Note that it is both fine and expected for your data values--strings especially--
to be meaningless gibberish. The point of generating large amounts of data
is so that you can experiment with a database of realistic size, rather than
the small "toy" databases often used in classes. The data you generate and
load should have the following characteristics:
·At least two relations with thousands of tuples
·At least one additional relation with hundreds of
tuples
If the semantics of your
application includes relations that are expected to be relatively small (e.g., departments within a university), it
is fine to use some small relations, but please ensure that you have relations
of the sizes prescribed above as well. When writing a program to fabricate
data, there are two important points to keep in mind:
1. Be sure not to generate duplicate values for
key attributes.
2. Your PDA almost certainly includes relations
that are expected to join with each other. For example, you may have a Student relation with attribute courseNo that's expected to join with attribute number in relation Course. In generating data, be sure to generate values
that actually do join--otherwise all of your interesting queries will have
empty results! One way to guarantee joinability is to generate the values
in one relation, then use the generated values to select joining values for
the other relation. For example, you could generate course numbers first
(either sequentially or randomly), then use these numbers to fill in the courseNo values in the student relation.
Turn in your program
code for generating or transforming data, a small sample of the recordsgenerated foreach relation (5 or so per relation), the input
file containing the SQL statements you used to load your data into the database,
and the history file produced during the load.
Suppliers(supplierId,supplierName,supplierCategory,supplierCity)
Items(itemId,description,itemCity)
Sales(supplierId,customerId,itemId,date,quantity,price)
Customers(customerId,customerName,customerCity)
Write expressions in Relational Algebra and SQL for the following queries:
a. Name of all suppliers in Toronto.
b. Name of all suppliers that sold some item with
description "Mint Chocolate".
c. Name of customers who bought some item supplied
by "Good Foods Inc.".
d. d.
Name of customers who bought some item supplied by a supplier with category
higher than 4.
e. Description of items that were bought by customer
with Id=23, and not by customer with Id=30.
f. Name of the supplierswith the highest category.
2. (15 pts.) Given the following database schema:
Employee(employeeId,name,address,city,years)
Task(task#,description)
Plant(plantId,description,city)
Performs(employeeId,task#)
PlantOrganization(plantId,task#)
WorksAt(employeeId,plantId)
Distance(city1, city2,distance)
In relation Employee,
attribute years represents the number of years that the employee has
been working for the company. Also, assume that, in
relation Distance, if there is a tuple <A,B,20>, then, tuple
<B,A,20> cannot exist.
Write
expressions in Relational Algebra and SQL for the following queries:
a.
Name of employees,city of residence and city of the plant, for the employees
such that the distance from the city where they live to the city where
the plant where they work is located, is the largest among all employees
in the company.
b.
Name of employees, and description of the plant they work at, for the employees
that have been working for the company a number
of years higher than the average number of years of the current plant workers(only
in SQL).
3. (20 pts.) Given the following database
schema:
Person(docType, docNumber,
name,address,phone#, gender)
Parent(docType, docNumber,docTypeChild, docNumberChild)
a.
For each person named"John Cassavets"list the document type and document number, and document
type and document number of all of his chidren.
b.
Same as above for:
-
all of his siblings
-
his mother
-
his grandsons.