University of Toronto
CSC148S--Introduction to
Computer Science, Spring 1996
Assignment 5:
Long Strings
Due: Wednesday 10 April, 6:00 pm, in the 148 drop boxes
This assignment is worth 10 marks towards your final grade.
Introduction
This assignment will give you more practice with pointers and
recursion, as well as a chance to manipulate trees.
Its pedagogical purpose, however, is to teach you a little
about object-oriented programming.
To help prepare yourself for the conceptual shift
to object-oriented programming,
read the supplementary lecture notes on object-oriented programming, and
do the assigned readings from chapter 8 of the text.
The problem
The maximum length of a string in OOT is 255 characters.
This limit is adequate for many situations, but not all.
For example,
biologists need programs for manipulating gene sequences,
represented as very long strings made up of combinations of four characters
that stand for the four nucleotides in our DNA.
Another use for long strings is to represent files.
Both word processors and
software for outputting a file to a printer, for instance,
display files for us
as sequences of lines, so we are likely to think of a file
as a two-dimensional object.
However, to the computer, a file is linear --
it is just a (potentially very long) sequence of characters.
Some particular character (with its own ASCII value) can be used to
tell the software when to go to a new line, or the software
can infer when to break the lines from, say, the width of the window or
screen.
Many kinds of programs need a way to represent very long strings.
We will call long strings ropes.
Data structure
For this assignment, you will use trees to represent ropes.
Each leaf will contain a piece of the total rope,
up to 255 characters long.
Internal nodes will contain pointers to two children (one
of which may be nil).
The string represented by an internal node is simply the string
that results from concatenating the two strings represented by
its two children.
(Notice that this is a recursive definition, since each child may
itself be an internal node.)
In addition to child pointers, each internal node will contain
an integer that denotes the length of the rope represented by
the tree of which it is the root.
This extra information will turn out to be handy when performing
rope operations.
Your task
Write an OOT class that uses the data structure described above to represent ropes, and includes methods for the following rope operations:
Line breaks in the input file should be ignored. (Thus the user will be able to break long strings up into separate lines of the input file.)
Note that if reading is done in a naive fashion, the resulting tree will be badly out of balance, hanging off to the right. You must come up with a better method that uses the length information to ensure that the resulting tree is balanced.
For readability, insert a line break return after every so many characters, as indicated by a parameter.
Write also a simple ``driver program'' -- a program whose only
purpose is to test your rope class.
This program need not be general at all. It should simply allow
you to run the tests that you need.
You may even ``hard code'' in specific test cases.
How to design your program
Your program must be written in the object-oriented style,
with a class to represent ropes.
Inside this class will be the variables that hold the trees
that represent ropes.
It is natural to create another class to represent the individual
nodes of our trees.
This class will know how do things like concatenate two nodes,
or print out a node.
But a leaf node contains an OOT string
(which constitutes a piece of a possibly larger rope),
whereas an internal node contains two child pointers and a size
value.
So these two kinds of node store different data;
the subprograms that operate on them also work differently.
For instance,
printing a leaf node involves just printing out the
OOT string that it holds (and that constitutes a piece of a possibly
larger rope);
printing an internal node means printing the whole rope that it
represents, and involves traversing the tree of which it is root.
Because of these differences between the two types of node,
we will need two additional classes:
one for leaf nodes and one for internal nodes,
each of which will be a specialization of the general node class.
In order to help you understand the structure of the
classes that you will need, and the relationships among them,
we have written an outline for your program.
Our starter code is available through the course web page, at
http://www.cs.utoronto.ca/ dianeh/148.html
You must use our starter code, without alterations.
If you
have had some exposure to object-oriented programming and wish
to redesign the program, you must
speak to the course coordinator
to make sure that your plans are appropriate.
What to hand in
Hand in:
Tentative grading scheme
Below is the tentative grading scheme. Although we reserve the right to change it somewhat, it will give you an idea of where to focus your efforts.
Program: Correctness 40Programming style 15
Comments 10
Choice of test cases 10
TOTAL 75
Report: Purpose, use, and other remarks 5
Method 10
Explanation of testing 5
TOTAL 20
Overall quality of writing 5
TOTAL 100
University of Toronto
CSC148S--Introduction to
Computer Science, Spring 1996
Cover sheet for Assignment 5
Complete this page, and tape it to the front of the envelope containing your assignment.
Name: Lecturer:Student #: Tutor:
Turing registration number and owner: (If a home computer was used)
I have discussed this assignment with the following people:
I have read section 7.2 of the CSC148 Course Handbook on plagiarism, and I declare that this assignment is my own work.
Signature: