CSC148H - Summer 96
Assignment 2: Data Encryption
Due: Thursday 20 June, 6pm.
The security of data, whether it is stored or communicated, has always
been a concern of governments and military organizations. But with
the age of electronic communication upon us, ordinary people are now
concerned about security and privacy whether they are making a credit
card transaction over the Internet, sending/receiving sensitive
e-mail, or simply having a chat. Keeping information private
and ensuring its authenticity (i.e., making sure it's not a
forgery) have become major research areas in computer science.
Cryptography is the science (and art) of protecting information by
making it unintelligible to all but the authorized parties. In this
assignment, we will look at two very simple cryptographic systems
(cryptosystems).
Encryption is the process of transforming the original text, called
the plaintext, into a new text, called the ciphertext.
Decryption in the inverse process of transforming the ciphertext back
to the original plaintext. A cryptosystem, for our purposes, is a set
of one-to-one mappings over an alphabet of characters, each
mapping of which is defined by a unique key. A mapping tells
you how to encrypt/decrypt every character in the alphabet depending
on the key.
In this assignment we will look at two very simple cryptosystems from
a class of systems called substitution ciphers. A substitution
cipher replaces each character (of the alphabet) in the plaintext
message with a unique character from the alphabet thereby creating the
ciphertext message. The key determines how each substitution is made.
Thus, from the key, we can generate a substitution table that
lists each individual substitution. For example, this ciphertext
message:
Tkh sxusrvh ri wklv dvvljqphqw lv wr jlyh brx hashulhqfh lq ghdolqj
zlwk d odujh slhfh ri frgh, dqg wr whdfk brx derxw surjudp
vshflilfdwlrqv dqg fruuhfwqhvv.
was encrypted by applying this substitution table:
a b c d e f g h i j k l m n o p q r s t u v w x y z
d e f g h i j k l m n o p q r s t u v w x y z a b c
to this plaintext message:
The purpose of this assignment is to give you experience in dealing
with a large piece of code, and to teach you about program
specifications and correctness.
The alphabet in this example is the set of lower case letters. Notice
that upper case letters, punctuation, and spacing were left untouched.
The substitution table was generated by shifting each character of the
alphabet over by 3 places. This is just one mapping of a set known as
the Caesar ciphers after Julius Caesar, who, not trusting his
messengers, communicated with Marcus Tullius Cicero in 50 B.C. using
this very mapping. The key of a Caesar cipher is the shift factor; in
this case 3. Question: how many different Caesar ciphers are there?
(Or, in the worst case, how many keys would you have to go through to
break the code?)
Assumptions for the assignment: The alphabet will consist of
only the lower case letters; however, a plaintext message may have
more than just these letters in it, in which case the letters and
symbols not in the alphabet should be left untouched in all
encryption/decryption operations. So, a message that you encrypt and
then decrypt should look identical to the original message.
We have divided the assignment into several stages, which you do not
have to complete in the order shown. Some stages require you to
design code, others to provide answers in English. The division is
intended to provide a logical structure to a well-designed program.
The end goal of your work is to develop a simple interface so that a
user can encrypt and decrypt messages stored in files.
- What properties make a substitution table
valid? In other words, what must be true of the table so that
the original message is always recoverable from its ciphertext.
Write one clear and concise paragraph in English that explains your
informal specification for a valid substitution table, and give an
example of an invalid substitution table.
- We have provided a Turing function that
verifies the validity of a substitution table.
It takes a substitution table of type subsTableType as a
parameter, and returns true if the table is valid, and
false otherwise. Your first task is to design the data
structure for the substitution table. Next, in anticipation of
proving that the main loop (marked in the function) is correct, your
task is to specify the precondition(s), postcondition(s), and loop
invariant for the loop. Write them in executable Turing using the
assert and invariant statements. Your conditions and
invariant may call other functions that you have written.
- Prove the correctness of the main loop in the
verify function. You must prove partial correctness and
termination. This means that you must prove (1) that the loop
invariant that you specified in ii is indeed a loop
invariant, (2) that if the loop terminates, then the
postcondition(s) is true, and (3) that the loop terminates.
- Write a Turing procedure that generates a
valid substitution table for a Caesar cipher. Its parameters should
include an integer representing the key, and a substitution table
that will be filled in.
- Write a Turing procedure that generates a
valid substitution table for a random cipher. A random cipher
maps each character of the alphabet to a unique randomly-chosen
character. To make the random choices, your procedure should use
Turing's randnext, which generates a sequence of random
numbers. You will `seed', i.e., initialize, the sequence with the
key. (Documentation on randseed and randnext.) Note: the function should take the same
parameters as the Caesar cipher generator. Question: how many keys
would you have to go through to break this cipher (as compared to
the Caesar cipher)?
- Write a Turing procedure that encrypts a
plaintext message stored in a file. It should output the ciphertext
message into a file. It should take at least three
parameters: the name of the plaintext file (to read from), the name
of the ciphertext file (to write to), and a substitution table to
apply in the encryption. Make sure that just the letters in the
alphabet (the lower case letters) are encrypted. All others,
including punctuation, spaces, and blank lines, should be passed
through unchanged.
- Write a Turing procedure that decrypts a
ciphertext message stored in a file. It should take at least these
three parameters: the name of the ciphertext file (to read from),
the name of the plaintext file (to write to), and the substitution
table to apply in the decryption.
- Substitution ciphers are not very secure
compared to current state-of-the-art cryptosystems, such as PGP
(Pretty Good Privacy), which is pretty much unbreakable. To break a
substitution cipher is to generate the substitution table using only
the ciphertext, thereby allowing one to decrypt the message without
knowing the key. One method to do this is to compare
the frequency that characters occur in the ciphertext to the
frequency that characters occur in normal English text. It is
likely that the most frequent character in the ciphertext
corresponds to the most frequent character in English; that the
second most frequent characters correspond; and so on. While this
method is not always perfect, it can be used to partially uncover
the original message.
Write a Turing procedure that attempts to break a substitution
cipher. It should take at least three parameters: the name of a
file containing the ciphertext, an array of characters ordered
(sorted) by their frequency in normal English text, and a substitution table to fill
in. It should first compute the frequencies, that is, the number of
times each character (of the alphabet) occurs in the ciphertext.
Then, it should match the ordering of characters by frequency in the
ciphertext to the ordering in normal English text.
- Develop a simple user interface that ties
all of the above functions and procedures together. The user should
be able to perform the following operations: generate a substitution
table of her choice for a given key, display a table on the screen,
verify that a table is valid, encrypt a file of her choice, decrypt
a file of her choice, and attempt to break an encrypted file of her
choice. Basic functionality and ease of use will determine your
mark for this interface. We aren't looking for anything fancy, and
you may assume that all user input is correct (i.e., no
error-checking is necessary).
- An encrypted secret message will be put
on the web page. You must try to break its code, and decrypt it.
Include the substitution table and the decrypted secret message in
your report. Note: because of variations in frequency, the
decrypted secret message may not be perfect.
- Write a report, as described in the CSC148
Course Handbook. In the `Method' section, make sure you
include a paragraph that describes your data structure for the
substitution table, and paragraphs that describe how each of the
above subprograms work. Justify all design decisions that you have
made. In the `Tests' section, make sure you explain each of the
tests that you performed.
All of the above subprograms rely on the substitution table; be
careful in designing its data structure. You may also need to know
about converting characters to and from their ASCII codes using the
ord() and chr() functions.
On the web page you will find:
1. the verify function
2. documentation on randseed and randnext
3. a list of letters ordered by their frequency in normal English text
4. the secret message
Hand in the following:
- Your report, including your written answers
to i and iii.
- A printout of your program consisting of your solutions to
ii, iv - ix.
Clearly label all the subprograms in your program so that the
marker can easily find them.
- A printout of the decrypted secret message and substitution
table from part x.
- Printouts of your tests.
- A 3.5" floppy disk containing your program named 148a2.t,
plus test inputs and outputs. Include a README file to
explain what you have put on the disk.
To keep the marker happy, clearly label and organize everything
that you hand in.
This document was generated using the LaTeX2HTML translator Version 96.1 (Feb 5, 1996) Copyright © 1993, 1994, 1995, 1996, Nikos Drakos, Computer Based Learning Unit, University of Leeds.
The command line arguments were:
latex2html -split 0 -no_navigation as2.
The translation was initiated by Philip Edmonds on Wed May 29 20:55:21 EDT 1996
Philip Edmonds
Wed May 29 20:55:21 EDT 1996