SCI199 Home Page (September 2009 - April 2010)
Announcements for week of March 15
The last assignment, A7,
has now been completed and posted.
I suggest looking at the very subjective topic
of human
attractivenss for some good hints for the first question
of assignment A7.
The 2-3 page individual written report for the service learning
project is due on the last class, March 31. The oral reports (for
each group project will take place during the last week on
Monday, March 29 and Wednesday, March 31).
To see a nice report on the positive impact of service learning
see page 7 of the
Kennsington News Letter .
This week we will first discuss the introduction of information
theory by Claude Shannoni in 1948. In particular, we will mention
the concept of entropy, error correcting codes and compression. Compression
is directly related to prediction so this is a nice segway from
our last topic. We wil brielfy discuss
Huffman coding and
Ziv-Lempel compression . See
for a short Then (if time permits) we will go on to
discuss source level languages and compilers (the topic I originally planned
for this week).
The devoplment of FORTRAN and the
FORTRAN compiler is retold in John Backus' excellent article
on the
History of FORTRAN .
General Announcments
Instead of a newsgroup, there will be a web forum for this class.
There is a link to it at the top of the page that you can
use to access the forum.
We will mainly post course announcments at this web site.
In case you don't know, the Faculty of Arts and Sciences
sponsors a "First-year Learning Community (FLC)"
to help first-year students get the most out of
their University life (for more details, see
FLC ).
You should also take note
of the following resources:
Academic Success Centre
Writing at
University
College Writing Centres
Learning
English as a Second Language
How not to plagarize
Previous Offerings
This version of SCI199 will be similar to previous versions taught during the
1999-2000, 2001-2002, 2002-2003 and 2003-2004 academic years
Course Overview
We will pursue the general (and very debatable) theme of GREAT
IDEAS in COMPUTING.
The ambitious goal
is to try to identify the great ideas that have significantly influenced the
field. We will concentrate on mathematical, algorithmic and software ideas with
the understanding that the importance and usefulness of these ideas
depends upon
(and often parallels) the remarkable ideas and progress in computing and
communications hardware. The list of topics we shall discuss will
depend to some
degree on the background and interests of the class. Here you
can find a list of some
possible
course topics . It is interesting (relevant to this course) that the
Computing Community Consortium is asking the
computing research community to
help identify "game-changing advances from computing research conducted
in the past 20 years."
See the game-changing blog post .
This page will provide WWW access to various documents concerning SCI199.
Some of these documents are electronic versions of handouts given
out in class. Some announcements will also be made on this page. See also the
forum site given in the links above. Please send any comments or questions to
the instructor Allan Borodin (bor ..funnyatsign ..cs.toronto.edu).
Assignments
Assignment 0 .
This assignment will be a small part (1% of final grade) of the
participation component of the grading scheme.
Assignment 1 .
It is
meant to be a relatively easy assignment.
Assignment 2 .
This assignment requires some time so please start
as soon as possible.
Assignment 3 .
Assignment 3 requires some record keeping so again do not
wait too long to begin; assignment 3 is due November 9.
Assignment 4 .
Assignment 5 .
Assignment 6 .
Assignment 7 is now available.
Tutorial Notes and Slides
Slides for
Slides for first tutorial, September 14
Slides for
Slides for fourth tutorial, October 5
Slides for
Slides for fifth tutorial October 12 in pdf and
slides in powerpoint .
Slides for
Slides for first tutorial in second term, Jan 4 in pdf and
slides in powerpoint .
Slides for
Slides for Jan 11 tutorial in pdf .
Slides for
Slides for March 1 tutorial in pdf .
Readings
I suggest looking at
Ed Lazowska's
Spetember 22,2009 lecture slides
for many "great ideas" and achievements in computing
and speculation about future great ideas.
Assignment 1 made some reference to hash tables which motivated
discussion of the birgthday paradox. For those who would like to estimate or calculate birthday
paradox probabilites, try
another birthday paradox explanation which
lets you perform random trials for different numbers of possible values
and numbers of samples.
Assignment 2 concerns engines and I hope
everyone has
read
Vannevar Bush's 1945 prophetic article .
Readings for the 2008-2009 Course
For those interested in the birthday paradox, here is
the Wikipedia article
on birthday paradox and
another birthday paradox explanation which
also lets you calculate similar probabilities.
Here is the
Wikipedia discussion of floating point representation .
Here is the link for those interested in the
history of computing site that
I used in class. Many other sites are available. The history timeline used in
class does seem to emphasize U.S. based contributions to the field.
You might want to look at a more
European history of computing . It is also fair to say
that the University of Toronto was actively involved in the early history
of computing as can be learned from
William's article on U.Toronto Computation
Centre and
Hume's article on software for Ferut
computer . Toronto faculty members Kelly Gotlieb and J. Patterson
Hume were pioneers in
the early development of computing.
There are many sites that provide a discussion of Turing machines
and gives examples and simulations for particular computational
problems. Here is one such
Turing machine site . And here is
Turing's famous paper .
Perhaps no topic is more widely discussed on the web
than web search engines. We'll provide a very small selection of
some interesting readings. I strongly suggest everyone to read
an exceptionally prophetic
1945 article by Vannevar Bush . (In particular, see sections 6 and 7 for
the concept of hyperlinks.) We mentioned in class that Google doesn't
make it easy to find out specifics about their capabilites and methods.
This
2004 Farach-Colton
article is somewhat dated but still very interesting.
One of the reasons that Google is so secretive can be understood
from this
article on Google ranking and spamming . For those interested
in search anomalies, here is an article on
the OR anomaly . At the end of the October 27 class, I was
asked about
how search engines accomodate for geographical location .
Here follows a few
of the many histories of the internet:
It is interesting that the history of packet routing seems to be a little
controversial as seen in this who invented packet routing debate .
We have discussed the counter-intuitive idea of using negative results (assumptions about
what can not be computed efficiently) leading to positive results (pseudo
random generators, public key cryptography). In particular, we have been discussing
results based on the assumed difficulty of factoring numbers. The
Blum Blum Shub pseudo random
generator is one example of a pseudo random generator. Perhaps the best known
public key system is RSA . The original
$100 RSA challenge
and subsequent larger RSA number factoring challenges have
been broken with improved factoring and computational resources but
still most people continue to believe it is difficult to factor large numbers.
For the fifth assignment you need to execute the
extended Euclidean algorithm .
Social networks and their study are being impacted by online networks
and an emerging emphasis on more mathematical and algorithmic understanding of such networks.
We start our discussion of social networks by asking (as suggested by Milgram's
seminal studies), if
it is a small world ,
and if so, is this surprising, and is the world made smaller or bigger by online
social networks. Any study of the nature of social ties (e.g. friendship)
must take into account the strength of such ties and in this context we consider
Granovetter's
strength of weak ties . An imporatnt aspect of social network
study is the nature of communities in a network (i.e. how they
can be defined and structural information about such communities)
as well as how communities grow (and break apart). Two very interesting
papers are
the Leskovec et al study of community structure in
large scale networks and
the Backstrom et al study of the evolution of groups .
In our discussion of cryptography, we introduced the idea
that certain problems (like factoring) cannot be "efficiently" computed.
And in our discussion of social networks, we came across many large scale graph
theoretic problems where it is not always clear how to solve such problems.
We recall Turing's work and the Church-Turing thesis which equates
the intuitive concept of "computable" with the mathematically defined
"computable by a Turing machine". In order to formalize the concept
of "efficiently computable", we will assume the (debateable) extended
Church-Turing thesis, namely we equate "efficiently computable"
with "computable by a Turing machine in polynomial (as a function of
the length of the encoded input/output) time". Now the question of
what is and what is not efficiently computable becomes the precise (but
very difficult) question of what is and what is not computable in
polynomial time. We will foucs on decision problem (as justified in class)
and consider the now famous P vs NP conjecture as introduced
by Stephen Cook in 1971 (and independently in the Soviet Union) by
Leonid Levin. To gain some appreciation of the importance of this
question for not only computing, but for all mathematics and science,
you can consider becoming a millionaire by solvina the "P vs NP" question, one
of the seven
Clay Millennium Prizes . The prizes are listed alphabetically
and a description of the problem can be found by clicking on
"P vs NP".
Our next topic was a brief intrdocution to machine learning
and, in particular, classification. After a brief overview
of some common concepts and terminolog, we briefly
considered
support vector machines as one of the generic
classification methods. Then we had
a special guest
lecture by Professor Geoffrey Hinton, celebrated for
his work on machine leraning and especially for his seminal work relating
to neural networks. For those wishing to get a related more technical
version of his
presentation, I suggest watching the
UTUBE video of Hinton's Google Tech Talk
"The Next Generation of
Neural Networks" ..
Based on the von Neumann architecture, programming languages begin to
emerge by the early 1950's. While there is an interesting (and perhaps somewhat
disputed history) of the early development of "high level languages", it seems
fair to say that the first commercially successful high level language (and
compiler) was FORTRAN developed at IBM by John Bachus and his group.
To get some
appreciation of some attitudes about the use of high level languages, consider
the following von Neumann anecdote taken from the University of Virginia
History of Computing : In the 1950's von Neumann was employed as a
consultant to IBM to review proposed and ongoing advanced technology projects.
One day a week, von Neumann "held court" at 590 Madison Avenue, New York.
On one
of these occasions in 1954 he was confronted with the FORTRAN concept; John
Backus remembered von Neumann being unimpressed and that he asked
"why would you
want more than machine language?" Frank Beckman, who was also present, recalled
that von Neumann dismissed the whole development as "but an application of the
idea of Turing's `short code'." Donald Gillies, one of von Neumann's
students at
Princeton, and later a faculty member at the University of Illinois,
recalled in
the mid-1970's that the graduates students were being "used" to hand assemble
programs into binary for their early machine (probably the IAS machine).
He took
time out to build an assembler, but when von Neumann found
out about he was very
angry, saying (paraphrased), "It is a waste of a valuable scientific computing
instrument to use it to do clerical work."
An excellent recounting of
the
development of Fortran ) is given by John Backus. Some
other historical notes on John Backus, Fortran and the history of programming
languages can be found at the following sites:
Nebel
article
biographical note
References
for history of programming languages .
One of the programming language features included in FORTRAN was the use
of subprograms. The implementation of FORTRAN did not permit recursion. We
briefly discuss the use of a STACK to implement subprograms (recursive or not).
For our next topic, we will discuss operating systems, the software that
controls the allocation of systems resources (computing cycles, memory, I/O).
The idea of having an operating system is another one of those great
ideas that seems so
obvious (now) that it is hard to imagine not having one.
We will look at some of
the great ideas utilized in operating systems, including interupts,
file systmes
and virtual memory/paging.
Allan Borodin, Phillipa Gill; 2009.