SCI199 Home Page (September 2009 - April 2010)

[ Previous offerings | Contacts & Overview | Assignments | Tutorial Notes | Readings | FORUM | Course Information]

Announcements for week of March 15

The last assignment, A7, has now been completed and posted. I suggest looking at the very subjective topic of human attractivenss for some good hints for the first question of assignment A7.

The 2-3 page individual written report for the service learning project is due on the last class, March 31. The oral reports (for each group project will take place during the last week on Monday, March 29 and Wednesday, March 31). To see a nice report on the positive impact of service learning see page 7 of the Kennsington News Letter .

This week we will first discuss the introduction of information theory by Claude Shannoni in 1948. In particular, we will mention the concept of entropy, error correcting codes and compression. Compression is directly related to prediction so this is a nice segway from our last topic. We wil brielfy discuss Huffman coding and Ziv-Lempel compression . See for a short Then (if time permits) we will go on to discuss source level languages and compilers (the topic I originally planned for this week). The devoplment of FORTRAN and the FORTRAN compiler is retold in John Backus' excellent article on the History of FORTRAN .

General Announcments

Instead of a newsgroup, there will be a web forum for this class. There is a link to it at the top of the page that you can use to access the forum. We will mainly post course announcments at this web site.

In case you don't know, the Faculty of Arts and Sciences sponsors a "First-year Learning Community (FLC)" to help first-year students get the most out of their University life (for more details, see FLC ). You should also take note of the following resources:

Academic Success Centre

Writing at University

College Writing Centres

Learning English as a Second Language

How not to plagarize

Previous Offerings

This version of SCI199 will be similar to previous versions taught during the 1999-2000, 2001-2002, 2002-2003 and 2003-2004 academic years

Course Overview

We will pursue the general (and very debatable) theme of GREAT IDEAS in COMPUTING. The ambitious goal is to try to identify the great ideas that have significantly influenced the field. We will concentrate on mathematical, algorithmic and software ideas with the understanding that the importance and usefulness of these ideas depends upon (and often parallels) the remarkable ideas and progress in computing and communications hardware. The list of topics we shall discuss will depend to some degree on the background and interests of the class. Here you can find a list of some possible course topics . It is interesting (relevant to this course) that the Computing Community Consortium is asking the computing research community to help identify "game-changing advances from computing research conducted in the past 20 years." See the game-changing blog post .

This page will provide WWW access to various documents concerning SCI199. Some of these documents are electronic versions of handouts given out in class. Some announcements will also be made on this page. See also the forum site given in the links above. Please send any comments or questions to the instructor Allan Borodin (bor ..funnyatsign ..cs.toronto.edu).

Assignments

Assignment 0 . This assignment will be a small part (1% of final grade) of the participation component of the grading scheme.

Assignment 1 . It is meant to be a relatively easy assignment.

Assignment 2 . This assignment requires some time so please start as soon as possible.

Assignment 3 . Assignment 3 requires some record keeping so again do not wait too long to begin; assignment 3 is due November 9.

Assignment 4 .

Assignment 5 .

Assignment 6 .

Assignment 7 is now available.

Tutorial Notes and Slides
Slides for Slides for first tutorial, September 14
Slides for Slides for fourth tutorial, October 5
Slides for Slides for fifth tutorial October 12 in pdf and slides in powerpoint .
Slides for Slides for first tutorial in second term, Jan 4 in pdf and slides in powerpoint .
Slides for Slides for Jan 11 tutorial in pdf .
Slides for Slides for March 1 tutorial in pdf .

Readings

I suggest looking at Ed Lazowska's Spetember 22,2009 lecture slides for many "great ideas" and achievements in computing and speculation about future great ideas.

Assignment 1 made some reference to hash tables which motivated discussion of the birgthday paradox. For those who would like to estimate or calculate birthday paradox probabilites, try another birthday paradox explanation which lets you perform random trials for different numbers of possible values and numbers of samples.

Assignment 2 concerns engines and I hope everyone has read Vannevar Bush's 1945 prophetic article .

Readings for the 2008-2009 Course

For those interested in the birthday paradox, here is the Wikipedia article on birthday paradox and another birthday paradox explanation which also lets you calculate similar probabilities.

Here is the Wikipedia discussion of floating point representation .

Here is the link for those interested in the history of computing site that I used in class. Many other sites are available. The history timeline used in class does seem to emphasize U.S. based contributions to the field. You might want to look at a more European history of computing . It is also fair to say that the University of Toronto was actively involved in the early history of computing as can be learned from William's article on U.Toronto Computation Centre and Hume's article on software for Ferut computer . Toronto faculty members Kelly Gotlieb and J. Patterson Hume were pioneers in the early development of computing.

There are many sites that provide a discussion of Turing machines and gives examples and simulations for particular computational problems. Here is one such Turing machine site . And here is Turing's famous paper .

Perhaps no topic is more widely discussed on the web than web search engines. We'll provide a very small selection of some interesting readings. I strongly suggest everyone to read an exceptionally prophetic 1945 article by Vannevar Bush . (In particular, see sections 6 and 7 for the concept of hyperlinks.) We mentioned in class that Google doesn't make it easy to find out specifics about their capabilites and methods. This 2004 Farach-Colton article is somewhat dated but still very interesting. One of the reasons that Google is so secretive can be understood from this article on Google ranking and spamming . For those interested in search anomalies, here is an article on the OR anomaly . At the end of the October 27 class, I was asked about how search engines accomodate for geographical location .

Here follows a few of the many histories of the internet:

It is interesting that the history of packet routing seems to be a little controversial as seen in this who invented packet routing debate .

We have discussed the counter-intuitive idea of using negative results (assumptions about what can not be computed efficiently) leading to positive results (pseudo random generators, public key cryptography). In particular, we have been discussing results based on the assumed difficulty of factoring numbers. The Blum Blum Shub pseudo random generator is one example of a pseudo random generator. Perhaps the best known public key system is RSA . The original $100 RSA challenge and subsequent larger RSA number factoring challenges have been broken with improved factoring and computational resources but still most people continue to believe it is difficult to factor large numbers. For the fifth assignment you need to execute the extended Euclidean algorithm .

Social networks and their study are being impacted by online networks and an emerging emphasis on more mathematical and algorithmic understanding of such networks. We start our discussion of social networks by asking (as suggested by Milgram's seminal studies), if it is a small world , and if so, is this surprising, and is the world made smaller or bigger by online social networks. Any study of the nature of social ties (e.g. friendship) must take into account the strength of such ties and in this context we consider Granovetter's strength of weak ties . An imporatnt aspect of social network study is the nature of communities in a network (i.e. how they can be defined and structural information about such communities) as well as how communities grow (and break apart). Two very interesting papers are the Leskovec et al study of community structure in large scale networks and the Backstrom et al study of the evolution of groups .

In our discussion of cryptography, we introduced the idea that certain problems (like factoring) cannot be "efficiently" computed. And in our discussion of social networks, we came across many large scale graph theoretic problems where it is not always clear how to solve such problems. We recall Turing's work and the Church-Turing thesis which equates the intuitive concept of "computable" with the mathematically defined "computable by a Turing machine". In order to formalize the concept of "efficiently computable", we will assume the (debateable) extended Church-Turing thesis, namely we equate "efficiently computable" with "computable by a Turing machine in polynomial (as a function of the length of the encoded input/output) time". Now the question of what is and what is not efficiently computable becomes the precise (but very difficult) question of what is and what is not computable in polynomial time. We will foucs on decision problem (as justified in class) and consider the now famous P vs NP conjecture as introduced by Stephen Cook in 1971 (and independently in the Soviet Union) by Leonid Levin. To gain some appreciation of the importance of this question for not only computing, but for all mathematics and science, you can consider becoming a millionaire by solvina the "P vs NP" question, one of the seven Clay Millennium Prizes . The prizes are listed alphabetically and a description of the problem can be found by clicking on "P vs NP".

Our next topic was a brief intrdocution to machine learning and, in particular, classification. After a brief overview of some common concepts and terminolog, we briefly considered support vector machines as one of the generic classification methods. Then we had a special guest lecture by Professor Geoffrey Hinton, celebrated for his work on machine leraning and especially for his seminal work relating to neural networks. For those wishing to get a related more technical version of his presentation, I suggest watching the UTUBE video of Hinton's Google Tech Talk "The Next Generation of Neural Networks" ..

Based on the von Neumann architecture, programming languages begin to emerge by the early 1950's. While there is an interesting (and perhaps somewhat disputed history) of the early development of "high level languages", it seems fair to say that the first commercially successful high level language (and compiler) was FORTRAN developed at IBM by John Bachus and his group. To get some appreciation of some attitudes about the use of high level languages, consider the following von Neumann anecdote taken from the University of Virginia History of Computing : In the 1950's von Neumann was employed as a consultant to IBM to review proposed and ongoing advanced technology projects. One day a week, von Neumann "held court" at 590 Madison Avenue, New York. On one of these occasions in 1954 he was confronted with the FORTRAN concept; John Backus remembered von Neumann being unimpressed and that he asked "why would you want more than machine language?" Frank Beckman, who was also present, recalled that von Neumann dismissed the whole development as "but an application of the idea of Turing's `short code'." Donald Gillies, one of von Neumann's students at Princeton, and later a faculty member at the University of Illinois, recalled in the mid-1970's that the graduates students were being "used" to hand assemble programs into binary for their early machine (probably the IAS machine). He took time out to build an assembler, but when von Neumann found out about he was very angry, saying (paraphrased), "It is a waste of a valuable scientific computing instrument to use it to do clerical work." An excellent recounting of the development of Fortran ) is given by John Backus. Some other historical notes on John Backus, Fortran and the history of programming languages can be found at the following sites:

Nebel article biographical note

References for history of programming languages .

One of the programming language features included in FORTRAN was the use of subprograms. The implementation of FORTRAN did not permit recursion. We briefly discuss the use of a STACK to implement subprograms (recursive or not).

For our next topic, we will discuss operating systems, the software that controls the allocation of systems resources (computing cycles, memory, I/O). The idea of having an operating system is another one of those great ideas that seems so obvious (now) that it is hard to imagine not having one. We will look at some of the great ideas utilized in operating systems, including interupts, file systmes and virtual memory/paging.

Allan Borodin, Phillipa Gill; 2009.