csc324, Lecture 1, September 10, 1996

Instructor: Marsha Chechik

Office hours: Tuesday and Thursday 4-5p.m. and by appointment

Office: D.L. Pratt 384, X3820

Class homepage: http://www.cs.toronto.edu/~chechik/courses/csc324

What is this course? A study of programming languages.

Why study programming languages

Increased capacity to express ideas - the languages in whcich programmers develop software places limits on the kinds of control structures, data structures, and abstractions they can use; thus the forms of algorithms they can construct is also limited. Awareness of a wider variety of programming language features can reduce such limitations in software development. If a language that a programmer is forced to write in does not have some of desired capabilities, then he/she can simulate these features if he/she is aware of them.
Improved background for choosing appropriate languages.
Increased ability to learn new languages. First of all, computing is an evolving field, and lots of languages are contantly being introduced. Practioners will have to learn these languages to stay competitive. Once a thorough understanding of the fundamental concepts of languages is acquired, it becomes far easier to see how these concepts are incorporated into the desgn of the language being learned. Also, programmers need to know the volabulary and fundamental concepts of programming languages so they can read and understand programming language manuals and sales literature for languages and compilers.
Better understanding of the significance of implementation. We can become better programmers by understanding the choices among programming language constructs and the consequences of those choices. Certain kinds of program bugs can only be found and fixed by a programmer who knows some related implementation details. Also, once we understand how a compiler executes particular constructs, we can judge the relative efficiency of alternative constructs that may be chosen for a program.
Increased ability to design new languages.
Overall advancement of computing. In many cases a particular language becomes widely used not because it is the best, but because those in positions to choose languages were not sufficiently familiar with programming language concepts. (Ex., ALGOL 60 over FORTRAN) In general, if those who choose languages are better informed, better languages will more quickly squeeze out poorer ones.

Introduction

In this course we will go over concepts underlying programming languages and learn a few languages from different paradigms of programming. By no means will we be able to learn these languages in its entirety. Projects given in this class are used to illustrate various paradigms and introduce students to various ways to think about programing. This course, however, assumes that you have solid knowledge of at least one imperative language, e.g. Turing.

Most of information regarding this course will be available on-line. I will be distributing some information in class. However, I strongly urge you to read the course Web site for on-line help, assignments, etc. Please check the site on the regular basis. I will post the availability of new information in the news group for this course, ut.ecf.csc324h. This news group is to serve as the forum. You are encouraged to ask questions there, and I and my markers will read it on the regular basis and try to answer questions. Do not hesitate to answer questions yourself. And if you have an answer, post it to the news group rather than mailing it directly. Other people might have the same problems, too.

Information sheet for this class is available on-line, although I will bring copies of it next time. Here are some main points. The textbook for the course is R. Sethi, Programming Languages: Concepts and Constructs, 2nd edition. It should be available in the bookstore. The book has a teddy bear on its cover, so we can refer to it as "The Teddy Bear" book. Other books, in the recommended list and in the "Other Good References" list are for reference purposes only. You do not have to buy these books. If homework or reading should come from these books, it will be distributed to you.

Coursework for this class consists of 3 projects, a midterm, quizzes and a final. Projects are worth 30% total, and are in Scheme, Prolog and OOT or C++ (to be decided later in the course). Projects are to be submitted electronically, and are due at 11:59p.m. of the due date. There is an automatic 30 minute extension period given for every project. No further extensions are granted, i.e., a project submitted at 12:31 the next day is considered late. Late policy is 10% off per late day (weekends, holidays, etc. still count). A project late more than 7 days receives a grade of 0. You are supposed to work on the projects on your own. All projects should run on CDF in order to receive credit for them, although you may work on your home PCs and Macs, but then you have to upload your projects to CDF and make sure they run. All projects have to be turned in WORKING CORRECTLY in order to pass this course, even if you get 0 for the project.

In addition to projects, there will be homework. You are encouraged to collaborate on the homework, to ask me and each other questions about it, but please do not post answers to the news group. Homework is NOT to be collected. However, there will be weekly quizzes, given on the tutorial, 5-10 minutes each, that will ask you to anwer a homework (or similar to homework) question. If you do your homework, you should have no problems with the quizzes. There will be a quiz on each tutorial session, except the first one. 10 best quizzes will count. Grading policy for quizzes is 1 (correct), 0 (wrong) and 1/2 (everything else).

The Midterm for this course is worth 25% and will be in the end of October. The Final exam is 35%, during the finals week.

Tutorials: Your tutors are Nick Zahariadis, Irene Fung and David Neto. Tutorial assignment is as follows:

Section 0101 - MP102
Section 0102 - SS1073
Section 0103 - SS1088

Tutorials are on Mondays, 12-1 p.m. You should indicate your tutorial choice on the signup sheet.

Labs are in Engineering Annex. Access code for this year is 3, 1-2, 4. Your CDF accounts have been created. The user name is a324+4 letters of your lastname. Password is your student ID number. Please change password as soon as you log on.

Introduction to Programming Language

What is a programming language? It is a means for communicating with the computer to tell it what to do. Why are there so many languages? They reflect different views on how to use a computer to solve a problem.

There are many criteria by which one could classify or compare programing languages:

Intended application area, e.g., business, numerical computing, simulation
Data structures built into the language, e.g., lists, arrays, records, graphs, strings
Operations built into the language, e.g., pattern mattching, list manipulation, unification
Programing implied by the language, e.g., procedural, functional, declarative, object-oriented
Style of syntax of the language, e.g., ALGOL-like, APL-like
Deterministic or not
Sequential or parallel
Compiled or interpreted

Levels of details (generations of languages)

Machine language (1st generation) - every operation is specified in terms of the actual numeric code (binary), and each operand in terms of its absolute address in machine memory or in terms of a register number (or some combination of these). The basic architecture of most machines includes:

memory locations - contain data or instructions, organized into bytes or words (4K to 2G)
registers - large enough to hold one work of information (1-128 bits. 16 bits for 386 and 486 machines. 32 for Pentiums...)
Instruction set - numerically encoded commands

Programming in machine language requires deep knowledge of underlying machine architecture.

Assembly languages (2nd generation) - specify each operation and operand with a symbolic name. The assembler turns assembly code into machine code. Macro assemblers allow macros as a convenient shorthand for frequently used sequences of code. Assembly languages are machine-dependent, i.e., their features depend on gardware (or a particular family of CPU's) for which they are intended.

Programming in assembly is similar (i.e., just as hard) to programming in machine languages.

High-Level languages (HLLs) (3rd generation) - convensional programming languages (C, C++, Lisp, Basic, COBOL, etc.). Usually machine-independent, but at times implementation-dependent. For example, as some of you might know, Turbo C initializes all variables to 0, whereas Unix implementations of C, cc and gcc, don't, so programs which run fine on PCs might not run in Unix, and vice versa. A compiler turns an HLL program into assembly, and then assembler turns it into machine code.

Benefits of high-level language:

Readable, familiar notations (If, While, Procedure)
Machine-independent
Availability of program libraries (e.g., math library for C)
Possibility of syntax checking (semicolons and such) and more thorough type checking (we will talk about this later on in the course), helping to detect errors.

The programmer specifies instructions that a language processor understands and executes.

Very high-level languages (4th generation) (4GL) - declarative languages in which one states what is to be done rather than how to do it. Examples include report-generating languages, database processing languages and Prolog.

This is the future of programming.

History of Programming Languages

Early computing (40-50s) was done in machine and later in assembly language. First "real" high level language (mid 50s) was FORTRAN (it was seen as "automatic programming" when it came out). Around 1960s, we got first versions of COBOL, Lisp and Algol. Algol was the first block-structured language, with begin/end blocks. Algol-like languages include Pascal, Simula, C, Turing, etc. An important concept developed in late 60's-early 70's was that of structured programming. Structured programming is a technique to program so that the program text helps the reader understand what the program does, i.e., there are no GOTOs. Instead, block structure and loop exists are to be used. (Note: your projects CANNOT contain GOTOs. This is an automatic 0 for this project!)

Different languages were developed for differnt needs. COBOL - for business applications, FORTRAN - for scientific computing. Combination, i.e. PL/1, which attempted to create a single "good language" is generally considered a failure. Still, most languages are "general purpose", i.e., a variety of problems could be solved using these languages. For example, Pascal was designed for teaching purposes, yet a number of companies had industrial-size projects using Pascal.

Imperative Style of Programming

Examples: Cobol, C, Basic, Pascal, Algol, Turing...

The program is intended to tell the computer to execute a set of instructions which solvve a particular (usually computational) problem. These languages are based on John van Naumann's view of computers. von Naumann's computer consists of a CPU (divided into Control Unit, Arithmetic Unit and Input/Output Unit), Memory (holding instructions and data) and communication tube between them. A program has a job of changing the contents of memory in some way, doing so by pumping the contents back and forth through the tube. Thus, the basic operation of these languages is iteration: loop... end.

Other statements:

declaration of variables
assignments
procedure call (with or without parameters)
return from procedure
GOTO labal or location
conditional statements (If... Then... Else...)
sequence (begin ... end)

Functional Programming

Started with Lisp (LISt Processor), developed in 1958 for applications in artificial intelligence.Main question: given the program, what does it evaluate to? All programs are sequences of function applications. A function is a mapping from domain to range.

Examples:

sin(90 degrees) = 1 Here, sin is a function applied to degrees.

Signature of this function is Degrees -> Int between 0 and 1

age(Mary) = 15

This signature is Name -> Int

2 + 3 = 5. In this case, the function, "+", has two arguments.

Alternative representation is add(2, 5) = 5. Signature is Int X Int -> Int

Father(Mary) = Fred. Signature is Name -> Name

Logic Programming

Developed in 1972 with Prolog for natural language processing. Prolog stands for PROgramming in LOGic. The questions answered by the program are "what can be inferred from the facts consituting the program?" Programs consist of facts, rules and queries.

Example:

Facts:

Toronto is exciting

It is hot here

Clyde is an elephant

Ann has big ears

Ann has long trunk

Ann is an animal

Rules:

If X is an animal and X has big ears and X has a long trunk Then X is an elephant

Queries:

Is Toronto exiciting? Answer is YES

Is it cold here? Answer is NO

Who is an elephant? Answer is Clyde, Ann (it was able to apply the rule to determine that Ann is an elephant, too)

Is Marc an elephant? Anwer is NO

Object-Oriented Programming

Started with Simula (language for simulation) between 61 and 67. Key concept: classes and subclasses of objects. Another keep concept is inheritance, i..e., getting some properties from the base class. For example, both cars and trucks are vehicles, so all properties of vehicles apply to them, but in addition they can have other properties, like a number of passangers for cars and height for trucks.

Question answered by the program is "what behaviours are supported by (active) objects constituting the program?" Popular OO languages are Smalltalk and C++.

Programming Language Design

Language design has a number of goals:

Syntax shuld be unambiguous and easy to specify
The language must be able to express (almost) everything the computer can do
Language processors should emit efficient code (space and time-wise)

At this point, however, programmer time is much more expensive than computer time, in most cases. Thus,

The language should be easy to learn, read and understand
It should be clear in intended meaning; orthogonal with respect to features
Easy to express things that are frequently done
Easy to write correct programs - helpful in protecting against silly but human mistakes. Still, other means for checking that the program is correct are necessary. Typical means are:

Testing (running program with input and checking output.) Problem - don't know where to stop!
Code inspection (independent reader reads code line by line). Problem - can miss "hard to find" errors

Compilation and Interpretation

A compiler is a program translating a source program into machine code (target program), which is then run. Stages:

compile time - translation, analysis of static properties (properties evident from program text)
link time - collecting all files and libraries together to form an executable
run time - execution. Checking of dynamic properties (properties evident only upon running the program).

An interpretor is a program that acts like it can run programs in this programming language (rather than machine language). It takes a source program and its input at the same time, and then scans the source program, executing its operations one by one, producing output.

Comparison: Compiled code runs 10-100 times faster than interpreted. However, compilers take time to compile the code. So, if a program is frequently changed, the entire effort of compilation has to be redone over and over again. Interpretor, on the other hand, examines the program repeatedly, so the program can be changed on the fly. Thus, if a program is to be changed frequently but not run very often, then interpreting is the right way to go.

Homework:

Read Chapter 1 of Sethi book. Look at exercises at the end of the chapter.