(* List is not a type. It's a *parametric type*. A parametric type maps one or more types to types. Parametric types are not functions, in CaML-speak, however. Functions map objects to objects. *) [1;2;3];; [1.0;2.0];; [[1;2];[3;4]];; tl [[[[[[[[[1]]]]]]]]];; (* What type would we assign to: *) [1;2;"ab"];; (1,2,"ab");; (* --- lists are *homogeneous*, but tuples are not *) (1,2.0,"ab");; hd;; let sq(x) = x * x;; sq;; (* 'a is a *type variable*. CaML knows the type of hd, but notice that is has also *inferred* the type of sq - we didn't tell CaML this. This is very handy. *) (* Sound type system: a type system in which all types can always be inferred in any valid program. CaML has a sound type system *) (* list comes with an append operation: @ *) [1;2;3]@[4;5];; (* similar but not identical (types!) to concatenation: ^ *) "123"^"45";; prefix @;; prefix ^;; (* *predicate*: a function that returns true or false here is a good one for lists: *) let null list = if (list=[]) then true else false;; null [];; null [2;3];; (* This is an important predicate because it determines whether we can apply hd or tl *) (* Now, what's the sense in having a predicate that's true for exactly one object? Can't we just look at the object? Yes and no... VALUES AND EVALUATION All objects have a *value* number: its own value symbol: possibly undefined *) x=3;; let x=3;; x=3;; (* let is a reserved word that gives x the value 3: *) x;; (* "x" refers (anonymously) to the value "x", not the variable x. "x" is a string. x is (now) an integer. *) "x";; (* For lists, [] has itself as its value. Other lists have a value of the same length, where each object is replaced by its corresponding value: *) [2+3;4];; (* The value of a function application is the result of applying the function to the *values* of the arguments. That is, first we evaluate the arguments, then we evaluate the function on those arguments' values. This is called *call-by-value* function evaluation. An alternative is to substitute the argument expressions themselves, and then evaluate the resulting function application expression: given fun f(x) = x+5; f(3+4) -> 3+4+5 -> 3+9 -> 12 (built-ins like + would evaluate by call-by-value still). This is called "call-by-name." CaML is call-by-value, although it can be difficult to see the difference in a typed functional language. Algol used call-by-name. Some PLs like Simula have both. Call-by-name is pretty rare now, except for *macros*, which always use call-by-name. Example: inline functions in C(++). Other PLs are lazy, and use *call-by-need*, evaluating expressions only when absolutely forced to do so. Lazy PLs also generally tabulate the results of computations they have already performed so that they don't have to do the work again if asked for the same value. CaML implements call-by-need only in the sense that the programmer can define any object as a function of the unit element (). We'll look at an example of this later. Sometimes people call call-by-value *applicative-order execution*, and call-by-name *normal-order execution*. But we still need an *order of evaluation*. CaML evaluates arguments from left to right, bottom-up. First, it evaluates the arugment expressions from left-to-right: ef(e1 ... en) -- -- || || \ / \ / \/ \/ ef(v1 ... vn) then the expression that denotes the function: ef(v1 ... vn) -- || \ / \/ f(v1 ... vn) then it applies the function to the arguments. Some applicative-order languages (e.g. Scheme) evaluate the function first and then the arguments. Functions can indeed be expressions worthy of evaluation: *) (hd [sq])(3);; (* "PURE" FUNCTIONAL PROGRAMMING Programs are viewed as collections of *functions*. Execution of programs is viewed as *evaluation*. This means: - *Referential transparency*: an antiquated and somewhat inaccurate view of the difference between functional and imperative programming. Essentially, a functional program behaves the same if we replace one expression with another expression, provided that they both evaluate to the same thing. This is true in pure FP, but it arguably holds in well-designed imperative languages too. - *Manifest Interface Principle*: all interfaces are apparent in syntax. Code is then more modular: the arguments are the interface(s). - no assignment statement: a variable may not have a value, or its value might be a function that has not been applied to its arguments yet (see below), but variables in pure FP are said to be *logical* in that, having acquired a value in the course of an evaluation, they retain that value until the end of evaluation. This is similar to the manner in which variables are used in mathematics. Remember when you had to unlearn your mathematics training to make sense of "x := x+1"? In FP, this is anathema! - more generally, FP can be said to be a *higher-level* language, in that we don't have to think about how program state maps to a computer's memory (or at least, not unless we want to write super-efficient code). That means no assignment, no new/free calls to manage our own memory, and no pointers. - recursion vs. iteration: in pure FP, recursion is generally preferred as a matter of style to iterative constructs like while-loops or for-loops. This makes programs (1) easier to prove correct, and (2) easier to conceptualize as functions, because recursions, like functions, identify the structure of one or more *arguments* as the only values upon which the computation (and termination) depends. - more generally, the *state* of a computation is much easier to think about in FP than in imperative programming because it has been *reified* within the language. We'll see some good examples of this later. Historically at least, imperative programming had to refer to Turing machines to talk about state. - Backus's revolutionary idea: program correctness, readability and reliability are more important than efficiency! *) (* PREDICATES A predicate is a function that returns true or false. It can take any kind of argument. List predicates: *) null;; let isList(x:'a list) = true;; (* don't do this - this can never yield false *) (* type inferencing should determine off-line whether a given object is a list *) let isCons(x) = if not(null(x)) then true else false;; (* better way: *) let isCons(x) = not(null(x));; (* String predicates: *) eq_string("ab")("a"^"b");; eq_string("ab","a"^"b");; (* doesn't work - why? *) eq_string;; let eq_ab = eq_string("ab");; eq_ab(char_for_read(`a`)^"b");; char_for_read;; (* built-in coercion *) `a`;; (* characters are given in backquotes *) (* Arithmetic predicates: *) 3=4;; 3<4;; let zero(x) = (x=0);; zero(0);; (* Boolean predicates: *) not true;; true & true;; true or false;; false or (y=3);; (* but CaML only checks the type of the second argument - no evaluation unless necessary *) (* CONDITIONAL EXPRESSIONS if _ then _ else _ : (bool * 'a * 'a) -> 'a let mark(g) = match (g>=50) with true -> "pass" | false -> "fail";; let mark(g) = match g with 85 -> "a" | 75 -> "b" | 65 -> "c" | 55 -> "d" | 45 -> "f";; mark(90);; (* There is no "else" statement as such in a CaML match statement, although there is a pattern that matches anything (of the inferred type): *) let mark(g) = match g with 85 -> "a" | 75 -> "b" | 65 -> "c" | 55 -> "d" | _ -> "f";; (* If more than one pattern matches, the textually first one is used *) let mark(g) = match g with _ -> "f" | 85 -> "a" | 75 -> "b" | 65 -> "c" | 55 -> "d";; mark(85);; (* But matches can be used on their own... LAMBDA: FUNCTIONS AS FIRST-CLASS CITIZENS In functional programming, we can: - return a function as the value of an expression - pass a function as an argument to another function - incorporate a function as an object in a more complex data structure (this eliminates the need to distinguish methods from other objects) - use functions anonymously Lambda Terms include: - variables - constants - lambda x.t, where x is a variable, t is a lambda term - (t1 t2), where t1, t2 are lambda terms "lambda" in CaML looks like this: function Examples: *) (function x -> x+1)(3);; let f = function x -> x+1;; (* "let f(x) = x+1" is actually a shorthand for this *) let x = 3;; f(x);; let lessthan0 = function x -> x<0;; lessthan0(-1);; lessthan0(0);; let lessthan = function (x,y) -> x function y -> x body;; is shorthand for: let f = (function x1 -> (function x2 -> ... (function xn -> body)));; and: (f e1 e2 ... en);; is shorthand for: (((f e1) e2) ... en);; Sometimes fun's pattern matcher gets confused, though, and can't parse your definition. Then you have to use function. Notice that if we use a tuple: *) let linear (a,b,c) = a+b*c;; (* must receive all of its arguments: *) linear(0);; (* Higher-order functions: this is really why functions are first-class citizens... *) let applyfun (f,x) = f(x);; applyfun(lessthan0,-1);; applyfun(lessthan0,1);; (* These permit better abstraction and reuse of code, e.g.: *) let even(x) = ((x mod 2)=0);; let odd(x) = ((x mod 2)=1);; let rec remove_even(l) = if null(l) then [] else if even(hd l) then remove_even(tl l) else (hd l)::remove_even(tl l);; (* can you guess why we need "rec"? *) let rec remove_odd(l) = if null(l) then [] else if odd(hd l) then remove_odd(tl l) else (hd l)::remove_odd(tl l);; remove_even [1;2;3;4];; remove_odd [1;2;3;4];; (* OR *) let rec remove(f)(l) = if null(l) then [] else if f(hd l) then remove(f)(tl l) else (hd l)::remove(f)(tl l);; remove even [1;2;3;4];; remove odd [1;2;3;4];; (* "Anonymous" functions: *) map(lessthan0)([-1;0;1]);; map;; map(function x -> x+1)([-1;0;1]);; (* SIDE EFFECTS *) let f(x)=x+y;; (* y is a *free* variable *) let y = 3;; let f(x)=x+y;; f(2);; let y = 4;; f(2);; (* CaML is (mostly) immune from side effects, through a combination of strict type-checking and strict name-space separation. CaML does have REFERENCES - but they're a parametric datatype: *) let r = ref 3;; let f(x) = x + !r;; f(2);; r := 4;; (* Note: this returns (). That's often the result returned by a function whose only effect is a side effect. *) f(2);; (* Side effects are bad. Some say that means references are bad. But at least in CaML they're labelled as bad (!). Some languages implicitly pass complex arguments like arrays or structured data types to functions/procedures with references. This is called *call-by-reference*. With the demand for explicit pointer datatypes, however, imperative languages have evolved towards a uniform call-by-value mechanism that can be invoked on pointers or data values alike. Call-by-reference, or passing a pointer with call-by-value, is a very good idea if the value is a large data structure. UNIT TYPE We have already seen tuples: *) (1,2,"abc");; (* These are useful because they are flat and heterogeneous. *) let message () = "hello world";; (* The *unit* type is the type of the zero-dimensional tuple, (). () is the only object in the world of type unit. As an argument, the unit delays evaluation of another function application: *) let square (x) = x *. x;; let delay () = square(3.0);; (* As a result, the unit indicates no result, for example, with side effects: *) load;; (* Note: non-zero-dim. tuples and records can also be function results: *) let quotrem(x,y) = ((x / y),(x mod y));;