{29.Aug'03} Summary This note proposes some changes to the Eiffel programming language, which fall in three categories: - A simplification of syntax, which doesn't affect semantics. - A simplification of the semantic description of the language, which neither affects syntax nor semantics. - As a consequence of the former, a generalisation of the language which doesn't affect existing programs, but allows a much simpler new programming style. Drawback of the proposed simplifications is that some (yet undocumented) uses of introspection might become impossible. Drawback of any language extension is added complexity. As we'll see, however, the proposed generalisation does add none. Advantage of the proposal is a tremendous overall simplification of the language, especially but not restricted to the methodological development of programs. Introduction Consider the following Eiffel expressions and their counterparts using the newly proposed syntax and semantics for tuples. IF predicate.item([x]) THEN .. IF predicate.item(x) THEN .. action.call([x]) action.call(x) PROCEDURE[ANY, TUPLE[A, B, ..]] PROCEDURE[A, B, ..] FUNCTION[ANY, TUPLE[A, B, ..], FUNCTION[A, B, ..][X, Y, ..] TUPLE[X, Y, ..]] FUNCTION[ANY, TUPLE[..], RESULT] FUNCTION[..][RESULT] ROUTINE[ANY, TUPLE[..]] ANY (All agent expressions, call and inline, stay as they are.) div_mod( i, j : INTEGER ) : TUPLE[INTEGER, INTEGER] is require div_mod( i, j : INTEGER i >= 0 and j > 0 ) : ( q, r : INTEGER ) is local require q, r : INTEGER i >= 0 and j > 0 do do from q := 0; r := i from q := 0; r := i invariant q*j + r = i invariant q*j + r = i until r < j until r < j loop loop r := r - j r := r - j q := q + 1 q := q + 1 end end Result := [q, r] end -- div_mod end -- div_mod Both old and new style permit: check div_mod(i, j).first = div(i, j) and div_mod(i, j).q = div(i, j) div_mod(i, j).second = mod(i, j) and div_mod(i, j).r = mod(i, j) end Although this will only be used in heavily functional programming styles. (Which keep their status of "possible, but not encouraged" in Eiffel.) Old style: New style: local q_r : TUPLE[INTEGER, INTEGER] local q, r : INTEGER q, r : INTEGER .. .. do do q_r := div_mod(i, j) (q, r) := div_mod(i, j) q := q_r.first .. r := q_r.second end .. end Making also the following possible: div_mod( i, j : INTEGER ) : ( q, r : INTEGER ) is require i >= 0 and j > 0 do from (q, r) := (0, i) invariant q*j + r = i until r < j loop (q, r) := (q + 1, r - j) end end -- div_mod Finally, a restriction: local local q_r : TUPLE[INTEGER, INTEGER] q_r : (INTEGER, INTEGER) r : INTEGER r : INTEGER .. .. do do q_r.set_r(r) q_r := q_r.with_r(r) .. .. end end Note that the two don't have the same semantics under aliasing! This last change is necessary to make contravariant conformance of tuple generics possible. The proposed tuples have no features with arguments of generic type, and are by consequence immutable. The form shown above will be rarely (if at all) needed and feature `with_y' in class TUPLE[x : X; y : Y; z : Z] is declared with type with_y( y : ANY ) : ( x : X; y : like Y; z : Z ) This is the standard pattern for "generic routines". This last change is the only restrictive one, that "takes expressive- ness away", but conformance of mutable tuples was a dangerous feature anyway and programs making use of it may still contain undetected type errors. (And, as far as I know, they were no documented usage patterns or sensible examples for the old style anyway.) Formally, I propose the following changes: 1. The syntax "TUPLE[".."]" is changed to "("..")". The old style will still be legal, making clear that tuples are only a normal abstract type, whose objects could as well be seen as instances of an implicit class (just like agents). 2. The formal arguments list of routines now has the same syntax as tuples and it already has an analogous semantics. We make that "official" by viewing each routine as taking one single argument of tuple type. The empty tuple can be left away instead of writing "()". 3. Since the proposed routine signatures are tuples anyway, there is no more need for the routine types FUNCTION and PROCEDURE to refer to tuples explicitly. A syntax as shown above is proposed instead. 4. To keep a common semantics for reattachment (assignment and parameter passing), we introduce the "multiple assignment" which just generalizes reattachment from signatures to tuples. Discussion The idea for the proposal was born out of the cumbersome use of agents which always needed tuple types in declarations and brackets in agent calls. It seems that much of the clumsiness of current agent syntax is due to their use for introspection mechanisms. A use, however, that is only sparely documented and not methodologically guided. In particular the agent introspection has not been compared with mechanisms of code generation and manipulation of abstract syntax trees which are their primary rival[1]. Indeed, introspection is been more popular with "dynamic languages" (pioneered by Lisp and Smalltalk) while more static languages suggest the other approach for more type safety and efficiency. Eiffel has already a lot of tools which work on the abstract syntax of programs (interface generation, source editing, (alas, not yet) static checking) and the proposed applications of introspection, namely object persistence and migration, could as well fit into that pattern. The proposed syntax change for agents could have been made without affecting tuples: Routine types could just use the same magic (flexible number of generic arguments) that are used by tuples. This doesn't make the language definition much more complicated, but significantly simplifies the use of agents. (While the creation of agents already has an optimal syntax.) Nevertheless I also propose to change the tuple specifications, since that will yield an effective simplification of the language: the concept of "signature" can be completely purged out of the language definition. (It has been a restricted version of tuples anyway.) Note that this change doesn't affect in any way the semantics of the Eiffel language (where routine call is rather ubiquitous), it only simplifies the description of it and generalises it a little bit. (See the use of tuples in [2]. As an acknowledgement and (weaker) argument: the use of tuples for routines with several arguments is largely common in functional languages (especially all modern ones that distinguish between curried and uncurried function application) and is also used in the "Z"-based specification constructs of [3].) Another consequence of that simplification is that functions now become truly symmetric in argument and result types. The new call syntax (x, y, ..) := function_name(a, b, ..) does make this particularly clear. Function declaration is function_name( a, .. : type_a; .. ) : ( x, .. : type_x; .. ) "is" where the special (conventional) case function_name( a, .. : type_a; .. ) : result_type "is" abbreviates function_name( a, .. : type_a; .. ) : ( "Result" : result_type ) "is" I think that the proposition of multiple assignment (a concept that was perhaps as well understood in 1985 as it is now) is the most revolutionary part of my proposition, since it is the only one that is not a simplification of syntax (for the agents) or semantics (for signatures). However, reasons for its inclusion are compelling: - Eiffel has shown much foresight in adopting call by value (and return result by value) as its only parameter passing mechanism. The above function call syntax using multiple assignments does exploit call by value to unify the concepts of "by result arguments" and function call. The advantages of this style (which to my knowledge first appeared in the "B" language) as opposed to "by result" are detailed in [4]. Briefly, it combines the advantages of simple practical use with simple formal semantics. - Reattachment to tuples is needed anyway in the new semantics for routine arguments, generalising it to multiple assignment doesn't complicate the language description in any way. - [4] shows how the notation of "the currently best book on programming methodology"(*) can be adapted and improved to match almost exactly to a subset of Eiffel. Multiple assignments are used in (semi-)formal program derivation since a long time and the introduction of the above function call syntax will improve both Eiffel and "the textbook notation". - And finally, multiple assignment has a simple semantics and is really trivial to implement. All these benefits make off for the drawback that multiple assignment alone does not add much expressiveness. Finally, the proposed replacement of type ROUTINE with ANY above should not be read literally. My proposition is to apply the command- query- seperation also to routine types, which means that functions and procedures don't have a common calling facility. If it is insensible in "normal programming" to call an Eiffel function ignoring its return value, why should it be sensible to do so in programming with routine types? The only justification are vaporous uses of introspection... According to this proposal PROCEDURE objects do (mainly) have their feature `perform' (or `call'), while FUNCTION objects have `apply' (or `item'). All other features of the current ROUTINEs don't have a methodological use and it seems they are just there "because it seems convenient" or "to profit maximally from inheritance (and conformance)". Conformance of tuple types The ubiquitousness of tupels suggests to make the unary tuple of type T isomorphic to type T, which is logical since (T) = T is already suggested by the syntax (which would otherwise be ambiguous). This unification doesn't pose a problem generally, it even makes things easier: otherwise the standard assignment would be "(x) := (expression)" (where the parens on the left create a tuple) unless we'ld introduce the unary assignment as a special case. Taking this one step further the empty tuple becomes equivalent to NONE. This is already practice in some programming languages (like Haskell), it is theoretically pleasing since NONE is the equivalent of `do_nothing' (and the integer 0, as Dijkstra would say) and it is also usefule: for example, one can create a dynamic set structure out of a dictionary by class SET[T] inherit DICTIONARY[T, NONE] rename ... end end Until now, all of the shown would work well when washed in Eiffel. But together with conformance they come even more consequences. Some people might even like those -- they fit with a certain attitude of conformance -- but this attitude is not mine. The problem is this: Tuple conformace states that "longer" tuples conform to shorter ones, if their common actual generics conform. This means This means that any tuple (T, ...) conforms to all unary tuples (T') where T conforms to T', but those unary tuples are the same as plain type T' meaning that each tuple actually conforms to the type of its first actual generic (and each of its "conformant heirs"). This effect doesn't pose any problem (besides being a bit strange). It is completely safe and perhaps there will even be uses of it. The attitude this effect goes with is the same that lead to the generous rules for conformace of generic classes in general (which is also subject to catcalls, but has been much less studied than the covariant redifinition issue). What I don't like about that attitude is it's motto "it's possible, let's allow it!" instead of an "it's necessary, let's allow it!". In general, the former attitude leads to a static type system based on structural subtyping (where anything that's type-safe is indeed allowed) while the latter leads to a system of explicitly declared conformance via inheritance. As I saw it, the latter attitude was always the basis of Eiffel's design. The former attitude, on the other hand, is the basis of Cardelli's theories of objects (which are in fact theories of records, not of abstract data types). The tuple conformance is an introduction of Cardelli- records in Eiffel, and in fact it can be used for an Oberon- like style of object- oriented programming (as in Modula-3, Component Pascal, to some extend OCaml). In short: it is vastly more, than needed. Finally, we'll get a little problem with NONE. Since all tuples conform according to the structural subtyping rule to the nullary tuple, which means that all types T = (T) conform to () = NONE. Structurally this is perfectly valid: NONE has no features, so of course anthing conforms to NONE. But the structural approach to subtyping does just care a little to less for actual language pragmatics :-( As a consequence of this considerations, I am very sceptical about whether structural conformance of tuples is any good. I'm sure that a careful consideration of the needs can find alternatives with less drastic consequences. However, these issues are relatively independent from the simplifications this note proposes. One can as well stay with the conformance rules (and have a unary TUPLE different from NONE). Also, the conformance of the generics discussed above is independent from this (and completely type-safe). Evolution It is clear, that the syntactic changes regarding agents can be applied to Eiffel programs with a simple tool and compilers can easily adapted to accept both versions for a transitional period. The semantic simplification regarding signatures has no externally visible effects and will cause no problems. Together with the other changes it may allow a simpler design of compilers, but this is an internal benefit (or challenge) to compiler writers. Multiple assignment and multiple-result functions don't raise evolution issues regarding legacy code, because they are new in the language. However, the nice code of today is the legacy code of tomorrow, and the use of the new constructs will only be practical if compilers support it properly. It is hoped, however, that the simpler and less promising semantics of tuples and agents will end the current disagreement on some of the issues. Furthermore it is an advantage of the proposed immutable tuples that they don't have any catcall problems anymore. (The catcall problem for agents was already "solved" in [6]. This problem was also subject of some disagreement, and I suspect that solution of [6] subsumes the one proposed by the SmartEiffel team (of which I know of no publication or description, but afaik it is implemented).) Conclusion The proposed simplification of the Eiffel language is so signficant, that I deem it essential for the use of Eiffel in introductory programming courses. And if it's good for students, why shouldn't it be bad for business analysts and engineers who neither need additional language complexity in their way when solving practical problems. The current agent specification suffers from the conflict that introspection is an advanced mechanism whose use lies on the border of language extension, while agents on the other hand in spite of their recentness in the language have become an indispensable means even for simple language uses. The use for the higher-order predicates `for_all' and `exists' has already become ubiquitous in assertions and higher-order functions like `filter', `map' and `fold' (which in turn serves to define `sum', `product', `maximum', `minimum', and (together with `map') the two fore-mentioned predicates) have finally been able to fulfil the early promise of Eiffel that "most programs will contain a few loops anyway"[5]. I admit that those simple uses of agents do not need to call agents or write down agent types (only reading agent types). But I certainly want to show my students how those ubiquitous higher-order functions are implemented, and similarly for the practical programmer it should not be much harder to use objects of routine type than to use any other kind of object. (The overhead needed to fit routines into the object-oriented framework, although minimised by Eiffel, is already a hardship when compared to functional languages (those, of course, make other things hard, no offense meant (;-) ).) Finally, although the changes and extensions are nontrivial, I think they are perfectly in the spirit of Eiffel and recommend their implementation and inclusion into the language standard. And even if it is decided that my proposals are too much "text book" and too little "practice", I recommend to any teacher and author of programming books to use the variant proposed here and in [4]. Although it is tempting to let students work with a "real language", students should grow up with the best notation possible and I think this is the proposed one. [1] Robert Will: ??, SmartEiffel mailing list, TODO date and URL [2] Richard Bird: An introduction to functional programming, 2nd ed. (Prentice Hall, 1998) [3] Carroll C. Morgan: Programming from Specifications, 2nd ed. (Prentice Hall, 1994) [4] R.W.: Modern guarded commands, private note of august 2003 [5] Bertrand Meyer: The Eiffel Loop Construct, comp.lang.eiffel, TODO date [6] Robert Will: Caring for Covariance with Complete Contract Conformance, TODO date and URL (*) The cover of [3] attributes this quote to "The Science of Computer Programming" without naming the reviewer or giving an issue number.