324 Lecture Week 8 - Winter 2006 ================================ A variable is characterized by sextuple of attributes: (name, address, value, type, lifetime, scope) name: might have none (i.e., dynamically allocated with new) address: memory location (aliases: multiple ways to access storage) type: properties value: bits, interpreted in some way lifetime: when do we destroy it scope: discussed already Types ===== Types are sets of computational objects with uniform behaviour - set of values - set of operations Eg. reals, integers, strings, int->bool, (int->int)->bool What constitutes a type is language dependent. Remember: functions have type too! (even if they're not first-class) Uses/Merits: ------------ Program organization and documentation - separate types for separate concepts - indicate intended use of declared identifiers Identify and prevent errors - compile-time or run-time checking can prevent meaningless computations eg. 5 + true - "Charlotte" Support optimzation - compiler can create better code if it knows what's in each variable eg. short integers require fewer bits - access record component by known offset Type declaration and binding ---------------------------- Explicit declaration - statement declares variable and specifies exact type Implicit declaration - types determined by convention - eg. Fortran: i,j,k,l,m,n are ints, otherwise real - eg. Perl: $ scalar, @ array, % hash Literals usually have implicit type Types can be bound to variables at compile-time or run-time Static binding - like C, Java, Pascal - occurs before program is run - sometimes called "early binding" - efficient for compilers Dynamic binding - like Scheme, Javascript, PHP - type of a variable isn't known until execution - typically useful for interpreted languages Type can be extended to expressions Type errors ----------- Type error occurs when execution of program is not faithful to the intended semantics Hardware errors: eg. function call y() where y is not a function - may cause jump to instruction that does not contain a legal op code Unintended semantics: eg. int_add(3, 4.5) - not a hardware error but the bits representing 4.5 will be interpreted as an integer Type checking ------------- Does the language check types? Type Safety: a programming language is type safe if no program is allowed to violate its type distinctions. - Scheme, Java, ML are type safe - C and C++ are not (union structures not type checked) - type coercion temporarily suspends type checking A type safe language is also called strongly typed Type checking: the process of verifying and enforcing the type constraints - can either occur at compile-time (static) or at run-time (dynamic). run-time (dynamic) type checking - Scheme: (car x) checks first to make sure x is a list. compile-time (static) type checking - ML and Java: f(x) must have f: A -> B and x:A Trade-off: - Both prevent type errors. - Run-time checking slows down execution, but more flexible. - Compile-time checking restricts program flexibility, but finds any errors before execution E.g., Scheme list elements can have diff. types, ML lists elements must have the same type. - Dynamic checking requires tagging every piece of data with a type label, static checking avoids this (tags can be forgotten once compiled) - Static typing can make programming more difficult, initially. It's harder to get things to compile, and Standard Type Checking: - Look at body of each function and use declared types to check for agreement. eq. int f(int x) { return x+1; } int g(int y) { return f(y+1)*2; } Type Inference: - Looks at code without type info and figures out what types could have been declared. - ML is designed to make type inference tractable. - A cool algorithm! - Widely regarded as an important language innovation. - ML type inference gives you some idea of how other static analysis algorithms might work. It uses constraint satisfaction techniques. eg. A3 := B4 + 1; Q: What type is A3 and B4 ? A: Must be integer E.g. if test then ... Q: What type is test ? A: Must be Boolean Sound type system: a type system in which all types can always be inferred in any valid program. ML's Type Inference Algorithm (Mitchell): 1. Assign a type to the expression and each subexpression by using the known type of a symbol of a type variable. 2. Generate a set of constraints on types by using the parse tree of the expression. 3. Solve these constraints by using unification, which is a substitution-based algorithm for solving systems of equations. Type compatibility ------------------ Every expression must have a type that is known - Type system: rules for associating a type with expressions - expression is rejected if it does not associate a type with it When are two types compatible? Can we assign variable of one type to another? Name type compatibility - types have same name - easy to implement, but very restrictive eg. Ada, Modula-2: if "Indextype is 1..100", it's incompatible with integers eg. C++ (classes) Structure type compatibility - types have same structure - computationally more challenging - are 1..11 and 0..10 compatible? - are structures with same types but different names compatible? eg. C (except struct and union) Coercion - automatically or explicitly convert object of one type to another Conversions are either narrowing or widening - narrowing (ie. double to float) might lose meaning (overflow/underflow) - widening (ie. int to double) might lose precision, but nearly always at least approximates the original value Automatic conversions might weaken error checking Instead of conversions, we could use overloading - use different function for different type arguments - aritmetic operators are typically overloaded (+ on ints, + on floats) Generics (Ad-hoc Polymorphism) ------------------------------ Overloaded sub-programs (functions) allow - different code for different types of operands eg. variant behaviour, optimised code - allow code reuse and type safety C++ templates - parametric polymorphism ------------- Example of generic functions: generated at compile-time - kind of a poor cousin to dynamically binding parameters to actual types in a call, where only a single copy of code is needed (in C++ a copy for each type must be created at compile-time) Syntax: template