0. our old friend "nat" ----------------------- Over the natural numbers (nat) we have the < ordering relation. It has the well-ordered property: a non-empty subset of nat has a minimal member, i.e., forall P:nat->bool. (exist x. Px) ==> (exist m. Pm /\ ~(exist y. Py /\ ybool. (exist x. Px) ==> (exist m. Pm /\ ~(exist ybool. (forall x. Qx) <== (forall m. (forall y Qm) (Exercise: Let "Waldo" = "the base case" = Q0. Where is Waldo?) They are equivalent: forall P. (exist x. Px) ==> (exist m. Pm /\ ~(exist y forall P. ~(exist x. Px) <== ~(exist m. Pm /\ ~(exist y forall P. (forall x. ~Px) <== (forall m. ~Pm \/ ~(forall y forall P. (forall x. ~Px) <== (forall m. (forall y ~Pm)) = forall Q. (forall x. Qx) <== (forall m. (forall y Qm)) 1. new friends -------------- In the above equivalence we never used the "nat"ness of nat. (Actually we never used transitivity or dichotomy of < either.) We may as well replace nat by some other set, impose a < ordering, and make sure < has the well-ordering property; then induction has to work. Example. Pull, out of thin air, a new symbolic constant w (omega); amend < so that n0, 0|->1, 1|->2, ... is 1-1 onto), but that no 1-1 onto map preserves the order (in fact w|->0 already breaks the order). There is a 1-1 order-preserving map from nat to nat U {w}. There is no 1-1 order-preserving map the other way round. In this sense nat is "smaller" and natU{w} is "larger". Example. {(0,0), (0,1), (0,2), ... , (1,0), (1,1), (1,2), ... , (2,0), (2,1), (2,2), ... , ... } again using the lexicographical ordering. In "w" notation: {0, 1, 2, ... , w, w+1, w+2, ..., w+w, w+w+1, w+w+2, ... , ... } Transfinite induction simply refers to performing induction over some well-ordered set larger than nat. In a well-ordered set, a member m falls into one of three cases: - m is the minimum, e.g. 0; - m is the successsor of someone else, e.g., 1, 2, 3, ... we call these "successors"; - m is not the successor of someone else, nonetheless it is bigger than a whole lot of other members; e.g., w, w+w, ... we call these "limits" so the proof obligation of induction, (forall y Qm, can also take on three forms: - m is 0: Q0 - m is a successor, say m=n+1: Qn ==> Q(n+1) for pretty much the same reason "weak induction" equivales "strong induction" - m is a limit: no change, (forall y Qm So in many treatises, transfinite induction is re-stated as: (forall x. Qx) <== Q0 /\ forall n. Qn ==> Q(n+1) /\ forall limit m. (forall y Qm it is convenient in practice because usually the three cases do require three conceptually different proofs. 2. ordinals ----------- {0, 1, 2, ... , w} and {(0,0), (0,1), (0,2), ... , (1,0)} are order-isomorphic to each other. Their members seem to use different naming schemes, but that's the end of the difference. We may as well pick one of them and call it "canonical", "representative", etc. An ordinal is a "canonical" well-ordered set. We impose these requirements on our canonical choice to obtain nice properties: - ordinals are built bottom-up: from smaller ordinals we define larger ordinals - an ordinal x, being a well-ordered set, has elements; each element is in turn an ordinal smaller than x - in fact x = { those ordinals smaller than x } - natural numbers are recasted as ordinals, so that ordinals become a generalization of natural numbers Thus we re-build natural numbers and build ordinals this way (due to von Neumann): 0 := the empty set 1 := {0} = 0 U {0} 2 := {0, 1} = 1 U {1} ... n+1 := {0, 1, ... , n} = n U {n} ... w := nat = {0, 1, ... } this is the first limit ordinal w+1 := {0, 1, ... , w} = w U {w} ... w+w := {0, 1, ... , w, w+1, ... } this is the second limit ordinal ... In general, there is 0, there are successor ordinals x+1 = x U {x} such as 1 and w+1, and there are limit ordinals (non-successors) such as w and w+w. The nice thing about the above requirements and construction is this equivalence: x smaller than y (has 1-1 order-preserving map from x to y) == x member of y == x subset of y thus many different notions of "smaller/larger" are unified. We can just write xS this way: f0 = whatever you like f(y+1) = expression involving fy fz = expression involving ft's where the t's are < z Example. Let S be a complete lattice, h:S->S a monotonic function. Define f by f0 = bottom f(y+1) = h(fy) fz = Join t