module AlgebraBasic where
import Prelude hiding ( filter, last, reverse, (++), (!!), seq, zip
		      , maximum, minimum, and, or, all, any
		      )

-- Foreword: Abstraction in Programming firmly rests on the separation
-- of Specification and Implementation.  Implementations in functional
-- languages have always been on a "higher level", many of them
-- serving as their own specifications.  (Just take a look into
-- Haskell's Prelude and try to specify one of the functions.)  Due to
-- take higher-level of built-in abstraction, they have resisted for a
-- long time the use of programmer-defined abstraction, that is, the
-- separation of Specification and Implementation.  (Which is by the
-- way also the basis of well-used polymorphism.)  The main data
-- structure of the Haskell programmer is still the same as the one
-- from the first functional language in 1958.  This file might
-- convince the reader that a good functional language should have
-- more to offer.

{--

This file contains type classes with functions for the three abstract
data structures Sequence, Ordered Bag, and Ordered Set, together with
complete formal specifications for all of those functions.

Furthermore there are two type classes with the common functions for
all those data structures (Collection) and the common functions for
the (implicitly) ordered structures (OrdColl).  The axioms stated in
those classes hold for the respective functions in the subclasses,
too. 

Purpose of the file is too design an algebra of Collections that
serves as a basis for the (external) design of Functional Dessy.  It
is already clear, that the functions specified here can be implemented
efficiently in several ways:
http://www.stud.tu-ilmenau.de/~robertw/dessy/fun
The question is: how exactly should they behave?

While Sets, Bags and Sequences are well-defined mathematically, such a
thing as a Collection is not.  When defining Collections (and Ordered
Collections) here, and stating axioms and laws for them, we do
necessary invent something arbitrary, which doesn't correspond to some
known structure.  However, our design is guided by the _usefulness_ of
the result.  Although what we describe is a mathematical structure, it
is meant to serve some purpose: namely allowing algorithms and
specification to be expressed for several kinds of data structures at
once, as long as we don't need to use one of their special properties.

When our design settles on Collections that are Ordered and
Associative, this isn't because that need implicitly be so
mathematically.  Rather it is, because this way, Collections are
simple, general and useful.  Since I took most of the decisions in
this file (among them all the major ones) with several arguments on
the pro side, and often none on the con, I think that there is probably
no fundamentally different design, that can serve the same purpose
well.  This is why I expect the classes in this file to become a
public standard.

Here is the design rationale for (Ordered) Collections.  
Below is the code. 


** Order **

In this file all data structures have a notion of ordering: Sequences
are explicitly ordered: elements stay at the position you put them.
Bags and Sets are implicitly ordered: elements always take an order
according to the <= relation (from class Ord).  

Formally, being ordered means to have a well defined `first` element.
Since one can consider the first element of the rest of the collection
(called `but_first`) and so on, this defines an order for all of the
collection.  For example, the `last` element is defined by taking
`but_first` until there's only one element left.  (Below, `last` is
defined in a different way, not much more complicated, but such that
the definition can also serve as a more efficient implementation.)

The mathematical concept of Set doesn't have a notion of Ordering:
people don't agree what's the first element of a set.  So why do we
have this notion?

 - Sets are usually implemented using search trees or another ordered
   (concrete) data structure, which gives them not only a notion of
   ordering, but also allows efficient implementation of the
   corresponding functions.  Expecially in functional programming,
   there are virtually no alternatives to such an ordered
   implementation.  (Of course on shouldn't let the implementation
   decide over the specification, but this first argument is only the
   point that gave me the idea...)

 - The notion of ordering also helps to explain set semantics and do
   calculations on paper: to compare two sets, just sort them and
   compare element by element.

 - Since Sequences are already ordered, Sets and Bags can share this
   notion with them and provide a lot of useful common operations.
   This is especially important for the semantics of `fold` and
   `reduce`, see next section.

 - Even if we had a class Set for unordered Sets, we would probably
   like to have a class OrdSet, too: `minimum`, `maximum` and so on,
   are useful notions, furthermore we can access elements by index, do
   rank queries or return the nearest smaller/larger element from the
   set. 


** Associativity and Higher-Order Functions **

Sequences (as well as Sets and Bags) are "flat" structures, there is
no difference between "a ++ (b ++ c)" and "(a ++ b) ++ c".
Nevertheless, Haskell's "lists" are based on a non-associative
constructor, namely (:).  (I write "list" in quotes, because the data
structure is really a functional stack.)  Haskell "lists" directly
expose their first element (`head`) to the user, while the last
element is buried deep behind all the others.  In my paper on
Democratic Sequences I have called this "tyranny of the front over the
back".  But after all, Haskell's "list" is only a concrete data
structure and it must make it's choice somehow.

Our collections, on the other hand, are all constructed via the
associative constructor <+> which carries the common semantics of ++
for Sequences, \/ for Sets, and \+/ for Bags.  Since <+> is no data
constructor, every instance of Collection can choose its own
association of elements, and from the user's point of view, no
association is preferred over the other.  This has consequences for
the specification of the folding functions...

Indeed, I have developed this algebra while I was trying to specify
`fold` for the Collection class.  To make a long story short, `foldr`
on lists just replaces the data constructors by user-defined
functions.  We have specified `fold` on collections analogously: it
takes one argument for each of the abstract data constructors `empty`,
`single`, <+>:

    fold   :: (b) ->             -- empty
              (a -> b) ->        -- single 
	      (b -> b -> b) ->   -- <+>
	      coll a -> b

This function compared to `fold` on "lists" has the same nice
properties as Hinze's Abstract Leaf Trees have over binary search
trees: it separates concerns.  To appreciate this, have a look at some
other higher-order functions that are defined via `fold`:

    reduce e = fold e id
    apply f  = fold empty (single . f) (<+>)
    filter p = fold empty (\x -> if p x then single x else empty) (<+>)

The names `apply` and `reduce` come from Scheme (I think).  `apply`
does what Haskell calls `map` but this name is too near to the data
structure Map, is also conflicts with Map`s constructor function.
`reduce` is a useful variant of `fold` discussed below.

Or some other functions:

    size = fold 0 (const 1) (+)

    count coll x = fold 0 eq (+) coll
	where
	eq y | x == y    = 1
	     | otherwise = 0

In exercises on functional programming students are often demanded to
implement some data structure together with some functions on it,
often including a folding function.  Using these data structures
together this gives a myriad of folding functions.  (Remember that
"lists" have already at least five of them.)  Here, however, we do
with just _one_ folding function for all data structures that are
instances of Collection: Sequence, Set, Bag in this file and Relations
and Maps in another one.  And still, the function is incredibly
useful. 


 -- `reduce` --

The major problem of translating "list's" folding functions to general
Collections is that the "list" functions unify several purposes, one
of them being "the replacement of data constructors by user-defined
functions" which we treated with the `fold` function above.  Another
one is the `reduce` operator on abstract Sequences: insert a binary
function between the elements of a Sequence.  Since Sequences by
definition are "flat", i.e. have an associative ++ constructor, this
operation is only well-defined for associative binary operators.
Furthermore, unlike "lists" or trees, Sequences have no specified
place where they contain `empty`: `empty` is the unit of ++ and
therefore disappears in all Sequences but `empty` itself.
Consequently, a call to `reduce (*) e" makes only sense, when `e` is
also the unit of (*).  Together with the associativity this means,
that (*) and `e` must be a Monoid (i.e. an algebraic group, but no
need for inverse elements).  This precondition is part of any sensible
specification of 'reduce'.

 - In practice, most of the uses of fold on "lists" do indeed observe
   that precondition: + * || && ++ min max >>

 - Unlike traditional fold on "lists" our `reduce` is abstract and
   many implementations (on trees) will be able to perform it in
   parallel.

 - And we have the simple law of reasoning:

    reduce (*) e coll | is_empty coll = e
                      | otherwise     = reduce1 (*) coll

   Meaning that anytime the Collection is non-empty, we don't need `e`
   to reduce it, and otherwise the result is just `e`.

`reduce`, like `fold` is therefore a good thing, that we should keep
in the library.

When talking about associativity and Sequences, we should also think
about commutativity of Sets (and Bags):  Usually a `reduce` on Sets
would require a commutative binary function to be inserted between the
elements.  In our case, however, Sets and Bags are ordered, and by
consequence the definition of `reduce` as given is perfectly valid.
Again, it's one function for all kinds of Collection.

We have separated the fold on "lists" into two purposes and proposed
two functions for Collection: `fold` and `reduce`.  We should ask
whether `foldl` and `foldr` still have applications which are not
covered by the cases treated so far.  And there are: some uses of fold
do not use associative binary functions.  These uses depend on a
certain order of traversal that is performed on the data structure.
Examples are Horner's algorithm and the calculation of a (n-ary)
number from a Sequence of digits.  These traversals can easily
specified via a recursion on first/but_first or last/but_last,
respectively.  Technically we should make those traversals into
higher-order class functions, too, since on some structures a
different implementation (not using first/but_first) can be
asymptotically faster.  (O(N) vs O(NlogN).)


 -- `zip` and `merge` --

The `zip_with` function on Sequences is known from "lists" and there
is not much special about it here.  It's specifying implementation
below does much resemble the code from Prelude.  However, at the same
time, it is a shining example of abstraction allowed by the type
classes.  While the implementation in the Prelude is fixed to "lists"
and one and the same code, `zip_with` on Collection will be
implemented differently on tree-based and on deque-based Sequences.
All of this is transparent to the user, who only needs to know the
specification of the function in the (type) class.  (And since we have
that formal specification, any other implementation can be proven
correct formally, but that's another story.  Oh, and there's
QuickCheck which we should talk about some day.)

While `zip_with` works on all ordered structures and pairs elements
according to their index, the `merge_with` structure works only on
implicitly ordered structures and pairs equal elements.  Although I
just discovered that function Yesterday (2003-12-09), it already
proved to be incredibly useful: all of the combination operators of
Ordered Collections can be expressed efficiently (!) using
`merge_with`: \/, /\, \\ on Sets; \+/, \-/ on Bags; <+ on Maps; and
even in user applications (especially with Maps) this function is
useful.  (Although Maps have a separate Algebra and there the function
has a separate name.)

And it even more illustrates abstraction: its specification is already
rather long (perhaps you can simplify?), and the implementation is
different on different concrete data structures.  (Like that of
`zip_with`, but this time the code is really complicated and users can
be happy, that it is hidden from them!)

Due to its high level of built-in abstraction, functional languages
rarely had complicated code and consequently few things to
(information) hide.  With the Collection classes presented here,
abstraction finally pays of.  Long live Design by Contract!

--}

-- In order to type-check the algebraic specifications, I write all the
-- axioms as functions where the arguments represent forall-quantified
-- variables (and the function-name gives a name to the axiom).
-- Unfortunately I can not interleave these specifications with the
-- type class definition, so I'm sorry if you have to jump around a
-- bit to follow the discussion...

-- Furthermore all the implementations in the classes have to be
-- considered as specifications: any overwriting of them must exactly
-- have the same client-visible semantics (but often run faster).  (In
-- general, instances will specialise the functions which return
-- "small structures" (like `size`, `first`, (!!), ...), because for
-- them the overhead is most significant.)  All functions have their
-- precondition stated explicit, and at the moment I think that no
-- subclass will weaken any of this preconditions.

data Splitted coll a = Empty
		     | Single a
		     | (coll a) :+: (coll a)  deriving Eq

-- The :+: invariant.
inner_nonempty coll = case split coll of
--		      (s :+: l) -> not (is_empty s)
--			        && not (is_empty l)
                      (Empty :+: _) -> False
		      (_ :+: Empty) -> False
		      (_)           -> True

-- ** Collections ** --

class (Eq (coll a), Eq a)
    => Collection coll a 
    where
    -- Primitive Constructors.
    empty :: coll a     
    single :: a -> coll a
    (<+>) :: coll a -> coll a -> coll a  -- read <+> as "join"

    -- The one and only basic query.
    split :: coll a -> Splitted coll a

    -- All other queries and modificators are defined via split.
    -- Functions that traverse the complete collection can often be
    -- defined indirectly be `fold`.
    size :: coll a -> Int
    is_empty :: coll a -> Bool

    size = fold 0 (const 1) (+)
    is_empty coll = case split coll of
		    Empty -> True
		    _     -> False
    -- is_empty = fold True undefined (const (const False))

    count :: coll a -> a -> Int
    has   :: coll a -> a -> Bool

{--  count coll x = fold 0 eq (+) coll
	where
	eq y | x == y    = 1
	     | otherwise = 0 --}
    count coll x = size $ filter (==x) coll

    -- has coll x = count coll x > 0
    has coll x = not.is_empty $ filter (==x) coll
    -- Advantage: works also on Infinite Streams.

    -- Queries on the ordering.
    -- (Skip implementations on first reading ;-).
    first,     last     :: coll a -> a      
    but_first, but_last :: coll a -> coll a 

    has_index   :: coll a -> Int -> Bool
    (!!)        :: coll a -> Int -> a       
    front, back :: coll a -> Int -> coll a  

    front_view :: coll a -> Maybe (a, coll a)
    back_view  :: coll a -> Maybe (coll a, a)

    first coll = case split coll of
		 (Empty)    -> error "first.Precondition"
		 (Single x) -> x
		 (s :+: _)  -> first s

    last coll  = case split coll of
		 (Empty)    -> error "last.Precondition"
		 (Single x) -> x
		 (_ :+: l)  -> last l

    but_first coll = case split coll of
		     (Empty)    -> error "but_first.Precondition"
		     (Single x) -> empty
		     (s :+: l)  -> but_first s <+> l

    but_last coll  = case split coll of
		     (Empty)    -> error "but_last.Precondition"
		     (Single x) -> empty
		     (s :+: l)  -> s <+> but_last l

    has_index c i  =  1 <= i && i <= size c

    coll !! i | not (has_index coll i) = error "(!!).Precondition"
	      | otherwise              = last $ front coll i

{-- spec:
    front coll 0 = empty
    front coll i = first coll `cons` front (but_first coll) (i-1)

    back coll 0 = empty
    back coll i = last coll 'cons' back (but_last coll) (i-1)
--}

    front coll i | i <  0         = error "front.Precondition"
		 | i == 0         = empty
		 | size coll <= i = coll
                 -- assert: 1 <= i < size coll, meaning `coll` is a :+:
		 | size s <= i    = front s i
		 | otherwise      = s <+> front l n 
		 where
		 (s :+: l) = split coll
		 n = i - size s

    back coll i  | i <  0         = error "back.Precondition"
		 | i == 0         = empty
		 | size coll <= i = coll
                 -- assert: 1 <= i < size coll, meaning `coll` is a :+:
		 | size l <= i    = back l i
		 | otherwise      = back s n <+> l
		 where
		 (s :+: l) = split coll
		 n = i - size s

    front_view coll | is_empty coll = Nothing
		    | otherwise     = Just (first coll, but_first coll)
    back_view coll  | is_empty coll = Nothing
		    | otherwise     = Just (but_last coll, last coll)

    -- Modification
    remove :: coll a -> a -> coll a

    remove coll x = filter (/=x) coll
    -- (The implementation is more efficient on Ordered Collections.)
    -- On Bags this is not the inverse of `add`, see `subtract`.

    -- Sequences: cons, snoc, insert_*, delete
    -- OrdColl: add, OrdBag: subtract

    -- Higher-Order Kingdom
    fold   :: b -> (a -> b) -> (b -> b -> b) -> coll a -> b
    reduce :: a ->             (a -> a -> a) -> coll a -> a

    apply  :: (Collection coll b) =>
	      (a -> b) -> coll a -> coll b
    filter :: (a -> Bool) -> coll a -> coll a

    fold e f g coll | is_empty coll = e
		    | otherwise     = fold1 f g coll
    -- This makes use of the :+: invariant.				      

    reduce e = fold e id
    apply f  = fold empty (single . f) (<+>)
    filter p = fold empty (\x -> if p x then single x else empty) (<+>)

    -- These are helper functions, but also useful per se...
    -- Both have the precondition that their `coll` is not empty. 
    fold1   :: (a -> b) -> (b -> b -> b) -> coll a -> b
    reduce1 ::             (a -> a -> a) -> coll a -> a

    fold1 f (*) coll = case split coll of
		       (Single x) -> f x
		       (s :+: l)  -> fold1 f (*) s  *  fold1 f (*) l

    reduce1 = fold1 id
    
    -- e.g. 
    maximum, minimum :: Ord a => coll a -> a
    maximum = reduce1 max
    minimum = reduce1 min
    -- Mentioned here, because ordered collections can do it fast. --}

    -- Just for completeness.  In case anyone wants to instantiate
    -- this one feature instead of the basic three.
    unsplit :: Splitted coll a -> coll a
    unsplit (Empty)    = empty
    unsplit (Single x) = single x
    unsplit (s :+: l)  = s <+> l

    empty    = unsplit (Empty)
    single x = unsplit (Single x)
    s <+> l  = unsplit (s :+: l)

    zip_with :: (Collection coll b, Sequence seq' c) =>
		(a    -> b     -> c    ) ->
		coll a -> coll b -> seq' c

    zip_with (*) a b | is_empty a || is_empty b = empty
		     | otherwise = single (first a * first b) 
                                ++ zip_with (*) (but_first a)
				                (but_first b)

    zip :: (Collection coll b, Sequence seq' (a, b)) =>
	   coll a -> coll b -> seq' (a, b)
    zip = zip_with ( \ x y -> (x, y) )


{-- Until now we have indeed defined all functions in terms of the
    first four.  You may ask what kind of collection this definitions
    describe.  Well.. this depends on the axioms of `single`, <+>, and
    `split`.  In class Collection we will underspecify these
    functions, such that they allow several refinements, yielding
    Sequences, Sets, Bags, Maps, ...

    All the functions we add later in the subclasses, will also be
    expressed in terms of `empty`, `single`, <+>, and `split`.  Why
    don't we add them right now?  Depending on the properties of the
    basic functions, the additional definitions would not make sense
    in all subclasses.  E.g., they would degenerate to do the same as
    other functions, or to constant functions, or wouldn't have nice
    applications and own laws.
--}

-- ** Axioms of Collection ** --

instance Eq a => Collection [] a where 
   split ([])   = Empty
   split (x:[]) = Single x
   split (x:xs) = [x] :+: xs

   unsplit (Empty)     = []
   unsplit (Single x)  = [x]
   unsplit (xs :+: ys) = xs Prelude.++ ys
-- instance (Collection a b, Eq c) => Collection a c where {}

-- To avoid "Unresolved top-level overloading" we will type anything
-- as lists.  Nevertheless the axioms must be understood to hold for
-- all collections!

empty_is_Empty = is_empty (empty :: [Int])
		 
single_is_Single x = split (single x :: [Int]) == (Single x)
--single_is_Single x = (split . single) x == (Single x :: Splitted [] Int)

-- Due to the definitions of `is_empty` and `size` above (which are
-- part of the specification), these equations fix the result of
-- `split` for the first two constructors.

-- These two axioms are also implied by "split . unsplit == id", but
-- this is surely too strong: it would make all Collections isomorphic
-- to leave trees and not even allow associativity:

join_associative a b c = (a <+> b) <+> c == a <+> (b <+> c) 

empty_join_unit a = a <+> empty == a
		 && empty <+> a == a


join :: (Collection coll' (coll a), Collection coll a)
	  => coll' (coll a) -> coll a
join = reduce empty (<+>)
-- Having different type variables here is very important:
-- E.g. we can join a Sequence of Sets.
-- It also makes Collection Comprehension with different Collection
-- Types possible.


-- Here the == must be understood in a strong logical sense: There is
-- no means to distinguish between two equal abstract values.

infixr 2 ==>  -- as ||
a ==> b = not a || b  -- strict in first argument

dont_loose a b x = (a `has` x || b `has` x)
		   ==> (a <+> b) `has` x

-- In usual algebraic specifications, every expression is assumed to
-- denote a different value (as if all the functions were data
-- constructors), unless there is an explicit axiom (like
-- `join_associative` above), that states equality.  However, for our
-- Collection that doesn't work, since the subclasses will introduce
-- more axioms.

-- That's why I give the axiom `dont_loose`: it gives a lower bound on
-- equality, stating that (a <+> b) cannot be equal to a Collection
-- that doesn't have `x`.  Don't ask me, how that fits into the theory
-- of abstract data types.  I think it's practical here.


-- ** Sequences ** --

class Collection seq a => Sequence seq a where
    -- Only adds modification of the explicit ordering.
    -- And a famous synonym for <+>.
    (++) :: seq a -> seq a -> seq a
    (++) = (<+>)

    cons :: a -> seq a -> seq a
    snoc :: seq a -> a -> seq a

    cons x s = single x ++ s
    snoc s x = s ++ single x

    has_gap :: seq a -> Int -> Bool
    has_gap s i  =  1 <= i && i <= size s + 1

    insert_element :: seq a -> Int -> a -> seq a
    insert_segment :: seq a -> Int -> seq a -> seq a

    insert_element s i x
	| not (has_gap s i) = error "insert_element.Precondition"
        | otherwise = front s (i - 1) ++ single x ++ back s n
	where n = size s - i + 1

    insert_segment s i s'
	| not (has_gap s i) = error "insert_element.Precondition"
        | otherwise = front s (i - 1) ++ s' ++ back s n
	where n = size s - i + 1

    reverse :: seq a -> seq a
    reverse = fold empty single (flip (++))

infixl 5 `snoc`
infixr 5 `cons`

{-- The Collection axioms allow us to reduce any Collection expression
    to a normal form being `empty` or a combination of `single` and
    <+>.

    The Sequence doesn't add any more axioms on equality.
    Consequently we will consider all the expression of such a normal
    form (modulo associativity) to denote a different value.

    TODO: How to write that formally.
--}

instance Eq a => Sequence [] a where {}


-- ** (Implicitly) Ordered Collections ** --

class (Collection coll a, Ord a) => OrdColl coll a where
    -- <+> is \/ or \+/ for Sets and Bags respectively.
    add :: coll a -> a -> coll a
    add coll a = coll <+> single a

    smaller, larger :: coll a -> a -> coll a
    smaller coll a = filter (<=a) coll
    larger  coll a = filter (>=a) coll

    -- Should we have the synonyms `minimum`, `maximum`, `but_min`,
    -- `but_max`, `smallest`, `largest` for `first`, `last`,
    -- `but_first`, `but_last`, `front`, `back`?

    merge_with :: (Collection coll' b) =>
		  (Maybe a -> Maybe a -> coll' b)
	          -> coll a -> coll a -> coll' b
    merge_with f a b | is_empty a && is_empty b = empty
		     | is_empty a               = take_b
		     | is_empty b               = take_a
		     | first a < first b        = take_a
		     | first b < first a        = take_b
		     | otherwise                = take_both
     where
     take_a = f (Just$ first a) Nothing 
           <+> merge_with f (but_first a) b
     take_b = f Nothing (Just$ first b) 
           <+> merge_with f a (but_first b)
     take_both = f (Just$ first a) (Just$ first b) 
              <+> merge_with f (but_first a) (but_first b)


union_commutative :: (OrdColl coll a) => coll a -> coll a -> Bool
union_commutative a b  =  a <+> b == b <+> a

{-- Despite <+>'s commutativity, Ordered Collections also have a
    proper notion of ordering, which is expressed by the following
    axiom:

    split_ordered :: (OrdColl coll a) => coll a -> Bool
    split_ordered a = case split a of
                      (s :+: l) -> and [ x < y | x <- s, y <- l ]
                      _         -> True

    Haskell '05 will have Collection comprehension, no question.
--}


-- ** (Ordered) Bags ** --

class (OrdColl bag a) => OrdBag bag a where
    (\+/), (\-/) :: bag a -> bag a -> bag a
    (\+/) = (<+>)

bag_union_commutative a b  =  a \+/ b == b \+/ a

-- The difference to Sets is simply that we don't have idempotence of
-- union, which thus gives us a slightly stronger notion of
-- equivalence.  It's lower bound is:

bag_add a b x  =  (a \+/ b) `count` x 
               == a `count` x  +  b `count` x

-- Definition of subtraction:

bag_sub a b x  =  (a \-/ b) `count` x 
               == (a `count` x) -. (b `count` x)
    where i -. j = 0 `max` (i - j)

-- Formerly I preferred to model Bags as Maps from elements to counts,
-- as one would do in Z or B.  However, the algebra of this file
-- clearly suggests the way gone here.  But since the purpose of
-- abstraction in this file is not to insist on one single view, we
-- have the following functions:

{-- Doesn't work yet, since Maps are not done.

bag_map :: (Map map a Int, OrdBag bag a) => bag a -> map a Int
bag_map = fold empty ( \x -> single (x :> 1) ) (merge f)
    where f (Nothing)     (Just m)      = single m
	  f (Just n)      (Nothing)     = single n
	  f (Just (k:>n)) (Just (_:>m)) = single (k :> n+m)

map_bag :: (Map map a Int, OrdBag bag a) => map a Int -> bag a
map_bag = fold empty m (<+>)
    where m (x :> 1) = single x
	  m (x :> n) = single x <+> m (x :> (n+1) )
--}

-- Furthermore, I don't consider Bags to be as useful as Sets and
-- Sequences, I only have included them here, because the algebra
-- allows for it, without further circumstances.  Furthermore (as
-- shown below) we can also have manifest notations for Bags without
-- introducing any new syntax.  Thus Bags are gratuitous and so we
-- include them.


-- ** (Ordered) Sets ** --

class (OrdColl set a) => OrdSet set a where
    (\/), (/\), (\\) :: set a -> set a -> set a
    (\/) = (<+>)

    subset, strict_subset :: set a -> set a -> Bool
--    a `subset` b = size a <= size b && all (b `has`) a
--    a `strict_subset` b = size a <  size b && all (b `has`) a

instance Ord a => OrdColl [] a where {}
instance Ord a => -- only GHC needs that
         OrdSet [] a where {}

union_idempotent x y  =  single x \/ single x == (single x :: [Int])

-- This upper bound on equality matches the lower bound given by
-- `dont_loose`.  By consequence, OrdSets are the "weakest" of our
-- data structures.

-- Here is a trivial consequence from the axiom:
at_most_one a x  =  a `count` x <= 1



-- ** On Manifest Notation ** --

{-- Since Sequences have the strongest notion of equality (and the
    least "loss of information"), we can safely use Sequences as an
    intermediate to construct the other collections.  We don't need to
    change the Haskell syntax in any way, we just give all manifest
    [], [,], [..] and [ | <- ] the type "Sequence seq a => seq a" and
    offer the following functions: --}

set :: (Collection coll a, OrdSet set a) =>
       coll a -> set a
set = convert

bag :: (Collection coll a, OrdBag bag a) =>
       coll a -> bag a
bag = convert

seq :: (Collection coll a, Sequence seq a) =>
       coll a -> seq a
seq = convert

convert :: (Collection coll a, Collection coll' a) =>
       coll a -> coll' a
convert = fold empty single (<+>)

-- `seq` can be used in programs to convert back from any other
-- Collection to Sequences.  I think this explicit style is superior
-- to using "Collection coll a => coll a" as the type and having the
-- context decide on the subtype.  This is too implicit (to much like
-- Perl) for a semantic issue.

-- In the same way we can output Sets and Bags by printing "set " /
-- "bag " and then the Sequence of elements.

-- In the Collection comprehesion, by the way, we can mix all kinds of
-- generator types, e.g. 
--     bag [ x - y | x <- some_seq, y <- some_set, x > y ]

-- A compiler can of course fuse the manifest / enumeration /
-- comprehension notation with the following conversion function, so
-- that we don't pay any overhead.  (The simple rule is that `convert`
-- specialised to type "Collection coll a => coll a -> coll a" is
-- equivalent to `id`.)



-- ** Acknowledgments ** --

-- Ralf Hinze's Abstract Leaf Trees
-- "A fresh look at binary search trees", JFP 2001
-- http://www.informatik.uni-bonn.de/~ralf/publications/SearchTree.ps.gz

-- E.C.R. Hehner's "Practical Theory of Programming"
-- 2nd edition to appear

-- My original inspiration were the structures of the Z and B
-- specification languages (long live Abrial).  But those are based on
-- classical mathematics, while the structures described here (as well
-- as Hehner's) are meant to be self-contained and contain the minimal
-- theory needed to reason about them.


