This is the README for lk-0.5.0

See BUGS and ChangeLog for a summary of recent changes, including two
bug fixes since the last public release (0.4.17).

The home site of this package is 
http://www.cs.utoronto.ca/~neto/research/lk

Enjoy,
David Neto, netod@acm.org, 2000/9/17


Contents
========
 - Introduction
 - Making the programs
 - Why a new implementation?
 - Tools
 - Algorithms
 - Performance
 - Rough bits
 - Copyrights and Licenses
 - References


Introduction
============

Program LK implements the Lin-Kernighan heuristic for the symmetric
traveling salesman problem (TSP).

See the famous 1973 paper by Shen Lin and Brian Kernighan for the first
description and motivation of this algorithm.  (References appear below.)  
A more modern source is the chapter by Johnson and McGeoch; it describes
the Lin-Kernighan algorithm in the context of other local search
algorithms for the TSP.

This program was written by David Neto for his doctoral work in the
Department of Computer Science at the University of Toronto.  It mostly
follows the design outline given by Johnson and McGeoch for the
implementation due to Johnson, Bentley, McGeoch and Rothberg (JBMR),
and is tuned for two-dimensional Euclidean instances.  

I successfully defended my PhD on May 25, 2000.  The thesis
title is ``Efficient Cluster Compensation for Lin-Kernighan Heuristics''.
Find out more by visiting http://www.cs.utoronto.ca/~neto/research/thesis.html
The thesis is now available online in PostScript form.

This program introduces ``efficient cluster compensation'', an
algorithmic technique intended to make the Lin-Kernighan heuristic
more robust in the face of clustered data.

Mail suggestions and bug reports for LK to netod@acm.org.  Please include
the version of `lk', which you can get by running `lk --version'.
If pertinent, include a pointer to sample inputs.

Programs jitter, shake, tspgen, are TSP instance generators.  
These others are the subject of active research.
Program unifd generates 2-d instances with uniform distribution over a square.


Making the programs
===================

See the file INSTALL for compilation instructions.  In a nutshell:
	./configure
	make
	(cd script; make myall)
	(cd src; ../script/lkdoitcmp)

You'll probably want to execute the ../script/lkdoitcmp script while
inside the src directory.  That creates two binaries: 
	lk.no_d (baseline Lin-Kernighan without cluster compensation), and
	lk.deg (lk with cluster compensation)
You can tell the which version is which by executing the binary with
--version.  The cluster compensation will have a 'deg' at the end of
its version number, e.g. 
	LK 0.4.20deg

Scripts for munging data and running experiments are in the script directory,
and the data directory, and in this directory.  Buyer beware.  :)

The expt directory contains scripts for script/expt.pl that run a battery
of experiments.   The expt directory also contains a horribly convoluted 
makefile for processing the output logs of the experiments.

Once both binaries for lk are built, try this inside the expt directory:
    ../script/expt.pl expttest
    make tsp.lin105..i1..cf.plt
    make tsp.lin105..i1..probe.plt
    make tsp.lin105..i1..move.plt

    # Shows run by run comparison with and without cluster compensation.
    gnuplot tsp.lin105..i1..cf.plt   
    # Show probe depths of lambda-changes (how long does the t list grow, 
    #	and how often
    gnuplot tsp.lin105..i1..probe.plt   
    # Show move depths of improving lambda-changes (i.e. that we commit to)
    gnuplot tsp.lin105..i1..move.plt   

If you want to try a more extensive set of experiments, try expttsp.

script/TSP.pm is a start at a Perl module that should mostly encapsulate
munging of TSPLIB files.  For example, see script/tsplib2rothberg.pl.


Why a new implementation?
=========================

1. "You never really know something until you teach it to a computer."
	-- Donald Knuth (paraphrase)

2.  When I started writing this program, there were no freely available
implementations of the Lin-Kernighan heuristic.

3.  I wanted to instrument the program so I could learn more about the
behaviour of the algorithm.  Writing it myself seemed a natural way
to know where and what to instrument.

4.  I originally wanted to run the code in parallel.  Also, I chose the
performance-oriented implementation language C.  Both factors meant I
needed to architect the code in a particular way.  (Now I don't care
about parallelizing the code.)

5.  I've written (most of) the code in a "literate style".  I've used the
CWEB suite of literate programming tools.  I'm convinced literate
programming is the way to go for experimental and expository
programming.

6.  Free software is a good thing.  By "free" I mean free to
redistribute and modify.  (See http://www.debian.org/intro/free.) 
I wanted to be able to distribute the code under the GNU GPL or the GNU
LGPL, so I needed to write a clean-room implementation.

I do not resent other people for not releasing their code.  I have the
greatest respect for other researchers in this field (see the
references).  They have already given a great deal in describing their
own work.

Let me clarify this stance with two quotes:

"My opinion on licenses is that `he who writes the code gets to choose
the license, and nobody else gets to complain.' Anybody complaining
about a copyright license is a whiner." -- Linus Torvalds.

"Implementation is the sincerest form of flattery." -- L. Peter Deutsch.

Note:
I just found out about the Applegate, Bixby, Chvatal, Cook source code
relase.  I'm eager to try out their code.
(See http://www.caam.rice.edu/~keck/concorde.html)
But before I do, I'm releasing my own code, to establish the "cleanliness"
of my code.


Tools
=====

If you want to try the programs in this package, you will need a C compiler
and make.  The only non-standard C the program requires is the BSD
resource usage functions (getrusage).

If you want to munge the output and run experiments automatically,
you will need Perl and GNU make (for expt/Makefile).  If you want
to plot the munged output of experiments, you will need gnuplot.

If you want to read the code, you will need CWEB and TeX to process
the .w files into .dvi files.

If you want to develop the code, you will need CWEB (and make and a C
compiler).  Edit the .w files, not the .c files!  I also strongly
suggest using RCS.  You'll probably also want GNU autoconf and GNU
automake.  Remaking dependencies can be avoided by first using
automake --include-deps in the project root directory.


Algorithms
==========

For the Lin-Kernighan heuristic for the TSP, I worked from Lin and
Kernighan's original paper and from the Johnson and McGeoch chapter.
(See src/jbmr.w)

Jon Bentley and Doug McIlroy developed the QuickSort variant coded as
dsort() in file src/dsort.w.  See their article "Engineering a Sort
Function".  I used it here because it is deterministic: I need
repeatability for my experiments.  Apparently, the Solaris qsort
function isn't always deterministic.  (See src/dsort.w)

Kd-trees for semi-dynamic point sets are Jon Bentley's creation.
I've implemented 2-d trees only.  3-d and beyond shouldn't be hard.
Of the queries, I've implemented only the nearest-neighbour queries,
not the fixed-radius searches.  (See src/kdtree.w)

The TSP code uses the oriented tour ADT as described by Fredman et.al.
I've implemented the ADT in terms of arrays and two-level trees.
(See src/array.w, src/twolevel.w)

I've also implemented a crude version of Held-Karp lower bounds for
the TSP.  (See src/ascend.w)

Program LK can also be used for approximately solving the minimum
weight perfect matching problem.  The details are simpler than for
the TSP.  I don't know of anyone else applying Lin-Kernighan for 
weighted perfect matching, although Bentley used 2-opt for this
problem.  (See his paper on fast geometric algorithms for the TSP.
He used a 2-opt matching algorithm as a component in his approximate
Christofides algorithm.)
(See src/match.w: it plays the role for matching that jbmr.w does for the TSP)

I invented cluster compensation, and I think it's kind of neat.
(See src/decluster.w)

Fast least common ancestors in binary trees is implemented in decluster.w.
I use the algorithm described by Schieber.

The chaos game for iterated function systems played in ifs.w is described
in Michael Barnsley's _Fractals Everywhere_.


Performance
===========

Program lk is a high quality implementation of the Lin-Kernighan
heuristic.  However, I wouldn't call it "state of the art".  
What do I mean?

I consider the JBMR implementation to be the state of the art
implementation of the Lin-Kernighan heuristic for the TSP.  I use
it as the standard of comparison.  In the latter stages of my doctoral
research another group led by Bill Cook publicized their own extremely
scalable implementation of the Lin-Kernighan heuristic.
(As I write, there is a DIMACS implementation challenge for the TSP,
see http://www.research.att.com/~dsj/chtsp/)

My implementation is "high quality" because it has similar algorithmic
behaviour as the JBMR implementation.  For example, the average depth
of the searches is about 3 or 4 beyond the backtracking phase, i.e. to
about t_12 or t_14.  (If you know Lin-Kernighan, you'll know what this
means).  It routinely gets tours that are within 2% of optimal (or
above the Held-Karp lower bound).  It can be run on million-city
uniform geometric instances in reasonable time and space.

However, my implementation is slower than the JBMR implementation.
In tests I ran a while back, my implementation was about twice as
slow as the JBMR implementation.  For example, in about 2 CPU hours it
can find near-optimal tours for million-city instances.  (One SGI
Challenge 150MHz MIPS R4400 processor; using about 250MB of RAM; a
million cities drawn randomly from a uniform distribution over the unit
square; distance between cities is the Euclidean distance; within 2% of
optimal, or 2.5% above the expected Held-Karp bound.)


Rough bits
==========

LK is not finished.  There are a number of things that can be improved.

This code is not yet suitable for inclusion in a library, as there are
a number of global structures that would have to be encapsulated.  It
will take some cleaning up and some rearchitecting before this
happens.  That's not a high priority for me right now.  I'd rather
finish my thesis.

There are a number of experimental coding techniques I tried for the
first time with this implementation.  Aside from scripts, all the
coding was done in CWEB.  That was just great from a developmental
point of view as the code almost entirely self-documenting.  However, I
took great stylistic liberties in coding tricks and techniques afforded
by CWEB's section-oriented programming model.  (As opposed to C's
function-oriented programming model.)  For instance, module jbmr.w is
over 4000 lines of CWEB, but it is almost entirely the implementation
of a single function, jbmr_run.  That module also extensively uses
multiple section nesting together with preprocessor tricks to weave
code together.  After cweave'ing, the resulting jbmr.c file is over
45000 lines long, with much of it effectively #ifdef'd out.  But still
it makes for a very large C function.  It's as if all the code were
inlined together.  I'm relying on the optimizer in the C compiler to do
a good job.  The code speed will likely be reduced on register-poor
architectures, and on machines with small instruction caches.

My initial intention was to run Lin-Kernighan in parallel, so I chose C
over C++ so that I could be assured of greater control over memory
allocation.  That's also why jbmr.w is coded as one big function:  all
the local variables are available with known local addresses, and they
wouldn't conflict with other processes running the same procedure. As a
side effect, this design saves indirection in accesses to local
variables.  If I knew I'd be running this code only in sequential mode,
then I might have used C++ instead.

Some distance functions are not yet implemented.  For example, I haven't
been able to validate my implementation of the GEO distance metric
from TSPLIB. (Fixed in lk-0.4.17/src/read.w.  We follow Concorde behaviour.)

Also, the kd-trees handle only 2-d instances.  3-d and beyond would not
be hard to implement.

Hash tables can be used for the tabu list and check in jbmr.w
and match.w.  But the distributions of tabu list lengths are so
skewed that you might want to stick to the plain linear time tabu check.
You choose at compile time between the two.

Yes, I know splay trees can be slow.  But I implemented my dictionaries
in terms of splay trees because I wanted to remind myself of the
details of the splay tree algorithms in preparation for implementing
the oriented tours based on splay trees.  I still haven't done that
implementation.  Also, early on I had some ideas about splay trees
relating to parallelism for Lin-Kernighan, but that has fallen by the
wayside in the meantime.

Also, I often write "declustering" when I mean "cluster compensation".
The latter is a better description of what is going on.

Yes, I know C++.  Perhaps not as well as I should.
C++ and its standard library was not finalized when I started this code.  
C++ environments were not complete or entirely compatible when I started this
code.  A C++ version of this code seems like a natural step.


Copyrights and Licences
=======================

David Neto is the author of LK and holds its copyright.  However, LK is
free software.   Since I intend LK to evolve into a library, I am
licensing it under the terms of the GNU Library General Public
License.  See the file COPYING.LIB in this directory.

In this distribution, the "preferred form of the work for making
modifications to it" is the .w file, not the .c and .h files.  The .c
and .h files are generated from the .w file by cweave (a CWEB tool).
The only exception is src/lkconfig.h, which is not generated from
a .w file.

In the interest of supporting forensic software engineering, I strongly
suggest that you also distribute the RCS files as I have.  At some
point in the future, this will likely become impractical, so I have not
included the RCS file as part of the "prefered editing form" of a
source file.  However, any source file derived from the RCS file is 
derived from the original standalone file, and hence is covered by 
my copyright and the associated license.

Each source file I have written contains the following notice.

   Copyright (C) 1994, 1995, 1996, 1997, 1998 David Neto

   This library is free software; you can redistribute it and/or
   modify it under the terms of the GNU Library General Public
   License as published by the Free Software Foundation; either
   version 2 of the License, or (at your option) any later version.

   This library is distributed in the hope that it will be useful,
   but WITHOUT ANY WARRANTY; without even the implied warranty of
   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
   Library General Public License for more details.

   You should have received a copy of the GNU Library General Public
   License along with this library; if not, write to the
   Free Software Foundation, Inc., 59 Temple Place - Suite 330,
   Boston, MA  02111-1307, USA.

   You may contact David Neto via email at netod@acm.org, or with
   greater latency at
     Department of Computer Science
     University of Toronto
     10 King's College Rd.
     Toronto, Ontario
     M5S 3G4
     Canada 


Two source files in this distribution are exceptions:

Donald Knuth is the author of src/gb_flip.w, the random number generator 
of The Stanford GraphBase (ftp://labrea.stanford.edu/pub/sgb);
see also _The Stanford GraphBase: A platform for combinatorial computing_,
ISBN 0-201-54275-7, Addison-Wesley, 1993.  As per the copying terms,
gb_flip.w has not been changed from its original distribution.  Changes for
use in LK are made in gb_flip.ch and automatically incorporated by the 
CWEB tools.

Knuth (and perhaps others) is the author of the file cwebmac.tex in the
src directory.  It is required only if you want to typeset CWEB programs 
in that directory.  In that case, you may have to modify your TEXINPUTS 
environment variable so that TeX can find it.


References
==========

@phdthesis { Neto1999,
	fullauthor = "David~M.~Neto",
	author = "D.~M.~Neto",
	title = {Efficient Cluster Compensation For Lin-Kernighan Heuristics},
	school = {Department of Computer Science, University of Toronto},
	month = {May},
	year = 1999
}

@article { LinKernighan1973,
    author = "S.~Lin and B.~W.~Kernighan",
    title = {An effective heuristic algorithm for the traveling salesman
        problem},
    journal = "Operations Research",
    volume = 21,
    year = 1973,
    pages = "498--516"
}


@incollection { JohnsonMcGeoch1997,
    fullauthor = "David S.~Johnson and Lyle A.~McGeoch",
    author = "D.~S.~Johnson and L.~A.~McGeoch",
    title = {{The traveling salesman problem}:
            a case study},
    booktitle = {Local Search in Combinatorial Optimization},
    publisher = {John Wiley \& Sons},
    chapter = 8,
    year = 1997,
    fulleditor = {Emile Aarts and Jan Karel Lenstra},
    editor = {E.~Aarts and J.~K.~Lenstra},
    pages = "215--310",
    series = {Wiley Interscience series in discrete mathematics
        and optimziation}
}


@book { Knuth1993,
    fullauthor = "Donald~E.~Knuth",
    author = "D.~E.~Knuth",
    title = {The {Stanford} {GraphBase}: A Platform for
        Combinatorial Computing},
    publisher = "Addison-Wesley",
    year = 1993
}

@Article{SPE::BentleyM1993,
  title =        "Engineering a Sort Function",
  author =       "Jon Louis Bentley and M. Douglas McIlroy",
  journal =      "Software---Practice and Experience",
  pages =        "1249--1265",
  month =        nov,
  year =         "1993",
  volume =       "23",
  number =       "11",
}


@article{ FredmanJohnsonMcGeochOstheimer1995,
    author = {M.~L.~Fredman and D.~S.~Johnson and L.~A.~McGeoch and
        G.~Ostheimer},
    title = {Data structures for traveling salesmen},
    journal = {Journal of Algorithms},
    volume = 18,
    pages = "432--479",
    year = 1995
}


@article { Bentley1992,
    fullauthor = "Jon Louis Bentley",
    author = "J.~L.~Bentley",
    title = "Fast algorithms for geometric traveling salesman problems",
    journal = "ORSA Journal on Computing",
    volume = 4,
    number = 4,
    month = "Fall",
    year = 1992,
    pages = "387--411"
}

@InProceedings{Bentley90c,
  author =       "J. L. Bentley",
  title =        "{K}-d Trees for Semidynamic Point Sets",
  booktitle =    "Proceedings of the 6th Annual Symposium on
                 Computational Geometry",
  pages =        "187--197",
  month =        jun,
  year =         "1990",
}


@article { HeldKarp1970,
    fullauthor = "Michael Held and Richard Karp",
    author = "M.~Held and R.~Karp",
    title = "The Traveling Salesman Problem and Minimum Spanning Trees",
    journal = "Operations Research",
    volume = 18,
    pages = "1138--1162",
    year = 1970
}

@article { HeldKarp1971,
    fullauthor = "Michael Held and Richard Karp",
    author = "M.~Held and R.~Karp",
    title = {The Traveling Salesman Problem and Minimum Spanning Trees:
        Part {II}},
    journal = "Mathematical Programming",
    volume = 1,
    pages = "6--25",
    year = 1971
}


@InCollection{Schieber1993,
    fullauthor =   "Baruch Schieber",
    author =       "B.~Schieber",
    title =        "Parallel Lowest Common Ancestor Computation",
    booktitle =    "Synthesis of Parallel Algorithms",
    editor = "John H. Reif",
    publisher =   "Morgan Kaufmann",
    year =         "1993",
    chapter = "6",
    pages = "259--273"
}