EFC, version 1.1 What ---- EFC is a library that can be used for the development and testing of new algorithms to solve n-ary Constraint Satisfaction Problems. The ideas around which it was built are covered in [1]. Most of the widely used algorithms have already been implemented (BT, CBJ, FC, GAC, FCCBJ, GACCBJ). Notable exceptions are BM and BJ, which I did not think were worth the time any more. Even BT and CBJ are only in there because they were so easy to do, not because they are useful. In addition, some more algorithms have been implemented: vsCBJ (value specific CBJ [2]) and its FC and GAC counterparts. In addition, there are several flavours of nogood recording: unrestricted, length-bounded and relevance-bounded. Those can be combined with any of the algorithms mentioned above. If you want to use EFC for your own research, please reference: EFC, http://www.cs.toronto.edu/~gkatsi/efc/efc.html, 2002-2004 Who --- You can send email to the authors: gkatsi@cs.toronto.edu George Katsirelos fbacchus@cs.toronto.edu Fahiem Bacchus The code was originally written by Fahiem Bacchus but has in large part been rewritten by me (George Katsirelos), so for most questions you should send any questions to me. In particular, any omissions in the documentation or bug reports should be sent to me. Moreover, if you need help using the code (for implementing algorithms, constraints, models or anything else), send me an email and I will do my best to help. Prerequisites ------------- What you need before you start compiling: - a c++ compiler with good conformance. I've used gcc 3.3. I don't know whether older versions of gcc will work. - boost (http://www.boost.org). I have used version 1.30.2. Other versions will probably not work. You need to set the BOOST_ROOT environment variable to point to the directory where you have unpacked boost. You do not need to compile any of the libraries (regex, threads, etc). Only the libraries that are totally contained in header files are used. You also need boost jam to build EFC. - CPlan. If you plan to run the planning problems from CPlan, you need to download and compile CPlan from http://ai.uwaterloo.ca/~vanbeek/software/software.html. If you don't plan to use these problems, you don't need this library. You also need to set the environment variable EFC_ROOT to wherever you unpack efc. Compiling --------- Using bjam, it's easy to compile EFC. Simply go to the directory $EFCROOT and type bjam This will compile a debug version of the library as well as the sample programs. To compile an optimized version, type bjam -sBUILD=release The corresponding versions of the library will be in lib/bin/libefclib.a/gcc/debug/runtime-link-dynamic/libefc.a and lib/bin/libefclib.a/gcc/release/runtime-link-dynamic/libefc.a The naming is not ideal, but these are the choices made by bjam. The executables for the examples are placed in similarly named directories under the example directories. Contents -------- lib/ The source code for the library include/ Include files loki/ The Loki library (see below) The rest are sample problems. Note that the loki library from the book "Modern C++ Design" by Andrei Alexandrescu is included for convenience. Naturally, the license of EFC does not apply to this piece of code. (Also note that there a couple of small patches to Loki, as it is distributed with EFC. The patches are relatively minor, but the program will not compile or run without them.) ------------------------------------------------------------------------ ------------------------------------------------------------------------ Using ----- Creating a model ---------------- First, you need to decide what kinds of constraints you need. Then, you must create a subclass of Cons for each type of constraint. For example, assume you need a 'less than' constraint. Then, you would create the following class: class LessThanCons : public efc::Cons { public: LessThanCons(const efc::vararray& v) : Cons(v) { assert(v.size() == 2); } bool checkAssgn(int *vals) const { Solver::ConsistencyCheck(); int v1_ = get_vars()[0]->getid(); int v2_ = get_vars()[1]->getid(); return vals[v1_] < vals[v2_]; } protected: virtual Cons* clone(efc::csp *newc) { return new LessThanCons(get_mapped_vars(get_vars(), *newc)); } }; The elements of the class are as follows: Constructor: You can define as many constructors as you need. However, the parent class, Cons, does not have a default constructor and needs to be passed a vararray in the initialization list of each constructor. The vararray is the list of variables constrained. [ Note that order matters within that list. LessThanCons( make_vararray(v1, v2) ) is a different constraint than LessThanCons( make_vararray(v2, v1) ) ] checkAssgn: The core of this class, it returns whether the constraint is satisfied by a specific (complete) assignment to its variables. Note that this function will never be called with some of the variables that are constrained uninstantiated. It will however be called with other uninstantiated variables, so unless the constraint is over all the variables of the problem, you cannot depend on the values of variables other than v1_ and v2_. The actual values that the variables have been instantiated to are stored at vals[get_vars()[i]->getid()], which can also be written as getval(vals, i). clone: This is a pure virtual function that has to be overriden in your class. The implementation shown above should be enough for any class: just create and return an object that is exactly the same as the current one, only map the variables to the new csp. More constraint examples can be found in the directory include/efc/cons. This contains a small collection of predefined constraints, including LessThan. Make sure that you read further if you want to do more things with constraints, such as define a propagator. Now that this is out of the way, you need to actually define the constraint (hyper-)graph. First, initialize the model: using namespace efc; csp thecsp; thecsp.initCSP(M); // M is the number of variables // in the model Now, you need to set the domain size of each variable for(int i = 0; i < M; ++i) thecsp.getVar(i)->setVar(i, D[i]); where D[i] is the domain size of variable i. For convenience, you can also create vararrays of interesting variables, as follows: vararray mainvars; for(int i = 0; i < M; ++i) { thecsp.getVar(i)->setVar(i, D[i]); mainvars.push_back(thecsp.getVar(i)); } Now, to post a constraint, you just have to create an instance of an existing constraint class. To go back to our LessThanCons example, we would write: thecsp.post(new LessThanCons(make_vararray(mainvars[5], mainvars[7]))); Because that's a bit of a hassle to write, it may be useful to add a convenience function less_than: Cons* less_than(Var *v1, Var *v2) { return new LessThanCons(make_vararray(v1, v2)); } So that now we can post the constraint as follows: thecsp.post(less_than(mainvars[5], mainvars[7])); Now, you are ready to setup the solver. Solver thesolver(thescp); Then, create the DVO heuristic: boost::shared_ptr dvo = get_dvo_heuris(thesolver, "dom+deg"); You can naturally substitute the name of any available DVO heuristic for "dom+deg". Alternatively, you can simply create the class by yourself: shared_ptr dvo; dvo.reset( new dvo::dom_plus_deg_dvo dd(thesolver) ); You can also apply any operations you want on the heuristic at this point, such as asking it to instantiate a subset of the variables only. Finally, get an algorithm: CSPalgorithm csa = getCSPAlgorithm("gaccbj"); Similar to DVO heuristics, you can choose any algorithm instead of gaccbj. Now you are ready to solve the problem. The solver needs a solution processing function (to be called once for every solution found). You can leave it empty: solution_func slf; thesolver.solve(csa, *dvo, slf); ofstream log("log"); thesolver.print_statistics("problem name", log); Alternatively, you can create a solution function that does something, like printing the solution: void print_solution(efc::Solver& solver) { std::cout << "Solution: "; for(int q = 0; q != solver.itscsp.var_size(); ++q) std::cout << solver.savedsolution[q] << ' '; std::cout << std::endl; } and then you can initialize slf as follows: solution_func slf = print_solution; The rest of the code remains unchanged. After thesolver.solve() returns, the last solution found will be in thesolver.solution. The number of solutions will be in thesolver.SOLUTIONS. Note that the variable solver.FINDALL controls whether the search will go on to find all solutions (if it's true) or just the first (if it's false). A somewhat easier way to deal with all this is to use solve_with. Instead of having to write all this mostly boilerplate code, you can instead write a function setup() which does the interesting work of creating the problem and leave the rest to solve_with. The signature of setup() has to be: std::string setup(efc::csp & thecsp, efc::cmdline::arglist & args) In it, you can parse the command line arguments and setup your problem. Parsing is done as follows: - the function efc::cmdline::has_option(args, "myoption") returns true if the commandline contains "myoption". - the function efc::cmdline::has_argoption(args, "myotheroption") returns a pair p. If p.first, then the command line contains "myotheroption arg" and arg can be parsed as a T. So, for example you can have many possible models for the same problem and choose one as follows: std::pair model = efc::cmdline::has_argoption(args, "model"); if( model.first ) { // the model name is stored in model.second } else // choose the default model The function setup has to return a problem name, which will be used for logging. Note that you do not have to deal with selecting an algorithm or DVO or creating a solver or anything else. To use solve with, your main() should look like this: int main(int argc, char *argv[]) { using namespace efc; setup_func sf = setup; solution_func solf = print_solution; restrict_dvo_func dvof; solve_with(argc, argv, sf, solf, dvof); return 0; } The only thing we have not mentioned yet about this is the dvo function, which can be used to restrict the DVO to consider only a subset of the variables of a problem, using the function Heuris::restrict. Aside: Unary Constraints ------------------------ Unary constraints can be defined like higher arity constraints, but they are only enforced when the variable they constrain is about to be instantiated. This may affect the behavior of the dynamic variable ordering heuristic. If you need to have the effects of the unary constraint before the search starts, just remove the values from the variables domain (using Solver::removeVal(), see later). Creating a constraint propagator -------------------------------- You can associate a constraint propagator with a constraint, so that you avoid the cost of running generic GAC algorithm on that constraint, by taking advantage of domain knowledge (i.e. by applying specific rules implied by the constraint). To create a propagator for a custom constraint, you have to override the propagate() function. The signature of that function is bool MyCons::propagate(Solver& solver, Var *var, int level, conflict_map& cmap, bool ccf) For an example of a propagator, see the files in lib/cons/ Some guidelines to writing propagators: - You access all the constrained variables like so: const vararray& vars = get_vars(); vararray is just a vector, so you can use it like that. - You can iterate over all values of a variable using the iterators allval_begin()/allval_end(). You can tell the value's id using i->id() (where i is an iterator) and you can tell whether it's pruned using i->pruned() - You can iterate over all unpruned values in increasing order of id(), using curval_begin()/curval_end(). This is just a filter over allval_begin()/allval_end(). - You can iterate over all unpruned values in no particular order using curval_unsorted_begin()/curval_unsorted_end(), which is faster than the previous method but can get you the values in an arbitrary manner. - You can prune values that you find to be inconsistent by using remove_val(solver, &*i, level, this, cmap, ccf); where i is an iterator and the rest are either part of propagate()'s signature. - You can access a specific value of a variable using Val *vi = v->getVal(j); You can write vi->id() and vi->pruned(), as previously. - propagate() returns false if the domain of a variable has been wiped out (checkable by v->dwo()), true otherwise. In addition to writing propagate, you have to make sure that your function is called when the domains of its variables are pruned. To do that, you have to override the virtual function initialize(): void MyCons::initialize(Solver& solver) { for(vararray::const_iterator i = get_vars().begin(); i != get_vars().end(); ++i) { solver.propagate_on_domain(this, *i); } } The options are propagate_on_domain, propagate_on_lower_bound, propagate_on_upper_bound, propagate_on_bound, which propagate when the domain/lower bound/upper bound/any bound of the corresponding variable changes. In addition, the following options are provided: void propagate_on_domain_percons(Cons *c, Var *v); void propagate_on_lower_bound_percons(Cons *c, Var *v); void propagate_on_upper_bound_percons(Cons *c, Var *v); void propagate_on_bound_percons(Cons *c, Var *v); The difference between the two versions of each function is what is put on AC stack. In the first case, a pointer to the variable is put on the stack. When that variable is processed, the propagate() function of every constraint that has put itself on its propagate_on_* lists is called. This means that the propagate() function of your constraint may be called many times after it has already ensured that the constraint is GAC. In the second case, a pointer to the constraint is placed on the stack and only the propagate function for that constraint is called when the corresponding stack entry is processed. The advantage of this second method is that your propagate() will not be called as often. The disadvantage is that there is an overhead in stack processing. You have to determine empirically what works best for each constraint. Creating a constraint propagator: CBJ ------------------------------------- You also have to write a set_reason function if you want your constraint to work better with algorithms such as gaccbj. If you don't, a generic function will be used instead, which will not in general provide good reasons. In general, if you view a constraint as a set of satisfying tuples, then for each value you have a set of supporting tuples. A value is pruned by GAC when at least one value has been pruned from each supporting tuple. set_reason should identify a set of pruned values that cover the entire set of supporting tuples. A minimal set would be even better. A non-maximal way is usually easily achievable (in terms of complexity), assuming you have propagate() is easy. The guidelines for set_reason() are almost the same as for propagate(). The signature of the function is void MyCons::set_reason(Solver& solver, Val *v, conflict_map &cmap) You should have a prolog like this in your implementation: value_nogood & vicfset = cmap.conflict[v]; vicfset.clear(); nogood_merge_guard guard(vicfset); Then, for each value that covers a support (or many), have a call: merge_with_neg_assignment(solver, cmap, vicfset, &*i, v->prune_level()); where i is an iterator over the values of a variable. Creating a new algorithm - the standard way ------------------------------------------- [This section is slightly outdated. The design has not changed, but often you have to implement some additional functions in your classes. The good news is that the implementation of those is usually empty or something equally trivial.] To create a new algorithm in the "standard" way, you must decide which part of the standard backtracking algorithm's behavior you want to modify. There are 3 choices: - the constraint propagation phase - the backtracking/jumping phase. - at a leaf - at an internal node Then you have to write the propagator or backjumper for the algorithm. A couple of examples will show how to do that: - Implementing an algorithm than maintains Singleton Arc Consinstency. You would then implement your propagator like this: class sgacpropagator { public: void init(efc::Solver& solver) { // Use this function to extract any information from the solver // (or add to it, or preprocess the problem) } int operator()(Solver& solver, Var *vi, Val *vali, int level) { // This function gets called when the assignment vi <- vali // takes place at level 'level'. Use this to maintain // singleton arc consistency } }; That's all that is required. You can access any variable or constraint from the interface of Solver. Now suppose that you want to do something more refined, like preprocessing the problem to be singleton arc consistent, but only maintaing some other level of consistency during search. Moreover, you want to be able to use the code with any other type of constraint propagation available to the user: forward checking, maintaining GAC, even no constraint propagation, as in CBJ. The, you would write your propagator as a wrapper around an existing one: template class sgacwrapper : public virtual Propagator { public: void init(efc::Solver& solver) { Propagator::init(solver); // establish SGAC } int operator()(Solver& solver, Var* vi, Val *vali, int level) { return Propagator::operator()(solver,vi,vali,level); } }; You have to make sure to publically, virtually inherit from the propagator. Public inheritance is necessary, as some propagators (or backjumpers) need to get a specific component of an algorithm and that is accomplished with a dynamic_cast at runtime. If you use protected or private inheritance, the conversion will not be possible. Moreover, you need to use virtual inheritance, because some algorithms need to have cooperating components which need to share data. If the inheritance is not virtual, or if there is not inheritance at all and you just have a member like : Propagator prop; then you will be unable to use your algorithm with these algorithms, as they will not share the data; instead, each component will have its own private copy. Then the next step is to actually define some algorithms that maintain SGAC or that enforce SGAC before solving. This is very simple to do. An algorithm is really a type, which you have to define and register with a global factory: typedef GenericBTAlgorithm SGAC; EFC_REGISTER_CSP_ALGORITHM(SGAC); typedef GenericBTAlgorithm SGAC_CBJ; EFC_REGISTER_CSP_ALGORITHM(SGAC_CBJ); typedef GenericBTAlgorithm, cbjbackjumper, cbjbackjumper, cfsolutionprocessor> SGAC_FC_CBJ; EFC_REGISTER_CSP_ALGORITHM(SGAC_FC_CBJ); typedef GenericBTAlgorithm, cbjbackjumper, cbjbackjumper, cfsolutionprocessor> SGAC_GAC_CBJ; EFC_REGISTER_CSP_ALGORITHM(SGAC_GAC_CBJ); This is all that is required. Note that for all but the first algorithm to actually work, the propagator needs to set a value-no-good for each value that it prunes. You can create two versions of your propagator if you wish, but there would probably be no significant performance improvement. Note that defining a new propagator is not the way to handle constraints that can be propagated more effectively than with GAC. Instead, you define a constraint propagator (I know, the naming needs improvement), as discussed in the previous section, and attach it to each relevant constraint. The procedure for creating a new backjumper is similar, only in this case the information that the solver passes to the components is the variable whose domain has been exhausted. Then the operator() becomes: int operator()(Solver& solver, Var *vi, int level); which means that the domain of variable vi has been exhausted at level 'level'. The only difference between an internal backjumper (the first of the two backjumpers defined in a typedef above) and a leaf backjumper is that in the second case all of the values in the domain of that variable have been pruned as a result of constraint propagation, not search. You may or may not wish to differentiate your backjumper's behaviour between these two cases. If not, you only need to define one backjumper and use it in both places. The operator() needs to return which level the solver should jump back to. A chronological backjumper always returns level - 1. The only other requirement placed on the backjumper is that whenever it manipulates the conflict set of a value, the maximum level in that conflict set needs to also be the prune level of that value. So, if you have written a backjumper called SuperBackjumper, you can define an algorithm that uses it like this: typedef GenericBTAlgorithm FC_SUPER; EFC_REGISTER_CSP_ALGORITHM(FC_SUPER); and so on for other propagators. Similarly, if the behaviour in a leaf node differs from that in an internal node, the definition becomes: typedef GenericBTAlgorithm FC_SUPER2; EFC_REGISTER_CSP_ALGORITHM(FC_SUPER2); Creating a new algorithm - the other way ---------------------------------------- [Once again, some parts of this may be slightly outdated. Certainly, it must be even harder right now to develop an algorithm outside the mentioned framework] There are a few algorithms that cannot be implemented using the techniques discussed above. For example, you cannot implement local search or dynamic backtracking using this approach. In this case, you have to take the more general route of implementing a subclass of class BTAlgorithm. To do that, you have to override the following pure virtual functions in your subclass: - void init(Solver&) Similar to what you do with custom, propagators and backjumpers you have to initialize the algorithm to work with the problem associated with the solver. If you are going to use any of the generic components used in GenericBTAlgorithm, this is the place where you have to call these components' init() function as well. - int operator()(Solver& s, Heuris& h) Solve the problem. Store the latest solution found in s.solution and the number of solutions in s.SOLUTIONS (find the first solution if FINDALL==0, otherwise find all solutions). If you need to choose which variable to instantiate next, use h. - int search(Solver&, int level, Heuris&) Same as operator(), but treat the first level assignments already recorded in s.solution as "untouchable". (In other words, search subtree in backtracking terms.) - int propagate(Solver&, Var *varp, Val *valp, int level) Assuming that assignment varp<-valp was made at level 'level', do any constraint propagation. - Heuris& get_heuris() just return the heuristic passed to operator()() or search(). - boost::shared_ptr clone(); clone the object. Most of the time, it's just return boost::shared_ptr(new MyAlgorithm(*this)); Besides these requirements, all bets are off. See GenericBTAlgorithm for an implementation. Advice: Even though you don't have to, it would be best if you tried to follow the road of GenericBTAlgorithm, by allowing the use of the same propagators and backjumpers. So, if you were to, say, implement dynamic backtracking, you could reuse the same components to almost instantly get more than 50 algorithms. The alternative would be to reimplement these from scratch. In addition, even if you think it would be hard to use the standard way of writing new algorithms to implement your new one, it might be worthwhile to contact me to see whether/how the generic algorithm could be extended to allow the new one to be implemented more easily. Other features -------------- GACConds -------- [Outdated] You can attach a gaccond to a constraint, in order to restrict when GAC is enforced for that constraint. The condition can be something as simple as the commonly used "only when at most N variables are uninstantiated" to "never when variable X is uninstantiated", or even "Only when variable X is instatiated to values aX, bX or cX". See file gacconds.h for examples. DVO heuristics -------------- [Partially outdated. There are two generic dvo heuristics in genericdvoimpl.h: genericdvo and genericrandomdvo. The first one will call the scorer for each uninstantiated variable and return the one with the smallest score. Automatically, variables with degree==0 have MAX_FLOAT/2 added to their score, so that they are selected last (by definition, a variable current degree 0 cannot influence the problem by propagation or other means, therefore it can only slow things down if instantiated before ones with a non-zero degree). Also, the heuristic will select a variable with a current domain size of zero, even if there are other variables with a better (smaller) score. This is needed, because the generic algorithm does not detect domain wipe outs caused by constraint propagation until the next level. This allows the internal and leaf backjumpers to be the same in most cases, but places this restriction on the heuristic. If there is a tie, the lexicographically first variable with minimum score is returned. The values for that variable will be instantiated in the order specified by ValueOrderer, which is a function object that gets a vector of assignments and reorders it. See the file include/efc/dvo/orderers.hpp for examples. The second generic heuristic will randomly select a variable to return. Variables are however weighed by probfunc(maxscore - scorer(variable)). So, the probability that a specific variable will be returned is proportional to its difference from the maximum (worse) score. Moreover, that probability is scaled by probfunc. Additionally, the same restrictions mentioned for genericdvo apply to this dvo as well (wrt zero domain size and zero degree variables). Finally, the DVO can be strict or non-strict: strict means that the variable returned has to have minimum score. Essentially, strict mode will randomly return one of the variables with minimum score, all with the same probability, regardless of the probability function. For a list of available scorers see include/efc/dvo/scorers.hpp For a list of available probability functions see include/efc/dvo/genericrandomdvo.hpp. Most DVOs are defined in include/efc/dvo/stock_dvos.hpp, include/efc/dvo/stack_ac_dvo.hpp and include/efc/dvo/uselogdvo.hpp References ---------- [1] A Uniform View of Backtracking, Fahiem Bacchus, Unpublished Manuscript http://www.cs.toronto.edu/~fbacchus/on-line.html [2] Extending Forward Checking, F. Bacchus, Principles and Practice of Constraint Programming--CP 2000, pages 35-51, 2000.