mercredi 13 octobre 2010

The importance of redundant constraints: Modeling a Bin-Packing

What makes CP really fun is that modeling is an art and you have to understand (or at least have a good feeling) of how the propagation will behave when you write a model.
Detractors of CP will say it's a weakness, Pascal Van Hentenryck will say that "CP is a smart tool for smart people" ;-)
When I wrote 5 years ago my second CP model (first one was n-queens) it was the Balanced Academic Curriculum Problem (BACP).
I started to read the paper "Modelling a balanced academic curriculum problem" (CPAIOR2002) by Brahim Hnich , Zeynep Kiziltan and Toby Walsh. The experiments in this paper show that the problem is really difficult (they cannot prove optimality with an hybridization of CP and MIP).
In reality this problem is not difficult for CP but like many problems, it has bin-packing component and you have to model it correctly. I understood this when I read the very good paper of Paul Shaw: "A Constraint for Bin Packing" (CP2004).

A bin packing constraint implies some sized items that you must place into some capacitated bins. So you have one variable for each item that tells you in which bin you place the item and one variable for each bin telling you how loaded is the bin. A ground instance is bin-packing([1,2,1,3],[3,4,1,5],[4,4,5]) with 5 items placed in bins [1,2,1,3] with sizes [3,4,1,5] so that the load of bins 1,2,3 are [4,4,5].

A model for this constraint in google cp-solver is the following:

def BinPacking(cp, binvars, weights, loadvars):
'''post the load constraint on bins.

constraints forall j: loadvars[j] == sum_i (binvars[i] == j) * weights[i])
for j in range(len(loadvars)):
b = [cp.IsEqualCstVar(binvars[i], j) for i in range(len(binvars))]
cp.Add(cp.ScalProd(b, weights) == loadvars[j])
cp.Add(cp.SumEquality(loadvars, sum(weights)))

For all bins j, you say that the load is equal to the sum of the weights of items placed into that bin j. This is enough to model the semantic of the constraint but not enough for the pruning. It is very important to add the redundant constraints that the sum of the load variables is equal to the sum of the total size of the items. If you don't do that, there is not communication between the bins in the cp store. Consider this example to realize this:

You have 5 items with size (6,5,5,4,4,4,2) to place into three bins with capacity 10. There is clearly a solution: (5,5) in bin one, (4,4,2) in bin two and (6,4) in the third one. Assume now that you have a partial solution where one item of 5 and one item of 4 is placed into bin one. Clearly this partial solution cannot lead to a final solution since you know that the bins must be completely packed at the end (30 units to place and three bins with capa 10). Unfortunately you can see that the first bin cannot be packed completely with this partial solution. Interestingly this inconsistency can only be detected with the redundant constraint "the sum of the load variables = sum of the total size". Indeed, the load of bin 1 is at most 9, and the load of bins 2 and 3 is at most 10 so at most the total load will sum to 29, which is smaller than 30 => fail!

Now if you want to experiment (like the authors of the first BACP model) how difficult BACP is without this redundant constraint, you can comment the redundant constraint in the google-solver cp model available here

If you don't forget the redundant constraint, you'll solve the problem in 68ms and 1000 failures with a default first fail heuristic. In conclusion this problem is trivial for CP and I hope if you read this that you will never forget to post this redundant constraint to model a bin-packing like me 5 years ago ;-)

Note that this problem has been generalized to be more realistic here. I think this new problem is now challenging for CP.

mardi 12 octobre 2010

Improvement or Innovation ?

There is currently a discussion on the blog of Michael Trick about the LDS (Limited Discrepency Search) which has neither any relation with LSD nor with Lucy, Diamond or Sky, but which comes from a request of Matt Ginsberg asking Christian Schulte for changing the licence of Gecode because the use of Limited Discrepency Search for solving some job scheduling problems is patented.

The same story happened when I was at ILOG. So, M. Ginsberg seems to like repeated business :-) I don't know exactly what was the agreement between ILOG and M. Ginsberg. Everybody knows that  ILOG removed the
LDS acronym and name from all the documentation and from the product, but there are some unknown funny stories (because cp is fun) behind that old problem.

- we discovered that we didn't correctly understand the LDS and what we implemented was  not the LDS. Thus, we name it Slice Based Search and it was published under the name Discrepancy-Bounded Depth First Search (J. Christopher Beck and Laurent Perron)
- all that stuff does not really work well. I think that there is no more any CP model of benchmarks using it.
- LDS is clearly outperformed by restarts with randomization.

However, there are some other lessons that are interesting:

First, it is not because it is open source that you have the right to copy everything and put it into your product.

Second, M. Ginsberg is right when he says :"But if a commercial company wants to sell an implementation of LDS for use in job scheduling, it is — in my opinion — not unreasonable for the people who invented LDS to be compensated.". With such an idea I could become rich, especially if I create a company with N. Beldiceanu :-) But how can we implement this idea ? By patents ? I am not sure it is the good solution. We could accept this only when the idea is really an invention, that's mean it is strongly original. In other words, it is an inovation and not only an improvement. Because if it is an improvement we should also compensate the people who introduce the idea before, and so on... But how do we distinguish these two notions? In the comments of Mike's blog we can see that this is the point! T. Walsh proposed the Left First Search : “Thus, in searching the tree we want to minimize right arcs … We call a search algorithm which does this left-first search. At the n+1-th ply of the search we explore all those nodes whose path back to the root includes n right arcs”. Then Toby proposed an exponential algorithms. If you recognize LDS in Toby's words then LDS is an improvement otherwise it could be seen as an inovation...

At the end, I hope that M. Ginsberg made more money by the breakthrough brought by LDS over other search methods than by the patent and the threats associated with it...

lundi 11 octobre 2010

May the force be with you!

I found another vendor of the future headset interface!
It can help you to control the force:

Even if they make toys, the company seems to be serious

I should try one of these headsets and keep you inform about my tests :-)

[Edit] If you have an iphone, you can immediatly order one :

lundi 4 octobre 2010

CP is fun

CP is fun is the title of this blog.
I needed to find quickly a title when I created the blog and I thought to this sentence. Ten years ago, when I gave talk at Informs, I always finished my talks by this sentence in order to convince people to be more interested in CP. Ten years later I am not sure that I was successfull, except if I consider that Irv Lustig told me once that he really enjoyes when he defines CP models.

When I read Pierre's posts, I really think that CP is fun because we can do a lot of things.
However, one sentence of Pierre drew my attention to optimization languages :
"When you take a close look at the line:
solver.Add(sum([10**(n-i-1)*x[i] for i in range(n)]) == nb)
It is difficult to argue that it is very far from dedicated optimization languages!"

I think the association of the concept of objects and libraries kill a lot of languages. Consider C++, do you want to have a kind of Lisp ? Use a library which manipulates lists like lisp! Do you want to use iterations with mapping functions ? Use  the stl or meta-programmation! In addition, you will benefit from all the advantages of the recent compilers and IDE. I think it is really difficult to obtain such a result with a dedicated language.
Thus, I strongly support Pierre.
However, this is not the best interface we can imagine, because I think that I found it.
Look at this website
Being able to model and to solve some problems with CP while using their products will definitively be a very good way to prove that CP is really fun!

samedi 2 octobre 2010

Open positions

You would like to work on the resolution of problems like green-it, cloud computing or program verification with constraint programming?
You would like to work on industrial problems like some of Google?

We have some open positions at the University of Nice-Sophia Antipolis at the I3S department. We have some grants for master students, phD or postdoctoral stay.
The dates are flexible.

Interested ? Feel free to send me a message :

vendredi 1 octobre 2010

Multiplication vs Exponent Constraints

In the Dudeney problem we have the constraint:
solver.Add(nb = s*s*s)
and the model that I have shown previously finishes with 92 fails with a labeling on x.
But behind the scene, google solver is smart because it doesn’t post multiplication constraints only. The first multiplication introduces a variable (call it t) with the constraint (t==s^2) (an exponent two constraint). Then the multiplication constraint (nb==t*s) is posted.

Is it important? Yes for the pruning of the domains. And CP is all about pruning!

Let’s consider some domains on our variables to realize this. Assume that s has a domain [1..10] and t has a domain of [25..100]. Then you can prune the interval of s to [5..10] with the power two constraint. With a simple multiplication you can only prune it to [3..10].

If you want to see the effect on the Dudeney problem with only multiplication we can post this such that the solver has no way to know that a power two constraint can be posted:

s1 = solver.IntVar(range(1,9*n+1),'s1')
solver.Add(s1 == s)
solver.Add(nb == s*s1*s)

And now you get 5821 fails!

Can we do better in this case?
Yes if you have a power three constraint ;-) But this is generally not implemented in solvers because it’s not so used.
You can also do better on the Dudeney problem with a labeling on another decision variable but this is another story...