The Calendar Puzzle continued - Z3 | metterklume (Abhishek Roy)

This is a follow-up to the previous post.

Recall that we condensed the constraints from the puzzle into a single array of 1s and 0s, M. A solution to the puzzle being an exact cover i.e. a subset of rows such that,

No two rows in the subset, have a 1 in the same column.
Each column has a unique matching row with a 1 in the appropriate position.

In this example, the shaded rows form an exact cover.

Exact cover

To find such a solution, we used backtracking and had to sweat a bit to optimise the search.

Instead, we might stand on the the shoulders of giants and use a SAT solver.

A SAT solver is a program that takes a boolean formula and either finds a solution or proves that no solution exists. Formulas are built from Boolean variables, which can take values 0 or 1. The logical operators are AND, OR and NOT.

I chose Z3 as I was vaguely familiar with the name. Z3 actually goes far beyond SAT in solving problems with integers, and even real numbers. But SAT is really all we need. Skimming through this tutorial, I came up with:

#Assume that M and keep_cols have been constructed as in the previous post.
#keep_cols is a boolean array 

from z3 import *
#A list of boolean variables, one for each row. True means that the row is selected in the solution.
rowz = Bools(['row_%d' % i for i in range(M.shape[0])])

s = Solver()
for col in range(M.shape[1]):
    if not keep_cols[col]:
        continue

    #For each kept column, require exactly one intersecting row to be selected
    constraint = [(rowz[r],1) for r in np.where(M[:,col])[0]]
    s.add(PbEq(constraint, 1))

PbEq is an (undocumented!) feature in Z3 that allows one to demand exactly only one of the following variables must be True, in a concise manner. In full generality, it sets a weighted sum of boolean variables equal to a constant. In our case the constant and weights are all 1. Pb apparently stands for Pseudo-Boolean.

In Z3 a solution is also called a model. We check if the constraints are satisfiable and print the model.

print(s.check()) #creates the model automatically
print(s.model())

>>>sat
>>>[row_516 = False,
    row_620 = False,
    row_197 = False,
    row_257 = False,
    ...]

That’s it, we have a solution! All we did was set the problem up and the SAT solver did the rest.

solution = [i for i in range(M.shape[0]) if m.eval(rowz[i])]
plot_solution(M, solution, initial_grid)

What if we want all solutions not just the first one that Z3 stumbled upon? This is where it gets interesting. Stackoverflowing a bit, I found two answers. The naive way keeps adding each solution to the constraints as it is found, effectively saying “show me something I haven’t seen before”. The sophisticated approach splits the search space, fixing some variables and exploring the rest. It also uses coroutines - a fairly advanced feature of Python that is admittedly a bit out of my comfort zone. Both are described in this comprehensive (and intimidating) guide I also learned that the Z3 solver maintains state between calls that ought to enable incremental search, instead of starting over from scratch each time.

Naive

def block_model(s, terms):
    """
    Add a constraint that blocks the current model (solution). 
    The next call to check() will return a different solution.
    """
    m = s.model()
    s.add(Or([t != m.eval(t, model_completion=True) for t in terms]))

def all_solutions(s, terms):
    while sat == s.check():
        m = s.model()
        sol = [i for i in range(M.shape[0]) if m.eval(rowz[i])]
        print(sol)
        block_model(s, terms)

Sophisticated way with coroutines

def all_smt(s, initial_terms):
    def block_term(s, m, t):
        s.add(t != m.eval(t, model_completion=True))
    def fix_term(s, m, t):
        s.add(t == m.eval(t, model_completion=True))
    def all_smt_rec(terms):
        if sat == s.check():
           m = s.model()
           yield m
           for i in range(len(terms)):
               s.push()
               block_term(s, m, terms[i])
               for j in range(i):
                   fix_term(s, m, terms[j])
               yield from all_smt_rec(terms[i:])
               s.pop()   
    yield from all_smt_rec(list(initial_terms))


ss = all_smt(s,rowz)
m = next(ss, None)
while m:
    sol = [i for i in range(M.shape[0]) if m.eval(rowz[i])]
    print(sol)
    m = next(ss, None)

Now comes the surprise.

Time taken to find all 92 solutions for Dec 25 in the calendar

Naive: 6.7s
Smarter: 10min 8s

The simple way is about 10x faster. Why?