Canonicalizing {0,1}-matrices with Nauty

Previously on this blog:

In the process of deciding that I ought to build a file of small \(d\)-separable matrices, I went down a rabbit-hole that in hindsight wasn’t all that related. I decided that it might be useful to have some way of telling whether a given testing strategy for \(t(n,d)\) (that is, a given \(t\times n\) matrix) was “novel” or merely equivalent to a matrix that I’d already found.

Asking “Is testing strategy \(T_1\) equivalent to testing strategy \(T_2\)?” is equivalent to asking “Can the \(t\times n\) matrix \(T_2\) be derived from \(T_1\) merely by permuting \(T_1\)’s rows and then its columns?” And that question is equivalent to asking “Is graph \(G_1\) isomorphic to graph \(G_2\)?”, where \(G_1\) is the bipartite graph on \(t+n\) vertices whose adjacency matrix is \(T_1\), and \(G_2\) is the bipartite graph on \(t+n\) vertices whose adjacency matrix is \(T_2\).

Graph isomorphism is a hard (but not NP-hard) problem. Fortunately, as a popular practical problem, there exist software tools for it!

So I downloaded Nauty/Traces, an open-source software library by Brendan McKay and Adolfo Piperno, and I wrote some code to canonicalize a Wolves and Sheep testing strategy (or any other {0,1}-matrix). Put in two matrices that are equivalent under row-and-column-permutation, get out two identical matrices. Neat! Find my code on GitHub here.

I also submitted the nauty package to Homebrew and got it accepted, so if you’re on Mac OSX you can just type brew install nauty and you’ll be ready to compile my matrix-canonicalization code.

For more info on Nauty, see:

Nauty’s C API is reasonably clean, but it took me a lot of experimentation to figure out its quirks.

The main entry point for our purposes is densenauty(g, lab, ptn, orbits, options, stats, m, n, canong). (There’s also a sparsenauty API, but I didn’t try to use it.)

m is simply SETWORDSNEEDED(n) (i.e., n divided by the word size and rounded up). I don’t really understand why this needs to be its own parameter, but it is.

g is the graph you’re trying to canonicalize, represented as an adjacency matrix. Except that Nauty doesn’t do matrices; it does sets. So g is really an array of n sets, where each set consists of m SETWORDs. To see whether vertex vi is adjacent to vertex vj, you would test ISELEMENT(g[vi], vj)… except that Nauty doesn’t do multidimensional arrays either! So actually g is an array of n*m SETWORDs, and you have to index into it manually. What you actually write to test vi’s adjacency to vj is ISELEMENT(g[vi*m], vj).

lab and ptn define the graph’s vertex coloring, but in a weird way (which, to be fair, is mostly documented). Each of lab and ptn is an array of int. For example:

lab: 2 5 4 0 3 1 6
ptn: 1 1 0 1 0 1 0

This example makes vertices {2,5,4} red, vertices {0,3} blue, and vertices {1,6} green. lab should be a permutation of the integers \([0,n)\), and ptn should consist of blocks of zero-or-more consecutive 1s followed by single 0s. Each ptn[i]==0 indicates that a color-partition ends with element i. (If you set ptn[n-1]!=0, then I assume bad things happen.)

Nauty’s default behavior is to make all the vertices the same color. It will do this even if you pass in a different coloring! That’s right, Nauty will by default ignore what you pass in for lab and ptn! To make Nauty respect your settings of lab and ptn, you must set options.defaultptn=false.

So if you want all your vertices to be the same color, can you just pass nullptr for lab and ptn? No, of course not. Nauty wants to use those two arrays for scratch space. You must pass in non-null buffers, and they must be writeable (non-const), even if you don’t care about vertex coloring. Also, be aware that Nauty will trash lab and ptn as part of its operation. (Specifically, it may change any non-zero element of ptn. It will never change any element of ptn whose value you have set to 0.)

orbits we don’t care about, but again it cannot be nullptr — it’s used as scratch space.

options we can mostly leave alone, except that we must set options.defaultptn=false. The documentation shows another promising-looking option — options.getcanon=true — but it turns out that this is a red herring. Even though we are trying to canonicalize our graph, we do not want to set options.getcanon, nor do we want to pass in anything for the canong parameter. (That’s right, canong is the one parameter which is allowed to be nullptr!)

When densenauty returns, it will have permuted our input labeling, lab, into a canonical ordering. Each color-partition will be permuted only within itself. So in our example above, we might end up with:

lab: 5 4 2 0 3 6 1

Don’t look at ptn; it’s garbage at this point. And don’t look at canong — it’s a red herring. Look at the original graph g, which (as far as I can tell) Nauty does not trash or modify in any way. If you take its vertices in the order given by lab, then you get the canonical graph we’re after!

This description is probably about as confusing as the real docs at this point; so if you got here looking for Nauty/Traces code to canonicalize a (non-directed, dense, perhaps bipartite) graph, you’ll probably want to read my code on GitHub here.

Posted 2020-01-11