How to improve the constraints that minimize the number of different elements in an array

Hi all,

Your help is very much appreciated on how I should write the constraints.

M is a 2^N{\times}\frac{N(N-1)}{2} matrix of positive constant integers. E is a \frac{N(N-1)}{2}{\times}(N-1) matrix of binary variables. T=ME is a 2^N{\times}(N-1) matrix that I am trying to optimize. Each row of E has a single 1 and N-2 zeros.

This is basically partitioning the columns of matrix M into disjoint sets. The columns in T are the sum of the M columns in each partition. The objective is to minimize the number of different elements in the T columns (I am trying different things such as sum or minimizing the max, but that is irrelevant to my problem here). In order to find the number of different elements, I am currently comparting every two elements on each column but that results in the number of variables/constraints increasing by 2^{2N}. What I currently have does work as expected for for N<10, but I need to solve the same problem for larger N values.

A side information that might help formulating this better, is that I know there are solutions where the maximum number of different elements in T columns is less than let us say 100, therefore even for N=10 where the current method tries to define a million variables and a million constraints just to figure out how many different elements each column has, I know that the solution that I am looking for will have at most 100 different elements.

Any pointers to relevant problems in literature is also very helpful.

Thanks a lot for in advance

There may be some more efficient possibilities. But first can you say more about how large an N value you want to be able to handle?

Also, it is a concern that the data matrix M quickly gets very large as a function of N. Even for N equal to 30, it has 467,077,693,440 elements, which will likely take too much time to generate and too much space to store. Is there any pattern to the elements of M that would make the matrix easier to work with?

With regards to the scale of the problem, let us say N\leq{24}. For larger values, I am only interested in finding “a” solution and not necessarily the optimal answer.

The elements of M are indeed structured. Each row corresponds to a binary representation of the numbers 0\cdots{2}^N-1. Each column is the AND of two bits of the binary representation, see my code below. This code has a few additional details that were omitted in the previous description.

With this code, the solver starts to struggle even with N=10. FYI, I have written a python code to perform the search directly and so far have found some solutions even for N=20 (I don’t know about their optimality though).

I think, a promising direction to improve the speed would be utilizing the fact that for N=10, T_cnt_max \leq 100. I should probably try to calculate the cardinality of the 2^N-element columns in multiple steps instead of doing it in a single shot i.e. break the columns into smaller portions, determine the cardinality on each chunk and then combine the results. Doing it that way the number of variables/conditions for T_eql_m does not scale by 2^{2N} (it would be something like 100\times{2^N} which is still better).

Any other suggestions ? Is there anything else that I can improve in the code below ?

option solver './ampl/cplex'; option cplex_options 'mipdisplay=2 populate=1';

param N > 0 integer;
param M_max = 2^(N-3);

set diag := 1..N-1;
set rows := 0..2^N-1;
set cols := 0..N*(N-1)/2-1;
set rnge := 0..M_max;

param M {rows,cols} >= 0 integer;

# How to define SOS1, does it speed up the solver ?
var E {c in cols,d in diag} binary; subject to E_cond {c in cols} : ( sum {d in diag} E[c,d] ) == 1;

# Partitioning M columns based on E into N-1 subsets, the columns in each subset are added together to create T
var T {r in rows,d in diag} = sum {c in cols} M[r,c]*E[c,d];

# T_mod = T mod 2^(N-2) (largest value in T is 2^(N-1))
var T_div {r in rows,d in diag} integer >= 0, <= 3;
var T_mod {r in rows,d in diag} = T[r,d]-T_div[r,d]*2^(N-2); subject to T_mod_cond {r in rows,d in diag}: 0 <= T_mod[r,d] <= 2^(N-2)-1;

# T_adj = min(T_mod,2^(N-2)-T_mod)
var T_sel {r in rows,d in diag} binary;
var T_adj {r in rows,d in diag} integer; subject to T_adj_cond_1 {r in rows,d in diag} :         T_mod[r,d]-2^(N-2)*   T_sel[r,d]  <= T_adj[r,d];
                                         subject to T_adj_cond_2 {r in rows,d in diag} :                                              T_adj[r,d] <=         T_mod[r,d];
                                         subject to T_adj_cond_3 {r in rows,d in diag} : 2^(N-2)-T_mod[r,d]-2^(N-2)*(1-T_sel[r,d]) <= T_adj[r,d];
                                         subject to T_adj_cond_4 {r in rows,d in diag} :                                              T_adj[r,d] <= 2^(N-2)-T_mod[r,d];

# T_eql_m = T_adj >= m && T_adj <= m (this is the part that should be improved)
var T_mte_m {r in rows,d in diag,m in rnge} binary; subject to T_mte_m_cond {r in rows,d in diag,m in rnge}: 0<= T_adj[r,d]-m+(2*M_max+1)*(1-T_mte_m[r,d,m]) <= 2*M_max;
var T_lte_m {r in rows,d in diag,m in rnge} binary; subject to T_lte_m_cond {r in rows,d in diag,m in rnge}: 0<= m-T_adj[r,d]+(2*M_max+1)*(1-T_lte_m[r,d,m]) <= 2*M_max;
var T_eql_m {r in rows,d in diag,m in rnge} binary; subject to T_eql_m_cond {r in rows,d in diag,m in rnge}: 0<= T_mte_m[r,d,m]+T_lte_m[r,d,m]-2*T_eql_m[r,d,m] <= 1;

# T_eql_m_all = T_eql_m(r=0) || T_eql_m(r=1) || ... || T_eql_m(r=2^N-1)
var T_eql_m_all {d in diag,m in rnge} binary; subject to T_eql_m_all_cond {d in diag,m in rnge}: 0<= 2^N*T_eql_m_all[d,m] - ( sum {r in rows} T_eql_m[r,d,m] ) <= 2^N-1;

# T_cnt is the number of different elements in each column (cardinality of each column)
var T_cnt {d in diag} = sum {m in 0..M_max} T_eql_m_all[d,m];

# Different objectives
var T_cnt_max integer >= 0; subject to T_cnt_max_cond {d in diag}: T_cnt[d] <= T_cnt_max;
var T_cnt_sum = sum {d in diag} T_cnt[d];

minimize cost: T_cnt_max*1000+T_cnt_sum;
problem find_sol: E, E_cond, T, T_div, T_mod, T_mod_cond, T_sel, T_adj, T_adj_cond_1, T_adj_cond_2, T_adj_cond_3, T_adj_cond_4,
                  T_mte_m, T_mte_m_cond, T_lte_m, T_lte_m_cond, T_eql_m, T_eql_m_cond, T_eql_m_all, T_eql_m_all_cond,
                  T_cnt, T_cnt_max, T_cnt_max_cond, T_cnt_sum, cost;

let N := 8;

param r;
param B {0..N-1};
for {k in 0..2^N-1}
  let r := 0;
  for {b in 0..N-1}
    let B[b] := ( k div 2^b ) mod 2;
  for {j in 1..N-1}
    for {i in 1..N-j}
      let M[k,r] := 2^(N-1-j)*B[N-i]*B[N-i-j];
      fix {d in diag: d <= i-1} E[r,d] := 0;
      fix {d in diag: d >= i+j} E[r,d] := 0;
      let r := r + 1;

fix {c in cols,d in diag: c >= N-1 and (d mod 2) == 1} E[c,d] := 0;

solve find_sol;

#display T;
#display T_div;
#display T_mod;
#display T_adj;
display cost, T_cnt_max, T_cnt_sum, T_cnt;

Of course, as long as your formulation grows exponentially in the numbers of variables and constraints, you are going to “hit a wall” at some N value, beyond which you get no useful results from optimization. It seems that you wall is currently around N = 10, but with some refinements you might increase it somewhat.

On the AMPL side, here are a few possibilities that come to mind:

  • You can drop integer from the definition of var T_cnt_max since the minimization should cause that variable to be an integer anyway.

  • The only numbers in M are zeros and powers of 2, so maybe you can write the AMPL expression for T=ME in a way that does not require building the entire matrix.

  • A substantial number of variables and constraints are removed by AMPL’s presolve (see below). These occupy a lot of memory, even though they are eliminated before the problem is sent to the solver. Some speedup might be achieved if it is possible to avoid generating some of these variables and constraints.

Presolve eliminates 2690006 constraints and 2676511 variables.
Substitution eliminates 6773 variables.
Adjusted problem:
930965 variables:
	927330 binary variables
	3634 integer variables
	1 linear variable
935550 constraints, all linear; 2532840 nonzeros
	24 equality constraints
	9776 inequality constraints
	925750 range constraints
1 linear objective; 509 nonzeros.

Also I noticed a few things in CPLEX:

  • CPLEX’s preprocessing phase starts to get bogged down in its “probing” step at N=9. To keep going, you can turn off probing by adding probe=-1 to the cplex_options string.

  • If you set a time limit or interrupt CPLEX, it will return the best solution that it found, which may be sufficient for your purposes.

  • CPLEX generates an unusually large number of cuts, and at the end it lists how many of each kind were kept in the problem (see below). The cuts don’t seem to help all that much, so you might try suppressing some of the more numerous ones. For example, impliedcuts=-1 suppresses generation of implied bound cuts.

        Nodes                                         Cuts/
   Node  Left     Objective  IInf  Best Integer    Best Bound    ItCnt     Gap

*     0+    0                        15041.0000     3013.0000            79.97%
      0     0     3049.6682  4855    15041.0000     3049.6682       17   79.72%
      0     0     7021.0469  4733    15041.0000    Cuts: 7066     6936   53.32%
      0     0     7026.9891  4608    15041.0000    Cuts: 5444    10054   53.28%
      0     0     7108.5833  4828    15041.0000    Cuts: 3372    11933   52.74%
      0     0     7279.6359  6299    15041.0000    Cuts: 2169    15422   51.60%
      0     0     7526.5000  5777    15041.0000    Cuts: 5717    18955   49.96%
      0     0     7526.5000  5622    15041.0000    Cuts: 2441    20828   49.96%
      0     0     7526.5000  5750    15041.0000    Cuts: 1694    22403   49.96%
      0     0     7526.5000  5718    15041.0000    Cuts: 1834    23981   49.96%
      0     0     7526.5000  5665    15041.0000    Cuts: 1778    25104   49.96%
      0     0     7526.5000  5586    15041.0000    Cuts: 1337    26339   49.96%
      0     0     7526.5000  5713    15041.0000    Cuts: 1450    27544   49.96%
      0     2     7526.5000  5650    15041.0000     7526.5000    27861   49.96%
Elapsed time = 44.33 sec. (23825.80 ticks, tree = 0.02 MB)
GUB cover cuts applied:  106
Clique cuts applied:  26
Cover cuts applied:  444
Implied bound cuts applied:  9535
Flow cuts applied:  1010
Mixed integer rounding cuts applied:  2337
Zero-half cuts applied:  2268
Gomory fractional cuts applied:  2

Of course, if you could develop a reformulation that uses fewer variables and/or constraints, that would be valuable, too. That would require further research, however.

Thanks a lot for the very valuable feedback. Solvers have always been a black box for me so what you kindly shared is definitely very helpful. Do you happen to have a better reference/pointer that describes more details on how to optimize the solver performance by looking at how they progress over time ? I usually ignore the log that they generate (besides the best solution, best lower bound and the gap) but if there are indicators that would help adjusting the problem so that they can progress faster it would be a great help.

I think I have found a way to avoid iterating over 2^N using generator functions (see here for example). Basically there is a one to one map between the number of distinct members in the set of M columns and the number of distinct terms in a polynomial expansion and the latter has been studied quite a bit in the ILP context and combinatorics. I have not found the full solution yet but that looks like an angle that could be pursued.

Thanks again.

You could consult CPLEX Performance Tuning for Mixed Integer Programs, which describes a variety of non-default parameter settings that can be useful. Although this page has not been updated in a long time, its suggestions are still helpful. To tell whether any particular suggestion is useful for your model, however, you’ll need to run some tests to see how it performs with different values of N.

Thanks a lot for the link, very much appreciated !

Thanks for the solution, this is very helpful for me.