Performance difference between defined variables and s.t. or syntatic sugar?

I appreciate how defined variables, e.g., var x {...} = ...; helps keep my AMPL code maintainable. However, for performance reasons, should I prefer something like var x {...}; s.t. X {...} : x[...] = ...; instead? How does AMPL treat these behind the scenes? Does the presolve benefit from one formulation over the other?

I ask because I am seeking ways to speed up my code in terms of time spent generating my models and time spent evaluating constraints in the solver. Are there any performance differences in choosing one formulation over the other when it comes to model generation and solver function evaluation?

1 Like

The short answer is that “defined variables”,

var x {...} = ... ;

and “defining constraints”,

var x {...};
subject to X {...} : x[...] = ... ;

are handled differently in AMPL. The effects of this difference on both generation and solve time are highly problem-dependent, and you should experiment with both to determine what works best for your particular model.

The difference begins in generation, where AMPL converts a model and data into a problem instance that is written to a file in “.nl” format.

Defined variables are substituted out of the problem. At each place where a defined variable appears in an objective or constraint, it is replaced in the .nl file by a pointer to another part of the file where its definition is given. Variables in the definition are treated as nonlinear variables.

There is one exception. If the definition of the variable is a linear expression, and it appears in a linear objective or constraint, then the coefficients of the variable’s definition are substituted directly into the linear objective’s or constraint’s coefficient list. Variables in the definition can thus be treated as linear. (This exception can be turned off by setting AMPL option linelim to 0.)

Defining constraints are treated like other constraints. Thus the variable being defined remains as a variable in the problem, and the definition adds a constraint to the problem.

There is an alternative, however. When option substout is set to 1, AMPL tries to convert defining constraints into defined variables:

AMPL scans the constraints in the order of their appearance in the model, and defining constraints are converted to defined variables if the variable’s definition imposed no restrictions on it (such as integrality or bounds) and the variable has not appeared in any previous such constraint.

Additionally, the AMPL-solver interface converts the representation in the .nl file to a form that the solver can deal with. That may involve further changes in the representation of defined variables. And finally, solvers incorporate highly complex implementations of a variety of optimization ideas, which react to defined variables or defining constraints in ways that are different but hard to predict. Put all this together, and you can see why it’s better to experiment than to try to figure out in advance which approach will work best for a certain application.

1 Like

In the solver interfaces, the newer MP-based drivers substitute defined variables into the containing algebraic expressions, even nonlinear defined variables. This has proved most efficient, particularly with quadratic forms which most MIP solvers like to see as a whole. This behavior can be configured via solver options cvt:dvelim and, for nonlinear formulas, cvt:expr:nlassign.

Most ASL drivers, such as Knitro and Ipopt, use ASL’s callbacks to evaluate nonlinear expressions and derivatives. ASL evaluates defined variables only once, even if used many times.

Thus, defined variables are essentially macros with the most flexibility for efficient presolve. Note that even if not substituted, some solvers can do it internally.

Thank you both for your detailed and informative responses. They’ve given me ideas for directions to try and pursue. Much appreciated!

Does AMPL internally do any common subexpression elimination (CSE) with the algebraic expressions? The expressions in my AMPL model can become quite large. However, the model will have many common subexpressions. If AMPL doesn’t do any CSE, then I might reformulate how I generate the model, but wanted to make sure AMPL isn’t already doing something like CSE in the backend.

If it doesn’t, then I could try experimenting with CSE and the usage of defined variables vs. s.t. statements in decreasing model generation time and its impact on solve time. It seems like defined variables might be the way to go given that they are only evaluated once during a callback, which complements the use of CSE as an optimization technique.