I am developing a big linear optimization model for energy system modelling. I am running it on a machine with 126 Gb of RAM and reaching the limit of its capacities (solving my problem uses up to 100 Gb and I still need to increase the size of my model).

However, when I look at your FAQ, I find the following:
“AMPL’s memory use for a linear program can be estimated crudely as 1000000 + 260 (m + n) + 50 nz bytes, where m is the number of constraints, n the number of variables, and nz the number of nonzeroes.”

When I try to compute my case based on the information in the log file, I get 11.35 Gb, which is 10 times smaller than what I observe in practice. Can you tell why and if I can change that behaviour?
Also, can you tell me how I could estimate the additional memory needed if I increase the size of my model?

Your screenshot shows about 22 Gb of memory used by the genmod, merge, and collect phases, which process the model and data to create a representation of the complete optimization problem in memory. However, from only this information, it is not possible to tell which variables and constraints are most responsible for the memory use, or how much memory was used for the presolve, output, and possibly other phases. Thus it is not yet clear what is causing 126 Gb of RAM to be used.

Can you post the entire output from AMPL (and from the solver, if any)? That will be very helpful in determining the cause of the very high memory use. If the listing is very long, you can store it in a file and then upload the file.

The listing for AMPL shows that it uses about 34GB. Over 60% of the variables and constraints are eliminated by AMPL’s presolve phase; almost 12GB is associated with presolved variables and constraints, and that could maybe be reduced significantly by changes to the formulation.

But since AMPL is using much less than the 100MB that you are observing, it must be that the greater memory use is in CPLEX. Indeed, the message “Total non-zeros in factor = 6116241945” already suggests that memory use will be high, since there will be somewhat more than 12 bytes needed for each nonzero. Thus, I would recommend first trying to reduce CPLEX’s memory requirements. Here are four things that you can try:

Add the option memoryemphasis=1 to the cplex_options string. This tells CPLEX to compress some data to reduce the memory used. (You might see some increase in computation time, however.)

Try ordering=1 and also ordering=2. Each of these tells CPLEX to choose a different method for factoring the main linear system solved at each barrier iteration, and there is a chance one of them will produce a sparser factor — using less memory than CPLEX’s current choice, which corresponds to ordering=3.

Add the option aggregate=0. This tells CPLEX not to eliminate equality constraints by substituting variables out of the problem. As a result a larger problem will be solved, but possibly the factorization will be sparser.

Run CPLEX separately from AMPL. Start your AMPL session like you did before, do not type “solve” — type this command instead:
– write benergy;
–
Then quit out of AMPL, and in your command window, use a command like this (all on one line) to run CPLEX:
– cplex energy -AMPL baropt predual=-1 barstart=4 comptol=1e-5 crossover=0 timelimit=64800 bardisplay=1 prestats=1 display=2
–
(If you previously added any options to your cplex_options string, you need to also add them to this command.) After CPLEX finishes, start AMPL again, and type all of the commands you entered before — but now, when you get to where the write command was, enter this command instead:
– solution energy.sol;
–
Then you can proceed with commands to display or save results.

Try each of these ideas separately. If more than one appears helpful, then you can consider using two or more together.