Performance / Memory issue

Hi,

I’m new in AMPL, I’ve write a simple model that optimize location for picking in a warehouse.
Codes works with 3 items and gurobi or cplex, but :

  • Not working with cbc with only 3 items …
  • With High errors : Error -1 for call Highs_passH
  • If i remove the filter on 3 items, it crash with AMPL error “Unexpected end of file while reading AMPL output.”

But it’s a so simple model … convert to zip file because i can’t upload it with zip extension

example.mod (32.6 KB)

2 Likes

Hi @Stephane,

Thank you for providing a complete example to reproduce the issue. In this model you have parameters and variables indexed over two sets (locations and items). Since you have 3,000 items and 30,000 locations, this means 90,000,000 variables if all items are available in all locations. This may still be possible to solve with Gurobi or CPLEX on a machine with a substantial amount of memory, but it is way bigger than anything that an open-source solver will be able to tackle nowadays.

To get an idea of the model you do the following to see the size of the data being passed to AMPL:

print(cartesian_locations.unstack().shape)

The following code will display (21191625,) (i.e., 21,191,625 rows):

import polars as pl

orders = pl.read_csv("./data/orders.csv", separator=";")
locations = pl.read_csv("./data/locations.csv", separator=";")

# orders = orders.filter(
#     (pl.col("ITEM_ID") == 34682)
#     | (pl.col("ITEM_ID") == 34657)
#     | (pl.col("ITEM_ID") == 29840)
# )

cartesian_locations = (
    orders.select("ITEM_ID").unique().join(locations, how="cross", suffix="_LOCATIONS")
)
cartesian_locations = cartesian_locations.select(
    pl.col("ITEM_ID"),
    pl.col("LOCATION_ID"),
    (
        pl.col("CART_STD1")
        | pl.col("CART_STD2")
        | pl.col("CART_DEMI-HAUT")
        | pl.col("CART_VOLUMINEUX")
    ),
)
cartesian_locations = cartesian_locations.with_columns(
    pl.col("CART_STD1").apply(lambda col: int(col))
)
cartesian_locations = cartesian_locations.pivot(
    values="CART_STD1",
    index="ITEM_ID",
    columns="LOCATION_ID",
    aggregate_function="first",
)
cartesian_locations = cartesian_locations.to_pandas()
cartesian_locations = cartesian_locations.set_index("ITEM_ID").transpose()
df = cartesian_locations.unstack()
print(df.shape)

You can filter this data to only include items available in each location. That should help improving the performance and reduce the memory usage substantially.

Note that instead of having variables indexed over {Items, Locations}, you should index them over valid allocations to reduce the number of variables. This also avoids the constraint Valid_Allocation_Constraint. Nevertheless, if the amount of valid allocations remains in the order of millions, you will only be able to solve it with commercial solvers.

2 Likes

Hi,

Thanks for your answer.
What do you mean by " Variables indexed over {Items, Locations} , you should index them over valid allocations" ?

Best Regards
Stephane

1 Like

If you have a set ItemLocations containing just valid pairs, you can model as follows:

set Items;  # Ensembles des articles
set Locations;  # Ensemble des emplacements
set ItemLocations within {Items, Locations}; # Valid allocations

param Quantity{Items};  # Coût de chaque allée
param Cost_Location{Locations};  # Capacité maximale des emplacements
param Cost_Move := 1; # Coût de déplacement d'un article

var assign{ItemLocations} binary;  # Variables de décision : 1 si l'article est attribué à l'emplacement, 0 sinon
var prev_loc{ItemLocations} binary;  # Variables de décision : 1 si l'article est attribué à l'emplacement, 0 sinon

minimize Total_Cost:
    sum{(i,j) in ItemLocations} assign[i,j] * (Cost_Location[j] + Cost_Move * prev_loc[i,j]);

subject to Assignment_Constraint{i in Items}:
    sum{(i,j) in ItemLocations} assign[i,j] = 1;

# redundant constraint as Assignment_Constraint enforces that each item is assigned to exactly one location:
#subject to AtLeastOneAssignment_Constraint{i in Items}:
#   sum{(i,j) in ItemLocations} assign[i,j] >= 1;

ItemLocations will be smaller than {Items, Locations} as it will only contain valid pairs.

1 Like

I would also note that even though prev_loc is declared as a variable, it seems to be a parameter indicating whether or not the item is in a given location. You can switch that to a parameter to to reduce the model size eventually to a point where you can use an open-source solver. You can filter to only include assignments under a certain value (i.e., exclude all expensive reallocations). The solution will not guaranteed to be optimal specially if you pick a low threshold, but should still be quite good. You can place in ItemLocations just pairs (item, location) that seem worth it to consider rather than all valid assignments.

1 Like